Stories linking everyone in Telecom

Table of Contents

This year has become the moment of truth for VoIP businesses. At the same time, it has created a “grow fast or die” situation for smaller VoIP vendors, who are now under tremendous pressure from Google, Facebook, Apple, Microsoft, Slack, Zoom, and the gang, each pushing its solution forward as the default option for the “new normal.” As Tsahi Levent-Levi wisely points out in his blog: “COVID-19 is causing all communication vendors to fast forward and accelerate their roadmaps by 6-18 months. Those that don’t are going to be left behind on the other side of this pandemic.”

In today’s story we will introduce you to WebTrit – our next generation softphone solution – and discuss WebRTC,  the breakthrough VoIP technology that made WebTrit possible. We will also explain why (in our opinion) the WebRTC ecosystem will soon replace traditional SIP telephony, and why it is time for you to act now.

But before we start comparing SIP (which in the last 10 years was the de-facto standard for VoIP communications starting from hardware deskphones to dialer apps for smartphones) to WebRTC, let’s do a quick recap – just three paragraphs – of what’s been going on in telecommunications over the past, say,  two centuries.

A (Very) Brief History of Telecommunications in Three Paragraphs

Bell, Gray, and Meucci invented the telephone somewhere in the XIX century. It simplified communication, enabled various new business models (pizza delivery?) and contributed to the decrease of the global population of postal pigeons.

Fast forward a century: international telephony prospers, but calls are too expensive. An ecosystem of long-distance call providers appears. The predominant protocol for VoIP communications is H323, invented by ITU (International Telecommunications Union) – it tries to mimic the functionality of telephony networks (SS7) which makes it fairly complex, so only large operators can set up and operate VoIP networks properly. Then, suddenly, it’s 1996: Handley, Schulzrinne, Schooler, and Rosenberg invent SIP, which makes it much easier to write a VoIP client or a server. So a single hacker can create a whole VoIP PBX, like Mark Spencer did with the Asterisk project! And fast forward again another decade: SIP democratized long-distance telephony, killed international mobile roaming, and made PSTN call providers obsolete.

Demo of an early SIP SoftPhone. Yes, this 2008 video now looks like digital archaeology to millennials.

However, SIP only defines signaling, i.e., how to establish or disconnect a call. A separate technology is required for media delivery, i.e., passing the voice stream from the caller to the callee over the IP network. Hence the appearance of an ecosystem of Real-time Transport Protocol (RTP) and its secure twin, SRTP. SIP calls were cheap, while transport layer solutions were expensive to develop. This launched a race to invent that was akin to the moon race of the 1960s.

Rise of WebRTC and How PortaOne Contributed

In 2007–2008, PortaOne worked with Stockholm-based Global IP Solutions (GIPS), which delivered high-performance VoIP codecs that would produce good sound quality even when making a call over a poor Internet connection (this was the time before the omnipresence of 4G/LTE and the proliferation of fiber connections!) and using low-end customer equipment, such as a five-year-old laptop (five years old in 2007, even). We made it possible to use this codec and GIPS SDK in conjunction with PortaSwitch to build VoIP applications that would reach a majority of Internet users.

In 2007, Globe7, a supplier of ad-powered communications, had over 30 million active users. The Globe7 service was running on the largest (at the time!) PortaSwitch installation across several datacenters in the UK and India. Another company, based in Estonia, used GIPS codes inside their peer-to-peer calling software. This company was Skype, which, by 2010, had 600+ million users worldwide and was acquired by Microsoft in 2011. In 2010, Google acquired GIPS for $68.2 million. Then, in May 2011, Google released an open-source project for browser-based real-time communication called WebRTC (Web Real-Time Communication).

Why Is Google Interested?

Unlike its archrival Amazon (hence the term “amazoning”), Google is (or at least was) used to “settling accounts” with competitors in a “not evil” manner, preferring to build a new ecosystem instead. And with WebRTC, that actually worked. Google built the entire “in-browser VoIP” customer experience from scratch, treating each market participant as equal. But what were the true reasons for this coding philanthropy?

In the past, Google demonstrated the ability to leverage its major strengths: search domination (hello, PageRank patents), Android mobile OS, and the Chrome browser (not to mention YouTube and Gmail). From the very beginning, the WebRTC story had a lot to do with strength No.3, i.e., Chrome: the browser, the OS, and the netbook triune.

Before acquiring GIPS, Google acquired Gizmo5 in 2009. At the time of acquisition, Gizmo5 was developing freeware SIP softphones, which Google decided to add to their already existing VoIP portfolio. Getting an understanding of that portfolio requires some digital historiography effort and a good glass of… green tea. Nevertheless, we will try.

Google, VoIP and WebRTC: “It’s Complicated”

At the end of ’00s, there were already two VoIP products available from Google: Talk and Voice. Google launched Talk in 2005 as a desktop companion to Gmail. At first, Google Talk allowed Gmail customers to see their new email notifications and use the text chat via Extensible Messaging and Presence Protocol (XMPP), which itself grew out of Jabber (acquired by Cisco in 2008). Then Google introduced voice calls and voicemail for Talk.

Early demo of Google Talk UI. That was before Google launched WebRTC
Early demo of Google Talk UI. Beware of the bear!

The Google Voice Story

Google Voice came into existence in 2009, two years after Google acquired GrandCentral Communications. While Google Talk targeted mostly digital natives – the core audience of Gmail at that time – Google Voice originally targeted conventional landline customers. Each Google Voice customer was assigned a US number, which Google then used for International SIP telephony and iPhone-like visual voicemail, later adding automated voicemail transcriptions.

Google Duplex AI recorded demo with multi-billion-dollar CEO Sundar Pichai ironizing about live human beings.

But all that was not enough to beat Skype, which was growing superfast throughout the ’00s, until being acquired by Microsoft in 2011. By the time Google launched Voice, Skype already had Skype Credits for international landline calls and the first Skypephone in 2007. For the sake of digital archaeology, we wish that Microsoft  had not killed the original Skype website announcements. However, it did 😭, so all that is left to us in 2020 are third-party reviews of the Skypephone.

A video presentation from Google’s 2015 event, which reflects Google’s vision for WebRTC within the nearest decade.

Chrome’s Market Share Soars

Let’s get back to world domination Google and its leveraging of superpowers. BTW, we are intentionally leaving out the “Hangouts Saga” because including that would require another glass of green tea or a healthy drink of your preference (actually, several glasses). So, by 2010 Chrome was completely cross-platform on the desktop, and in 2012 it was ported to Android (and completely replaced Android’s native browser since Android Lollipop in 2014). iOS followed suit, with Safari remaining dominant and Apple requiring any app (including Chrome) to use WebKit – Apple’s proprietary rendering engine. Guess what Google did then?

While the concept of Chrome OS and Chromebook enjoyed modest success, the Chrome browser gained over 50% market share on desktops (including laptops) somewhere in 2015 (according to Statista) and never dropped below that point. So, promoting an open standard, which turns any device running that over-50-percent-market-share browser into a softphone with features comparable to modern office PBX, did not sound like pure philanthropy anymore. However, the existing “king” (aka SIP) was still in power. And that brings us to a relevant (in Google’s own terms) question.

What Is Wrong with SIP?

In brief: nothing. Except for broken security, but this is a problem common to almost any communications protocol. SIP is a great protocol… from the 90s. It created a revolution in voice calls and enabled the existence of the whole VoIP industry. However, a lot has changed since then. Session initiation was a big deal in the times of rotary phones landline giants like AT&T or Deutsche Telekom. In 2020, a VoIP session can be initiated by anything: from SMS to a microwave oven.

All the money in the SIP/VoIP business currently comes not from VoIP calls themselves (unless you are Amazon, Azure, or Oracle providing the cloud processing power), but from (1) PSTN termination, and (2) custom-developed transport layer applications. And the PSTN termination was successfully challenged by Skype, WhatsApp, Viber and the likes, each pushing its own “secret sauce” for the transport layer. 

Ok, what if there is no need for the “secret sauce” anymore, and any teenager can now build her own softphone using the default browser APIs and a week of basic JavaScript coding? This hip throw makes all the existing VoIP moguls and their “secret transport layer sauce” irrelevant, together with their codec wars, patents, and proprietary code. All that customers and the new generation of developers need is a fast and reliable web browser, equipped with voice and video libraries “from the box.” Here, Google suddenly got an unexpected ally: Facebook.

Hey Facebook!

Unlike Google, for a long time Facebook did not have either an operating system or any popular desktop application besides WhatsApp Web (formerly known as WhatsApp Desktop). This made Facebook, whose primary moneymaker is still social networking, highly dependent on a web browser in “all things desktop.” Moreover, Facebook’s Messenger app for Android (and iOS, although with certain Apple-specific nuances) now uses WebRTC for VoIP calls.

But forget the money! Since 2019, messaging has been officially Facebook’s main strategic focus. That’s totally understandable, given that Facebook by that time owned the world’s largest “yellow pages” directory, packed with petabytes of additional information. And its customers are still voluntarily providing and updating this info on a regular (sometimes excessively regular) basis.

The WhatsApp Drama

Adopting WebRTC meant “writing off” a substantial part of the $19-billion investment Facebook made in WhatsApp back in 2014. However, Facebook still invested, while WebRTC was already around for several years. This means that, unlike Instagram, acquiring WhatsApp was more about removing a rival than it was about incorporating it as a feature. Basically, years of coding its own VoIP platform at WhatsApp ended up with Facebook launching a standalone macOS client for Facebook Messenger (a direct internal competitor to WhatsApp) in 2020. Speaking in “distracted boyfriend” meme terms:

Facebook's relationship with WebRTC and WhatsApp in distracted GF meme terms
Facebook's relationship with WebRTC and WhatsApp in distracted BF meme terms
To be gender-balanced we decided to include both sides of the story 😂 Pick the one you like.

So while Google and Facebook compete heavily in the online advertising business, Facebook has to rely on Google Chrome as its key service delivery vehicle for the end customers in all matters VoIP. And what’s Chrome’s default solution for VoIP? Right, it’s WebRTC.

Are You Here, Siri?

WebRTC’s “take it and sell it as yours” approach to technology stack is the opposite of all that is Apple business philosophy. Obviously, this brought in a Mexican standoff that lasted up until recently. Lots of vendors, including Facebook, put pressure on Apple to start supporting WebRTC. Facebook went as far as to intentionally leave the “video call” button available in the web version of Messenger for Safari, only to advise interested customers to start using a “better” browser (hello, Chrome and Firefox, the “younger twin”).

Courtesy: Techcrunch. Although we did create added value for this picture by recropping the screenshot without leftovers of Messenger’s background 😉

Finally, Safari 11 for iOS came with WebRTC support in 2017. This feels like a “forced concession.” Nevertheless, it is great that one of the largest hardware vendors in the world is “on board” now.

Codec wars, hardware acceleration, and Apple’s persistence with WebKit still remain a big issue. However, it is now obvious that Apple accepted WebRTC as a de-facto standard for browser-based VoIP on macOS. This, on the other hand, does not mean WebRTC’s road to iOS and tvOS will be fast and easy. Apple has a lot of proprietary code and patents there. It will not be willing to concede its “gated garden,” which, after all, it was the first one to create.

What WebRTC Means to the Existing SIP Ecosystem (and What You Can Do About It)

It is the early ’20s now and you are not a founder of Skype or WhatsApp (or maybe you are… who knows who is reading our blog?). There is no Microsoft or Facebook to bail you out with their billions and cover years of coding a “secret sauce” VoIP solution. Moreover, while you are still playing the old “SIP + transport layer” game, your competitor might already be on WebRTC. And might help this competitor to divert the now-available resources to some relevant products and features, sharpening its competitive edge and drawing your customers away.

This autumn, PortaOne-backed startup WebTrit launched an open platform and an SDK for new communication apps. WebTrit consists of WebRTC Gateway, various clients or “dialers” for browsers (Chrome, Firefox and Safari) and mobile platforms (Android, iOS). It also has the API connecting the Gateway to PortaSwitch. The dialer is a VueJS-based web app. You can add anything you want and structure it according to the business model of your choice. For now, WebTrit operates mostly in MCU mode and runs on top of the Janus server.

Why Did We Select Janus WebRTC Server?

  1. Great performance. WebRTC Hacks conducted a comparative testing and Janus demonstrated outstanding performance, especially in load-intensive use cases – for example: conferencing with 300+ participants.
  2. Vibrant community. Ok, forks-and-stars on GitHub is not really a valid community metric. Nevertheless, Janus has a sound presence on GitHub with daily high-frequent updates (as of Autumn 2020). It’s true that Jitsi is bigger. However, let’s be honest: it is unfair to compare the end customer client to a server. Getting back to the physical world, there is JanusCon. It is the customer conference and community get-together where people can actually see and touch each other. Before the 2020 lockdown that is.
  3. Great real-time (and we mean it) support. Once WebTrit CTO had an issue, Lorenzo replied to Yuri and corrected the code within minutes (!). You don’t get this level of attention and agility with most commercial-grade software.

What Did the WebTrit Team Do?

On a very basic level: they “taught” the PortaSwitch backend to communicate with Janus and developed a customizable (and white-label-able) frontend caller app. The app is currently available for browsers (Chrome, Firefox, and Safari) and mobile platforms (Android, iOS). Here’s a high level diagram, explaining how WebTrit works:

WebTrit Architecture

WebTrit architecture based on WebRTC
How does WebTrit work? Here’s an explanatory diagram that is less acronym-loaded than the “dev one” on our Git.

The WebRTC Gateway with the PortaSIP and PortaBilling core modules are the central elements of WebTrit backend. The Gateway handles authentication and customer data (such as call history or contacts) from the various frontend clients via HTTPS. It also manages calls and call notifications via WebSocket with SSL and receives/delivers call media (audio and video) via WebRTC. For mobile apps, an additional communication channel appears between the Gateway and the mobile clients. It delivers push notifications via Firebase Cloud Messaging (FCM) for Android and Apple Push Notification service (APNS) for iOS. You can send an inquiry to schedule a call with our Chief Architect for WebTrit. This will allow you to dive deeper into WebTrit architecture.

Who Should Use WebTrit (and WebRTC)?

Our dear existing clients definitely should. With a subscription cost of a few hundred dollars per month via PortaOne iPaaS, this is definitely a deal. Can you find a developer, a tester, a designer, and a PM altogether for such a salary in your market? If yes, please tell us and we will open a dev office where you are 😂. 

Installation in a “private cloud” on a local physical server (extending your PortaSwitch) with a flat fee is also possible. It is relevant in regions where no sufficiently close cloud data centers are available. More good news: with iPaaS cloud architecture you don’t have to upgrade your PortaSwitch to the most recent version to use WebTrit for your business. Please inquire with our sales department for the details.

Some Use Cases

For businesses currently looking for a softswitch+WebRTC solution, WebTrit gives solid grounds to become a PortaSwitch customer. Here are just a few reasons to do this:

1. Eliminate the physical phone device or standalone softphone

Video calling is just the beginning of a long journey here. The technology for VR and panoramic or 360 calls is already available. Its mass-market adoption is a matter of a year or two. This opens a whole new set of opportunities for businesses. Take for example furniture design, sports and wellness, construction, or repairs. Imagine a carpenter. Now she has to wait for a customer to set up an appointment. Then she arrives to do the measurements for a new kitchen (half an hour drive plus all the scheduling). What if a carpenter could conduct a 10-minute VR call to explain “all that you want” to the customer? Wouldn’t it be cheaper and more time-efficient?

While click-to-call is already an industry standard after having been implemented by FB Messenger, WhatsApp, Viber, and others, there is still a lot to improve at the receiving side. For example your call center employees. WebTrit can help.  

3. Create crowd collaboration and coordination apps

COVID is not eternal (hopefully). The golden era of conventions, “unconferences,” and other corporate beauty pageants will return. And it will bring a totally new level of augmented reality. This is particularly true after people have got the true taste of what a good organization can do to create truly hybrid physical+virtual events. However, even with COVID, WebTrit enables various interesting video-on-demand emergency scenarios.

4. Sell ad-supported long distance calls to landlines

Weirdly, there are still people who call international via landline in 2020. WebTrit enables various hybrid call monetization scenarios (something akin to what Globe7 did in the early ’00s). For example: watch a video ad to get X minutes credit. There are plenty of businesses that are already doing this.

5. Sell to gamers! 👾

Google launched Stadia in November 2019. It is completely WebRTC-based. Microsoft countered with xCloud beta the same month and a commercial launch in September 2020. Lagging slightly behind, Facebook is due to launch its own soon. Meanwhile, Apple still does not know what to do with this avalanche. What is definite: Cloud gaming is the next frontier for VoIP and VR/360 telephony.

6. Apply interesting SIP-based call routing scenarios to your existing VoIP

Connect WebRTC-originated calls to the traditional cloud PBX functionality, such as IVR, call queues, and hunt groups. While they’re waiting on hold, would your customers prefer to listen to annoying muzak or watch a relevant educational video from your company or a partner?

Integrate WebRTC group calls with PSTN landlines in Zoom, Slack, Discord, Microsoft Teams, or any other OTT app of your preference.

7. Name your own

Seriously. We love WebTrit and WebRTC and we will be happy for it to grow together with our customers.

Share this story