Telecom configuration management is similar to any other business in terms of product differentiators. “It’s very easy to be different but very difficult to be better,” Jonathan Ive once said before leaving Apple. By the end of 2000-s PortaOne was definitely different from most of the softswitch and billing solutions on the market. However, we had to constantly make it better to stay competitive.
This was particularly true, because our major competitors: Broadsoft (later acquired by Cisco in 2018) and Metaswitch (part of Microsoft since 2020) were also building an outstanding product. That’s why various new essential features of PortaSwitch started appearing. We saw building our telco configurator as one of such major essential features back in the late 2000-s.
Automate or Die! And Inspiration from Seattle and Portland
The main issue was that manual installation required a human being to manage it. And even superhumans (but still humans) sometimes make mistakes. This meant we had to accept huge responsibility and bear all the costs related to the manual configuration and numerous double checks. Obviously, we started looking for ways to automate the server configuration process.
Luckily, some of our founders lived in Seattle by that time. This allowed visiting various technology meetups, frequented by people from the Microsoft mafia, Seattle technology cluster and the nearby Portland. The idea of devops and various forms of automatization for server configurations was in the air.
Right around the time, in 2009, two companies that would grow to dominate the devops industry throughout the 2010-s appeared. They were Puppet from Seattle and Chef from Portland. Both companies use specialized computer languages to enable system configuration “recipes”. Puppet does that with their own declarative language. Chef uses pure-Ruby DSL. Currently are gravitating towards various cloud automation technologies and containerization. In 2020 by a bigger tech company Progress (specializing in devops and UI) acquired Chef.
The Nuances of the Telecom Configuration Management
Telecom infrastructure of that era varied substantially from the rest of IT. Even in VoIP the load intensity was more substantial and fault tolerance directly affected client satisfaction (i.e. the calls did not go through). After all, getting a webpage or an information via API after having to wait for 2-5 seconds is tolerable, while having to wait for the same amount of time while on VoIP call usually kills the conversation.
Here another piece of inspiration came from Telenor’s K2 ISP billing system from the 1990-s. Driven by Scandinavian approach to “automate everything”, it was one of the earliest attempts on automating the telecom server configuration.
Two Fundamental Telecom Configuration Management Issues We Were Addressing with Configurator
First issue is a typical situation, when an admin need to make change on a system that contains many servers. A change (beyond truly trivial) on a single server (e.g. “let’s add another IP address for our VoIP services”) usually requires some sort of changes to be made to all others (in our case – this IP address should be added to the “white list” so other servers treat it as one of our own”). When done manually (i.e. editing the text configuration files) this required a lot of time, killed productivity and triggered mistakes. And since a single mistake on one server meant that the whole task failed, the theory of probability tells us that as the number of servers increased, bad things were almost guaranteed to happen.
Second, sometimes there need to be rollbacks to the last consistent configuration. However, when there is no “history of changes” you need either to run a chain of experiments to restore the status quo or remember to create manual backups each time you change the configuration.
We created (1) configuration recording module and database and (2) automation framework for propagating changes and configuration scripts to virtually any number of servers, (3) GUI to manage the whole process visually, whether it is a change or minor options or entire system update. Clients started telling us that it is heaven.
Automated Telecom Configuration Management: Its Friends and Foes
One of the major reasons behind the success of the Big Reinstall in 2010 was PortaOne Configurator or a centralized telecom configuration management server. System configuration is a tricky business. Usually people who perform it (a.k.a. “system administrators caste”) like the idea of simplification in general. However, sometimes there is a subconscious push back because of “job security” (indeed, who wants their trade to be commoditized?) or (frequently) there is not enough resources and willpower to change things fundamentally. This is why manual system administration is still wide-spread in the industry in 2021 – and even successfully jumped the bandwagon of cloud DevOps.
On the other hand, it’s impossible to build a sustainable business model based on hand labor. Any project manager or person responsible for business outcomes would tell you that automated configuration management is good. Be it cloud configuration in the 2020-s (hello, iPaaS) or configuring various bare metal servers and the virtual machines from the 2000-s. So the path to building the Configurator went through various implementations of human’s natural desire to automate the work which can be automated.
The Three Great Virtues
American programmer and the author of Perl programming language Larry Wall once formulated the three great virtues of a programmer. According to Wall, these are laziness (to reduce overall energy expenditure), impatience (you are not willing to wait until someone else will make your life better for you) and hubris (it makes you write and maintain programs that other people will admire). We followed all three when creating the PortaOne configurator. But at first we needed a great team to build this feature that is still in use more than a decade after its initial release.
The Dream Team
The Strugatsky Brothers (think J. K. Rowling of the former Soviet Union) had their National Institute for the Technology of Witchcraft and Thaumaturgy (NITWITT or NII ChaVo in the original). We had our smart, young and hungry dev team, the nucleus of it germinating from the best students of Chernihiv Poly and Chernihiv Ped (the latter still does not have English website as of 2021, which is a shame, given the amount of its alumni working for the Valley).
The Laziness
Among the Dream Team was Roman Yepyshev who came up with the idea of “porta-updater” utility. The “lazy” idea was to automate the updating process for various instances of our software. Roman demoed it to his superiors Oleksandr Kosenko and Mike Kidik (by the way both are still with PortaOne in 2021!). Another tool, also developed around that time was DBup — a database updater, as you could have guessed. It allowed PortaOne team to change the database schema (structure of tables, columns, etc.) easily and securely to match the new versions of the program code. DBup even allowed to apply most of the changes to “live” database which greatly reduced the downtime during the updates and allowed a longer preparation cycle. It also performed “the lazy” task of preserving all links to MySQL (and later Oracle SQL, hence DBup2) during the system updates.
The Impatience
Originally porta-updater was CLI or command-line interface. The team thought that was lame and started building the web GUI. Alex Strelets did that, while Roman created the roles-instances model and the agent—core architecture. Ultimately, we combined the two tools (porta-updater and DBup) into a single feature and the Configurator was thus born.
The agent—core architecture required a lot of coding for the actual agents (for various hardware and software variations). Young graduate of Chernihiv Poly Anrdiy Kosachenko got this task as a “test assignment” for the junior coder position. It looks like he coped well. Fast forward a decade: in 2019 Andriy Kosachenko became the CTO of PortaOne.
The Hubris
The BSD-to-Oracle migration during MR19 created the porting problem. While porta-updater and DBup1 were written for FreeBSD, now the team ported them into Oracle Linux. After rewriting came the next issue: the sysadmins caste, with its brightest members slowly realising it was time to morph into the new and cool devops gang. Obviously, no one wanted “bad things being said” regarding Configurator. That’s how young Mykola Marzhan joined the Configurator team as a release engineer, responsible for liaising with the sysadmins and devops on the clients’ side.
MR21. The Softswitch Configurator (2008… 2021 and on)
So to sum up the early “release history” of the Configurator: it all started during MR18 with FreeBSD. Then came the MR19 Oracle Linux switch and Configurator was on pause for porting to a new platform. MR20 saw the first practical implementation of the core. Throughout MR21 to MR23 the team implemented interfaces, agents and trained our engineers to work with it. After the original release during MR21 our clients saw the benefit.
During MR22 the team implemented update scenarios — these allow updating platforms between maintenance releases via web interfaces and doing that visually, not from CLI. Later other essential features appeared at the client request: such as “deposit files” and custom modifications. The idea behind the “deposit files” is depositing to source RPMs (SRPM), from which we then compiled the binary RPMs. The aim of this feature is (again) to minimize human mistakes and allow custom modifications on behalf of the client’s dev team. This way customers, who take advantage of PortaOne’s “we have nothing to hide!” policy and so have access to the PortaSwitch’s source code and make changes to it, can have their modifications automatically transferred to the next software version during the update.
MR30 marked a transition to ExtJS — a great JavaScript application framework for building interactive cross platform web applications by Sencha. Before ExtJS we used a home-grown solution. During the Ext JS migration support engineers hated the dev team again. The old UI just “worked fine”. Therefore, why change it and build new neural links? The dust had settled somewhere around MR32. Now any support engineer would defend the Ext JS interface until the last breath. The Configurator still continues serving our clients as of 2021. We scheduled the most recent updates to it in MR95.
Agile Development (for Telcos) Requires Agile Telecom Configuration Management and DevOps
Therefore, to sum up: manual system configuration beyond a single server is toiling the barren earth. Winston Churchill (or Steve Blank) once said that “intellect without will is worthless, will without intellect is dangerous”. In telecom agile code is indeed useless without agile configuration management.
It’s like having a supercar engine on a scooter frame: impressive but useless. The whole concept of modern telecom devops is built around this idea of syncing the dev cycle to the configuration management and ability to implement the changes to the code fast on multiple installations in a reliable manner. With Configurator we’ve laid down the foundation for feature-driven product development based on the analysis of our experimental data. This brings us to the next chapter in our corporate retrospective.
P.S. 2020: Here Comes Cisco and BroadWorks 😂
In 2020 BroadWorks Release 24 came from our honorable business rivals at Cisco. Among other things they promise “release-independent software delivery for the cloud era” by “introducing support for RedHat Ansible playbooks – an industry-leading software deployment toolset”. What do Ansible playbooks do essentially? Right, they “orchestrate and automate software version management for the BroadWorks servers”. Plus it still is only a half of the solution: while Ansible allows to automatically apply some changes to multiple servers at once, it is sysadmin, who is still responsible for preparing the “playbooks” and ensuring they are mixed in a correct sequence. Huh that was lightning fast, dear mesdames and sirs! And that’s kinda what Chernihiv dudes from PortaOne’s Institute for the Technology of Witchcraft and Thaumaturgy did back in 2009.