By Frank Orozco, Chief CDN Technology & Products Officer
As recently as a decade ago, it was common for companies to deploy software in monthly or weekly cycles, changing thousands upon thousands of lines of code at a time. Today that pace has quickened exponentially as teams strive to deliver smaller batches of changes daily or hourly. This approach, known as continuous delivery (CD), has also radically transformed software production pipelines across the tech industry. For example, in 2008, Etsy deployed its code twice a week; today it deploys around 60 times per day.
Unfortunately, many content delivery networks (CDNs) have not kept pace with these changes. Sluggish deployment cycles and large batches of code changes remain the norm. That's an enormous missed opportunity. As stewards of the internet, CDNs have a responsibility to their customers to avoid service degradation, ensure maximum uptime and safeguard the quality and reliability of our products. Especially when paired with other practices, such as continuous integration (CI), CD is an ideal tool for accomplishing these goals and more.
We're fortunate at Verizon Digital Media Services (VDMS) to have built a robust CI/CD workflow that can provide a model for other CDNs. At Velocity in San Jose this June, we'll be laying out our CI/CD process as well as EdgeControl, a set of next-generation CDN management tools we are currently developing to allow our customers to benefit from our experience in safely deploying thousands of changes to a global network. Here's a preview of what's in store.
To any software engineering organization, CI/CD offers numerous advantages. It reduces time-to-market by getting measurable changes to customers quickly while also minimizing risk, since fewer changes are pushed out at once. This results in more agility and greater stability and, ultimately, happier users and customers. However, not all CI/CD workflows are created equal, and CDNs looking to implement these practices should build out their processes with care.
A CI/CD workflow increases efficiency and reliability in part by reducing the number of humans in the deployment pipeline. W. Edwards Deming, one of the forefathers of lean engineering, put it best in the third of his 14 points for management: "Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place." In other words, companies shouldn't rely on a team of a 100 QA engineers to inspect their product (in this case code) for quality. Instead, they should actually build that testing process into the product by reducing human inspection and adding automations.
A CDN looking to build a CI/CD workflow should make a rigorous, automated testing procedure the backbone of that process. For example, at VDMS, every time a developer checks in a change to even one line of code, our dedicated CI servers run that change through more than 40,000 automated unit tests before integrating it into the main code repository. (That is where the "integration" in "continuous integration" comes in – such changes are being tested and added to the repository continuously as developers work.)
Our CI/CD pipeline for the CDN also runs through percentage and region-based canaries—tests on controlled subsets of the population—to ensure code is bug-free before it is deployed across our entire network. This robust, automated workflow is designed to ensure that code written by developers gets to production in a safe manner, facilitating smoother deployment.
At VDMS, we take these processes even further with internal tools that give our developers increased visibility into code that is deployed at the edge of our network. These tools let developers see deltas in CPU usage, memory, errors and so on between code currently in production and the canary being deployed. These metrics help us understand if a given deployment is a good build, or if it will cause problems when rolled out more widely.
VDMS's Edgecast CDN controls around 10 percent of the world’s internet traffic. Just as an outage at Amazon Web Services (AWS) would take hundreds of internet services like Slack and Twilio offline, any outage in one of VDMS' core systems would impact a large number of customers. We adopted a rigorous CI/CD practice in part because we take our responsibility for the internet very seriously.
We're also excited about how CI/CD will help us build a deeper relationship with our customers. We want them to be able to bring in our CDN as a part of their own CI/CD workflows so they can leverage us in their automated format. We're also building our EdgeControl toolset so we can share our most successful techniques and processes with them, binding our separate workflows more closely together. For this CDN, adopting CI/CD has been a win-win.