Build vs. Buy – The Cost of Implementing Continuous Delivery
Continuous Delivery Pre-requisites
Like anything in business, not everything can be solved with technology alone. Going from quarterly/monthly deployments to daily/hourly deployments isn’t just about automation. Your application architecture(s) and team(s) must be able to support the fundamentals of Continuous Delivery.
If you can’t break up your apps into independent services or microservices, then you’re never going to get enough parallel tracks of development and deployment to do hourly release cycles.
People by nature are the biggest resistors to change, so if things are challenging today without CD, what’s life going to be like when you crank the change dial up to 11? Before considering build vs. buy for a CD platform, you really need to think through your target architecture(s) and culture.
Don’t Bundle CI and CD Together
Chances are you’ve already got a mature Continuous Integration (CI) platform. Let’s face it–you’ve been spoiled with solutions like Jenkins, Atlassian Bamboo and CircleCI for taking code to artifact. It’s entirely possible you might be trying to over-extend these solutions to cover CD as well. Bottom line here is to not boil the ocean.
Try to focus your Continuous Delivery (CD) initiative on the process of taking artifact to production. Yes, you can’t have CD without CI, but bundling these two processes together will create massive dependencies, complexity, and scope for your project. Instead, focus on integration and the APIs between CI and CD; after all, your business/teams are likely to have multiple CI processes, tools, and repositories.
Time is the killer, not cost
The stereotypical view of build vs. buy software always comes down to the raw dollars/pounds/euros/bitcoins it costs. Cost is important; however, with CD–it’s the “opportunity cost” that matters the most. How quickly can you acquire CD capabilities that will allow you to accelerate the deployment of innovation in your business? What would the business impact (opportunity cost) of daily/hourly deployments mean to your business?
For example, if it takes a year to build or implement a CD platform, then that’s 12 months of opportunity cost gone. More to the point, if your competitors are all doing CD today, what will 12 months of innovation idling mean to your business?
Today, beating your competition to the punch is all that really matters.
Scoping Continuous Delivery Requirements & MVP
From the 350 or so customer meetings we’ve had since May, it’s become clear that the 1-2% of customers who nailed CD had spent several years (3-5 years) mastering it. CD isn’t just about provisioning infrastructure and managing configuration; if it was, CD would have been solved by now. Below is a breakdown of what we’ve learned from the various types of customers who shared features of their CD initiative. You might want to leverage these findings as an idea of scope for your own CD project.
As you might expect, CD requirements and complexity increases as the size of your business increases. It’s why most modern enterprises have hundreds of DevOps engineers working on initiatives like AWS/Azure migration, Docker, Kubernetes, Continuous Delivery, Application Performance Monitoring and so on.
Key Continuous Delivery Capabilities to Consider:
- Triggers – When and how do you execute your CD pipelines?
- Pipeline management – How do you structure and manage your pipelines across different apps, services, and environments?
- Infrastructure management – How do you provision infrastructure or compute for each of your apps, services, and environments?
- Configuration management – How do you version control and deploy your application run-time configuration?
- Deployment workflows – How do you deploy your apps and services to different environments? (e.g. Canary, Blue/Green, Rolling)
- Automated Testing – How do you ensure artifacts are ready to be deployed to production?
- Verification & health checks – How do you verify the impact and health of artifacts in production? Specifically performance, throughput, errors, and quality of service.
- Rollback – How do you manage failure and anomalies in production?
- Security, auditing & secrets management – How do you manage API keys and credentials across your apps, environments, and pipelines?
- Analytics & reporting – How do you baseline, debug and report on deployments?
- Notifications & workflow – How are teams informed of deployments, events, and status?
- Scalability – How many apps, services, environments, and users need to be supported?
- UX – Will the solution be self-service, if so, will users actually adopt and use it?
You’re probably considering CD for some business outcome; if so, make sure this is measured so you can manage and highlight the progress you are making. Having simple dashboard and reporting capabilities on your CD platform is key to measuring things like deployment velocity, success/failure rate, and mean-time-to-production for your code. You’ll need to over-communicate your success to keep the momentum up so it’s important to have information to share.
The Build Option
Let’s assume you go down the build option. Below are some conservative “build and QA costs” that customers have shared with us over the past six months. The majority of SMB and Mid-market customers told us that it would take them between 2-5 FTE’s 1 year to build out basic CD capabilities for their business. Enterprises were more like 10-15 FTE resources per line of business. This effort includes the integration work required for existing CI platforms as well as the CD capabilities. For ballpark costs, let’s assume $200k/year for a fully-loaded developer and $160k/year for a fully loaded QA resource.
Obviously, these costs are ballpark and will vary across organizations. Don’t also forget the 12 months “opportunity cost” that building CD incurs.
Don’t Forget About Maintenance
Over time, how does your CD platform support multiple:
- Cloud platforms (AWS, Google, Azure, CloudFoundry, On-prem)
- Application Artifacts (Docker images, zip, rpm, war, jar, lambda functions, …)
- Application Run-times (Java, .NET, Python, Ruby, PHP, Node, …)
- Load Balancers (F5, AWS, Nginx, Consul, Akamai, …)
- Monitoring Tools (AppDynamics, New Relic, Dynatrace, Splunk, Sumo Logic, Elastic, …)
- Authentication & Key Mgmt Providers (LDAP, SAML, AWS, Hashicorp)
Once you’ve figured that out, you can then multiply that effort/cost by two for maintaining new versions and upgrades of the above. Botton line is that maintenance for your CD platform can become a never-ending journey depending on the size of your business and applications. You can easily have 1-3 FTEs/year ($200-600k) dedicated to CD maintenance for SMB/Mid organizations and another 4-5 FTEs/year ($800-$1m) for enterprises.
Remember, technologies like Docker, Kubernetes, Mesosphere, Slack, Lambda and self-driving cars didn’t exist a few years ago. Just like CD, change in technology is always constant. Today, Kubernetes and Lambda are the best things since sliced bread; 18 months later they’ll be superseded by other OSS projects and technologies that will blow everyone’s mind.
No matter how well we plan for life, we’ll always encounter things that slow us down. Here are a few gotchas to look out for when building your CD platform:
Failure & Downtime – building deployment capabilities that automate and touch production environments isn’t a trivial thing. With great power comes great responsibility. Like Spider-Man, you don’t want to be in too many sticky situations. Don’t underestimate the cost of failure in production as you test your CD platform.
Recruiting – the average lead time for an engineer these days is between 2 and 4 months, assuming you can find them. According to Glassdoor, there are 31,401 DevOps jobs right now. If you have the resources, awesome–but don’t underestimate how difficult it can be to hire engineers.
Attrition & competing projects – losing engineering resources and context-switching them between projects can also be a challenge. Is building your CD platform a short-term or long-term strategy for the business? What are the odds of your engineers being pulled off this initiative?
Open-Source Isn’t Free
I know what you’re thinking: can’t we just download a solution like Spinnaker and be off to the races? Not exactly. Like many OSS projects, Spinnaker is a framework that needs to be personalized for your apps and organization.
From the customers we’ve spoken with, Spinnaker still requires a dedicated team to set up, personalize, manage and maintain. It’s certainly better than starting from scratch, but OSS doesn’t imply “zero costs.” I spoke with an enterprise yesterday afternoon that had six people dedicated to personalizing Spinnaker for their organization. That’s coming up on $1M cost per year.
In addition, you need things like dedicated documentation, security, support, and training.
The Buy Option
The core benefit of buying any CD solution is time-to-market. Instead of waiting 12 months for a v1 of your custom build, you can take a commercial solution and implement it in weeks/months (or a few hours with Harness). A potential drawback here is the evaluation period, as you might be interested in evaluating multiple vendors and performing POCs before you eventually buy and deploy.
Not many modern CD vendors exist, so you might find that many of the solutions on the market are traditional on-premises software that was designed for the last generation of application architectures and stacks (monolithic & SOA). Harness recently released CD support for AWS Lambda environments and I think we were one of the first to support serverless architectures. Another potential drawback is that some CD solutions might only support a certain percentage of your tools, technologies, and architectures. In contrast, with a custom CD build, you’re building things to the spec of your environments.
Another pro for buying a CD solution is that total cost of ownership is significantly lower, both short-term and long-term. I would strongly argue it’s anywhere from 3-10X cheaper. Someone else looks after the maintenance, which, for an enterprise, is where most of the expense goes. If you don’t believe me, then check the version history for your current deployment scripts. Let someone else worry about supporting the different clouds, run-times, tools, and APIs.
Finally, many CD solutions require you to change your existing CI/CD process or tooling. Soup to nuts I believe is the phrase they use–where they provide end-to-end automation and coverage–but at a big cost. Think of these as transformational projects that can take consultants months to scope, plan, deploy and configure. At Harness, we believe integration and APIs rule in the world of CI/CD. We’ve yet to meet a single customer who wants to throw away their current investments in CI/CD. Reuse is fine, but replacement is often a non-starter.
Building a CD platform often makes sense when the number of apps, compute nodes. tools and technology stacks in your organization are relatively small, predictable, and manageable.
Buying a CD platform often makes sense when your environments are rapidly changing with new services, technology stacks, and tools.
Another way to look at things is – do you want to innovate in building a Continuous Delivery platform, or do you want to focus and innovate in your core business?
What path did you take with CD in your organization? What were the main drivers and considerations for your decision?
Would love to hear from your experiences.