May 25, 2021

Kubernetes CI/CD Best Practices

Table of Contents

Containerization and Kubernetes have ushered in a new paradigm of consistency in the computing world, allowing for increased velocity and agility to engineering teams. The convergence provided by a common declarative language to describe application and operational tasks makes Kubernetes a popular platform for running distributed workloads. 

Authoring the desired state in a declarative YAML, once applied, Kubernetes is off to the races resolving and fulfilling the state that has been declared; e.g. number of replicas of an application. If there is any deviation, Kubernetes will work to resolve the difference between the actual and declared state; e.g. a pod/container dying and being re-spun up.

For those deploying to Kubernetes for the first time, the experience can be pretty rapid and anti-climatic. Authoring a minimum deployment.YAML, once giving kubectl the command to go [apply], you’re off to the races. When the time comes to make a change, Kubernetes will take advantage of one of its strengths, the rolling update, to make changes incrementally. Watching the rolling update occur if you are used to platforms where you had to hand-write the rolling update rules makes Kubernetes seem like a breeze. 

Though with all of the benefits that Kubernetes has, having good CI/CD practices is key. Kubernetes did not magically erase the discipline that your CI/CD journey has taken you on before it came into the picture. Leverage Kubernetes’ strengths to further your CI/CD journey. 

Best Practices for CI and Kubernetes

Continuous Integration (CI) is the process of build automation. For example, a JAVA application needs to be built into a JAR, then if headed to Kubernetes needs to be Dockerized and potentially be packed/described in a format such as a Helm Chart. In the containerized world, since containers are immutable, any change that is needed will result in a new image thus your CI process will be called a lot to build and package new images. 

The chicken or the egg argument, running your Continuous Integration process on Kubernetes is a prudent move. Building and packaging software can take a lot of compute resources. With modern approaches that every commit kicks off a build, it can be really taxing on infrastructure - especially with containerized builds. Taking advantage of Kubernetes to build and package software is a great use case because modern CI tools focus on creating ephemeral build runners/nodes in Kubernetes. As build requests come in, spin up a new instance to create the build artifacts and then spin down the instance when the job is complete. 

The Continuous Integration confidence-building steps that can easily run in an ephemeral container are unit tests, integration tests, and security scan steps. Especially image/container scanning steps can be pretty compute-intensive decomposing and validating the Docker Layers, similar to running compute-heavy build tasks. Because each build could be introducing new dependencies or new versions of dependencies, running a container scan is important every time you build a new image. 

However, there are items that need to be more long-lasting than an ephemeral container and require more durable storage. Seen as an exit step of Continuous Integration is publishing the created artifacts/packages to an artifact repository and/or manifests to a respective source code management/package manager solution. In the Kubernetes world, this can also be the creation of manifests that Kubernetes needs to deploy. Also for example, Helm Charts or Kustomize/JSONNET resources. A goal of CI with Kubernetes is to produce an easily deployable artifact and package/configuration/templating managers allow for that. 

Unless highly available/durable storage is available to workloads on your Kubernetes Cluster, running your artifact repository as SaaS or off a K8s cluster makes sense. The Achilles heel is that artifact repositories are storage-heavy by design. Having a deployable artifact/manifest is only part of the equation of getting your idea into the hands of the end user; the next step is the deployment. 

Best Practices for CD and Kubernetes

The goal of Continuous Delivery (CD) is to get your changes into production in a safe manner. Kubernetes has the ability to deploy very quickly, especially if using a recreate strategy where all the Pods are killed and replaced vs incrementally with a rolling strategy; but this causes downtime. Though most of us deal with workloads that have been running and having downtime would be a detriment. Because of the immediate nature of Kubernetes, resisting deploying as rapidly as possible seems counterintuitive but is needed for confidence. 

The confidence-building exercises that applications went through before Kubernetes did not magically disappear with Kubernetes. For example, testing and coverage requirements did not evaporate. With Kubernetes, the possibility of more concerns has appeared. Running of conformance tests is not unusual with Kubernetes to validate the Kubernetes infrastructure which you are deploying to for portability reasons. Portability is a big draw for leveraging Kubernetes in the first place. 

Similar to running Continuous Integration steps on Kubernetes, running certain Continuous Delivery steps on Kubernetes itself is prudent. Standing up test infrastructure and then spinning down test infrastructure is easily achievable on a Kubernetes cluster. Depending on the length of the confidence-building steps, there can be a workflow aspect needed for orchestration which needs to be long-living. The same design principles and decisions of running long-standing/stateful workloads on or off Kubernetes apply for the orchestration. 

Leveraging release strategies such as a blue-green or canary release is very possible with Kubernetes. While possible to do by hand with several well-crafted Kubernetes manifests and timely applications of these manifests, tooling to cover these release strategies is increasing. Building in proper health checks such as liveness and readiness probes to enable incremental deployments to continue are key when architecting for Kubernetes. The safety that was needed before Kubernetes did not go away with Kubernetes. As the ecosystem and tooling continue to mature, new paradigms will appear. 

Furthering the Journey

With Kubernetes blurring the line between infrastructure and application, a common system design paradox “can the author be the enforcer” in the system can play out easily in Kubernetes. Prior to Kubernetes, development engineers deploying directly to production was not the norm. It was usually fronted by some sort of CI/CD platform with varying levels of automation and approvals to get to production. 

With Kubernetes depending on how far you take isolation vs having singular clusters, you can easily run the build, confidence-building steps, and deployment on and into the same cluster by Namespace separation. With modern tooling and the gaining traction of the GitOps movement, now authors can enforce standards such as drift detection and self-healing of the declarative states of deployments.

Kubernetes has the ability to react in a generic sense. With the continued evolution of monitoring and observability tools trickling into judgment calls if a deployment successful and taking action to further the deployment or roll back on Kubernetes is certainly possible today for example on the Harness Software Delivery Platform. As more organizations further their Kubernetes journey for all of the benefits Kubernetes offers, it is wise not to forget the discipline (which has been around before Kubernetes) and embracing the new paradigms.  

Cheers,

Ravi

Platform