Auto-Pruning Kubernetes Resources with Harness

All this author’s posts

Harness's Auto-Pruning feature helps manage Kubernetes deployments by automatically removing untracked resources, preventing clutter, and ensuring seamless rollback. This enhances deployment efficiency and reduces manual clean-up efforts, providing a streamlined and reliable CI/CD process for Kubernetes environments.

Nobody wants dangling resources in their deployment environment. They add clutter and unnecessarily consume resources. This is particularly painful when the environment is being managed by multiple administrators/developers with distributed ownership and knowledge. Moreover, tracing these resources becomes even more complex when using advanced deployment strategies, such as canary and blue-green, especially when handling a release rollback. To save you from the above situation, we here at Harness provide a controlled and customizable solution called Auto-Pruning. This means you can focus on more important tasks in peace.

Have we got you excited? Read along to know more about our approaches and design.

A Day-to-Day K8s Deployment With Harness

Standard Kubernetes Deployment With Harness — Release R1 - Successful Deployment

Before discussing pruning, we must understand a regular K8s deployment with Harness. We support multiple deployment strategies, such as rolling, canary, and blue-green. Furthermore, you can choose from multiple manifest sources. Let’s explore more of this with a simple rolling deployment example.

First, you must create a Service. Then, select the manifest source and artifact that you want to deploy by providing their respective connector information.

Next, define your target infrastructure, such as cluster and namespace, in the Infrastructure Definition section.

Finally, create a workflow with Rolling Deployment strategy using the Service and Infrastructure Definition that you just created in the previous steps, and run the workflow to deploy in your environment.

That's it! Happy Continuous Delivery!

The Problem of Orphaned Resources

At an organizational level, the frequency and complexity of these deployments are high. This means that different environments must be set up on the Kubernetes cluster to roll out or test different releases. That's why the team must update their manifest frequently. All of this adds up to a line of unowned or untracked resources on the cluster that were removed from the manifest during various updates, but are now unnecessarily on the cluster. Besides being untracked, these resources might cause unexpected behavior.

One of our customers faced this scenario - they had an undesired ingress left in their cluster, which was causing the incorrect routing. The manual management and cleaning up of these resources remains a constant problem for many of our customers.

Therefore, we rolled out our Auto-Pruning feature to deal with these problems and improve the Kubernetes experience.

Breaking Down the Problem

Back to our problem of dangling resources in the environment – we subdivided the various aspects as follows:

Deciding the filter criteria for pruning and handling rollbacks with different strategies.
Where, what, and how
Storage and security concerns

In the next section, we will discuss our design for addressing each of the above aspects.

Deciding the Filter Criteria for Pruning and Handling Rollbacks With Different Strategies

Rolling Deployment Case A

Let's say, with the first release (R1), you deploy resources (A, B, C) from a manifest (M1).

Next, with the second release (R2), you update your manifest to version M2, which now contains resources A, B, and D, and you deploy it.

Now, in the production environment, you have resources A, B, C, and D, where C is an undesired resource.

Release R2 Case A - Auto-Pruning Post Successful Deployment

Rolling Deployment Case B

A modification to the above scenario is that, suppose release R2 failed and the production environment must be rolled back to its state after release R1 deployed. In this case, we must handle the recreation of resources that got pruned while deploying new Release R2 (in our case, resource C).

Another point worth mentioning here is that, through rollback, we make the environment as it was after the last successful deployment. So, suppose multiple failed releases were deployed between your current and last successful deployment. Then, after rollback, your environment will be restored to its state after the last successful deployment.

Release R2 Case B - Rollback After Failed Deployment

Blue-Green Deployment

The blue-green strategy is a bit different. This is why it needs different handling for implementing Auto-Pruning. A sample scenario would be as follows:

Suppose you have already deployed two releases, R1 and R2, successively. R1 contains resources A, B, and C as defined in manifest M1, and R2 contains resources A, B, and D, as defined in manifest M2. Currently, your release R1 is associated with stage setup, and release R2 is associated with prod setup. As a result, your cluster contains resources A, B, C, and D.

Next, you update your manifest to M3, which contains resources A, B, and E, and you want to deploy this new Release R3 as prod setup. So, after the deployment completes, your release R2 will be associated with stage setup, and R3 with prod. Now, your cluster contains resources A, B, C, D, and E collectively.

In the above scenario, resource C is undesired. If rephrased in a generalized way, then resources that were specifically present in releases associated with the previous stage setups will be candidates for pruning.

Where, What, and How

Harness stores the release history in a ConfigMap Release History on the cluster. This contains some metadata about Workloads, CRD, and Version information. At any point in time, it contains the state of two deployments: the Current and the last Successful deployment.

For Pruning purposes, we use this ConfigMap. When a deployment starts with the new manifest M_NEW, we compare the resources that got deployed in the last successful release to the resources that are in M_NEW. Whichever resources are in the last successful release and aren’t present in the current release are pruned after deployment.

Recreation in Rollback: We also save rendered YAML of the resource as a string in the ConfigMap. That's how we recreate already-pruned resources in the current deployment step from that YAML, in the case that a rollback occurs.

Customizability: There may be a scenario where you don't want some specific resource to be included for pruning. To address this, we provide an annotation, harness.io/skipPruning: true, that you can include in your resource YAML.

Handling Interventions: There can be multiple interventions to your environment that can make auto-pruning unsuccessful. This may include situations such as manually deleting resources that were supposed to be pruned in the current deployment, or an unreachable cluster. In these cases, we let the pipeline continue executing further steps, instead of failing it right there..

Storage and Security

Manifest sizes are usually bigger than the ConfigMap storage limit, which is 1 Mb. That's why we compress and encode the rendered manifest by using a widely-accepted Java deflator with the BEST_COMPRESSION setting, before saving it to ReleaseHistory ConfigMap. This solved our problem of storing manifests larger than 1 Mb.

Another issue that we faced was that customers don't want their CRDs to be public. That's why we save rendered YAML of CRDs as Kubernetes Secrets instead of ConfigMap.

Conclusion

Many customers are already utilizing our Auto-Pruning feature and saving the extra effort of keeping their environment clean. Therefore, they’re saving manual effort to track and clean up these resources. Furthermore, this also helps them keep their configuration up-to-date with the manifest, all without worrying about any unexpected behavior due to an orphaned resource.

To leverage Auto-Pruning, customers are no longer limited to using Native Helm, which is only available with a basic rollout strategy. Instead, they can now opt for Kubernetes deployments, where they can use it alongside advanced rollout strategies, such as canary and blue-green, as well as with the option of multiple manifest stores.

We hope you enjoyed this article on using Harness to improve your Kubernetes experience. So, what are you waiting for? Hop on with Captain Canary and start your journey with Harness.

If you’re not ready yet, keep on reading and learning more! We’ve mentioned canary and blue-green release strategies a few times, so why not familiarize yourself more with these concepts? Read our piece now: Intro To Deployment Strategies: Blue-Green, Canary, And More.

Tathagat Chaurasiya

All this author’s posts

Auto-Pruning Orphaned Resources With Harness Kubernetes Deployments

A Day-to-Day K8s Deployment With Harness

The Problem of Orphaned Resources

Breaking Down the Problem