At some point in our careers, we most likely have gotten the proverbial e-mail [or Slack today] about “are you still using that?”. What powers and creates our software is time and resources; though we might not have much visibility into the actual resources we are leveraging until we get that faithful notification. 

With the advent of Kubernetes, there is now a common way for application and infrastructure teams to describe application infrastructure. Simply describe what you require in a YAML and Kubernetes will take care of the rest, or so it seems. Kubernetes itself still requires infrastructure to run on and workloads placed in Kubernetes are not immune to being missized

One of the main goals of Continuous Efficiency is to point out where the hot spots are and direct teams to prioritize to take action to increase efficiency and decrease spend. When looking at the findings for the first time, there are common approaches to take to reduce your footprint. 

Quick Intro to Continuous Efficiency 

Continuous Efficiency is available on the Harness Platform which seamlessly integrates into your user experience on the platform. 

Continuous Efficiency

Once you turn on Continuous Efficiency, digging into your services for the first time you will notice reporting in a few different buckets.  Continuous Efficiency for your Kubernetes and ECS based resources will report on cost in three buckets that are utilized, idle, and unallocated costs.  

Kubernetes Pod Costs

These buckets can be explained in our previous blog Though let’s say you crack open Continuous Efficiency and are looking for immediate next steps on what to do to reduce spend. Cloud providers charge for any and all resources that are being used. 

Is Kubernetes Free?

Like they say there is no such thing as a free lunch [unfortunately!], cloud vendors will bill you for multiple dimensions. Reducing your footprint on any of the below dimensions are important to reducing your cloud cost. 

What you pay for Public Cloud

The lowest hanging fruit would be to reduce the Kubernetes Worker Nodes from the pool. By doing this all of the associated infrastructures should be able to be reduced e.g the underlying compute. If the unallocated costs are high, removing nodes from the pool would shift workloads around to available Worker Nodes thus driving the density up. 

Kubernetes Tips and Tricks

Kubernetes as a resource manager is a pretty powerful tool. Define/describe what you need and Kubernetes will try to the best of its ability to fulfill the state you described. Though Kubernetes itself is a cluster and is subject to the underlying compute + overhead to create worker node size/capacity. Based on findings in Continuous Efficiency, you might be seeing high idle costs or high unallocated costs. 

High Idle Costs:

An easier to solve for but a big producer of waste on your Kubernetes Cluster is over-sized or underutilized Pods. When declaring resources in Kubernetes, you typically deal with limits and requests. With requests, Kubernetes will guarantee those resources during scheduling. Once a limit is hit, throttling in terms of CPU can occur and the dreaded Out of Memory [OOM] killer might run if memory exceeds the limit. 

Kubernetes Cost Trends

With the above deployment, at the top end of the graph hit an average CPU utilization of 37% and memory mostly stayed below 50%. A safe measure would be to re-deploy with a different resource request. Making those changes can be made in the Harness Platform to kick off a Pipeline to execute the resource request reduction. 

Kubernetes Resource Requests and Limits

Tuning what is deployed inside of Kubernetes is a quick way to achieve cost savings. Though like any piece of software, Kubernetes itself needs to be managed to be optimized. 

High Unallocated Costs:

Below we were testing out the next version of our HarnessU and even with several “test” students, we were not anywhere close to the cluster limits. 

Kubernetes Metric Server

If we were to continue to run the Kubernetes Worker Nodes at the same size, we would never be used anywhere close to what we have available thus producing a high unallocated cost. 

Machine sizing is an important task. Let’s say you have selected a cloud instance with 4 CPU and 16gb of memory and you want to place two pods with 8gb of resources; most likely you will only be able to place one of those pods per machine since 100% of the 16gb will not be available; your operating system and Kubernetes does require overhead. Knowing that you double your capacity in instances. That is what I did above but was nowhere close to the limit when running our workloads concurrently. 

The prudent move here would be to reduce the number of worker nodes and correctly right size based on anticipated work. Instead of using a larger instance, I could pick a smaller instance type if I want to have some sort of separation of workloads on different instances, and combining the idle cost strategy of redeploying with lower resource limits, savings would abound. 

Key Commands for High Unallocated Costs

The strategy would be to drain the workload away from a node then remove the node from the cluster and then finally remove the infrastructure that the worker node was using. Kubernetes has a handy command called drain. Simply get the NodeID you need and then drain away. 

kubectl get nodes
kubectl drain <node name>

kubectl delete node <node-name>

From the cloud perspective, shutting down the underlying compute node would remove all of the costs from the node. Though if you are leveraging some sort of autoscaler, you will need to make sure to reduce the count on the autoscaler or your work will be thwarted by the mighty autoscaler. Though making changes can be scary especially if you do not have data to back up your changes or knowing where to look in the proverbial needle in a haystack. Harness is here to make cloud cost savings easy. 

Cloud Cost Savings Made Easy

As more workloads head towards the public cloud, the engineer to infrastructure management ratio continues to increase. E.g one engineer is responsible for vast amounts of infrastructure. Given that platforms like Kubernetes are generic one or more clusters can be supporting an entire enterprise worth of workloads, knowing where to tune is the needle in a haystack. With Continuous Efficiency, you are given the power of exactly when and where costs spike. With the Harness Platform, you are able to kick off a pipeline to mediate the cost.  

Cheers!

-Ravi

Keep Reading