Moving to Kubernetes won’t guarantee lowered cloud costs. This blog post shares how to manage costs for containerized applications within Kubernetes.
Whether you are using AWS Elastic Container Services (ECS) or any flavor of Kubernetes, this content can help FinOps teams succeed. A core tenant to managing cloud costs understands the operating models for our cloud-based workloads. Container technologies allow applications to run independently on shared computing resources, but this creates challenges in cost visibility, resource optimization, and budgeting.
Note in ECS, Containers, and Service Instances are equivalent to Pods and Nodes in Kubernetes. In this blog post, we will use Kubernetes terminology.
What does Kubernetes provide?
Kubernetes and its other distributions is a container orchestration platform. Containers run an application, and they work off of container images that define all the resources needed to run the said application. Kubernetes manages these containers by grouping one or more of them as a pod. The pods can be scheduled and scaled within a cluster of compute nodes.
And then namespaces provide a way to organize different Kubernetes resources such as pods and deployments. A namespace can mimic an organization’s structure, and so you have a single namespace for every team or a namespace for a sandbox environment for developers.
How to optimize Kubernetes workloads?
In a previous blog post, we discussed the contributors to cloud costs, and how workloads can contribute to utilized, idle, and unallocated costs. Unallocated and idle cost is waste and represents an underutilized cluster resource. Note in the unallocated case you’ve reserved nodes that have no active workloads.
There are different strategies to consider when optimizing workloads in Kubernetes. Let’s discuss these in more detail.
Configuring Quality of Service for Pods
You can ensure a pod has a fixed or minimum amount of node resources by specifying the container’s configuration. The Kubernetes scheduler uses the configuration to allocate pods to nodes across a cluster. The configurations provide different Quality of Service(QoS) classes for the pods running in Kubernetes, Guaranteed, Burstable, and Best Effort. Each class is assigned based on how you configure resource limits and requests for compute and memory resources.
The Guaranteed quality of service class ensures pods have the set amount of CPU and memory requested. Burstable resource allocation will enable pods to access more resources only when required, and Best Effort allows pods to run while there is capacity on the node. I recommend reviewing the Kubernetes Documentation to learn how to make use of each class. You can get higher utilization by employing different QoS classes to your workloads.
Configuring The Kubernetes Scheduler
The Kubernetes scheduler watches for newly created pods that have no node assigned. The scheduler is responsible for finding the best node assignment for every pod that it discovers, and it does so through two steps, filtering and scoring.
The filter steps determine the set of nodes where it’s possible to schedule the pod. Node affinity refers to the preference or requirement of pods running on a set of nodes. In Kubernetes, you can also taint a node to repel a set of pods and then apply toleration to allow pods to schedule onto nodes with matching taints. This is a great way to dedicate a set compute resources to a team within an organization.
The scoring step determines a rank on the remaining nodes to evaluate the most suitable pod placement. The kube-scheduler assigns the pod to the node with the highest ranking. If there is more than one node with equal scores, kube-scheduler selects one of these at random.
There are two supported ways to configure the filter and scoring behavior of the scheduler. I recommend looking into Scheduling Profiles and Scheduling Profiles for determining what makes the most sense for your workloads. See the Kubernetes documentation here.
Vertical and Horizontal Rightsizing
The final suggestion is to consider resizing your resources. In vertical rightsizing, we change the size of nodes. You may need vertical rightsizing if you are running high availability workloads where pods need to run on different underlying nodes. Many times this leads to a higher percentage of idle costs because, for example, our node instance could have 12 cores, but your pod only uses six cores. If you have a highly available service that runs two instances of an application, then 50% of the cost of running each of the nodes is contributing to waste.
In horizontal rightsizing, you change the number of nodes within your cluster. Horizontal rightsizing is beneficial if you do not have enough utilization or workloads across your cluster. You can only horizontally scale down so far, however, so alternatively, you can turn off or spin down compute resources when they are not in use, for example, during the weekend. Linedata uses Harness to optimize its AWS workloads this way.
Serverless architectures and containers also handle the resource allocation for you, so you are only paying for what you use. Still, without access to reserved infrastructure, it’s harder to optimize for different workloads.
Leveraging Cloud Costs Visibility
Managing the costs of Kubernetes and container-based workloads is just as important as managing any other cloud workload. This post shares how to optimize your Kubernetes spend. A main component of optimization is visibility. Tagging and labeling your containers can be a tedious process that provides little insight into your strategy, allocation, and usage. If your public cloud solution is not providing enough visibility consider trying Harness’s Continuous Efficiency (CE).
Leverage CE to correlate pod resources consumption to your cloud bill, track unallocated costs contributed by freed resources, and monitor the service instance usage and perform scheduler optimizations where needed, great for any FinOps journeys!