Kubernetes Cost Management Strategies: Cost Visibility

Let’s dive into how you can achieve cost visibility to any level of granularity at any scale. These strategies are not mutually exclusive, and in fact, each provides an increasingly complete picture of cost.

With cost visibility, you want to be able to reasonably attribute costs to their origins. Depending on the context of the business, this could be down to the level of the developer, project, application, service, business unit, or anything else that makes sense for your organization’s needs. In Kubernetes, we achieve this by looking at clusters, nodes, namespaces, workloads, and pods.

The most basic of ways to get visibility is to use the Kubernetes API to see how we allocate resources, and then connecting them to a tool like Prometheus to see what’s being actively utilized versus unallocated or idle. This can work as a start but is difficult to scale, which is why you want to use a more holistic approach.

Let’s dive into how you can achieve cost visibility to any level of granularity at any scale. These strategies are not mutually exclusive, and in fact, each provides an increasingly complete picture of cost.

This article contains an excerpt from our eBook, Cost Management Strategies for Kubernetes. If you like the content you see, stick around to the end where we’ll link the full eBook for you. It’s free - and best of all, ungated.

Examples of What You’ll See

With cost visibility, you’ll be able to identify the contributors to your costs. In terms of contributions to wasteful spending in Kubernetes, you can expect to find:

  • Cluster sizes bigger than necessary, even with Cluster Autoscaler enabled
  • Unallocated resources within a cluster that contribute to underutilized resources
  • Low pod density, indicating that resources are allocated to the cluster or node but not being used by any pod or workload
  • Mismatches between requests and actual usage of CPU and memory resources, resulting in overprovisioned pods or pod throttling
  • Shared storage across resources that may not be fully utilized, or storage not allocated to any pod that ends up as unallocated storage volumes you’re still paying for

Strategy One: Tag and Label Management

Cost Visibility: Tags/Labels

Tags are a native capability built into any cloud resource, and they’re called labels in Kubernetes. They allow you to use a key-value style of assigning identifiers, or tags, to any resource so that you can find it later based on a combination of key-value pairs. It’s great in that it provides you the full flexibility to create any kind of identification scheme, and you can tag resources manually or in code, making it simple to create good governance around resource identification.

With a robust set of tags and good tag management policies in place, it becomes possible to slice and dice the cost visibility and ownership into a relevant context for anyone who needs it. In an ideal world, you’d have all of the tags on every resource that everyone needs, to see costs in whatever way makes most sense for them. 

However, because of the manual nature of managing tags as a core capability, it can become quite the challenge to scale this up. Tag management is hard to do, so tools which help here are valuable. You’ll want to look for the following in a robust tag management tool:

  • Ability to enforce OPA-based governance rules
  • Reporting that can surface non-compliant resources (e.g. showing what percentage of resource don’t have the “Team” tag)
  • Helps with tag compliance through automated tagging (e.g. add tags as part of a CI/CD pipeline, enabling you to answer questions such as “do frequent deployments impact cost efficiency?”)

There are tools out there that make auto-tagging easier, and if you use those, you’ll want to make sure you get full coverage of your architecture in any tagging effort. You’ll also want to consider how you’ll actually do the reporting if you don’t have a visualization tool, dashboard, or some other way to collate tag information for chargeback, showback, or financial reporting.

Risks

Using tag management as your primary visibility mechanism into your infrastructure relies heavily on the teams using the resources to appropriately tag them. You have to ensure there are guidelines or governance in place that guarantee all resources are tagged appropriately; if resources aren’t tagged, your entire visibility game plan could fall apart.

In addition to making sure there are no untagged resources, you’ll want to consider that there are also untaggable resources. Examples of these are shared resources. If you’re trying to do chargeback or showback for a shared resource, how do you attribute costs across multiple consumers of a resource?

When It’s Most Useful

Tag management is great for two kinds of organizations:

  1. Small organizations that don’t have complex reporting needs, aren’t worrying about optimizing costs, or don’t have a big infrastructure footprint
  2. Organizations that can create and enforce robust tag governance policies and are able to leverage the tag data to meet reporting needs

Typically, organizations that can effectively leverage the tag management strategy are early SMBs and enterprises with complex financial controls and governance already in place in other parts of the organization.

Strategy Two: Organizational Mapping

Org Mapping.
Organizational mapping can be a mess.

If chargebacks and showbacks are the most important thing, then you need to map each resource back to the unit of your organization that’s consuming it, whether that’s by business unit, product, application, microservice, cluster, or workload. 

We’ve repeatedly seen at Harness that this is the goal for organizations at scale. This kind of information brings insights into key business questions, such as:

  • “How much is a customer costing us?”
  • “Am I charging the right amount for my SaaS product?” 
  • “Can I standardize my costs across products or teams?”

Doing this, in practice, can be notoriously difficult. You’ll likely want to implement this strategy as you scale, so it’ll pay to think about how to do this in advance. If you’re using tag management, for example, you’ll need to make sure your tag hygiene is excellent and built to support this kind of allocation.

Risks

The biggest risk with this strategy is in implementation. Successful organizational mapping requires business context at all levels of the organization and a strong ability to understand and get transparency into your Kubernetes costs at any granularity. In addition, you need to be able to map each Kubernetes cluster, node, namespace, and workload to the business context you’re looking at. This can require a large investment in maintaining this mapping, even across cases such as a company reorganization or an infrastructure change event.

When It’s Most Useful

Organizational mapping is most useful for organizations that:

  • Need accurate chargeback and showback across a vary of different use cases or business contexts
  • Want to use cost allocation data as a core metric to inform and speed up business decisions 
  • Have governance requirements that they need to methodically execute on to ensure compliance

Typically, we at Harness have found that the types of organizations that can most effectively leverage this strategy are mid-market companies and enterprises, both of whom want to get a view of cloud costs that more closely correlate to business goals.

Strategy Three: Deployment Correlation

Deployment correlation is the practice of mapping costs to individual engineering deployments to production, dev, test, staging, or qa. With insight into which cloud resources are associated with deployments, organizations can start to answer questions, such as:

  • “Can I get the root cost analysis of an unexpected change in costs?”
  • “Can I attribute profit/loss to an individual feature change?”
  • “Are there efficiencies I can create across our deployment strategy?”
  • “What is the correlation between deployment frequency and our cloud costs?”

We at Harness find that this strategy is emerging as a new methodology for cloud cost visibility, including visibility into Kubernetes. Organizations are now looking to understand the impact of feature changes to their costs and even more granularly understand the different variables involved in cloud spend. For example, organizations want to understand when there is an unexpected change in their cloud costs, when it began, and which code deployment introduced the issue. This allows them to quickly triage potential cost snowballs and remedy the specific deployment.

Of course, attributing Kubernetes costs to deployments requires an intimate knowledge of the continuous delivery pipeline and how Kubernetes resources are being allocated within each deployment. If managing CI/CD pipelines wasn’t already enough, now you have to combine that with your cost visibility strategy! On the tail end of the process, however, you have a complete view across all dimensions of your costs, which can be very powerful indeed.

Harness recognizes the value of such an approach. As a software delivery platform, Harness can tie into your CI/CD pipelines and pull this information to give you a view of how deployments are affecting the costs incurred by your Kubernetes workloads, including showing the exact change that was made to impact the change in costs.

Kubernetes Cost Visibility: Cost Explorer

Risks

The primary risk with this strategy is the time and effort required to implement. With the level of involvement going beyond just your cost management strategy and spilling into your continuous delivery pipelines, this is no small feat. While the rewards can be great, the cost of building and maintaining such a solution can be a deterrent. And, while many tools struggle to solve this, there are some that help correlate your CI/CD to your cloud costs.

When It’s Most Useful

The organizations that find deployment correlation more useful are those that:

  • Want to associate engineering and infrastructure changes with changes in cost, including anomalous cost spikes
  • Use cost as a baseline metric to drive marginal improvements to the business
  • Have a goal of increasing cost efficiency among engineering teams 

At Harness, we’ve found that there isn’t necessarily a select group of organizations that are able to best leverage the deployment correlation strategy. Rather, the driving factors are either a desire to further improve cost management, or to further refine the software delivery process. While this typically is something organizations deal with as they scale, even smaller organizations find value in creating a scalable process here early on.

Conclusion

These three cost visibility strategies each provide a layer of information that gets towards the ultimate visibility into your costs. Whether you use one of these exclusively or multiple together, the first step to any cost optimization exercise is to see where the money is going. Even for your personal budgeting, how well can you decide where to spend (or not spend) money if you don’t know where you’re spending it? There are a variety of valid methodologies for gaining cost visibility. No matter what you choose, the biggest thing is understanding what you need before implementing anything.


The next couple blog posts will be on the other 2 strategies: cost savings and cost forecasting. However, if you don’t want to wait for the posts, you can simply download the full eBook right now. It’s free and doesn’t require an email address! Download the Cost Management Strategies for Kubernetes eBook now.

The Modern Software Delivery Platform™

Loved by Developers, Trusted by Businesses
Get Started

Need more info? Contact Sales