The advent of cloud computing has enabled us to develop faster and ship more than ever before, but we’re quickly realizing (if we haven’t already) that it’s come with a tradeoff: we’re spending a lot more on cloud than we anticipated. Our battle now is not increasing our speed of innovation, but rather one of cloud cost management — managing and optimizing the costs that have come with faster innovation. 

While we want to empower developers to build, test, and deploy code quickly, we want to try and be as efficient as we can with our spend at the same time. Whether it’s a team in finance, operations, engineering, or the new hotness of FinOps, someone is on the hook for keeping those cloud costs in check!

Let’s go through a few approaches we can tangibly take to understand and manage cloud costs.

Method 1: Manual Cloud Cost Management

ProsCons
Gets the job done for small-scale operationsNeed strong governance to scale
Keeps management cost lowManual and prone to error
Easy to get started using native toolsCan miss savings opportunities

Who Should Use This Method?

Typically, this method is best used by early startups and in some cases, growing SMBs, and it’s most effective when these requirements are met:

  • Teams with only enough cloud resources to easily remember
  • Teams who can ensure 100% tag completeness across their resource fleet
  • Teams who care more about fast growth than cost optimization

The Spreadsheet Approach

Naturally, we can manage our cloud costs by keeping track of what we’re using and where in a spreadsheet. Makes sense and seems easy enough, right? This approach can make a lot of sense when we’re spinning up ten AWS EC2 instances and three S3 buckets, but anyone who’s done their own management knows that this quickly becomes a chore, and is, in fact, impossible as we scale — especially when we start introducing things like containers.

Think about the work involved in managing just your thirteen AWS resources. Now, think about what that would mean if you introduced containerization and exploded the complexity of managing these resources: you have to keep track of how containers and resources map to each other, and the usage at any given point in time, making it difficult to get an accurate accounting of what’s going on if you do it manually.

The amount of effort looks something like the below — as we have more cloud resources to manage, the effort it takes to manage them manually grows exponentially.

Mapping exponential growth of cloud resources to cost management effort.
Great for business, terrible for your workload.

Making Manual Management Easier with Tagging

If this model is what works best for our org, it’s not that hard! In fact, cloud providers have created a concept called tagging specifically so that we can organize and manage our cloud resources. We can then leverage the cloud provider’s built-in cost dashboard to see what things are costing based on how we’ve tagged those resources.

Tags are essentially key-value pairs assigned to each resource, just like in a hashmap. They allow us to define how we want to identify resources and assign the key-values required to uniquely identify a resource. For example, in our small pool of resources, we know that to uniquely identify a resource, we only need to know the owner and the application, so we can tag any resources accordingly. In this instance, we might end up with tags on our three AWS S3 buckets that read {“owner:foo”; “application:bar”}, {“owner:joe”; “application:schmoe”}, and {“owner:sally”; “application:pally”}. Armed with this identification ability, we can hop into our AWS Cost Explorer dashboard and understand exactly what these resources are costing us and assign them very easily to our internal model.

Ultimately, doing this relies heavily on creating strong governance around the usage of cloud resources. We want to create consistency in how we identify resources, regardless of how they are invoked, and follow an identification schema that lets us effectively group and find resources for us to keep track of costs and eventually optimize them. At that point, we can come to the table at any time to do a costing exercise and see where costs are coming from and how we might be able to optimize them.

This always runs the risk of things being tagged incorrectly or not being tagged at all, and the burden will fall on us to maintain the policies and validate the tags over time. This strategy relies on good tag coverage and good tag accuracy to understand costs, which may not always work out, and failure breaks down the whole strategy. On the other hand, it does carry the benefit of giving us lots of control, and we’ll definitely understand our costs better than ever!

Method 2: Using Cloud Cost Management Software

ProsCons
Simple way to understand cloud costs and optimization opportunitiesCosts money, potentially a lot
Automate a lot of the manual work of managing cloud resourcesCreates a “cloud cost management” function in an organization, potentially creating more inefficiencies
Tag hygiene is a thing of the pastActs as a wrapper around the root problem

Who Should Use This Method?

Typically, this method is best used by companies who are SMBs or early mid-market, and it’s most effective when these requirements are met:

  • Organizations with multiple teams using multiple different groups of cloud resources
  • Teams who can’t rely only on good governance and tag hygiene to understand costs
  • Teams who want to keep costs in check at a point in time or periodically

For those of us who have tried to use the manual method as we hit our stride and grow our use cases, we know that it becomes nearly impossible to properly track and tag all of the resources in order to understand costs and create more efficiency. In fact, trying to maintain tag hygiene sometimes creates more inefficiency than it solves for, making it an inefficient method to manage cloud costs as we scale.

At this point, we want to strongly consider some kind of management software, whether that’s something built in-house or provided by a vendor to help with a few core problems:

  • Surfacing savings or optimization opportunities based on usage patterns
  • Tracking costs to their appropriate resource or event (including potentially enforcing and maintaining tag hygiene)
  • Identifying potential cost overruns based on a budget and proactively resolving them

At this point, we can confidently say we understand our cloud costs

The Rise and Fall of Top-Down Cloud Cost Management

But who understands the cloud costs, and who is responsible for understanding them? We have to realize that with cloud cost management software, we’ve flipped the script. When we were small, engineers owned and managed resources and cost, but with cost management software we go from bottom-up cloud cost management to top-down: engineers are tasked with innovation while a manager (usually sitting between a few different teams and having no context) is tasked with managing the costs.

With cost management software, suddenly we abstract away a lot of the really difficult work required with manual management of cloud costs, and there’s a good chance we’ll net a bunch of savings that previously hadn’t been discovered via some great “reserved” resource pricing or rightsizing opportunities. That, and there’s a good chance we’re able to get this view without any tagging! What’s not to like?

The promise of such a top-down management method is realized early on in the process and time to value is certainly minimized because there are a lot of upfront opportunities to optimize the cloud resource fleet. But there’s a caveat: even in a scenario where we can track every resource and identify every savings opportunity, it’s tough to actually implement them because chances are, we can’t just go and make changes. We’ll have to contend with individual budget owners, CI/CD issues and velocity, and rope multiple owners and functions into every conversation. 

Cloud cost budget ownership as a functional role that only works on paper and struggles in practice.
Having a separate cloud cost ownership function doesn’t work in practice.

Not only does this mean the savings we found might never actually get implemented, but also that we’ve solved one problem only to run into a brick wall. You just can’t solve these problems as a top-down cost optimizer that doesn’t get into the weeds the same way a stay-at-home armchair analyst couldn’t do the same job as someone on the ground.

We can definitely understand our costs now, but we’re not much closer to actually managing them well or optimizing them, which was the whole idea.

As it turns out, cloud cost management software provides great point-in-time views of what’s going on and automates some of the thinking around what to do at a high level, but it’s really just a nicer way of doing the same manual management, so now we get to spend time trying to talk others into cost optimizations rather than trying to find the optimizations in the first place. We’ve made progress, but this situation is still far from ideal.

If we really want to create a long-term scalable model, we have to consider whether the top-down approach that traditional cloud cost management software provides may not be the best fit for our needs. It’s awesome for a high-level view and powerful recommendations, but that method of implementing a winning strategy doesn’t work for everyone. For these folks, it’s important to solve the core efficiency problem and solve a different kind of challenging problem.

Method 3: Collaborative Cloud Cost Management

ProsCons
Diffuses responsibility for cloud costs across the whole orgCosts money, potentially a lot
Solves the root problem of cloud cost mismanagementHigh upfront cost of retraining engineering teams and habits
Scalable, and transforms teams from reactive to proactive about cloud costsCan be tricky to implement without the right tooling

Who Should Use This Method?

This method works well for companies at all stages, but particularly well for growing SMBs, mid-market companies, and especially enterprises. It’s most effective when these requirements are met:

  • Teams with large distributed groups of cloud resources, including container fleets
  • Teams who need to track down and understand costs quickly without overhead
  • Teams who want to create continuous efficiency, balancing costs without slowing down innovation

There are a lot of good things about cloud cost management software, and we definitely want to preserve those. In particular, regardless of the solution we choose, we want to ensure we can have a simple view that helps us understand our cloud costs. But we want to get away from top-down management of cloud costs, because we see that as we scale to greater heights, it creates a lot of inefficiencies and doesn’t get to the core of the problem: the engineering teams that are building things.

We want our engineers to build awesome stuff and we don’t want to slow down their velocity. After all, this is the whole reason we moved to the cloud and created this innovation model. At the same time, we can’t escape the fact that this model has created cost concerns and we want to get more efficient about it.

Let’s take a step back and see what we know:

  1. Things worked really well when there was a tiny engineering team that could keep track of costs themselves, but that’s not scalable and it’s definitely not how it works at our size
  2. We can create powerful discrete or point-in-time views of our costs and opportunities to optimize, but by the time we can get anything done, it’s stale
  3. We want to be efficient continuously rather than discretely, but without sacrificing the ability to see the big picture, including the cost analytics and fleet-wide optimization opportunities

Long story short, we want the control we had when we could handle it ourselves, we want to keep moving fast, and we want to operate at scale. There’s no need to imagine whether that could happen, because that’s where collaborative cost management comes into the picture.

Collaborative cloud cost management is simply shifting the responsibility for cloud costs down the chain instead of leaving it as a top-level function.
Collaborative cloud cost management illustrated simply.

The collaborative cost management model leverages the best of bottom-up ownership of resources as well as the top-down view of what’s happening. The question to answer then is how to implement such a model.

Implementing Collaborative Cost Management

At its core, the model is composed of two parts. First, we need a top-down view of things — we already have that from our cost management solution. Second, we need to bring engineers back into the cost picture and empower them to track and manage their costs — and we can already track resource-level costs, so this bottom-up need is already teed up to be solved.

If we can do both of these things, suddenly we have a better understanding of our overall costs than we ever did, and we do it so much faster. Imagine this: when we look at a point-in-time view of our cloud costs and have questions, 1) we can easily find the answers instead of having to spend time tracking down budget owners; and 2) we find that we’re already managing our costs pretty well because engineers are on top of it from the get-go. In this scenario, we’re never worrying about our costs, we’re still innovating quickly, and now we’ve created continuous efficiency.

Instead of leading and dictating from the top, we're bringing cost management to the masses and democratizing it.
We’re creating a better cost management model.

To create this model, recall that we need to have the right tools in place, and it’s a matter of getting engineers involved. Engineers will need training, of course, but we’ll also need the tooling to make their involvement possible. Hopefully our cost management solution makes this easy.

Once the engineers know what they’re responsible for and how to stay on top of it, we’re off to the races and have a new and efficient collaborative cloud cost management model in place. All pretty easy to do!

Making Cloud Cost Management Easy

Here’s the plug: Harness Continuous Efficiency. All of the goodies we want in our cloud cost management model packed right in, and it’s built into software, so most of the heavy lifting is done already. The best part? It’s part of a CI/CD platform, so it ties right into your usual engineering process instead of adding yet another tool to the long list.

  • We can do cloud cost analysis and see what’s happening at a high level, slice and dice the data to find the individual resources down to the application level, and find optimization opportunities galore.
  • Something spiked in the cloud costs? Cool, let’s just go find out what events were happening at the time and nail that sucker down before it snowballs, even if it’s in Amazon ECS or Kubernetes.
  • Everyone needs a budget, maybe even the engineers now. We can set these at any level in Harness and track resource costs to them, even seeing what that forecast will look like so we can deal with it before it’s a problem at the end of the month.

Do you want to learn more about which tools there are out there? We mapped out the top cloud cost management tools to consider.

Managing cloud costs is a big problem, but it really shouldn’t be. We conquered the data center, why can’t we conquer the cloud? Get your free trial of Harness Continuous Efficiency to make your cloud cost management easy.

Keep Reading

  • The Women of DevOps: Patricia Anong

    The Women of DevOps: Patricia Anong

    Meet Patricia Anong, DevOps Consultant. We're thrilled for you to meet her!
  • Introduction to Helm: Charts, Deployments, & More

    Introduction to Helm: Charts, Deployments, & More

    Probably one of the first packages installed after your Kubernetes cluster is up and running is Helm. A stalwart in the Kubernetes ecosystem, Helm is a package manager for Kubernetes. If you are unfamiliar with Helm, Helm helps users to have a more consistent deployment by packaging up all of the needed resources needed for a Kubernetes deployment.
  • GitOps Got Me Up

    GitOps Got Me Up

    Two years ago, I joined the technology space - and as such, I am now a strong proponent for DevOps methodologies.