After you have great visibility into your costs, you can harness that data to predict what your costs will be in the future. But there’s no avoiding it: accurate cost forecasting is all about statistical analysis. Since it’s inherently a projection of the future, the best that you’re able to do is use your knowledge of existing data to project what it will look like.
Cost forecasting typically takes two forms:
- Forecasting what you’ll spend at some point in the future
- Predicting whether current cost patterns will exceed budgeted spend
You can use historical data to predict what you’ll have spent by a point in time, and in this way, you can do the following:
- Avoid overspending and cost surprises
- Simplify budget and capacity planning
- Implement better governance (e.g. budgets are soft limits, quotas are hard limits)
This article contains an excerpt from our eBook, Cost Management Strategies for Kubernetes. If you like the content you see, stick around to the end where we’ll link the full eBook for you. It’s free – and best of all, ungated.
Examples of What You’ll See
With good cost forecasting capability, you’ll be able to better plan for the future and create a strong feedback cycle between the three steps of your Kubernetes cost management strategy that almost works on its own. When you can forecast, you can do the following:
- Predict Kubernetes costs to accurately budget for future development
- Proactively avoid budget overspend by predicting cost fluctuations, such as seasonality, and provisioning resources accordingly
- Project and enforce budgets against Kubernetes consumption of cloud resources
- Map costs to teams that own them so they know what they’ll be responsible for, thereby improving transparency, visibility, and accountability
- Avoid chargeback surprises by creating channels of communication that align teams around cost changes that may result in budget overruns
Strategy One: Soft and Hard Limits
This is the primary strategy used by organizations that do not have a good way to predict their costs, though it is certainly used by all organizations. Setting limits on spend is an easy way to limit usage of resources and ensure you don’t accidentally go bankrupt. However, it can be a very reactive and inaccurate method that is disconnected from the reality of the business.
Let’s say that you were using a simple Kubernetes autoscaler to provision more resources as demand increases, or, in user terms, provision more resources to run the app as more users come onboard. However, suddenly more users come onboard than planned for. You had hard limits such as resource quotas defined in your control plane to keep costs within a budget, and now you’re blocked from spinning up more resources. What happens? Your app crashes, or certain users aren’t able to access the app. That’s never a good outcome.
In an ideal situation, you want to be able to set that limit, but have it set intelligently so that you make efficient use of infrastructure and don’t run into walls, like in this example. When you’re setting more intelligent boundaries, you have more leeway for unexpected overhead while still keeping your infrastructure and cost issues to a minimum.
One way to set more intelligent boundaries is to use a tool, like Harness Cloud Cost Management, that can analyze your historical usage pattern data and let you know what requests and limits to set for your workloads.
As illustrated above, you want to avoid limits that are set too low and may create problems for users or for the balance of the infrastructure. Without good forecasting, you’ll tend to err on the side of setting limits too high to optimize for performance, and it ends up in missed efficiency opportunities.
When It’s Most Useful
Setting limits via resource quotas is useful for all organizations across the board. Limits are a great way to ensure you don’t spend more than you can afford to, and that you’re staying within reasonable expectations. However, be careful not to rely on limits alone as a way to manage and forecast your costs.
Strategy Two: Regression Analysis
A more advanced statistical approach is called regression analysis. Now, you’re getting into the weeds of what it takes to actually predict cost, the variables involved, and how do they play with each other to inform an accurate projection of future costs?
Usually with regression analysis, you’ll end up with an equation that covers the majority of use cases for what impacts your costs, and you’ll be able to plug in values to the equation that will help you predict what your cost will look like at a point in time. To do this, you’ll need to have an intimate understanding of what contributes to cost and what affects those contributors, which is no small feat if you’re not an analyst or statistician. You’ll often have to pull together a team to do this analysis and create the “plug and play” equation or visualization of your cost forecast.
The key here is that regression analysis is powerful, but it often limits you to a point in time – you can make discrete predictions of what will happen and adjust accordingly, but you can’t continuously come back and see how the projection is changing. Doing the analysis itself takes a long time, and because you have a manual equation, you need to come back and do the math again every time you want to project your costs.
In regression analysis, you always take on the risk of being wrong. What if you’re correlating the wrong variables in the wrong way and your forecast is way off? If you act on those projections, suddenly you might be wildly over- or under-budgeting for the upcoming time period, neither of which is ideal. The goal of forecasting is to paint an accurate picture of the future so that you can plan for it. Often, this is in the realm of at least 80% accuracy, but the higher the better.
Another risk you run is that doing the analysis is a point-in-time exercise. The typical way it’s done is that a project is commissioned, a team comes back with a number, and you plan based on that. But what happens if some major event occurs that completely changes the projection? You can do a best, worst, average case analysis, but it still doesn’t capture the full picture of how things change day to day. To offset this, you can create a running spreadsheet that you can refer to daily or at any other measure of time, but unless you can pull in the people who need to see it and the people who need to act on it, it may become a fruitless endeavor.
When It’s Most Useful
At the end of the day, regression analysis is most useful when your primary use case is a one-time future projection to enable planning. Most typically, we’ve seen organizations care about regression analysis when they’re in a position to start projecting into the future and plan ahead. This is usually at the mid-market to enterprise stages of an organization, though it doesn’t exclude any others who may need the same capabilities. Arguably, all organizations need to use this method at some point.
While it can take a lot of upfront investment to do regression analysis properly, it’s very useful any time you want to project what costs will look like at a point in time, and to be able to adjust budgets accordingly. It’ll be most effective when it’s paired with effective visualizations and action mechanisms to work from the results of the analysis.
Strategy Three: Machine Learning
Think of leveraging machine learning as pumping steroids into regression analysis. Machine learning provides you the power to create incredibly accurate regression models that find all of the contributing factors, what affects them, and what effect they have on the final result. And all of this can be done with minimal human intervention, and on a continuous basis.
Imagine if you could see how projections change daily so you could adjust allocation of funds before you run into problems. And imagine if you could do this just by looking at a dashboard instead of a bloated spreadsheet, all based on your exact business context. That’s what machine learning for cost forecasting lets you do.
However, implementing a machine learning cost forecasting model is no easy feat. Large organizations sometimes devote entire teams to solving the problem, and at the very least, you need to have a strong ML engineer that can architect, build, and validate the system. There are lots of good open-source ML models out there that specialize in projection, but you’ll need to adjust those for cost forecasting specifically and train them on your specific business context.
Because it’s so powerful and provides so much value, Harness includes machine learning-based forecasting out of the box. Simply connect your billing data and let the engine do the rest. You can group together resources that are relevant to you, set a budget, and see your forecasted costs versus your budget. This enables you to be more proactive about staying within budget, and helps you do better capacity planning and budgeting for future cycles.
This is where it really starts to become a question of how accurate your forecasts need to be, how often they need to be run, and whether they’re a core part of your organizational needs. Is it worth it to you to invest in building an ML-based cost forecasting solution, or can you live with another forecasting strategy? Of course, you can buy solutions that will do this, but you’ll want to ensure they are able to work for your business context. You’ll also want to consider the granularity at which you need to be able to forecast costs.
In addition to developing the solution itself, how will you be using it? Will it be purely to forecast costs and do budget planning, or can you stretch and do other things with continuous forecasting capabilities? For example, can you devise ways to proactively alert teams as soon as a projection indicates they’re on track to go over budget? Consider the risk here of not fully utilizing your investment in machine learning, especially since it’s not uncommon for solutions with poor ROI to be scrapped down the line.
When It’s Most Useful
Machine learning-based cost forecasting is most useful when you pair it with effective visualizations and when you have action mechanisms tied to the forecasts. Think about it: for something that continuously spits out projections, it’s easiest to see those changes visually; and when things change constantly, you need to be able to adjust the outputs tied to the original analysis. More tangibly, you probably want a dashboard tied to your relevant context and desired granularity; and you want a way to make changes, or at least alert decision makers that something has changed and they need to act on it.
While the level of power and detail afforded by an ML solution typically seems like something only the largest or most complex of organizations could utilize, the beauty of ML software is that it can be used by organizations of any scale – it’s just a way to utilize vast troves of data in the most insightful way possible. Any organization that wants to be data-driven and create accurate cost projections will find this strategy useful.
Predictable costs in Kubernetes and the cloud are the holy grail for many involved in the financial management part of the cloud. Whether it’s a high-level finance need or a question of budget for engineering teams, being able to know what you need to spend makes everyone’s lives easier. The same way that organizations knew in the data center what would be spent on infrastructure, they want to nail down the same for cloud costs – which can be elusive.
This concludes our mini strategy pieces about cost visibility, cost savings, and cost forecasting. We have one last eBook excerpt to go over though: Simplifying Kubernetes Cost Management with Harness. However, if you don’t want to wait for that post to go live, you can simply download the full eBook right now – it’s free and doesn’t require an email address! Download the Cost Management Strategies for Kubernetes eBook now.