Autoscaling Kubernetes Clusters With Custom Metrics

Kubernetes Autoscaling allows to automatically scale the cluster by adding more nodes. This ensures that the cluster is always providing enough capability to run an application.

By Pranjal Kumar
April 30, 2019

Kubernetes is ruling the container orchestration world. It’s a truly portable system that has powerful capabilities for deploying, scaling and managing containerized applications.

It is an interesting space to look into and becomes even more if you add autoscaling into the mix.

Why Custom Metrics instead of Traditional Metrics?

As Autoscaling is natively supported in Kubernetes. By default, you have an option to automatically scale the number of Kubernetes pods based on the observed CPU utilization (A traditional metric).

However, in many scenarios, you want to scale your application based on other monitored metrics, such as the number of incoming requests or memory consumption. In Kubernetes 1.7, you have the capability to do that by leveraging the Prometheus and Kubernetes aggregator layers.

Custom metrics gives more control and visibility on what parameters the service needs to be autoscaled.

Kubernetes Cluster

Flow diagram for Autoscaling

Prometheus

Prometheus is widely used to monitor all the components of a Kubernetes cluster including the control plane, the worker nodes, and the applications running on the cluster.

Prometheus-Stackdriver Adapter

A sidecar for the Prometheus server that can send metrics to Stackdriver.

Once Prometheus scrapes the metrics from various pods. This sidecar sends the metrics and metric-data to Stackdriver.

API/Metrics Server

The metrics server uses the Kubernetes API to expose the metrics so that the metrics are available in the same manner in which Kubernetes API is available. The metrics server aims to provides only the core metrics such as memory and CPU of pods and nodes and for all other metrics

Custom Metrics API

Custom Metrics Server exposes the endpoint using Kubernetes API. But before that metrics needs to be converted to the suitable format.

Following manifest file describes a HorizontalPodAutoscaler object that scales a Deployment based on the target average value for the metric

diagram2

Once you have all these components setup. On running describe HPA it should result in the following output.

diagram3

Once HPA comes in to play. The deployment starts scaling up/down automatically based on the configurations provided. As shown in the Image. The HPA events starts recording whenever the pod scales up/down.


Harness & Kubernetes with HPA

Harness provides an easy way of configuring HPA for services to be deployed on Kubernetes. With configurations completely in Yaml. User needs to provide HPA details while configuring Service in Harness and that’s all.

After the deployment is done. The Service will be up and running with HPA enabled. With such an easy way of configuring HPA. Harness supports all types of HPA :

  1. Multiple metrics based HPA
  2. Default metrics based HPA
  3. Custom/External metrics based HPA

Summary

In this blog, I covered how to use Prometheus, Custom Metrics API server and HPA By Custom Metrics. For HPA the first thing you must truly understand is which part of the application causes the high-load situation and configure the proper scale policy to allow the application to survive during peak times.

In Harness, we have implemented HPA at various places to make sure the smooth running of different microservices. Harness makes things easy and efficient for a user by providing autoscaling behind the scene and ties up the metrics in the system as part of the continuous delivery process. Taking away all the pain of scaling and monitoring the system.

Pranjal Kumar.

➞ Back to Blog

1
Leave a Reply

avatar
1 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
1 Comment authors
Neha Agarwal Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Neha Agarwal
Guest
Neha Agarwal

Fantastic insights Pranjal. Many companies struggle to have a good and automated scaling policy in place for which this is one of the reliable solution.