Not Everything Fits in Kubernetes

Updated

We are getting pretty dangerous in the Kubernetes world after part four of the series; we understand some widely used operational tasks in Kubernetes. Next logical step clearly is to stuff every application we have into Kubernetes and start to call our vendors demanding a deployment.YAML so 100% of what we have runs in Kubernetes. As you start to go through your Kubernetes journey, you will soon find out there is a good amount of thought required to confidently leverage a fast moving platform; what was par for the course even a few years ago is changing. Summarizing the blog post in three bullets (see a trend?):

What works best: Stateless workloads that were built in the age of Kubernetes.
What works good: Stateful/stateless workloads that are built in the modern age of Kubernetes.
What does not work so well: Stateful/clusterable workloads not built in the age of Kubernetes.

My father would say when the lottery jackpot gets very large “if you don’t have a ticket you don’t have a chance”. In the Kubernetes world, “if you don’t have a container, you don’t have a chance”.

You still need a container (99% sure is Docker)

If we take our knowledge back a few weeks to part one, we learned about what this container thing is. As a refresher, containers are immutable (not designed to change) and ephemerial (will die). This design paradigm is a relatively new one but modern cloud native platforms and application infrastructure takes this paradigm into account. Most folks who leverage Kubernetes is using some sort of Docker Image and Docker Engine to be orchestrated. There are other container formats that vendors have a Kubernetes play for such as RKT and Mesos, but those have been overshadowed by the Kubernetes + Docker combination. A next logical division is brownfield versus greenfield development. The benefit of building something new aka greenfield is that you can design for the new paradigm. With existing aka brownfield you can feel that you are fitting a square peg in a round hole. With existing applications, can certainly take a few investigatory passes to see if you can evenDockerize your application. There are a lot of items fighting against you even down to the language / run time that your application was written in. JAVA has been going through a transformation with the run time being updated to respect container limits more gracefully. With Kubernetes, having more than one instance of your application is as simple as increasing the Replica count. Some applications are designed to be only run one at a time and was not designed with distributing computing in mind.

Fallacies of Distributed Computing

James Gosling, the inventor of JAVA, and Peter Deutsche came up with the eight Fallacies of Distributed Computing. Summing the fallacies up, there are multiple people working on things and resources that are seen as always available such as the bandwidth/network are restricted. With multiple teams deploying to Kubernetes as we are all collectively learning can really exacerbate these fallacies. Developing a distributed system has unique sets of challenges and Kubernetes does not auto-magically solve them such as the fallacy of no latency. Kubernetes is a great platform for us to build on but is just one interpretation how a distributed system should be. Distributed systems that were built before the Kubernetes paradigm can be quite difficult to shoe-horn into a Kubernetes cluster.

My cluster needs um a (Kubernetes) cluster?

The first set of workloads that were hitting Kubernetes in the 2015-2016 time frame were stateless applications. Though a majority of enterprise applications need to manage state and are stateful. If you are unfamiliar with what stateful vs stateless applications are, imagine a stateless application as a weather application and a stateful application as a banking application. Does not matter what endpoint tells us the weather, e.g we can request “what is the weather in zip code 94105” and get the same result from any node; that is some sweet idempotence! Moving money around is a different story. There are distributed transactions involved and the end points need to be aware of what the other nodes are doing e.g replication. We can request “money from Ravi” and one node fulfills that request and then has to somehow persist that information to the rest of the cluster or Ravi might be paying twice. Clustering and replication are not new to the application world. If you are using a JAVA stack, application servers have been supporting this for almost two decades now. Take Kubernetes out of the picture, for JAVA applications not that long ago that had to cluster together, application server / in-memory solutions would use UDP/multicast for member nodes to find each other. Getting your Kubernetes Worker Nodes to allow UDP to each other was an ordeal. Application servers have been updated and tooling such as Weave Net by Weaveworks but those are more recent additions to the ecosystem. We can take a look at a more modern solution, Apache Kafka. Born at LinkedIN, Kafka is a streaming messaging platform and only predates Kubernetes by two years. Confluent, the company behind Kafka, has been on a journey over the last four years to get their platform to be consumable on Kubernetes to the masses. They certainly faced challenges with their clustering and replication mechanisms which took time for industry and Kubernetes project to build solutions around. Clustering and replication is just half the stateful battle. The second part is where the state will eventually live, the persistence layer.

Persistence the pest

Early on during the Docker bloom concerns about storage started to arise. Clearly the mounts and volumes inside the container will not be persisted once that container dies/respawns. A good design would have storage volumes that are outside of the container and can be externally managed. With the bloom of containers, especially as they come and go, pinning down the storage has been a challenge. In the early days of Kubernetes, the recommendation would be that platforms that required persistence such as a database would not run on Kubernetes. Highly specialized databases that have very specific file system requirements most likely run on a purpose built cluster not managed by Kubernetes or even on bare metal. BothDocker and Kubernetes are evolving to support external volumes better. Purpose built solutions such as Portworx are helping solve the container storage conundrum. The Container Storage Interface (CSI) also can be attributed to strengthening storage options and interoperability with the container ecosystem. With all of the enhancements to the ecosystem, the radio dials seem to point towards generics.

Kubernetes is generic

Kubernetes, to be a successful orchestrator, had to have the ability to orchestrate a large swath of workloads and to capture the most workload Kubernetes needed to be generic.Minus the Liveness and Readiness checks, after dabbling in Kubernetes, you start to see the iconic pet vs cattle analogy at play. If your cattle is sick get a new one unlike your beloved pet; so replace an unhealthy container vs healing. There has been enhancements in the scaling realm with the Horizontal Pod Autoscaler [HPA] allowing for container resource utilizations to drive the Pod Replicas. The HPA is a great mechanism but left unchecked can easily exhaust your cluster capacity. A tip of the hat into part six of the blog series, there are new capabilities being added and extended into the Kubernetes ecosystem which makes Kubernetes more application aware / specific. Up until this point in the blog series we are still talking generics which Kubernetes can take action on.

Cluster awareness and response time

For Kubernetes to take action on an event such as a failure or scaling trigger does take a little bit of time for the response. The Horizontal Pod Autoscaler only knows what is given to the HPA.Going back to part four, scaling your Kubernetes cluster is important. Adding / subtracting the worker nodes should be considered a common task. At some point the Horizontal Pod Autoscaler would not be able to place work and you will end up with unfulfilled or rejected resource requests. Thus you need to scale the cluster which takes time.This is why there is such importance on Application Performance Monitoring [APM] solutions to help anticipate in more specific context when to trigger a scaling event. Very large Kubernetes clusters have additional systemic help in the form of another cluster manager. In what I call the “Illuminati Clusters”, some of the largest clusters in the world typically have another layer behind Kubernetes managing a cluster of clusters. Google Omega, Microsoft Apollo, and Apache Mesos are all examples of “cluster of cluster” managers. If you clicked on one of those links, you will see some lengthy scholarly papers on distributed systems. Those platforms are designed to orchestrate other platforms; harken back do your computer science university days with two level schedulers. Your workloads might not warrant investing in a two level scheduler or a cluster manager for your Kubernetes cluster, but don’t be scared and peel the bandaid off to start somewhere.

Incremental is your friend

Learning where the challenges in Kubernetes is important as you design your applications and platforms for the future. Don’t be discouraged, Kubernetes is a platform that is helping bring the power of distributed computing to the masses. A common approach for any sort of technology change is an incremental approach. Start with the low hanging fruit and build incremental confidence. Start with the stateless pieces of your applications which would easiest to be orchestrated. Can learn about what Dockerizing takes and how to scale up and down for traffic. As the journey continues can look at bringing on stateful applications with varying levels of persistence requirements. Having some workloads deployed to Kubernetes and some not is par for the course.

Harness - bridging the divide

Harness is the perfect platform to help bridge the Continuous Delivery divide in heterogeneous infrastructure. No matter if you are firing up Kubernetes for the first time or well on the way to having all of your workloads on Kubernetes, Harness has your back. With Harness, you can easily have pipelines split where the deployment goes so you can deploy to Kubernetes and non-Kubernetes destinations alike. Stay tuned for the final chapter, part six, where Kubernetes is headed. -Ravi