September 18, 2024

Evolution of Harness Infrastructure

Table of Contents

Eighteen months ago, we took the initiative to evolve our cloud infrastructure. In this article, we share what we have learned and highlight important aspects of our approach.

The objectives for this initiative were:

  1. Standardize production infrastructure with infrastructure as code (IaC) to streamline operations and scale efficiently with growth.
  2. Provide developers with production-like environments for their dev/test use.

Ours is a cloud-native stack with dozens of microservices deployed in a Kubernetes cluster. We needed to create new production clusters for scale and provide our SaaS service in more geographies(outside the US). We opted for OpenTofu and Terragrunt as our IaC tools and standardized on Helm Charts as service artifacts. Our CI process produces docker images and corresponding helm charts(as versioned and immutable artifacts; we bake image tags in the chart).

Our Tiered Approach for Infrastructure Stack

We divided our stack into four tiers from the bottom (Tier-1) to the top (Tier-4), as illustrated in the diagram below:

Harness Infrastructure Stack

The separation of concerns from the security and operations point of view determined these tiers. The following are the functions of each tier: 

  • Tier-1 (aka Cloud Infra Tier): At this tier, we manage fundamental building blocks of the Cloud Infrastructure, which includes the setup of (GCP) Project, IAM roles, VPC, and Harness delegate.
  • Tier-2 (aka Compute Tier): At this tier, we manage compute resources, including Kubernetes clusters, service mesh, and other cluster resources like external secret manager, etc.
  • Tier-3 (aka Application Infra Tier): At this tier, we manage application infrastructure, including namespaces, databases, and the external secrets needed for the microservices in the application tier(aka Tier-4).
  • Tier-4 (aka Application Tier): This tier is to deploy microservices (through Helm Charts).

Tier-1 needs the highest privileged access (it needs to be IAM admin). Our Security Operations team operates this tier from their workstations. Tier-1 setup also deploys a Harness Delegate with an IAM role with required permissions(scoped to the Project) to manage other Tiers. We operate Tiers 2-4 through Harness pipelines. Harness’ RBAC system provides granular controls for managing access at the environment levels. For production environments, we restrict access to Tier-2 and Tier-3 to Cloud Engineers (who manage our production infrastructure), and Tier-4 is available for individual application teams for their independent service deployments.

We use External Secret Manager to pass secrets from the lower to the upper tier. Cloud and Application engineering teams never see the secrets. We use keyless workload identities wherever applicable in our application and infrastructure tiers.

This tiered approach to the infrastructure stack ensures best-in-class security controls for our cloud infrastructure and provides flexibility and agility for application teams' development flows.

Devspaces - On-Demand Dev/Test environments

We have more than a dozen independent development teams. Devspaces are on-demand production-like environments where the teams can do their feature testing. We use the same infrastructure stack to build Devspaces. Each devspace is implemented as an isolated namespace (Tier-3 and -4) in a shared cluster (Tier-1 and -2). Developers can deploy feature builds to their devspaces while the rest of the stack runs a production-like configuration managed by the central team.

Devspaces in a shared cluster

Devspaces have proven to be very versatile in our development process. They have effectively removed the bottlenecks of the integration environment, enabling each development team to do end-to-end feature testing in their environments. These environments are instrumental in various use cases, such as feature testing, performance testing, demo environments for early feedback, and documentation. In a typical week, we see over a thousand feature build deployments across hundred-odd devspaces, a testament to their versatility and efficiency. We have built features like TTL and team-wise cost visibility for devspaces to bring cost efficiency.

Built-in compliance with Git version control

We have all aspects of our infrastructure version controlled in Git. It includes infrastructure and pipeline definition and environment-specific configurations. Git provides us with an audit trail through commit history. We use Pull Request flows to govern changes. Git-based versioning provides us with complete repeatability of environment setup.

Benefits of as-code

Environment as a Service

Standardized IaC driven by Harness pipelines has provided a very flexible mechanism for creating environments for various use cases at Harness. We call this approach Environment-as-a-Service. The diagram below depicts the different use cases in which we employ this.

Environment as a Service powered by IaC and Pipelines

Conclusion

This initiative has had a significant impact at Harness. In the last few months, we have created three new production clusters, one integration and two QA environments, and over a hundred devspaces. The time to create a new production cluster has been reduced to a few hours from many weeks, a testament to the power and efficiency of this approach.

We are working with some of our large enterprise customers interested in adopting our approach. In the future, we plan to improve documentation, system usability, and open source our infrastructure repository so that others can benefit from this work.

Infrastructure as Code Management