
TLDR: Today, Harness is introducing the Harness Cursor Plugin, bringing the power of the Harness AI-native software delivery platform directly into Cursor. This integration, along with the Harness Secure AI Coding hook for Cursor, allows developers and AI agents to move from code changes to vulnerability detection, CI/CD execution, security validation, approvals, deployments, and operational insight without leaving the editor.
AI has completely changed how we write code. You can spin up functions, refactor entire files, and generate tests in seconds. The inner loop, writing and iterating on code, has never been faster. But the moment you try to ship that code, everything slows down. This is what we call the AI Velocity Paradox.
You are suddenly back to juggling pipelines, waiting on approvals, checking security scans, debugging failed runs, and bouncing between tools just to get a change into production.
That gap, between fast code and slow delivery, is what we kept running into. So we built something to fix it.
Today, we are introducing the Harness Plugin for Cursor, a way to go from PR to production without leaving your editor.
If you are using agentic coding tools, such as Cursor, you have probably felt this.
You can:
But shipping still depends on everything outside your editor:
And none of that got simpler just because AI showed up. In fact, AI makes the problem more obvious.
Now you can create changes faster than your delivery process can safely handle. And if those controls are not tight, you are introducing a whole new category of risk. Fast-moving code with fragmented governance.
AI did not break software delivery. It exposed how disconnected it already was.
Instead of jumping between tools, what if you could just tell your editor what you want to happen?
Something like:
“Deploy PR #4821 to staging once the security scan passes, and Slack me if anything fails.”
That is the idea behind the Harness Cursor Plugin.
It connects Cursor directly to Harness, so you can trigger and manage your entire delivery workflow using natural language, right inside Cursor.

No tab switching. No manual orchestration. No guessing what is happening in the pipeline.
Once connected, you can use Cursor to interact with your delivery system just as you do with your code.
For example, you can:
This builds on what we introduced last month, Secure AI Coding, which integrates directly with Cursor and scans code at the moment of generation rather than waiting for a PR review. Developers see inline vulnerability warnings with the option to send flagged code back to the agent for remediation, without leaving their workflow. Under the hood, it leverages Harness's Code Property Graph (CPG) to trace data flows across the entire codebase, surfacing complex vulnerabilities that simpler linting tools would miss.
The key thing is that you are no longer just interacting with code. You are interacting with the entire delivery system from the same place.
One of the biggest concerns with AI in delivery is obvious:
“Are we about to let agents push code to production without guardrails?”
No.
With Harness, everything runs through the controls that you can rely on:

Instead of being manual checkpoints spread across tools, they are enforced automatically as part of the workflow while you stay in flow.
So AI can help move things faster, but it cannot bypass the governance that matters.
Most integrations today expose APIs or bolt AI onto existing systems. That is not what we wanted to do.
We designed the Harness Cursor Plugin specifically for how AI agents actually work:
Because shipping software is not a single action. It is a chain of decisions across CI, CD, security, approvals, and operations. If AI is going to help here, it needs access to that full picture. That’s where the Harness Software Delivery Knowledge Graph comes into play. It provides the necessary context for AI to take actions for you.
The knowledge graph models the relationships between services, pipelines, environments, policies, and operational signals in real time. Instead of treating each step in delivery as an isolated task, it creates a connected system of record that AI can reason over. This allows agents to understand not just what to do, but when and why to do it, based on dependencies, risk signals, and historical behavior.

In practice, this means smarter automation: deployments that adapt to context, approvals that are triggered based on policy and impact, and faster root cause analysis because the system already understands how everything is connected.
This is not just about convenience. It is a shift in how software actually moves from idea to production.
Instead of:
You get a single, connected workflow:
All accessible from your editor. Cursor accelerates the building. Harness governs the shipping. And the handoff between the two disappears.
Watch the demo:
If you want to try it:
For example:
“Run the CI pipeline for this branch, check if the security scan passed, and promote to staging if it did.”
That is it.
AI is not just changing how we write code. It is changing expectations for how fast we should be able to ship it. But speed without control does not work in real environments. What we are building toward is something simpler:
A world where every step, from PR to production, is:
Without forcing developers to leave their flow. This plugin is one step in that direction.
.png)
We’ve come a long way in how we build and deliver software. Continuous Integration (CI) is automated, Continuous Delivery (CD) is fast, and teams can ship code quickly and often. But environments are still messy.
Shared staging systems break when too many teams deploy at once, while developers wait on infrastructure changes. Test environments get created and forgotten, but over time, what is running in the cloud stops matching what was written in code.
We have made deployments smooth and reliable, but managing environments still feels manual and unpredictable. That gap has quietly become one of the biggest slowdowns in modern software delivery.
This is the hidden bottleneck in platform engineering, and it's a challenge enterprise teams are actively working to solve.
As Steve Day, Enterprise Technology Executive at National Australia Bank, shared:
“As we’ve scaled our engineering focus, removing friction has been critical to delivering better outcomes for our customers and colleagues. Partnering with Harness has helped us give teams self-service access to environments directly within their workflow, so they can move faster and innovate safely, while still meeting the security and governance expectations of a regulated bank.”
At Harness, Environment Management is a first-class capability inside our Internal Developer Portal. It transforms environments from manual, ticket-driven assets into governed, automated systems that are fully integrated with Harness Continuous Delivery and Infrastructure as Code Management (IaCM).

This is not another self-service workflow. It is environment lifecycle management built directly into the delivery platform.
The result is faster delivery, stronger governance, and lower operational overhead without forcing teams to choose between speed and control.
Continuous Delivery answers how code gets deployed. Infrastructure as Code defines what infrastructure should look like. But the lifecycle of environments has often lived between the two.

Teams stitch together Terraform projects, custom scripts, ticket queues, and informal processes just to create and update environments. Day two operations such as resizing infrastructure, adding services, or modifying dependencies require manual coordination. Ephemeral environments multiply without cleanup. Drift accumulates unnoticed.
The outcome is familiar: slower innovation, rising cloud spend, and increased operational risk.
Environment Management closes this gap by making environments real entities within the Harness platform. Provisioning, deployment, governance, and visibility now operate within a single control plane.
Harness is the only platform that unifies environment lifecycle management, infrastructure provisioning, and application delivery under one governed system.
At the center of Environment Management are Environment Blueprints.
Platform teams define reusable, standardized templates that describe exactly what an environment contains. A blueprint includes infrastructure resources, application services, dependencies, and configurable inputs such as versions or replica counts. Role-based access control and versioning are embedded directly into the definition.

Developers consume these blueprints from the Internal Developer Portal and create production-like environments in minutes. No tickets. No manual stitching between infrastructure and pipelines. No bypassing governance to move faster.
Consistency becomes the default. Governance is built in from the start.
Environment Management handles more than initial provisioning.
Infrastructure is provisioned through Harness IaCM. Services are deployed through Harness CD. Updates, modifications, and teardown actions are versioned, auditable, and governed within the same system.
Teams can define time-to-live policies for ephemeral environments so they are automatically destroyed when no longer needed. This reduces environment sprawl and controls cloud costs without slowing experimentation.
Harness EM also introduces drift detection. As environments evolve, unintended changes can occur outside declared infrastructure definitions. Drift detection provides visibility into differences between the blueprint and the running environment, allowing teams to detect issues early and respond appropriately. In regulated industries, this visibility is essential for auditability and compliance.

For enterprises operating at scale, self-service without control is not viable.
Environment Management leverages Harness’s existing project and organization hierarchy, role-based access control, and policy framework. Platform teams can control who creates environments, which blueprints are available to which teams, and what approvals are required for changes. Every lifecycle action is captured in an audit trail.
This balance between autonomy and oversight is critical. Environment Management delivers that balance. Developers gain speed and independence, while enterprises maintain the governance they require.
"Our goal is to make environment creation a simple, single action for developers so they don't have to worry about underlying parameters or pipelines. By moving away from spinning up individual services and using standardized blueprints to orchestrate complete, production-like environments, we remove significant manual effort while ensuring teams only have control over the environments they own."
— Dinesh Lakkaraju, Senior Principal Software Engineer, Boomi
Environment Management represents a shift in how internal developer platforms are built.
Instead of focusing solely on discoverability or one-off self-service actions, it brings lifecycle control, cost governance, and compliance directly into the developer workflow.
Developers can create environments confidently. Platform engineers can encode standards once and reuse them everywhere. Engineering leaders gain visibility into cost, drift, and deployment velocity across the organization.
Environment sprawl and ticket-driven provisioning do not have to be the norm. With Environment Management, environments become governed systems, not manual processes. And with CD, IaCM, and IDP working together, Harness is turning environment control into a core platform capability instead of an afterthought.
This is what real environment management should look like.

Engineering teams are generating more shippable code than ever before — and today, Harness is shipping five new capabilities designed to help teams release confidently. AI coding assistants lowered the barrier to writing software, and the volume of changes moving through delivery pipelines has grown accordingly. But the release process itself hasn't kept pace.
The evidence shows up in the data. In our 2026 State of DevOps Modernization Report, we surveyed 700 engineering teams about what AI-assisted development is actually doing to their delivery. The finding stands out: while 35% of the most active AI coding users are already releasing daily or more, those same teams have the highest rate of deployments needing remediation (22%) and the longest MTTR at 7.6 hours.
This is the velocity paradox: the faster teams can write code, the more pressure accumulates at the release, where the process hasn't changed nearly as much as the tooling that feeds it.
The AI Delivery Gap
What changed is well understood. For years, the bottleneck in software delivery was writing code. Developers couldn't produce changes fast enough to stress the release process. AI coding assistants changed that. Teams are now generating more change across more services, more frequently than before — but the tools for releasing that change are largely the same.
In the past, DevSecOps vendors built entire separate products to coordinate multi-team, multi-service releases. That made sense when CD pipelines were simpler. It doesn't make sense now. At AI speed, a separate tool means another context switch, another approval flow, and another human-in-the-loop at exactly the moment you need the system to move on its own.
The tools that help developers write code faster have created a delivery gap that only widens as adoption grows.
Today Harness is releasing five capabilities, all natively integrated into Continuous Delivery. Together, they cover the full arc of a modern release: coordinating changes across teams and services, verifying health in real time, managing schema changes alongside code, and progressively controlling feature exposure.
Release Orchestration replaces Slack threads, spreadsheets, and war-room calls that still coordinate most multi-team releases. Services and the teams supporting them move through shared orchestration logic with the same controls, gates, and sequence, so a release behaves like a system rather than a series of handoffs. And everything is seamlessly integrated with Harness Continuous Delivery, rather than in a separate tool.
AI-Powered Verification and Rollback connects to your existing observability stack, automatically identifies which signals matter for each release, and determines in real time whether a rollout should proceed, pause, or roll back. Most teams have rollback capability in theory. In practice it's an emergency procedure, not a routine one. Ancestry.com made it routine and saw a 50% reduction in overall production outages, with deployment-related incidents dropping significantly.
Database DevOps, now with Snowflake support, brings schema changes into the same pipeline as application code, so the two move together through the same controls with the same auditability. If a rollback is needed, the application and database schema can rollback together seamlessly. This matters especially for teams building AI applications on warehouse data, where schema changes are increasingly frequent and consequential.
Improved pipeline and policy support for feature flags and experimentation enables teams to deploy safely, and release progressively to the right users even though the number of releases is increasing due to AI-generated code. They can quickly measure impact on technical and business metrics, and stop or roll back when results are off track. All of this within a familiar Harness user interface they are already using for CI/CD.
Warehouse-Native Feature Management and Experimentation lets teams test features and measure business impact directly with data warehouses like Snowflake and Redshift, without ETL pipelines or shadow infrastructure. This way they can keep PII and behavioral data inside governed environments for compliance and security.
These aren't five separate features. They're one answer to one question: can we safely keep going at AI speed?
Traditional CD pipelines treat deployment as the finish line. The model Harness is building around treats it as one step in a longer sequence: application and database changes move through orchestrated pipelines together, verification checks real-time signals before a rollout continues, features are exposed progressively, and experiments measure actual business outcomes against governed data.
A release isn't complete when the pipeline finishes. It's complete when the system has confirmed the change is healthy, the exposure is intentional, and the outcome is understood.
That shift from deployment to verified outcome is what Harness customers say they need most. "AI has made it much easier to generate change, but that doesn't mean organizations are automatically better at releasing it," said Marc Pearce, Head of DevOps at Intelliflo. "Capabilities like these are exactly what teams need right now. The more you can standardize and automate that release motion, the more confidently you can scale."
The real shift here is operational. The work of coordinating a release today depends heavily on human judgment, informal communication, and organizational heroics. That worked when the volume of change was lower. As AI development accelerates, it's becoming the bottleneck.
The release process needs to become more standardized, more repeatable, and less dependent on any individual's ability to hold it together at the moment of deployment. Automation doesn't just make releases faster. It makes them more consistent, and consistency is what makes scaling safe.
For Ancestry.com, implementing Harness helped them achieve 99.9% uptime by cutting outages in half while accelerating deployment velocity threefold.
At Speedway Motors, progressive delivery and 20-second rollbacks enabled a move from biweekly releases to multiple deployments per day, with enough confidence to run five to 10 feature experiments per sprint.
AI made writing code cheap. Releasing that code safely, at scale, is still the hard part.
Harness Release Orchestration, AI-Powered Verification and Rollback, Database DevOps, Warehouse-Native Feature Management and Experimentation, and Improve Pipeline and Policy support for FME are available now. Learn more and book a demo.


TLDR: Today, Harness is introducing the Harness Cursor Plugin, bringing the power of the Harness AI-native software delivery platform directly into Cursor. This integration, along with the Harness Secure AI Coding hook for Cursor, allows developers and AI agents to move from code changes to vulnerability detection, CI/CD execution, security validation, approvals, deployments, and operational insight without leaving the editor.
AI has completely changed how we write code. You can spin up functions, refactor entire files, and generate tests in seconds. The inner loop, writing and iterating on code, has never been faster. But the moment you try to ship that code, everything slows down. This is what we call the AI Velocity Paradox.
You are suddenly back to juggling pipelines, waiting on approvals, checking security scans, debugging failed runs, and bouncing between tools just to get a change into production.
That gap, between fast code and slow delivery, is what we kept running into. So we built something to fix it.
Today, we are introducing the Harness Plugin for Cursor, a way to go from PR to production without leaving your editor.
If you are using agentic coding tools, such as Cursor, you have probably felt this.
You can:
But shipping still depends on everything outside your editor:
And none of that got simpler just because AI showed up. In fact, AI makes the problem more obvious.
Now you can create changes faster than your delivery process can safely handle. And if those controls are not tight, you are introducing a whole new category of risk. Fast-moving code with fragmented governance.
AI did not break software delivery. It exposed how disconnected it already was.
Instead of jumping between tools, what if you could just tell your editor what you want to happen?
Something like:
“Deploy PR #4821 to staging once the security scan passes, and Slack me if anything fails.”
That is the idea behind the Harness Cursor Plugin.
It connects Cursor directly to Harness, so you can trigger and manage your entire delivery workflow using natural language, right inside Cursor.

No tab switching. No manual orchestration. No guessing what is happening in the pipeline.
Once connected, you can use Cursor to interact with your delivery system just as you do with your code.
For example, you can:
This builds on what we introduced last month, Secure AI Coding, which integrates directly with Cursor and scans code at the moment of generation rather than waiting for a PR review. Developers see inline vulnerability warnings with the option to send flagged code back to the agent for remediation, without leaving their workflow. Under the hood, it leverages Harness's Code Property Graph (CPG) to trace data flows across the entire codebase, surfacing complex vulnerabilities that simpler linting tools would miss.
The key thing is that you are no longer just interacting with code. You are interacting with the entire delivery system from the same place.
One of the biggest concerns with AI in delivery is obvious:
“Are we about to let agents push code to production without guardrails?”
No.
With Harness, everything runs through the controls that you can rely on:

Instead of being manual checkpoints spread across tools, they are enforced automatically as part of the workflow while you stay in flow.
So AI can help move things faster, but it cannot bypass the governance that matters.
Most integrations today expose APIs or bolt AI onto existing systems. That is not what we wanted to do.
We designed the Harness Cursor Plugin specifically for how AI agents actually work:
Because shipping software is not a single action. It is a chain of decisions across CI, CD, security, approvals, and operations. If AI is going to help here, it needs access to that full picture. That’s where the Harness Software Delivery Knowledge Graph comes into play. It provides the necessary context for AI to take actions for you.
The knowledge graph models the relationships between services, pipelines, environments, policies, and operational signals in real time. Instead of treating each step in delivery as an isolated task, it creates a connected system of record that AI can reason over. This allows agents to understand not just what to do, but when and why to do it, based on dependencies, risk signals, and historical behavior.

In practice, this means smarter automation: deployments that adapt to context, approvals that are triggered based on policy and impact, and faster root cause analysis because the system already understands how everything is connected.
This is not just about convenience. It is a shift in how software actually moves from idea to production.
Instead of:
You get a single, connected workflow:
All accessible from your editor. Cursor accelerates the building. Harness governs the shipping. And the handoff between the two disappears.
Watch the demo:
If you want to try it:
For example:
“Run the CI pipeline for this branch, check if the security scan passed, and promote to staging if it did.”
That is it.
AI is not just changing how we write code. It is changing expectations for how fast we should be able to ship it. But speed without control does not work in real environments. What we are building toward is something simpler:
A world where every step, from PR to production, is:
Without forcing developers to leave their flow. This plugin is one step in that direction.
.jpg)
.jpg)
The question for enterprise AI in 2026 is no longer just which model. It’s which harness.
An agent harness is the system around the model. It decides what the agent remembers, what context it sees, what tools it can call, what it is allowed to do, and what happens when it is wrong.
The model provides intelligence. The harness provides control.
This is where the real engineering is happening. When Claude Code's source was accidentally exposed earlier this year, reports put it at more than half a million lines. None of that was the model. All of it was the system around the model.
The model gets you started. The harness gets you to production.
Software engineering is one of the first places this plays out. AI coding tools are writing and editing code. Autonomous agents are starting to deploy, operate, and respond to incidents. These are not suggestions anymore. They are changes to running software, made by agents acting on their own.
And one harness is not enough.
Software engineering has two halves at the level that matters for agent harness design. Software development, where code gets written. Software delivery, where code becomes running software.
The inner loop is software development. Code gets written, edited, tested, and reviewed. Coding agents work here, close to the developer and bounded by the repository. Whether they live in an IDE, a terminal, a background session, or a web workspace doesn’t change what they do. They help one person write better code faster.
The outer loop is software delivery. Code becomes software that is built, tested, secured, deployed, verified, operated, and sometimes rolled back. That includes CI, security scans, deployments, infrastructure, feature flags, incidents, and approvals.
The two are loops different. The inner loop is about individual productivity. The outer loop is about organizational execution under risk. It crosses teams, touches production, uses secrets, enforces policy, and leaves an audit trail.
An agent delivering software can’t be a coding assistant with API access. It has to run inside a system that enforces the organization’s rules.
The stakes are easier to see by starting with what breaks.
Security. An agent with broad access to deploy, provision, and push config changes is a new attack surface. Prompt injection through a PR description, a poisoned dependency, or a malicious issue comment can turn an autonomous agent into the most privileged insider threat in the company. It acts under its own identity, with its own scoped credentials, doing exactly what it’s authorized to do. The attacker just redirects the authorization. Without an identity model and governed execution, every action the agent can take becomes a potential action path for an attacker.
Compliance. An agent that ships code without the same policy gates, approvals, and audit trails humans use creates a parallel path that regulators and auditors will challenge. A single deployment that skipped EU data residency review can trigger a finding that takes quarters to close. Cyber insurers are starting to scrutinize AI governance, and some are exploring exclusions or tighter terms for poorly governed AI. Within a year or two, “we have autonomous agents deploying code without an evidence trail” will be impossible to defend. Autonomous delivery without verification is autonomous liability.
Confident bad decisions. An agent with partial context looks like it’s working. It deploys during a change freeze. It rolls out a config change that breaks an upstream service. It enables a feature flag during an incident. Each failure is locally reasonable and globally wrong. Without the full knowledge graph, the agent keeps making the wrong call.
AI-specific failure modes. Autonomous agents fail in ways that deterministic automation doesn’t. They hallucinate actions, generating and deploying a Kubernetes manifest that doesn’t match reality. They get stuck in loops, rolling back and redeploying the same change until a human kills the process. They’re confidently wrong, proposing a fix that passes a weak policy gate and breaks production an hour later. No attacker involved. Without verification strong enough to catch them, errors reach production.
All of this has happened with deterministic automation, one mistake at a time. With autonomous agents, errors happen in parallel. A coding agent with bad context can push 10 broken PRs in 10 minutes. A delivery agent without verification can deploy 20 services before anyone notices.
Speed used to be the feature. With autonomous agents, speed is also the damage multiplier.
A software delivery agent needs four things: memory, context, tools, and verification. The shape and stakes of each element are distinct.
Suppose a team is shipping a new version of a retailer’s checkout service on Thursday. Checkout depends on payments, inventory, fraud, and identity.
A Software Delivery Knowledge Graph is a connected map of services, teams, pipelines, deployments, incidents, policies, scorecards, and artifacts. Nodes and edges show how they all relate.
To answer “Is checkout safe to ship Thursday?”, the agent has to know which services checkout depends on, what their scorecards look like, whether any have open critical CVEs, whether there’s a change freeze, and who’s on call Thursday night.
Tha’is a graph query. If the agent doesn’t have the graph, it’s guessing.
Memory is the durable map. Context is the live signal. Memory tells the agent how the delivery system is connected. Context tells it what’s happening now.
Back to checkout. The agent sees that a chaos experiment last week showed payments fail when its Redis cache is unavailable. It sees that yesterday’s security scan flagged a critical CVE in a library fraud detection depends on. It sees that the new version changes the same config flag that caused an incident two weeks ago.
None of this is in the pull request. All of it matters.
Context isn’t something you assemble from scratch at runtime. It accumulates in the harness long before the agent is asked to act.
People often assume “tools” means function calls to APIs. For a software delivery agent, it means something different. The agent can deploy to Kubernetes, run a database migration, apply a feature flag, trigger a security scan, run a chaos experiment, open and close an incident. Real actions, inside your network, using your credentials, under your policies, with full audit logging.
At Harness, every action runs through a Delegate: a lightweight worker inside your environment. Your VPC, your Kubernetes cluster, your data center. The agent issues an instruction. The Delegate executes it inside your perimeter and returns the result.
Secrets are decrypted inside the Delegate. Never in the agent’s context window, never in a model provider's memory, never in an audit log.
An agent with arbitrary production access is dangerous. An agent constrained by governed execution is governable.
This is the pillar coding and personal productivity agents don’t need at this depth. Software delivery agents do.
Three mechanisms make it concrete:
For checkout, the Thursday release is blocked unless the scorecard passes, no critical CVEs are open, no change freeze applies, and an EU compliance approver signs off. If any of those fail, the agent cannot deploy. If they all pass, the deployment runs through a Delegate and an evidence record is written.
The rules of the organization are enforced in the harness. The agent operates inside them.
I mentioned that an agent needs memory, context, tools, and verification. The good news: a modern software delivery platform like Harness already has the foundations, because truly automated delivery has always needed those four things.
A note on our name. We called the company Harness in 2017 because the original thesis was a safety harness for code: let developers move fast without breaking things. Pipelines, policies, approvals, rollbacks, evidence. The scaffolding that lets speed and safety coexist.
That thesis hasn’t changed. The mover has. Developers are still moving fast. AI agents are moving fast too, and faster. The harness has to hold both.
Pipelines aren’t agents. Pipelines are the harness that lets agents safely act. They’re the control plane where agent actions are evaluated, constrained, and executed under policy.
The word “pipeline” carries baggage. Many people hear “script runner.” That isn’t what we mean. Harness pipelines are production orchestration engines: loops, matrix runs, parallel stages, conditions, approvals, OPA gates, rollback, retries, and deterministic-plus-agentic step-chaining.
An agent step can run inside a loop. A deterministic step can pass output to an agent, then to a policy gate, an approval, another agent, and a deployment. The agent isn’t replacing the pipeline. The agent is one kind of step the pipeline already knows how to run.
Harness pipelines execute hundreds of millions of runs a year across enterprise production systems. That isn’t a theoretical runtime for agents. It’s a runtime already hardened at scale, on real delivery, under real policy, with real rollback. That’s the difference between a script runner and a production harness for autonomous action.
The rest of the foundation maps the same way. The Delegate is how actions reach your infrastructure. The Software Delivery Knowledge Graph is the memory. Our platform modules are the tools. Scorecards, policy gates, and signed evidence are the verification. Harness AI, the intelligence layer on top, uses all four of these elements.
We didn’t set out to build an agent harness. We set out to build a software delivery platform with AI at its core. It turns out those two things are the same.
Coding agents (IDE copilots, background agents, terminal-based assistants, cloud coding sessions) are built for a different job. They know your codebase, your style, your recent commits. That’s a real harness, bounded by the repository and the developer. A software delivery harness has different scope, memory, risks, and accountability.
A coding agent’s memory is the repository. A software delivery agent’s memory is the organization.
The context gap. Ask your coding assistant: “Is it safe to deploy this checkout change to production tonight?” It can’t answer. It doesn’t know the current scorecard, the change freeze status, last week’s chaos test results, or who’s on call. None of that lives inside the developer's workspace. A coding agent can write a change. It can’t know if the change is safe to ship.
The blast radius gap. A coding agent’s bad change usually gets caught before it hurts anything: in review, in CI, in a security scan, on a policy gate. Fifteen minutes wasted, not a production incident. A software delivery agent’s worst day is customer data exposure, a production outage, or a regulatory incident. Same agent paradigm, radically different blast radius.
The safety-net gap. Both kinds of agents are moving toward less human oversight. The difference is what catches them when they’re wrong. A coding agent mistake gets caught downstream: by CI, by security scans, by policy gates, by the delivery harness itself. A delivery agent mistake has nothing downstream. It is the downstream.
The control-plane gap. Could a coding agent call Harness as a backend? Of course. It should. But the caller isn’t the control plane. The software delivery harness decides whether the request is allowed, how it executes, and what evidence is retained.
The preference gap. Developers are going to pick their own coding agents. Most enterprises already run two or three: Cursor on some teams, Claude Code on others, Copilot on others, whatever ships next year on yet other teams. That’s healthy. Software development is distributed by design. Software delivery is the opposite: it’s centralized. One company, one delivery control plane. One set of policies, one audit trail, one source of evidence, one place where credentials are held.
The winning pattern is the two meeting cleanly: whichever coding agent the developer picks, the deployment passes through the same delivery harness.
Managed agents. Stateful APIs. Server-side memory. Model providers are extending into harness territory, and for many use cases, that works. For software delivery specifically, the architecture runs into a different set of constraints.
The credentials problem. Every software delivery action requires production credentials: cloud admin roles, Kubernetes service accounts, database passwords, secrets manager keys. The most sensitive assets in the company. Enterprises spend years building the controls around them: vaults, rotation, scoped access, audit trails. A model-provider-hosted agent loop would require those credentials to flow through the model provider’s infrastructure on every action. Few CISOs will approve it. Few auditors will sign off. In regulated industries, it’s often a non-starter.
The inversion. A model can be hosted anywhere. Any provider, any cloud. Execution has to happen inside the enterprise, using credentials that never leave. The model stays outside. The control plane runs inside. Intelligence can live anywhere. The control plane can’t.
The live-state problem. A software delivery agent’s answer to “Is this safe to ship?” depends on a state that changes every minute. The current change freeze. The latest incident. The newest CVE. Who’s on call right now. Whether the deployment window just closed. A model provider can reason about what you put in the prompt. It doesn’t naturally own the current state of your delivery system. A model provider knows the world. The harness has to know your world, right now.
The accountability problem. When a delivery agent does something wrong, the model provider isn’t on the incident bridge. The on-call engineer is. The platform lead is. The CTO is. The company is the one that has to explain the outage to customers, the finding to regulators, the miss to the board. Accountability can’t be outsourced. The harness that constrains the agent can’t be either.
A model provider can be the brain. It can’t be the harness for delivery.
More and more code will be written by AI. The bottleneck is shifting from code generation to safe delivery.
Coding agents help developers write code. Software delivery agents help teams safely deliver and operate it. Two harnesses. Two categories. Two sets of winners.
The foundation for software delivery is ready. The agents that need it are arriving now. The category now has a name.
We’ve always called it Harness. The idea just got bigger.


“We’ve been operating in a hybrid environment with both OpenTofu and Terragrunt, and Harness has made it much easier to bring those workflows together into a single, consistent platform with IaCM. The addition of Terragrunt support is a valuable step toward simplifying how we manage infrastructure at scale.”
— Lead Platform Engineer, Enterprise Customer
Infrastructure as Code is now a standard for modern cloud operations, with most enterprises using IaC to provision and manage environments. However, as adoption grows, so does complexity. Teams are no longer managing a handful of environments. They are operating across multiple regions, accounts, and services, often at massive scale.
This is where traditional approaches begin to fall short.
As organizations scale their infrastructure, Terraform alone is often not enough. Teams adopt Terragrunt to manage complex, multi-environment deployments, but they are often forced to stitch together fragmented tooling that lacks visibility, governance, and consistency.
At Harness, we are changing that.
Today, we are excited to announce native Terragrunt support in Harness IaCM, bringing it to full parity with Terraform and OpenTofu while delivering capabilities that go beyond what is available in standalone tooling. This is more than support. It is about making Terragrunt a first-class platform for enterprise infrastructure management.
With Harness IaCM, teams can now:

Terragrunt has become a critical layer for managing infrastructure at scale because it simplifies how teams structure and reuse configurations across environments. Harness builds on that foundation with deep, native integration, enabling platform teams to operate with both flexibility and control.
This is especially important for enterprises where a single deployment spans multiple environments and services. Harness abstracts that complexity while maintaining governance, auditability, and consistency.
Terragrunt is part of a broader shift toward multi-tool infrastructure strategies.
Modern teams are no longer standardized on a single IaC tool. Instead, they operate across:

This creates challenges around consistency, visibility, and governance. Harness IaCM is built for this reality. We are evolving IaCM into a unified control plane for multi-IaC workflows, where teams can manage different frameworks with a consistent experience, shared policies, and centralized visibility.
This means:
Instead of managing infrastructure in silos, teams can now operate from a single platform across the entire lifecycle.
The next phase of Infrastructure as Code is not just about supporting more tools. It is about making infrastructure systems more intelligent and automated.
We are investing in two key areas:
We are continuing to support modern frameworks like AWS CDK, enabling developer-centric infrastructure workflows alongside provisioning, configuration, and orchestration tools.
We are introducing intelligence into IaC workflows to simplify tasks such as drift management and optimization. This helps teams reduce manual effort and operate more efficiently at scale.
Together, these investments move IaCM toward a unified, multi-IaC platform that combines flexibility, governance, and automation. Terragrunt has become essential for managing infrastructure at scale but until now, it hasn’t had a platform that truly supports it. As infrastructure continues to grow in complexity, our focus remains the same. Helping teams move faster, reduce risk, and scale with confidence no matter which IaC tools they use.


The release of Anthropic Mythos and Project Glasswing marks an exciting and pivotal new chapter in software development. As the industry advances, the speed and economics of vulnerability exploitation have fundamentally shifted. What once took weeks of manual reconnaissance can now be scaled rapidly through automated models. However, this is not just a security problem to solve. It is a massive engineering opportunity to build cleaner, more robust systems. By leaning into AI-accelerated defense, engineering teams are uniquely positioned to lead the charge and redesign the landscape of modern software architecture.
To succeed in this new era, the traditional silos separating security and engineering must fall. Defense at machine speed requires a unified front.
The foundation of AI-accelerated defense relies on sound, proactive engineering practices. Developers must take ownership of architectural hygiene from the ground up.
Even with the best architecture, unexpected friction will occur. Resilient engineering means planning comprehensively for your ecosystem.
To keep pace with the increased velocity of engineering teams, Security teams must also evolve their operational models.
Engineering leaders and developers are in the perfect position to navigate this industry inflection point. By taking ownership of these structural changes today, you ensure the long-term viability of your products and the enduring strength of your codebase. Bring your security, infrastructure, and engineering teams together into the same room and start building your shared roadmap today.


What happens when your Infrastructure as Code management strategy works perfectly in dev, scales reasonably well in staging, and then quietly fractures across seventeen production workspaces because nobody documented which Terragrunt wrapper goes with which AWS account? You spend Friday afternoon reverse-engineering DRY patterns that made sense six months ago, wondering why your team is managing three different IaC execution engines with four incompatible workflow philosophies.
This scenario isn't hypothetical. It's the reality of organizations that adopted IaC incrementally, layer by layer, without a unified management approach. One team standardized on OpenTofu for new infrastructure. Another maintained legacy Terraform configurations because migration felt risky. A third discovered Terragrunt and used it to wrangle complexity across AWS regions, but now those wrappers exist outside any centralized governance model. Each decision was rational in isolation. Together, they created an orchestration problem masquerading as a tooling problem.
The actual challenge isn't choosing between Terraform, OpenTofu, or Terragrunt. It's managing their outputs, enforcing policy consistently across execution contexts, and ensuring that infrastructure changes don't outpace your ability to understand what's deployed.
Most platform teams don't set out to run multiple IaC tools simultaneously. They inherit Terraform state from acquisitions, adopt OpenTofu for licensing predictability, and introduce Terragrunt because someone needed to stop copying backend configurations across 40 AWS accounts. The tools themselves aren't the problem. The problem is that each tool introduces its own state management assumptions, module resolution logic, and workflow expectations.
Terragrunt, for instance, exists specifically to solve Terraform's verbosity problem. It lets you define backend configurations once and reference them across environments. It supports dependency graphs so you can deploy a VPC before attempting to create subnets. These capabilities are valuable, but they also mean your actual infrastructure logic now spans two layers: the Terraform or OpenTofu code that defines resources, and the Terragrunt configuration that orchestrates execution.
When you lack centralized Infrastructure as Code management, those layers drift independently. Someone updates a Terragrunt dependency graph without realizing it breaks a downstream workspace. Another engineer modifies an OpenTofu module but forgets that three different Terragrunt configurations depend on its output structure. You don't discover these issues until a deployment fails in production, and the postmortem reveals that nobody had visibility into the full dependency chain.
The typical response to multi-IaC complexity is to standardize on one tool and deprecate the others. That works if you're early in your IaC journey. It's impractical if you're managing hundreds of workspaces across regulated environments where compliance audits expect immutable infrastructure definitions and audit trails for every state change.
Here's what actually happens: platform teams create custom CI/CD pipelines for each tool. Terraform runs in Jenkins. OpenTofu runs in GitHub Actions. Terragrunt configurations use a shell script someone wrote during an incident. Each pipeline implements drift detection differently. Policy enforcement exists as scattered OPA rules that don't share a common evaluation context. When an auditor asks, "How do you prevent unapproved infrastructure changes?", the honest answer is, "We run some checks in some places, and we hope teams remember to use them."
This isn't negligence. It's what emerges when Infrastructure as Code management tooling doesn't natively support the reality of polyglot IaC environments. Teams need a system that treats OpenTofu, Terraform, and Terragrunt as execution details, not architectural boundaries. The workflow layer—plan generation, policy evaluation, approval gates, state locking—should remain consistent regardless of which engine interprets the configuration.
Running `terragrunt apply` successfully doesn't mean your infrastructure is well-managed. It means Terragrunt successfully invoked OpenTofu or Terraform and applied a configuration. The actual management work—validating inputs, enforcing cost policies, detecting drift, promoting changes through environments—exists outside the execution layer.
This is where most homegrown solutions collapse under their own weight. You build a wrapper script that runs Terragrunt with the right flags. Then you add pre-commit hooks for policy checks. Then you integrate Sentinel or OPA, but only for workspaces that someone remembered to configure. Then you add Slack notifications so people know when drift occurs, but the notifications don't include enough context to act on them. Eventually, you have a Rube Goldberg machine that works until it doesn't, and debugging requires institutional knowledge that exists in one person's head.
The fundamental issue is that IaC workflow optimization requires thinking beyond execution engines. You need orchestration that understands module dependencies, workspace relationships, and policy boundaries. You need variable management that doesn't require copying YAML files between repositories. You need drift detection that runs automatically and surfaces meaningful deltas, not raw Terraform output dumped into a log file.
Treating Terragrunt as an afterthought—something teams bolt onto existing Terraform or OpenTofu pipelines—misses its architectural intent. Terragrunt exists because managing backend configurations, passing outputs between modules, and orchestrating multi-account deployments shouldn't require copying boilerplate across dozens of directories. When Infrastructure as Code management platforms support Terragrunt natively, they acknowledge this reality: the DRY principle applies to infrastructure orchestration, not just resource definitions.
Native Terragrunt support means the platform understands dependency graphs without requiring custom parsing logic. It means workspace templates can reference Terragrunt configurations directly, rather than forcing teams to flatten everything into monolithic Terraform modules. It means policy enforcement applies before Terragrunt invokes the underlying execution engine, catching invalid configurations before they generate failed plans.
This matters most in organizations running multi-region or multi-cloud architectures. A typical pattern: one Terragrunt configuration defines networking across AWS regions, another manages Kubernetes clusters, a third provisions databases. Each configuration depends on outputs from the others. Without native orchestration, teams either write brittle shell scripts to sequence these dependencies or accept that deployments sometimes fail halfway through because someone applied changes out of order.
The real test of an Infrastructure as Code management platform isn't whether it runs OpenTofu or Terraform. It's whether it provides consistent state visibility, policy enforcement, and audit trails across both. If your platform requires separate workflows for each execution engine, you've automated the mechanics but not the governance.
Consider policy evaluation. A reasonable security requirement: no S3 buckets should allow public read access. With fragmented tooling, you implement this rule multiple times. Once for Terraform workspaces using Sentinel. Again for OpenTofu configurations using OPA. A third time for Terragrunt-managed infrastructure, where you're not sure which policy engine applies because Terragrunt is just orchestrating calls to Terraform or OpenTofu. When an audit occurs, you can't prove consistent enforcement because there's no unified policy evaluation layer.
The same fragmentation affects drift detection. Terraform Cloud detects drift for Terraform-managed resources. Your OpenTofu workspaces might run scheduled reconciliation jobs, or they might not—it depends on whether someone configured them. Terragrunt configurations drift silently unless you've built custom tooling to periodically run `terragrunt plan` and parse the output. The result: partial visibility across your infrastructure estate, where "managed by IaC" becomes aspirational rather than descriptive.
Organizations exploring Terraform alternatives often focus on licensing or community governance. Those considerations matter, but they don't address the operational question: how do you manage infrastructure deployed with multiple execution engines without creating parallel workflow systems?
OpenTofu integration means more than "we can run OpenTofu commands." It means workspaces provisioned for OpenTofu behave identically to Terraform workspaces at the orchestration layer. Variable sets apply consistently. Policy evaluation uses the same rule sets. Drift detection runs on the same schedule. Approval workflows follow the same governance model. The execution engine becomes an implementation detail, not a workflow boundary.
This distinction matters during migrations. Teams don't flip entire infrastructure estates from Terraform to OpenTofu overnight. They migrate incrementally, starting with non-critical workspaces and expanding as confidence grows. If your Infrastructure as Code management platform treats each engine as a separate silo, you're managing two parallel systems during the transition. If the platform abstracts execution details behind a unified orchestration layer, the migration becomes a configuration change, not an architectural overhaul.
The hard problems in infrastructure management aren't technical; they're organizational. How do you ensure that 40 engineers across six teams follow the same approval process for production changes? How do you enforce cost policies without blocking legitimate deployments? How do you maintain audit trails that satisfy compliance requirements without turning every infrastructure change into a bureaucratic ordeal?
IaC orchestration platforms solve these problems by decoupling policy from execution. Instead of embedding governance rules in CI/CD pipelines—where they're invisible, untestable, and easy to bypass—you define them once at the platform level. Instead of writing custom scripts to sequence Terragrunt dependencies, you describe the dependency graph declaratively and let the platform handle execution order. Instead of building bespoke drift detection logic, you configure detection schedules and let the platform surface meaningful deltas.
This approach doesn't eliminate complexity. It consolidates complexity into a layer designed to manage it. Your IaC configurations remain simple: modules that define resources, Terragrunt wrappers that eliminate boilerplate, workspace configurations that specify execution context. The orchestration platform handles everything else: state locking, policy evaluation, approval workflows, audit logging, drift remediation.
Harness Infrastructure as Code Management approaches these challenges by treating the execution engine as a deployment detail, not an architectural constraint. Whether you're running OpenTofu, Terraform, or Terragrunt, the orchestration layer remains consistent: standardized pipelines for plan generation and apply operations, unified policy enforcement across all workspaces, centralized drift detection that surfaces actionable insights.
For teams managing infrastructure across multiple clouds, regions, or execution engines, Harness IaCM provides the orchestration layer that makes polyglot IaC environments manageable. The platform doesn't force you to standardize on a single tool. It provides governance, visibility, and workflow consistency regardless of which engine interprets your configurations.
The promise of Infrastructure as Code—reproducible deployments, version-controlled infrastructure, collaborative development—only materializes when you have consistent orchestration across execution engines. Running Terraform in one pipeline, OpenTofu in another, and Terragrunt through a shell script doesn't scale. It creates workflow fragmentation that defeats governance and slows teams down.
Effective Infrastructure as Code management platforms abstract execution details behind unified workflows. They treat Terragrunt as a first-class orchestration primitive, not an afterthought. They provide native support for OpenTofu alongside Terraform, recognizing that organizations migrate gradually, not overnight. Most importantly, they enforce policy, detect drift, and maintain audit trails consistently across all workspaces, regardless of which engine runs the actual infrastructure changes.
The technical lesson: orchestration complexity belongs in platforms designed to manage it, not scattered across custom scripts and fragmented CI/CD pipelines. The operational lesson: governance doesn't slow teams down when it's embedded in the workflow rather than bolted on afterward. Multi-IaC environments are manageable when you have the right orchestration layer. Without it, you're just running tools in parallel and hoping they don't conflict.
Explore how Harness Infrastructure as Code Management handles multi-IaC orchestration, or review the technical documentation or implementation details. The product roadmap outlines upcoming capabilities for workflow optimization and policy enforcement.




Most development teams today build everything around Git, and deploy with GitOps principles.
Code sits in version controlled environments, changes go through PRs, and deployments are handled through modern CI/CD. That part is pretty standard at this point, especially when using a modern DevOps platform like Harness.
MongoDB fits into that developer world and workflow pretty naturally. Data is stored in documents that look a lot like JSON, the format many developers already use in application code and APIs. Under the hood, MongoDB stores those documents as BSON, which is essentially a binary form of JSON that supports additional data types like dates, object IDs, and binary data. That means developers get a familiar model to work with, while MongoDB gets a format that is efficient for storing and querying application data.

Looks just like JSON, with native types like ObjectId and dates powered by BSON.
The tradeoff is that structure isn’t always defined upfront. Schemas change over time, and not always in a clean or consistent way.
Collections can contain documents with different shapes. Index changes can directly impact performance. These aren’t problems on their own, but they require discipline to manage safely.
MongoDB changes are often handled outside the standard development workflow, whether that’s by developers, platform teams, or database teams.
Teams rely on application-level updates or one-off scripts to backfill data, modify structures, or create indexes. These approaches work, but they’re not always consistently versioned in Git. Execution can vary across environments, and review or validation is often informal.
The result is limited visibility into what changed, when it changed, and how it was applied. Over time, that leads to inconsistencies between environments and increased risk during deployment.
Flexibility is powerful, but without proper controls it introduces risk.
To solve this, teams need to bring MongoDB changes into the same workflow they already trust for application code: Git-driven, reviewable, and automated.
GitOps for MongoDB isn’t about changing how Mongo works. It’s about changing how changes are managed.
Instead of handling updates through scripts or application logic alone, database changes are treated like application code. Index creation, schema validation rules, and migration scripts are all defined in Git and tracked over time. This includes MongoDB’s native schema validation rules, which can be versioned and applied consistently across environments.
Changes need to go through pull requests, just like any other code change. This allows developers, platform teams, and DBAs to review what’s being modified before anything runs in an environment.
From there, pipelines handle the validation and deployment. Changes are applied consistently across environments, rather than being run manually and potentially differently each time.
In practice, this means a new field, an index, or a backfill isn’t just a script someone runs once. It’s a versioned change that can be reviewed, tested, and repeated.
This isn’t about forcing rigid schemas onto MongoDB. It’s about making changes visible, consistent, and easier to manage as systems grow.
Harness DB DevOps provides the structure to do this. With Harness, we define changes as changesets, store them in Git, and deploy them through pipelines with built-in validation and policy checks.
To demonstrate how this works, we will walk through a practical MongoDB change from start to finish.
Here’s a simple example: A team needs to add a new userPreferences field to the users collection and create an index to support a new query.
Instead of writing a script and running it manually, we define the change and commit it to Git.

1. Define the change in Git
A developer creates the update as a changeset. That includes the logic to add or backfill the new field, along with the createIndex operation needed for performance. The change is committed alongside application code, like any other update.
2. Open a pull request
From there, the change goes through a pull request. Other developers or DBAs can review what’s being changed before anything runs. If something looks off, it gets caught here instead of in production.
3. Let the pipeline take over
Once the change is approved, the pipeline takes over.
The Pipeline

Before anything gets applied, the change is validated and previewed against the target environment. This helps catch issues early, whether it’s a conflict, a bad query pattern, or something that could impact performance.
This is especially important for heavy operations like index creation on massive collections, where resource contention and performance degradation are real risks. Instead of running those changes manually, pipelines can enforce safe rollout strategies like rolling index creations across replica sets, without manual intervention.
Policies are enforced as part of that same process, with required approvals, environment rules, and other guardrails checked automatically so teams aren’t relying on someone to manually verify every step.
Once everything passes, the change is deployed through the pipeline and applied consistently across environments, moving from dev to staging to production in a controlled way. No one is logging into a database to run scripts by hand.
Now, everything is tracked. You can see what was applied, where it was deployed, when it happened, and who approved it, with a full history available if something needs to be reviewed or rolled back later.
Sound familiar? This workflow should sound a lot like application delivery, where changes are versioned, reviewed, validated before deployment, and visible after.
Traditionally, database changes have been tightly controlled by DBAs. They review scripts, approve changes, and sometimes execute them manually in each environment. That model helps reduce risk, but it doesn’t scale as teams grow and release more frequently.
With a GitOps approach, that control doesn’t disappear, it moves earlier in the process.
Instead of reviewing every individual change, database teams define policies and standards up front. Those rules are then enforced automatically through pipelines. Every change must pass the same checks before it reaches an environment, without requiring manual intervention each time.
In practice, this means:
The role of the database team evolves from gatekeeper to system designer. Rather than being involved in every deployment, they define the guardrails that ensure every deployment is safe.
Developers still move quickly, but now within a controlled, repeatable system.
Bringing MongoDB into a Git-driven workflow changes how teams ship.
MongoDB's flexibility doesn't eliminate the need for structure - it just shifts the responsibility for maintaining consistency from the database itself to your development processes.
If your application is managed through Git, your database should be too.


If you've ever run an ALTER TABLE on a busy MySQL table in production, you know the feeling. The change is small. The risk isn't. Long-running table locks, queued writes, application timeouts, replication lag, a five-minute migration that turns into a half-hour incident review.
We're shipping an integration that takes that anxiety out of the loop. Harness Database DevOps now supports Percona Toolkit for MySQL as part of Liquibase-based schema management. Flip a checkbox at schema creation, and eligible changes execute through pt-online-schema-change instead of native MySQL DDL.
Native ALTER TABLE on MySQL can lock tables for as long as the change takes to apply. On a large or hot table, that means writes pile up, dependent services start timing out, and replicas fall behind.
Percona Toolkit handles the same change very differently. pt-online-schema-change creates a shadow table with the new schema, copies your data over in small chunks, uses triggers to keep the original and shadow tables in sync, then performs an atomic swap with minimal lock time. The practical upside: schema changes you can run during business hours, not at 2 AM with a runbook open.
The integration is enabled per schema. When you create a Database Schema in Harness DB DevOps:
That's it. With the box unchecked (the default), Harness DB DevOps applies your changelogs using native MySQL operations through Liquibase, exactly as before. Check it, and eligible changes route through Percona Toolkit instead.
Percona Toolkit isn't a silver bullet for every DDL. A few cases need extra thought.
Adding or dropping foreign keys can break during the table swap, so plan those changes carefully or apply them outside the toolkit. Tables without a primary key or unique index won't migrate safely either, since pt-online-schema-change needs one to chunk data deterministically. And a handful of specific operations sit outside the safe-change envelope: dropping a primary key, complex column reordering, and some storage engine swaps.
You'll also want to give the database user the right privileges: ALTER, SELECT, INSERT, and UPDATE on the target table, plus CREATE and DROP on the database for shadow table management.
The full list of supported patterns, edge cases, and required permissions is in the Harness DB DevOps docs.
If you're already running Harness DB DevOps for MySQL, the next schema you create is a good place to try this. Turn it on against a non-critical environment first, watch how it behaves on your workload, and the path to using it in production gets a lot shorter.
For teams running MySQL at scale, that's one fewer reason to schedule schema changes around your customers' sleep.
If you aren't already using Database devops, speak with our experts to discuss how you can achieve zero downtime database schema migrations.
_%20Formula%2C%20Examples%20%26%20DevOps%20Use%20Cases.png)
_%20Formula%2C%20Examples%20%26%20DevOps%20Use%20Cases.png)
Your production problems aren't just random. If a Kubernetes node fails every 72 hours or your CI runners crash every 4 builds, that's a clear pattern. Mean Time to Failure (MTTF) turns these failures into data that you can control, plan for, and improve over time.
MTTF should not be a decoration on a dashboard for platform engineering leaders; it should be a decision-making tool. With the right calculations, you can set realistic SLOs, plan capacity, and cut down on developer work by focusing on the parts that break the most often. You'll get exact formulas for distributed systems, data collection patterns that avoid common mistakes, and a playbook to turn reliability improvements into measurable ROI through automated resilience practices alongside faster recovery metrics.
Stop letting unpredictable failures drain your team's time and budget. With Harness Continuous Integration and Continuous Delivery, you can turn MTTF insights into concrete pipeline changes, progressive delivery strategies, and guardrails that keep reliability improving release after release.
Mean Time to Failure (MTTF) is the average operating time of non-repairable components before failure across a population.
At a basic level:
MTTF = total operating time ÷ number of failures
If 100 CI runners each run for 50 hours during a week (5,000 runner‑hours total) and 20 runners experience at least one hard failure, then:
MTTF = 5,000 ÷ 20 = 250 hours
Historically, MTTF is used for physical assets you replace instead of fix (light bulbs, disks, sealed devices). In software, the same concept fits ephemeral resources such as:
MTTF tells you how long things run, on average, before they fail and must be replaced. MTTF is an approximation, not a strict reliability model.
Three reliability metrics show up in every platform review:
Use them to answer different questions:
For example:
Your platform scorecards should display all three together, alongside SLO health and error budget burn, so teams see the full reliability picture instead of optimizing a single metric in isolation.
The theoretical rules around MTTF and MTBF are straightforward; the ambiguity comes when you apply them to real cloud‑native stacks. Concrete examples help.
These components typically behave like non‑repairable items:
For each of these, you can treat a single lifecycle (from start to failure/termination) as one observation in your MTTF dataset.
These components behave more like classic repairable systems:
For these, you care more about how much uptime you get between failures (MTBF) and how quickly you can restore full health (MTTR).
It is tempting to say “our nodes have an MTTF of 720 hours, so our service is very reliable.” That is only true if your architecture masks those failures from users. User‑facing reliability lives at the service boundary, measured via SLOs and error budgets; component MTTF is an input that helps you:
MTTF helps you understand where things break; SLOs and MTTR tell you how much that matters to customers.
The MTTF calculation is trivial. The work is in collecting honest data across a distributed system without losing important details.
For each component type, decide exactly what counts as “failed,” for example:
Document these in your platform taxonomy so every team logs and reports failures the same way.
For each instance in the population you’re measuring, capture:
Then compute:
MTTF = total operating time across all instances ÷ number of failed instances
This gives you MTTF for that class (e.g., “Linux GPU runners in prod”).
Never pool dissimilar components into a single MTTF number. Instead:
Example:
Fleet MTTF (weighted) = (1,000 + 100) ÷ (5 + 1) ≈ 183 hours, not the naive (200 + 100) ÷ 2.
Some instances will still be running when you take the snapshot. If you drop them:
When censored samples are common, use basic survival analysis (like Kaplan–Meier) so that "still running" instances add to the exposure instead of being thrown away. If you give them clear timestamps and labels, observability tools and data teams can usually take care of this for you.
MTTF becomes strategically important when you use it to shape SLOs, error budgets, and reliability investments, not just track uptime.
If a class of components has an MTTF of 72 hours, a single instance will fail about:
8,760 hours/year ÷ 72 ≈ 121 failures/year
With multiple instances and redundancy, not every failure becomes a user‑visible incident, but you can still estimate:
MTTF highlights which components generate excessive manual work:
Use this to:
Because MTTF underpins incident rates, any improvement can be tied to measurable gains:
Treat MTTF as a leading indicator: when you raise it on critical components, you should see downstream improvements in SLO attainment and delivery cadence.
Once you know which components have the lowest MTTF and the highest operational cost, you can systematically improve them. In modern delivery pipelines, four patterns tend to pay off quickly.
Flaky CI is one of the most common sources of low MTTF and wasted engineering time.
You can improve CI‑related MTTF by:
Result: higher MTTF for pipelines and runners, fewer broken builds, and fewer interruptions for developers.
You cannot prevent every bad change, but you can limit how many become full‑blown incidents that count against your service‑level MTTF.
Key tactics:
This keeps effective MTTF for user‑facing services higher, even if underlying components still fail regularly.
Many MTTF regressions start as “just one more config change” that slips past informal reviews. Prevent those with:
This ensures the MTTF gains you’ve earned are not eroded by ad‑hoc changes and one‑off exceptions.
To sustainably raise MTTF, you need confidence that your architecture and runbooks can handle real failures, not just happy‑path tests.
By running targeted chaos experiments on the components with the lowest MTTF, you can:
When failures happen, MTTF tells you how often they occur. AI‑powered automation helps you decide what to do next—fast—so more failures stay under control and never become major incidents.
Harness AI‑assisted deployment verification analyzes metrics and logs during and after each deployment:
The result is fewer deployments turning into user‑visible failures and a higher effective MTTF for your services, because many problematic changes are automatically rolled back before customers notice.
On the CI side, AI‑driven analysis works with Test Intelligence and analytics to:
SLOs and error budgets turn raw data into rules. Instead of making teams watch dashboards and make decisions on their own, you can:
This completes the cycle: MTTF informs SLO design. Guardrails are based on SLOs, and AI-powered verification and rollbacks work on those guardrails at machine speed.
Want to turn MTTF insights into automated reliability improvements?
Explore Harness CI/CD to reduce failure rates, enforce guardrails, and improve SLO performance.
MTTF can feel abstract until you have to justify reliability decisions or explain incident patterns to stakeholders. These FAQs break down the most common questions practitioners ask about MTTF and how it relates to other reliability metrics.
MTTF is the average time it takes for a group of parts, like pods or temporary CI runners, to fail in a way that can't be fixed. MTBF tells you how long systems you fix and put back into service, like databases or long-running services, are up and running before they break down again.
When you need to know how often failures happen so you can plan for redundancy or auto-healing, use MTTF. Use MTTR to find out how quickly you can fix services that users can see after they go down. Both metrics work together and are usually used to help make decisions about SLOs and error budgets.
MTTF estimates are very uncertain when there aren't many failures. To make the number more reliable, put similar workloads together, add up the exposure hours for each class, and think of MTTF as a range or trend instead of a single point. If a part didn't fail in your window, don't assume that it will never fail; instead, treat that as incomplete data.
Most of the time, MTTF is skewed by dropping instances that are still running when the measurement is taken (right-censoring), combining environments (staging, load, and production) into one metric, and having different or unclear definitions of failure across teams. Fixing these problems usually makes MTTF more useful than any other advanced statistical method.
MTTF doesn't work when failures are very similar or when you're measuring systems that are fixed instead of replaced. In those cases, MTBF and MTTR, when looked at through SLOs and error budgets, usually give better advice than just one MTTF value.
When the MTTF is higher on important parts, there are fewer problems, fewer pages, and less time lost by developers fixing them. You can link improvements directly to faster safe release velocity, lower downtime risk, and lower operational costs when you combine MTTF with SLOs, error budgets, chaos engineering, and AI-powered automation.


Modern application security goes from code to runtime. Vulnerabilities are found at every stage of the software development lifecycle (SDLC) - in the code developers write, open source packages they pull in, container images they build, and cloud infrastructure where it all runs. But finding vulnerabilities is no longer enough. With attack surfaces sprawling across pipelines, registries, and production environments, the harder problem is fixing the vulnerabilities that actually matter.
Understanding what’s important increasingly depends on correlating multiple data points. A critical CVE buried in a dependency looks very different depending on whether the vulnerable function is actually reachable, the library is used in production, or the affected service is internet-facing. Without runtime context, security and development teams are often left triaging noise instead of actually reducing risk. And fixing vulnerabilities discovered in production can be challenging without being able to follow the trail back to the repo and line of code where the vulnerability can be found.
No matter where application security lives in your organization - and increasingly, it lives in more than one place - Harness and Wiz are working together to make sure you're covered. Whether your team is shifting left from cloud security or pushing right from the development pipeline, integrating Harness and Wiz brings code and runtime findings together so you always have the context you need to act.
Application security used to have a clear owner. The AppSec team ran the scanners, triaged findings, and created tickets for developers. But "shift left" has been pushing security earlier into the development process and ownership has been migrating toward the teams that actually write and ship code. Today, the DevSecOps or platform engineering team owns application security tooling in many organizations. They're the ones who know exactly where a vulnerability lives in code, who owns it, and how to get developers to fix them.
But as applications move to the cloud, cloud security and infrastructure teams have a stake in application security outcomes as well. They're the ones with visibility into what's actually running in production - what's internet-facing, what's over-privileged, what's actively being exploited. Cloud security platforms have expanded their focus from purely infrastructure and runtime back through the SDLC to code. For many cloud teams, application security isn't a handoff; it feeds into their cloud risk picture.
The result is that application security now has multiple stakeholders with different vantage points. DevSecOps teams see risk through the lens of the CI/CD pipeline and the developer workflow. Cloud security teams see it through the lens of the deployed environment and the blast radius of a breach. Neither view is complete on its own. The good news is that these teams don't have to choose between their tools or their workflows. They need integration that lets each team work in their context while sharing the signals that make both more effective.
DevSecOps teams need to expand right. SAST and SCA tools often generate more findings than any team can fix. Runtime context helps separate signal from noise. Knowing that a vulnerable service is actively internet-facing or that a dependency with a critical CVE is actually loaded in production changes how a team prioritizes. Without it, developers are left triaging based on CVSS scores alone. With it, they can focus effort where exposure is real and the risk in production is highest.

Harness Security Testing Orchestration (STO) makes it easy to orchestrate Wiz Code across your CI/CD pipelines. With a pre-built integration, you can deploy Wiz Code in just a few clicks instead of needing to create a custom integration or write custom scripts. Harness orchestrates Wiz Code alongside all your other scanners so you know your pipelines always get the required security tests, without needing to manually coordinate multiple tools.

Once Wiz Code is integrated, STO aggregates findings with other scanners in your pipeline, automatically deduplicating vulnerabilities so teams aren't triaging the same issue twice. The consolidated view means developers and security engineers can see the full picture in one place, understanding pipeline-level risk and assigning tickets to developers. In addition, Harness Policy as Code lets teams define and take action at the pipeline level instead of tool by tool, so decisions about what to fail a build on, what to flag for review, and what to pass through are applied consistently and holistically across every scan and pipeline.
Cloud security is pushing left - past runtime, past containers, all the way back to the code and open source packages that vulnerabilities originate from. The driver is enabling action. A misconfigured cloud resource or a vulnerable container image is more actionable when you can tie it back to the specific dependency introduced in a pull request, the developer who owns the code, and the pipeline that shipped it. Runtime findings without code context are just alerts. With code context, they become actionable work items that can be routed to the right person and fixed at the source.
Wiz Application Security Posture Management (ASPM) is designed to aggregate findings from across the SDLC and correlate them with runtime context - what's deployed, what's exposed, and what's actually at risk. By integrating Harness SAST and SCA scanner findings directly into Wiz, cloud security teams can connect the dots between a vulnerable open source package or insecure code pattern and the running workloads it affects. That correlation is what turns a list of CVEs into a prioritized risk picture that reflects what's actually happening in production.
For cloud security teams already working in Wiz, this integration means Harness SAST and SCA become part of their existing workflow rather than a separate tool to check. Code-level findings surface alongside runtime signals in the same platform where cloud risk is already being managed, analyzed, and acted on. Teams get broader coverage without adding friction, and the context that makes those findings meaningful - reachability, exposure, business criticality - is already there when they need it.
DevSecOps and cloud security teams are not generally not competing - they're looking at risk from different angles. One team lives in the development pipeline; the other lives in the cloud. Both need visibility into what the other sees to do their jobs well. When those views are siloed, findings get duplicated, priorities diverge, and the vulnerabilities that matter most fall through the cracks between teams.
Harness and Wiz close that gap from both directions. DevSecOps teams get runtime signals from Wiz Code inside the pipeline context where they already work, so they can prioritize fixes based on real-world exposure. Cloud security teams get code-level findings from Harness SAST and SCA inside the risk context where they already work, so they can trace production risk back to its source. Each team keeps their workflow. Both teams get the full picture.
The right combination of these integrations depends on how your organization is structured, where application security ownership sits today, and where you want it to go. If you're a Wiz customer evaluating how Harness SAST and SCA fit into your security program, or a Harness customer looking to bring runtime context into your pipelines, contact your Harness account team to understand how you can map the integrations to your specific environment.


Application security testing tools promise coverage and accuracy, but teams often struggle just to get started. One of the biggest friction points in dynamic application security testing is configuring authentication correctly so a scanner can even access a target application, let alone API endpoints that power the functionality.
Whether it’s API keys, bearer tokens, or custom auth flows, setting up authentication for scans frequently requires trial-and-error and engineering support. This reality of scanning configuration slows down security validations, delays insights, and makes it difficult to integrate with AI-driven tooling that depends on fast, accurate access to API endpoints.
Today, we’re excited to introduce AI-Powered Custom Authentication Generation—a new capability designed to eliminate this friction and help teams move from setup to security insights faster than ever.
With this release, teams can now generate and refine authentication configurations using natural language and LLMs. Instead of manually configuring authentication logic or relying on additional support, users can simply describe their requirements and let AI handle the rest.
The average time to configure authentication for API security testing is measured in seconds, whereas older manual approaches can take hours and require extensive trial-and-error.
Here are a few highlights:
Authentication setup has long been one of the most frustrating parts of security scanning. Access control mechanisms are already complex due to security hardening used to protect applications and APIs. Successfully automating authentication flows so a machine can access an app or endpoint raises the bar substantially.
Some of the common pain paints include:
What should be a simple prerequisite, gaining authenticated context into an application, becomes a major bottleneck to dynamic application security testing.
The new AI-powered authentication feature in Harness API Testing removes these barriers entirely by reworking how authentication config is created and managed.
Users can navigate to the authentication configuration page, select the custom option, and simply describe what they need. For example:
“Generate an API key-based authentication hook where the token <token> is injected into the request header <authorization>.”
With a single click on “Generate with AI,” the system produces a complete, ready-to-use authentication script. This functionality eliminates the need to write or stitch together configurations manually.

The feature supports a range of common authentication mechanisms, including:
This flexibility ensures teams can quickly configure access regardless of how their application or API is secured. Learn more details about the supported authentication types.
Authentication requirements often evolve. Instead of starting over, users can iteratively refine their configurations using natural language prompts.
For example, if you want to change how credentials are injected into the auth flow, you can simply say:
“Change the injection type to header name.”
By selecting “Refine with AI,” the system updates the existing configuration accordingly—no manual edits required.
Every AI-generated or modified configuration includes inline comments that explain what changed. These comments make it easier for teams to:

Additionally, no credentials are stored in logs or persisted in prompts. Any sensitive authentication material is masked and encrypted at rest.
By reducing setup errors and simplifying authentication configuration, this Harness API Testing feature directly improves scan success rates. Teams can spend less time troubleshooting authentication issues and more time analyzing real security findings.
This release is more than just a usability improvement. It’s a foundational step towards enabling AI-driven security workflows.
By removing the friction of authentication setup, teams can:
Ultimately, this translates to a faster time-to-value and a more scalable approach to dynamic application security testing.
AI-Powered Custom Authentication Generation is available immediately with your existing Harness subscription. You can find related technical documentation here.
Current Customers: Log in to your dashboard today to start exploring your threat data in a whole new dimension.
New to the Platform? If you aren't yet protected, contact us to schedule a personalized demo.


There is a version of the Legal team that exists in most companies: thorough, careful, and quietly overwhelmed. Good lawyers are spending their days on tasks that really should not require a lawyer at all.
We decided early on that this was not the team we wanted to be.
At Harness, the AI-first approach is not just saved for the engineering team. It is how every team operates, including Legal. That means we stopped asking “should we use AI?” a long time ago and started asking “how do we build with AI?” The result is a Legal team that does not just use AI tools. It develops them. We close faster, advise smarter, and frankly, have more fun doing it!
Every tool in our stack has a job. Here is what that looks like in practice:
The honest answer: the relationship between Legal and the rest of the business.
Turnarounds that used to take days take hours. Quality has gone up, not down. And because teams can self-serve answers to routine questions through our Legal Playbooks, the requests that do reach us are the ones that genuinely need us. We spend less time being a checkpoint and more time being a partner. That is a different job, and a more meaningful one.
Moving fast with AI does not mean being reckless about it:
What makes this more than a policy is the culture around it. We run regular sessions where the team shares what they are learning: tools worth trying, prompting approaches that actually work for legal drafting, and ways to get more out of what we already use. When one person figures something out, everyone benefits. That collective curiosity is what stops this from becoming shelfware and keeps it genuinely evolving.

If this is what Legal looks like at Harness, imagine the rest.
Every team here operates this way. Not because they are told to, but because it is genuinely a better way to work. If you are looking for a company where AI is woven into how things actually get done, not just what gets announced, we are hiring!


If your Terraform install is insecure or inconsistent, it can quickly slow down your delivery. A single compromised file or a misconfigured backend can stop deployments for many services. Teams that set up Terraform correctly from the start can scale easily and avoid compliance issues.
The answer is to install Terraform with strong security measures right from the beginning. Use verified binaries, encrypt your state, and set up automated CI/CD integration from day one. This method includes OS-specific setup, security checklists, GitOps alignment, and governance that can grow with your company. Want to speed up secure infrastructure automation? Harness Infrastructure as Code Management offers AI-powered pipelines with built-in governance for enterprises.
One misconfigured Terraform install can cause hours of pipeline failures across many services. When setting up Terraform on development machines, build agents, and production, focus on consistency and security for reliable automation. Start with verified binaries, pinned versions, and automated checks to keep your infrastructure stable.
Always get Terraform from HashiCorp’s official repositories, not from third-party mirrors or unofficial packages. For macOS, use the official Homebrew tap (brew tap hashicorp/tap && brew install hashicorp/tap/terraform).
On Linux, add HashiCorp’s GPG-signed package repository instead of using versions from your distribution, which may be outdated. Windows users should download signed binaries directly from releases.hashicorp.com. This helps keep your infrastructure safe from compromised or outdated packages.
To make builds reproducible, control the exact Terraform version in every environment. Download the specific version you need, such as from https://releases.hashicorp.com/terraform/1.6.0/terraform_1.6.0_linux_amd64.zip, and check the SHA256 checksum against HashiCorp’s signed SHASUMS file before extracting.
Keep your version-pinned install scripts in your infrastructure repository so teams can create identical environments. If you use Terraform with Harness, delegates manage versions for you, but local development still needs consistent versioning.
After installing Terraform, run terraform version to make sure the right version is active and in your PATH. Set up the plugin cache directory (TF_PLUGIN_CACHE_DIR) to avoid repeated provider downloads and check that you have write permissions.
Write a simple script to check the Terraform binary location, version, and basic provider setup. Run this script automatically in your CI/CD pipelines, container builds, and onboarding workflows to catch problems before they affect deployments. While local installation is useful for development, enterprise teams should standardize Terraform execution through an IaCM platform. This ensures consistent environments across developers, CI/CD pipelines, and production systems without relying on manual setup
Installing Terraform is only the beginning. In enterprise settings where you manage important infrastructure and need to meet regulations, hardening your Terraform setup turns a basic install into a system ready for production and governance. These controls are significantly easier to enforce when Terraform is managed through an IaCM platform that centralizes execution, credentials, and policy enforcement.
Credential Management and Execution Isolation:
Provider Security and Integrity:
State Management and Backend Security:
Make your Terraform CI/CD setup consistent by including the binary in versioned container images or reusable templates that all services use. This prevents differences between developer machines, build agents, and production. This approach can become even more scalable when implemented through an IaCM tool integrated with your CI/CD platform where Terraform execution, policy checks, and governance are built into reusable workspaces and modules.
When updating Terraform versions or security patches, make changes in your template library instead of updating each pipeline one by one. We recommend this version-controlled method for enterprise customers.
Use Policy as Code checks to enforce governance by validating Terraform versions, approved modules, and provider rules before running any plans. OPA can review Terraform plans in your CI/CD pipeline, automatically approving safe changes and flagging risky ones for manual review.
Pair this with GitOps workflows, where pull requests start plans and approved merges triggers applies. This creates clear audit trails for compliance and keeps developers moving quickly. Instead of treating Terraform as a standalone CLI step, IaC tools allow you to manage infrastructure workflows as first-class citizens within your delivery pipelines.
DevOps teams running hundreds of services need Terraform installation methods that scale and stay secure and compliant. Here are practical answers to common questions from teams in regulated settings.
Start with package repositories that include GPG verification rather than direct binary downloads to prevent compromised or malicious software packages. Install from official HashiCorp repositories with signed packages, verify SHA256 checksums, and run Terraform from isolated build environments with limited-access credentials that only provide necessary permissions. Keep your state files in encrypted, secure storage with access controls and comprehensive audit logging.
Include Terraform in your container images with specific versions, or use custom binaries to keep all pipeline runs consistent. Pin exact builds in your pipeline templates and use policy-as-code to allow only approved releases before running plans. This keeps development and production in sync and maintains clear compliance records.
Make reusable install scripts that check checksums and pin builds, then share them through central config management or container registries. Use remote execution on dedicated infrastructure for security and audit trails. Apply OPA policies to control which Terraform releases and providers your teams can use.
Running Terraform remotely on dedicated infrastructure gives you better security and audit trails. Running it locally on developer machines can cause compliance and credential issues. Use isolated build environments or cloud-managed services that run Terraform plans with proper authentication and detailed logs for production. Even better, IaC platforms standardize this by enforcing remote execution with built-in security, auditability, and role-based access controls.
Set up golden path templates with pinned Terraform installs that update all services automatically. Distribute approved releases using container images or package managers, or use platforms that handle governance for you. IaC platforms automate this by centrally managing Terraform versions and enforcing them across all pipelines and environments.
Standardizing how you install Terraform sets the stage for everything else. Pinning versions, using verified binaries, and securing remote state help your teams work quickly and stay compliant. These best practices are the base for templates that scale to hundreds of services.
Once you have this foundation, the real benefits come when your install standards connect to automated pipelines and GitOps workflows. Using centralized templates and modules for Terraform means security updates are spread automatically, and developers keep their flexibility. Policy-as-code makes sure every deployment meets enterprise needs without slowing things down. At this stage, adopting an IaC Platform approach becomes the recommended path. By managing Terraform through platforms like Harness, teams can standardize execution, enforce governance, and scale infrastructure delivery without increasing operational overhead.”
Are you ready to move from manual installs to enterprise-level automation and governance? Harness Infrastructure as Code Management offers AI-powered templates, a central control plane, and automated checks to make your Terraform setup a real advantage.



With the acceleration of AI-assisted coding, spurring the velocity of software releases, the challenge of ensuring stable deployments is heightened, and platform teams are feeling the hit. The State of AI-assisted Software Development DORA report measured a negative impact on software delivery stability: “an estimated 7.2% reduction for every 25% increase in AI adoption.”
The DORA report advises:
Considered together, our data suggest that improving the development process does not automatically improve software delivery—at least not without proper adherence to the basics of successful software delivery, like small batch sizes and robust testing mechanisms.
A robust testing mechanism rapidly gaining momentum is testing in production. Let’s take a closer look at how this practice boosts software delivery stability and supports the software development lifecycle (SDLC). We’ll also consider how to make testing in production, specifically A/B testing at scale, work for you.
Testing in production (TIP) means testing new software code on live traffic in active real-world environments. TIP is complementary to pre-production testing and does not replace it. It does, however, carry tangible benefits:
Feature flags are instrumental in the practice of safe testing in production because they decouple deployment and exposure at the most granular level. By means of feature flags, you implement incremental feature release techniques and unlock progressive experimentation. With carefully crafted A/B testing, you empower rapid feedback loops that confirm real feature value, validate high quality software, and increase team productivity and satisfaction.
These testing and verification capabilities are crucial as never before in this “AI moment” where AI-assisted coding enjoys wide adoption and funding.
A/B testing is the process of simultaneously testing two different versions of a web page or product feature in order to optimize a behavioral or performance metric, while ensuring guardrail metrics are not negatively impacted. A/B testing spans the whole spectrum of software verification: you can safely carry out architectural validation on fundamental architectural changes or gather behavioral analytics on UI variations.
Progressive experimentation with feature flags lets you roll out changes to a small slice of users first, catch problems early, and expand only when the data looks good.
The key is keeping deployment and release separate. You decouple deployment and release by delivering new features in a dormant state. Code goes out behind a flag. You validate it with real traffic.
A/B testing built into your CI/CD pipeline means you're making data-driven decisions based on observed metrics. Advanced feature flagging correlates statistical data, with pinpoint precision, to the actual feature variation causing the impact. Even when multiple features are rolled out concurrently, an enterprise-grade feature management platform will effectively parse the data, alert you to the impactful variant, and enable you to roll back any negative feature in seconds. The time/cost savings and safety benefits are astounding.
A/B testing provides a great experience for both marketing teams and engineers:
An enterprise-level platform like Harness, provides Feature Management and Experimentation, bringing flags, monitoring, and full experimentation freedom into a finely-tuned, seamless end-to-end software delivery tech stack for your platform team. Integrating A/B testing and feature flags directly into CI/CD pipelines empowers your teams with self-service experimentation while maintaining enterprise governance and security.
Bundling features into cliff-jump releases put every user account at risk simultaneously. A progressive ramp—starting with just 1 or 2% of traffic, and gradually increasing—means a bug in your checkout flow only affects a fraction of users before you catch it. Progressive delivery validates that SLOs are holding before exposure expands. p95 latency spiking? Error rate creeping up? You catch it when a tiny fraction of users are affected—not thousands—and Harness CD integrates cleanly with Jenkins, GitLab, or GitHub Actions.
The deploy-and-hold pattern is the keystone. Ship code in the "off" state behind a feature flag and nothing changes for users until you're ready. Deploy at 11 AM on a Tuesday instead of 1 AM on a Sunday. No change windows, no dashboard babysitting. Code is in production, the feature is dark, and you flip the switch when you're ready to monitor it. That's the freedom of progressive experimentation with feature flags in practice.
Raw telemetry is information in theory and chaos in practice. AI-powered monitoring watches flag-level metrics—not just "something is slower," but "checkout button variant B is adding 43ms of p95 latency." That specificity matters. When you have six active experiments running, your engineers are not flipping through dashboards trying to isolate which one broke something. The system tells you.
If your team is already running feature flags with health monitoring, you're closer to a full experimentation platform than you might think. Targeting logic, rollout percentages, kill switches—that's already experiment infrastructure. What's missing is experiment tracking, statistical analysis, and deterministic assignment.
To implement experiments with your feature flagging:
An experimentation system built on top of your feature flagging makes A/B testing a cinch and eliminates operational bottlenecks and technical debt for your platform team.
A/B testing doesn't have to be complicated. It can run as part of a structured rollout with automated KPI metrics and guardrails:

The seven stages are built into your pipeline and completed with minimal human intervention:
A common mistake is ramping too fast and drawing conclusions from thin data. If your sample size is too low, your experiment will be underpowered, and you will be unlikely to detect a reasonably-sized impact. Calculate that you have a large enough sample to be able to detect impacts of the size that are important to you.
Progressive experimentation requires patience. Premature conclusions produce unreliable results, and unreliable results produce bad decisions.
Every experiment should have a documented hypothesis, defined success metrics, blast radius assessment, and rollback plan before it touches production. Feature flag lifecycle management also keeps technical debt from quietly accumulating—flags that never get retired are toggle debt and a production surprise waiting to happen.
The goal isn't just fewer 3 a.m. incidents, though that's a welcome side effect. The real win is replacing gut feel with data at every stage of delivery.
With modern testing in production: feature flags decouple deploy from release, progressive ramps limit blast radius, AI-powered guardrails catch regressions before they spread, and centralized analytics replace the multi-tool sprawl that makes experimentation feel expensive.
Every time you release a feature you can ramp gradually up to 100% using percentage-based rollouts, alert on specific pre-decided latency increases, and enforce minimum sample sizes before promotion. Let every release become a decision backed by actual evidence, not optimism.
Harness Feature Management & Experimentation consolidates flags, release monitoring, and A/B testing, so every deployment is a controlled experiment—not a gamble.
How do you pick guardrail metrics without blocking every release?
Start with your existing SLO metrics and be conservative. Grafana's SLO guidance recommends event-based SLIs over percentiles for cleaner signals. Focus on business-critical user journeys first.
What's a practical ramp schedule for a mid-sized SaaS team?
Every team has slightly different criteria to consider before safely ramping up. Release monitoring with automated guardrails removes the need for someone to manually review metrics at each stage—which is the only way this actually scales.
How do you handle sample ratio mismatch?
Monitor assignment ratios continuously using chi-squared tests. Harness FME’s attribution and exclusion algorithm is honed to ensure accurate samples. In addition, FME reassesses experiment health in real-time, including sample ratio.
Filter bot traffic early too. Microsoft's bot detection research shows bots can skew conversion rates by 15–30%. Behavioral signals like sub-10-second session duration or unusual referrer patterns are a practical starting point for exclusion algorithms.
Should you A/B test infrastructure changes or just product features?
A/B testing works best for user-facing changes where behavior matters. Infrastructure changes are better suited to progressive rollouts with guardrail monitoring—different changes, different success metrics. Performance and reliability for engineering experiments; conversion and engagement for growth. Keep the tooling integrated in your pipeline either way.
How do you maintain consistent user experiences across devices and services?
Deterministic hashing on stable user IDs. Hash user ID plus experiment name to generate consistent assignments and make sure the same user sees the same variant whether they're on mobile, desktop, or clearing cookies every 20 minutes. Avoid session-based bucketing—it creates flickering experiences, causes re-bucketing, and erodes trust in experiment data. Lean on SDK-side evaluation for consistency that holds across your entire stack.


At 2 am, your migration goes live. By 2:07, error rates spike, and rollback isn’t an option. Cloud migrations, API rewrites, and architecture transformations rarely fail because of bad code. They fail because of how that code is released.
Most teams still rely on a “big bang” cutover where infrastructure, services, and user-facing changes go live at once. This concentrates risk into a single moment. When something breaks, rollback is slow, visibility is limited, and the blast radius is large.
This is not just anecdotal. According to BCG, more than half of transformation efforts fail to achieve their intended outcomes within three years.
The difference between success and failure is not the migration itself. It is the release strategy.
“Cloud migration” sounds simple, but in practice, it is a layered transformation.
Most migrations combine several of the following:
These rarely happen in isolation. Teams often try to ship them together in a single coordinated release. That coupling increases complexity and multiplies risk.
Before your next migration, list every system involved. If they are all released together, you are carrying unnecessary risk.
The failure mode is consistent:
There is no safe way to validate behavior in production. There is no gradual exposure. Rollback often requires redeploying an old stack that may no longer be compatible.
Even worse, teams lack a reliable baseline. They cannot answer simple questions:
Without that, migration becomes guesswork.
Modern teams are adopting a different model:
Feature flags provide a control layer that separates deployment from exposure. Code can exist in production without being active for all users.
This enables:
Start by putting one service behind a feature flag and releasing it to internal users first.
Instead of switching everything at once:
If something fails, you reduce traffic or revert instantly.
This shifts migration from a single high-risk event to a series of measurable steps.
A common migration strategy is the strangler fig pattern.
Feature flags make this executable in production by controlling routing and exposure. But to make this work in practice, you need a control layer that can manage traffic in real time.
Below is a simplified view of how feature flags act as a control plane during migration:

Fig: Feature-flag–driven progressive traffic routing during migration
Two things matter here:
This is not just a toggle. It is a runtime decision and an observability layer.
A successful migration is not defined by deployment success. It is defined by outcomes.
Key metrics include:
These metrics are not theoretical. They are what teams use to validate migrations in real production environments.
In the Beyond the Toggle ebook, a legacy Spark batch pipeline was replaced with a streaming architecture, with a progressive rollout rather than a cutover.
The new system showed faster processing and lower costs before the full rollout.
From the webinar, teams often go further:
This allows validation of both performance and data integrity before committing.
Define your baseline metrics before migration. If you cannot measure improvement, you cannot prove success.
Staging environments cannot replicate production conditions. They lack:
Feature flags enable safe production testing through controlled exposure.
Not all canary releases are percentage-based. Some teams roll out by country or user segment first, then expand globally.
To make this safe:
A migration is a sequence of decisions, not a single moment.
At each stage:
In one example from the webinar:
This approach removes pressure from a single “launch moment” and distributes risk across stages.
Modern flag systems avoid becoming a bottleneck:
This ensures minimal latency and high reliability.
Not all migrations are equal.
The key is incremental transition, not avoidance.
Feature flags are temporary by design.
If left unmanaged, they accumulate and create complexity. Teams need:
Emerging approaches include automation that detects stale flags and generates pull requests to remove them.
Adopting progressive delivery is not just a tooling decision. It changes how teams release software.
Key considerations:
Feature flags do not bypass controls. They enhance them by adding visibility and control at runtime.
For migration use cases, a Feature Flag platform should provide:
Flags should not feel like a bolt-on. They should be part of how software is built and released.
The biggest mistake teams make is treating migration as a moment.
It is not.
It is a controlled progression of changes, each validated in production under real conditions.
Feature flags enable this by:
The result is simple:
Migrations become reversible, observable, and data-driven.
Want a deeper breakdown of these patterns and real-world examples? Read the full ebook or see a demo.
Need more info? Contact Sales