
Engineering teams are generating more shippable code than ever before — and today, Harness is shipping five new capabilities designed to help teams release confidently. AI coding assistants lowered the barrier to writing software, and the volume of changes moving through delivery pipelines has grown accordingly. But the release process itself hasn't kept pace.
The evidence shows up in the data. In our 2026 State of DevOps Modernization Report, we surveyed 700 engineering teams about what AI-assisted development is actually doing to their delivery. The finding stands out: while 35% of the most active AI coding users are already releasing daily or more, those same teams have the highest rate of deployments needing remediation (22%) and the longest MTTR at 7.6 hours.
This is the velocity paradox: the faster teams can write code, the more pressure accumulates at the release, where the process hasn't changed nearly as much as the tooling that feeds it.
The AI Delivery Gap
What changed is well understood. For years, the bottleneck in software delivery was writing code. Developers couldn't produce changes fast enough to stress the release process. AI coding assistants changed that. Teams are now generating more change across more services, more frequently than before — but the tools for releasing that change are largely the same.
In the past, DevSecOps vendors built entire separate products to coordinate multi-team, multi-service releases. That made sense when CD pipelines were simpler. It doesn't make sense now. At AI speed, a separate tool means another context switch, another approval flow, and another human-in-the-loop at exactly the moment you need the system to move on its own.
The tools that help developers write code faster have created a delivery gap that only widens as adoption grows.
Today Harness is releasing five capabilities, all natively integrated into Continuous Delivery. Together, they cover the full arc of a modern release: coordinating changes across teams and services, verifying health in real time, managing schema changes alongside code, and progressively controlling feature exposure.
Release Orchestration replaces Slack threads, spreadsheets, and war-room calls that still coordinate most multi-team releases. Services and the teams supporting them move through shared orchestration logic with the same controls, gates, and sequence, so a release behaves like a system rather than a series of handoffs. And everything is seamlessly integrated with Harness Continuous Delivery, rather than in a separate tool.
AI-Powered Verification and Rollback connects to your existing observability stack, automatically identifies which signals matter for each release, and determines in real time whether a rollout should proceed, pause, or roll back. Most teams have rollback capability in theory. In practice it's an emergency procedure, not a routine one. Ancestry.com made it routine and saw a 50% reduction in overall production outages, with deployment-related incidents dropping significantly.
Database DevOps, now with Snowflake support, brings schema changes into the same pipeline as application code, so the two move together through the same controls with the same auditability. If a rollback is needed, the application and database schema can rollback together seamlessly. This matters especially for teams building AI applications on warehouse data, where schema changes are increasingly frequent and consequential.
Improved pipeline and policy support for feature flags and experimentation enables teams to deploy safely, and release progressively to the right users even though the number of releases is increasing due to AI-generated code. They can quickly measure impact on technical and business metrics, and stop or roll back when results are off track. All of this within a familiar Harness user interface they are already using for CI/CD.
Warehouse-Native Feature Management and Experimentation lets teams test features and measure business impact directly with data warehouses like Snowflake and Redshift, without ETL pipelines or shadow infrastructure. This way they can keep PII and behavioral data inside governed environments for compliance and security.
These aren't five separate features. They're one answer to one question: can we safely keep going at AI speed?
Traditional CD pipelines treat deployment as the finish line. The model Harness is building around treats it as one step in a longer sequence: application and database changes move through orchestrated pipelines together, verification checks real-time signals before a rollout continues, features are exposed progressively, and experiments measure actual business outcomes against governed data.
A release isn't complete when the pipeline finishes. It's complete when the system has confirmed the change is healthy, the exposure is intentional, and the outcome is understood.
That shift from deployment to verified outcome is what Harness customers say they need most. "AI has made it much easier to generate change, but that doesn't mean organizations are automatically better at releasing it," said Marc Pearce, Head of DevOps at Intelliflo. "Capabilities like these are exactly what teams need right now. The more you can standardize and automate that release motion, the more confidently you can scale."
The real shift here is operational. The work of coordinating a release today depends heavily on human judgment, informal communication, and organizational heroics. That worked when the volume of change was lower. As AI development accelerates, it's becoming the bottleneck.
The release process needs to become more standardized, more repeatable, and less dependent on any individual's ability to hold it together at the moment of deployment. Automation doesn't just make releases faster. It makes them more consistent, and consistency is what makes scaling safe.
For Ancestry.com, implementing Harness helped them achieve 99.9% uptime by cutting outages in half while accelerating deployment velocity threefold.
At Speedway Motors, progressive delivery and 20-second rollbacks enabled a move from biweekly releases to multiple deployments per day, with enough confidence to run five to 10 feature experiments per sprint.
AI made writing code cheap. Releasing that code safely, at scale, is still the hard part.
Harness Release Orchestration, AI-Powered Verification and Rollback, Database DevOps, Warehouse-Native Feature Management and Experimentation, and Improve Pipeline and Policy support for FME are available now. Learn more and book a demo.

On March 19th, the risks of running open execution pipelines — where what code runs in your CI/CD environment is largely uncontrolled — went from theoretical to catastrophic.
A threat actor known as TeamPCP compromised the GitHub Actions supply chain at a scale we haven't seen before (tracked as CVE-2026-33634, CVSS 9.4). They compromised Trivy, the most widely used vulnerability scanner in the cloud-native ecosystem, and turned it into a credential-harvesting tool that ran inside victims' own pipelines.
Between March 19 and March 24, 2026, organizations running affected tag-based GitHub Actions references were sending their AWS tokens, SSH keys, and Kubernetes secrets directly to the attacker. SANS Institute estimates over 10,000 CI/CD workflows were directly affected. According to multiple security research firms, the downstream exposure extends to tens of thousands of repositories and hundreds of thousands of accounts.
Five ecosystems. Five days. One stolen Personal Access Token.
This is a fundamental failure of the open execution pipeline model — where what runs in your pipeline is determined by external references to public repositories, mutable version tags, and third-party code that executes with full privileges. GitHub Actions is the most prominent implementation.
The alternative, governed execution pipelines, where what runs is controlled through policy gates, customer-owned infrastructure, scoped credentials, and immutable references, is the model we designed Harness around years ago, precisely because we saw this class of attack coming.
TeamPCP wasn't an anomaly; it was the inevitable conclusion of an eighteen-month escalation in CI/CD attack tactics.
CVE-2025-30066. Attackers compromised a PAT from an upstream dependency (reviewdog/action-setup) and force-pushed malicious code to every single version tag of tj-actions/changed-files. 23,000 repositories were exposed. The attack was later connected to a targeted campaign against Coinbase. CISA issued a formal advisory.
This proved that the industry's reliance on mutable tags (like @v2) was a serious structural vulnerability. According to Wiz, only 3.9% of repositories pin to immutable SHAs. The other 96% are trusting whoever owns the tag today.
The first self-replicating worm in the CI/CD ecosystem. Shai-Hulud 2.0 backdoored 796 npm packages representing over 20 million weekly downloads — including packages from Zapier, PostHog, and Postman.
It used TruffleHog to harvest 800+ credential types, registered compromised machines as self-hosted GitHub runners named SHA1HULUD for persistent C2 over github.com, and built a distributed token-sharing network where compromised machines could replace each other's expired credentials.
PostHog's candid post-mortem revealed that attackers stole their GitHub bot's PAT via a pull_request_target workflow exploit, then used it to steal npm publishing tokens from CI runner secrets. Their admission that this kind of attack "simply wasn't something we'd prepared for" reflects the industry-wide gap between application security and CI/CD security maturity. CISA issued another formal advisory.
TeamPCP went after the security tools themselves.
They exploited a misconfigured GitHub Actions workflow to steal a PAT from Aqua Security's aqua-bot service account. Aqua detected the breach and initiated credential rotation — but reporting suggests the rotation did not fully cut off attacker access. TeamPCP appears to have retained or regained access to Trivy's release infrastructure, enabling the March 19 attack weeks after initial detection.
On March 19, they force-pushed a malicious "Cloud Stealer" to 76 of 77 version tags in trivy-action and all 7 tags in setup-trivy. Simultaneously, they published an infected Trivy binary (v0.69.4) to GitHub Releases and Docker Hub. Every pipeline referencing those tags by name started executing the attacker's code on its next run. No visible change to the release page. No notification. No diff to review.
TeamPCP's payload was purpose-built for CI/CD runner environments:
Memory Scraping. It read /proc/*/mem to extract decrypted secrets held in RAM. GitHub's log-masking can't hide what's in process memory.
Cloud Metadata Harvesting. It queried the AWS Instance Metadata Service (IMDS) at 169.254.169.254, pivoting from "build job" to full IAM role access in the cloud.
Filesystem Sweep. It searched over 50 specific paths — .env files, .aws/credentials, .kube/config, SSH keys, GPG keys, Docker configs, database connection strings, and cryptocurrency wallet keys.
Encrypted Exfiltration. All data was bundled into tpcp.tar.gz, encrypted with AES-256 and RSA-4096, and sent to typosquatted domains like scan.aquasecurtiy[.]org (note the "tiy"). These domains returned clean verdicts from threat intelligence feeds during the attack. As a fallback, the stealer created public GitHub repos named tpcp-docs under the victim's own account.
The malicious payload executed before the legitimate Trivy scan. Pipelines appeared to work normally. CrowdStrike noted: "To an operator reviewing workflow logs, the step appears to have completed successfully."
Sysdig observed that the vendor-specific typosquat domains were a deliberate deception — an analyst reviewing CI/CD logs would see traffic to what appears to be the vendor's own domain.
It took Aqua five days to fully evict the attacker, during which TeamPCP pushed additional malicious Docker images (v0.69.5 and v0.69.6).
Why did this work so well? Because GitHub Actions is the leading example of an open execution pipeline — where what code runs in your pipeline is determined by external references that anyone can modify.
This trust problem isn't new. Jenkins had a similar issue with plugins. Third-party code ran with full process privileges. But Jenkins ran inside your firewall; exfiltrating data required getting past your network perimeter.
GitHub Actions took the same open execution approach but moved execution to cloud-hosted runners with broad internet egress, making exfiltration trivially easy. TeamPCP's Cloud Stealer just needed to make an HTTPS POST to an external domain, which runners are designed to do freely.
Here are a few reasons why open execution pipelines break at scale:
Mutable Trust. When you use @v2, you are trusting a pointer, not a piece of code. Tags can be silently redirected by anyone with write access. TeamPCP rewrote 76 tags in a single operation. 96% of the ecosystem is exposed.
Flat Privileges. Third-party Actions run with the same permissions as your code. No sandbox. No permission isolation. This is why TeamPCP targeted security scanners — tools that by design have elevated access to your pipeline infrastructure. The attacker doesn't need to break in. The workflow invites them in.
Secret Sprawl. Secrets are typically injected into the runner's environment or process memory during job execution, where they remain accessible for the job's duration. TeamPCP's /proc/*/mem scraper didn't need any special privilege. It just needed to be running on the same machine.
Unbounded Credential Cascades. There is no architectural boundary that stops a credential stolen in one context from unlocking another. TeamPCP proved this definitively: Trivy → Checkmarx → LiteLLM → AI API keys across thousands of enterprises. One PAT, five ecosystems.
Harness CI/CD pipelines are built as governed execution pipelines — where what runs is controlled through customer-owned infrastructure, policy gates, scoped credentials, immutable references, and explicit trust boundaries. At its core is the Delegate — a lightweight worker process that runs inside your infrastructure (your VPC, your Kubernetes cluster), executes tasks locally, and communicates with the Harness control plane via outbound-only connections.
When we designed this architecture, we assumed the execution plane would become the primary target in the enterprise. If TeamPCP tried to attack a Harness-powered environment, they would hit three architectural walls.
The Architecture.
The Delegate lives inside your VPC or cluster. It communicates with our SaaS control plane via outbound-only HTTPS/WSS. No inbound ports are opened.
The Defense.
You control the firewall. Allowlist app.harness.io and the specific endpoints your pipelines need, deny everything else. TeamPCP's exfiltration to typosquat domains would fail at the network layer — not because of a detection rule, but because the path doesn't exist. Both typosquat domains returned clean verdicts from threat intel feeds. Egress filtering by allowlist is more reliable than detection by reputation.
The Architecture.
Rather than bulk-injecting secrets as flat environment variables at job start, Harness can resolve secrets at runtime through your secret manager — HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager, Azure Key Vault — via the Delegate, inside your network. Harness SaaS stores encrypted references and metadata, not plaintext secret values.
The Defense.
TeamPCP's Cloud Stealer worked because in an open execution pipeline, secrets are typically injected into the runner's process memory where they remain accessible for the job's duration. In a governed execution pipeline, this exposure is structurally reduced: secrets can be resolved from your controlled vault at the point they're needed, rather than broadcast as environment variables to every step in the pipeline.
An important caveat: Vault-based resolution alone doesn't eliminate runtime exfiltration. Once a secret is resolved and passed to a step that legitimately needs it — say, an npm token during npm publish — that secret exists in the step's runtime. If malicious code is executing in that same context (for example, a tampered package.json that exfiltrates credentials during npm run test), the secret is exposed regardless of where it came from. This is why the three walls work as a system: Wall 2 reduces the surface of secret exposure, Wall 1 blocks the exfiltration path, and (as we'll see) Wall 3 limits the blast radius to the scoped environment. No single wall is sufficient on its own.
To further strengthen how pipelines use secrets, leverage ephemeral credentials — AWS STS temporary tokens, Vault dynamic secrets, or GCP short-lived service account tokens — that auto-expire after a defined window, often minutes. Even if TeamPCP’s memory scraper extracted an ephemeral credential, it likely would have expired before the attacker could pivot to the next target.
The Architecture.
Harness supports environment-scoped delegates as a core architecture pattern. Your "Dev" scanner delegate runs in a different cluster, with different network boundaries and different credentials, than your "Prod" deployment delegate.
The Defense.
The credential cascade that defined TeamPCP hits a dead end. Stolen Dev credentials cannot reach Production publishing gates or AI API keys, because those credentials live in a different vault, resolved by a different delegate, in a different network segment. If the Trivy compromise only yielded credentials scoped to a dev environment, the attack stops at phase one.
Beyond the walls, governed execution pipelines provide additional structural controls:
Architecture is a foundation, not a guarantee. Governed execution pipelines are materially safer against this class of attack, but you can still create avoidable risk by running unvetted containers on delegates, skipping egress filtering, using the same delegate across dev and prod, granting overly broad cloud access, or exposing excessive secrets to jobs that don't need them, or using long-lived static credentials when ephemeral alternatives exist.
I am not claiming that Harness is safe and GitHub Actions is unsafe. That would be too simplistic.
What I am claiming is that governed execution pipelines — where what runs is controlled through policy gates, customer-owned infrastructure, scoped credentials, and immutable references — are a materially safer foundation than open execution pipelines. We designed Harness as our implementation of a governed execution pipeline. But architecture is a starting point — you still have to operate it well.
As we enter the era of Agentic AI — where AI is generating pipelines, suggesting dependencies, and submitting pull requests at machine speed — we can no longer rely on human review to catch a malicious tag in an AI-generated PR.
But there's a more fundamental shift: AI agents will become the primary actors inside CI/CD pipelines. Not just generating code — autonomously executing tasks, selecting dependencies, making deployment decisions, remediating incidents.
Now imagine an AI agent in an open execution pipeline — downloaded from a public marketplace, referenced by a mutable tag, executing with full privileges, making dynamic runtime decisions you didn't define. It has access to your secrets, your cloud credentials, and your deployment infrastructure. Unlike a static script, an agent makes decisions at runtime — fetching resources, calling APIs, modifying files.
If TeamPCP showed us what happens when a static scanner is compromised, imagine what happens when an autonomous AI agent is compromised — or simply makes a decision you didn't anticipate.
This is why governed execution pipelines aren't just a security improvement — they're an architectural prerequisite for the AI era. In a governed pipeline, even an AI agent operates within structural boundaries: it runs on infrastructure you control, accesses only scoped secrets, has restricted egress, and its actions are audited. The agent may be autonomous, but the pipeline constrains what it can reach.
The questions every engineering leader should be asking:
If you use Trivy, Checkmarx, or LiteLLM:
If you use GitHub Actions:
For the longer term:
I'm writing this as the CEO of a company that competes with GitHub in the CI/CD space. I want to be transparent about that.
But I'm also writing this as someone who has spent two decades building infrastructure software and who saw this threat model coming. When we designed Harness, the open execution pipeline model had already evolved from Jenkins plugins to GitHub Actions — each generation making it easier for third-party code to run with full privileges and, by moving execution further from the customer's network perimeter, making exfiltration easier. We deliberately chose to build governed execution pipelines instead.
The TeamPCP campaign didn't teach us anything new about the risk. What it did was make the difference between open and governed execution impossible for the rest of the industry to ignore.
Open source security tools are invaluable. The developers and companies who build them — including Aqua Security and Checkmarx — are doing essential work. The problem isn't the tools. The problem is running them inside open execution pipelines where third-party code has full privileges, secrets sit in memory, and exfiltration faces no structural barrier.
If you want to explore how the delegate architecture works in practice, we're here to show you. But more importantly, regardless of what platform you choose, please take these structural questions seriously. The next TeamPCP is already studying the credential graph.

Over the last few years, something fundamental has changed in software development.
If the early 2020s were about adopting AI coding assistants, the next phase is about what happens after those tools accelerate development. Teams are producing code faster than ever. But what I’m hearing from engineering leaders is a different question:
What’s going to break next?
That question is exactly what led us to commission our latest research, State of DevOps Modernization 2026. The results reveal a pattern that many practitioners already sense intuitively: faster code generation is exposing weaknesses across the rest of the software delivery lifecycle.
In other words, AI is multiplying development velocity, but it’s also revealing the limits of the systems we built to ship that code safely.
One of the most striking findings in the research is something we’ve started calling the AI Velocity Paradox - a term we coined in our 2025 State of Software Engineering Report.
Teams using AI coding tools most heavily are shipping code significantly faster. In fact, 45% of developers who use AI coding tools multiple times per day deploy to production daily or faster, compared to 32% of daily users and just 15% of weekly users.
At first glance, that sounds like a huge success story. Faster iteration cycles are exactly what modern software teams want.
But the data tells a more complicated story.
Among those same heavy AI users:
What this tells me is simple: AI is speeding up the front of the delivery pipeline, but the rest of the system isn’t scaling with it. It’s like we are running trains faster than the tracks they are built for. Friction builds, the ride is bumpy, and it seems we could be on the edge of disaster.

The result is friction downstream, more incidents, more manual work, and more operational stress on engineering teams.
To understand why this is happening, you have to step back and look at how most DevOps systems actually evolved.
Over the past 15 years, delivery pipelines have grown incrementally. Teams added tools to solve specific problems: CI servers, artifact repositories, security scanners, deployment automation, and feature management. Each step made sense at the time.
But the overall system was rarely designed as a coherent whole.
In many organizations today, quality gates, verification steps, and incident recovery still rely heavily on human coordination and manual work. In fact, 77% say teams often have to wait on other teams for routine delivery tasks.
That model worked when release cycles were slower.
It doesn’t work as well when AI dramatically increases the number of code changes moving through the system.
Think of it this way: If AI doubles the number of changes engineers can produce, your pipelines must either:
Otherwise, the system begins to crack under pressure. The burden often falls directly on developers to help deploy services safely, certify compliance checks, and keep rollouts continuously progressing. When failures happen, they have to jump in and remediate at whatever hour.
These manual tasks, naturally, inhibit innovation and cause developer burnout. That’s exactly what the research shows.
Across respondents, developers report spending roughly 36% of their time on repetitive manual tasks like chasing approvals, rerunning failed jobs, or copy-pasting configuration.
As delivery speed increases, the operational load increases. That burden often falls directly on developers.
The good news is that this problem isn’t mysterious. It’s a systems problem. And systems problems can be solved.
From our experience working with engineering organizations, we've identified a few principles that consistently help teams scale AI-driven development safely.
When every team builds pipelines differently, scaling delivery becomes difficult.
Standardized templates (or “golden paths”) make it easier to deploy services safely and consistently. They also dramatically reduce the cognitive load for developers.
Speed only works when feedback is fast.
Automating security, compliance, and quality checks earlier in the lifecycle ensures problems are caught before they reach production. That keeps pipelines moving without sacrificing safety.
Feature flags, automated rollbacks, and progressive rollouts allow teams to decouple deployment from release. That flexibility reduces the blast radius of new changes and makes experimentation safer.
It also allows teams to move faster without increasing production risk.
Automation alone doesn’t solve the problem. What matters is creating a feedback loop: deploy → observe → measure → iterate.
When teams can measure the real-world impact of changes, they can learn faster and improve continuously.
AI is already changing how software gets written. The next challenge is changing how software gets delivered.
Coding assistants have increased development teams' capacity to innovate. But to capture the full benefit, the delivery systems behind them must evolve as well.
The organizations that succeed in this new environment will be the ones that treat software delivery as a coherent system, not just a collection of tools.
Because the real goal isn’t just writing code faster. It’s learning faster, delivering safer, and turning engineering velocity into better outcomes for the business.
And that requires modernizing the entire pipeline, not just the part where code is written.


Shift-Left FinOps reframes cloud cost optimization as an engineering responsibility rather than a retrospective financial review.
Instead of analyzing cloud spend after infrastructure has already been deployed, organizations integrate FinOps automation and cost governance directly into development workflows.
Developers receive immediate feedback about the financial impact of infrastructure changes during development, not weeks later during billing reconciliation.
This approach aligns FinOps best practices with modern platform engineering workflows built around:
The result is proactive cost management, where waste is prevented before resources ever reach production.
Most organizations still treat FinOps as a retrospective discipline.
Finance teams review monthly cloud bills, identify anomalies, and ask engineering teams to investigate cost spikes.
This approach worked when infrastructure provisioning happened slowly through manual processes.
Modern cloud environments operate differently.
Teams now deploy infrastructure changes dozens or even hundreds of times per day through automated pipelines.
By the time billing data arrives, the operational context behind those provisioning decisions has already disappeared.
This delay introduces several operational challenges.
Without consistent tagging and governance policies, organizations struggle to determine:
Missing metadata makes cloud financial management and cost attribution significantly harder.
When cost issues are discovered weeks later, teams must retrofit governance controls onto already-running infrastructure.
This reactive approach introduces operational risk and disrupts delivery schedules.
Instead of preventing waste, teams are forced into post-deployment cloud cost optimization efforts.
Developers often make infrastructure decisions without understanding their financial implications.
Without real-time cost visibility, engineers cannot evaluate tradeoffs between:
Shift-Left FinOps addresses these problems by embedding Infrastructure as Code cost control and governance earlier in the development lifecycle.
Implementing Shift-Left FinOps requires three foundational capabilities:
Together, these capabilities enable FinOps automation and proactive cost management across cloud environments.
Policy as Code frameworks translate financial governance requirements into enforceable rules.
These rules automatically evaluate infrastructure definitions before resources are deployed.
Instead of relying on documentation or manual enforcement, policy as code provides automated cloud cost governance directly within engineering workflows.
Effective cost governance policies typically address three categories of waste.
Policies ensure every resource includes metadata required for cost attribution.
Examples include:
Governance rules prevent developers from provisioning oversized instances for workloads that do not require maximum capacity.
This supports long-term cloud cost optimization and prevents overprovisioning.
Policies identify resources missing automated shutdown schedules, retention rules, or lifecycle controls.
These controls prevent idle resources from generating unnecessary costs.
Example conceptual tagging policy:

With Policy as Code enforcement, developers receive immediate feedback during infrastructure validation rather than after deployment.
Infrastructure as Code provides the technical foundation for shift-left cloud cost governance.
IaC allows infrastructure changes to be:
before deployment.
These characteristics create natural enforcement points for Infrastructure as Code cost control policies.
Developers run policy checks locally before committing infrastructure changes.
This prevents cost policy violations from entering shared repositories.
Early validation enables proactive cost management during development.
Pull request pipelines automatically validate infrastructure definitions against governance policies.
If cost policies fail, the merge is blocked.
Example validation workflow:

This ensures consistent cloud cost governance across all infrastructure deployments.
Production deployment pipelines include a final validation step before infrastructure changes are applied.
This layered validation model creates defense in depth for FinOps automation:
Shift-Left FinOps only works when developers have access to cost insights during infrastructure planning.
Without cost visibility, governance policies feel arbitrary and difficult to follow.
Cost estimation should occur during infrastructure planning stages, not after deployment.
When developers run terraform plan, they should also see estimated monthly costs associated with proposed infrastructure changes.
This allows developers to evaluate architectural tradeoffs such as:
Integrating cost feedback into existing workflows improves adoption.
Examples include:
These feedback loops support FinOps best practices by making cost awareness part of everyday development work.
Organizations implementing shift-left cloud cost governance frequently encounter predictable challenges.
Understanding these patterns helps teams implement FinOps automation successfully.
Governance policies that generate frequent false positives slow developer velocity.
Developers may attempt to bypass governance controls.
Policies should focus on real sources of cloud cost waste.
Shift-left cost governance fails when platform teams must manually review every infrastructure change.
All governance rules should be automated through Policy as Code and CI/CD validation pipelines.
Providing policies without cost visibility creates confusion.
Developers need both:
Successful governance systems integrate into existing workflows such as:
The most effective governance systems remain invisible until violations occur.
Harness Cloud Cost Management enables organizations to implement Shift-Left FinOps and proactive cost management at scale.
The platform integrates FinOps automation and cost governance directly into platform engineering workflows.
Harness CCM connects cloud environments across:
This provides unified cost visibility across multi-cloud infrastructure.
Real-time cost allocation allows developers to see cost breakdowns during infrastructure provisioning rather than waiting for billing cycles.
Teams can enforce governance policies such as:
These policies automatically validate infrastructure changes during development and CI/CD pipelines.
Harness CCM also supports cloud cost optimization through automated insights and recommendations, including:
These capabilities help organizations implement modern cloud financial management practices while maintaining developer velocity.
Shift-Left FinOps prevents waste before resources are deployed by embedding cost governance directly into development workflows.
This proactive model improves cloud cost optimization and cloud financial management outcomes.
Typical implementations include:
These technologies enable automated cloud cost governance through policy as code.
No. When implemented correctly, automated cost policies provide immediate feedback without manual approvals.
This improves delivery speed while maintaining cost control and FinOps automation.
Organizations define standardized governance policies that apply across AWS, Azure, and GCP.
These policies focus on universal patterns such as:
This approach supports consistent multi-cloud cost governance and financial management.
Shift-Left FinOps transforms cloud cost optimization from a reactive financial process into a proactive engineering practice.
By embedding Policy as Code governance, Infrastructure as Code cost control, and automated FinOps workflows into development pipelines, organizations prevent waste before infrastructure reaches production.
The result is stronger cloud cost governance, improved cloud financial management, and more efficient cloud operations.
Developers gain real-time cost insights.
Platform teams enforce governance automatically.
Finance teams gain accurate cost attribution across environments.
At the scale of modern cloud infrastructure, proactive cost management is essential for sustainable cloud growth.


Database systems store some of the most sensitive data of an organization such as PII, financial records, and intellectual property, making strong database governance non-negotiable. As regulations tighten and audit expectations increase, teams need governance that scales without slowing delivery.
Harness Database DevOps addresses this by applying policy-driven governance using Open Policy Agent (OPA). With OPA policies embedded directly into database pipelines, teams can automatically enforce rules, capture audit trails, and stay aligned with compliance requirements. This blog outlines how to use OPA in Harness to turn database compliance from a manual checkpoint into a built-in, scalable part of your DevOps workflow.
Organizations face multiple challenges when navigating database compliance:
These challenges highlight the necessity of embedding governance directly into database development and deployment pipelines, rather than treating compliance as a reactive checklist.
Harness Database DevOps is designed to offer a comprehensive solution to database governance - one that aligns automation with compliance needs. It enables teams to adopt policy-driven controls on database change workflows by integrating the Open Policy Agent (OPA) engine into the core of database DevOps practices.
What is OPA and Policy as Code?
OPA is an open source, general-purpose policy engine that decouples policy decisions from enforcement logic, enabling centralized governance across infrastructures and workflows. Policies in OPA are written in the Rego declarative language, allowing precise expression of rules governing actions, access, and configurations.
Harness implements Policy as Code through OPA, enabling teams to store, test, and enforce governance rules directly within the database DevOps lifecycle. This model ensures that compliance controls are consistent, auditable, and automatically evaluated before changes reach production.
Here’s a structured approach to implementing database governance with OPA in Harness:
Start by cataloging your regulatory obligations and internal governance policies. Examples include:
Translate these requirements into quantifiable rules that can be expressed in Rego.
Within the Harness Policy Editor, define OPA policies that codify governance rules. For example, a policy might block any migrations containing operations that remove columns in production environments without explicit DBA approval.
Harness policies are modular and reusable, you can import and extend them as part of broader governance packages. This allows cross-team reuse and centralized management of rules. Key aspects include:
By expressing governance as code, you ensure consistency and remove ambiguity in policy enforcement.
Policies can be linked to specific triggers within your database deployment workflow, for instance, evaluating rules before a migration is applied or before a pipeline advances to production. This integration ensures that non-compliant changes are automatically blocked, while compliant changes proceed seamlessly, maintaining the balance between speed and control.
Harness evaluates OPA policies at defined decision points in your pipeline, such as pre-deployment checks. This prevents risky actions, enforces access controls, and aligns every deployment with governance objectives without manual intervention.
Audit Trails and Traceability
Every policy evaluation is logged, creating an auditable trail of who changed what, when, and why. These logs serve as critical evidence during compliance audits or internal reviews, reducing the overhead and risk associated with traditional documentation practices.
By enforcing the principle of least privilege, policies ensure that users and applications possess only the necessary permissions for their specific roles. This restriction on access is crucial for minimizing the potential attack surface and maintaining compliance with regulatory requirements for data access governance.
Database governance is an essential pillar of enterprise compliance strategies. By embedding OPA-based policy enforcement within Harness Database DevOps, organizations can automate compliance controls, minimize risk, and maintain developer productivity. Policy as Code provides a scalable, auditable, and consistent framework that aligns with both regulatory obligations and the need for agile delivery.
Transforming database governance from a manual compliance burden into an automated, integrated practice empowers teams to innovate securely, confidently, and at scale - ensuring that every change respects the policies that protect your data, your customers, and your brand.


Most organizations begin their software supply chain journey the same way: they implement Software Composition Analysis (SCA) to manage open source risk. They scan for vulnerabilities, generate SBOMs, and remediate CVEs. For many teams, that feels like progress—and it is.
But it is not enough.
As discussed in the webinar conversation with Information Security Media Group, open-source visibility is necessary, but it is not sufficient. Modern applications are no longer just collections of third-party libraries. They are built, packaged, signed, stored, promoted, deployed, and increasingly augmented by AI systems. Each stage introduces new dependencies and new trust boundaries.
Events like Log4j made open source risk impossible to ignore. However, the evolution of the threat landscape has demonstrated that attackers are no longer limited to exploiting known vulnerabilities in libraries. They are targeting the mechanics of delivery itself—ingestion, build pipelines, artifact storage, and CI/CD automation. Organizations that stop at SCA are securing one layer of a much broader system.
Artifacts are the final outputs of the build process—container images, binaries, scripts, Helm charts, JAR files. They are what actually run in production. Yet many organizations treat artifact management as operational plumbing rather than a security control point.
The webinar highlighted how focusing exclusively on source code can obscure the reality that artifacts may be tampered with after build, altered during promotion, or stored in misconfigured registries. Visibility into open source dependencies does not automatically guarantee artifact integrity. An attacker who compromises a registry or intercepts promotion workflows can distribute malicious artifacts at scale.
The key risk lies in assuming that once an artifact is built, it is trustworthy. Without signing, provenance tracking, and gating at the registry level, artifacts become one of the most exploitable surfaces in the supply chain.
CI/CD systems hold credentials, secrets, deployment paths, and signing keys. They connect development directly to production. As one of the speakers noted during the discussion, pipelines should be treated as privileged infrastructure and assumed to be potential targets of compromise.
A compromised runner can publish malicious artifacts, exfiltrate secrets, or promote unauthorized builds. This is not theoretical. Attacks involving poisoned GitHub Actions and manipulated build systems demonstrate how easily the pipeline itself can become the distribution mechanism.
Security must therefore extend beyond scanning artifacts to enforcing strict governance within pipelines. This includes least privilege access, ephemeral credentials, audit trails, and Policy as Code enforcement to ensure required security checks cannot be bypassed.
Container ecosystems introduce additional risk vectors. Malicious images uploaded to public registries, typosquatting packages, and compromised upstream components can all infiltrate environments through seemingly legitimate pulls.
Organizations that implicitly trust external registries transfer vendor risk into their own infrastructure. Without upstream proxy controls, cool-off periods, or quarantine mechanisms, container ingestion becomes another blind spot.
The supply chain does not stop at internal code repositories. It extends to every external source that feeds into the build.
Modern delivery pipelines integrate numerous third-party services. Vendors often have privileged access to environments or automated integrations into CI/CD workflows. If a vendor is compromised, that risk propagates downstream.
The webinar discussion emphasized that the supply chain must be viewed as a “trust fabric.” Pipelines, registries, and vendors are all part of that fabric. A weakness in any one node can cascade across the system.
Build systems represent one of the most underestimated attack surfaces in modern software delivery. A source tree may pass review and scanning, yet small modifications in the build process can fundamentally alter the resulting artifact.
Examples discussed during the session included pre-install hooks, registry overrides, runtime installers, or seemingly minor shell script changes that introduce malicious behavior before artifacts are signed or scanned. These changes can bypass traditional SCA tools because the underlying source code appears clean.
This is why build integrity must be verifiable. Provenance should be recorded and tied to specific systems and identities. Build steps should be signed and attested. Promotion gates should require verification of those attestations before artifacts move forward.
Trust must be anchored in the build output, not assumed from the source input.
CI/CD pipelines are often viewed as automation tools, but they are in fact highly privileged systems that bridge development and production. They hold secrets, manage deployment logic, and often operate with broad permissions across infrastructure.
The webinar discussion stressed that pipelines must be treated as untrusted environments by default. This does not imply mistrust of developers, but rather recognition that any high-privilege system is an attractive target.
Policy-as-code frameworks, strict RBAC, auditability, and enforcement of mandatory security checks ensure that controls cannot be disabled under pressure to ship. Developers may unintentionally bypass safeguards when under deadlines. Governance mechanisms must therefore be systemic, not optional.
In complex environments where multiple tools—GitHub Actions, Jenkins, Docker, Kubernetes—are integrated together, misconfiguration becomes another source of risk. Each tool has its own security model. Without centralized governance, complexity compounds vulnerability.
As if artifacts, pipelines, and containers were not enough, AI-native applications are adding an entirely new dimension to supply chain security.
Modern applications increasingly rely on Large Language Models, prompt libraries, embeddings, model weights, and training datasets. These components influence runtime behavior in ways that traditional code does not. Yet they are rarely tracked or governed with the same rigor as open-source dependencies.
The concept of an “AI Bill of Materials” is emerging, but no standardized framework currently exists. Organizations are integrating AI features faster than governance standards can keep pace.
The risks differ from traditional CVEs. Poisoned training data can subtly manipulate model behavior. Backdoored model weights can introduce hidden functionality. Prompt injection attacks can trick systems into exposing sensitive information. Shadow AI systems may be deployed without formal oversight.
Unlike deterministic software, AI systems produce probabilistic outputs. Static security testing does not fully address this unpredictability. Security teams must now consider model provenance, data lineage, vendor trust, and runtime behavior monitoring as part of the supply chain equation.
Even the build-versus-buy decision for LLMs becomes a supply chain governance choice. Building offers control but introduces operational burden and long-term responsibility. Buying accelerates deployment but increases trust dependency on external vendors. In both cases, AI components extend the trust fabric and must be governed accordingly.
Moving beyond SCA requires structured controls across the full lifecycle of delivery.
Ingest-Time Controls ensure that risky packages and images are prevented from entering developer workflows in the first place through dependency firewalls, upstream proxy governance, and vendor controls.
Build-Time Integrity requires signed environments, provenance attestations, and enforcement of SLSA-style compliance so that artifacts can be cryptographically tied to verified build processes.
Promotion-Time Governance introduces artifact registry gating, quarantine workflows, and policy enforcement to prevent unauthorized or tampered artifacts from advancing to production.
Runtime Verification ensures continuous monitoring of deployment health, secret usage, and, increasingly, AI behavior to detect anomalous activity after release.
This layered approach transforms supply chain security from a reactive scanning function into an operational control system embedded directly into software delivery workflows.
Software supply chain security has evolved.
It is no longer an open-source vulnerability problem alone. It is a trust management challenge spanning artifacts, pipelines, containers, vendors, and AI components.
Organizations that succeed will not stop at generating reports. They will enforce policy at every stage. They will treat CI/CD as privileged infrastructure. They will require attestations before promotion. They will govern ingestion and monitor runtime behavior. They will extend security controls into AI-native systems.
Supply chain security must move beyond visibility. It must deliver enforceable control across the entire software delivery lifecycle. Software supply chain security isn’t about scanning more. It’s about governing every stage of software delivery from code to artifact to pipeline to production to AI.
Ready to see how Harness embeds supply chain security directly into CI/CD, artifact governance, and AI-powered verification?
Explore Harness Software Supply Chain Security solutions and secure your delivery pipeline end-to-end.
No. SCA provides visibility into open-source vulnerabilities but does not protect CI/CD pipelines, artifact integrity, container ecosystems, or AI components.
Pipelines hold credentials, signing keys, and deployment paths. A compromised runner can inject malicious artifacts or exfiltrate secrets.
Artifact governance includes registry gating, quarantine workflows, attestation verification, and policy enforcement before artifacts are promoted or deployed.
An AI-BOM would catalog AI components such as models, prompts, embeddings, and training data. Standards are still emerging.
They exploit ingestion workflows, build steps, compromised pipelines, or malicious container images — rather than known CVEs.
Build offers control but high cost and operational burden. Buy offers speed and vendor expertise but introduces trust and governance risks.


For decades, SCM has meant one thing: Source Code Management. Git commits, branches, pull requests, and version history. The plumbing of software delivery. But as AI agents show up in every phase of the software development lifecycle, from writing a spec to shipping code to reviewing a PR, the acronym is quietly undergoing its most important transformation yet.
And this isn't a rebrand. It's a rethinking of what a source repository is, what it stores, and what it serves, not just to developers, but to the agents working alongside them.
AI agents in software development are powerful but contextually blind by default. Ask a coding agent to implement a feature and it will reach out and read files, one by one, directory by directory, until it has assembled enough context to act. Ask a code review agent to assess a PR and it will crawl through the codebase to understand what changed and why it matters.
Anthropic's 2026 Agentic Coding Trends Report documents this shift in detail: the SDLC is changing dramatically as single agents evolve into coordinated multi-agent teams operating across planning, coding, review, and deployment. The report projects the AI agents market to grow from $7.84 billion in 2025 to $52.62 billion by 2030. But as agents multiply across the lifecycle, so does their hunger for codebase context, and so does the cost of getting that context wrong.
This approach has two brutal failure modes:
The result? Agents that hallucinate implementations because they missed a key abstraction three directories away. Code reviewers that flag style issues but miss architectural regressions. PRD generators that know the syntax of your codebase but not its soul.
The bottleneck is not the model. It is the absence of a pre-computed, semantically rich, always-available representation of the entire codebase: a context engine.
Consider a simple task: "Add rate limiting to the /checkout endpoint."
Without a context engine, a coding agent opens checkout.go, reads the handler function, and writes a token-bucket rate limiter inline at the top of the handler. The code compiles. The tests pass. The PR looks clean.
The agent missed three things:
The code works. The team that maintains it finds it wrong in every way that matters. A senior engineer catches these issues in review, requests changes, and the cycle restarts. Multiply this by every agent-generated PR across every team, every day.
With a context engine, the same agent queries before writing code: "How is rate limiting implemented in this service?" The context engine returns:
The agent writes a new rate limiter that follows the established pattern, implements the shared interface, emits metrics through the standard pipeline, and includes tests that match the existing style. The PR wins approval on the first pass.
The difference is context quality, not model quality.
The Language Server Protocol (LSP) transformed developer tooling in the past decade. By standardizing the interface between editors and language-aware backends, LSP gave every IDE, from VS Code to Neovim, access to autocomplete, go-to-definition, hover documentation, and real-time diagnostics. LSP was designed to serve a specific consumer: a human developer, working interactively, in a single file at a time. That design made the right trade-offs for its era:
For interactive development, these are strengths. LSP excels at what it was built to do.
Agents are a different class of consumer. They don't sit in a file waiting for cursor events. They operate across entire repositories, across SDLC phases, often in parallel. They need the full semantic picture before they start, not incrementally as they navigate.
Agents need not a replacement for LSP, but a complement: something pre-built, always available, queryable at repo scale, and semantically complete, ready before anyone opens a file.
Lossless Semantic Trees (LST), pioneered by the OpenRewrite project (born at Netflix, commercialized by Moderne), take a different approach to code representation.
Unlike the traditional Abstract Syntax Tree (AST), an LST:
This is the first layer of a Source Context Management system. Not raw files. Not a running language server. A pre-indexed semantic tree of the entire codebase, queryable by agents at any time.
A proper Source Context Management system is not a single component. It is a three-layer stack that turns a repository from a file store into something agents can actually reason over.
Every file in the repository is parsed into an LST and simultaneously embedded into a vector representation. This creates two complementary indices:
The LST and semantic indices are projected into a code knowledge graph, a property graph where nodes are functions, classes, modules, interfaces, and comments, and edges are relationships: calls, imports, inherits, implements, modifies, tests.
This graph enables queries like:
The context engine exposes itself through a Model Context Protocol (MCP) server or REST API, so any agent (whether a coding agent, a review agent, a risk assessment agent, or a documentation agent) can query the context engine directly, retrieving precisely the subgraph or semantic chunk it needs, without ever touching the raw file system.
The key insight: agents never read files. They query the context engine.
A single context engine can serve every phase of the software development lifecycle.
A PRD agent queries the context engine to understand existing capabilities, technical constraints, and module boundaries before generating a requirements document. It produces specs grounded in what the system actually is, not what someone thinks it is.
A spec agent traverses the code graph to identify affected components, surface similar prior implementations, flag integration points, and propose an architecture, all without reading a single file directly.
A coding agent retrieves the precise subgraph surrounding the feature area: the types it needs to implement, the interfaces it must satisfy, the patterns used in adjacent modules, the test conventions for this package. It writes code that fits the codebase, not just code that compiles.
A review agent queries the context engine to understand the semantic diff, not just what lines changed, but what that change means for the rest of the system. It can immediately surface:
A risk agent scores every PR against the code graph, identifying high-centrality nodes (code that many things depend on), historically buggy modules, and changes that cross team ownership boundaries. No DORA metrics spreadsheet required.
A documentation agent can traverse the code graph to generate living documentation (architecture diagrams, module dependency maps, API contracts) that updates automatically as the codebase evolves. Design principles can be encoded as graph constraints and validated on every merge.
When a production incident occurs, an on-call agent queries the context engine with the failing component and gets an immediate blast-radius map, the last 10 changes to that subgraph, the owners, and the test coverage status. Time-to-understanding drops from hours to seconds.
The business case is simple:
This is not a theoretical architecture. Tools exist today:
The missing piece is not any individual component. It is the platform that assembles them into a unified, repo-attached context engine that every agent in the SDLC can query through a single interface.
Source Context Management faces real engineering challenges:
This is the shift:
A repository is not a collection of files. A repository is a knowledge graph with a version history attached.
Git's job is to version that knowledge. The context engine's job is to make it queryable. The agent's job is to act on it.
Follow this model and the consequences are concrete. Every CI/CD pipeline should include a context engine update step, as natural as running tests. Every developer platform should expose a context engine API alongside its code hosting API. Every AI coding tool should be evaluated not just on model quality but on context engine quality.
Source code repositories that don't invest in their context layer will produce agents that are fast but wrong. Repositories with rich, well-maintained context engines will produce agents that feel like senior engineers, because they have the same depth of understanding of the codebase that a senior engineer carries in their head.
The LSP gave us IDE intelligence. Git gave us version control. Docker gave us portable environments. Kubernetes gave us cluster orchestration. Each of these was an infrastructure primitive that unlocked a new generation of developer tooling.
It is the prerequisite for every agentic SDLC capability worth building. And like every infrastructure primitive before it, the teams and platforms that build it first will be hard to catch.
SCM is no longer just about managing source code. It's about managing the context that makes the source code understandable.


Did ecTerraform vendor lock-in just become your biggest operational risk without you noticing? When HashiCorp changed Terraform's license from MPL to BSL in August 2023, legal terms were not the only alteration. They fundamentally shifted the operational landscape for thousands of platform teams who built their infrastructure automation around what they believed was an open, community-driven tool. If your organization runs Terraform at scale, you're now facing a strategic decision that wasn't on your roadmap six months ago.
The uncomfortable truth is that most teams didn't architect for IaC portability. Why would they? Terraform was open source. It was the standard. And now, many organizations find themselves in a position they swore they'd never be in again after the Kubernetes wars: locked into a single vendor's roadmap, pricing model, and strategic priorities.
This isn't theoretical; it’s the very serious reality platform engineers are dealing with right now!
Terraform lock-in wasn't always a concern. For years, Terraform represented the opposite of vendor lock-in. It was open source, cloud-agnostic, and community-driven. Teams built entire operational models around it. They trained engineers, standardized on HCL, built module libraries, and integrated Terraform deeply into CI/CD pipelines. You’ve got to hand it to them; these aspects were very desirable.
Then HashiCorp moved to the Business Source License. Suddenly, the "open" in "open source" came with conditions. The BSL restricts certain commercial uses, and while many organizations technically fall outside those restrictions, the change introduced uncertainty.
The deeper problem is architectural. Most teams didn't design for IaC engine portability because they didn't need to. Terraform state files, provider interfaces, and workflow patterns became embedded assumptions. Module libraries assumed Terraform syntax. Pipelines called `terraform plan` and `terraform apply` directly. When every workflow is tightly coupled to a single tool's CLI and API, switching becomes expensive.
This is classic vendor lock-in, even if it happened gradually and without malice.
The immediate cost of Terraform lock-in isn't the license itself, but rather related to what you can't do when you're locked in.
If HashiCorp decides to sunset features, deprecate APIs, or introduce breaking changes, you either adapt or do without; stuck on an outdated version with mounting technical debt.
The operational risk compounds over time. When you're locked into a single IaC tool, you're also locked into its limitations. If drift detection isn't native, you build workarounds. If policy enforcement is bolted on, you maintain custom integrations. If the state backend causes performance issues at scale, you optimize around the bottleneck rather than solving the root problem.
And then there's the talent risk. If your team only knows Terraform, and the industry shifts toward other IaC paradigms, you're either retraining everyone or competing for a shrinking talent pool. Monocultures are fragile.
The good news is that escaping Terraform lock-in doesn't require a full rewrite. It requires a deliberate strategy to introduce portability into your IaC architecture.
OpenTofu emerged as the open-source fork of Terraform immediately after the license change. It's MPL-licensed, community-governed through the Linux Foundation, and API-compatible with Terraform 1.5.x. For most teams, OpenTofu migration is the lowest-friction path to regaining control over your IaC engine.
Migrating to OpenTofu doesn't mean abandoning your existing Terraform workflows. Because OpenTofu maintains compatibility with Terraform's core primitives, you can run OpenTofu side-by-side with Terraform during a transition. This lets you validate behavior, test edge cases, and build confidence before committing fully.
The strategic advantage of OpenTofu is not just licensing, optionality. Once you're no longer tied to HashiCorp's roadmap, you can evaluate IaC engines based on technical merit rather than sunk cost.
The harder part of escaping IaC vendor lock-in is decoupling your operational workflows from Terraform-specific patterns. This means abstracting your pipelines so they don't hardcode `terraform plan` and `terraform apply`. It means designing module interfaces that could theoretically support multiple engines. It means treating the IaC engine as an implementation detail rather than the foundation of your architecture.
This is where infrastructure as code portability becomes a design principle. If your pipelines call a generic "plan" and "apply" interface, switching engines becomes a simple configuration change, not a migration project.
The reality is that most large organizations will eventually run multiple IaC tools. Some teams will use OpenTofu. Others will stick with Terraform for compatibility with existing state. New projects might adopt Terragrunt for DRY configurations or Pulumi for type-safe infrastructure definitions.
Fighting this diversity creates friction. Embracing it requires tooling that supports multi-IaC environments without forcing everyone into a lowest-common-denominator workflow. You need a platform that treats OpenTofu, Terraform, and other engines as first-class citizens, not as competing standards.
Harness Infrastructure as Code Management was built to solve the multi-IaC problem that most teams are only now realizing they have. It doesn't force you to pick a single engine. It doesn't assume Terraform is the default. It treats OpenTofu and Terraform as equally supported engines, with workflows that abstract away engine-specific details while preserving the flexibility to use either.
This matters because escaping Terraform lock-in isn't just about switching tools. It's about building infrastructure automation that doesn't collapse the next time a vendor changes direction.
Harness IaCM supports OpenTofu and Terraform natively, which means you can run both engines in the same platform without maintaining separate toolchains. You get unified drift detection, policy enforcement, and workspace management across engines. If you're migrating from Terraform to OpenTofu, you can run both during the transition and compare results side-by-side.
The platform also supports Terragrunt, which means teams that have invested in DRY Terraform configurations don't have to throw away that work to gain vendor neutrality. You can keep your existing module structure while gaining the operational benefits of a managed IaC platform.
Beyond engine support, Harness IaCM addresses the systemic problems that make IaC vendor lock-in so painful. The built-in Module and Provider Registry means you're not dependent on third-party registries that could introduce their own lock-in. Variable Sets and Workspace Templates let you enforce consistency without hardcoding engine-specific logic into every pipeline. Default plan and apply pipelines abstract away the CLI layer, so switching engines doesn't require rewriting every workflow.
Drift detection runs continuously, which means you catch configuration drift before it becomes an incident. Policy enforcement happens at plan time, which means violations are blocked before they reach production. These aren't afterthoughts or plugins. They're native platform capabilities that work the same way regardless of which IaC engine you're using.
And because Harness IaCM is part of the broader Harness Platform, you can integrate IaC workflows with CI/CD, feature flags, and policy governance without duct-taping together disparate tools. This is the architectural model that makes multi-IaC tool management practical at scale.
Explore the Harness IaCM product or dive into the technical details in the IaCM docs.
Escaping Terraform lock-in is not about abandoning Terraform everywhere tomorrow. It's about regaining strategic control over your infrastructure automation. It's about designing for portability so that future licensing changes, roadmap shifts, or technical limitations don't force another painful migration.
The teams that will navigate this transition successfully are the ones that treat IaC engines as interchangeable components in a larger platform architecture. They're the ones that build workflows that abstract away engine-specific details. They're the ones that invest in tooling that supports multi-IaC environments without creating operational chaos.
If your organization is still locked into Terraform, now is the time to architect for optionality. Start by evaluating OpenTofu migration paths. Decouple your pipelines from engine-specific CLI calls. Adopt a platform that treats IaC engines as implementation details, not strategic dependencies.
Because the next time a vendor changes their license, you want to be in a position to evaluate your options, not scramble for a migration plan.


AI made writing code faster. It didn’t make releasing that code safer.
That’s the tension platform teams are dealing with right now. Development velocity is rising, but release operations still depend on too many manual decisions, too many disconnected tools, and too much tribal knowledge. Teams can deploy more often, but they still struggle to standardize how features are exposed, how approvals are handled, how risky changes are governed, and how old flags get cleaned up before they turn into debt.
That’s where the latest Harness FME integrations matter.
Harness Feature Management & Experimentation is no longer just a place to create flags and run tests. With recent pipeline integration and policy support, FME becomes part of a governed release system. That’s the bigger story.
Feature flags are valuable. But at scale, value comes from operationalizing them.
The software delivery gap is getting easier to see.
In a recent Harness webinar, Lena Sano, a software developer on the Harness DevRel team and I framed the problem clearly: AI accelerates code creation, but the release system behind it often still looks manual, inconsistent, and fragile.
That perspective matters because both Lena and I sit close to the problem from different angles. I brought the platform and operating-model view. Lena showed what it actually looks like when feature release becomes pipeline-driven instead of person-driven.
The tension they described is familiar to most platform teams. When more code gets produced, more change reaches production readiness. That doesn’t automatically translate into safer releases. In fact, it usually exposes the opposite. Teams start batching more into each launch, rollout practices diverge from service to service, and approvals become a coordination tax instead of a control mechanism.
That’s why release discipline matters more in the AI era, not less.
Feature flags solve an important problem: they decouple deployment from release.
That alone is a major improvement. Teams can deploy code once, expose functionality gradually, target cohorts, run experiments, and disable a feature without redeploying the whole application.
But a flag by itself is not a release process.
I made the point directly in the webinar: feature flags are “the logical end of the pipeline process.” That line gets to the heart of the issue. When flags live outside the delivery workflow, teams get flexibility but not consistency. They can turn things on and off, but they still don’t have a standardized path for approvals, staged rollout, rollback decisions, or cleanup.
That’s where many programs stall. They adopt feature flags, but not feature operations.
The result is predictable:
This is why platform teams need more than flagging. They need a repeatable system around feature release.
The recent Harness FME pipeline integration addresses exactly that gap.
In the webinar demo, Lena showed a feature release workflow where the pipeline managed status updates, targeting changes, approvals, rollout progression, experiment review, and final cleanup. I later emphasized that “95% of it was run by a single pipeline.”
That’s not just a useful demo line. It’s the operating model platform teams have been asking for.
The first value of pipeline integration is simple: teams get a common release language.
Instead of every service or squad improvising its own process, pipelines can define explicit rollout stages and expected transitions. A feature can move from beta to ramping to fully released in a consistent, visible way.
That sounds small, but it isn’t. Standardized states create transparency, reduce confusion during rollout, and make it easier for multiple teams to understand where a change actually is.
Approvals are often where release velocity goes to die.
Without pipelines, approvals happen per edit or through side channels. A release manager, product owner, or account team gets pulled in repeatedly, and the organization calls that governance.
It isn’t. It’s coordination overhead.
Harness pipelines make approvals part of the workflow itself. That means platform teams can consolidate approval logic, trigger it only when needed, and capture the decision in the same system that manages the rollout.
That matters operationally and organizationally. It reduces noise for approvers, creates auditability, and keeps release evidence close to the actual change.
One of the most useful ideas in the webinar was that rollback should depend on what actually failed.
If the problem is isolated to a feature treatment, flip the flag. If the issue lives in the deployment itself, use the pipeline rollback or redeploy path. That flexibility matters because forcing every incident through a full application rollback is both slower and more disruptive than it needs to be.
With FME integrated into pipelines, teams don’t have to choose one blunt response for every problem. They can respond with the right mechanism for the failure mode.
That’s how release systems get safer.
Most organizations talk about flag debt after they’ve already created it.
The demo tackled that problem directly by making cleanup part of the release workflow. Once the winning variant was chosen and the feature was fully released, the pipeline paused for confirmation that the flag reference had been removed from code. Then targeting was disabled and the release path was completed.
That is a much stronger model than relying on someone to remember cleanup later.
Feature flags create leverage when they’re temporary control points. They create drag when they become permanent artifacts.
Pipelines standardize motion. Policies standardize behavior.
That’s why the recent FME policy integration matters just as much as pipeline integration.
As organizations move from dozens of flags to hundreds or thousands, governance breaks down fast. Teams start hitting familiar failure modes: flags without owners, inconsistent naming, unsafe default treatments, production targeting mistakes, segments that expose sensitive information, and change requests that depend on people remembering the rules.
Policy support changes that.
Harness now brings Policy as Code into feature management so teams can enforce standards automatically instead of managing them with review boards and exceptions.
This is the core release management tradeoff most organizations get wrong.
They think the only way to increase safety is to add human checkpoints everywhere. That works for a while. Then scale arrives, and those checkpoints become the bottleneck.
Harness takes a better approach. Platform teams can define policies once using OPA and Rego, then have Harness automatically evaluate changes against those policy sets in real time.
That means developers get fast feedback without waiting for a meeting, and central teams still get enforceable guardrails.
That is what scalable governance looks like.
The strongest part of the policy launch is that it doesn’t stop at the flag object itself.
It covers the areas where release risk actually shows up:
That matters because most rollout failures aren’t caused by the existence of a flag. They’re caused by how that flag is configured, targeted, or changed.
Governance only works when it matches how organizations are structured.
Harness policy integration supports that with scope and inheritance across the account, organization, and project levels. Platform teams can set non-negotiable global guardrails where they need them, while still allowing business units or application teams to define more specific policies in the places that require flexibility.
That is how you avoid the two classic extremes: the wild west and the central committee.
Global standards stay global. Team-level nuance stays possible.
The most important point here is not that Harness added two more capabilities.
It’s that these capabilities strengthen the same release system.
Pipelines standardize the path from deployment to rollout. FME controls release exposure, experimentation, and feature-level rollback. Policy as Code adds guardrails to how teams create and change those release controls. Put together, they form a more complete operating layer for software change.
That is the Harness platform value.
A point tool can help with feature flags. Another tool can manage pipelines. A separate policy engine can enforce standards. But when those pieces are disconnected, the organization has to do the integration work itself. Process drift creeps in between systems, and teams spend more time coordinating tools than governing change.
Harness moves that coordination into the platform.
This is the same platform logic that shows up across continuous delivery and GitOps, Feature Management & Experimentation, and modern progressive delivery strategies. The more release decisions can happen in one governed system, the less organizations have to rely on handoffs, tickets, and tribal knowledge.
The webinar and the new integrations point to a clearer operating model for modern release management.
Use CD to ship the application safely. Then use FME to expose the feature by cohort, percentage, region, or treatment.
Standardize stages, approvals, status transitions, and evidence collection so every release doesn’t invent its own operating model.
Move governance into Policy as Code. Don’t ask people to remember naming standards, metadata requirements, targeting limits, or approval conditions.
Use the flag, the pipeline, or a redeploy path based on the actual failure mode. Don’t force every issue into one response pattern.
Treat cleanup as a first-class release step, not a future best intention.
This is the shift platform engineering leaders should care about. The goal isn’t to add feature flags to the stack. It’s to build a governed release system that can absorb AI-era change volume without depending on heroics.
If this model is working, the signal should show up in operational metrics.
Start with these:
These are the indicators that tell you whether release governance is scaling or just getting noisier.
AI made software creation faster, but it also exposed how weak most release systems still are.
Feature flags help. Pipelines help. Policy as code helps. But the real value shows up when those capabilities work together as one governed release model.
That’s what Harness FME now makes possible. Teams can standardize rollout paths, automate approvals where they belong, enforce policy without slowing delivery, and clean up flags before they become operational debt. That is what it means to release fearlessly on a platform, not just with a point tool.
Ready to see how Harness helps platform teams standardize feature releases with built-in governance? Contact Harness for a demo.
Pipelines automate deployment and standardize release workflows. Feature flags decouple deployment from feature exposure, which gives teams granular control over rollout, experimentation, and rollback. Together, they create a safer and more repeatable release system.
It brings feature release actions into the same workflow that manages delivery. Teams can standardize status changes, targeting, approvals, rollout progression, and cleanup instead of handling those steps manually or in separate tools.
At scale, manual governance breaks down. Policy as code lets platform teams enforce standards automatically on flags, targeting rules, segments, and change requests so safety doesn’t depend on people remembering the rules.
Teams can enforce naming conventions, ownership and tagging requirements, safer targeting defaults, environment-specific rollout rules, segment governance, and approval requirements for sensitive change requests.
It reduces risk by combining progressive rollout controls with standardized workflows and automated governance. Teams can limit blast radius, catch unsafe changes earlier, and respond with the right rollback path when issues appear.
It shows how Harness connects delivery automation, feature release control, and governance in one system. That reduces toolchain sprawl and turns release management into a platform capability rather than a collection of manual steps.
They make cleanup part of the workflow. When the rollout is complete and the winning treatment is chosen, the pipeline should require confirmation that the flag has been removed from code and no longer needs active targeting.


Releasing fearlessly isn't just about getting code into production safely. It's about knowing what happened after the release, trusting the answer, and acting on it without stitching together three more tools.
That is where many teams still break down.
They can deploy. They can gate features. They can even run experiments. But the moment they need trustworthy results, the workflow fragments. Event data moves into another system. Metric definitions drift from business logic. Product, engineering, and data teams start debating the numbers instead of deciding what to do next.
That's why Warehouse Native Experimentation matters.
Today, Harness is making Warehouse Native Experimentation generally available in Feature Management & Experimentation (FME). After proving the model in beta, this capability is now ready for broader production use by teams that want to run experiments directly where their data already lives.
This is an important launch on its own. It is also an important part of the broader Harness platform story.
Because “release fearlessly” is incomplete if experimentation still depends on exported datasets, shadow pipelines, and black-box analysis.
The AI era changed one thing fast: the volume of change.
Teams can create, modify, and ship software faster than ever. What didn't automatically improve was the system that turns change into controlled outcomes. Release coordination, verification, experimentation, and decision-making are still too often fragmented across different tools and teams.
That's the delivery gap.
In a recent Harness webinar, Lena Sano, a Software Developer on the Harness DevRel team and I showed why this matters. Their point was straightforward: deployment alone is not enough. As I said in the webinar, feature flags are “the logical end of the pipeline process.”
That framing matters because it moves experimentation out of the “nice to have later” category and into the release system itself.
When teams deploy code with Harness Continuous Delivery, expose functionality with Harness FME, and now analyze experiment outcomes with trusted warehouse data, the release moment becomes a closed loop. You don't just ship. You learn.
Warehouse Native Experimentation extends Harness FME with a model that keeps experiment analysis inside the data warehouse instead of forcing teams to export data into a separate analytics stack.
That matters for three reasons.
First, it keeps teams closer to the source of truth the business already trusts.
Second, it reduces operational drag. Teams do not need to build and maintain unnecessary movement of assignment and event data just to answer basic product questions.
Third, it makes experimentation more credible across functions. Product teams, engineers, and data stakeholders can work from the same governed data foundation instead of arguing over two competing systems.
General availability makes this model ready to support production experimentation programs that need more than speed. They need trust, repeatability, and platform-level consistency.
Traditional experimentation workflows assume that analysis can happen somewhere downstream from release. That assumption does not hold up well anymore.
When development velocity rises, so does the volume of features to evaluate. Teams need faster feedback loops, but they also need stronger confidence in the data behind the decision. If every experiment requires moving data into another system, recreating business metrics, and validating opaque calculations, the bottleneck just shifts from deployment to analysis.
That's the wrong pattern for platform teams.
Platform teams are being asked to support higher release frequency without increasing risk. They need standardized workflows, strong governance, and fewer manual handoffs. They do not need another disconnected toolchain where experimentation introduces more uncertainty than it removes.
Warehouse Native Experimentation addresses that by bringing experimentation closer to the release process and closer to trusted business data at the same time.
This launch matters because it changes how experimentation fits into the software delivery model.


Warehouse Native Experimentation lets teams run analyses directly in supported data warehouses rather than exporting experiment data into an external system first.
That is a meaningful shift.
It means your experiment logic can operate where your product events, business events, and governed data models already exist. Instead of copying data out and hoping definitions stay aligned, teams can work from the warehouse as the source of truth.
For organizations already invested in platforms like Snowflake or Amazon Redshift, this reduces friction and increases confidence. It also helps avoid the shadow-data problem that shows up when experimentation becomes one more separate analytics island.
Good experimentation depends on metric quality.
Warehouse Native Experimentation lets teams define metrics from the warehouse tables they already trust. That includes product success metrics as well as guardrail metrics that help teams catch regressions before they become larger incidents.
This is a bigger capability than it may appear.
Many experimentation programs fail not because teams lack ideas, but because they cannot agree on what success actually means. When conversion, latency, revenue, or engagement are defined differently across tools, the experiment result becomes negotiable.
Harness moves that discussion in the right direction. The metric should reflect the business reality, not the reporting limitations of a separate experimentation engine.
Speed matters. Trust matters more.
Warehouse Native Experimentation helps teams understand impact with results that are transparent and inspectable. That gives engineering, product, and data teams a better basis for action.
The practical benefit is simple: when a result looks surprising, teams can validate the logic instead of debating whether the tool is doing something hidden behind the scenes.
That transparency is a major part of the launch story. Release fear decreases when teams trust both the rollout controls and the data used to judge success.
Warehouse Native Experimentation is valuable on its own. But its full value shows up when you look at how it fits into the Harness platform.
In the webinar, Lena demonstrated a workflow where a pipeline controlled flag status, targeting, approvals, rollout progression, and even cleanup. I emphasized that “95% of it was run by a single pipeline.”
That is not just a demo detail. It is the operating model platform teams want.
Pipelines make releases consistent. They reduce team-to-team variation. They create auditability. They turn release behavior into a reusable system instead of a series of manual decisions.
Harness FME gives teams the ability to decouple deployment from release, expose features gradually, target specific cohorts, and run experiments as part of a safer delivery motion.
That is already powerful.
It lets teams avoid full application rollback when one feature underperforms. It lets them isolate problems faster. It gives product teams a structured way to learn from real usage without treating every feature launch like an all-or-nothing event.
Warehouse Native Experimentation completes that model.
Now the experiment does not end at exposure control. It continues into governed analysis using the data infrastructure the business already depends on. The result is a tighter loop from release to measurement to decision.
That is why this is a platform launch.
Harness is not asking teams to choose between delivery tooling and experimentation tooling and warehouse trust. The platform brings those motions together:
That is what “release fearlessly” looks like when it extends beyond deployment.
Engineering leaders should think about this launch as a better operating model for software change.
Release with control. Use pipelines and feature flags to separate deployment from feature exposure.
Verify with the right signals. Use guardrail metrics and rollout logic to contain risk before it spreads.
Learn from trusted data. Run experiments against the warehouse instead of recreating the truth somewhere else.
Standardize the process. Make approvals, measurement, and cleanup part of the same repeatable workflow.
This is especially important for platform teams trying to keep pace with AI-assisted development. More code generation only helps the business if the release system can safely absorb more change and turn it into measurable outcomes.
Warehouse Native Experimentation helps make that possible.
This feature will be especially relevant for teams that:
As software teams push more change through the system, trusted experimentation can no longer sit off to the side. It has to be part of the release model itself.
Harness now gives teams a stronger path to do exactly that: deploy safely, release progressively, and measure impact where trusted data already lives. That is not just better experimentation. It is a better software delivery system.
Ready to see how Harness helps teams release fearlessly with trusted, warehouse-native experimentation? Contact Harness for a demo.
Warehouse Native Experimentation is a capability in Harness FME that lets teams analyze experiment outcomes directly in their data warehouse. That keeps experimentation closer to governed business data and reduces the need to export data into separate analysis systems.
GA signals that the capability is ready for broader production adoption. For platform and product teams, that means Warehouse Native Experimentation can become part of a standardized release and experimentation workflow rather than a limited beta program.
Traditional approaches often require moving event data into a separate system for analysis. Warehouse-native experimentation keeps analysis where the data already lives, which improves trust, reduces operational overhead, and helps align experiment metrics with business definitions.
Safer releases are not only about deployment controls. They also require trusted feedback after release. Warehouse Native Experimentation helps teams learn from production changes using governed warehouse data, making release decisions more confident and more repeatable.
Harness pipelines help standardize the release workflow, while Harness FME controls rollout and experimentation. Warehouse Native Experimentation adds trusted measurement to that same motion, closing the loop from deployment to exposure to decision.
Organizations with mature data warehouses, strong governance requirements, and a need to scale experimentation across teams will benefit most. It is especially relevant for platform teams that want experimentation to be part of a consistent software delivery model.


A financial services company ships code to production 47 times per day across 200+ microservices. Their secret isn't running fewer tests; it's running the right tests at the right time.
Modern regression testing must evolve beyond brittle test suites that break with every change. It requires intelligent test selection, process parallelization, flaky test detection, and governance that scales with your services.
Harness Continuous Integration brings these capabilities together: using machine learning to detect deployment anomalies and automatically roll back failures before they impact customers. This framework covers definitions, automation patterns, and scale strategies that turn regression testing into an operational advantage. Ready to deliver faster without fear?
Managing updates across hundreds of services makes regression testing a daily reality, not just a testing concept. Regression testing in CI/CD ensures that new code changes don’t break existing functionality as teams ship faster and more frequently. In modern microservices environments, intelligent regression testing is the difference between confident daily releases and constant production risk.
These terms often get used interchangeably, but they serve different purposes in your pipeline. Understanding the distinction helps you avoid both redundant test runs and dangerous coverage gaps.
In practice, you run them sequentially: retest the fix first, then run regression suites scoped to the affected services. For microservices environments with hundreds of interdependent services, this sequencing prevents cascade failures without creating deployment bottlenecks.
The challenge is deciding which regression tests to run. A small change to one service might affect three downstream dependencies, or even thirty. This is where governance rules help. You can set policies that automatically trigger retests on pull requests and broader regression suites at pre-production gates, scoping coverage based on change impact analysis rather than gut feel.
To summarize, Regression testing checks that existing functionality still works after a change. Retesting verifies that a specific bug fix works as intended. Both are essential, but they serve different purposes in CI/CD pipelines.
The regression testing process works best when it matches your delivery cadence and risk tolerance. Smart timing prevents bottlenecks while catching regressions before they reach users.
This layered approach balances speed with safety. Developers get immediate feedback while production deployments include comprehensive verification. Next, we'll explore why this structured approach becomes even more critical in microservices environments where a single change can cascade across dozens of services.
Modern enterprises managing hundreds of microservices face three critical challenges: changes that cascade across dependent systems, regulatory requirements demanding complete audit trails, and operational pressure to maintain uptime while accelerating delivery.
A single API change can break dozens of downstream services you didn't know depended on it.
Financial services, healthcare, and government sectors require documented proof that tests were executed and passed for every promotion.
Catching regressions before deployment saves exponentially more than fixing them during peak traffic.
With the stakes clear, the next question is which techniques to apply.
Once you've established where regression testing fits in your pipeline, the next question is which techniques to apply. Modern CI/CD demands regression testing that balances thoroughness with velocity. The most effective techniques fall into three categories: selective execution, integration safety, and production validation.
Once you've established where regression testing fits in your pipeline, the next question is which techniques to apply. Modern CI/CD demands regression testing that balances thoroughness with velocity. The most effective techniques fall into three categories: selective execution, integration safety, and production validation—with a few pragmatic variants you’ll use day-to-day.
These approaches work because they target specific failure modes. Smart selection outperforms broad coverage when you need both reliability and rapid feedback.
Managing regression testing across 200+ microservices doesn't require days of bespoke pipeline creation. Harness Continuous Integration provides the building blocks to transform testing from a coordination nightmare into an intelligent safety net that scales with your architecture.
Step 1: Generate pipelines with context-aware AI. Start by letting Harness AI build your pipelines based on industry best practices and the standards within your organization. The approach is interactive, and you can refine the pipelines with Harness as your guide. Ensure that the standard scanners are run.
Step 2: Codify golden paths with reusable templates. Create Harness pipeline templates that define when and how regression tests execute across your service ecosystem. These become standardized workflows embedding testing best practices while giving developers guided autonomy. When security policies change, update a single template and watch it propagate to all pipelines automatically.
Step 3: Enforce governance with Policy as Code. Use OPA policies in Harness to enforce minimum coverage thresholds and required approvals before production promotions. This ensures every service meets your regression standards without manual oversight.
With automation in place, the next step is avoiding the pitfalls that derail even well-designed pipelines.
Regression testing breaks down when flaky tests erode trust and slow suites block every pull request. These best practices focus on governance, speed optimization, and data stability.
Regression testing in CI/CD enables fast, confident delivery when it’s selective, automated, and governed by policy. Regression testing transforms from a release bottleneck into an automated protection layer when you apply the right strategies. Selective test prioritization, automated regression gates, and policy-backed governance create confidence without sacrificing speed.
The future belongs to organizations that make regression testing intelligent and seamless. When regression testing becomes part of your deployment workflow rather than an afterthought, shipping daily across hundreds of services becomes the norm.
Ready to see how context-aware AI, OPA policies, and automated test intelligence can accelerate your releases while maintaining enterprise governance? Explore Harness Continuous Integration and discover how leading teams turn regression testing into their competitive advantage.
These practical answers address timing, strategy, and operational decisions platform engineers encounter when implementing regression testing at scale.
Run targeted regression subsets on every pull request for fast feedback. Execute broader suites on the main branch merges with parallelization. Schedule comprehensive regression testing before production deployments, then use core end-to-end tests as synthetic testing during canary rollouts to catch issues under live traffic.
Retesting validates a specific bug fix — did the payment timeout issue get resolved? Regression testing ensures that the fix doesn’t break related functionality like order processing or inventory updates. Run retests first, then targeted regression suites scoped to affected services.
There's no universal number. Coverage requirements depend on risk tolerance, service criticality, and regulatory context. Focus on covering critical user paths and high-risk integration points rather than chasing percentage targets. Use policy-as-code to enforce minimum thresholds where compliance requires it, and supplement test coverage with AI-powered deployment verification to catch regressions that test suites miss.
No. Full regression on every commit creates bottlenecks. Use change-based test selection to run only tests affected by code modifications. Reserve comprehensive suites for nightly runs or pre-release gates. This approach maintains confidence while preserving velocity across your enterprise delivery pipelines.
Quarantine flaky tests immediately, rather than letting them block pipelines. Tag unstable tests, move them to separate jobs, and set clear SLAs for fixes. Use failure strategies like retry logic and conditional execution to handle intermittent issues while maintaining deployment flow.
Treat test code with the same rigor as application code. That means version control, code reviews, and regular cleanup of obsolete tests. Use policy-as-code to enforce coverage thresholds across teams, and leverage pipeline templates to standardize how regression suites execute across your service portfolio.


When an offensive security AI agent can compromise one of the world’s most sophisticated consulting firms in under two hours with no credentials, guidance, or insider knowledge, it’s not just a breach but a warning sign to industry.
That’s exactly what happened when an AI agent targeted McKinsey’s Generative AI platform, Lilli. The agent chained together application flaws, API misconfigurations, and AI-layer vulnerabilities into a machine-speed attack. This wasn’t a novel zero-day exploit. It was the exploitation of familiar application security gaps and newer AI attack vectors, amplified by AI speed, autonomy, and orchestration.
Enterprises are already connecting functionality and troves of data through APIs. Increasingly, they’re wiring up applications with Generative AI and agentic workflows to accelerate their businesses. The risk of intellectual property loss and sensitive data exposure is amplified exponentially. Organizational teams must rethink their AI security strategy and likely also revisit API security in parallel.
Let’s be precise about what happened, avoiding the blame for McKinsey moving at a pace that much of the industry is already adopting with application and AI technology.
The offensive AI agent probing McKinsey’s AI system was quickly able to:
From there, the AI agent accessed:
Even experienced penetration testers don’t move this fast, not without AI tools to augment their testing. Many would stumble to find the type of SQL injection flaw present, let alone all the other elements in the attack chain.
What makes this security incident different and intriguing is how the AI agent crossed layers of the technology stack that are now prominent in AI-native designs.
McKinsey’s search API was vulnerable to blind SQL injection. The AI agent discovered that while values were parameterized (a security best practice), it could still inject into JSON keys used as field names in the backend database and analyze the resulting error messages. Through continued probing and evaluation of these error messages, the agent mapped the query structure and extracted production data.
These are long-known weaknesses in how applications are secured. Many organizations rely on web application firewall (WAF) instances to filter and monitor web application traffic and to stop attacks such as SQL injection. However, attack methods constantly evolve. Blind SQL injection, where attackers infer information from the system without seeing direct results, is harder to detect and works by analyzing system responses to invalid queries, such as those that delay server response. These attacks can also be made to look like normal data traffic.
Security teams need monitoring capabilities that analyze application traffic over time to identify anomalous behaviors and the signals of an attack.
The offensive agent quickly performed reconnaissance of McKinsey’s system to understand its API footprint and discovered that 22 API endpoints were unauthenticated, one of which served as the initial and core point of compromise.
The public API documentation served as a roadmap for the AI agent, detailing the system's structure and functionality. This presents a tricky proposition, since well-documented APIs and API schema definitions are critical to increasing adoption of productized APIs, enabling AIs to find your services, and facilitating agent orchestration.
APIs aren’t just data pipes anymore; they’re also control planes for AI systems.
APIs serve as control planes in AI-native designs, managing the configuration of model commands and access controls, and also connecting the various AI and data services. Compromising this layer enables attackers to manipulate AI configuration, control AI behavior, and exfiltrate data.
The major oversight here was the presence of 22 unauthenticated API endpoints that allowed unfettered access. This is a critical API security vulnerability, known as broken authentication.
Lack of proper authorization enabled the AI agent to manipulate unique identifiers assigned to data objects within the API calls, increase its own access permissions (escalate privileges), and retrieve other users' data. The weakness is commonly known as broken object-level authorization (BOLA), where system checks fail to restrict user or machine access to specific data. McKinsey’s AI design also allowed direct API access to backend systems, potentially exposing internal technical resources and violating zero-trust architecture (ZTA) principles. With ZTA, you must presume that the given identity and the environment are compromised, operate with least privilege, and ensure controls are in place to limit blast radius in the event of an attack. At a minimum, all identities must be continuously authenticated and authorized before accessing resources.
A breach in an AI system essentially provides centralized access to all organizational knowledge. A successful intrusion can grant control over system logic via features such as writable system prompts. This enables attackers to rewrite AI guardrails, subtly steering AI to bypass compliance policies, generate malicious code, or leak sensitive information.
New risks arise when organizations aim to improve AI system usefulness by grounding them with other sources (e.g., web searches, databases, documents, files) or using retrieval-augmented generation (RAG) pipelines that connect data sources to AI systems. This is done to tweak the prompts sent to LLMs and improve the quality of responses. However, attackers exploit these connections to corrupt the information processing or trick the AI into revealing sensitive or proprietary data.
With its elevated access, the AI agent had the ability to gain influence over:
A breach in the AI layer is not just a security incident, but a core attack on the integrity and competence of the business.
The rise of generative AI has further dissolved traditional security perimeters and created critical new attack vectors. Attackers can now target core mechanisms of institutional intelligence and reasoning, not just data.
Traditional "defense in depth" thinking segments application and AI protection into isolated layers, commonly WAFs, API gateways, API runtime security, and AI guardrails. While offering granular protection, such approaches inadvertently create a critical security blind spot: they fail to track sophisticated, multi-stage attacks that exploit handoffs between application layers.
Modern attacks are fluid campaigns. They may target frontend code as the initial attack vector, abuse APIs to attack business logic, bypass access controls enforced by gateways, pivot to database services for data exfiltration, and leverage access to manipulate reasoning of AI services.
The fatal flaw is the inability to maintain a single, unbroken chain of contextual awareness across the entire sequence. Each isolated WAF, gateway, or AI guardrail only sees a segment of the event and loses visibility once the request passes to the next layer. This failure to correlate events in real-time across APIs, applications, databases, and AI services is the blind spot that attackers exploit. By the time related signals are gathered and correlated in an organization’s SIEM, the breach has already occurred. True resilience requires a unified runtime platform to quickly identify, correlate, and respond to complex application attack chains.
To connect signals and stop advanced attacks, organizations need correlated visibility and control across their application, API, and AI footprint. This essential capability comes from three key elements.
A platform must identify your application assets by combining and analyzing traffic signals from:
Runtime protection must go beyond simple authentication checks. It requires a deep understanding of other application context including:
Threat detection and prevention must happen at multiple levels during runtime, which include:
The incident with McKinsey's AI system didn’t introduce new vulnerabilities. It revealed something more important.
AI systems amplify every weakness across your stack, and AI excels at finding them.
Act now by reevaluating your AI security posture, unifying security monitoring, and bridging gaps that AI can exploit before attackers do.
It’s fortunate this event was essentially a research experiment and not a motivated threat actor. Attackers are already thinking in terms of AI-native designs. It’s not about endpoints or services for them; it’s about attack chains that enable them to get to your organization’s data or intelligence.
When reviewing your application security strategy, it’s not whether you have application firewalls, API protection, or AI guardrails to mitigate attacks; it’s whether they work together effectively.
.jpg)
.jpg)
Eight years ago, we shipped Continuous Verification (CV) to solve one of the most miserable parts of a great engineer’s job: babysitting deployments.
The idea was simple but powerful. At 3:00 AM, your best engineers shouldn't be staring at dashboards waiting to see if a release went sideways. CV was designed to think like those engineers, watching your APM metrics, scanning your logs, and making the call for you. Roll forward or roll back, automatically, based on what the data actually said.
It worked. Customers loved it. Hundreds of teams stopped losing sleep over deployments.
But somewhere along the way, we noticed a new problem creeping in: setting up CV had become its own burden.
To get value from Continuous Verification, you had to know what to look for. Which metrics matter for this service? Which log patterns indicate trouble? Which thresholds separate a blip from a real incident?
When we talk to teams trying to use Argo Rollouts and set up automatic verification with its analysis templates, we hear that they hit the same challenges.
For teams with deep observability expertise, this was fine. For everyone else—and honestly, for experienced teams onboarding new services—it added friction that shouldn't exist. We’d solved the hardest part of deployments, but we’d left engineers with a new "homework assignment" just to get started.
That’s what AI Verification & Rollback is designed to fix.
AI Verification & Rollback builds directly on the CV foundation you already trust, but adds a layer of intelligence before the analysis even begins. Instead of requiring you to define your metrics and log queries upfront, the system queries your observability provider—via MCP server—at the moment of deployment to determine what actually matters for the service you just deployed.
What that means in practice:
At our user conference six months ago, we showed this running live—triggering a real deployment, watching the MCP server query Dynatrace for relevant signals, and walking through a live failure analysis that caught a bad release within minutes. The response was immediate. Engineers got it instantly, because it matched how they already think about post-deploy monitoring.
We’ve spent the past six months hardening what we showed you. A few highlights:
We're not declaring CV legacy today. AI Verification & Rollback is not yet a full replacement for traditional Continuous Verification across all use cases and customer configurations. CV remains the right choice for many teams, and we're committed to supporting it.
Bottom line: AI V&R is ready for many teams to use. It's available now, and for teams setting up verification for the first time—or looking to reduce the operational overhead of maintaining verification configs—it's the faster, smarter path forward.
The takeaway here is simple: If you've been putting off setting up Continuous Verification because of the configuration overhead, this is the version you were waiting for.
Ready to stop babysitting your releases? Drop the AI V&R step into your next pipeline and see what it finds.
How is your team currently handling the "3:00 AM dashboard stare"—and how much time would you save if the pipeline just told you why it rolled back?


AI has officially made writing code cheap.
Your developers are shipping more changes, across more microservices, more frequently than ever before. If you’re a developer, it feels like a golden age.
But for the Release Engineer? This isn't necessarily a celebration; it’s a scaling nightmare.
We’re currently seeing what I call the "AI delivery gap." It’s that uncomfortable space between the breakneck speed at which we can now generate code and the manual, spreadsheet-driven processes we still use to actually release it.
The reality is that while individual CI/CD pipelines might be automated, the coordination between them remains a stubbornly human bottleneck. We’ve automated the "how" of shipping code, but we’re still stuck in the Dark Ages when it comes to the "when" and "with whom."
Today, we are introducing Harness Release Orchestration alongside four other capabilities that ensure confident releases. Release Orchestration is designed to transform the release management process from a fragmented, manual effort into a standardized, visible, and scalable operation.

Most release engineers I talk to spend about 40% of their time "chasing humans for status." You’re checking Slack threads for sign-offs, updating Confluence pages, and obsessively watching spreadsheets to ensure Team A’s service doesn't break Team B’s dependency. (And let’s be honest, it usually does anyway.)
We could call it a team sport, but it’s really a multi-team sport. Teams from multiple services and functions need to come together to deliver a big release.
If we rely on a person to coordinate, we can’t move fast enough.
Harness Release Orchestration moves beyond the single pipeline. It introduces a process-based framework that acts as your release "blueprint."
Release management software isn’t an entirely new idea. It’s been tried before, but never widely adopted. The industry went wrong by building separate tools for continuous delivery and release orchestration.
With separate tools, you incur integration overhead, have multiple places to look, and experience awkwardness.
We’ve built ours alongside our CD experience, so everything is as seamless and fast as possible. Yes, this is for releases that are more complex than a simple microservice, which the app team delivers on their own. No, that doesn’t mean introducing big processes and standalone tools.
Here’s the “gotcha”: the biggest barrier to adopting a new release tool is the hassle of migrating. You likely have years of proven workflows documented in SharePoint/Confluence, in early-release management tools like XL Release, or in the fading memory of that one person who isn't allowed to retire.
Harness AI now handles the heavy lifting. Our AI Process Ingestion can instantly generate a comprehensive release process from a simple natural-language prompt, existing documentation, or export from a tool.
What used to take months of manual configuration now takes seconds. Simply put, we’re removing the friction of modernization.
For the Release Engineer, the goal is leverage. You shouldn't need to perform heroics every Friday night to ensure a successful release. (Though if you enjoy the adrenaline of a 2:00 AM war room, I suppose I can’t stop you.)
Harness Release Orchestration creates a standardized release motion that scales with AI-driven output. It allows you to move from being a "release waiter" to a "release architect."
AI made writing code cheap. Harness makes releasing it safe, scalable, and sustainable.


Engineering teams are generating more shippable code than ever before — and today, Harness is shipping five new capabilities designed to help teams release confidently. AI coding assistants lowered the barrier to writing software, and the volume of changes moving through delivery pipelines has grown accordingly. But the release process itself hasn't kept pace.
The evidence shows up in the data. In our 2026 State of DevOps Modernization Report, we surveyed 700 engineering teams about what AI-assisted development is actually doing to their delivery. The finding stands out: while 35% of the most active AI coding users are already releasing daily or more, those same teams have the highest rate of deployments needing remediation (22%) and the longest MTTR at 7.6 hours.
This is the velocity paradox: the faster teams can write code, the more pressure accumulates at the release, where the process hasn't changed nearly as much as the tooling that feeds it.
The AI Delivery Gap
What changed is well understood. For years, the bottleneck in software delivery was writing code. Developers couldn't produce changes fast enough to stress the release process. AI coding assistants changed that. Teams are now generating more change across more services, more frequently than before — but the tools for releasing that change are largely the same.
In the past, DevSecOps vendors built entire separate products to coordinate multi-team, multi-service releases. That made sense when CD pipelines were simpler. It doesn't make sense now. At AI speed, a separate tool means another context switch, another approval flow, and another human-in-the-loop at exactly the moment you need the system to move on its own.
The tools that help developers write code faster have created a delivery gap that only widens as adoption grows.
Today Harness is releasing five capabilities, all natively integrated into Continuous Delivery. Together, they cover the full arc of a modern release: coordinating changes across teams and services, verifying health in real time, managing schema changes alongside code, and progressively controlling feature exposure.
Release Orchestration replaces Slack threads, spreadsheets, and war-room calls that still coordinate most multi-team releases. Services and the teams supporting them move through shared orchestration logic with the same controls, gates, and sequence, so a release behaves like a system rather than a series of handoffs. And everything is seamlessly integrated with Harness Continuous Delivery, rather than in a separate tool.
AI-Powered Verification and Rollback connects to your existing observability stack, automatically identifies which signals matter for each release, and determines in real time whether a rollout should proceed, pause, or roll back. Most teams have rollback capability in theory. In practice it's an emergency procedure, not a routine one. Ancestry.com made it routine and saw a 50% reduction in overall production outages, with deployment-related incidents dropping significantly.
Database DevOps, now with Snowflake support, brings schema changes into the same pipeline as application code, so the two move together through the same controls with the same auditability. If a rollback is needed, the application and database schema can rollback together seamlessly. This matters especially for teams building AI applications on warehouse data, where schema changes are increasingly frequent and consequential.
Improved pipeline and policy support for feature flags and experimentation enables teams to deploy safely, and release progressively to the right users even though the number of releases is increasing due to AI-generated code. They can quickly measure impact on technical and business metrics, and stop or roll back when results are off track. All of this within a familiar Harness user interface they are already using for CI/CD.
Warehouse-Native Feature Management and Experimentation lets teams test features and measure business impact directly with data warehouses like Snowflake and Redshift, without ETL pipelines or shadow infrastructure. This way they can keep PII and behavioral data inside governed environments for compliance and security.
These aren't five separate features. They're one answer to one question: can we safely keep going at AI speed?
Traditional CD pipelines treat deployment as the finish line. The model Harness is building around treats it as one step in a longer sequence: application and database changes move through orchestrated pipelines together, verification checks real-time signals before a rollout continues, features are exposed progressively, and experiments measure actual business outcomes against governed data.
A release isn't complete when the pipeline finishes. It's complete when the system has confirmed the change is healthy, the exposure is intentional, and the outcome is understood.
That shift from deployment to verified outcome is what Harness customers say they need most. "AI has made it much easier to generate change, but that doesn't mean organizations are automatically better at releasing it," said Marc Pearce, Head of DevOps at Intelliflo. "Capabilities like these are exactly what teams need right now. The more you can standardize and automate that release motion, the more confidently you can scale."
The real shift here is operational. The work of coordinating a release today depends heavily on human judgment, informal communication, and organizational heroics. That worked when the volume of change was lower. As AI development accelerates, it's becoming the bottleneck.
The release process needs to become more standardized, more repeatable, and less dependent on any individual's ability to hold it together at the moment of deployment. Automation doesn't just make releases faster. It makes them more consistent, and consistency is what makes scaling safe.
For Ancestry.com, implementing Harness helped them achieve 99.9% uptime by cutting outages in half while accelerating deployment velocity threefold.
At Speedway Motors, progressive delivery and 20-second rollbacks enabled a move from biweekly releases to multiple deployments per day, with enough confidence to run five to 10 feature experiments per sprint.
AI made writing code cheap. Releasing that code safely, at scale, is still the hard part.
Harness Release Orchestration, AI-Powered Verification and Rollback, Database DevOps, Warehouse-Native Feature Management and Experimentation, and Improve Pipeline and Policy support for FME are available now. Learn more and book a demo.


Here's a scenario that probably sounds familiar: a developer needs a sandbox environment to test something. They file a ticket. Then they wait. And wait. Maybe a day goes by, maybe three. Meanwhile, your platform team is buried in provisioning requests, and somewhere, someone has already spun up an unsanctioned workaround that bypasses every governance policy you've put in place.
It's a lose-lose. Developers lose velocity, platform teams lose their sanity, and security gaps quietly multiply.
An Internal Developer Portal flips this whole dynamic. Instead of ticket queues and manual provisioning, you get a self-service sandbox environment automation with guardrails baked in. Developers get instant access to governed, policy-driven environments. Platform engineers get visibility and control — without becoming a bottleneck.
In this post, we'll walk through how to build sandbox automation that actually works, set governance that holds up at scale, accelerate developer onboarding, and measure ROI in a way that makes leadership pay attention.
Harness Internal Developer Portal brings enterprise-grade orchestration together with developer-friendly self-service — built on the Backstage framework with the security, scalability, and governance that enterprises actually need.
Let's be real: if your developers are waiting days for a test environment while your security team is blocking deployments they can't validate, something is broken. Teams end up either skipping testing altogether or delaying releases — neither of which is a good outcome. A solid sandbox environment strategy breaks this cycle by giving teams secure, isolated spaces to validate changes without compromising governance or speed.
The best way to think about sandboxes? Treat them as disposable. Each sandbox environment gets the minimum privileges needed for the task at hand — whether that's validating a database migration, testing a third-party API integration, or experimenting with a new service configuration.
If something breaks, you just delete the sandbox environment and start fresh. No incident reports. No painful rollback procedures. No 2 a.m. pages to the on-call engineer. Teams that adopt this pattern tend to catch configuration errors much earlier in the development cycle.
Think of sandbox environments as your first line of defense. When you're dealing with untrusted code, unfamiliar dependencies, or configuration changes you're not 100% sure about, run them in an isolated sandbox first. Validate the behavior, confirm it's clean, and only then promote the changes to staging or production.
This pattern is incredibly effective at catching supply chain issues, configuration drift, and integration failures before they ever touch your shared systems. It shifts security left in a way that's practical, not just aspirational.
Not every test needs a full production replica. A smart sandbox environment strategy uses a tiered approach: lightweight developer sandboxes for day-to-day feature work, partial-copy environments for integration testing with realistic (but masked) data, and full-copy sandboxes reserved for final validation before release.
Add in ephemeral PR environments that spin up automatically when a pull request is opened and tear themselves down after a set time window, and you've got realistic testing without persistent infrastructure costs or manual provisioning overhead.
When sandbox environment requests take days instead of minutes, you're not just slowing down development — you're creating a governance problem that only gets worse as you scale. Automating provisioning through your IDP eliminates this friction while keeping the controls you need.
Here's what that looks like in practice:
This approach transforms your IDP from a service catalog into a genuine self-service platform — one that scales governance instead of bypassing it.
Not all sandbox environments are created equal. Picking the wrong type wastes resources and creates either security gaps or unnecessary developer friction. Knowing which type fits each workflow helps platform teams design self-service templates that match what developers actually need.
Developer sandboxes prioritize speed and productivity. Security detonation environments prioritize total isolation, even at the cost of convenience. Most teams will use a mix across their workflows.
Here's something that frustrates every engineering leader: a new hire joins your team, and it takes them three days just to get a working development environment. That's three days of lost productivity, plus a terrible first impression of your engineering culture.
With an Internal Developer Portal, you can bundle everything a new developer needs — sandbox environment provisioning, documentation, credentials, and golden path templates — into a single catalog request. Day one, they click a button, and they're up and running. No more Slack threads asking "where's my environment?"
Use Scorecards — a native feature in Harness IDP — to track the metrics that actually matter: time-to-first-PR-environment, sandbox environment mean time to recovery, and reuse rates. When self-service becomes the default, ticket provisioning drops dramatically.
Scaling sandbox environments across your organization doesn't mean you need to double your platform team or accept governance trade-offs. Harness IDP extends the Backstage framework with the enterprise capabilities that self-managed Backstage doesn't offer out of the box — fine-grained RBAC, hierarchical organization, native Scorecards tied to your KPIs, access to the 200+ Backstage plugin ecosystem, and environment management that scales with your teams.
Ready to ship governed, self-service sandbox environments that actually reduce toil? Try Harness Internal Developer Portal and see how fast your teams move when the friction disappears.
These are the questions that come up most often when platform leaders are planning sandbox environment rollouts or making the case to leadership.
Sandbox environments create isolated failure boundaries where you can test risky changes, third-party integrations, and security patches without touching shared systems. Teams validate behavior in production-like conditions, catching issues before they reach staging or production — which means fewer incidents and faster recovery when things do go wrong.
Focus on time-to-first-environment for new developers, sandbox environment utilization rates, and ticket reduction percentages. Also track cost per sandbox hour and cleanup effectiveness. Developer satisfaction scores and time saved on provisioning requests are the numbers that help you demonstrate platform ROI to leadership and justify continued investment.
The biggest culprit is sandbox environments that get spun up and never torn down. Set TTL-based auto-teardown policies so environments self-destruct after a defined window — a few hours for ephemeral PR environments, a couple of weeks for longer-lived developer sandboxes. Pair that with resource quotas enforced through policy-as-code, and you've got cost controls that don't depend on anyone remembering to clean up after themselves.
Yes, and this is where a platform approach really pays off. Instead of building separate provisioning workflows for each cloud provider, golden path templates in your IDP abstract the infrastructure layer. Developers request a sandbox environment from a single catalog — whether it lands on AWS, Azure, GCP, or an on-prem cluster is handled by the template logic and your team's policies. Harness IDP supports multi-cloud and multi-region deployments natively, so you're not stitching together one-off scripts for each environment.


On March 19th, the risks of running open execution pipelines — where what code runs in your CI/CD environment is largely uncontrolled — went from theoretical to catastrophic.
A threat actor known as TeamPCP compromised the GitHub Actions supply chain at a scale we haven't seen before (tracked as CVE-2026-33634, CVSS 9.4). They compromised Trivy, the most widely used vulnerability scanner in the cloud-native ecosystem, and turned it into a credential-harvesting tool that ran inside victims' own pipelines.
Between March 19 and March 24, 2026, organizations running affected tag-based GitHub Actions references were sending their AWS tokens, SSH keys, and Kubernetes secrets directly to the attacker. SANS Institute estimates over 10,000 CI/CD workflows were directly affected. According to multiple security research firms, the downstream exposure extends to tens of thousands of repositories and hundreds of thousands of accounts.
Five ecosystems. Five days. One stolen Personal Access Token.
This is a fundamental failure of the open execution pipeline model — where what runs in your pipeline is determined by external references to public repositories, mutable version tags, and third-party code that executes with full privileges. GitHub Actions is the most prominent implementation.
The alternative, governed execution pipelines, where what runs is controlled through policy gates, customer-owned infrastructure, scoped credentials, and immutable references, is the model we designed Harness around years ago, precisely because we saw this class of attack coming.
TeamPCP wasn't an anomaly; it was the inevitable conclusion of an eighteen-month escalation in CI/CD attack tactics.
CVE-2025-30066. Attackers compromised a PAT from an upstream dependency (reviewdog/action-setup) and force-pushed malicious code to every single version tag of tj-actions/changed-files. 23,000 repositories were exposed. The attack was later connected to a targeted campaign against Coinbase. CISA issued a formal advisory.
This proved that the industry's reliance on mutable tags (like @v2) was a serious structural vulnerability. According to Wiz, only 3.9% of repositories pin to immutable SHAs. The other 96% are trusting whoever owns the tag today.
The first self-replicating worm in the CI/CD ecosystem. Shai-Hulud 2.0 backdoored 796 npm packages representing over 20 million weekly downloads — including packages from Zapier, PostHog, and Postman.
It used TruffleHog to harvest 800+ credential types, registered compromised machines as self-hosted GitHub runners named SHA1HULUD for persistent C2 over github.com, and built a distributed token-sharing network where compromised machines could replace each other's expired credentials.
PostHog's candid post-mortem revealed that attackers stole their GitHub bot's PAT via a pull_request_target workflow exploit, then used it to steal npm publishing tokens from CI runner secrets. Their admission that this kind of attack "simply wasn't something we'd prepared for" reflects the industry-wide gap between application security and CI/CD security maturity. CISA issued another formal advisory.
TeamPCP went after the security tools themselves.
They exploited a misconfigured GitHub Actions workflow to steal a PAT from Aqua Security's aqua-bot service account. Aqua detected the breach and initiated credential rotation — but reporting suggests the rotation did not fully cut off attacker access. TeamPCP appears to have retained or regained access to Trivy's release infrastructure, enabling the March 19 attack weeks after initial detection.
On March 19, they force-pushed a malicious "Cloud Stealer" to 76 of 77 version tags in trivy-action and all 7 tags in setup-trivy. Simultaneously, they published an infected Trivy binary (v0.69.4) to GitHub Releases and Docker Hub. Every pipeline referencing those tags by name started executing the attacker's code on its next run. No visible change to the release page. No notification. No diff to review.
TeamPCP's payload was purpose-built for CI/CD runner environments:
Memory Scraping. It read /proc/*/mem to extract decrypted secrets held in RAM. GitHub's log-masking can't hide what's in process memory.
Cloud Metadata Harvesting. It queried the AWS Instance Metadata Service (IMDS) at 169.254.169.254, pivoting from "build job" to full IAM role access in the cloud.
Filesystem Sweep. It searched over 50 specific paths — .env files, .aws/credentials, .kube/config, SSH keys, GPG keys, Docker configs, database connection strings, and cryptocurrency wallet keys.
Encrypted Exfiltration. All data was bundled into tpcp.tar.gz, encrypted with AES-256 and RSA-4096, and sent to typosquatted domains like scan.aquasecurtiy[.]org (note the "tiy"). These domains returned clean verdicts from threat intelligence feeds during the attack. As a fallback, the stealer created public GitHub repos named tpcp-docs under the victim's own account.
The malicious payload executed before the legitimate Trivy scan. Pipelines appeared to work normally. CrowdStrike noted: "To an operator reviewing workflow logs, the step appears to have completed successfully."
Sysdig observed that the vendor-specific typosquat domains were a deliberate deception — an analyst reviewing CI/CD logs would see traffic to what appears to be the vendor's own domain.
It took Aqua five days to fully evict the attacker, during which TeamPCP pushed additional malicious Docker images (v0.69.5 and v0.69.6).
Why did this work so well? Because GitHub Actions is the leading example of an open execution pipeline — where what code runs in your pipeline is determined by external references that anyone can modify.
This trust problem isn't new. Jenkins had a similar issue with plugins. Third-party code ran with full process privileges. But Jenkins ran inside your firewall; exfiltrating data required getting past your network perimeter.
GitHub Actions took the same open execution approach but moved execution to cloud-hosted runners with broad internet egress, making exfiltration trivially easy. TeamPCP's Cloud Stealer just needed to make an HTTPS POST to an external domain, which runners are designed to do freely.
Here are a few reasons why open execution pipelines break at scale:
Mutable Trust. When you use @v2, you are trusting a pointer, not a piece of code. Tags can be silently redirected by anyone with write access. TeamPCP rewrote 76 tags in a single operation. 96% of the ecosystem is exposed.
Flat Privileges. Third-party Actions run with the same permissions as your code. No sandbox. No permission isolation. This is why TeamPCP targeted security scanners — tools that by design have elevated access to your pipeline infrastructure. The attacker doesn't need to break in. The workflow invites them in.
Secret Sprawl. Secrets are typically injected into the runner's environment or process memory during job execution, where they remain accessible for the job's duration. TeamPCP's /proc/*/mem scraper didn't need any special privilege. It just needed to be running on the same machine.
Unbounded Credential Cascades. There is no architectural boundary that stops a credential stolen in one context from unlocking another. TeamPCP proved this definitively: Trivy → Checkmarx → LiteLLM → AI API keys across thousands of enterprises. One PAT, five ecosystems.
Harness CI/CD pipelines are built as governed execution pipelines — where what runs is controlled through customer-owned infrastructure, policy gates, scoped credentials, immutable references, and explicit trust boundaries. At its core is the Delegate — a lightweight worker process that runs inside your infrastructure (your VPC, your Kubernetes cluster), executes tasks locally, and communicates with the Harness control plane via outbound-only connections.
When we designed this architecture, we assumed the execution plane would become the primary target in the enterprise. If TeamPCP tried to attack a Harness-powered environment, they would hit three architectural walls.
The Architecture.
The Delegate lives inside your VPC or cluster. It communicates with our SaaS control plane via outbound-only HTTPS/WSS. No inbound ports are opened.
The Defense.
You control the firewall. Allowlist app.harness.io and the specific endpoints your pipelines need, deny everything else. TeamPCP's exfiltration to typosquat domains would fail at the network layer — not because of a detection rule, but because the path doesn't exist. Both typosquat domains returned clean verdicts from threat intel feeds. Egress filtering by allowlist is more reliable than detection by reputation.
The Architecture.
Rather than bulk-injecting secrets as flat environment variables at job start, Harness can resolve secrets at runtime through your secret manager — HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager, Azure Key Vault — via the Delegate, inside your network. Harness SaaS stores encrypted references and metadata, not plaintext secret values.
The Defense.
TeamPCP's Cloud Stealer worked because in an open execution pipeline, secrets are typically injected into the runner's process memory where they remain accessible for the job's duration. In a governed execution pipeline, this exposure is structurally reduced: secrets can be resolved from your controlled vault at the point they're needed, rather than broadcast as environment variables to every step in the pipeline.
An important caveat: Vault-based resolution alone doesn't eliminate runtime exfiltration. Once a secret is resolved and passed to a step that legitimately needs it — say, an npm token during npm publish — that secret exists in the step's runtime. If malicious code is executing in that same context (for example, a tampered package.json that exfiltrates credentials during npm run test), the secret is exposed regardless of where it came from. This is why the three walls work as a system: Wall 2 reduces the surface of secret exposure, Wall 1 blocks the exfiltration path, and (as we'll see) Wall 3 limits the blast radius to the scoped environment. No single wall is sufficient on its own.
To further strengthen how pipelines use secrets, leverage ephemeral credentials — AWS STS temporary tokens, Vault dynamic secrets, or GCP short-lived service account tokens — that auto-expire after a defined window, often minutes. Even if TeamPCP’s memory scraper extracted an ephemeral credential, it likely would have expired before the attacker could pivot to the next target.
The Architecture.
Harness supports environment-scoped delegates as a core architecture pattern. Your "Dev" scanner delegate runs in a different cluster, with different network boundaries and different credentials, than your "Prod" deployment delegate.
The Defense.
The credential cascade that defined TeamPCP hits a dead end. Stolen Dev credentials cannot reach Production publishing gates or AI API keys, because those credentials live in a different vault, resolved by a different delegate, in a different network segment. If the Trivy compromise only yielded credentials scoped to a dev environment, the attack stops at phase one.
Beyond the walls, governed execution pipelines provide additional structural controls:
Architecture is a foundation, not a guarantee. Governed execution pipelines are materially safer against this class of attack, but you can still create avoidable risk by running unvetted containers on delegates, skipping egress filtering, using the same delegate across dev and prod, granting overly broad cloud access, or exposing excessive secrets to jobs that don't need them, or using long-lived static credentials when ephemeral alternatives exist.
I am not claiming that Harness is safe and GitHub Actions is unsafe. That would be too simplistic.
What I am claiming is that governed execution pipelines — where what runs is controlled through policy gates, customer-owned infrastructure, scoped credentials, and immutable references — are a materially safer foundation than open execution pipelines. We designed Harness as our implementation of a governed execution pipeline. But architecture is a starting point — you still have to operate it well.
As we enter the era of Agentic AI — where AI is generating pipelines, suggesting dependencies, and submitting pull requests at machine speed — we can no longer rely on human review to catch a malicious tag in an AI-generated PR.
But there's a more fundamental shift: AI agents will become the primary actors inside CI/CD pipelines. Not just generating code — autonomously executing tasks, selecting dependencies, making deployment decisions, remediating incidents.
Now imagine an AI agent in an open execution pipeline — downloaded from a public marketplace, referenced by a mutable tag, executing with full privileges, making dynamic runtime decisions you didn't define. It has access to your secrets, your cloud credentials, and your deployment infrastructure. Unlike a static script, an agent makes decisions at runtime — fetching resources, calling APIs, modifying files.
If TeamPCP showed us what happens when a static scanner is compromised, imagine what happens when an autonomous AI agent is compromised — or simply makes a decision you didn't anticipate.
This is why governed execution pipelines aren't just a security improvement — they're an architectural prerequisite for the AI era. In a governed pipeline, even an AI agent operates within structural boundaries: it runs on infrastructure you control, accesses only scoped secrets, has restricted egress, and its actions are audited. The agent may be autonomous, but the pipeline constrains what it can reach.
The questions every engineering leader should be asking:
If you use Trivy, Checkmarx, or LiteLLM:
If you use GitHub Actions:
For the longer term:
I'm writing this as the CEO of a company that competes with GitHub in the CI/CD space. I want to be transparent about that.
But I'm also writing this as someone who has spent two decades building infrastructure software and who saw this threat model coming. When we designed Harness, the open execution pipeline model had already evolved from Jenkins plugins to GitHub Actions — each generation making it easier for third-party code to run with full privileges and, by moving execution further from the customer's network perimeter, making exfiltration easier. We deliberately chose to build governed execution pipelines instead.
The TeamPCP campaign didn't teach us anything new about the risk. What it did was make the difference between open and governed execution impossible for the rest of the industry to ignore.
Open source security tools are invaluable. The developers and companies who build them — including Aqua Security and Checkmarx — are doing essential work. The problem isn't the tools. The problem is running them inside open execution pipelines where third-party code has full privileges, secrets sit in memory, and exfiltration faces no structural barrier.
If you want to explore how the delegate architecture works in practice, we're here to show you. But more importantly, regardless of what platform you choose, please take these structural questions seriously. The next TeamPCP is already studying the credential graph.


Taking a look back over the last ten years in enterprise technology, paradigm shifts are occurring more frequently. For example, the maturity of DevOps/Platform Engineering and Cloud Native infrastructure has occurred. The new frontier depending where you are in adoption is AI. As your adoption and maturity curve progress, operationalizing these paradigms become important. Many sophisticated firms that have to manage and even innovate paradigms that are possibly not core to their business model look to Product Portfolio Management aka PPM to manage the business aspect of the paradigm.
A core pillar to PPM is resource management or the management of skillsets across the portfolio. We looked inwards to our own Professional Service team and ran a Job Task Analysis aka JTA to analyze how Harness itself built a portfolio of implementation experts.
For organizations/firms looking not only to scale Harness but potentially other platforms or paradigms, can use the Harness Implementation Job Task Analysis as one framework to use and apply to your specific need.
In 2026, it is exciting to say that DevOps is mature. Most in industry agree today to the pillars and practices of what DevOps has been trying to achieve are normal for most engineering orgs. If you turn the clock back 15 years ago, DevOps was still very much emerging as a paradigm. The trio of people, process, and technology followed as the movement became more mature.
In an early episode of ShipTalk, we talked to Nidhi Allipuram, who at the time was leading a DevOps Org at a large insurance company. In the episode we talked about how she went from 0-1 in scaling a DevOps idea to building a DevOps Org. Like any technology adoption with a new or fast moving paradigm (think of AI today), there are business considerations that typically follow the maturity curve. The process she followed was incremental starting to gather internal expertise then having to answer “is this problem even worth solving”. From there once momentum is made, scale will follow.
The episode was great timing, for myself I was becoming a new first-time manager here at Harness and Harness itself was turning into a multi-module platform. I was really curious about how she load-balanced skills on her team to continue to expand into an evolving paradigm. Her advice was solid that you do have to balance the team to grow complementary skillsets and be able to keep up with their internal customer demand. This is exactly what resource management in a PPM based discipline/org would strive for. The Job Task Analysis that was just concluded can be used in resource management when looking to scale the program/portfolio.
Looking at the Implementation Job Task Analysis and how Harness itself scales our Professional Services Practice, there is a needed mixture of both hard and soft skills. The hard skills are vertical skills in the tooling/platforms/ecosystem that Harness participates in. Second and just as important are the soft skills around problem solving and requirements gathering / stakeholder management. The hard skills would answer the “how” and “where” type of questions. The soft skills will answer just as importantly the “why” and “what” type of questions. As Harness is a platform of outcomes, solving for both hard and soft skills is important.
Extrapolating this to building a portfolio or program in evolving or even mature technology domains, the soft skills bolster the initial organic question “is this problem even worth solving”. In technology there is rarely a singular big-bang type of innovation occurrence. A good amount of innovation happens incrementally and with a solid team/consensus driven approach. Getting consensus can also be a challenge and a typical approach is requirements/stakeholder management gathering inputs from many points-of-view and blending them for trade-offs.
Distilling down the findings in the Implementation Job Task Analysis, for highly technical resources stakeholder / requirements management can be challenging as there is a lot of grey area. For example in the Site Reliability Engineering aka SRE world, defining your SLIs/SLOs is one set of challenges but having to get consensus to renegotiate them is a different order of problem. As we look towards AI, the fundamentals do not go away.
AI is certainly a rapidly evolving paradigm that is rapidly changing how we use technology. In a recent Cloud Native Podcast where the topic was Generative AI in Platform Engineering. AI, especially how it is used in Platform Engineering, is still early in the adoption curve. In 2026, The Linux Foundation will have its first AI centric conference with AGNTCon + MCPCon which is telling that maturity will eventually arrive. With items that are either early or late maturity, fundamentals do not go away.
On the Cloud Native Podcast, there was a conversation around how Kubernetes was during the first KubeCon in 2015 and how AGNTCon + MCPCon is a telling similar signal in 2026. The panel in the podcast reflected on their own K8s journey and early questions around innovation vs control/operationalization came up when looking towards AI; similar concerns when looking at K8s early on.
Crucial to navigating the radio dials of innovation vs control is again the fundamentals of soft skills getting stakeholder or internal customer alignment which is by incremental consensus. In the Implementation Job Task Analysis, the people who are agents of change in software delivery need to blend both their hard and soft skills on a daily basis. These very people are available to help further your innovation goals with top notch software delivery via the Harness Platform.
As you start to embrace new paradigms and the Harness Platform, we are here to help. Harness has been at the forefront of the Cloud Native and now AI-Centric software delivery worlds. Our platform continues to evolve to meet these new and similar challenges in the technology adoption curve. Harness Professional Services are the people helping your organization making these new paradigms become reality. Feel free to take a look at our Services Offerings which includes consulting/coaching to training. We are excited to share how we scale our own internal skills to help our customers/partners/and public continue on to better the software delivery craft.
Need more info? Contact Sales