Product
|
Cloud costs
|
released
March 22, 2022
|
3
min read
|

Introducing Harness Service Reliability Management

Updated

Today, we are announcing a new module in the Harness Software Delivery Platform that helps developers maintain high velocity while continuously improving the reliability of application services. Harness Service Reliability Management (SRM) was designed to improve the collaboration and governance between engineering and reliability teams so that they can adopt a modern Site Reliability Engineering (SRE) program using Service Level Objectives (SLO) as outlined by Google in the SRE Handbook.

Harness Service Reliability Management is for teams that want a better way to balance the velocity of feature releases and bug fixes with the stability and reliability needs of a production environment. With Harness SRM, you no longer need to choose between velocity and confidence. It helps you ensure that your best developers continue to deliver highly reliable software at high velocity while putting guardrails in place for other developers or sensitive projects.

Adopting SLO-Driven Software Delivery

There is a cultural shift AND new knowledge/skills required to make the SRE model successful in your organization. Harness SRM was designed to help companies of all sizes rapidly adopt and implement an SRE model while avoiding these common challenges:

  • Wasted time manually tracking SLOs and error budgets.
  • Conflict between engineering and reliability teams due to lack of collaboration defining governance.
  • Engineering teams are surprised with work stoppages because they don’t have visibility into SLOs and error budgets.
  • Work stoppage negotiations between engineering and reliability teams on a service-by-service basis.
  • Trouble scaling site reliability engineering practices.
  • Problems maintaining high feature delivery velocity while also ensuring high reliability.

Achieving Excellence in SLO-Driven Software Delivery

Harness SRM is a solution for engineering AND reliability teams. Within SRM, teams collaborate to define SLIs, SLOs, and Error Budgets. SRM users also create reliability guardrails within their CI/CD pipelines. These reliability guardrails determine whether or not pipelines are allowed to proceed to the next stage. SLO and Error Budget data is used to drive the behavior of the reliability guardrails. If SLOs are violated too often, Error Budgets become depleted, which causes the reliability guardrails to stop pipeline execution. Once pipeline execution is stopped, explicit approval must be provided for pipelines to proceed. This is all tracked in the SRM audit log for compliance purposes.

Service Reliability Management - A New Harness Module

To promote better production reliability, Service Reliability Checks are performed across all stages of the software delivery lifecycle. Some of these reliability checks, like native error tracking from Harness, require an agent to be added to the application service. All other reliability checks are performed via integrations to external tools (APM, log analytics, testing, etc.). The goal of these checks is to identify as many reliability issues as possible before production. If done properly, production reliability will continually improve.

Conclusion - Frenemies No More!

Reliability and engineering teams don’t want to be in conflict with each other, and now with Harness SRM, they don’t need to be. They can evolve to a new collaborative relationship where both teams work in harmony to deliver software faster with the confidence that it will be reliable. 

Interested in learning more or getting started with Harness Service Reliability Management? Click here for more information.

Sign up now

Sign up for our free plan, start building and deploying with Harness, take your software delivery to the next level.

Get a demo

Sign up for a free 14 day trial and take your software development to the next level

Documentation

Learn intelligent software delivery at your own pace. Step-by-step tutorials, videos, and reference docs to help you deliver customer happiness.

Case studies

Learn intelligent software delivery at your own pace. Step-by-step tutorials, videos, and reference docs to help you deliver customer happiness.

We want to hear from you

Enjoyed reading this blog post or have questions or feedback?
Share your thoughts by creating a new topic in the Harness community forum.

Sign up for our monthly newsletter

Subscribe to our newsletter to receive the latest Harness content in your inbox every month.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Service Reliability Management