Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

April 24, 2025

Using Feature Flags with Open Telemetry

Authors:

Table of Contents

Integrating feature flags with Open Telemetry allows for enhanced observability by tracing the impact of flags on application behavior through span attributes, which enables querying traces based on specific flag states.

Enhancing Observability with Feature Flags and Open Telemetry

Already using Open Telemetry? If so, you might be curious about the benefits of integrating feature flags into your application. This article explains how to level up your observability following a simple, code-level integration. Follow along as this guide reviews the benefits of integrating these technologies and suggestions for next steps.

This article assumes code-level familiarity with Open Telemetry, an open source observability platform. Open Telemetry requires an implementation, and there are many to choose. For the purpose of this article and code demonstration, I chose to use Honeycomb.io and Java. Honeycomb’s Java collector uses the -javaagent bytecode instrumentation technique to instrument the application.

Threaded Echo Server

The sample application is written in Java and is a simple socket server. Clients connect to the server on default port 5009. The client can type a word or phrase, and the Threaded Echo Server will respond with the phrase verbatim.

Listing 1: Threaded Echo Server handles client interaction.

The Threaded Echo Server sleeps for two seconds when the feature flag “next_step” is turned on. The sleep is wrapped with a “next_step”, “span2”. When the “next_step” flag is off, only the usual doSomeWork is performed.

With “next_step” switched on, we can find this trace in Honeycomb.

Figure A: Honeycomb displays a client session.

In Figure A, the client types four words. The next_step performance of handling each word is almost exactly two seconds, the time our code waits.

With “next_step” toggled off, this is the trace.

Figure B: Honeycomb displays an Echo Server client session with feature flag toggled off.

The feature flag had two impacts.

It introduced a new, nested span to the work of handling a client word. The span was named after the feature flag that creates it, resulting in the green bars that illustrate each word of a single session in a single trace.

It also introduced two seconds of sleep time into handling the word, making it easier to see the new span instances.

Finding Flags

It is enough to see that flags and spans can interact, but there is an additional opportunity.

Listing 2: Preparing the Threaded Echo Server’s top-level “echo” span with feature flag treatments

In Listing 2, three flags are expected to be in use by the program: “next_step”, “multivariant_demo”, and “new_onboarding”. Using Harness FME, all flags are evaluated up front and stored in a flag2treatments map. This means that a dynamic change to a flag treatment will be ignored for the rest of the program execution, and in another blog we could show techniques for avoiding that.

In this example, it’s fine. While the treatments are cached, they’re stored as span attributes. Why do you want to put the feature flag “impressions” into the span?

Figure C: Query for traces by feature flag impression

If your span has the feature flag name and treatment, you can query for traces that show (or don’t show) a particular flag. This makes it much easier to isolate your trace sessions when you’re looking to handle a problem specific to a feature flag.

Next Steps

Feature flags are not good candidates for bytecode instrumentation. The hard part of introducing a flag is not the SDK, but rather the thought that must go into what you want to supply when the flag is toggled on and off (or with multi-variant flags).

In one vision of the future, a span is synonymous with a flag. Flags would be reverse dependent on the span/flag that includes them. You could turn on and off whole portions of live, running application code by identifying the span or spans that require rollback. The flagging toggle interface isn’t well suited to this broad purpose though, and the complexity could be overkill, depending on how many spans you have used to instrument your app.

In the near term, consider manually wrapping your feature flag changes with a span on top of the flag itself, and give yourself Open Telemetry analytics on your flag.

Feature Flags For Dummies

Empower product teams to release new features, schedule when releases should happen and guarantee your customers the best experience possible.

Similar Blogs

CI/CD

Using Feature Flags with Open Telemetry

Enhancing Observability with Feature Flags and Open Telemetry

Threaded Echo Server

Finding Flags

Next Steps

Feature Flags For Dummies

Similar Blogs

Engineer-led Experimentation to Optimize Infrastructure

Split Embraces OpenFeature

Not All Connections are Created Equal: Introducing Streaming Architecture

More Powerful Experiments and Personalization at Scale with Amplitude and Split

Using Feature Flags with Open Telemetry

Similar Blogs

Engineer-led Experimentation to Optimize Infrastructure

Split Embraces OpenFeature

Not All Connections are Created Equal: Introducing Streaming Architecture

More Powerful Experiments and Personalization at Scale with Amplitude and Split

the State of

Software Delivery2025

Software
Delivery
2025