Baggage in APM Agents #757

AlexanderWert · 2023-02-06T09:28:42Z

Description

Problem

The baggage concept allows users to propagate additional (custom) context downstream together with the tracing context. For example, when there are services

A -> B -> C

service A could propagate some information to B and C, that otherwise would be only available in the scope of A (e.g. user ID, device.id in case of mobile, some product ID, etc.). Such additional context can be extremely valuable when troubleshooting performance and reliability issues.

Example:
Let's assume in the scheme above, A is a mobile app that propagates a device.id (that is unique, but not attributable to a person, so no PII) through the baggage. C is some backend service on the downstream invocation path of A. Let's assume C has an increased error rate for the error group XYZ. With the baggage implementation, the device.id could be easily added as an additional label to all errors. Thus, just by counting the unique number of device.ids on XYZ, an SRE could easily answer the question on how many users (i.e. mobile devices) are affected by that error.

Solution

Create spec for APM agents
Implement a baggage container as part of the tracing context. (Where / if possible reuse the implementation / SDK from OpenTelemetry).
Propagate the baggage together with the tracing context (see W3C SPEC) - in this iteration propagation should be implemented only for HTTP (in case propagation for other protocols implies additional work)
We will use the OpenTelemetry API for baggage declaration. This requires bridging the OTel API Baggage Part to our internal implementation.
Define config option(s) that would specify where to attach which baggage, something like:
- baggage_to_attach_on_transactions = myBaggageKey_1, key_abc*
- baggage_to_attach_on_spans = [EMPTY]
- baggage_to_attach_on_errors = *

Spec Issue

[META 757] Spec: Baggage in APM Agents #758

Agent Issues

The text was updated successfully, but these errors were encountered:

GeorgeGkinis · 2024-04-10T09:16:29Z

I see no plan for support in the RUM agent?

Currently baggage seems to get dropped?
https://www.elastic.co/guide/en/apm/agent/rum-js/4.x/opentracing.html#opentracing-baggage

We have a use case where we want to know which user-facing services are impacted by error.culprits on other services down the line.
Being able to propagate data from our frontend (i.e. which button was clicked) down the line could solve our use case, as the documents with error.culprit filled would include the impacted client-facing product/services and allow prioritising which issues to resolve first.

AlexanderWert added meta apm-agents labels Feb 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baggage in APM Agents #757

Baggage in APM Agents #757

AlexanderWert commented Feb 6, 2023 •

edited

Loading

GeorgeGkinis commented Apr 10, 2024 •

edited

Loading

Baggage in APM Agents #757

Baggage in APM Agents #757

Comments

AlexanderWert commented Feb 6, 2023 • edited Loading

Description

Problem

Solution

Spec Issue

Agent Issues

GeorgeGkinis commented Apr 10, 2024 • edited Loading

AlexanderWert commented Feb 6, 2023 •

edited

Loading

GeorgeGkinis commented Apr 10, 2024 •

edited

Loading