You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The baggage concept allows users to propagate additional (custom) context downstream together with the tracing context. For example, when there are services
A -> B -> C
service A could propagate some information to B and C, that otherwise would be only available in the scope of A (e.g. user ID, device.id in case of mobile, some product ID, etc.). Such additional context can be extremely valuable when troubleshooting performance and reliability issues.
Example:
Let's assume in the scheme above, A is a mobile app that propagates a device.id (that is unique, but not attributable to a person, so no PII) through the baggage. C is some backend service on the downstream invocation path of A. Let's assume C has an increased error rate for the error group XYZ. With the baggage implementation, the device.id could be easily added as an additional label to all errors. Thus, just by counting the unique number of device.ids on XYZ, an SRE could easily answer the question on how many users (i.e. mobile devices) are affected by that error.
Solution
Create spec for APM agents
Implement a baggage container as part of the tracing context. (Where / if possible reuse the implementation / SDK from OpenTelemetry).
Propagate the baggage together with the tracing context (see W3C SPEC) - in this iteration propagation should be implemented only for HTTP (in case propagation for other protocols implies additional work)
We will use the OpenTelemetry API for baggage declaration. This requires bridging the OTel API Baggage Part to our internal implementation.
Define config option(s) that would specify where to attach which baggage, something like:
We have a use case where we want to know which user-facing services are impacted by error.culprits on other services down the line.
Being able to propagate data from our frontend (i.e. which button was clicked) down the line could solve our use case, as the documents with error.culprit filled would include the impacted client-facing product/services and allow prioritising which issues to resolve first.
Description
Problem
The baggage concept allows users to propagate additional (custom) context downstream together with the tracing context. For example, when there are services
service A could propagate some information to B and C, that otherwise would be only available in the scope of A (e.g. user ID, device.id in case of mobile, some product ID, etc.). Such additional context can be extremely valuable when troubleshooting performance and reliability issues.
Example:
Let's assume in the scheme above, A is a mobile app that propagates a
device.id
(that is unique, but not attributable to a person, so no PII) through the baggage. C is some backend service on the downstream invocation path of A. Let's assume C has an increased error rate for the error groupXYZ
. With the baggage implementation, thedevice.id
could be easily added as an additional label to all errors. Thus, just by counting the unique number ofdevice.id
s onXYZ
, an SRE could easily answer the question on how many users (i.e. mobile devices) are affected by that error.Solution
baggage_to_attach_on_transactions = myBaggageKey_1, key_abc*
baggage_to_attach_on_spans = [EMPTY]
baggage_to_attach_on_errors = *
Spec Issue
Agent Issues
The text was updated successfully, but these errors were encountered: