Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have the global Propagators default to a no-op instance. #721

Conversation

carlosalberto
Copy link
Contributor

@carlosalberto carlosalberto commented Jul 20, 2020

Fixes #428

Changes

Have the global Propagators default to a no-op instance. The change itself is small but the side effects are important. This change is proposed as:

  • Some APIs may not be able to provide propagation without a SDK.
  • Keep it simple ("least surprise property").
  • Avoid dealing with any unexpected, non-obvious corner case.

Also see this comment

Note: There was initial, overall agreement on also having the TraceContext propagator moved out of the API, but will do that in a separate PR as a follow up.

cc @dyladan

@carlosalberto carlosalberto requested a review from a team July 20, 2020 23:53
@carlosalberto carlosalberto added spec:context Related to the specification/context directory area:api Cross language API specification issue release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Jul 20, 2020
@dyladan
Copy link
Member

dyladan commented Jul 20, 2020

Observe that API/SDK will still be expected to contain a default implementation of TraceContext, B3 and Zipkin

Does this mean it will be contained in the API or the SDK?

@Oberon00
Copy link
Member

I think either the SDK or a separate propagator pacakage that only depends on API+context layer? A propagator implementation is no API, so I see no reason for it to be there.

@carlosalberto
Copy link
Contributor Author

A propagator implementation is no API, so I see no reason for it to be there.

Observe that TraceContext propagators are usually in API, based on what I observed from a few languages. Other propagators will probably land in either the SDK or a contrib/extension package.

@Oberon00
Copy link
Member

Did these languages also have it as the default API-only propagator?

Anyway, this reminds me that we should really have a document explaining the goals of the API / SDK split (I'm not sure there is an agreed set of these currently) and guidelines to decide what to put where.

@dyladan
Copy link
Member

dyladan commented Jul 21, 2020

A propagator implementation is no API, so I see no reason for it to be there.

Observe that TraceContext propagators are usually in API, based on what I observed from a few languages. Other propagators will probably land in either the SDK or a contrib/extension package.

It's odd to me that a propagator would be the one piece of actual functionality in an otherwise totally no-op api.

@carlosalberto
Copy link
Contributor Author

It's odd to me that a propagator would be the one piece of actual functionality in an otherwise totally no-op api.

The idea, IIRC, is that users can do inter-process code propagation with the API alone if they want.

@Oberon00
Copy link
Member

This would also be possible with the propagator being in a separate artifact. But I agree that would be more complicated, although it varies by language how much boilerplate, etc. it is to create a separate artifact.

@carlosalberto
Copy link
Contributor Author

Did these languages also have it as the default API-only propagator?

In Java/Python they do, for a start.

@SergeyKanzhelev
Copy link
Member

In .NET propagation is enabled when either logging or tracing is enabled. At any verbosity. There may be no tracing exporter configured, but IDs will be propagated. So error logs can be correlated with the originating request on front end service. This also simplifies monitoring using service mesh as critical-only verbosity logs are configured on many .NET apps out of the box.

I would consider the .NET implementation to be a legit implementation of OpenTelemetry so I would suggest to add a language to proposed change that either default propagator is OK.

Furthermore, I would advocate to include propagators to API so longer term both scenarios - logs correlation and service mesh-based monitoring would work out of the box for any application. I believe we can get to the level of expectations when not propagated headers would be a surprising behavior.

That's said if there is a strong pushback - it's OK to migrate propagators to API in later versions. But please, include the language in specification that allows default propagator to be W3C one.

@Oberon00
Copy link
Member

Default if SDK is included is another thing, but the API must be completely no-op by default IMHO.

@yurishkuro
Copy link
Member

I think baggage propagation should be on by default, even in the absence of any SDK.

@Oberon00
Copy link
Member

I don't think we have baggage (in the OpenTracing sense) even specified yet.

@yurishkuro
Copy link
Member

baggage/correlation context. They are independent of any telemetry reporting.

@Oberon00
Copy link
Member

I was under the impression that Correlation context was write-only for the application, and only to be used to associate labels with metrics and attributes with spans? But maybe I got it exactly wrong?

@anuraaga
Copy link
Contributor

Can someone describe what it means exactly to do propagation without any SDK? Generally the actual propagation is implemented in instrumentation which relies on having SDK, for example the servlet instrumentation extracts from headers, OkHttp client instrumentation writes to headers. I can only imagine that without an SDK, a user would be calling the API methods directly to read / write baggage with their (presumably non-instrumented) server/client. Is there even a concept of "on by default" for this case?

@carlosalberto
Copy link
Contributor Author

Generally the actual propagation is implemented in instrumentation which relies on having SDK

Instrumentation only needs an API to work, and actually it would be a bad idea to have it rely on any specific SDK implementation.

@anuraaga
Copy link
Contributor

Instrumentation only needs an API to work

Ah sorry, I keep on forgetting this - I think it's because I just can't shake the weird feeling of depending on, e.g., opentelemetry-instrumentation-grpc but it only doing propagation without actually instrumenting unless the SDK is also brought in. Anyways, it's not related to this PR.

@carlosalberto
Copy link
Contributor Author

But please, include the language in specification that allows default propagator to be W3C one.

What about having the TraceContext propagator in the API without being installed, and have either the user manually or some configuration/util library 'automatically' install it? Not enough for your requirements?

I personally don't like the idea of having different defaults in different languages, as it may create confusion, so I'd try to avoid that if possible 😉

@carlosalberto
Copy link
Contributor Author

Ping @SergeyKanzhelev ;)

@SergeyKanzhelev
Copy link
Member

@carlosalberto I don't think we can eliminate different defaults at this stage. In .NET many components are pre-instrumented. In other languages, user needs to install additional libraries to instrument components.

What the text was saying is that once something is instrumented - context have to be propagated. So if some library is pre-instrumented, it is automatically a part of a distributed tracing. So it will send Trace Context to the backend that can be used to correlate things. If language/component is not instrumented, there is an explicit step to enable API. I have doubts anybody will enable API without the desire to get Trace Context (even without SDK). Since it was an explicit step - there shouldn't be much concern.

Blocking PR for now.

WDYT?

@carlosalberto
Copy link
Contributor Author

@SergeyKanzhelev Ping ;)

@SergeyKanzhelev
Copy link
Member

SergeyKanzhelev commented Aug 14, 2020

The feedback specifically for this PR is:

  • from @Oberon00: expand the definition of the user.
  • from me: we should allow languages to decide on global propagators. There is no need to limit what languages will decide to do since we already have examples of this working.
  • from me, @bogdandrutu and @yurishkuro: Why no-op shouldn't proppogate? What breaks if this is required?

@carlosalberto @Oberon00 and @jkwatson seems to be voting for this PR.

@carlosalberto if you want to move this PR forward, I'd suggest to re-word it into:

  • libraries instrumented with OpenTelemetry API MUST call into propagation APIs to inject and extract context even when no SDK present or monitoring disabled. Default propagators implementation may be either no-op or propagation using TraceContext.
  • there should be a way for application owner disable propagation of context. Languages must define either global switch or per-library.

I believe this will satisfy all feedback.

@Oberon00
Copy link
Member

Default propagators implementation may be either no-op or propagation using TraceContext.

I really think we have to decide on either one or the other (per situation), and I think at least without an SDK this must be no-op.

@SergeyKanzhelev
Copy link
Member

Default propagators implementation may be either no-op or propagation using TraceContext.

I really think we have to decide on either one or the other (per situation), and I think at least without an SDK this must be no-op.

Can you please elaborate on a reasoning to make a definitive decision here? I see languages and instrumentation patterns as being too different. Also, with the introduction of logging we will have more information on this as we might (?) want to use context for logging even without tracing enabled. But would it even be possible in different languages as logging APIs/consumers are very language specific - based on existing loggers.

So I think postponing this decision for now is OK. It can always be adjusted in future versions.

@carlosalberto carlosalberto requested a review from a team August 22, 2020 13:21
@carlosalberto
Copy link
Contributor Author

@Oberon00 Added a note on the fact that any component (other than the API) is allowed to call Set Global Propagator (this makes much more sense than restricting it to the application developer IMHO).

@SergeyKanzhelev Added a note on the fact that .Net is an exception and will provide a set of default Propagators. Wondering if you know exactly what Propagators will be used (could add that as documentation); but more importantly, was wondering if you want me to add an explanation on why this is needed (if YES, please propose a sentence or two on the matter).

cc @reyang

@SergeyKanzhelev
Copy link
Member

I don't think that suggested phrasing reflects the feedback given. Let me add more comments

specified in [api-correlationcontext.md](../correlationcontext/api.md#serialization),
in order to provide propagation even in the presence of no-op
OpenTelemetry implementations.
If offered, the global `Propagator`s MUST default to a no-op instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosalberto in this PR are you advocating for code cleanliness or scenario? This may be the root of the misunderstanding. I think myself, @yurishkuro and @bogdandrutu all advocating for a long term goal of having context ambiently propagated everywhere. It may be that the extensions you mentioned below would be the mechanism of this enablement, but I believe this context should be everywhere.

For example, to explain better what happens in .NET. - there is no context propagation when nothing is enabled. But even smaller level of logging enablement (which is default) will enable context propagation. We clearly recognize some people want to turn it off, But default is ON.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the lack of context. A week ago, in the Specification SIG call (at least @bogdandrutu was there), we came to an agreement on having the API with no-op Propagators by default (.Net will keep its current behavior, given the specific needs of how it's used in the .Net framework and how things are instrumented by default).

But even smaller level of logging enablement (which is default) will enable context propagation

Oh, actually this is what I tried to convey when I mentioned OpenTelemetry extensions (same in spirit, although not quite the same).

cc @bogdandrutu @yurishkuro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosalberto
Copy link
Contributor Author

@open-telemetry/specs-approvers Please review/approve this PR. It has been standing idle for a week now, and although I'm waiting for feedback from @SergeyKanzhelev before moving on, general approval needs to happen as well.

specification/context/api-propagators.md Outdated Show resolved Hide resolved
specification/context/api-propagators.md Outdated Show resolved Hide resolved
specification/context/api-propagators.md Show resolved Hide resolved
Comment on lines +257 to +258
Note: OpenTelemetry .NET is an exception to this rule as his global
`Propagator`s default to a set of specific `Propagator`s, including `TraceContext`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to reference the .NET documentation giving more details on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyang @SergeyKanzhelev Any 'official' documentation we could mention here for the .NET case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Sep 16, 2020
@arminru
Copy link
Member

arminru commented Sep 17, 2020

@carlosalberto Can we close this as there seems to be consensus on #930?

@carlosalberto
Copy link
Contributor Author

Let's go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:context Related to the specification/context directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Default global propagators
10 participants