-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GlobalOpenTelemetry trigger of autoconfiguration is opt-in #5010
GlobalOpenTelemetry trigger of autoconfiguration is opt-in #5010
Conversation
Codecov ReportBase: 91.27% // Head: 91.27% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #5010 +/- ##
=========================================
Coverage 91.27% 91.27%
- Complexity 4885 4892 +7
=========================================
Files 552 552
Lines 14433 14454 +21
Branches 1373 1375 +2
=========================================
+ Hits 13173 13193 +20
Misses 874 874
- Partials 386 387 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
I think this is a good change. How can we best ensure that this won't be a badly breaking change for users, or at least that they will be able to proactively adjust for it? |
2051d96
to
1a2a776
Compare
I've added a commit that allows |
+ " To enable, run your JVM with -D" | ||
+ GLOBAL_AUTOCONFIGURE_ENABLED_PROPERTY | ||
+ "=true"); | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is null the right response here, rather than the no-op implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the caller of this...I think we might be doing the wrong thing right now. If this method returns null, we're calling set
with the no-op impl. That will mean (I think?) that we will always end up returning the no-op, because of the check in the set
method. I think this is wrong because we do want people to be able to set
after calling a uninitialized get
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will mean (I think?) that we will always end up returning the no-op, because of the check in the set method.
That is correct. Here's a summary of the behavior as of right now on main
:
- If
GlobalOpenTelemetry.get()
is called beforeGlobalOpenTelemetry.set()
, anOpenTelemetry
instance is resolved and set usingGlobalOpenTelemetry.set()
. - If autoconfigure is on the classpath and autoconfigure initializes without error,
GlobalOpenTelemetry
will be set to an autoconfigured instance ofOpenTelemetrySdk
. - If autoconfigure is not on the classpath, or some error occurs during initialization,
GlobalOpenTelemetry
will be set toOpenTelemetry.noop()
.
I think this is wrong because we do want people to be able to set after calling a uninitialized get.
I'm not so sure. The current behavior is inconvenient because it forces users to either call GlobalOpenTelemetry.set
OR have autoconfiguration do so before any call to GlobalOpenTelemetry.get
. Trying to call GlobalOpenTelemetry.set
later triggers an exception which can only be resolved by adjusting app initialization ordering. But the benefit of this is that GlobalOpenTelemetry.get
always returns the same value, regardless of the order in which its called. I.e. you can't have one caller receive a OpenTelemetry.noop()
and another caller receive an autoconfigured OpenTelemetrySdk
. This is a nice invariant to be able to rely on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I can see that side of things as well (wanting in invariant). I guess it boils down to: in the case of incorrect initialization ordering, do you want a) nothing to be emitted from the SDK at all or b) at least some things emitted, from the instruments/tracers/etc that are initialized after the SDK initialization has taken place.
I don't know that there's an easy answer to this one. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup it's a good question and is kind of a design philosophy question between fail fast or allow partial success.
I'm prefer to the fail fast because I think the absence of all telemetry provides a really strong signal to go and fix the issue.
If we do want to change this behavior should probably do it in a different PR to separate concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'll be able to make sure no one will call get before I create the OpenTelemetry object.
I will need to doublecheck this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'll be able to make sure no one will call get before I create the OpenTelemetry object.
You don't need to. If you rely on GlobalOpenTelemetry.get()
calling AutoConfiguredOpenTelemetrySdk.initialize()
, you will just need to opt in to that behavior with an environment variable or system property.
The current behavior should still be possible, but it shouldn't come at the expense of other scenarios that don't want the side affect. It makes sense to the AutoConfiguredOpenTelemetrySdk.initialize()
side affect opt-in rather than opt-out because autoconfigure users should be comfortable adding environment variables / system properties so it shouldn't be a problem be a big lift to add one more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much support this change, I think it makes using GlobalOpenTelemetry
less scary by removing this side-effect, which is important for Java agent users.
I'm not sure it's the best solution to this problem:
The more generalized problem is how do SDK extension components (exporters, processors, etc) get access to OpenTelemetry so they can instrument themselves?
because ideally the SDK extension components (e.g. a SpanExporter) would get access to the specific OpenTelemetry instance which they were used to create, instead of the GlobalOpenTelemetry instance which might be a different OpenTelemetry instance (e.g. configured to a different endpoint)
@trask opened #5021 to propose a solution to this. I think both this PR and a solution like #5021 are needed. |
GlobalOpenTelemetry and SDK initializationBelow is what I think is an exhaustive list of ways the SDK can be initialized and set in GlobalOpenTelemetry (or not). Today, all of these initialization methods exist with the exception of option 3 (which is new with this PR). This PR also modifies the existing option 4.
Note: for all of the auto-configuration options listed above, an SDK will always be created and set in GlobalOpenTelemetry. When SDK auto-configuration is disabled with In the case where the OpenTelemetry Kubernetes Operator is used with |
@deejgregor its all so simple! 😅
Just one clarification that you can use autoconfigure without |
Thanks. I've added it as 7. I was struggling with how to include this when I made the list last night (and this is important, since it is the case that was referenced in the creation of this issue!). I also tweaked 5 for the case with I also added notes on the Kubernetes Operator and which cases I think apply. Next up: the real reason why I made this list: to summarize impacts and pros/cons for a few use cases (and why I overall l am 👍 this change). |
I am 👍 with this change and having In #5010 (comment) I tried to enumerate the cases where the SDK is initialized and/or
I believe that's it. Only people initializing via a call to The users that I don't believe would be impacted at all are those using the Java agent (as long as it is enabled; case 1), those that don't have To continue with @trask's comment:
Personally, I'd like to see more libraries support OpenTelemetry without requiring explicit OpenTelemetry injection and Now, one effect of more libraries using Lastly, for OSGi, we want to avoid any uses of |
@jkwatson pending any more comments from you, I'd like to merge this after another day or two. There's quite a bit of work which is dependent on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…metry#5010) * Do not initialize AutoConfiguredOpenTelemetrySdk in OpenTelemetry.get * GlobalOpenTelemetry triggers autoconfigure based on env var / system property
GlobalOpenTelemetry.get
has a side affect where ifGlobalOpenTelemetry.set
has not been called andopentelemetry-sdk-extension-autoconfigure
is on the classpath, the OpenTelemetrySdk will automatically be configured and set viaGlobalOpenTelemetry.set
.This is problematic for the reasons described in this comment:
Essentially there is no way to include
opentelemetry-sdk-extension-autoconfigure
and haveAutoConfiguredOpenTelemetrySdkBuilder.setResultAsGlobal(false)
since any code that callsGlobalOpenTelemetry.get
will trigger autoconfiguration undermine the attempt to not set GlobalOpenTelemetry.Then there's the question of who / what instrumentation wants this side affect? As @mateuszrzeszutek says:
This PR requires users to opt-in to the side affect by setting
otel.java.global-autoconfigure.enabled=true
.