-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[logs] Mitigate unwanted object creation during configuration reload #5514
[logs] Mitigate unwanted object creation during configuration reload #5514
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5514 +/- ##
==========================================
+ Coverage 83.38% 85.30% +1.91%
==========================================
Files 297 289 -8
Lines 12531 12580 +49
==========================================
+ Hits 10449 10731 +282
+ Misses 2082 1849 -233
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Is there no way to prevent the creation of these processors? Also, there was some discussion around this happening due to the usage of |
I haven't come up with a way to prevent it. The problem is in these configuration delegates users may do this: options.AddProcessor(new MyProcessor());
Kind of but not totally. We could change these two spots... opentelemetry-dotnet/src/OpenTelemetry/Logs/ILogger/OpenTelemetryLoggingExtensions.cs Line 195 in c7c7a69
opentelemetry-dotnet/src/OpenTelemetry/Logs/ILogger/OpenTelemetryLoggingExtensions.cs Line 252 in c7c7a69
... to use But we have this public ctor we can't change: opentelemetry-dotnet/src/OpenTelemetry/Logs/ILogger/OpenTelemetryLoggerProvider.cs Line 36 in c7c7a69
Anyone using that will run into the problem. Also we can't prevent anyone from doing We may want to also support hot reload of things like I decided to go this route because I thought it would work more broadly. |
+1, I think the issue is not about "thread leaks", it is "unwanted objects creation during configuration refresh". |
That should be fine, right? If a user is explicitly using
We could still make that available in future using some newer public API or parameter. However, our default setup can avoid |
How will they know
Let's say some contributor comes and adds an option Simply switching to I just pushed updates. What I decided to do was just disable reload completely for
|
The golden answer: unit test. |
That's the thing. If I did what @utpilla originally suggested and just switched to But I think the point is moot. The current form of the PR disabling reload I think checks all the boxes. If in the future this friendly contributor wants to add |
From #5513
@CodeBlanch Could you also tell why is this only a problem for Logs and not other signals? Or why was |
src/OpenTelemetry/CHANGELOG.md
Outdated
@@ -2,6 +2,12 @@ | |||
|
|||
## Unreleased | |||
|
|||
* New instances of `OpenTelemetryLoggerOptions` will no longer be created during |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the description (which is phrased as a behavior change) here might be confusing to users - most of them probably don't even know about OpenTelemetryLoggerOptions
.
Consider being more explicit here by starting with "Fixed an issue ..." (whether OpenTelemetryLoggerOptions
will be created or not is an implementation detail that we might change later if we figured out how to properly support dynamic configuration reload).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New version pushed
Sorry I don't understand, could you add a unit test to make sure that even if configuration got changed, there is only one processor/exporter created? (fail the test if there is more than one exporter created) |
A lot of history here. I'll do my best to try and explain it 🤣 There are two different concepts: Builders and Options. Builders use the
Would it fix the issue in Logging to not do this? if (configureOptions != null)
services.Configure(configureOptions); It will fix some issues (but not all) and it will introduce whole new issues 🤣 For example users can do these types of things... services.Configure<OpenTelemetryLoggerOptions>(o => o.AddProcessor(...));
services.Configure<OpenTelemetryLoggerOptions>(o => o.AddSomeExporterViaSomeExtensions(...));
services.AddOptions<OpenTelemetryLoggerOptions>()
.Configure<IConfiguration>((options, config) =>
{
if (config.GetValue("OpenTelemetry:EnableLogging", false))
{
options.AddProcessor(new BatchLogRecordExportProcessor(new MyExporter()));
}
}); If we remove our But removing it will introduce ordering issues. For example this... services.AddLogging(logging => logging.AddOpenTelemetry(o => o.IncludeFormattedMessage = true));
services.PostConfigure<OpenTelemetryLoggerOptions>(o => o.IncludeFormattedMessage = false); Options API has a certain order it executes in. In that code the If we did something like this instead... var options = sp.GetRequiredService<IOptions<OpenTelemetryLoggerOptions>>().Value;
if (configureOptions != null)
configureOptions(options); // Manually invoke configuration delegate instead of using Options API What that is doing is manually invoking the configuration delegate on the final options instance. That will make the order essentially: // Execute PostConfigure
o.IncludeFormattedMessage = false; // What the user wanted
// Manually invoke configure delegate
o => o.IncludeFormattedMessage = true; // What the user ends up with Logging using |
The tests are there already: What I was saying is those tests are not possible using the design @utpilla was suggesting. |
I wasn't following this. I'm good 👍 as long as there is unit test coverage. |
…h/opentelemetry-dotnet into sdk-log-options-reload
src/OpenTelemetry/Logs/ILogger/OpenTelemetryLoggingExtensions.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Reiley Yang <[email protected]>
// Note: We disable built-in IOptionsMonitor features for | ||
// OpenTelemetryLoggerOptions as a workaround to prevent unwanted | ||
// objects (processors, exporters, etc.) being created by | ||
// configuration delegates during reload of IConfiguration. | ||
services.DisableOptionsMonitor<OpenTelemetryLoggerOptions>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need all this infrastructure? Instead, can't you just use IOptions
instead of IOptionsMonitor
everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is changing OTel to use IOptions
(instead of IOptionsMonitor
) for OpenTelemetryLoggerOptions
. But my thinking is, we can't prevent users from accessing IOptionsMonitor
and we can't prevent some future dev from re-introducing it. The infrastructure here is so we can make it deterministic and have unit tests validating it will work correctly should IOptionsMonitor<OpenTelemetryLoggerOptions>
sneak into the process anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we really expect someone outside of OpenTelemetry to try to get an
IOptionsMonitor<OpenTelemetryLoggerOptions>
? What would they do with it? - If someone does explicitly use
IOptionsMonitor
, don't they want updates as the config changes? That's why they choose IOptionsMonitor.
Would it be better to just fail in this case if we explicitly want to block it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's unclear why anyone else would need to monitor OpenTelemetryLoggerOptions
for changes when the core library won't react to changes itself, it feels like code smell to go out of our way prevent it by replacing core options services and changing how they work. What if someone wants to just new up an OpenTelemetryLoggerOptions
and calls options.AddProcessor(new BatchLogRecordExportProcessor(new OtlpLogExporter(new()))
in a test or something? Should they know that this will spawn a thread that will never get stopped?
I think the core issue is how BatchExportProcessor<T>
spawns a thread in its constructor.
opentelemetry-dotnet/src/OpenTelemetry/BatchExportProcessor.cs
Lines 60 to 65 in 876e4fa
this.exporterThread = new Thread(this.ExporterProc) | |
{ | |
IsBackground = true, | |
Name = $"OpenTelemetry-{nameof(BatchExportProcessor<T>)}-{exporter.GetType().Name}", | |
}; | |
this.exporterThread.Start(); |
I know that this API has already shipped, so adding a StartAsync
method or something like that may not be feasible, but could you unseal OnStart
in the base class and start the thread there? Or lazily start the thread some other way? If not, should we deprecate the BaseProcessor<LogRecord>
overload of AddProcessor()
and tell people to use the Func<IServiceProvider, BaseProcessor<LogRecord>
overload instead?
If we have to override the IOptionsMonitor<OpenTelemetryLoggerOptions>
, I agree with @eerhardt that it should throw from everything with a NotSupportedException
indicating that reloading OpenTelemetryLoggerOptions
is completely unsupported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed updates so that accessing OpenTelemetryLoggerOptions
via IOptionsMonitor
or IOptionsSnapshot
will result in a NotSupportedException
. Technically breaking, but I think the impact will be very low.
If not, should we deprecate the BaseProcessor overload of AddProcessor() and tell people to use the Func<IServiceProvider, BaseProcessor overload instead?
This is the plan yes (more or less)! We have another API for building logging pipelines which does not suffer from these issues. The plan is to make that a stable API (#5442) and then we can Obsolete
these AddProcessor
methods on OpenTelemetryLoggerOptions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good discussion and thanks for the suggestions/ideas!
While I agree with most parts here, I suggest that we focus on "mitigate the issue quickly with minimum change/risk" instead of trying to get a complete solution in this PR. I'm specifically concerned about throwing exception in this PR. I suggest that we take the feedback and think about exception or other approaches in a follow up PR once the mitigation/hotfix is released.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we focus on "mitigate the issue quickly with minimum change/risk" instead of trying to get a complete solution in this PR
So that means just changing our references of IOptionsMonitor<OpenTelemetryLoggerOptions>
=> IOptions<OpenTelemetryLoggerOptions>
? Don't do anything to prevent users from accessing IOptionsMonitor. Am I understanding that correctly?
If so, I agree that since this is a hotfix, keep it scoped to resolving the issue at hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have strong opinion regarding the actual solution (e.g. IOptions/IOptionsMonitor, singleton, etc.), as long as it meets the two conditions #5514 (review) I'm good 👍.
I don't think throwing exception in this PR is the right thing to do, might be a good topic for another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I'm fine with whatever approach that stops the unwanted object creation as long as the following conditions meet:
- It is a simple change with very low risk, ideally with minimal lines of code touched.
- It is not giving user another shock (e.g. throwing exceptions and killing their app).
Any improvements can be made in future PRs after this hotfix.
…pen-telemetry#5514) Co-authored-by: Reiley Yang <[email protected]>
Fixes #5513
Changes
OpenTelemetryLoggerOptions
as a singleton and ignoreIConfiguration
reload(s).Merge requirement checklist
CHANGELOG.md
files updated for non-trivial changes