-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[api] Fix memory leaks in TracerProvider.GetTracer API #4906
[api] Fix memory leaks in TracerProvider.GetTracer API #4906
Conversation
Codecov Report
@@ Coverage Diff @@
## main #4906 +/- ##
==========================================
+ Coverage 83.21% 83.51% +0.29%
==========================================
Files 295 295
Lines 12294 12324 +30
==========================================
+ Hits 10231 10292 +61
+ Misses 2063 2032 -31
Flags with carried forward coverage won't be shown. Click here to find out more.
|
{ | ||
if (this.tracers == null) | ||
{ | ||
// Note: We check here for a race with Dispose and return a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we need to set this.tracers = null
inside the same lock. Else we could still run into a situation where some thread calling Dispose
sets this.tracers
to null
after this if
check and before the new entry is added to the dictionary. We would want to return a no-op tracer in that case, but we would end up returning a valid tracer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked it a couple times. I think it is good! Could be I'm not seeing something though. Can you write out a flow for me that you think is flawed? Here are a couple flows I'm imagining.
Case where Dispose runs in the middle of the writer and gets the lock...
- Writer thread reads the
this.tracers
on Line 58. It is valid so it begins its work. - Dispose thread sets
this.tracers
tonull
. - Dispose thread takes the lock.
- Reader thread misses the cache and tries to take the lock. It has to wait.
- Dispose thread finishes its clean up and releases the lock.
- Writer thread gets the lock. Now it checks
this.tracers == null
. This will betrue
now and it will return a no-op instance.
Case where Dispose runs in the middle of the writer and waits on the lock...
- Writer thread reads the
this.tracers
on Line 58. It is valid so it begins its work. - Reader thread misses the cache and takes the lock. Inside the lock it checks
this.tracers == null
which isfalse
. It begins to do its work. - Dispose thread sets
this.tracers
tonull
. - Dispose thread tries to takes the lock. It has to wait.
- Writer thread adds a new tracer to the cache and releases the lock. It doesn't care that
this.tracers
is now actuallynull
because it is working on a local copy. - Dispose thread gets the lock and makes all the tracers in the cache no-ops including the one that was just added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For case 2,
Writer thread adds a new tracer to the cache and releases the lock. It doesn't care that this.tracers is now actually null because it is working on a local copy.
I think this is more of design choice. Yes, it doesn't care that this.tracers
is now actually null but it could care about it 😄.
I was thinking we could offer a stronger guarantee that we would never return a Tracer
when TracerProvider
is disposed or being disposed. We could avoid this limbo state where the Dispose method may or may not have marked the newly returned Tracer
no-op when its being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged the PR because I think what's there will work well enough. I'll circle back to this comment when I have a sec to see if I can simplify it or clean it up in a way that doesn't introduce a bunch of contention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a non-blocking comment: #4906 (comment)
Changes
TracerProvider
now maintains a cache of theTracer
s it has issued. When disposed it will turn them into no-op instances and release their associatedActivitySource
s.Details
Consider the following simple application:
Running that we will see memory growing per iteration that is never released:
What's going on here?
Today we create a
Tracer
each timeGetTracer
is called which is handed its ownActivitySource
. Creating spuriousActivitySource
s is dangerous because there is a static list of all active sources.Tracer
does NOT implementIDisposable
so users aren't given a chance to do this correctly.After the cache introduced on this PR the graph looks like this:
Merge requirement checklist
CHANGELOG.md
files updated for non-trivial changes