-
Notifications
You must be signed in to change notification settings - Fork 770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logs: Add log record pooling #3385
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3385 +/- ##
==========================================
- Coverage 86.36% 86.22% -0.14%
==========================================
Files 258 261 +3
Lines 9269 9411 +142
==========================================
+ Hits 8005 8115 +110
- Misses 1264 1296 +32
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a changelog entry.
src/OpenTelemetry.Exporter.InMemory/InMemoryExporterLoggingExtensions.cs
Outdated
Show resolved
Hide resolved
} | ||
|
||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
internal int RemoveReference() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, I personally found AddRef/Release commonly used, might be related to C++ only:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How passionately do you feel about this? I looked around, don't see any good example or prior art in .NET. AddRef
w/ Release
seems a bit odd for me for a .NET API. AddRef
and RemoveRef
I think would be better? Or AddRef
w/ ReleaseRef
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only 1% 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, this is a good article https://docs.microsoft.com/en-us/windows/win32/com/rules-for-managing-reference-counts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
return true; | ||
} | ||
|
||
current = current.Next; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this may be improbable but still possible - might have a concurrency issue here in the event another thread changes the value of Next
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you mention it, ya there are some issues here. I don't think AddProcessor
is thread-safe for tracer or logger providers. I'm not going to tackle it on this PR. I don't even know if it is worth tackling at all but we could at least add comments on our "AddProcessor" methods about the thread safety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the current behavior of AddProcessor
:
- If multiple threads are calling
AddProcessor
simultaneously, the caller should be responsible for doing proper synchronization (this follows the .NET general approach - member functions are not thread safe by default). - It is safe if a thread is calling
AddProcessor
while other threads are using the processors (e.g. by generating Activity objects which go through the processor chain).
And here is my general thinking regarding thread safety:
- All the telemetry APIs must be thread safe, following the OpenTelemetry spec:
- Any other APIs (e.g. the APIs exposed by the SDK package) should follow .NET general guidance, which is:
- static method should be thread safe
- member function will not be thread safe
- Certain functions might provide additional thread safety guarantees in addition to 2), and this should be called out explicitly in the API XML document:
opentelemetry-dotnet/src/OpenTelemetry/BaseExporter.cs
Lines 76 to 79 in 9bb7aaf
/// <remarks> /// This function guarantees thread-safety. /// </remarks> public bool ForceFlush(int timeoutMilliseconds = Timeout.Infinite)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 makes sense. Agreed we should follow .NET's guidance here.
Different concern, but same area of code... someone could do the following:
var processor = new CompositeProcessor<LogRecord>(new SomeProcessor());
loggerProvider.AddProcessor(processor);
processor.AddProcessor(new BatchLogRecordExportProcessor(new SomeExporter()));
The provider would end up using use the ThreadStaticPool
, correct? Maybe this is far fetched as well and not worth doing anything about 😆. I suppose it would require constantly inspecting for any changes to the processor pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya that's a problem. Could add an event on CompositeProcessor
like internal EventHandler ProcessorAdded
we could fire and use to listen for changes. Just an idea, not going to do anything on this PR to address it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably won't like this idea, but could drop the ThreadStaticPool
for now - if we went with the event solution, definitely seems fit for its own PR. I assume the SharedPool would still have perf benefits even when no batching is involved.
That said, not a blocker at all, I think it's probably fine to roll with this as a known edge case - the thing is, I think it's an edge case. I haven't encountered users who are modifying their processor pipeline after initialization.
// only one of the logRecords will make it back into the | ||
// pool. Anything lost in the race will collected by the GC | ||
// and the pool will issue new instances as needed | ||
this.pool[returnSnapshot % this.Capacity] = logRecord; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of losing the logRecord object, we could try to write it to the next available index by using CompareExchange
instead of Exchange
:
while (true)
{
var rentSnapshot = Volatile.Read(ref this.rentIndex);
var returnSnapshot = Volatile.Read(ref this.returnIndex);
if (returnSnapshot - rentSnapshot >= this.Capacity)
{
return; // buffer is full
}
if (Interlocked.CompareExchange(ref this.returnIndex, returnSnapshot + 1, returnSnapshot) == returnSnapshot)
{
- this.pool[returnSnapshot % this.Capacity] = logRecord;
+ if (Interlocked.CompareExchange(ref this.pool[returnSnapshot % this.Capacity], logRecord, null) == null)
+ {
+ return;
+ }
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea, however I think it might be optimizing for the wrong case. The race called out in the comment is for "many threads hammering rent/return" which is more of a theoretic exercise. The actual use (right now) is many threads renting/reading (log emitters), one thread returning/writing (batch exporter). In that scenario, this race should really never be hit. So if we put in the Interlocked.CompareExchange
that is more expensive than just doing this.pool[returnSnapshot % this.Capacity] = logRecord
which should work for actual use case we have right now. Are you open to me updating the comment with the solution and considerations so we have it captured (in the event the use case changes) but we leave as-is for now so we have better perf for exporters returning records?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just updated the comment with more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The race called out in the comment is for "many threads hammering rent/return" which is more of a theoretic exercise. The actual use (right now) is many threads renting/reading (log emitters), one thread returning/writing (batch exporter). In that scenario, this race should really never be hit.
I think we could still have a race for multiple threads returning/writing. We could have an emitter thread returning because OnExport
failed and the exporter thread returning simultaneously but I guess it's okay to ignore that scenario as it might not be as common.
@@ -57,9 +57,14 @@ public override void OnEnd(LogRecord data) | |||
// happen here. | |||
Debug.Assert(data != null, "LogRecord was null."); | |||
|
|||
data!.BufferLogScopes(); | |||
data!.Buffer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add a TODO to add a targeted test for testing #2905
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some tests I did on #2932 so I just updated and added them to this PR.
|
||
internal void Buffer() | ||
{ | ||
this.BufferLogStateValues(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment/todo for "state" - do we need to do similar for state as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
} | ||
|
||
var attributeStorage = this.AttributeStorage ??= new List<KeyValuePair<string, object?>>(stateValues.Count); | ||
attributeStorage.AddRange(stateValues); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment this is what triggers the iteration/enumeration to "snap" what we need before things could get disposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Left few nits, mostly about adding comments or todo.
[Goal is to pare down the amount of changes on #3305]
Fixes #1582
Fixes #2905
Changes
LogRecord
pool implementations. One uses aThreadStatic
and the other implements a circular buffer. The type of pool to use is determined by the processor(s) registered withOpenTelemetryLoggerProvider
.TODOs
CHANGELOG.md
updated for non-trivial changes