This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
Port to 3.1 - Fix a native memory leak in EventPipe #28038
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Port of dotnet/runtime#35924.
A customer reported a seemingly unbounded memory usage growth over time when using AppInsights: microsoft/ApplicationInsights-dotnet#1678.
EventPipeBuffers were being allocated using
malloc
, and over time this may cause an internal fragmentation within glibc's internal data structure, resulting in a memory leak over time. A usage pattern that worsens this leak is to start tracing, fill up all the buffers, then stop tracing, and repeat that many times. This fix mitigates the issue by making EventPipeBuffer to use ClrVirtualAlloc instead of malloc.Customer Impact
On Linux platforms (specifically ones that use glibc), customers may see an unexpected growth in the native heap size over a long period of time if using EventPipe for a long time (ex. using Application Insights Service Profiler), making our first-party and third-party tracing solutions potentially problematic in production scenarios when used for elongated periods of time.
Regression?
No, EventPipe buffers were always allocated using malloc.
Testing
The fix was tested against the scenario the customer provided and . Additional performance tests that use EventPipe showed minimal performance regression (<4 % in the worst case).
Risk
I believe the risk is low - the fix is localized to only 3-lines change and the behavior is well-understood. I have tested the fix locally for the past three weeks for various performance measurements and the fix passed all the runtime tracing tests that I ran locally and all the tests in the CI when I merged dotnet/runtime#35924.