Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Port to 3.1 - Fix a native memory leak in EventPipe #28038

Merged
merged 1 commit into from
May 13, 2020

Conversation

sywhang
Copy link

@sywhang sywhang commented May 7, 2020

Port of dotnet/runtime#35924.

A customer reported a seemingly unbounded memory usage growth over time when using AppInsights: microsoft/ApplicationInsights-dotnet#1678.

EventPipeBuffers were being allocated using malloc, and over time this may cause an internal fragmentation within glibc's internal data structure, resulting in a memory leak over time. A usage pattern that worsens this leak is to start tracing, fill up all the buffers, then stop tracing, and repeat that many times. This fix mitigates the issue by making EventPipeBuffer to use ClrVirtualAlloc instead of malloc.

Customer Impact

On Linux platforms (specifically ones that use glibc), customers may see an unexpected growth in the native heap size over a long period of time if using EventPipe for a long time (ex. using Application Insights Service Profiler), making our first-party and third-party tracing solutions potentially problematic in production scenarios when used for elongated periods of time.

Regression?

No, EventPipe buffers were always allocated using malloc.

Testing

The fix was tested against the scenario the customer provided and . Additional performance tests that use EventPipe showed minimal performance regression (<4 % in the worst case).

Risk

I believe the risk is low - the fix is localized to only 3-lines change and the behavior is well-understood. I have tested the fix locally for the past three weeks for various performance measurements and the fix passed all the runtime tracing tests that I ran locally and all the tests in the CI when I merged dotnet/runtime#35924.

@sywhang sywhang self-assigned this May 7, 2020
@jeffschwMSFT jeffschwMSFT added the Servicing-consider Issue for next servicing release review label May 7, 2020
@jeffschwMSFT jeffschwMSFT added this to the 3.1.x milestone May 7, 2020
@leecow leecow added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels May 7, 2020
@leecow leecow modified the milestones: 3.1.x, 3.1.15, 3.1.5 May 7, 2020
@Anipik Anipik merged commit b174565 into dotnet:release/3.1 May 13, 2020
@ckurtin
Copy link

ckurtin commented May 20, 2020

Greetings,

any idea on when will the 3.1.5 release including this fix be available? Our project which should be production ready by next week is affected by this memory leak (.net core 3.1.4 + app insights). Do we proceed by temporarily removing Application Insights?

@Anipik
Copy link

Anipik commented May 20, 2020

3.1.5 should be available before second week of june

@sywhang
Copy link
Author

sywhang commented May 20, 2020

@ckurtin in the meantime you can set the environment variable MALLOC_ARENA_MAX to something like 2 and avoid the leak.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants