-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Instrumentation.Runtime] Reintroduce current heap size metric #683
[Instrumentation.Runtime] Reintroduce current heap size metric #683
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #683 +/- ##
==========================================
+ Coverage 77.56% 77.58% +0.02%
==========================================
Files 176 176
Lines 5308 5313 +5
==========================================
+ Hits 4117 4122 +5
Misses 1191 1191
|
"process.runtime.dotnet.gc.heap.live", | ||
() => GC.GetTotalMemory(false), | ||
unit: "bytes", | ||
description: "Count of bytes currently in use by live objects in the GC heap."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is the estimated bytes allocated, not necessarily "used by live objects" (e.g. I think it will contain fragmentation/gaps that are not connected to any live objects").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can borrow some wording from https://learn.microsoft.com/en-us/dotnet/api/system.gc.gettotalmemory?view=net-6.0#returns, e.g. "A number that is the best available approximation of the number of bytes currently allocated in managed memory."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to improve wording to indicate this is an approximation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current wording comes from here.
I guess this is the estimated bytes allocated, not necessarily "used by live objects" (e.g. I think it will contain fragmentation/gaps that are not connected to any live objects").
Unless I'm looking at the wrong place, looking at the implementation (e.g gc.cpp
, line 44841 for 6.0.0), fragmentation is specifically excluded.
I preferred current wording, because it is more specific.
If wording from docs is preferred, I will use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lachmatt! Let me consult the owner of GC and see if we can improve the description here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, I've been jumping between a number of different issues. I've been chatting more with Maoni and we've got some misgivings about using for using GC.GetTotalMemory() in runtime's own counter. My plan for the moment is to write up a proposal and open a runtime GitHub issue to discuss deprecating the gc-heap-size counter. As part of that we would change the tutorial to reference the gc-committed counter instead. If the guidance in that tutorial changed would there be any other motivation to add this counter or would it now be moot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the issue I just opened in the runtime repo discussing moving/removing/hiding the gc-heap-size counter: dotnet/runtime#77530
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @lachmatt mentioned the only downside to gc-committed
is that it requires .NET 6.0 so no .NET Framework support and we still see many users on .NET Framework. I'm not concerned about .NET Core 3.1 given its upcoming EOL.
In the same line with proposals 1 & 5 on dotnet/runtime#77530 we could document that the counter is available for legacy purposes and that gc-committed
is preferred to track actual memory used by GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud: how much gc-committed
diverges from the native perf counter tracking MEM_COMMIT for the typical .NET Framework application?
If the overall consensus is that adding gc-heap-size
is not desirable. The docs could guide users that lack the gc-committed
to the proper native counter, and explain what is expected to be different with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the app, total committed memory and GC committed memory could be very different numbers. I've been continuing to noodle on this and the feedback from dotnet/runtime#77530. I think the main commonality from proposals (1) and (5) is that neither of them gets rid of the heap size metric, they just try to re-position it to reduce accidental misuse. If this is will be the only metric we have for .NET Framework then maybe rather than avoiding it we just be careful with naming it. From my perspective, the key distinction of GetTotalMemory() isn't that it updates between GCs, it is that it doesn't include fragmentation and other GC committed memory pools. Perhaps if we named it process.runtime.dotnet.gc.heap.total_objects_size that would make it clearer what this metric is really measuring.
@@ -48,6 +48,12 @@ static RuntimeMetrics() | |||
() => GetGarbageCollectionCounts(), | |||
description: "Number of garbage collections that have occurred since process start."); | |||
|
|||
MeterInstance.CreateObservableUpDownCounter( | |||
"process.runtime.dotnet.gc.heap.total_objects_size", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very inconsistent with other metrics names (e.g.
opentelemetry-dotnet-contrib/src/OpenTelemetry.Instrumentation.Runtime/RuntimeMetrics.cs
Line 53 in 005fdb7
"process.runtime.dotnet.gc.allocations.size", |
"process.runtime.dotnet.gc.heap.total_objects_size", | |
"process.runtime.dotnet.gc.live_objects.size", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed in 25e9cf8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'live' usually means that an object is reachable via references from a root, but that isn't necessarily true for these objects at the current instant in time. We know these objects haven't been collected yet, but they may already be unreachable. We can brainstorm different names but I would avoid "live". For example "process.runtime.dotnet.gc.objects.size"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'live' usually means that an object is reachable via references from a root, but that isn't necessarily true for these objects at the current instant in time. We know these objects haven't been collected yet, but they may already be unreachable. We can brainstorm different names but I would avoid "live". For example "process.runtime.dotnet.gc.objects.size"?
Sounds good to me, and the name "process.runtime.dotnet.gc.objects.size" is aligned with the existing ones.
@noahfalk would you help to keep an eye on the instrument description
? Current it says "live objects" which needs to be adjusted based on your input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed in 48e021b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@@ -34,6 +34,9 @@ | |||
|
|||
This does not affect applications targeting .NET Framework. | |||
|
|||
* Add "process.runtime.dotnet.gc.objects.size" metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/open-telemetry/opentelemetry-dotnet-contrib/tree/main/src/OpenTelemetry.Instrumentation.Runtime#processruntimedotnetgcheapsize
Could you update the readme file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be also add a note about how is it different from the gc.heap.size (the fact that this exclude fragmentation, and works anytime even before the 1st GC event itself etc.) A lot such useful info was in this PR as comments. Its good to get the gist into the readme file itself to not lose them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
The changes look good. Requested to update readme file as well. (It could be done as a follow PR as well)
@lachmatt could you also update PR desc to reflect the final content of the PR? |
Updated.
I will do that in a separate PR. |
Merging. A follow up is needed to update readme. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Fixes #682.
Changes
Add
process.runtime.dotnet.gc.objects.size
instrument that reports count of bytes currently in use by objects in the GC heap that haven't been collected yet (excludes fragmentation and other GC committed memory pools), and uses API available on all targets.