-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Reducing confusion around the System.Runtime gc-heap-size EventCounter #77530
Comments
Tagging subscribers to this area: @dotnet/gc Issue DetailsIn our set of System.Runtime counters we have a bunch of GC related counters and one of them is 'gc-heap-size'. I want to propose deprecating it or making it less initially visible and instead encourage developers to use the 'gc-committed' counter to measure the size of the GC heap. I'm opening this issue to solicit feedback and discuss if and how we should do this. Why deprecate/move/hide it?My main goal is to reduce confusion and simplify the diagnostics story around the GC:
Potential for confusion aside, there is nothing inherently wrong with the gc-heap-size counter for developers who can interpret its meaning correctly. While we could certainly improve the docs as a mitigation, the scenario will be more robust if it doesn't require all developers to read docs to avoid the sharp edges. What would we change?I've thought of a few options, and I'd be glad to hear other suggestions.
What is impact of moving/removing a counter?As far as I am aware we have never done this in the BCL since EventCounters introduced so we don't have a clear precedent to learn from. This is roughly what I would expect:
Your feedback?Currently the options I am leaning towards are 5 followed by 1, but I could really imagine this going in many different directions and feedback is much appreciated!
|
Tagging subscribers to this area: @tommcdon Issue DetailsIn our set of System.Runtime counters we have a bunch of GC related counters and one of them is 'gc-heap-size'. I want to propose deprecating it or making it less initially visible and instead encourage developers to use the 'gc-committed' counter to measure the size of the GC heap. I'm opening this issue to solicit feedback and discuss if and how we should do this. Why deprecate/move/hide it?My main goal is to reduce confusion and simplify the diagnostics story around the GC:
Potential for confusion aside, there is nothing inherently wrong with the gc-heap-size counter for developers who can interpret its meaning correctly. While we could certainly improve the docs as a mitigation, the scenario will be more robust if it doesn't require all developers to read docs to avoid the sharp edges. What would we change?I've thought of a few options, and I'd be glad to hear other suggestions.
What is impact of moving/removing a counter?As far as I am aware we have never done this in the BCL since EventCounters introduced so we don't have a clear precedent to learn from. This is roughly what I would expect:
Your feedback?Currently the options I am leaning towards are 5 followed by 1, but I could really imagine this going in many different directions and feedback is much appreciated!
|
@davidfowl @jkotas @Maoni0 @davmason @sebastienros @reyang @cijothomas @samsp-msft @jander-msft - You all might be some folks who have an opinion on this or you know others who do? |
Would this also reduce the overhead of the monitoring the counters if you just want simple default set?
We can duplicate the simple counters in both sets. I like option 1. I would do all counter adjustments in one bigger breaking change instead of a numbing trickle of smaller breaking changes over several releases. What would be the GC counters to keep in the simple set and GC counters to move or add to the advanced set? |
My vote for 5. @noahfalk would you consider keeping existing things unchanged, and introducing new categories where things are better sorted out, then encouraging folks (via documentation, examples) folks to move to the newly introduced categories? I feel this is the same when we compare Win32 perf counters vs. event counters, it seems scary if certain Win32 perf counters can be renamed or disappear. Regarding the tooling, I don't know if users would treat dotnet-counters as an ad-hoc troubleshooting tool (e.g. like exception message, in general a developer shouldn't expect the message to follow specific format and parse it), or they would treat the output as established contract (e.g. windbg where the output has been used by multiple extensions). Seems dotnet-counters can be used for both scenarios when I looked at the command line arguments it is offering. |
Yep, although I expect it is a really small difference. Most of these counters probably generate a value in under 100ns, and a typical app might poll them on a timer once every 10-60 seconds. So this is 0.00001% kind of perf savings.
True :)
Probably a very subjective question that could be a topic of discussion all on its own. If we were going to do something like that I'd be tempted to be a little bold and move most of our current GC related stats. My starting proposal would be we keep the minimal set of info a dev needs to look at it to determine if the GC is impacting their app negatively. IMO that is only three metrics:
Everything else like generation sizes, special heap sizes, collection counts, budgets, fragmentation, allocation rates, etc. would go in the dedicated GC section and you look at it when you have need to go deeper.
Certainly it seems technically possible to do a two phase plan where first we add/duplicate things where we want it + update docs, then later we do a second phase where we remove/de-dupe things we don't want. Doing it in one phase vs. two phases probably hinges on how disruptive we believe the change is and I'm hoping this discussion is one avenue that we get some feedback on it.
Yeah I believe that is accurate. "dotnet-counters monitor" is designed for ad-hoc live monitoring by a human and "dotnet-counters collect" is designed for offline review. You could run it as a simple form of free production monitoring that pre-dates more comprehensive options like dotnet-monitor or OpenTelemetry. Thanks! And more feedback definitely welcome. |
I like aspects of 1 & 5. Really like the idea of "recommended" counters. I'm imaging docs like...
...which seems much clearer. Doesn't require obsoleting anything, but does require reading docs 😄 I could also imagine groups of counters like System.Runtime.GC.Recommended \ System.Runtime.GC.Full. |
Closing this issue. Feel free to open new issues for documentation gaps or other specific suggestions. |
In our set of System.Runtime counters we have a bunch of GC related counters and one of them is 'gc-heap-size'. I want to propose deprecating it or making it less initially visible and instead encourage developers to use the 'gc-committed' counter to measure the size of the GC heap. I'm opening this issue to solicit feedback and discuss if and how we should do this.
Why deprecate/move/hide it?
My main goal is to reduce confusion and simplify the diagnostics story around the GC:
Potential for confusion aside, there is nothing inherently wrong with the gc-heap-size counter for developers who can interpret its meaning correctly. While we could certainly improve the docs as a mitigation, the scenario will be more robust if it doesn't require all developers to read docs to avoid the sharp edges.
What would we change?
I've thought of a few options, and I'd be glad to hear other suggestions.
What is impact of moving/removing a counter?
As far as I am aware we have never done this in the BCL since EventCounters were introduced so we don't have a clear precedent to learn from. This is roughly what I would expect:
Your feedback?
Currently the options I am leaning towards are 5 followed by 1, but I could really imagine this going in many different directions and feedback is much appreciated!
The text was updated successfully, but these errors were encountered: