Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: APIs to expose existing runtime EventCounter data #65989

Closed
noahfalk opened this issue Mar 1, 2022 · 12 comments
Closed

[API Proposal]: APIs to expose existing runtime EventCounter data #65989

noahfalk opened this issue Mar 1, 2022 · 12 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Diagnostics.Metric
Milestone

Comments

@noahfalk
Copy link
Member

noahfalk commented Mar 1, 2022

Background and motivation

EventCounters were our original xplat metrics API and while they aren't going away, we recently added System.Diagnostics.Meter API in collaboration with OpenTelemetry and we expect it will get more use going forward. There are some counters already exposed by the RuntimeEventSource EventCounters and OpenTelemetry would like to wrap the equivalent counter data using the new API. Most of this data is available via public APIs, but a few of the data points use private APIs. We'd like to make those APIs public (or expose some other public API that has the same data):

  • GC.GetLastGCPercentTimeInGC()
  • System.Reflection.Assembly.GetAssemblyCount()
  • Exception.GetExceptionCount()

See this issue in the OpenTelemetry repro for more context on the work.

@tarekgh

Updated Proposal

The original version of the API proposal is included below for reference but the current plan is:

  1. GetAssemblyCount() and GetExceptionCount() aren't needed because the same data can be reproduced like this. These APIs aren't frequently needed so even though these expressions might be a little less obvious it should be a non-issue for the handful of devs who will need to write the code.
AppDomain.FirstChanceException += (_ => s_exceptionCount++);`
AppDomain.GetAssemblies().Length
  1. For GetLastGCPercentTimeInGC() we are instead exposing [API Proposal]: Exposing the total paused duration in GC. #66036 which should give a more useful and general purpose statistic.

This issue is staying open only to track waiting on the GC API.

Original Proposal (we are not planning to do this)

namespace System
{
    public class GC
    {
        public static double GetLastGCPercentTimeInGC();
    }
}

namespace System.Reflection
{
    public class Assembly
    {
        public static int GetAssemblyCount();
    }
}

namespace System
{
    public class Exception
    {
        public static int GetExceptionCount();
    }
}

API Usage

static Meter s_myMetrics = new Meter("MyCoolRuntimeMetrics");
static ObservableGauge s_timeInGc = s_myMetrics.CreateObservableGauge<double>("PercentTimeInGC", GC.GetLastGCPercentTimeInGC);
static ObservableGauge s_assemblyCount = s_myMetrics.CreateObservableGauge<double>("AssemblyCount", Assembly.GetAssemblyCount);
static ObservableGauge s_exceptionCount = s_myMetrics.CreateObservableGauge<double>("ExceptionCount", Exception.GetExceptionCount);

Alternative Designs

No response

Risks

No response

@noahfalk noahfalk added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Mar 1, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 1, 2022
@tarekgh
Copy link
Member

tarekgh commented Mar 1, 2022

CC @Maoni0 @steveharter @stephentoub @jkotas for any API feedback.

@tarekgh tarekgh added area-System.Diagnostics.Metric and removed untriaged New issue has not been triaged by the area owner labels Mar 1, 2022
@tarekgh tarekgh added this to the 7.0.0 milestone Mar 1, 2022
@stephentoub
Copy link
Member

GetLastGCPercentTimeInGC

Should this be a value on GCMemoryInfo instead? I notice we've already exposed some time/duration/percentage information there.

@tarekgh
Copy link
Member

tarekgh commented Mar 1, 2022

@stephentoub I think @noahfalk proposed it in GC because this is already existed today there, but it is internal method. It makes sense to have it in GCMemoryInfo though.

@jkotas
Copy link
Member

jkotas commented Mar 1, 2022

Should GetExceptionCount return long? If you have misbehaving process that throws exceptions a lot, int can overflow in less than a day. Similar JitInfo methods return long for similar reason.

@tarekgh
Copy link
Member

tarekgh commented Mar 1, 2022

Should GetExceptionCount return long? If you have misbehaving process that throws exceptions a lot, int can overflow in less than a day. Similar JitInfo methods return long for similar reason.

The internal implementation returning uint. It looks like returning long would make sense.

https://source.dot.net/#System.Private.CoreLib/src/System/Exception.CoreCLR.cs,163

@jkotas
Copy link
Member

jkotas commented Mar 1, 2022

The events that correspond to the exception and assembly counters are on AppDomain:

  • Exception.GetExceptionCount() <-> AppDomain.FirstChanceException event
  • Assembly.GetAssemblyCount() <-> AppDomain.AssemblyLoad event, AppDomain.GetLoadedAssemblies property

The user code can actually implement these counters on its own using these events today.
Should we aim to some sort of consistency here? E.g. have the counters exposed next to the event that they are counting?

@Maoni0
Copy link
Member

Maoni0 commented Mar 1, 2022

chatted with @noahfalk about this. my recommendation would be that we use a much better way to show this data. the historical % time in GC counter value has always been questionable - as in, because it shows whatever the percentage that the last GC observed, it could give very false information.

in .NET 7 we are already planning to add an API to show the accumulated GC pause (either an as overload of the GetGCMemoryInfo API or as a standalone API to make the cost smaller - that part is still being discussed. @cshung is working on this). it'd be much better to show this instead. you can calculate a much more accurate % time in GC this way (since you know the interval; this would work the same way as the other event counters that show an ever increasing value like # of gen0 count).

@noahfalk
Copy link
Member Author

noahfalk commented Mar 1, 2022

I was filing this quickly to capture the request rather than doing much careful consideration of where the best API was. Don't put weight on the exact shape/location/naming I had there.

Should this be a value on GCMemoryInfo instead?

Yeah, that is reasonable (@Maoni0 gave more detail)

have the counters exposed next to the event that they are counting?

That makes sense to me.

It looks like returning long would make sense

👍

@cshung
Copy link
Member

cshung commented Mar 3, 2022

Here is the proposal @Maoni0 was referring to above.

@tarekgh
Copy link
Member

tarekgh commented May 20, 2022

@noahfalk the GC API is already in now.

@noahfalk
Copy link
Member Author

Closing now that the GC API is in, thanks @tarekgh!

@ghost ghost locked as resolved and limited conversation to collaborators Jun 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Diagnostics.Metric
Projects
None yet
Development

No branches or pull requests

6 participants