Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update default values for explicit histogram bucket boundaries #4797

Closed
noahfalk opened this issue Aug 22, 2023 · 7 comments · Fixed by #4820
Closed

Update default values for explicit histogram bucket boundaries #4797

noahfalk opened this issue Aug 22, 2023 · 7 comments · Fixed by #4820
Assignees
Labels
enhancement New feature or request metrics Metrics signal related
Milestone

Comments

@noahfalk
Copy link

Feature Request

Currently the OpenTelemetry SDK sets default histogram boundaries to { 0, 5, 10, 25, 50, 75,
100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000}. These values make sense for time durations of network
requests measured in ms, however OTel semantic conventions encourage using seconds. This means that most histograms will be timing things that take less than 5 seconds and the default buckets are not at all useful. I am in particular concerned with histograms that were added in .NET 8 to ASP.NET Core and System.Net because I expect it will be extremely common for users to want to monitor those histograms. The current workaround is to specify a custom view with alternate buckets for every histogram. This approach functions, but adds complexity to what should be a very simple and common use case. You can see how the getting started guide for ASP.NET Core metrics has to include a custom view right now: https://learn.microsoft.com/en-us/aspnet/core/log-mon/metrics/metrics?view=aspnetcore-8.0#create-the-starter-app

There are several tasks that could help improve the situation on different timescales:

  • Hardcode the names of well-known instruments and provide more appropriate defaults for those
  • Change the global histogram defaults for OpenTelemetry.NET
  • Change the global histogram defaults in the specification for all OpenTelemetry clients
  • Advance the metrics hint APIs towards stability and work with the .NET libraries team in .NET 9 to implement those APIs so that instrumentation authors can provide their own defaults.

The .NET team is glad to work with you on that last bullet in the .NET 9 timeframe, but we are hoping one of the other options can be adopted as well to offer a solution more quickly.

cc @JamesNK

@noahfalk noahfalk added the enhancement New feature or request label Aug 22, 2023
@cijothomas
Copy link
Member

I am in particular concerned with histograms that were added in .NET 8 to ASP.NET Core and System.Net because I expect it will be extremely common for users to want to monitor those histograms

Do they use ms or secs?
There is a similar issue where the Otel instrumentation from this repo is being modified to emit "s" instead of "ms" : https://github.com/open-telemetry/opentelemetry-dotnet/pull/4766/files#r1291903760

@JamesNK
Copy link
Contributor

JamesNK commented Aug 22, 2023

All duration counters in .NET are seconds. That is what OTel semantic conventions recommends. But the histogram defaults are 1000 times too large because they're designed for milliseconds.

See open-telemetry/opentelemetry-specification#3509

@samlam
Copy link
Contributor

samlam commented Aug 22, 2023

Hey guys, I can probably take this on.

@cijothomas
Copy link
Member

Hey guys, I can probably take this on.

Its not clear what do you propose to do? Can you write down the plan, before moving to actual code, to make sure everyone agrees with the proposal.

Changing the default bounds for Histogram is not something we can just do in the SDK, as it'll be breaking change. (The spec hasn't changed its defaults as well, due to same reason that they are part of Stable spec).

@CodeBlanch CodeBlanch added the metrics Metrics signal related label Aug 22, 2023
@CodeBlanch
Copy link
Member

@cijothomas I'm going to buddy with @samlam on this one. Let me discuss with him a bit and we'll add a proposal or bring it to the SIG.

@cijothomas cijothomas added this to the 1.7.0 milestone Aug 22, 2023
@cijothomas
Copy link
Member

Tagging for 1.7.0, to align with .NET 8 release.

@JamesNK
Copy link
Contributor

JamesNK commented Aug 24, 2023

I like the idea to automatically change buckets based on the unit.

If that isn't possible, then I think the opentelemetry-dotnet should hardcode buckets for built-in histograms for .NET 8. It will be a temporary measure until .NET 9 when the hint API will be available and these histograms will be annotated.

If we go down the hardcoding route, .NET 8's histogram counter names are below, grouped by meter name.


Microsoft.AspNetCore.RateLimiting:

  • aspnetcore.rate_limiting.request_lease.duration
  • aspnetcore.rate_limiting.request.time_in_queue

System.Net.Http:

  • http.client.connection.duration
  • http.client.request.duration
  • http.client.request.time_in_queue

Microsoft.AspNetCore.Hosting:

  • http.server.request.duration

Microsoft.AspNetCore.Server.Kestrel:

  • kestrel.connection.duration
  • kestrel.tls_handshake.duration

Microsoft.AspNetCore.Http.Connections:

  • signalr.server.connection.duration

System.Net.NameResolution:

  • dns.lookups.duration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request metrics Metrics signal related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants