Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider http.status_code_class attribute #1056

Open
alanwest opened this issue Nov 14, 2022 · 10 comments
Open

Consider http.status_code_class attribute #1056

alanwest opened this issue Nov 14, 2022 · 10 comments
Assignees

Comments

@alanwest
Copy link
Member

alanwest commented Nov 14, 2022

Inspired by open-telemetry/opentelemetry-specification#2943 (comment).

because metrics are sensitive to cardinality, I've seen instrumentations using strings like 4xx, 5xx for status code.

Proposal is to add a new attribute for grouping status codes by class, i.e., 1xx, 2xx, 3xx, etc. See: https://datatracker.ietf.org/doc/html/rfc9110#section-15.

Open questions:

  • Do we want a new attribute?
  • Should the value of the attribute be 1xx, 2xx, etc OR informational, successful?
  • For metrics, is this attribute conditionally required if status is present?
  • For metrics, would http.status_code become optional?
  • Should traces also have this attribute?
@codeboten
Copy link

Should the value of the attribute be 1xx, 2xx, etc OR informational, successful?

I would prefer grouping starting with the first digit of the status code class.

The httpcheck receiver in the collector currently reports http.status_class as an attribute on some metrics it produces https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/httpcheckreceiver#http-check-receiver

@joaopgrassi
Copy link
Member

joaopgrassi commented Nov 22, 2022

For metrics, is this attribute conditionally required if status is present?

I guess that would make more sense than making status_code optional, which I'm not sure would even be possible since it is conditionally required today. I'd also think the value being the first digit is a more "natural" solution.

@carlosalberto
Copy link
Contributor

Hey @alanwest - I suggest you bring it to the Semantic Conventions SIG (every Monday after the maintainers meeting).

@trask
Copy link
Member

trask commented Feb 22, 2023

@alanwest @codeboten @joaopgrassi please let us know if you think this should be considered as a blocker for HTTP semantic convention stability? my current assumption is that http.status_code is low-cardinality enough for most usage, and http.status_code_class could be added later as an opt-in attribute (and users of that attribute could opt-out of of http.status_code)

@alanwest
Copy link
Member Author

alanwest commented Feb 24, 2023

please let us know if you think this should be considered as a blocker for HTTP semantic convention stability?

@trask I am not inclined to consider this a blocker for stability. I opened this issue based on open-telemetry/opentelemetry-specification#2943 (comment).

@yurishkuro do you feel this should be a blocker?

edit: just to expand a bit on why I do not think this is a blocker.. I personally think that a good default behavior is to include the full status code for metrics. For users who do experience cardinality issues due to status code, instrumentation might be able to be configured to turn off http.status_code and enable http.status_code_class. In this way, I believe http.status_code_class could be an additive, non-breaking change in the future.

@Oberon00
Copy link
Member

What is our general stance one such "trivially derivable" attributes? The http.status_code_class attribute will always be redundant with the (required) http.status_code. It seems conceptually useless to report this from the agent/client side already, but it might be useful in processing further down the line (maybe as soon as in some aggregating exporter in-process).

I feel like this needs some more general discussion

@trask
Copy link
Member

trask commented Feb 27, 2023

I personally think that a good default behavior is to include the full status code for metrics. For users who do experience cardinality issues due to status code, instrumentation might be able to be configured to turn off http.status_code and enable http.status_code_class. In this way, I believe http.status_code_class could be an additive, non-breaking change in the future.

my thought also 👍

@Oberon00 do you think this needs more discussion before removing it as a blocker for HTTP semconv stability?

@Oberon00
Copy link
Member

Oberon00 commented Mar 1, 2023

Offering status_code_class as an alternative to status_code implies making status_code non-required (relaxing the condition). Wouldn't that be a breaking change? Just one example: backends may rely on presence of status_code to detect network errors, as discussed in open-telemetry/opentelemetry-specification#3253

@trask
Copy link
Member

trask commented Mar 2, 2023

Wouldn't that be a breaking change?

ya, makes sense, as soon as you opt-out of http.status_code you would no longer be producing HTTP semconv compliant telemetry, which means we wouldn't be able to define this kind of optional behavior in the HTTP semconv spec.

I added a comment open-telemetry/opentelemetry-specification#3225 (comment)

I see a couple of options:

  • make http.status_code Recommended instead of Conditionally Required
  • users could write a metric view to normalize http.status_code values, e.g. map values [400-499] to 400 and maintain semconv compliance (at the cost of lying a bit)
  • users could write a metric view to replace http.status_code with my.status_code_class and lose benefits of HTTP semconv compliance

@trask
Copy link
Member

trask commented Mar 3, 2023

We discussed this in the HTTP semconv meeting today.

I have sent open-telemetry/opentelemetry-specification#3289 to clarify that

Attribute requirement levels apply to instrumentation. Because users can transform their telemetry in a number of ways
(e.g. metric views, span processors, and collector transformations), these requirement levels cannot be relied on by
telemetry consumers.

I think this gives the flexibility to make something like http.status_code_class Opt-In (and http.status_code Opt-Out) in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Can be addressed after stability
Development

No branches or pull requests

7 participants