Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguity interpreting keys and values from environment variables #501

Closed
andrewhsu opened this issue Mar 3, 2020 · 4 comments
Closed
Labels
bug Something isn't working priority:p2 Medium priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory spec:resource Related to the specification/resource directory

Comments

@andrewhsu
Copy link
Member

From the specification SIG mtg today, a concern was raised interpreting keys and values:

What is allowed in the value of a resource? Is “a=\”b=c\”” valid?
The Java code, copied from OpenCensus, documents that the above example is valid, while the actual code would throw it away https://github.com/open-telemetry/opentelemetry-java/blob/24e470522d744c1249cbf818deab4a26d580b3e4/sdk/src/main/java/io/opentelemetry/sdk/resources/EnvVarResource.java#L62
jmacd proposes the W3C CorrelationContext encoding (urlencoding) but that’s still missing from the spec

Related issue in the golang SIG: open-telemetry/opentelemetry-go#332

Proposed solution from @jmacd : https://w3c.github.io/correlation-context/#value-format

@jmacd
Copy link
Contributor

jmacd commented Mar 4, 2020

Since I'd like to make progress on this, I'm gathering some links.

Here are Datadog's guidelines on naming metrics and metric tags: https://docs.datadoghq.com/developers/guide/what-best-practices-are-recommended-for-naming-metrics-and-tags

Metric names (which I'd argue are similar in function to Span names):

Metric names must start with a letter.
Can only contain ASCII alphanumerics, underscores, and periods. Other characters are converted to underscores.
Should not exceed 200 characters (though less than 100 is generally preferred from a UI perspective)
Unicode is not supported.
It is recommended to avoid spaces.

Metric labels:

Tags must start with a letter.
May contain alphanumerics, underscores, minuses, colons, periods, and slashes. Other characters are converted to underscores.
Any trailing underscore will get removed, whether if it originated from a converted character or if it was in the original tag value.
Tags can be up to 200 characters long and support Unicode.
Tags are converted to lowercase.
For optimal functionality, it is recommended to use the key:value syntax.

And this is problematic. From Datadog's perspective, a metric label consists of both its key and value, so values are constrained not to contain certain characters. I am not advocating that we follow this path.

Here are Prometheus' guidelines on naming metrics and tags: https://prometheus.io/docs/practices/naming/

Metric names:

The metric name specifies the general feature of a system that is measured (e.g. http_requests_total - the total number of HTTP requests received). It may contain ASCII letters and digits, as well as underscores and colons. It must match the regex [a-zA-Z_:][a-zA-Z0-9_:]*.

Metric labels:

Label names may contain ASCII letters, numbers, as well as underscores. They must match the regex [a-zA-Z_][a-zA-Z0-9_]*. Label names beginning with __ are reserved for internal use.

Label values may contain any Unicode characters.

A label with an empty label value is considered equivalent to a label that does not exist.

@jmacd
Copy link
Contributor

jmacd commented Mar 4, 2020

Discussion points in today's meeting:

We don't want to impose an expensive sanitization step for metric names and label keys on all exporters.
We don't want to crash when the user produces metric names or label keys that are invalid.
We don't want to restrict the character set of label values.
We will recommend that exporters with character set restrictions on metric names and labels address invalid characters on their own.
We will not endorse protocols that restrict the character set of label values. (This point emphatically disqualifies the current Dogstatsd protocol.)

@jmacd
Copy link
Contributor

jmacd commented Mar 4, 2020

Also related #345, #431, #459.

@jmacd jmacd added spec:metrics Related to the specification/metrics directory spec:resource Related to the specification/resource directory labels Mar 4, 2020
@jmacd jmacd changed the title ambiguity interpreting keys and values Ambiguity interpreting keys and values from environment variables Mar 4, 2020
@reyang reyang added bug Something isn't working release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Jul 10, 2020
@bogdandrutu bogdandrutu added the priority:p2 Medium priority level label Jul 24, 2020
@andrewhsu
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:p2 Medium priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory spec:resource Related to the specification/resource directory
Projects
None yet
Development

No branches or pull requests

4 participants