Restore Support for Summary Metric in OTLP as a compatibility feature for Prometheus #1146
Labels
area:data-model
For issues related to data model
priority:p1
Highest priority level
release:required-for-ga
Must be resolved before GA release, or nice to have before GA
spec:metrics
Related to the specification/metrics directory
What are you trying to achieve?
Why do we need the summary metric type?
Support of the summary metric type in OTLP is essential for fully supporting Prometheus’ metric types. One of the key mandates that the project has is to maintain full interoperability with popular open protocols (including Prometheus). However, currently OpenTelemetry's compatibility with Prometheus is broken without this support for the Summary metric. This is a blocker for the Prometheus Receiver, Prometheus exporter, and the Prometheus remote write exporter.
What did you expect to see?
We believe that the summary metric should be implemented as a compatibility feature even if it is not treated as a first-class citizen in OTLP.
Although suggestions have been made of using the histogram metric type to derive the summary metric type, we believe that due to uncertainties and ongoing discussions with the current histogram implementation it is important to start working on supporting the summary metric on its own as it is an essential component for release.
We would like implement the summary data point as before which means we re-implement the previous implementation of the summary metric in metrics.proto. See open-telemetry/opentelemetry-proto#199
Additional context
What’s a summary metric type?
The Summary metric is a metric type used for calculating configurable quantiles over a sliding time window. An x-quantile is an observation value that ranks at number x*N among N observations, for example the 0.5-quantile represents the 50th percentile, hence the value at the 0.5-quantile represents the threshold within which 50% of observations have been made.
Current Behavior with Prometheus Summary Metrics
Our current behavior is to drop any incoming summary metrics from the PrometheusReceiver (in the OpenTelemetry Collector). The PrometheusReceiver turns all summary metrics into a Nil metric, dropping our metric information. This can be seen in this link.
Developers should not need to change their Prometheus workflows when migrating from Prometheus to OpenTelemetry. We should not be dropping any of their existing metrics as they may already have alerting and pipelines based off of these metrics.
Current Usage of Quantiles in other Protocols
Although our primary use case is to fully support Prometheus metric types, there are other backends such as StatsD and Amazon CloudWatch also utilizing summary metrics types.
Links to related issues and PRs
cc: @bogdandrutu @jmacd @rakyll @tigrannajaryan @alvinlin123 @amanbrar1999 @JasonXZLiu
The text was updated successfully, but these errors were encountered: