Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore Support for Summary Metric in OTLP as a compatibility feature for Prometheus #1146

Closed
alolita opened this issue Oct 26, 2020 · 2 comments
Assignees
Labels
area:data-model For issues related to data model priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory

Comments

@alolita
Copy link
Member

alolita commented Oct 26, 2020

What are you trying to achieve?

Why do we need the summary metric type?
Support of the summary metric type in OTLP is essential for fully supporting Prometheus’ metric types. One of the key mandates that the project has is to maintain full interoperability with popular open protocols (including Prometheus). However, currently OpenTelemetry's compatibility with Prometheus is broken without this support for the Summary metric. This is a blocker for the Prometheus Receiver, Prometheus exporter, and the Prometheus remote write exporter.

What did you expect to see?

We believe that the summary metric should be implemented as a compatibility feature even if it is not treated as a first-class citizen in OTLP.

Although suggestions have been made of using the histogram metric type to derive the summary metric type, we believe that due to uncertainties and ongoing discussions with the current histogram implementation it is important to start working on supporting the summary metric on its own as it is an essential component for release.

We would like implement the summary data point as before which means we re-implement the previous implementation of the summary metric in metrics.proto. See open-telemetry/opentelemetry-proto#199

Additional context

What’s a summary metric type?
The Summary metric is a metric type used for calculating configurable quantiles over a sliding time window. An x-quantile is an observation value that ranks at number x*N among N observations, for example the 0.5-quantile represents the 50th percentile, hence the value at the 0.5-quantile represents the threshold within which 50% of observations have been made.

Current Behavior with Prometheus Summary Metrics
Our current behavior is to drop any incoming summary metrics from the PrometheusReceiver (in the OpenTelemetry Collector). The PrometheusReceiver turns all summary metrics into a Nil metric, dropping our metric information. This can be seen in this link.

Developers should not need to change their Prometheus workflows when migrating from Prometheus to OpenTelemetry. We should not be dropping any of their existing metrics as they may already have alerting and pipelines based off of these metrics.

Current Usage of Quantiles in other Protocols
Although our primary use case is to fully support Prometheus metric types, there are other backends such as StatsD and Amazon CloudWatch also utilizing summary metrics types.

Links to related issues and PRs

cc: @bogdandrutu @jmacd @rakyll @tigrannajaryan @alvinlin123 @amanbrar1999 @JasonXZLiu

@alolita alolita added the spec:metrics Related to the specification/metrics directory label Oct 26, 2020
@andrewhsu andrewhsu added priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA area:miscellaneous For issues that don't match any other area label labels Oct 27, 2020
@gcacace
Copy link

gcacace commented Oct 28, 2020

Proposal in open-telemetry/opentelemetry-proto#227

tigrannajaryan pushed a commit to open-telemetry/opentelemetry-collector that referenced this issue Nov 6, 2020
_Note:_ This PR is based off the Summary datapoint proto (that has not been merged upstream) in this [PR](https://github.com/open-telemetry/opentelemetry-proto/pull/227/files).

**Description:** 

This PR has two main parts: 

* Implement the [Summary datapoint proto](https://github.com/open-telemetry/opentelemetry-proto/pull/227/files) in the OTel Collector. Changes were made to _metrics_struct.go_ to create the generated Summary structs (using the `make genproto` and `make genpdata` commands).
* Add in the conversion between OpenCensus summary metrics and OTel Summary datapoints. This conversion directly adds compatibility for the Prometheus Summary metric in the PrometheusReceiver. 

**Link to tracking Issue:** [opentelemetry-specification/issues/1146](open-telemetry/opentelemetry-specification#1146) and [opentelemetry-collector/issues/1638](#1638)

A lot of the code in this PR is auto-generated from creating the structs and protos. The following files are were auto-generated:
* generated_metrics.go
* generated_metrics_test.go
* trace_config.pb.go 
* metrics.pg.go

**Testing:**

* Updated unit tests in _oc_to_metrics.go_ for conversion between OC summary metrics to OTel summary metrics
* Created end-to-end tests for the Summary metric in PrometheusReceiver
@jmacd jmacd added area:data-model For issues related to data model and removed area:miscellaneous For issues that don't match any other area label labels Feb 4, 2021
@bogdandrutu
Copy link
Member

This is done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:data-model For issues related to data model priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

No branches or pull requests

5 participants