-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a feature to prometheusexporter to export exemplars along with… #9945
Adding a feature to prometheusexporter to export exemplars along with… #9945
Conversation
|
d16e3b9
to
c05fa2d
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
c05fa2d
to
3fc352c
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
5984a4f
to
cba2de1
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
20a9e8c
to
1b93643
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the changes! I think you will need a changelog entry for this to pass. I also left a small (non-blocking) comment to know your opinion
This comment was marked as resolved.
This comment was marked as resolved.
9f46af4
to
865c123
Compare
Thanks for your patience @arun-shopify, sorry it took so long to merge this. |
Hi, I was waiting to test the exemplar support from the prometheus exporter. So, I did a local build with this PR. When I restarted my collector, I am seeing the following error in the collector logs: 2022/06/28 16:48:50 http: panic serving 10.42.15.30:39238: no bucket was found for given exemplar value If I curl the metrics endpoint, it returns with empty data. My pipeline goes like this: Java Agent (otel exporter trace/metrics) -> otel collector (prometheus exporter) -> prometheus The metrics was working fine previously, however it was not getting exemplar obviously. I tried enable and disabling the "enable_open_metrics" flag, but it does not seem to matter and I get the same error. |
@alvinhom, I tested it out using the latest
|
@alvinhom if the issue is not (necessarily) related to this PR, I would suggest you open an issue for this, so that more people can easily see it and participate |
I am pretty sure the issue is related to this PR as the error message is coming from the updated golang library and the collector was working correctly until I upgraded. Maybe it has something to do with exemplar at the Inf range. I am not sure. I will try to get more information by adding a logging to the upstream Java agent to log all the metrics with exemplars and then see if it will help. |
I think the issue is if there are exemplars associated with the last bucket x:Inf. I added the following code to the collector_test.go:TestConvertDoubleHistogramExemplar
Basically, adding an exemplar associated with the last bucket. When I tried to run the test, I get the same errors: --- FAIL: TestConvertDoubleHistogramExemplar (0.00s) goroutine 148 [running]: |
So indeed the panic only happens after this PR. From the comment above the line where this panics: I would assume this is not supported and that the problem comes from the application producing the data, but it would be great to confirm this |
support for |
So the panic is because the value found does not belong to any bucket bounds. @alvinhom would the value happen be outside of the default bucket bound |
@alvinhom in your test case, if the value is |
…ong with… (open-telemetry#9945)" This reverts commit e5a0a29.
@arun-shopify @alvinhom after discussion with the release manager, since this is a widely used component and the issue can cause the Collector to crash without any known workaround, we have decided to revert this PR before it lands in any Collector release (the revert PR is #12074). The functionality added by this PR can of course be accepted in a future PR, after we understand the issue better. |
@alvinhom would you be able share the config you are using? |
The metric data is produced by the Java opentelemetry agent. The agent just outputs the data and forwards to the otel-collector, and I use prometheus exporter to export that to Prometheus.I do not have any custom instrumentation code. Also, I would assume that exemplar data from the largest boundary to inf should be supported. I am not sure why it is not supported. |
We are using the default bucket bounds. But I am pretty sure that we have HTTP requests that have greater than 10s response time as there some some code that issue complex query to DB. But we definitely want to capture those exemplars as those are valuable for our investigation. |
The data pipeline is like this: OpenTelemetry Java Instrumentation agent (with auto instrumentation) --> OTel Collector Contrib --> Prometheus The collector configuration file is like this:
|
Root cause of the issue appears to be in prometheus/client_golang. The const code assumes that the Reverting seems overkill here rather than simply filtering out exemplars whose value is greater than the final bucket size and fixing forward, but too late for that. |
I implemented the workaround as suggested by above to filter out exemplars at the +Inf bucket and the panic is gone. The fix should really be in the prometheus/client_golang. I have also found that with my pipeline from Java agent -> otel collector, the trace id is not set in the prometheus labels. I had to add code to get the traceid and spanid and add it to the Prometheus labels. With the above changes, I was able to get the exemplars working and integrate it with Grafana and see the exemplars. |
new PR with the fix: |
Description:
Adding a feature to
prometheusexporter
to export exemplars along with histogram metrics. In order to add this feature to theprometheusexporter
, we made a PR (prometheus/client_golang#986, merged) to add exemplars support to the prometheus/client_golang library.Exemplars only show up in Open Metrics format, so I have added a config option
EnableOpenMetrics
in prometheus request handler that can be used to toggle the output format.Testing:
Example config for testing the
prometheusexporter
inopenetelemerty-collector
:To view the exemplars from prometheus exporter, you have to set the following header in your request.
Accept: application/openmetrics-text;
Example curl command:
Sample output for one of the buckets: