Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/prometheusexporter] Support for delta metrics from multiple sources #11870

Closed
douglasbgray opened this issue Jul 1, 2022 · 5 comments

Comments

@douglasbgray
Copy link

Is your feature request related to a problem? Please describe.

I would like to support exporting metrics sent by mobile devices. I am unable to use cumulative metrics because there are millions of devices sending metrics, and each metric would need to be labelled by device ID to avoid contention. This is not scalable. I need to use delta metrics.

This PR mostly has the feature I need, but it only supports metrics from a single source (in their use case, it was for a statsd server submitting the delta metrics) .

#9919

Describe the solution you'd like

I am proposing the addition of a boolean flag to the config. The flag is "accept_deltas_from_multiple_sources" and would default to false, to maintain current behavior. I am open to a shorter name, if you can think of one.

Then in accumulator.go, this section would only do the timestamp comparison when the flag is false. Likewise, the line that sets the timestamp would only happen when the flag is false.

// Delta-to-Cumulative
if doubleSum.AggregationTemporality() == pmetric.MetricAggregationTemporalityDelta && ip.StartTimestamp() == mv.value.Sum().DataPoints().At(0).StartTimestamp() {
	ip.SetStartTimestamp(mv.value.Sum().DataPoints().At(0).StartTimestamp())

Describe alternatives you've considered

Alternatives:

  1. Label each metric, this is not scalable.
  2. Fudge the value of setStartTimestamp in the metrics payload so that all devices use the same pre-determined fixed value. This seems like a hack to get around the logic and misuse of the start timestamp value, which may cause other issues downstream.
  3. Copy the contents of the exporter into my custom collector and add my changes on top. This will make getting updates difficult.
  4. Clone the contrib project and add my changes on top. Maintenance will be less of an issue, just a series of periodic rebases, but I may have conflicts to resolve. Still less than ideal.

Additional context

If this proposal is accepted, I can issue a PR with the change. Or if it is preferred for a code owner to make the change, that would also work for me.

@gouthamve
Copy link
Member

Hi, this looks like a violation of the "Single Writer" principle: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#single-writer

Having said that, I do see the validity of the use-case, and I am not sure how to deal with it.

@douglasbgray
Copy link
Author

douglasbgray commented Jul 5, 2022

Yes, it would be a violation of the single writer principle. What I have done in the past is

  1. Custom REST servlet and payload ->
  2. Kafka ->
  3. Flink (performs aggregation) ->
  4. Metrics storage.

I was trying to see if I could use OTEL to eliminate most of that infrastructure.

Besides the single writer violation, my other concern would be the scalability of the accumulator in the Prometheus exporter. If I were to implement this, I could have potentially thousands of calls per second trying to add to the same total. Normally, I would use an Atomic Counter to do this, but that may not be possible here, given the current data structure.

@github-actions
Copy link
Contributor

Pinging code owners: @Aneurysm9. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Nov 16, 2022
@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants