Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/statsd] Gaps in counter data (delta temporality) #18470

Closed
matej-g opened this issue Feb 8, 2023 · 4 comments · Fixed by #18498
Closed

[receiver/statsd] Gaps in counter data (delta temporality) #18470

matej-g opened this issue Feb 8, 2023 · 4 comments · Fixed by #18498
Labels
bug Something isn't working priority:p2 Medium receiver/statsd statsd related issues

Comments

@matej-g
Copy link
Contributor

matej-g commented Feb 8, 2023

Component(s)

receiver/statsd

What happened?

Description

First of all, I'm relatively new to the Statsd protocol, but I'm trying to get the use case of Statsd receiver -> OTLP exporter working for some of my users. So I also wanted to make sure I'm not misunderstanding how this should actually work.

When sending simple counter with delta temporality for testing to the collector, I see that the timestamps on the exports after the aggregation do not add up. What I mean by that is that the receiver does not build an 'unbroken sequence' as described in the spec - the start timestamp and timestamp of the preceding and current point do not match, i.e. this condition is not fulfilled:

For subsequent points in an unbroken sequence:

- For points with delta aggregation temporality, the StartTimeUnixNano of each point matches the TimeUnixNano of the preceding point

Example:

  • Timestamps for preceding point: StartTimestamp: 2023-02-08 20:15:50.70338 +0000 UTC, Timestamp: 2023-02-08 20:15:51.297241 +0000 UTC
  • Timestamps for next point: StartTimestamp: 2023-02-08 20:16:50.701975 +0000 UTC, Timestamp: 2023-02-08 20:16:51.397854 +0000 UTC
    (see below for a full log excerpt)

Since the backend that I export to expects this condition, I'm unable to export delta temporality metrics, since they are getting reset due to non-matching start points.

In comparison, when doing the same with OTLP receiver -> OTLP exporter, I'm getting results as expected and the timestamp points correctly to the preceding one.

Steps to Reproduce

  • Have any application periodically send a simple counter value to the Statsd receiver
  • Observe the timestamp values logged by the collector between different aggregations / (with detailed logging on)

Expected Result

The timestamp of preceding point matches the start timestamp of the next point

Actual Result

Gaps between timestamp of preceding point and the start timestamp of the next point

Collector version

latest main (fac9f8b)

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OS: MacOS 12.6.2
Compiler (Go version): go1.19.5 darwin/arm64

Running as a binary on my host machine (no container)

OpenTelemetry Collector configuration

receivers:
  statsd:
    endpoint: "localhost:8125"
    aggregation_interval: 60s
    is_monotonic_counter: true
  otlp:
    protocols:
      grpc:
      http:

exporters:
  otlp:
    endpoint: "<redacted>"
    headers:
      <redacted>
  logging:
    verbosity: detailed

processors:
  batch:

service:
  pipelines:
    metrics:
      receivers: [otlp, statsd]
      processors: [batch]
      exporters: [logging, otlp]

Log output

{"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-02-08T21:43:55.690+0100    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 1}
2023-02-08T21:43:55.690+0100    info    ResourceMetrics #0
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: web.page.views
     -> Description: 
     -> Unit: 
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Delta
NumberDataPoints #0
StartTimestamp: 2023-02-08 20:42:55.528066 +0000 UTC
Timestamp: 2023-02-08 20:42:56.49073 +0000 UTC
Value: 59
        {"kind": "exporter", "data_type": "metrics", "name": "logging"}


2023-02-08T21:44:55.675+0100    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 1}
2023-02-08T21:44:55.676+0100    info    ResourceMetrics #0
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: web.page.views
     -> Description: 
     -> Unit: 
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Delta
NumberDataPoints #0
StartTimestamp: 2023-02-08 20:43:55.527763 +0000 UTC
Timestamp: 2023-02-08 20:43:55.590627 +0000 UTC
Value: 60
        {"kind": "exporter", "data_type": "metrics", "name": "logging"}

Additional context

No response

@matej-g matej-g added bug Something isn't working needs triage New item requiring triage labels Feb 8, 2023
@github-actions github-actions bot added the receiver/statsd statsd related issues label Feb 8, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@matej-g matej-g changed the title [receiver/statsd] Gaps in metrics with delta temporality [receiver/statsd] Gaps in counter data (delta temporality) Feb 9, 2023
@matej-g
Copy link
Contributor Author

matej-g commented Feb 9, 2023

It seems like setting the start timestamp and timestamp when doing GetMetrics() would solve this, instead of doing this in the Aggregate() upon each time we build the counter, but I'm not sure if this is correct.

@dmitryax
Copy link
Member

dmitryax commented Feb 9, 2023

It seems like setting the start timestamp and timestamp when doing GetMetrics() would solve this, instead of doing this in the Aggregate() upon each time we build the counter, but I'm not sure if this is correct.

Hi @matej-g. Thanks for reporting. I think you're right. lastIntervalTime should be set to the last flush time, not the time when we saw a previous statsd line. Can you submit a PR?

@dmitryax dmitryax added priority:p2 Medium and removed needs triage New item requiring triage labels Feb 9, 2023
@matej-g
Copy link
Contributor Author

matej-g commented Feb 10, 2023

Thanks for confirming @dmitryax, PR is incoming 👍

dmitryax pushed a commit that referenced this issue Feb 10, 2023
These changes makes sure that the start timestamp and timestamp of successive data points for counters align, as is required for delta temporality metrics (see description of the issue in #18470 and relevant spec part - https://opentelemetry.io/docs/reference/specification/metrics/data-model/#resets-and-gaps)

Signed-off-by: Matej Gera <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:p2 Medium receiver/statsd statsd related issues
Projects
None yet
2 participants