Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTelemetry Collector 0.104.0 (core, win) apprently not respecting timeout as literally documented #10568

Open
mkarg opened this issue Jul 9, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@mkarg
Copy link

mkarg commented Jul 9, 2024

Describe the bug
It looks like the OpenTelemetry Collector's timeout value does not impose a hard limit on the number of requests sent per second.
According to the documentation, what a user expects to get is one request per timeout.
Apparently what the user actually gets is more than one request per second.

So either the documentation should be more clear about this fact, or (even better) there should be a separate option to set a hard limit to requests per second.

Steps to reproduce

receivers:
  otlp:
    protocols:
      grpc:

processors:
  batch:
    timeout: 10s

exporters:
  otlphttp:
    endpoint: https://otlp-gateway-prod-eu-west-2.grafana.net/otlp
    headers:
      Authorization: Basic ...token...

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp]

What did you expect to see?
Grafana Cloud should receive at most one request per ten seconds.

What did you see instead?

1.720523126297038e+09	info	exporterhelper/retry_sender.go:118	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "metrics", "name": "otlphttp", "error": "Throttle (5s), error: rpc error: code = ResourceExhausted desc = error exporting items, request to https://otlp-gateway-prod-eu-west-2.grafana.net/otlp/v1/metrics responded with HTTP Status Code 429, Message=the request has been rejected because the tenant exceeded the request rate limit, set to 75 requests/s across all distributors with a maximum allowed burst of 750 (err-mimir-tenant-max-request-rate). To adjust the related per-tenant limits, configure -distributor.request-rate-limit and -distributor.request-burst-size, or contact your service administrator., Details=[]", "interval": "5.575989996s"}

As Grafana Cloud complains to receive more than 75 requests per second, and as there are no other senders used, apparently the OpenTelemetry Collector sent more than one request per ten seconds.

What version did you use?
0.104.0 core

What config did you use?
see above

Environment
Windows 10 Pro

Additional context
Grafana Cloud (Free Subscription)

@mkarg mkarg added the bug Something isn't working label Jul 9, 2024
@mkarg
Copy link
Author

mkarg commented Jul 10, 2024

It would be highly appreciated if someone of the collector team could post a configuration that really limits the requests per second as needed for free Grafana Cloud subscription! Thanks! :-)

@atoulme
Copy link
Contributor

atoulme commented Jul 10, 2024

Please read https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md?plain=1#L29 on send_batch_size - you will need to override the default.

@mkarg
Copy link
Author

mkarg commented Jul 10, 2024

I have read that but I do not understand how this could work in any way reliably: How should the collector administrator know which is the correct batch size to stay within a requests/sec limit?

@jmacd
Copy link
Contributor

jmacd commented Sep 16, 2024

@mkarg It is difficult to see how a timeout setting would "limit on the number of requests sent per second."
The batch processor you are using is somewhat unofficially deprecated. There is new batching functionality under development, and some options were posted -- #9462 -- for what it's worth, you appear to want batching by request size, which was under discussion and appears to be not implemented at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants