-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTel-Arrow receiver admission control broken in several ways #36074
Labels
bug
Something isn't working
Comments
jmacd
added a commit
to jmacd/opentelemetry-collector-contrib
that referenced
this issue
Oct 29, 2024
This was referenced Oct 29, 2024
andrzej-stencel
pushed a commit
that referenced
this issue
Oct 31, 2024
#### Description Adds a no-op implementation of the BoundedQueue used by the OTel-Arrow receiver for admission control. #### Link to tracking issue Part of #36074. #### Testing Adds a new end-to-end test to verify admission control with/without waiters and unbounded. The test was taken from #36033. #### Documentation Added: "0" request_limit_mib indicates no admission control.
djaglowski
pushed a commit
that referenced
this issue
Nov 1, 2024
#### Description Simplifies the admission control logic for OTAP payloads. We call Acquire() once after uncompressing the data, instead of once with compressed size and once with the difference. #### Link to tracking issue Part of #36074. #### Testing One test is replaced with logic to verify certain BoundedQueue actions. ~Note: the OTel-Arrow test suite will not pass with this PR until it merges with #36078.~ Originally developed in #36033. #### Documentation Not user-visible.
This was referenced Nov 1, 2024
ArthurSens
pushed a commit
to ArthurSens/opentelemetry-collector-contrib
that referenced
this issue
Nov 4, 2024
…try#36081) #### Description Adds a no-op implementation of the BoundedQueue used by the OTel-Arrow receiver for admission control. #### Link to tracking issue Part of open-telemetry#36074. #### Testing Adds a new end-to-end test to verify admission control with/without waiters and unbounded. The test was taken from open-telemetry#36033. #### Documentation Added: "0" request_limit_mib indicates no admission control.
ArthurSens
pushed a commit
to ArthurSens/opentelemetry-collector-contrib
that referenced
this issue
Nov 4, 2024
…telemetry#36082) #### Description Simplifies the admission control logic for OTAP payloads. We call Acquire() once after uncompressing the data, instead of once with compressed size and once with the difference. #### Link to tracking issue Part of open-telemetry#36074. #### Testing One test is replaced with logic to verify certain BoundedQueue actions. ~Note: the OTel-Arrow test suite will not pass with this PR until it merges with open-telemetry#36078.~ Originally developed in open-telemetry#36033. #### Documentation Not user-visible.
sbylica-splunk
pushed a commit
to sbylica-splunk/opentelemetry-collector-contrib
that referenced
this issue
Dec 17, 2024
…try#36081) #### Description Adds a no-op implementation of the BoundedQueue used by the OTel-Arrow receiver for admission control. #### Link to tracking issue Part of open-telemetry#36074. #### Testing Adds a new end-to-end test to verify admission control with/without waiters and unbounded. The test was taken from open-telemetry#36033. #### Documentation Added: "0" request_limit_mib indicates no admission control.
sbylica-splunk
pushed a commit
to sbylica-splunk/opentelemetry-collector-contrib
that referenced
this issue
Dec 17, 2024
…telemetry#36082) #### Description Simplifies the admission control logic for OTAP payloads. We call Acquire() once after uncompressing the data, instead of once with compressed size and once with the difference. #### Link to tracking issue Part of open-telemetry#36074. #### Testing One test is replaced with logic to verify certain BoundedQueue actions. ~Note: the OTel-Arrow test suite will not pass with this PR until it merges with open-telemetry#36078.~ Originally developed in open-telemetry#36033. #### Documentation Not user-visible.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Component(s)
receiver/otelarrow
What happened?
Description
Several aspects of the OTel-Arrow admission control mechanism are broken or not working as intended.
admission.BoundedQueue
has a source of complexity, which causes it to have fallible APIs. The Arrow admission path was performing multiple calls to Acquire, once with compressed data and once with uncompressed data. This leads to error handling that could be avoided if it were not for the complication.obsrecv
, so no observability happens when admission control fails. This is a major bug.Proposed solution
First, eliminate the complication that necessitates fallible APIs. The problem is the two calls to Acquire() once with compressed size and once with uncompressed size. Because compressed size is typically so much smaller the uncompressed size, the advantage of these two Acquire calls does not outweigh the complexity cost.
Therefore, we can eliminate fallible APIs from the admission package. This may be done by returning a closure from Acquire() to perform the correct release. The potential for mis-use is greatly reduced.
The bounded queue implementation should transition to LIFO. To avoid fallible APIs, transition to LIFO, and fix the race condition is a substantial change. The BoundedQueue tests will be completely rewritten.
Finally, the OTel-Arrow receiver should perform admission control Acquire() once after it computes the uncompressed size of the request, meaning it will stop using the otlp-pdata-size header. The OTel-Arrow exporter should continue to emit this header while older receivers are still in use, but it can be removed eventually.
Collector version
v0.111.0
Environment information
Environment
Any.
OpenTelemetry Collector configuration
No response
Log output
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: