You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ingester implements limit for number of concurrent inflight push requests that it can handle. When the limit is reached, ingester rejects additional push requests. This limit exists to prevent overload of the ingester and crashes, when too many requests are sent to the ingester at once.
The problem is that it can be difficult to find the right value for the limit. Not all requests are equal in size, and value that works for tiny 1KiB requests will not work for requests that are 25x times bigger.
We have tried to solve this problem by introducing limit for total request size (#6010, #5958), but those checks don't provide adequate protection, because they happen too late.
I see two possible solutions how to solve this:
We could modify distributor to split big requests into smaller ones. Today the distributor splits incoming push request into per-ingester subrequests based on series sharding. When distributor constructs smaller subrequest, it could further split the subrequest by size. This would provide guarantee to ingesters that requests can not be any larger than specified size, and it would simplify finding the right limit for inflight requests.
When transport: Pass Header metadata to tap handle. grpc/grpc-go#6652 is available, distributor could provide request size as a metadata when sending push request ingester. This would solve the problem that we've run into in #6010 and #5958, namely that we don't know the push request size before we read and parse the request. By that time it may be too late.
The text was updated successfully, but these errors were encountered:
Ingester implements limit for number of concurrent inflight push requests that it can handle. When the limit is reached, ingester rejects additional push requests. This limit exists to prevent overload of the ingester and crashes, when too many requests are sent to the ingester at once.
The problem is that it can be difficult to find the right value for the limit. Not all requests are equal in size, and value that works for tiny 1KiB requests will not work for requests that are 25x times bigger.
We have tried to solve this problem by introducing limit for total request size (#6010, #5958), but those checks don't provide adequate protection, because they happen too late.
I see two possible solutions how to solve this:
We could modify distributor to split big requests into smaller ones. Today the distributor splits incoming push request into per-ingester subrequests based on series sharding. When distributor constructs smaller subrequest, it could further split the subrequest by size. This would provide guarantee to ingesters that requests can not be any larger than specified size, and it would simplify finding the right limit for inflight requests.
When transport: Pass Header metadata to tap handle. grpc/grpc-go#6652 is available, distributor could provide request size as a metadata when sending push request ingester. This would solve the problem that we've run into in #6010 and #5958, namely that we don't know the push request size before we read and parse the request. By that time it may be too late.
The text was updated successfully, but these errors were encountered: