Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make gRPC response code configurable when rate limiting occurs #4735

Closed
nikolay-pshenichny opened this issue Oct 15, 2018 · 7 comments
Closed
Assignees
Labels
enhancement Feature requests. Not bugs or questions.

Comments

@nikolay-pshenichny
Copy link
Contributor

Current envoy implementation returns UNAVAILABLE if a gRPC call was rate-limited (rate limit service responded with OVER_LIMIT). This behavior matches the HTTP to gRPC Status Code Mapping , but may not always be what a "client" wants to see ( Google Cloud documentation suggests mapping 429 to RESOURCE_EXHAUSTED).

Use cases:

  • a client service needs to know more details in order to pick the right retry policy, or not retry the call at all.
  • a service owner needs to know more details about why a gRPC call failed.

Feature request details:

This feature request is to make the gRPC response code configurable for the scenarios when a gRPC call was rate-limited by Envoy on the server-side.

Suggested config option names: rate_limited_response_code or just response_code
Suggested choices: UNAVAILABLE (default) and RESOURCE_EXHAUSTED
Suggested location for the new config option: config.ratelimit.v2.RateLimitServiceConfig

Relevant Links:

@mattklein123 mattklein123 added the enhancement Feature requests. Not bugs or questions. label Oct 17, 2018
@venilnoronha
Copy link
Member

I'd like to work on this.

@ramaraochavali
Copy link
Contributor

@venilnoronha How are you planning to fix this? Are you planning to introduce the config in rate limiting service configuration for this specific status and return the status accordingly or are you planning to provide a higher level config that selects one of grpc code state mapping that applies to all services/status codes?

@venilnoronha
Copy link
Member

I'm thinking in the direction of a generic status mapping approach, as there are a few statuses that map to UNAVAILABLE. See https://github.com/grpc/grpc/blob/master/doc/http-grpc-status-mapping.md.

@lizan
Copy link
Member

lizan commented Oct 23, 2018

Looks like gRPC C core does map 429 to RESOURCE_EXHAUSTED:
https://github.com/grpc/grpc/blob/master/src/core/lib/transport/status_conversion.cc#L82

But Java and Go does map 429 to UNAVAILABLE:
https://github.com/grpc/grpc-java/blame/master/core/src/main/java/io/grpc/internal/GrpcUtil.java#L305
https://github.com/grpc/grpc-go/blob/cfb96008516cbda549042f43fdd23befea331172/internal/transport/http_util.go#L91

This is confusing so I would recommend to double check with gRPC maintainers.

Aside from that, it might be useful anyway by adding a response status (HTTP/gRPC) in RateLimitResponse, so the rate limit service can override the status code Envoy send to downstream.

@ramaraochavali
Copy link
Contributor

@lizan Yes, We should check with gRPC maintainers and sort that confusion. On the Envoy side implementation instead of doing specific status code mappings at specific services as needed, can we have couple of well known mappings and allow the user to choose one via CLI/bootstrap? This may be more generic than implementing some thing specific to rate limit. WDYT?

@lizan
Copy link
Member

lizan commented Oct 23, 2018

@ramaraochavali Given the only deviation is mapping of HTTP 429 Too Many Requests, I doubt there is a need for generic code mapping override. IMO making rate limit service respond custom status code is more usable in long term, just like we added header there before.

@ramaraochavali
Copy link
Contributor

@lizan See the status mapping here in Envoy

Status::GrpcStatus Utility::httpToGrpcStatus(uint64_t http_response_status) {
. It maps currently maps 429, 502,503 and 504 to UNAVAILABLE as per https://github.com/grpc/grpc/blob/master/doc/http-grpc-status-mapping.md. However this https://cloud.google.com/apis/design/errors#generating_errors maps 504 to DEAD_LINE_EXCEEDED for example. Should we handle that difference as well?
Related issue #3619

NihilWater pushed a commit to NihilWater/gitaly that referenced this issue Nov 23, 2022
What
---
Change the returned gRPC code from `Unavailable` to `ResourceExhausted`
when the user reaches concurrency limits.

Why
---
In https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8056 and
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8071 we saw a
single user paging the on-call because of the high error rate. When we
looked at the error rate it was because they were reaching [concurrency
limits](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8056#note_1174303820).
Rate limiting a user is not an error for us but normal behavior just
like a 429 HTTP status code.

In
https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16844#note_1175622086
we looked into the best gRPC code to return, and
`ResourceExhausted` was the best one where we could differeiencate
between a real server error and a user error. This goes against the [grpc
mapping](https://github.com/grpc/grpc/blob/master/doc/http-grpc-status-mapping.md)
but also follows what the [envoy proxy
does](envoyproxy/envoy#4735).

Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16844
Reference: https://gitlab.com/gitlab-org/gitaly/-/issues/4637
Signed-off-by: Steve Azzopardi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests. Not bugs or questions.
Projects
None yet
Development

No branches or pull requests

5 participants