-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow disable ratelimit quota handler #22314
Comments
@clwluvw which backends you made requests to in your performance benchmark? We are currently fixing a bug related to quotas increasing requests latency, but this only happens for certain backends and certain types of quotas. I'm interested in hearing more about which test cases you've run. |
I'm not sure what I see in the wrapper code relates to a specific backend but my tests were on |
Note that with the fixes above, you'll still incur 1xRTT latency for logins when using auth methods that implement the There's one more fix in flight which should allow you to get parity with pre-1.12 version latency for such cases (unless you have a role-based quota applied to the namespace/mount) by setting a config option: |
@biazmoreira I used token-based auth (basically the root token) for my requests. Do you still think those PRs would resolve the CPU time I saw in the wrapper? |
@clwluvw Instead of bypassing the wrapper altogether, we pinpointed what in the wrapper was causing the increase in latency and decided to fix it. You can check that https://github.com/hashicorp/vault/pull/22597/files modifies the wrapper you mentioned in the PR description. We think this will fix the issue, but as Mike mentioned, it might happen for cases in which role-based quotas exist and the auth method implements |
Also, can you give us more details about the requests you used for the benchmark? Which auth methods, APIs, secret engines, etc? |
I used the root token for authentication and transit engine (https://developer.hashicorp.com/vault/api-docs/secret/transit#generate-data-key) to generate the data key. The latency for key generation was in microseconds but with the rate limit wrapper, I always get more than 1ms... After disabling that (and building the source again) I could gain sub-ms latency from the API. The CPU profiler was showing that...
|
No, we make a query in-memory for login requests specifically to see if the path has a role-based quota. |
Thanks for all the context, @clwluvw. We have just merged the last change that we think could fix the issue in latency you're seeing. Would you be able to test it once more with the latest commit from main? |
@biazmoreira I built from (e1895d8) and in addition, I put some time measurement over the diff --git a/http/util.go b/http/util.go
index 19bdbf836c..f2be83350f 100644
--- a/http/util.go
+++ b/http/util.go
@@ -12,6 +12,7 @@ import (
"net"
"net/http"
"strings"
+ "time"
"github.com/hashicorp/vault/sdk/logical"
@@ -40,6 +41,7 @@ var (
func rateLimitQuotaWrapping(handler http.Handler, core *vault.Core) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+ startTime := time.Now()
ns, err := namespace.FromContext(r.Context())
if err != nil {
respondError(w, http.StatusInternalServerError, err)
@@ -127,8 +129,8 @@ func rateLimitQuotaWrapping(handler http.Handler, core *vault.Core) http.Handler
return
}
+ core.Logger().Info("ratelimit time", "spend", time.Since(startTime).Seconds())
handler.ServeHTTP(w, r)
- return
})
} and here are the sample times (in seconds):
Still, ~0.5ms is consumed for the wrapper. |
@biazmoreira Do you have any thoughts on this? Would the PR I linked to this issue be good for such use cases? |
Hey @clwluvw, we would like to understand your issue more in depth. Would you be able to give us more details about the use case in which this problem appears? Was this meant to be only a performance benchmark or are you facing this issue in production every time you issue a request? Would you be able to give us more insight on your use case? Unfortunately, we won't be moving forward with your suggestion implemented in the PR. We would like to understand your use case a bit more before we think of other solutions to this specific case. One suggestion, though, is to configure Looking forward to hearing more from you about the use case. Let me know if there's anything else I can be of help! |
@biazmoreira The issue is on a default vault installation (without any rate limit configuration - authenticate with root token - no matter what the API is (here in my case was the one I mentioned above)) - the rate limit handler always consumes 0.5ms time. This would become a problem on installations that microseconds matter. I mean sometimes you cannot close your eyes on 0.5ms time consumption... So now to reproduce that you can apply the diff I paste for you - bring up a vault instance (no HA - backend on memory - no rate limit) - issue a request to one of the APIs (probably the one I mentioned would be better) - and see the output. In my case, I see 0.5ms consumed by the wrapper while I haven't configured any rate limits anywhere. Can't we have the wrapper with less microseconds effect? or make it configurable to disable it (as there is already one disabling option for one of the wrappers)? |
We acknowledge your concerns and share the belief that we should minimize unneeded latency, especially in request hot paths. However, there are instances where substantial latency increases cannot be entirely avoided. In practice, our preference is to refrain from adding configuration options to disable features solely due to latency concerns, as it may set an undesirable precedent. Imagine if we were to follow this approach for every feature that introduces latency. One potential solution to address your latency issues without resorting to feature toggles is to explore the option of using rate_limit_exempt_paths. Have you considered this as an alternative to your proposed solution? |
As per @akshya96's reply and our engineering methodologies, I'm going to go ahead and close this issue at this time. If needed, please feel free to re-open it. |
Is your feature request related to a problem? Please describe.
In performance benchmarks, it has been seen that
rateLimitQuotaWrapping
would eat a significant CPU time and brings ~0.5ms latency on each request.Describe the solution you'd like
Like other handlers, make it able to drop
rateLimitQuotaWrapping
at all.Describe alternatives you've considered
Drop the effect of the wrapper when there is no rate limiter in place.
Additional context
The text was updated successfully, but these errors were encountered: