Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

beautifulentropy · 2024-11-25T19:02:18Z

By default, go-redis will retry a request 3 times. Check if retries are applied to individual requests in a pipeline or if they are applied to the entire pipeline. The theory we are trying to prove out is whether a timeout of 1 or 2 keys in a 100 name order results in a whole pipeline being retried, and thus considerably more load on those same shards.

We can add a label to our metrics to bin transactions by count. With a ~105 upper limit for new-order rate limit checks, including per-name checks, we could use bins like: 1-25, 26-50, 51-75, 76-105, 106+. With these deployed we should be able to correlate timeouts to queries.

For batch operations, include the operation and the number of keys in the error message. This should help diagnose whether we are getting `i/o timeout` errors disproportionately for larger requests, or for certain operations. Also, make the ignored errors part of the overall WFE request logs, which allows us to get additional context, like whether certain requesters or domain names are getting disproportionately many errors. Related to #7846.

aarongable added this to the Sprint 2024-12-03 milestone Dec 3, 2024

aarongable assigned jsha Dec 3, 2024

jsha mentioned this issue Dec 4, 2024

ratelimits: add detail to error messages #7871

Merged

aarongable modified the milestones: Sprint 2024-12-03, Sprint 2024-12-10 Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

beautifulentropy commented Nov 25, 2024 •

edited

Loading

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

Comments

beautifulentropy commented Nov 25, 2024 • edited Loading

beautifulentropy commented Nov 25, 2024 •

edited

Loading