-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingester flush queue is backing up again #254
Comments
I think s3 latency was a red herring - those instances were doing more writes. We're also seeing this on dev. Took a stack dump (https://gist.github.com/tomwilkie/a1159d8974d231965fd04a6c26d0105b) and all the flush goroutines were in backoff sleeps. Upped read throughput on dev to match write throughput, and progress started to be made: Have upped read throughput on prod to match. |
Looks like this might be self-inflicted - flux has some very high cardinality metrics (see fluxcd/flux#417) |
Still happening even though we've disabled flux scraping. |
The error was |
This is in prod now. |
Things we should do to help:
Noticed when deploying to prod. Slack logs:
The text was updated successfully, but these errors were encountered: