-
-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stoplight for object storage failures, return HTTP 503 #13043
Conversation
7d4302c
to
1108be0
Compare
1108be0
to
75c7185
Compare
can we make this opt in through an environment variable? it seems bad if people on reliable storage providers would see increased unavailability due to transient network errors. i.e., if I'm using Digital Ocean Spaces, and there's a one off timeout, I don't want that to trigger a stoplight, because I know it's probably not indicative of a more serious failure. |
@nightpool Threshold is 3 and cooldown is 60 seconds by default. Can we find a combination that you would be happy with without requiring extra configuration? Even a cooldown of 10 seconds would greatly reduce the amount of timing out concurrent requests. Edit: Changed threshold to 10 failures, cooldown to 30 seconds. If object storage fails 10 times, the next 30 seconds it will fail instantly, then it can fail another 10 times. Is that OK with you? |
a820350
to
4d2a985
Compare
4d2a985
to
d3f50d3
Compare
d3f50d3
to
4f55364
Compare
One thing that this could have enabled is receiving and processing statuses with media attachments when the object storage is down. However, I think the error handling in |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
4f55364
to
5d9c295
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine to me although I'm a bit worried about temporary storage server failures causing to silently avoid downloading media attachments/emoji. It's probably a bit more useful overall, but I'm afraid this would make the behavior slightly more surprising and slightly more difficult to debug.
Reduce traffic congestion due to timeouts