Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry S3 get request on 500 Internal Server Error #510

Merged
merged 1 commit into from
Nov 30, 2021

Conversation

mosyp
Copy link
Contributor

@mosyp mosyp commented Nov 29, 2021

Description

We're getting 500 errors in kafka-delta-ingest on get requests (especially with several workers), which is expected and such requests have to be retried

<Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message>

We've already tested this in high load environments and this fix helped us avoid unnecessary restarts https://github.com/delta-io/delta-rs/tree/kdi-partition-values-quick-fix.

I've tried to find some universal approach to retry every s3 requests, but couldn't figure out one due to rusoto restrictions

xianwill
xianwill previously approved these changes Nov 29, 2021
@fvaleye
Copy link
Collaborator

fvaleye commented Nov 29, 2021

@mosyp I was really surprised that the S3 retry was not managed by rusoto for these internal errors: rusoto/rusoto#234

If you want another point of view for this workaround: https://github.com/abetterinternet/prio-server/pull/597/files

@mosyp
Copy link
Contributor Author

mosyp commented Nov 29, 2021

@fvaleye
Oh cool ty, will take a look at it tomorrow

fvaleye
fvaleye previously approved these changes Nov 30, 2021
Copy link
Collaborator

@fvaleye fvaleye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mosyp
Copy link
Contributor Author

mosyp commented Nov 30, 2021

@fvaleye the solution in the links you've shared seems huge to implement within this PR and current needs, so I've opened #514.

Also, since rusoto is not maintained anymore, we can give some hope for https://github.com/awslabs/aws-sdk-rust to do the job in a future

@mosyp mosyp merged commit 88803ea into delta-io:main Nov 30, 2021
@mosyp mosyp deleted the s3-retry branch November 30, 2021 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants