Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocks upload: Don't do cleanUp when meta.json fails to upload. #3626

Merged
merged 3 commits into from
Dec 22, 2020
Merged

Blocks upload: Don't do cleanUp when meta.json fails to upload. #3626

merged 3 commits into from
Dec 22, 2020

Conversation

pstibrany
Copy link
Contributor

@pstibrany pstibrany commented Dec 15, 2020

Under certain conditions, when upload of meta.json file fails, it will actually appear on the bucket eventually. S3 has been observed to behave like this when it reports 503 "SlowDown" error. Currently block.Upload calls cleanUp to delete partially uploaded block, but if meta.json can still appear in the bucket later, this actually creates invalid block – with meta.json file only, but no index or chunks.

Changes

This PR removes call to cleanUp if only meta.json upload fails. If meta.json has really failed, this will result into partial block (without meta.json) and be eventually removed by other components, or upload will be retried by Shipper. But at least it won't generate corrupted blocks.

For more details, see discussion at https://cloud-native.slack.com/archives/CL25937SP/p1608021430416900.

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Verification

Added unit test.

Signed-off-by: Peter Štibraný <[email protected]>
@pstibrany
Copy link
Contributor Author

Failed TestTSDBStore_Series_SplitSamplesIntoChunksWithMaxSizeOf120 test seem to be unrelated to this change.

Signed-off-by: Peter Štibraný <[email protected]>
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

Copy link
Member

@onprem onprem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well. 👍

@GiedriusS
Copy link
Member

GiedriusS commented Dec 22, 2020

This is a very interesting case. Thank you for your work on this! AIUI pragmatically this means that the meta.json file would have to appear within 30 seconds because we do a Sync every 30 seconds. I guess this lines up with AWS S3 restrictions - namely that the limitation is 3500 requests per second. So, we still have 29 seconds left. Is my understanding correct, @pstibrany? Either way, merging this (:

@GiedriusS GiedriusS merged commit 6a183d0 into thanos-io:master Dec 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants