-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compactor: Rare race of sharded compactors / slow upload causing data loss. #1919
Comments
The workaround here is to change cc @mattrco |
This replaces man 4 inconsistent meta.json syncs places in other components. Fixes: #1335 Fixes: #1919 Fixes: #1300 * One place for sync logic for both compactor and store * Corrupted disk cache for meta.json is handled gracefully. * Blocks without meta.json are handled properly for all compactor phases. * Synchronize was not taking into account deletion by removing meta.json. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Better observability for syncronize process. * More logs for store startup process. * Remove Compactor Syncer. * Added metric for partialUploadAttempt deletions. * More tests. TODO in separate PR: * More observability for index-cache loading / adding time. Signed-off-by: Bartek Plotka <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]>
This replaces man 4 inconsistent meta.json syncs places in other components. Fixes: #1335 Fixes: #1919 Fixes: #1300 * One place for sync logic for both compactor and store * Corrupted disk cache for meta.json is handled gracefully. * Blocks without meta.json are handled properly for all compactor phases. * Synchronize was not taking into account deletion by removing meta.json. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Better observability for syncronize process. * More logs for store startup process. * Remove Compactor Syncer. * Added metric for partialUploadAttempt deletions. * More tests. TODO in separate PR: * More observability for index-cache loading / adding time. Signed-off-by: Bartlomiej Plotka <[email protected]>
Fixes: #1335 Fixes: #1919 Fixes: #1300 * Clean up of meta files are now started only if block which is being uploaded is older than 2 days (only a mitigation). * Blocks without meta.json are handled properly for all compactor phases. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Added metric for partialUploadAttempt deletions and delayed it. * More tests. Signed-off-by: Bartlomiej Plotka <[email protected]>
|
Fixes: #1335 Fixes: #1919 Fixes: #1300 * Clean up of meta files are now started only if block which is being uploaded is older than 2 days (only a mitigation). * Blocks without meta.json are handled properly for all compactor phases. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Added metric for partialUploadAttempt deletions and delayed it. * More tests. Signed-off-by: Bartlomiej Plotka <[email protected]>
Fixes: #1335 Fixes: #1919 Fixes: #1300 * Clean up of meta files are now started only if block which is being uploaded is older than 2 days (only a mitigation). * Blocks without meta.json are handled properly for all compactor phases. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Added metric for partialUploadAttempt deletions and delayed it. * More tests. Signed-off-by: Bartlomiej Plotka <[email protected]>
Fixes: #1335 Fixes: #1919 Fixes: #1300 * Clean up of meta files are now started only if block which is being uploaded is older than 2 days (only a mitigation). * Blocks without meta.json are handled properly for all compactor phases. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Added metric for partialUploadAttempt deletions and delayed it. * More tests. Signed-off-by: Bartlomiej Plotka <[email protected]>
Fixes: #1335 Fixes: #1919 Fixes: #1300 * Clean up of meta files are now started only if block which is being uploaded is older than 2 days (only a mitigation). * Blocks without meta.json are handled properly for all compactor phases. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Added metric for partialUploadAttempt deletions and delayed it. * More tests. Signed-off-by: Bartlomiej Plotka <[email protected]>
Fixes: #1335 Fixes: #1919 Fixes: #1300 * Clean up of meta files are now started only if block which is being uploaded is older than 2 days (only a mitigation). * Blocks without meta.json are handled properly for all compactor phases. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Added metric for partialUploadAttempt deletions and delayed it. * More tests. Signed-off-by: Bartlomiej Plotka <[email protected]>
Fixes: #1335 Fixes: #1919 Fixes: #1300 * Clean up of meta files are now started only if block which is being uploaded is older than 2 days (only a mitigation). * Blocks without meta.json are handled properly for all compactor phases. * Prepare for future implementation of https://thanos.io/proposals/201901-read-write-operations-bucket.md/ * Added metric for partialUploadAttempt deletions and delayed it. * More tests. Signed-off-by: Bartlomiej Plotka <[email protected]>
@mattrco this should be fixed on master now. |
Thanks @bwplotka, is |
It is required but for different reasons, mainly for non-consistent object storages. I would suggest leaving the default value. |
As per discussion, #1901 sharding compactor is currently not fully safe. Please do not use it or use only for smaller setups.
We are working on this, but the reason is also partially explained in https://docs.google.com/document/d/1QvHt9NXRvmdzy51s4_G21UW00QwfL9BfqBDegcqNvg0/edit#
TL;DR:
The text was updated successfully, but these errors were encountered: