Preservation bug fix: properly handle rapid updates #96
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Two behaviors of the preservation service sets up a race condition that can make an update to an AIP cause data to be lost:
If the update comes before the previously saved bags have migrated to S3, then the preservation will either fail to pull over a head bag, or pull over the wrong one (i.e. not the latest). This means that the data (or updated metadata) from that bag will not be represented in the update and will be effectively lost. When this has occurred in the past, all of the data was lost. Further, the output version and sequence number would be wrong, and the new output bags would overwrite previously saved ones.
(In actuality, the data has been recoverable, thanks to versioning being turned on in the S3 bucket.)
This PR addresses this bug with three major changes: