Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency Problem with Async. Deletion #783

Closed
windkit opened this issue Jul 6, 2017 · 2 comments
Closed

Consistency Problem with Async. Deletion #783

windkit opened this issue Jul 6, 2017 · 2 comments

Comments

@windkit
Copy link
Contributor

windkit commented Jul 6, 2017

Description

Asynchronous deletion would cause consistency problem if another modification has occurred before the async. deletion is handled

Root Cause

Directory Deletion

  • Object list under the directory is pull when leo_storage handles the deletion, the list of objects could have been changed by the time
  • Time stamp of the deletion request is not taken into account, re-created file could be incorrectly deleted

Object Deletion

  • Time stamp of the deletion request is recorded as the time leo_storage starts to handle, not the origin request time

Action to take

Clarify the consistency model, especially the mix of sync and async. operations, the time stamp record for reconciliation

Related Issue

Spark first cleanup the temporary folder and then start to write data into the folder #595

@windkit
Copy link
Contributor Author

windkit commented Jul 7, 2017

The problem here is between Request and Handle, there could be state changes (other operations)
Now we do not record the timestamp at request, we have NO ways to distinguish which state changes happen before / after the request.

@windkit
Copy link
Contributor Author

windkit commented Aug 2, 2017

Issue fixed with
Spark 2.2.0 + Hadoop 2.7.3 and Spark 1.6.1 + Hadoop 2.6.3 with
hadoop-aws-2.7.3.jar and aws-java-sdk-1.7.4.jar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants