Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection #3

Closed
hackerwins opened this issue Dec 5, 2019 · 2 comments · Fixed by #57 or #59
Closed

Garbage collection #3

hackerwins opened this issue Dec 5, 2019 · 2 comments · Fixed by #57 or #59
Labels
enhancement 🌟 New feature or request

Comments

@hackerwins
Copy link
Member

hackerwins commented Dec 5, 2019

CRDT only changes flag, the tombstone when an element is deleted to avoid breaking when concurrent editing occurs. Even if the user deletes an element it still takes up space in memory. So after a certain point, we have to delete elements the tombstone marked.

If all clients attached to the document, have received all the changes for a particular checkpoint, we can delete the elements that have deleted before that checkpoint.

The DB stores checkpoints to keep track of the point of the change received by clients.

@hackerwins hackerwins added the enhancement 🌟 New feature or request label Jan 8, 2020
@hackerwins
Copy link
Member Author

Other CRDT libraries garbage collection and tombstone.

Y.js:

We can't garbage collect deleted structs (tombstones) while ensuring a unique order of the structs. But we can 1. merge preceeding structs into a single struct to reduce the amount of meta information, 2. we can delete content from the struct if it is deleted, and 3. we can garbage collect tombstones if we don't care about the order of the structs anymore (e.g. if the parent was deleted).

https://github.com/yjs/yjs/blob/705dce78384bac8f25fa11d05e3b2551325a5d26/README.md#yjs-crdt-algorithm

Automerge

Yes, RGA uses tombstones. I think this is okay for now, as in most cases, the tombstones are not really the dominant cost. Also, I think it is desirable to retain history (so you can see what was changed when), and that requires remembering deleted parts of the document anyway.
automerge/automerge#221 (comment)

@hackerwins
Copy link
Member Author

hackerwins commented May 20, 2020

Yorkie's GC logic looks like this:

  • Agent returns MinPulledTicket to the client in the response of PushPull API
    • find min_pulled_server_seq
      document `d1` server_seq: 10
      client `c1` server_seq: 5
      client `c2` server_seq: 7
      
      min_pulled_server_seq: 5
      
    • create MinPulledTicket, time.Ticket of min_pulled_server_seq
  • The client purge elements(marked with tombstone) that were removed earlier than the MinPulledTimeTicket received in response to PushPull.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🌟 New feature or request
Projects
None yet
1 participant