Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Channel backfill when using vector clock sequence #1084

Closed
adamcfraser opened this issue Aug 21, 2015 · 1 comment
Closed

Channel backfill when using vector clock sequence #1084

adamcfraser opened this issue Aug 21, 2015 · 1 comment
Assignees
Milestone

Comments

@adamcfraser
Copy link
Collaborator

The current channel backfill algorithm includes logic that's dependent on a single global sequence, particularly when resuming a partially completed backfill (when a previous backfill was interrupted by either hitting the _changes limit or disconnection).

Components involved

  • The user and role documents store the sequence number at which the principal was granted access to the channel (seqAddedAt). This is stored as a {vbucket, sequence} tuple when using a clock-based sequence
  • When performing a backfill, seqAddedAt is included as the TriggeredBy value in the sequence that's returned on the changes feed

Tasks

  1. Triggering backfill at the appropriate sequence
    Backfill should be triggered whenever seqAddedAt is less than the vbSequence in the since clock
  2. Identifying backfill completion
    The existing backfill uses the global sequence to identify when a backfill is complete - you're done when you get to seqAddedAt. When using a clock-based sequence, backfill requests to channels instead need to return two sets of entries (in listed order):
    • All entries from 0 to the current since value, marked as TriggeredBy `seqAddedAt'
    • Then, return all entries in the channel after the since value, with no TriggeredBy
      This is needed to ensure partially completed backfills are able to complete where they left off.
  3. Handling for partially completed backfill
    When we have a partially completed backfill, we need to complete that backfill on a subsequent request before processing additional sequences (otherwise we lose the triggered by, and potentially restart the backfill on a subsequent request). This needs to handle the scenario where:
    • between backfill requests, an entry is added to another channel in an earlier vbucket than seqAddedAt. The standard weaving of channelsets would send that first. To handle, we need to prioritize channels that are doing backfill during aggregation, regardless of vbucket. The sort criteria for aggregation should be [isBackfill, vbucket, sequence].
@adamcfraser
Copy link
Collaborator Author

Fixed with 87b4cdc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant