You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current channel backfill algorithm includes logic that's dependent on a single global sequence, particularly when resuming a partially completed backfill (when a previous backfill was interrupted by either hitting the _changes limit or disconnection).
Components involved
The user and role documents store the sequence number at which the principal was granted access to the channel (seqAddedAt). This is stored as a {vbucket, sequence} tuple when using a clock-based sequence
When performing a backfill, seqAddedAt is included as the TriggeredBy value in the sequence that's returned on the changes feed
Tasks
Triggering backfill at the appropriate sequence
Backfill should be triggered whenever seqAddedAt is less than the vbSequence in the since clock
Identifying backfill completion
The existing backfill uses the global sequence to identify when a backfill is complete - you're done when you get to seqAddedAt. When using a clock-based sequence, backfill requests to channels instead need to return two sets of entries (in listed order):
All entries from 0 to the current since value, marked as TriggeredBy `seqAddedAt'
Then, return all entries in the channel after the since value, with no TriggeredBy
This is needed to ensure partially completed backfills are able to complete where they left off.
Handling for partially completed backfill
When we have a partially completed backfill, we need to complete that backfill on a subsequent request before processing additional sequences (otherwise we lose the triggered by, and potentially restart the backfill on a subsequent request). This needs to handle the scenario where:
between backfill requests, an entry is added to another channel in an earlier vbucket than seqAddedAt. The standard weaving of channelsets would send that first. To handle, we need to prioritize channels that are doing backfill during aggregation, regardless of vbucket. The sort criteria for aggregation should be [isBackfill, vbucket, sequence].
The text was updated successfully, but these errors were encountered:
The current channel backfill algorithm includes logic that's dependent on a single global sequence, particularly when resuming a partially completed backfill (when a previous backfill was interrupted by either hitting the _changes limit or disconnection).
Components involved
seqAddedAt
). This is stored as a {vbucket, sequence} tuple when using a clock-based sequenceseqAddedAt
is included as the TriggeredBy value in the sequence that's returned on the changes feedTasks
Backfill should be triggered whenever
seqAddedAt
is less than the vbSequence in the since clockThe existing backfill uses the global sequence to identify when a backfill is complete - you're done when you get to
seqAddedAt
. When using a clock-based sequence, backfill requests to channels instead need to return two sets of entries (in listed order):This is needed to ensure partially completed backfills are able to complete where they left off.
When we have a partially completed backfill, we need to complete that backfill on a subsequent request before processing additional sequences (otherwise we lose the triggered by, and potentially restart the backfill on a subsequent request). This needs to handle the scenario where:
seqAddedAt
. The standard weaving of channelsets would send that first. To handle, we need to prioritize channels that are doing backfill during aggregation, regardless of vbucket. The sort criteria for aggregation should be [isBackfill, vbucket, sequence].The text was updated successfully, but these errors were encountered: