Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Items in SG channel not reflected in change feed. #491

Closed
lightandshadow68 opened this issue Nov 6, 2014 · 9 comments
Closed

Items in SG channel not reflected in change feed. #491

lightandshadow68 opened this issue Nov 6, 2014 · 9 comments
Assignees
Milestone

Comments

@lightandshadow68
Copy link

We have at least one item that is listed in a SG channel that is not syncing to the client for at least one user. I've verified the item is in the channel as indicated by :4985//_dump/channels. We're using Sync Gateway 1.0.3 and CBL master as of Oct 30th.

The item in question looks like this in the channel dump...

["24e25756-a3e6-4f29-8a78-d65620346219",67780]    {"flags":28,"rev":"141-3ee97f69d2b7487bbadf073a16086a41"}    d053366b-a1c7-44ec-bf7c-0d636eb4c08f

... but does not appear when we manually pull the change feed for a user assigned to that channel.

We thought the particular flags value might have been somehow related, but noticed a number of other items with the same flags were showing up in the change feed.

What would be our next best step for tracking this down?

@polfer
Copy link

polfer commented Nov 13, 2014

Observed a few more documents not syncing to our clients, and we have additional test results.

The following results can be seen from the internal channels view (_dump/channels):

{"id":"23131623-1223-43d3-9e0d-c09b6d4c5368","key":["*",67774],"value":{"rev":"2-c852cc10adb55c4075a6d58ae65719f4"}},
{"id":"worktype-appliances-worktypedetail-dishwasher","key":["*",67774],"value":{"rev":"20-39dc9cd2fd2b29a200961d73dc02ed5f","flags":24}},
{"id":"2314cd61-8b91-45b9-b9b6-52f1a0dd5eac","key":["*",67775],"value":{"rev":"2-f5802a3b11bf084e937dbf38012688f5"}},
{"id":"worktype-appliances-worktypedetail-disposal","key":["*",67775],"value":{"rev":"20-1804afce3b1309c613cfc97d497862cb","flags":24}},
{"id":"233591f3-67ef-474f-96cf-5ea14ac8a6c8","key":["*",67776],"value":{"rev":"2-bc367f3820fd3b96b699a377d0335aed"}},
{"id":"worktype-appliances","key":["*",67776],"value":{"rev":"20-118b811f594976160691f79b0ad1a9ae","flags":24}},

The documents "worktype-appliances*" do not show up in a corresponding call to the _changes feed. In each case of a document missing from the _changes feed we see the duplicate sequence numbers above (with or without flags on the documents). When not in the _changes feed, the documents do not, of course, show up on the client (which can break us).

I appears that the missing documents are getting skipped over by MultiChangesFeed within changes.go by this conditional:

if !options.Since.Before(minSeq) {
    continue // out of order; skip it
}

Unfortunately, it appears that rejects documents we actually need (and documents that come back fine on a direct get). The check looks like it was introduced with commit da9037d for #314 . If we back up one-revision to 6265244 and retest we see the missing documents reappear within the _changes log.

Hoping this helps narrow this down... Let us know if there is anything else we can do to help.

@amazkovoi
Copy link

This issue could be related to the one we are seeing #506

@jessliu jessliu added this to the 1.1.0 milestone Nov 18, 2014
@ajres ajres added ready and removed in progress labels Nov 21, 2014
@tleyden tleyden added epic and removed size-medium labels Dec 5, 2014
@ajres
Copy link

ajres commented Dec 5, 2014

@lightandshadow68

Can you confirm which version of Couchbase Server you are using when this issue occurs.

Do you only see this issue when the Couchbase Server is or has been under heavy load?

Andy

@polfer
Copy link

polfer commented Dec 6, 2014

Hey Andy... We're using 2.5.1 enterprise edition (build-1083) with the 1.0.3 gateway, and no, the issue occurs consistently regardless of the specific load. This is becoming pretty concerning for us given the fact that documents that had been showing up and syncing now do not. We can actually get this with virtually no load on the server at all.

I notice that the 1.1.0 build is being prepped for release, and that the target was for a fix on this before 1.1.0. Is that still the target here?

@tleyden
Copy link
Contributor

tleyden commented Dec 7, 2014

@polfer anything you can provide to help us reproduce this would be very valuable

@polfer
Copy link

polfer commented Dec 7, 2014

Totally open to helping any way we can. Still seeing this on one of our staging servers that we could get you remote access to, and could even share a backup of that. We can also get you logs on any test calls you'd like to make. Just name a time and place or what you need. What we can't easily provide you is a test harness that replicates the failed state every time from a clean bucket.

@ajres
Copy link

ajres commented Dec 8, 2014

After looking at the system in question, one thing stands out as a potential issue:

There are multiple sets of document revisions that share the same local sequence id.

For example:

_all_docs?limit=1&update_seq=true&startkey=worktype-appliances-worktypedetail-dishwasher
{"key":"worktype-appliances-worktypedetail-dishwasher","id":"worktype-appliances-worktypedetail-dishwasher","value":{"rev":"20-39dc9cd2fd2b29a200961d73dc02ed5f"},"update_seq":67774}
_all_docs?limit=1&update_seq=true&startkey=23131623-1223-43d3-9e0d-c09b6d4c5368
{"key":"23131623-1223-43d3-9e0d-c09b6d4c5368","id":"23131623-1223-43d3-9e0d-c09b6d4c5368","value":{"rev":"2-c852cc10adb55c4075a6d58ae65719f4"},"update_seq":67774}

@snej can you confirm that multiple revisions should not share the same local sequence id?

@snej
Copy link
Contributor

snej commented Dec 9, 2014

Yikes! This is definitely messed-up. Every revision is supposed to get a unique sequence number. I have no idea how multiple revisions could end up with the same one. The sequences are generated by an atomic 'increment' call to a key in the bucket, which should never return the same value twice.

@ajres
Copy link

ajres commented Dec 10, 2014

Closing this ticket as the issue has only occurred on a single system, and is isolated to a fixed time window.

Have a opened a new ticket #522 to document the correct process to backup and restore an SG instance or cluster backed by Couchbase Server or cluster.

@ajres ajres added done and removed ready labels Dec 10, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants