Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conflict causes revisions to grow without limit #1127

Closed
yonahforst opened this issue Sep 10, 2015 · 9 comments
Closed

conflict causes revisions to grow without limit #1127

yonahforst opened this issue Sep 10, 2015 · 9 comments
Assignees

Comments

@yonahforst
Copy link

yonahforst commented Sep 10, 2015

I'm experiencing another issue similar to #961 where a document's revisions grow very large (several hundred revisions in the _raw output).

Short summary:
deleting a document locally and then recreating it before the replication has had a chance to pull a copy from sync_gateway causes revisions to grow without limit.

Steps to reproduce:

  1. clone my copy of helloCBL, branch: deleted_database (https://github.com/joshblour/couchbase-lite-tutorial-ios/tree/deleted_database)
  2. start sync_gateway with "revs_limit" : 3 (my config.json: https://gist.github.com/joshblour/620a60cd658938a58e10)
  3. start helloCBL and let it run for a minute or two (there's a timer that updates the document every 10 seconds)
  4. have a look at the document at http://127.0.0.1:5985/sync_gateway/_raw/foobar12345 there should only be 3 revisions. great!
  5. stop and then run helloCBL again. (There is a line at the beginning of sayHello that deletes any existing database with the same name and then recreates it lazily.)
  6. wait a minute or two then check the document again (same url) there should be more than 3 revisions. If you let it run the revisions will just keep growing. here's a copy of mine: https://gist.github.com/joshblour/7787df936bef16c7788c

Note: this only happens if the device creates a copy of the document before the replication has had a chance to pull the existing one. If it waits for the initial replication to finish before making any updates, the rev count stays at 3.

This could be the correct behaviour but I would have expected there to be a conflict and the server version to win because of the higher revision number.

@yonahforst yonahforst changed the title revs grow very large (rev limit ignored) after device deletes then creates db conflict causes revisions to grow without limit Sep 11, 2015
@adamcfraser
Copy link
Collaborator

I expect you do have a conflict here - the restarted helloCBL will create a new revision branch starting from a new root, since it doesn't find the matching rev numbers in the other (already pruned) branch. That new branch is preventing pruning of the larger branch, as the pruning algorithm only prunes beyond the lowest non-deleted leaf rev generation count.

Resolving the conflict (marking the leaf node as deleted in the newly created branch) would allow pruning to complete as usual.

I believe this is the expected behaviour of the current pruning algorithm. In general, effective pruning also requires conflicts to be resolved. If you've got a scenario where conflicts like these aren't able to be resolved/tombstoned, share the details and we can see whether there are any refinements possible to the pruning algorithm.

@zgramana
Copy link
Contributor

Possibly related to #961.

@yonahforst
Copy link
Author

@adamcfraser thanks for the explanation. I'm resolving the conflicts on the device but when I try to push to sync_gateway it throws panic serving 127.0.0.1:49824: can't find rev:... https://gist.github.com/joshblour/9af630246594ee2f9ee2

could be related to #1007

@yonahforst
Copy link
Author

I've flushed the db and I'm no longer seeing this issue. Closing...

@zgramana zgramana removed the ready label Sep 16, 2015
@tleyden tleyden reopened this Sep 16, 2015
@tleyden
Copy link
Contributor

tleyden commented Sep 16, 2015

Re-opening since there may still be things to get to the bottom of

@mastohhh
Copy link

I'm having the same problem.

I first saw a sync problem, then a high CPU usage on the sync_gateway's node (we're in test mode with only 4 ipads and it takes more than 50% of CPU). That's why I opened logs and saw this panic issue.

The problem occurs when there's a lot of revisions, for example 5000 on a single document.

When the panic occurs, the replication fails, and devices are not synced.

Here's my configuration :

  • 3 nodes of couches server 3.1.0 enterprise edition
  • 1 node of sync_gateway 1.1.0 community edition
  • couchbase-lite-ios 1.1.1
  • 4 iPads with CBLIncrementalStore, which has a default rev-limit to 20.

Here's my sync_gateway config file :

{
 "interface":":4984",
 "adminInterface":":4985",
 "log": ["Access"],
   "databases": {
    "database": {
     "server":"http://x.x.x.x:8091/",
     "bucket":"default",
     "revs_limit":20,
     "users": {
      "GUEST": {"disabled": true, "admin_channels": ["*"]}
      },
      "sync":`

function (doc, oldDoc) {
    if (!doc.owner) throw({forbidden : "Documents must have a owner"});

    var documentForOwner = 'documentFor_' + doc.owner;
    var restaurantForOwner = 'restaurantFor_' + doc.owner;

    var channels = [documentForOwner];

    var restaurantTypes = [
                           'Account',
                           'Item',
                           'LiveTillID',
                           'Option',
                           'Person',
                           'Till'
                           ];

    // route channels
    if (restaurantTypes.indexOf(doc.type) !== -1) {
        channels.push(restaurantForOwner);
    }
    else {
        if (!doc.tillID && !oldDoc.tillID) {
            throw({forbidden : "Documents must have a tillID"});
        }
        else {
            var tillChannel;
            if (doc.tillID) {
                tillChannel = 'till_' + doc.tillID;
            }
            else {
                tillChannel = 'till_' + oldDoc.tillID;
            }

            channels.push(tillChannel);
        }
    }

    if (doc.type === 'LiveTillID') {
        var tillChannel;
        if (doc.tillID) {
            tillChannel = 'till_' + doc.tillID;
        }
        else {
            tillChannel = 'till_' + oldDoc.tillID;
        }

        access(doc.owner, tillChannel);
    }

    if (doc.type === 'Restaurant') {
        access(doc.owner, restaurantForOwner);
    }

    channel(channels);
}
 `
   }
 }
}

As you can see there's no bucket shadowing.

Here's the known panic message :

2015/09/22 07:04:36 http: panic serving x.x.x.x:60709: can't find rev: 2976-a2ad8a07e110c0d17d08fcc6ff9b8d74
goroutine 4748 [running]:
net/http.func·011()
    /usr/local/go/src/net/http/server.go:1130 +0xbb
github.com/couchbase/sync_gateway/db.RevTree.getInfo(0xc2089ca210, 0xc2093c0a50, 0x25, 0xc208d82000)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/couchbase/sync_gateway/db/revtree.go:122 +0xd8
github.com/couchbase/sync_gateway/db.RevTree.getHistory(0xc2089ca210, 0xc2093c0a50, 0x25, 0x0, 0x0, 0x0)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/couchbase/sync_gateway/db/revtree.go:138 +0x184
github.com/couchbase/sync_gateway/db.(*Database).updateDoc(0xc2089367e0, 0xc20929c3c0, 0x39, 0x100, 0xc20929ccc0, 0x0, 0x0, 0x0, 0x0)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/couchbase/sync_gateway/db/crud.go:665 +0x8c0
github.com/couchbase/sync_gateway/db.(*Database).PutExistingRev(0xc2089367e0, 0xc20929c3c0, 0x39, 0xc2093c03c0, 0xc209a78000, 0xb2, 0xb2, 0x0, 0x0)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/couchbase/sync_gateway/db/crud.go:445 +0x447
github.com/couchbase/sync_gateway/rest.(*handler).handleBulkDocs(0xc208fb1170, 0x0, 0x0)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/couchbase/sync_gateway/rest/bulk_api.go:397 +0xd45
github.com/couchbase/sync_gateway/rest.(*handler).invoke(0xc208fb1170, 0xd8c5f0, 0x0, 0x0)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/couchbase/sync_gateway/rest/handler.go:159 +0x4b8
github.com/couchbase/sync_gateway/rest.func·015(0x7f685ff7d150, 0xc208f81400, 0xc208db2270)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/couchbase/sync_gateway/rest/handler.go:86 +0x7d
net/http.HandlerFunc.ServeHTTP(0xc2081de8e0, 0x7f685ff7d150, 0xc208f81400, 0xc208db2270)
    /usr/local/go/src/net/http/server.go:1265 +0x41
github.com/gorilla/mux.(*Router).ServeHTTP(0xc2081ff6d0, 0x7f685ff7d150, 0xc208f81400, 0xc208db2270)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/gorilla/mux/mux.go:86 +0x29e
github.com/couchbase/sync_gateway/rest.func·017(0x7f685ff7d150, 0xc208f81400, 0xc208db2270)
    /home/couchbase/jenkins/workspace/sync-gateway-unix-builds/release/1.1.0/enterprise/app-under-test/sync_gateway/src/github.com/couchbase/sync_gateway/rest/routing.go:236 +0x32f
net/http.HandlerFunc.ServeHTTP(0xc2085595a0, 0x7f685ff7d150, 0xc208f81400, 0xc208db2270)
    /usr/local/go/src/net/http/server.go:1265 +0x41
net/http.serverHandler.ServeHTTP(0xc208538cc0, 0x7f685ff7d150, 0xc208f81400, 0xc208db2270)
    /usr/local/go/src/net/http/server.go:1703 +0x19a
net/http.(*conn).serve(0xc208f812c0)
    /usr/local/go/src/net/http/server.go:1204 +0xb57
created by net/http.(*Server).Serve
    /usr/local/go/src/net/http/server.go:1751 +0x35e

This could be related to #807 ? @snej do you think this problem could come from couchbase-lite-ios or is it only a sync_gateway issue ?

@mastohhh
Copy link

In addition, couchbase-lite-ios throws this error :

Error Domain=NSURLErrorDomain Code=-1005 "The network connection was lost." 
UserInfo=0x8074e2a0 {NSErrorFailingURLStringKey=http://x.x.x.x:4984/database/_bulk_docs, 
_kCFStreamErrorCodeKey=-4, NSErrorFailingURLKey=http:/x.x.x.x:4984/database/_bulk_docs, 
NSLocalizedDescription=The network connection was lost., _kCFStreamErrorDomainKey=4, 
NSUnderlyingError=0x7e92cc00 "The network connection was lost."}

@mastohhh
Copy link

Ok I saw in file database.gothat revs_limit default value is 1000

const DefaultRevsLimit = 1000

That explains why errors comes quickly when in my config file I set :

    ...
     "revs_limit":20,
    ...

And why the problem comes when there is more than 1000 revisions when I remove that line.

What are known issues to set the revs_limit to 1000000 ? Performance issues ?

@zgramana zgramana added ready and removed in progress labels Oct 2, 2015
@zgramana zgramana added backlog and removed ready labels Apr 18, 2016
@adamcfraser
Copy link
Collaborator

Closing, as the original issue is the expected behaviour for revision trees with conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants