Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving from TAP to DCP #404

Closed
jessliu opened this issue Aug 18, 2014 · 7 comments
Closed

Moving from TAP to DCP #404

jessliu opened this issue Aug 18, 2014 · 7 comments
Assignees
Milestone

Comments

@jessliu
Copy link

jessliu commented Aug 18, 2014

Per expectations in CBS 3.0, all tools and clients speaking directly to CBS should move from the TAP protocol to DCP. There is an expected drop in performance in trying to use both protocols, and DCP is highly recommended by the CBS team as protocol-of-choice going forward.

Expert on this new protocol is Mike W.

@jessliu
Copy link
Author

jessliu commented Aug 29, 2014

Needs further discussion for both forward and backward compatibility--discussion with Smart Client team could also give us insights in what has been prototyped and how to prioritize this.

@jessliu jessliu added icebox and removed backlog labels Sep 8, 2014
@jessliu jessliu added this to the 1.2.0 milestone Sep 8, 2014
@jessliu jessliu removed this from the 1.2.0 milestone Oct 15, 2014
@adamcfraser adamcfraser self-assigned this Oct 17, 2014
@zgramana zgramana added backlog and removed icebox labels Oct 24, 2014
@jessliu jessliu added ready and removed backlog labels Oct 24, 2014
@zgramana zgramana added this to the 1.1.0 milestone Oct 24, 2014
@zgramana zgramana removed the epic label Oct 24, 2014
@adamcfraser
Copy link
Collaborator

Completed an initial analysis and proof of concept for the DCP migration.

There's support in go-couchbase for DCP (as the older name UPR), but there's a bit more work involved in setting up a DCP feed, compared to the Tap feed. The process is:

  1. Start the DCP feed (StartUprFeed)
  2. For each vBucket, add a request stream for that bucket to the feed (feed.UprRequestStream), specifying start/end sequence and vBucket info.
  3. Listen to the DCP event stream

A basic POC has been completed that swaps out the TAP feed for a DCP feed (starting from seq 0), converts DCP events to walrus.TapEvents, and sends those through the standard SG processing.

Open Tasks/Issues

  1. How to start a DCP stream for new mutations only. When creating a request stream for a vBucket, you can specify the sequence number that you want to start with. However, this requires you to provide a specific sequence number and vBucket uuid - you can't simply send a seqno=maxInt, for example, to 'start listening for new mutations'. This is to support full data integrity on DCP restart (including rollback in the case of vBucket failure), but doesn't exactly line up with the current SG functionality.
  2. The sequence number restart would provide us more flexibility in the bucket shadowing scenario - the shadower can track DCP sequence information and use that on restart. Work is required to determine how best to persist the sequence info, etc.
  3. DCP events have a different structure, in particular different Opcodes than the TapEvents we currently support. Need to review whether any of the new events can provide functional benefit to SG.
  4. Comprehensive replication testing using DCP stream to ensure all event types that SG cares about are being handled properly.

@jessliu
Copy link
Author

jessliu commented Oct 31, 2014

Hi @adamcfraser based on these finds, does a transition from TAP to DCP by a currently live Sync Gateway cluster mean that an upgrade from our existing Sync Gateway to the next version will have to be permanent?

@jessliu jessliu added size-small and removed epic labels Oct 31, 2014
@adamcfraser
Copy link
Collaborator

I don't think there would be anything inherent in which feed is used that would make the change permanent. In a Bucket Shadowing scenario you'd have the same initial overhead that we already have today - when the SG is restarted in the new mode, the entire feed (Tap or DCP) would need to be reprocessed to get the two buckets back in step.

@adamcfraser
Copy link
Collaborator

Completed some a preliminary performance comparison between DCP and TAP in a dev environment. For retrieval/backfill of 13000 records, the change isn't statistically significant for the number of tests I ran - running 10 tests for each, DCP was on average 2% faster than TAP. Benchmarks against a production dataset would be recommended, but based on preliminary results I don't think that performance consideration should influence plans to support DCP.

@adamcfraser
Copy link
Collaborator

Depends on #486

@adamcfraser
Copy link
Collaborator

Reviewed the work Steve Yen has done to support DCP for cbft. His implementation does a better job at abstracting away individual vBucket management, and recovering from unexpected topology changes. His implementation is scheduled to be added to go-couchbase. Once that's in place, need to review and incorporate into our uptake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants