-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VReplication: data migration from another Vitess cluster #7546
VReplication: data migration from another Vitess cluster #7546
Conversation
611f613
to
7b3fc63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is brilliant. I can't find anything to nitpick, but I'll let @sougou and @shlomi-noach do a more thorough review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have nothing to add besides what Deepthi already said: this is brilliant!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Some comments and question inside. Since this PR was long, I've marked with multiple ( 👍 ) reactions to both acknowledge that the code looks fine, as well as to self-mark points of interest.
require.NotNil(t, vc) | ||
defaultReplicas = 0 | ||
defaultRdonly = 0 | ||
//defer vc.TearDown() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this line commented?
go/vt/wrangler/traffic_switcher.go
Outdated
if vschema == nil { | ||
return fmt.Errorf("no vschema found for keyspace %s", keyspace) | ||
} | ||
if strings.HasPrefix(tableSpecs, "{") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does {
stand for? What makes the string special when there's {
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment in the code: essentially the user can specify a proper vschema for a sharded keyspace or just a list of tables. The vschema is specified as an inline json string.
…ommands and tests Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
…r Migrate. Migrate vtctl command. Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
…er and add test for Migrate. Get test working Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
77d1fc2
to
5631d56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
Summary
See related RFC for the motivation and design of cross-cluster migration. This PR adds
Details
Migrate
Migrate is mainly a subset of the MoveTables workflow. The parts of the MoveTables implementation which this PR changes are:
Simplified Workflow
Switching traffic is no longer relevant for Migrate workflows since we are not serving these tables in the first place.
Source Topo
There are two topologies involved now. The source keyspace belongs to the external cluster, so we need all topo queries to be routed through the topo of the external cluster.
Proto
We add a new attribute to BinlogSource, ExternalCluster, to indicate that the source data resides in an external cluster. Note that this cluster needs to be always mounted in the target Vitess cluster during the entire workflow.
Routing Rules
While moving tables between keyspaces in a cluster the tables are first served from the source keyspace. Once traffic is switched the tables get served from the target. Vitess uses Routing Rules to ensure that vtgate sees the correct keyspace. However for Migrate, the tables being imported do not exist in the cluster until the import is completed.
VSchema updates
For the same reasons as mentioned above, we only add tables in the target cluster after the Migrate is completed
VDiff
VDiff has been changed to use the topo of the external cluster for getting source data.
Mount
A new sub-key, ExternalCluster, has been added to the global root node of the topo. These will have one child node per type of cluster supported. For now we only support an external vitess cluster. A mysql cluster type will be supported next to point to a regular mysql server. Each cluster is stored under the cluster type with the cluster name and containing the serialized proto as the leaf.
Testing
We add another cluster and test MoveTables into a "local" cluster from this additional "external" cluster. Since the underlying code is the same, we need to test for two things mainly:
Action Item: We need to add the new test to the required CI tests cluster_endtoend_vreplication_migrate.yml after merge
Test Framework
Currently the VReplication e2e tests work on a single cluster. For cross-cluster tests we need two independent clusters. We had a lot of hard-coded globals to setup ports, temp directories, etc that are the basic parameters needed to setup a cluster. We now allow multiple clusters to be created.
However: we have an anti-pattern in Vitess: vt executables look for an environment variable VTDATAROOT for certain parameters, like the log directory, path for log files, stdout etc. So each time we need to create vt processes in the "other" cluster we need to set the appropriate VTDATAROOT.
Related Issue(s)
#7545
Checklist
Impacted Areas in Vitess
Components that this PR will affect: