Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OnlineDDL: Revert for VReplication based migrations #7478

Merged

Conversation

shlomi-noach
Copy link
Contributor

@shlomi-noach shlomi-noach commented Feb 10, 2021

Ready for review

This PR offers lossless revert for online DDL. Revert is supported for:

CREATE TABLE and DROP TABLE using any online ddl_strategy (either online, gh-ost, pt-osc):

  • CREATE TABLE : the revert is to DROP, but actually implemented by RENAME...
  • CREATE TABLE IF NOT EXISTS: if the CREATE was a noop, then so is the revert
  • DROP TABLE: online DDL implements DRPO via RENANE. Revert "creates" the table by renaming it back into place
  • DROP TABLE IF EXISTS: if the DROP was a noop then so is the revert

ALTER TABLE using ddl_strategy=online

Only VReplication-based ALTER migrations are revertible.

Any reverted migration

It is possible to revert reverted migration, reverted-reverted migrations, and it's turtles all the way down.

Eligible migrations

On top of the above constraints, you may only revert a migration if:

  • It is the last successful migration to run on a table.

At this time we only support revert for the last migration, which means you cannot revert the migrations 2nd before last, and you can't "pop" the stack of migrations. However, notice that when you revert a migraiton, the revert itself is a (hopefully) succesful migration, and you may revert the revert.

What's in a revert?

A revert is an online schema migration of its own. You may submit a revert via vtctl OnlineDDL <keyspace> revert <uuid>. This generates a new UUID, which is the job ID for the new revert migration.

Revert migrations are queued and scheduled same as any other migration. They can be tracked, cancelled and retried same as any other migration.

A revert migration is indicated by a SQL statement of the form revert fb697374_7b25_11eb_85f4_f875a4d24e90. This statement is internal and not supported in VTGate or in any MySQL connection. Work in progress for specialized SQL syntax.

It's important to note that a revert migration does not modify the entry for the migration it reverts. It reads the details of the reverted migration from _vt.schema_migrations, and, since it is a migration in itself, has its own entry in that table.

Implementation

The revert implementation for CREATE and DROP is relatively simple, in the form of counter-queries that revert the effects of the original migraiton. A revert never actually CREATEs a table, and never actually DROPs a table. It's implemented behind the scenes by RENAMEs. There is special logic to handle the case for CREATE TABLE IF NOT EXISTS and DROP TABLE IF EXISTS.

The implementation for ALTER, with ddl_strategy='online' is more elaborate, and includes:

  • Finding the artifact table from the to-be-reverted ALTER migration
  • Finding the _vt.vreplication entry from the to-be-reverted ALTER migration
  • Sanity checks
  • Re-evaluate a VReplication rule/filter for the new migration
  • Read the pos from the to-be-reverted VReplication stream. That pos was taken while tables were swapped and while no writes took place on the table.
  • Construct a new VReplication stream from the above.
  • Run the new migration.

We use the artifacts value in _vt.schema_migration to identify what the original table was.

Lossless

The user has 24 hours to revert a migration without losing data.

  • When you revert a DROP, the table reappears with all data intact.
  • When you revert an ALTER, VReplication makes sure to apply any changes made to the table following the migration, so that you do not lose data.
    • This assumes the new table did not receive changes which are irrevertible. For example, you may have altered a VARCHAR(10) to VARCHAR(20). If the new table was populated with now longer texts, then it is irrevertible.
    • Revert-the-revert, revert-the-revert-the-revert and so forth are likewise lossless.

Tests

Online DDL Revert is tested via a specialize endtoend test. See below to get a sense of what gets tested:

--- PASS: TestSchemaChange (525.06s)
    --- PASS: TestSchemaChange/CREATE_TABLE_IF_NOT_EXISTS_where_table_does_not_exist (20.04s)
    --- PASS: TestSchemaChange/revert_CREATE_TABLE_IF_NOT_EXISTS_where_did_not_exist (20.05s)
    --- PASS: TestSchemaChange/revert_revert_CREATE_TABLE_IF_NOT_EXISTS_where_did_not_exist (20.05s)
    --- PASS: TestSchemaChange/revert_revert_revert_CREATE_TABLE_IF_NOT_EXISTS_where_did_not_exist (20.05s)
    --- PASS: TestSchemaChange/online_CREATE_TABLE (23.55s)
    --- PASS: TestSchemaChange/revert_CREATE_TABLE (20.05s)
    --- PASS: TestSchemaChange/revert_revert_CREATE_TABLE (20.05s)
    --- PASS: TestSchemaChange/fail_revert_older_change (20.05s)
    --- PASS: TestSchemaChange/CREATE_TABLE_IF_NOT_EXISTS_where_table_exists (20.04s)
    --- PASS: TestSchemaChange/revert_CREATE_TABLE_IF_NOT_EXISTS_where_table_existed (20.05s)
    --- PASS: TestSchemaChange/revert_revert_CREATE_TABLE_IF_NOT_EXISTS_where_table_existed (20.06s)
    --- PASS: TestSchemaChange/fail_online_CREATE_TABLE (20.05s)
    --- PASS: TestSchemaChange/online_ALTER_TABLE_0 (20.07s)
    --- PASS: TestSchemaChange/online_ALTER_TABLE_1 (20.07s)
    --- PASS: TestSchemaChange/revert_ALTER_TABLE (20.09s)
    --- PASS: TestSchemaChange/revert_revert_ALTER_TABLE (20.08s)
    --- PASS: TestSchemaChange/revert_revert_revert_ALTER_TABLE (20.09s)
    --- PASS: TestSchemaChange/online_DROP_TABLE (20.04s)
    --- PASS: TestSchemaChange/revert_DROP_TABLE (20.07s)
    --- PASS: TestSchemaChange/revert_revert_DROP_TABLE (20.06s)
    --- PASS: TestSchemaChange/online_DROP_TABLE_IF_EXISTS (20.07s)
    --- PASS: TestSchemaChange/revert_DROP_TABLE_IF_EXISTS (20.08s)
    --- PASS: TestSchemaChange/revert_revert_DROP_TABLE_IF_EXISTS (20.07s)
    --- PASS: TestSchemaChange/revert_revert_revert_DROP_TABLE_IF_EXISTS (20.07s)
    --- PASS: TestSchemaChange/fail_online_DROP_TABLE (20.07s)
    --- PASS: TestSchemaChange/fail_revert_failed_online_DROP_TABLE (20.07s)
PASS

Points of interest for the reviewer


Original comment, when this PR was just WIP:

This PR extends #7419 . It only makes sense to review it after #7419 is reviewed & merged.

This experimental PR introduces revert for Online DDL via VReplicaiton. With this PR it is possible to revert a successfully completed DDL performed by VReplication on a table (where this was tha last migration executed on said table). The revert operation may take place for as long as the old table is still available, which is now set to 24h.

This is still evolving, more writeup coming later.

Related Issue(s)

Checklist

  • Should this PR be backported?
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Impacted Areas in Vitess

Components that this PR will affect:

  • Query Serving
  • VReplication
  • Cluster Management
  • Build/CI
  • VTAdmin

@shlomi-noach shlomi-noach changed the title Vreplication online ddl revert OnlineDDL: Revert for VReplication based migrations Feb 10, 2021
…cessful migration on the table, and that there's no pending migrations on that table

Signed-off-by: Shlomi Noach <[email protected]>
@shlomi-noach
Copy link
Contributor Author

Convenience link to track changes on top of vreplication-online-ddl branch (#7419): planetscale/vitess@vreplication-online-ddl...vreplication-online-ddl-revert

Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
@shlomi-noach
Copy link
Contributor Author

Ready for review

This PR offers lossless revert for online DDL. Revert is supported for:

CREATE TABLE and DROP TABLE using any online ddl_strategy (either online, gh-ost, pt-osc):

  • CREATE TABLE : the revert is to DROP, but actually implemented by RENAME...
  • CREATE TABLE IF NOT EXISTS: if the CREATE was a noop, then so is the revert
  • DROP TABLE: online DDL implements DRPO via RENANE. Revert "creates" the table by renaming it back into place
  • DROP TABLE IF EXISTS: if the DROP was a noop then so is the revert

ALTER TABLE using ddl_strategy=online

Only VReplication-based ALTER migrations are revertible.

Any reverted migration

It is possible to revert reverted migration, reverted-reverted migrations, and it's turtles all the way down.

Eligible migrations

On top of the above constraints, you may only revert a migration if:

  • It is the last successful migration to run on a table.

At this time we only support revert for the last migration, which means you cannot revert the migrations 2nd before last, and you can't "pop" the stack of migrations. However, notice that when you revert a migraiton, the revert itself is a (hopefully) succesful migration, and you may revert the revert.

What's in a revert?

A revert is an online schema migration of its own. You may submit a revert via vtctl OnlineDDL <keyspace> revert <uuid>. This generates a new UUID, which is the job ID for the new revert migration.

Revert migrations are queued and scheduled same as any other migration. They can be tracked, cancelled and retried same as any other migration.

A revert migration is indicated by a SQL statement of the form revert fb697374_7b25_11eb_85f4_f875a4d24e90. This statement is internal and not supported in VTGate or in any MySQL connection. Work in progress for specialized SQL syntax.

It's important to note that a revert migration does not modify the entry for the migration it reverts. It reads the details of the reverted migration from _vt.schema_migrations, and, since it is a migration in itself, has its own entry in that table.

Implementation

The revert implementation for CREATE and DROP is relatively simple, in the form of counter-queries that revert the effects of the original migraiton. A revert never actually CREATEs a table, and never actually DROPs a table. It's implemented behind the scenes by RENAMEs. There is special logic to handle the case for CREATE TABLE IF NOT EXISTS and DROP TABLE IF EXISTS.

The implementation for ALTER, with ddl_strategy='online' is more elaborate, and includes:

  • Finding the artifact table from the to-be-reverted ALTER migration
  • Finding the _vt.vreplication entry from the to-be-reverted ALTER migration
  • Sanity checks
  • Re-evaluate a VReplication rule/filter for the new migration
  • Read the pos from the to-be-reverted VReplication stream. That pos was taken while tables were swapped and while no writes took place on the table.
  • Construct a new VReplication stream from the above.
  • Run the new migration.

We use the artifacts value in _vt.schema_migration to identify what the original table was.

Lossless

The user has 24 hours to revert a migration without losing data.

  • When you revert a DROP, the table reappears with all data intact.
  • When you revert an ALTER, VReplication makes sure to apply any changes made to the table following the migration, so that you do not lose data.
    • This assumes the new table did not receive changes which are irrevertible. For example, you may have altered a VARCHAR(10) to VARCHAR(20). If the new table was populated with now longer texts, then it is irrevertible.
    • Revert-the-revert, revert-the-revert-the-revert and so forth are likewise lossless.

Tests

Online DDL Revert is tested via a specialize endtoend test. See below to get a sense of what gets tested:

--- PASS: TestSchemaChange (525.06s)
    --- PASS: TestSchemaChange/CREATE_TABLE_IF_NOT_EXISTS_where_table_does_not_exist (20.04s)
    --- PASS: TestSchemaChange/revert_CREATE_TABLE_IF_NOT_EXISTS_where_did_not_exist (20.05s)
    --- PASS: TestSchemaChange/revert_revert_CREATE_TABLE_IF_NOT_EXISTS_where_did_not_exist (20.05s)
    --- PASS: TestSchemaChange/revert_revert_revert_CREATE_TABLE_IF_NOT_EXISTS_where_did_not_exist (20.05s)
    --- PASS: TestSchemaChange/online_CREATE_TABLE (23.55s)
    --- PASS: TestSchemaChange/revert_CREATE_TABLE (20.05s)
    --- PASS: TestSchemaChange/revert_revert_CREATE_TABLE (20.05s)
    --- PASS: TestSchemaChange/fail_revert_older_change (20.05s)
    --- PASS: TestSchemaChange/CREATE_TABLE_IF_NOT_EXISTS_where_table_exists (20.04s)
    --- PASS: TestSchemaChange/revert_CREATE_TABLE_IF_NOT_EXISTS_where_table_existed (20.05s)
    --- PASS: TestSchemaChange/revert_revert_CREATE_TABLE_IF_NOT_EXISTS_where_table_existed (20.06s)
    --- PASS: TestSchemaChange/fail_online_CREATE_TABLE (20.05s)
    --- PASS: TestSchemaChange/online_ALTER_TABLE_0 (20.07s)
    --- PASS: TestSchemaChange/online_ALTER_TABLE_1 (20.07s)
    --- PASS: TestSchemaChange/revert_ALTER_TABLE (20.09s)
    --- PASS: TestSchemaChange/revert_revert_ALTER_TABLE (20.08s)
    --- PASS: TestSchemaChange/revert_revert_revert_ALTER_TABLE (20.09s)
    --- PASS: TestSchemaChange/online_DROP_TABLE (20.04s)
    --- PASS: TestSchemaChange/revert_DROP_TABLE (20.07s)
    --- PASS: TestSchemaChange/revert_revert_DROP_TABLE (20.06s)
    --- PASS: TestSchemaChange/online_DROP_TABLE_IF_EXISTS (20.07s)
    --- PASS: TestSchemaChange/revert_DROP_TABLE_IF_EXISTS (20.08s)
    --- PASS: TestSchemaChange/revert_revert_DROP_TABLE_IF_EXISTS (20.07s)
    --- PASS: TestSchemaChange/revert_revert_revert_DROP_TABLE_IF_EXISTS (20.07s)
    --- PASS: TestSchemaChange/fail_online_DROP_TABLE (20.07s)
    --- PASS: TestSchemaChange/fail_revert_failed_online_DROP_TABLE (20.07s)
PASS

Points of interest for the reviewer

Signed-off-by: Shlomi Noach <[email protected]>
Copy link
Member

@GuptaManan100 GuptaManan100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!! 🚀

Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is brilliant :-)

@@ -0,0 +1,776 @@
/*
Copyright 2019 The Vitess Authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyright: 2021?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

testSelectTableMetrics(t)
})
t.Run("fail revert older change", func(t *testing.T) {
// We shouldn't be able to revert one-before-last succcessfulk migration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

succcessfulk=>successful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

fmt.Println("# 'vtctlclient OnlineDDL show recent' output (for debug purposes):")
fmt.Println(result)
assert.Equal(t, len(clusterInstance.Keyspaces[0].Shards), strings.Count(result, uuid))
// We ensure "full word" regexp becuase some column names may conflict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

becuase=>because

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


// GetRevertUUID works when this migration is a revert for another migration. It returns the UUID
// fo the reverted migration.
// The functio nreturns error when this is not a revert migration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

functio nreturns =>function returns an

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Signed-off-by: Shlomi Noach <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants