Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OnlineDDL: implementing -postpone-completion, ALTER VITESS_MIGRATION ... COMPLETE #9171

Merged
merged 14 commits into from
Nov 21, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 34 additions & 2 deletions doc/releasenotes/13_0_0_summary.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,48 @@
## Major Changes

### ddl_strategy: -postpone-completion flag

## Incompatible Changes
`ddl_strategy` (either `@@ddl_strategy` in VtGate or `-ddl_strategy` in `vtctl ApplySchema`) supports the flag `-postpone-completion`

This flag indicates that the migration should not auto-complete. This applies for:

- any `CREATE TABLE`
- any `DROP TABLE`
- `ALTER` table in `online` strategy
- `ALTER` table in `gh-ost` strategy

Note that this flag is not supported for `pt-osc` strategy.

Behavior of migrations with this flag:

## Syntax changes
- an `ALTER` table begins, runs, but does not cut-over.
- `CREATE` or `DROP` migrations are silently not even scheduled

### alter vitess_migration ... cleanup

A new query is supported:

```sql
alter vitess_migration '9748c3b7_7fdb_11eb_ac2c_f875a4d24e90' cleanup
```

This query tells Vitess that a migration's artifacts are good to be cleaned up asap. This allows Vitess to free disk resources sooner. As a reminder, once a migration's artifacts are cleaned up, the migration is no
longer revertible.

### alter vitess_migration ... complete

A new query is supported:

```sql
alter vitess_migration '9748c3b7_7fdb_11eb_ac2c_f875a4d24e90' complete
```

This command indicates that a migration executed with `-postpone-completion` is good to complete. Behavior:

- For running `ALTER`s (`online` and `gh-ost`) which are ready to cut-over: cut-over imminently (though not immediately - cut-over depends on polling interval, replication lag, etc)
- For running `ALTER`s (`online` and `gh-ost`) which are only partly through the migration: they will cut-over automatically when they complete their work, as if `-postpone-completion` wasn't indicated
- For queued `CREATE` and `DROP` migrations: "unblock" them from being scheduled. They'll be scheduled at the scheduler's discretion. there is no guarantee that they will be scheduled to run immediately.

## Incompatible Changes

## Deprecations
14 changes: 14 additions & 0 deletions go/test/endtoend/onlineddl/ghost/onlineddl_ghost_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,20 @@ func TestSchemaChange(t *testing.T) {
onlineddl.CheckCancelMigration(t, &vtParams, shards, uuid, false)
onlineddl.CheckRetryMigration(t, &vtParams, shards, uuid, false)
})
t.Run("successful online alter, postponed, vtgate", func(t *testing.T) {
uuid := testOnlineDDLStatement(t, alterTableTrivialStatement, "gh-ost -postpone-completion", "vtgate", "ghost_col")
// Should be still running!
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusRunning)
// Issue a complete and wait for successful completion
onlineddl.CheckCompleteMigration(t, &vtParams, shards, uuid, true)
// This part may take a while, because we depend on vreplicatoin polling
status := onlineddl.WaitForMigrationStatus(t, &vtParams, shards, uuid, 60*time.Second, schema.OnlineDDLStatusComplete, schema.OnlineDDLStatusFailed)
fmt.Printf("# Migration status (for debug purposes): <%s>\n", status)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)

onlineddl.CheckCancelMigration(t, &vtParams, shards, uuid, false)
onlineddl.CheckRetryMigration(t, &vtParams, shards, uuid, false)
})
t.Run("throttled migration", func(t *testing.T) {
uuid := testOnlineDDLStatement(t, alterTableThrottlingStatement, "gh-ost --max-load=Threads_running=1", "vtgate", "ghost_col")
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusRunning)
Expand Down
75 changes: 52 additions & 23 deletions go/test/endtoend/onlineddl/revert/onlineddl_revert_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -219,37 +219,38 @@ func TestSchemaChange(t *testing.T) {
require.Equal(t, 1, len(shards))

var uuids []string
ddlStrategy := "online"
// CREATE
t.Run("CREATE TABLE IF NOT EXISTS where table does not exist", func(t *testing.T) {
// The table does not exist
uuid := testOnlineDDLStatement(t, createIfNotExistsStatement, "online", "vtgate", "")
uuid := testOnlineDDLStatement(t, createIfNotExistsStatement, ddlStrategy, "vtgate", "")
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, true)
})
t.Run("revert CREATE TABLE IF NOT EXISTS where did not exist", func(t *testing.T) {
// The table existed, so it will now be dropped (renamed)
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, false)
})
t.Run("revert revert CREATE TABLE IF NOT EXISTS where did not exist", func(t *testing.T) {
// Table was dropped (renamed) so it will now be restored
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, true)
})
t.Run("revert revert revert CREATE TABLE IF NOT EXISTS where did not exist", func(t *testing.T) {
// Table was restored, so it will now be dropped (renamed)
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, false)
})
t.Run("online CREATE TABLE", func(t *testing.T) {
uuid := testOnlineDDLStatement(t, createStatement, "online", "vtgate", "just-created")
uuid := testOnlineDDLStatement(t, createStatement, ddlStrategy, "vtgate", "just-created")
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, true)
Expand All @@ -258,43 +259,43 @@ func TestSchemaChange(t *testing.T) {
})
t.Run("revert CREATE TABLE", func(t *testing.T) {
// This will drop the table (well, actually, rename it away)
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, false)
})
t.Run("revert revert CREATE TABLE", func(t *testing.T) {
// Restore the table. Data should still be in the table!
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, true)
testSelectTableMetrics(t)
})
t.Run("fail revert older change", func(t *testing.T) {
// We shouldn't be able to revert one-before-last succcessful migration.
uuid := testRevertMigration(t, uuids[len(uuids)-2])
uuid := testRevertMigration(t, uuids[len(uuids)-2], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusFailed)
})
t.Run("CREATE TABLE IF NOT EXISTS where table exists", func(t *testing.T) {
// The table exists. A noop.
uuid := testOnlineDDLStatement(t, createIfNotExistsStatement, "online", "vtgate", "")
uuid := testOnlineDDLStatement(t, createIfNotExistsStatement, ddlStrategy, "vtgate", "")
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, true)
})
t.Run("revert CREATE TABLE IF NOT EXISTS where table existed", func(t *testing.T) {
// Since the table already existed, thus not created by the reverts migration,
// we expect to _not_ drop it in this revert. A noop.
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, true)
})
t.Run("revert revert CREATE TABLE IF NOT EXISTS where table existed", func(t *testing.T) {
// Table was not dropped, thus isn't re-created, and it just still exists. A noop.
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, true)
Expand Down Expand Up @@ -324,6 +325,7 @@ func TestSchemaChange(t *testing.T) {
// If it fails, it has nothing to do with revert.
// We run this test because we expect its functionality to work in the next step.
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
var wg sync.WaitGroup
wg.Add(1)
go func() {
Expand All @@ -342,13 +344,14 @@ func TestSchemaChange(t *testing.T) {
// This reverts the last ALTER TABLE.
// And we run traffic on the table during the revert
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
runMultipleConnections(ctx, t)
}()
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
cancel() // will cause runMultipleConnections() to terminate
Expand All @@ -360,13 +363,14 @@ func TestSchemaChange(t *testing.T) {
// This reverts the last revert (reapplying the last ALTER TABLE).
// And we run traffic on the table during the revert
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
runMultipleConnections(ctx, t)
}()
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
cancel() // will cause runMultipleConnections() to terminate
Expand All @@ -378,20 +382,45 @@ func TestSchemaChange(t *testing.T) {
// For good measure, let's verify that revert-revert-revert works...
// So this again pulls us back to first ALTER
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
runMultipleConnections(ctx, t)
}()
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
cancel() // will cause runMultipleConnections() to terminate
wg.Wait()
checkMigratedTable(t, tableName, alterHints[0])
testSelectTableMetrics(t)
})
t.Run("postponed revert", func(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
runMultipleConnections(ctx, t)
}()
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy+" -postpone-completion")
uuids = append(uuids, uuid)
// Should be still running!
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusRunning)
// Issue a complete and wait for successful completion
onlineddl.CheckCompleteMigration(t, &vtParams, shards, uuid, true)
// This part may take a while, because we depend on vreplicatoin polling
status := onlineddl.WaitForMigrationStatus(t, &vtParams, shards, uuid, 60*time.Second, schema.OnlineDDLStatusComplete, schema.OnlineDDLStatusFailed)
fmt.Printf("# Migration status (for debug purposes): <%s>\n", status)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
cancel() // will cause runMultipleConnections() to terminate
wg.Wait()
checkMigratedTable(t, tableName, alterHints[1])
testSelectTableMetrics(t)
})

// DROP
t.Run("online DROP TABLE", func(t *testing.T) {
Expand All @@ -402,15 +431,15 @@ func TestSchemaChange(t *testing.T) {
})
t.Run("revert DROP TABLE", func(t *testing.T) {
// This will recreate the table (well, actually, rename it back into place)
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, true)
testSelectTableMetrics(t)
})
t.Run("revert revert DROP TABLE", func(t *testing.T) {
// This will reapply DROP TABLE
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, false)
Expand All @@ -426,21 +455,21 @@ func TestSchemaChange(t *testing.T) {
})
t.Run("revert DROP TABLE IF EXISTS", func(t *testing.T) {
// Table will not be recreated because it didn't exist during the DROP TABLE IF EXISTS
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, false)
})
t.Run("revert revert DROP TABLE IF EXISTS", func(t *testing.T) {
// Table still does not exist
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, false)
})
t.Run("revert revert revert DROP TABLE IF EXISTS", func(t *testing.T) {
// Table still does not exist
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
checkTable(t, tableName, false)
Expand All @@ -456,7 +485,7 @@ func TestSchemaChange(t *testing.T) {
})
t.Run("fail revert failed online DROP TABLE", func(t *testing.T) {
// Cannot revert a failed migration
uuid := testRevertMigration(t, uuids[len(uuids)-1])
uuid := testRevertMigration(t, uuids[len(uuids)-1], ddlStrategy)
uuids = append(uuids, uuid)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusFailed)
checkTable(t, tableName, false)
Expand Down Expand Up @@ -494,9 +523,9 @@ func testOnlineDDLStatement(t *testing.T, alterStatement string, ddlStrategy str
}

// testRevertMigration reverts a given migration
func testRevertMigration(t *testing.T, revertUUID string) (uuid string) {
func testRevertMigration(t *testing.T, revertUUID string, ddlStrategy string) (uuid string) {
revertQuery := fmt.Sprintf("revert vitess_migration '%s'", revertUUID)
r := onlineddl.VtgateExecQuery(t, &vtParams, revertQuery, "")
r := onlineddl.VtgateExecDDL(t, &vtParams, ddlStrategy, revertQuery, "")

row := r.Named().Row()
require.NotNil(t, row)
Expand All @@ -506,7 +535,7 @@ func testRevertMigration(t *testing.T, revertUUID string) (uuid string) {
fmt.Println("# Generated UUID (for debug purposes):")
fmt.Printf("<%s>\n", uuid)

time.Sleep(time.Second * 20)
_ = onlineddl.WaitForMigrationStatus(t, &vtParams, shards, uuid, 20*time.Second, schema.OnlineDDLStatusComplete, schema.OnlineDDLStatusFailed)
return uuid
}

Expand Down
33 changes: 32 additions & 1 deletion go/test/endtoend/onlineddl/vrepl/onlineddl_vrepl_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,6 @@ func TestSchemaChange(t *testing.T) {
retainArtifactSeconds := row.AsInt64("retain_artifacts_seconds", 0)
assert.Equal(t, int64(-1), retainArtifactSeconds)
}

})
t.Run("successful online alter, vtctl", func(t *testing.T) {
insertRows(t, 2)
Expand All @@ -288,6 +287,23 @@ func TestSchemaChange(t *testing.T) {
onlineddl.CheckRetryMigration(t, &vtParams, shards, uuid, false)
onlineddl.CheckMigrationArtifacts(t, &vtParams, shards, uuid, true)
})
t.Run("successful online alter, postponed, vtgate", func(t *testing.T) {
insertRows(t, 2)
uuid := testOnlineDDLStatement(t, alterTableTrivialStatement, "online -postpone-completion", "vtgate", "test_val", false)
// Should be still running!
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusRunning)
// Issue a complete and wait for successful completion
onlineddl.CheckCompleteMigration(t, &vtParams, shards, uuid, true)
// This part may take a while, because we depend on vreplicatoin polling
status := onlineddl.WaitForMigrationStatus(t, &vtParams, shards, uuid, 60*time.Second, schema.OnlineDDLStatusComplete, schema.OnlineDDLStatusFailed)
fmt.Printf("# Migration status (for debug purposes): <%s>\n", status)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)

testRows(t)
testMigrationRowCount(t, uuid)
onlineddl.CheckCancelMigration(t, &vtParams, shards, uuid, false)
onlineddl.CheckRetryMigration(t, &vtParams, shards, uuid, false)
})
t.Run("throttled migration", func(t *testing.T) {
insertRows(t, 2)
for i := range shards {
Expand Down Expand Up @@ -491,6 +507,21 @@ func TestSchemaChange(t *testing.T) {
// this table did not exist
checkTables(t, schema.OnlineDDLToGCUUID(uuid), 0)
})
t.Run("Online DROP TABLE IF EXISTS for nonexistent table, postponed", func(t *testing.T) {
uuid := testOnlineDDLStatement(t, onlineDDLDropTableIfExistsStatement, "online -postpone-completion", "vtgate", "", false)
// Should be still queued, never promoted to 'ready'!
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusQueued)
// Issue a complete and wait for successful completion
onlineddl.CheckCompleteMigration(t, &vtParams, shards, uuid, true)
// This part may take a while, because we depend on vreplicatoin polling
status := onlineddl.WaitForMigrationStatus(t, &vtParams, shards, uuid, 60*time.Second, schema.OnlineDDLStatusComplete, schema.OnlineDDLStatusFailed)
fmt.Printf("# Migration status (for debug purposes): <%s>\n", status)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusComplete)
onlineddl.CheckCancelMigration(t, &vtParams, shards, uuid, false)
onlineddl.CheckRetryMigration(t, &vtParams, shards, uuid, false)
// this table did not exist
checkTables(t, schema.OnlineDDLToGCUUID(uuid), 0)
})
t.Run("Online DROP TABLE for nonexistent table, expect error, vtgate", func(t *testing.T) {
uuid := testOnlineDDLStatement(t, onlineDDLDropTableStatement, "online", "vtgate", "", false)
onlineddl.CheckMigrationStatus(t, &vtParams, shards, uuid, schema.OnlineDDLStatusFailed)
Expand Down
Loading