-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent migrations #9192
Concurrent migrations #9192
Conversation
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
While #9171 is still unmerged, this is the diff for this branch: planetscale/vitess@online-ddl-postpone-completion...planetscale:online-ddl-concurrent-revert-migrations |
Signed-off-by: Shlomi Noach <[email protected]>
…ions Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
…ting migrations on table Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
…s() before runNextMigration() can actually run a new migration Signed-off-by: Shlomi Noach <[email protected]>
…s() before runNextMigration() can actually run a new migration Signed-off-by: Shlomi Noach <[email protected]>
…er than 'queued') Signed-off-by: Shlomi Noach <[email protected]>
…ones Signed-off-by: Shlomi Noach <[email protected]>
…e reason this is required is that there could be a migration in 'ready' state, which conflicts with some running migration. But we still want to be able to run _other_ -allow-concurrent migrations that do not conflict with running migrations Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
This PR has evolved a bit. We now support concurrent migrations as follows:
With the new scheduler:
Also:
Just some possible scenarios:
|
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
This is now ready to review, given #9171 is merged. |
go/vt/vttablet/onlineddl/executor.go
Outdated
@@ -2179,45 +2235,67 @@ func (e *Executor) executeMigration(ctx context.Context, onlineDDL *schema.Onlin | |||
return nil | |||
} | |||
|
|||
// runNextMigration picks one 'ready' migration that is able to run, and executes it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is at the heart of the PR: the scheduler here gets to decide whether there's a migration thta can be executed, concurrently or not concurrently, based on existence or non-existence of running migrations.
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feedback is all on code comments.
May merge after fixing them.
go/vt/vttablet/onlineddl/executor.go
Outdated
// isAnyNonConcurrentMigrationRunning sees if there's any migration running right now | ||
// that does not have -allow-concurrent. | ||
// such a running migration will for example prevent a new non-concurrent migration from running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy-pasta. Need the correct description of this function instead.
go/vt/vttablet/onlineddl/executor.go
Outdated
@@ -1561,36 +1634,15 @@ func (e *Executor) CancelPendingMigrations(ctx context.Context, message string) | |||
} | |||
|
|||
// scheduleNextMigration attemps to schedule a single migration to run next. | |||
// possibly there's no migrations to run. Possibly there's a migration running right now, | |||
// in which cases nothing happens. | |||
// possibly there's no migrations to run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there are
go/vt/vttablet/onlineddl/executor.go
Outdated
// possibly there's no migrations to run. | ||
// The effect of this function is to move a migration from 'queued' state to 'ready' state, is all. | ||
// Notice that the query sqlScheduleSingleMigration embeds some logic inside. We may choose | ||
// to refactor the logic into the app |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm.. which app is this referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified, moved explanation into the function itself. This is more a developer's note for future me/us.
go/vt/vttablet/onlineddl/executor.go
Outdated
// Possible scenarios: | ||
// - no migration is in 'ready' state -- nothing to be done | ||
// - a migration is 'ready', but conflicts with other running migrations -- try another 'ready' migration | ||
// It is therefore possible that there is a 'ready' migration, and still this function only handles one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is confusing. I would have thought that it is possible that there is a "ready" migration and still this function chooses none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow the wording was really bad. Changed to:
// Note that per the above breakdown, and due to potential conflicts, it is possible to have one or
// more 'ready' migration, and still none is executed.
go/vt/vttablet/onlineddl/executor.go
Outdated
// a vreplication migration from a pre-PRS/ERS that we still need to learn about? | ||
// We're going to be careful here, and avoid running new migrations until we have | ||
// a better picture. It will likely take a couple seconds till next iteration. | ||
// execution. This delay ony takes place shortly after Open(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// execution. This delay ony takes place shortly after Open(). | |
// This delay only takes place shortly after Open(). |
go/vt/vttablet/onlineddl/executor.go
Outdated
RequestContext: row["migration_context"].ToString(), | ||
// getNonConflictingMigration finds a single 'ready' migration which does not conflict with running migrations. | ||
// Conflicts are: | ||
// - a migration is 'ready' but is not set to run _concurrnetly_, and there's a running migration that is also non-concurrent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// - a migration is 'ready' but is not set to run _concurrnetly_, and there's a running migration that is also non-concurrent | |
// - a migration is 'ready' but is not set to run _concurrently_, and there's a running migration that is also non-concurrent |
go/vt/vttablet/onlineddl/executor.go
Outdated
if i == 0 { | ||
break | ||
// no non-conflicting migration found... | ||
// Either all ready migrations are conflicting, or there's no ready migrations... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no ready migration OR there are no ready migrations.
Addressed all comments. |
Signed-off-by: Shlomi Noach <[email protected]>
Woohoo! That's an important one merged. |
See updated #9192 (comment)
Description
Ongoing work towards supporting concurrent REVERT migrations.
Up till now, Vitess would only ever run a single migration at a time, for a given keyspace.
Anyway, we wish to support the following:
ddl_strategy
, something like-allow-concurrent
or similarThis changes the scheduler logic, which up till now assumed that if any migration was running, then no need to schedule the next one. From now on, it will need to consider these scenarios:
This means a more programmatic approach to scheduling next migration (where today some of the logic is a simple SQL)
Related Issue(s)
Checklist