Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migration,server: plumb in initial version to the migration manager #57118

Merged
merged 5 commits into from
Nov 28, 2020

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented Nov 25, 2020

It'll let us return early, and makes the manager "more functional" in its
behavior.

We also promote the clusterversion type to a first-class citizen in the
external facing API for pkg/migration. This package concerns itself with
migrations between cluster versions and we should have our API reflect as much.

The proto changes are safe, we haven't had a major release with the previous
proto definitions.

Release note: None

@irfansharif irfansharif requested review from knz and tbg November 25, 2020 15:20
@irfansharif irfansharif requested a review from a team as a code owner November 25, 2020 15:20
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@irfansharif irfansharif force-pushed the 201125.mgr-plumb-from branch 5 times, most recently from b81d921 to c5ee31b Compare November 25, 2020 21:50
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r1, 2 of 2 files at r2, 8 of 8 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @knz)


pkg/migration/manager.go, line 254 at r3 (raw file):

	for _, cv := range cvs {
		h := &Helper{Manager: m}
		clusterVersion := cv

What good is this? Let's avoid unnecessary aliasing. Why not just for _, clusterVersion := range cvs?

And for deleting old ones. This stuff is endlessly confusing.

Release note: None
Jump to symbol for a given version tag will now group together the
intent of the version itself (instead of expecting readers to know to
grep for the same version in the block below).

Release note: None
It'll let us return early, and makes the manager "more functional" in
its behavior. We should also be using the ClusterVersion type when
talking about migrations, so we make that change here.

Release note: None
in the external facing API for pkg/migration.

The migration package concerns itself with migrations between cluster
versions, so we should have our API reflect as much. The proto changes
are safe, we haven't had a major release with the previous proto
definitions.

Release note: None
@irfansharif irfansharif force-pushed the 201125.mgr-plumb-from branch from c5ee31b to 3783e63 Compare November 27, 2020 22:35
Copy link
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @tbg)


pkg/migration/manager.go, line 254 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

What good is this? Let's avoid unnecessary aliasing. Why not just for _, clusterVersion := range cvs?

Done (was just a brain fart).

craig bot pushed a commit that referenced this pull request Nov 27, 2020
57118: migration,server: plumb in initial version to the migration manager r=irfansharif a=irfansharif

It'll let us return early, and makes the manager "more functional" in its
behavior. 

We also promote the clusterversion type to a first-class citizen in the
external facing API for pkg/migration. This package concerns itself with
migrations between cluster versions and we should have our API reflect as much.

The proto changes are safe, we haven't had a major release with the previous
proto definitions.

Release note: None

57155: clusterversion,*: remove VersionContainsEstimatesCounter r=irfansharif a=irfansharif

This, and all surrounding migration code and tests, are now safe to
remove. It mostly served as documentation, which we've moved to the
field itself. Part of #47447. Fixes #56401.

(While here, Let's also tell git that `versionkey_string.go` is a
generated file.)

Release note: None

---

First commit is from #57154.

57156: sql,clusterversion: remove VersionAuthLocalAndTrustRejectMethods r=irfansharif a=irfansharif

It's an old cluster version, introduced in the 19.2 release cycle. It's
now safe to remove. Part of #47447. Fixes #56398.


Release note: None

---

See only last commit. First two are from #57154 and #57155 respectively.

Co-authored-by: irfan sharif <[email protected]>
@craig
Copy link
Contributor

craig bot commented Nov 27, 2020

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Nov 28, 2020

Build succeeded:

@craig craig bot merged commit bbb7e7c into cockroachdb:master Nov 28, 2020
@irfansharif irfansharif deleted the 201125.mgr-plumb-from branch November 28, 2020 00:16
craig bot pushed a commit that referenced this pull request Nov 28, 2020
57186: migration: introduce fence versions r=irfansharif a=irfansharif

The migrations infrastructure makes use of internal fence versions when
stepping through consecutive versions. We'll need to first bump the
fence version for each intermediate cluster version, before bumping the
"real" one. Doing so allows us to provide the invariant that whenever a
cluster version is active, all nodes in the cluster (including ones
added concurrently during version upgrades) are running binaries that
know about the version.

It's instructive to walk through how we expect a version migration from
v21.1 to v21.2 to take place, and how we behave in the presence of new
v21.1 or v21.2 nodes being added to the cluster.
  - All nodes are running v21.1
  - All nodes are rolled into v21.2 binaries, but with active cluster
    version still as v21.1
  - The first version bump will be into v21.2(fence=1), see the
    migration manager above for where that happens
Then concurrently:
  - A new node is added to the cluster, but running binary v21.1
  - We try bumping the cluster gates to v21.2(fence=1)

If the v21.1 nodes manages to sneak in before the version bump, it's
fine as the version bump is a no-op one (all fence versions are). Any
subsequent bumps (including the "actual" one bumping to v21.2) will fail
during the validation step where we'll first check to see that all nodes
are running v21.2 binaries.

If the v21.1 node is only added after v21.2(fence=1) is active, it won't
be able to actually join the cluster (it'll be prevented by the join
RPC).

We rely on the invariant we introduced earlier that requires new
user-defined versions (users being crdb engineers) to always have
even-numbered Internal versions. This reserved the odd numbers to slot
in fence versions for each user-defined one.

Release note: None

---

Only the last two commits here are of interest. Earlier commits are from
#57154 and #57118.


Co-authored-by: irfan sharif <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants