Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing upgrade path from / downgrade path to v8.0.0 #7294

Merged
merged 32 commits into from
Jan 20, 2021

Conversation

shlomi-noach
Copy link
Contributor

@shlomi-noach shlomi-noach commented Jan 14, 2021

Description

Followup to team discussion, we wish to add CI (endtoend) tests that validate an upgrade from a previous tagged version.

This PR introduces endtoend version upgrade test, specifically testing an upgrade path from v8.0.0. (side note, v9.0.0 will be released later this month, and then we will add a test for a v9.0.0 upgrade path).

The test verifies that a vitess cluster which was built, started and populated by a v8.0.0 release, is readable by a HEAD (the PR branch) build.

The way this works:

  • .github/workflows/cluster_endtoend_upgrade.yml is the workflow file
  • it checks out v8.0.0, and builds the binaries
  • stores binaries in a safe place
  • checks out HEAD
  • gets the binaries we built for v8.0.0
  • runs go/test/endtoend/versionupgrade/upgrade_test.go
    • the test creates sharded and unsharded keyspaces and tablets
    • creates tables in keyspaces and shards
    • populated the tables
    • verifies data is populated
    • shuts down without removing data (-keep-data)
  • Checks out HEAD again, cleaning up the workflow running directory
  • Builds HEAD
  • Starts vitess on top of existing data, which means:
    • etcd2 loads from existing data
    • mysqld starts with existing data
    • vttablets start with existing data (and are already in SERVING mode)
  • Very data is still populated.

So this test basically starts a "latest" version of Vitess on top of preexisting v8.0.0 data, and SELECTs data from tables to prove the upgrade is good.
We may add additional tests to verify the upgrade path is good.

What kind of changes were made:

  • Existing endtoend tests intentionally randomized paths, ports, etc. But we have to be able to run two vitess clusters on exact same paths, ports etc. To that effect we introduce these flags:
    • -force-vtdataroot (path)
    • -force-port-start (port number)
    • -force-base-tablet-uid (number)
      With these three supplied, all locations and ports are idempotent.
  • test.go has a -skip-build flag; this is required because we want to be able to run the latest upgrade test, on a past release, which may not have this test (specifically, v8.0.0 does not have this test). We use this to run a latest test on top of pre-built v8.0.0 binaries.
  • LocalProcessCluster has a field called ReusingVTDATAROOT which indicates whether this cluster started from empty, and initialized itself (false), or whether the cluster reused an existing dataset (true)
  • cluster_process.go uses that field to decide whether it should initialize mysql, or create keyspaces, or what state it expects tablet to start with, etc.
    -go/test/endtoend/cluster/vtctlclient_process.go uses that field to determine if it needs to populate database tables and data or not.

EDIT

Symmetrically, the workflow now also runs a downgrade path test.

Related Issue(s)

Checklist

  • Should this PR be backported?
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Impacted Areas in Vitess

Components that this PR will affect:

  • Query Serving
  • VReplication
  • Cluster Management
  • Build
  • VTAdmin

Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
…lace; if reusing existing data, we do not create and do not populate. In both cases, we now test for expected table data

Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
@shlomi-noach
Copy link
Contributor Author

A "fun" loophole I got into is this:

I've added a an endtoend upgrade test. In CI, I check out v8.0.0 to run the test, and then I check out pr/branch HEAD to run the test again.
Problem is... the test does not exist in v8.0.0. Next problem is that I can't precompile the binaries because test.go itself does that. So it seems like the endtoend test is coupled with the vitess binary test. I need to override that.

Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
@shlomi-noach
Copy link
Contributor Author

shlomi-noach commented Jan 17, 2021

Also copying this to main comment


This PR introduced endtoend version upgrade test, specifically testing an upgrade path from v8.0.0. (side note, v9.0.0 will be released later this month, and then we will add a test for a v9.0.0 upgrade path).

The test verifies that a vitess cluster which was built, started and populated by a v8.0.0 release, is readable by a HEAD (the PR branch) build.

The way this works:

  • .github/workflows/cluster_endtoend_upgrade.yml is the workflow file
  • it checks out v8.0.0, and builds the binaries
  • stores binaries in a safe place
  • checks out HEAD
  • gets the binaries we built for v8.0.0
  • runs go/test/endtoend/versionupgrade/upgrade_test.go
    • the test creates sharded and unsharded keyspaces and tablets
    • creates tables in keyspaces and shards
    • populated the tables
    • verifies data is populated
    • shuts down without removing data (-keep-data)
  • Checks out HEAD again, cleaning up the workflow running directory
  • Builds HEAD
  • Starts vitess on top of existing data, which means:
    • etcd2 loads from existing data
    • mysqld starts with existing data
    • vttablets start with existing data (and are already in SERVING mode)
  • Verify data is still populated.

So this test basically starts a "latest" version of Vitess on top of preexisting v8.0.0 data, and SELECTs data from tables to prove the upgrade is good.
We may add additional tests to verify the upgrade path is good.

What kind of changes were made:

  • Existing endtoend tests intentionally randomized paths, ports, etc. But we have to be able to run two vitess clusters on exact same paths, ports etc. To that effect we introduce these flags:
    • -force-vtdataroot (path)
    • -force-port-start (port number)
    • -force-base-tablet-uid (number)
      With these three supplied, all locations and ports are idempotent.
  • test.go has a -skip-build flag; this is required because we want to be able to run the latest upgrade test, on a past release, which may not have this test (specifically, v8.0.0 does not have this test). We use this to run a latest test on top of pre-built v8.0.0 binaries.
  • LocalProcessCluster has a field called ReusingVTDATAROOT which indicates whether this cluster started from empty, and initialized itself (false), or whether the cluster reused an existing dataset (true)
  • cluster_process.go uses that field to decide whether it should initialize mysql, or create keyspaces, or what state it expects tablet to start with, etc.
    -go/test/endtoend/cluster/vtctlclient_process.go uses that field to determine if it needs to populate database tables and data or not.

@shlomi-noach shlomi-noach changed the title WIP: testing version upgrade Testing upgrade path from v8.0.0 Jan 17, 2021
@shlomi-noach shlomi-noach marked this pull request as ready for review January 17, 2021 12:32
@shlomi-noach shlomi-noach requested a review from sougou as a code owner January 17, 2021 12:32
@shlomi-noach
Copy link
Contributor Author

Ready for review.

@shlomi-noach shlomi-noach requested review from deepthi and a team January 17, 2021 12:32
@shlomi-noach
Copy link
Contributor Author

shlomi-noach commented Jan 17, 2021

I'll create a separate PR to test a downgrade; basically it's a duplication of existing test. EDIT: eventually this was done in this PR

@shlomi-noach
Copy link
Contributor Author

This PR should be backported to v9.x

Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
@shlomi-noach
Copy link
Contributor Author

This PR now also includes a downgrade path test

@shlomi-noach shlomi-noach changed the title Testing upgrade path from v8.0.0 Testing upgrade path from / downgrade path to v8.0.0 Jan 17, 2021
Signed-off-by: Shlomi Noach <[email protected]>
Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent! Just one typo, can merge after fixing that.

rm -rf /tmp/vtdataroot
mkdir -p /tmp/vtdataroot
source build.env
# We pass -skip-build so that we use he v8.0.0 binaries, not HEAD binaries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# We pass -skip-build so that we use he v8.0.0 binaries, not HEAD binaries
# We pass -skip-build so that we use the v8.0.0 binaries, not HEAD binaries

@deepthi
Copy link
Member

deepthi commented Jan 20, 2021

This is a great start towards completely automating the upgrade/downgrade testing.
Other scenarios that need to be tested:

  • mixed mode: only one or two of the 3 main components (vttablet/vtgate/vtctld) is/are upgraded / downgraded

Signed-off-by: Shlomi Noach <[email protected]>
@shlomi-noach shlomi-noach merged commit 9ac7dc8 into vitessio:master Jan 20, 2021
@shlomi-noach shlomi-noach deleted the ci-version-upgrade branch January 20, 2021 04:32
@shlomi-noach
Copy link
Contributor Author

#7323 is a resubmission as a backport to 9.0

@shlomi-noach shlomi-noach mentioned this pull request Jan 25, 2021
8 tasks
@askdba askdba added this to the v9.0 milestone Jan 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants