sqlinstance: add `binary_version` column to instances table #98830

healthy-pod · 2023-03-17T01:34:21Z

This code change adds a binary_version column to the instances table.

This is done by adding the column to the bootstrap schema for system.sql_instances,
and piggy-backing on the existing code for the V23_1_SystemRbrReadNew migration
that overwrites the live schema for this table by the bootstrap copy.

This redefinition of the meaning of the V23_1_SystemRbrReadNew is backward-incompatible
and is only possible because this commit is merged before the v23.1 branch is cut.

Release note: None
Epic: CRDB-20860

cockroach-teamcity · 2023-03-17T01:34:31Z

This change is

knz

it would be good to link your PR to one of our epics.

Reviewed 23 of 23 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @healthy-pod and @jeffswenson)

pkg/sql/sqlinstance/instancestorage/row_codec.go line 341 at r1 (raw file):

		binaryVersionColumnIdx: func(datum tree.Datum) error {
			if datum != tree.DNull {
				binaryVersion = roachpb.MustParseVersion(string(tree.MustBeDString(datum)))

i recommend using the variant that checks the format and returns an error in case the value is invalid.

jeffswenson · 2023-03-17T13:42:11Z

pkg/upgrade/upgrades/alter_sql_instances_binary_version_test.go

+// getDeprecatedSqlInstancesDescriptorWithoutBinaryVersion returns the system.sql_instances
+// table descriptor that was being used before adding a new column in the
+// current version.
+func getDeprecatedSqlInstancesDescriptorWithoutBinaryVersion() *descpb.TableDescriptor {


It's possible to bootstrap the DB with the V22_2 system database. It makes writing upgrade tests a lot easier and the tests are more representative of how the upgrades will run in production. See #98613 for an example.

jeffswenson · 2023-03-17T13:48:06Z

pkg/upgrade/upgrades/alter_sql_instances_binary_version.go

+			schemaExistsFn: columnExists,
+		},
+	} {
+		if err := migrateTable(ctx, cs, d, op, tableID, systemschema.SQLInstancesTable()); err != nil {


This upgrade isn't actually doing anything in production. The RBR index migration replaces the descriptor with the statically defined descriptor, which would include the column you are adding.

cockroach/pkg/upgrade/upgrades/system_rbr_indexes.go

Line 104 in 85c6e38

func sqlInstanceMigration(codec keys.SQLCodec) rbrMigration {

Should I then remove the migration and keep the test case (run it after RBR index migration)? Anything else I should do? Actually I can't find the version associated with the RBR index migration. Can you please give more details?

This is the upgrade and version gate that will replace the descriptor with the new version.

cockroach/pkg/upgrade/upgrades/upgrades.go

Line 250 in ea389ff

upgrade.NewTenantUpgrade(

You should be able to delete your version key and upgrade logic. I think it is a good idea to keep the test. It documents the fact it is an intended side effect of the RBR migration.

healthy-pod · 2023-03-17T19:38:22Z

it would be good to link your PR to one of our epics.

Done

i recommend using the variant that checks the format and returns an error in case the value is invalid.

Done

healthy-pod · 2023-03-20T21:36:17Z

Ready for another look (pending clarification on @knz's question here).

knz

If you remove the migration, then this entire PR doesn't do what it says on the label, as it does not add a binary_version column to the instances table. I'll want the clarification before I even start the review.

Reviewed 1 of 2 files at r2, 12 of 12 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @healthy-pod and @jeffswenson)

knz

Now I understand. Great work. modulo updating the commit message (and PR description)

(See below for another followup action)

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @healthy-pod and @jeffswenson)

-- commits line 5 at r3:
Now that I understand what this is doing, please add more words to this commit message to explain it as well:

"""
This is done by adding the column to the bootstrap schema for system.sql_instances,
and piggy-backing on the existing code for the V23_1_SystemRbrReadNew migration
that overwrites the live schema for this table by the bootstrap copy.

This redefinition of the meaning of the V23_1_SystemRbrReadNew is backward-incompatible
and is only possible because this commit is merged before the v23.1 branch is cut.
"""

pkg/upgrade/upgrades/upgrades.go line 254 at r3 (raw file):

		toCV(clusterversion.V23_1_SystemRbrReadNew),
		upgrade.NoPrecondition,
		backfillRegionalByRowIndex,

Please file an issue to be processed at very high priority by either the Serverless or Schema team, to enhance this function with a hardwired copy of the bootstrap descriptors at that version.

Otherwise, any additional changes to these tables in future CRDB versions will result in schema corruption for folk upgrading from an older version.

jeffswenson

LGTM

healthy-pod · 2023-03-20T23:48:46Z

TFTRs!

bors r=[knz,JeffSwenson]

craig · 2023-03-21T02:59:54Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

craig · 2023-03-21T05:07:23Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

knz · 2023-03-21T09:01:42Z

bors r-

(merge conflict)

craig · 2023-03-21T09:01:44Z

Canceled.

Previously, tenant upgrades in UA required a user to issue a `SET CLUSTER SETTING version =` statement to finalize an upgrade. This UX is different from what we have in single-tenant SH/Dedicated deployments in that we have auto upgrade in the later that starts an attempt to finalize cluster version after every node startup incase the node was started with a new binary version that all nodes now support upgrading to. In UA, we have two differences: 1. What to upgrade? - In a multi-tenant deployment, the storage and sql layers are upgraded separately. - The storage layer upgrade finalization is still handled by the existing auto upgrade logic. - In this change, we ensure that the sql layer is also auto-upgraded when possible. 2. When to upgrade? - In a single-tenant deployment, all layers share the same binary version and cluster version. Hence, an upgrade attempt is only needed when a new node starts to ensure that the cluster is auto-upgraded if the new binary version supports an upgrade. - In a multi-tenant deployment, in addition to the condition above, the sql server upgrade is also constrained by the storage cluster version. It is possible for all SQL instances to have binary versions that support an upgrade but the upgrade will still be blocked by the storage cluster version if it’s equal to the current tenant cluster version. This code change does the following: 1. Adds logic to run a SQL server upgrade attempt (mostly adopted from the original auto upgrade code) within the following ordered constraints (previously we merged cockroachdb#98830 to make getting the binary versions of instances easier): - Ensure that upgrade is not blocked by `preserve_downgrade_option`. - Exit if tenant cluster version is equal to storage cluster version [upgrade already completed]. - Upgrade to storage cluster version if the binary version of all SQL instances supports that. - Exit if tenant cluster version is equal to the minimum instance binary version [upgrade blocked due to too low binary version]. - Upgrade to the minimum instance binary version. 2. Runs the logic above when a SQL server is started. - This covers the case where a SQL server binary upgrade allows for an upgrade to the tenant cluster version. 3. Checks for change in storage cluster version every 30 seconds and starts an upgrade attempt if it was changed. - This covers the case where the binary versions of all SQL instances allow for an upgrade but it’s blocked due to the storage cluster version. Release note: None Epic: CRDB-20860

Previously, tenant upgrades in UA required a user to issue a `SET CLUSTER SETTING version =` statement to finalize an upgrade. This UX is different from what we have in single-tenant SH/Dedicated deployments in that we have auto upgrade in the later that starts an attempt to finalize cluster version after every node startup incase the node was started with a new binary version that all nodes now support upgrading to. In UA, we have two differences: 1. What to upgrade? - In a multi-tenant deployment, the storage and sql layers are upgraded separately. - The storage layer upgrade finalization is still handled by the existing auto upgrade logic. - In this change, we ensure that the sql layer is also auto-upgraded when possible. 2. When to upgrade? - In a single-tenant deployment, all layers share the same binary version and cluster version. Hence, an upgrade attempt is only needed when a new node starts to ensure that the cluster is auto-upgraded if the new binary version supports an upgrade. - In a multi-tenant deployment, in addition to the condition above, the sql server upgrade is also constrained by the storage cluster version. It is possible for all SQL instances to have binary versions that support an upgrade but the upgrade will still be blocked by the storage cluster version if it’s equal to the current tenant cluster version. This code change does the following: 1. Adds logic to run a SQL server upgrade attempt (mostly adopted from the original auto upgrade code) within the following ordered constraints (previously we merged cockroachdb#98830 to make getting the binary versions of instances easier): - Ensure that the upgrade is not blocked by the secondary tenant's setting of preserve_downgrade_option or an all-tenant override of that value. - Exit if tenant cluster version is equal to the minimum instance binary version [upgrade already completed]. - Upgrade to storage cluster version if the binary version of all SQL instances supports that. - Exit if storage cluster version is less than the minimum instance binary version [upgrade blocked due to low storage cluster version]. - Upgrade to the minimum instance binary version. 2. Runs the logic above when a SQL server is started. - This covers the case where a SQL server binary upgrade allows for an upgrade to the tenant cluster version. 3. Checks for change in storage cluster version every 10 seconds and starts an upgrade attempt if it was changed. - This covers the case where the binary versions of all SQL instances allow for an upgrade but it’s blocked due to the storage cluster version. Release note: None Epic: CRDB-20860

Previously, tenant upgrades in UA required a user to issue a `SET CLUSTER SETTING version =` statement to finalize an upgrade. This UX is different from what we have in single-tenant SH/Dedicated deployments in that we have auto upgrade in the later that starts an attempt to finalize cluster version after every node startup incase the node was started with a new binary version that all nodes now support upgrading to. In UA, we have two differences: 1. What to upgrade? - In a multi-tenant deployment, the storage and sql layers are upgraded separately. - The storage layer upgrade finalization is still handled by the existing auto upgrade logic. - In this change, we ensure that the sql layer is also auto-upgraded when possible. 2. When to upgrade? - In a single-tenant deployment, all layers share the same binary version and cluster version. Hence, an upgrade attempt is only needed when a new node starts to ensure that the cluster is auto-upgraded if the new binary version supports an upgrade. - In a multi-tenant deployment, in addition to the condition above, the sql server upgrade is also constrained by the storage cluster version. It is possible for all SQL instances to have binary versions that support an upgrade but the upgrade will still be blocked by the storage cluster version if it’s equal to the current tenant cluster version. This code change does the following: 1. Adds logic to run a SQL server upgrade attempt (mostly adopted from the original auto upgrade code) within the following ordered constraints (previously we merged cockroachdb#98830 to make getting the binary versions of instances easier): - Ensure that the upgrade is not blocked by the secondary tenant's setting of preserve_downgrade_option or an all-tenant override of that value. - Exit if tenant cluster version is equal to the minimum instance binary version [upgrade already completed]. - Upgrade to storage cluster version if the binary version of all SQL instances supports that. - Exit if storage cluster version is less than the minimum instance binary version [upgrade blocked due to low storage cluster version]. - Upgrade to the minimum instance binary version. 2. Runs the logic above when a SQL server is started. - This covers the case where a SQL server binary upgrade allows for an upgrade to the tenant cluster version. 3. Checks for change in storage cluster version every 10 seconds and starts an upgrade attempt if it was changed. - This covers the case where the binary versions of all SQL instances allow for an upgrade but it’s blocked due to the storage cluster version. Release note: Support for automatic finalization of virtual clusters version upgrade was added. A new setting `cluster.auto_upgrade.enabled` was added to enable and disable automatic cluster version upgrade (finalization) and it will be used with automatic upgrade at both storage-level and virtual cluster level (each has its own setting value though). Epic: CRDB-20860

102427: pkg/server: support tenant auto-upgrade r=stevendanna,knz a=healthy-pod Previously, tenant upgrades in UA required a user to issue a `SET CLUSTER SETTING version =` statement to finalize an upgrade. This UX is different from what we have in single-tenant SH/Dedicated deployments in that we have auto upgrade in the later that starts an attempt to finalize cluster version after every node startup incase the node was started with a new binary version that all nodes now support upgrading to. In UA, we have two differences: 1. What to upgrade? - In a multi-tenant deployment, the storage and sql layers are upgraded separately. - The storage layer upgrade finalization is still handled by the existing auto upgrade logic. - In this change, we ensure that the sql layer is also auto-upgraded when possible. 2. When to upgrade? - In a single-tenant deployment, all layers share the same binary version and cluster version. Hence, an upgrade attempt is only needed when a new node starts to ensure that the cluster is auto-upgraded if the new binary version supports an upgrade. - In a multi-tenant deployment, in addition to the condition above, the sql server upgrade is also constrained by the storage cluster version. It is possible for all SQL instances to have binary versions that support an upgrade but the upgrade will still be blocked by the storage cluster version if it’s equal to the current tenant cluster version. This code change does the following: 1. Adds logic to run a SQL server upgrade attempt (mostly adopted from the original auto upgrade code) within the following ordered constraints (previously we merged #98830 to make getting the binary versions of instances easier): - Ensure that the upgrade is not blocked by the secondary tenant's setting of preserve_downgrade_option or an all-tenant override of that value. - Exit if tenant cluster version is equal to the minimum instance binary version [upgrade already completed]. - Upgrade to storage cluster version if the binary version of all SQL instances supports that. - Exit if storage cluster version is less than the minimum instance binary version [upgrade blocked due to low storage cluster version]. - Upgrade to the minimum instance binary version. 2. Runs the logic above when a SQL server is started. - This covers the case where a SQL server binary upgrade allows for an upgrade to the tenant cluster version. 3. Checks for change in storage cluster version every 10 seconds and starts an upgrade attempt if it was changed. - This covers the case where the binary versions of all SQL instances allow for an upgrade but it’s blocked due to the storage cluster version. Release note: Support for automatic finalization of virtual clusters version upgrade was added. A new setting `cluster.auto_upgrade.enabled` was added to enable and disable automatic cluster version upgrade (finalization) and it will be used with automatic upgrade at both storage-level and virtual cluster level (each has its own setting value though). Epic: [CRDB-20860](https://cockroachlabs.atlassian.net/browse/CRDB-20860) Co-authored-by: healthy-pod <[email protected]>

healthy-pod requested a review from a team March 17, 2023 01:34

healthy-pod requested a review from a team as a code owner March 17, 2023 01:34

healthy-pod marked this pull request as draft March 17, 2023 01:34

healthy-pod force-pushed the migrate-instances-table branch 2 times, most recently from e447a40 to 99eb0be Compare March 17, 2023 02:36

healthy-pod marked this pull request as ready for review March 17, 2023 03:39

healthy-pod requested review from jeffswenson and knz March 17, 2023 03:39

knz reviewed Mar 17, 2023

View reviewed changes

jeffswenson reviewed Mar 17, 2023

View reviewed changes

healthy-pod force-pushed the migrate-instances-table branch from 99eb0be to 0132479 Compare March 17, 2023 19:35

healthy-pod force-pushed the migrate-instances-table branch 3 times, most recently from 77240e9 to 3174505 Compare March 20, 2023 20:00

healthy-pod requested review from knz and jeffswenson March 20, 2023 21:36

knz reviewed Mar 20, 2023

View reviewed changes

knz approved these changes Mar 20, 2023

View reviewed changes

healthy-pod force-pushed the migrate-instances-table branch from 3174505 to 03a8cf5 Compare March 20, 2023 22:11

jeffswenson approved these changes Mar 20, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sqlinstance: add `binary_version` column to instances table #98830

sqlinstance: add `binary_version` column to instances table #98830

healthy-pod commented Mar 17, 2023 •

edited

Loading

cockroach-teamcity commented Mar 17, 2023

knz left a comment

jeffswenson Mar 17, 2023

healthy-pod Mar 17, 2023

jeffswenson Mar 17, 2023

healthy-pod Mar 17, 2023

jeffswenson Mar 20, 2023

healthy-pod commented Mar 17, 2023

healthy-pod commented Mar 20, 2023

knz left a comment

knz left a comment

jeffswenson left a comment

healthy-pod commented Mar 20, 2023

craig bot commented Mar 21, 2023

craig bot commented Mar 21, 2023

knz commented Mar 21, 2023

craig bot commented Mar 21, 2023

sqlinstance: add binary_version column to instances table #98830

sqlinstance: add binary_version column to instances table #98830

Conversation

healthy-pod commented Mar 17, 2023 • edited Loading

cockroach-teamcity commented Mar 17, 2023

knz left a comment

Choose a reason for hiding this comment

jeffswenson Mar 17, 2023

Choose a reason for hiding this comment

healthy-pod Mar 17, 2023

Choose a reason for hiding this comment

jeffswenson Mar 17, 2023

Choose a reason for hiding this comment

healthy-pod Mar 17, 2023

Choose a reason for hiding this comment

jeffswenson Mar 20, 2023

Choose a reason for hiding this comment

healthy-pod commented Mar 17, 2023

healthy-pod commented Mar 20, 2023

knz left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

jeffswenson left a comment

Choose a reason for hiding this comment

healthy-pod commented Mar 20, 2023

craig bot commented Mar 21, 2023

craig bot commented Mar 21, 2023

knz commented Mar 21, 2023

craig bot commented Mar 21, 2023

sqlinstance: add `binary_version` column to instances table #98830

sqlinstance: add `binary_version` column to instances table #98830

healthy-pod commented Mar 17, 2023 •

edited

Loading