Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/backfill: fix bug adding new columns to new index with volatile default #81486

Conversation

ajwerner
Copy link
Contributor

@ajwerner ajwerner commented May 18, 2022

In #58990 we added support for the index backfiller to evaluate expressions. This unblocked the usage of virtual computed columns in secondary expressions, so wasn't a totally bogus change, but probably was trying to do too much without understanding all of the implications. The original motivation for that change was to unblock the then nascent declarative schema changer prototype which wanted to only implement #47989 as a column change protocol. In order to do that, it needed to evaluate default expressions when adding new columns to a new primary index. For whatever reason, probably lack of understanding, I made it such that all default expressions for columns which were in the adding state were evaluated. This is wrong in cases where the column has already been backfilled into the current primary index; it's wrong because volatile expressions will evaluate to different values causing the newly backfilled secondary index to be bogus and corrupt.

This change addresses the failure of the above change by being more thoughtful about whether it's sane to evaluate a default expression when backfilling into an index. Note that it's still insane to backfill a volatile expression for a new column into the key of a new primary index. As of writing, there is no way to do this.

Backports will address #81448.

Release note (bug fix): In 21.1 a bug was introduced whereby default values
were recomputed when populating data in new secondary indexes for columns
which were added in the same transaction as the index. This arises, for example
in cases like ALTER TABLE t ADD COLUMN f FLOAT8 UNIQUE DEFAULT (random()).
If the default expression is not volatile, then the recomputation is harmless.
If, however, the default expression is volatile, the data in the secondary
index will not match the data in the primary index: a corrupt index will have
been created. This bug has now been fixed.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ajwerner ajwerner force-pushed the ajwerner/fix-index-backfill-volatile-default-new-column branch 4 times, most recently from ee6ce40 to 7b2e172 Compare May 19, 2022 12:55
@ajwerner ajwerner marked this pull request as ready for review May 19, 2022 12:58
@ajwerner ajwerner requested a review from a team May 19, 2022 12:58
Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)

a discussion (no related file):
Overall I like that you moved some of this logic out so it can be unit tested, and the additional comments also help.

I mostly just left questions I had while reading this.



pkg/sql/backfill/index_backfiller_cols.go line 21 at r1 (raw file):

// indexBackfillerCols holds the interned set of columns used by the
// indexBackfiller, and the information about which columns will need to

[nit] s/, and/ and/ ?


pkg/sql/backfill/index_backfiller_cols.go line 35 at r2 (raw file):

	// the index being backfilled is a new primary index and the columns do not
	// exist in the currently public primary index.
	addedCols []catalog.Column

Is this case possible in the non-declarative schema changer? I believe we block primary index swaps with any other concurrent schema changes, but am I missing another way to get this case in the non-declarative schema changer?


pkg/sql/logictest/testdata/logic_test/alter_table line 2500 at r1 (raw file):


statement ok
create table t81448_b as select b from t81448

Out of curiosity, is this guaranteed to always scan the primary index rather than the secondary index? Looking at the current implementation I believe it will, but is there any reasons we couldn't in theory scan the secondary index on b.


pkg/sql/backfill/index_backfiller_cols_test.go line 191 at r1 (raw file):

			// This is the case where we're building a new primary index as part
			// of an add column.
			name: "one virtual, one new mutation column in source used in new primary",

Looks like this case might be duplicated from above.

@ajwerner
Copy link
Contributor Author

Note that it's still insane to backfill a volatile expression for a new column into the key of a new primary index. As of writing, there is no way to do this.

I filed this as follow-up #81521

@ajwerner ajwerner force-pushed the ajwerner/fix-index-backfill-volatile-default-new-column branch from 7b2e172 to f2af107 Compare May 19, 2022 13:55
Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @stevendanna)

a discussion (no related file):

Previously, stevendanna (Steven Danna) wrote…

Overall I like that you moved some of this logic out so it can be unit tested, and the additional comments also help.

I mostly just left questions I had while reading this.

TFTR!



pkg/sql/backfill/index_backfiller_cols.go line 21 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

[nit] s/, and/ and/ ?

Done.


pkg/sql/backfill/index_backfiller_cols.go line 35 at r2 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Is this case possible in the non-declarative schema changer? I believe we block primary index swaps with any other concurrent schema changes, but am I missing another way to get this case in the non-declarative schema changer?

No, this is just for the declarative schema changer. I added a note.


pkg/sql/logictest/testdata/logic_test/alter_table line 2500 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Out of curiosity, is this guaranteed to always scan the primary index rather than the secondary index? Looking at the current implementation I believe it will, but is there any reasons we couldn't in theory scan the secondary index on b.

I made it explicit


pkg/sql/backfill/index_backfiller_cols_test.go line 191 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Looks like this case might be duplicated from above.

Removed.

…efault

Release note (bug fix): In 21.1 a bug was introduced whereby default values
were recomputed when populating data in new secondary indexes for columns
which were added in the same transaction as the index. This arises, for example
in cases like `ALTER TABLE t ADD COLUMN f FLOAT8 UNIQUE DEFAULT (random())`.
If the default expression is not volatile, then the recomputation is harmless.
If, however, the default expression is volatile, the data in the secondary
index will not match the data in the primary index: a corrupt index will have
been created. This bug has now been fixed.
@ajwerner ajwerner force-pushed the ajwerner/fix-index-backfill-volatile-default-new-column branch from f2af107 to d531343 Compare May 19, 2022 14:59
@chengxiong-ruan
Copy link
Contributor

pkg/sql/backfill/index_backfiller_cols.go line 82 at r3 (raw file):

		if column.IsComputed() && column.IsVirtual() {
			computedVirtual.Add(column.GetID())
			ib.computedCols = append(ib.computedCols, column)

per the comment on computedCols:

// computedCols are the columns in this index which are computed and do
// not have concrete values in the source index. This is virtual computed
// columns and stored computed columns which are non-public and not part
// of the currently public primary index.

what if a column is virtual+computed, but it's in the key of the source primary index?

@chengxiong-ruan
Copy link
Contributor

a discussion (no related file):

Previously, ajwerner wrote…

TFTR!

Thanks for fixing this!

@ajwerner
Copy link
Contributor Author

TFTR!

bors r+

Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @stevendanna)


pkg/sql/backfill/index_backfiller_cols.go line 82 at r3 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

per the comment on computedCols:

// computedCols are the columns in this index which are computed and do
// not have concrete values in the source index. This is virtual computed
// columns and stored computed columns which are non-public and not part
// of the currently public primary index.

what if a column is virtual+computed, but it's in the key of the source primary index?

it'll be counted as a computed column and re-computed. It ought not to matter that we're recomputing it given computed virtual columns in all other settings are recomputed and we assume that that's safe. I suppose using the physical value might save us in some rare edge case involving redefining the expression illegally, but I don't think I care about that case given it's just one special case of a broader problem.

@craig
Copy link
Contributor

craig bot commented May 19, 2022

Build succeeded:

@blathers-crl
Copy link

blathers-crl bot commented May 19, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from d531343 to blathers/backport-release-21.1-81486: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.1.x failed. See errors above.


error creating merge commit from d531343 to blathers/backport-release-21.2-81486: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants