-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/backfill: fix bug adding new columns to new index with volatile default #81486
sql/backfill: fix bug adding new columns to new index with volatile default #81486
Conversation
ee6ce40
to
7b2e172
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)
a discussion (no related file):
Overall I like that you moved some of this logic out so it can be unit tested, and the additional comments also help.
I mostly just left questions I had while reading this.
pkg/sql/backfill/index_backfiller_cols.go
line 21 at r1 (raw file):
// indexBackfillerCols holds the interned set of columns used by the // indexBackfiller, and the information about which columns will need to
[nit] s/, and/ and/ ?
pkg/sql/backfill/index_backfiller_cols.go
line 35 at r2 (raw file):
// the index being backfilled is a new primary index and the columns do not // exist in the currently public primary index. addedCols []catalog.Column
Is this case possible in the non-declarative schema changer? I believe we block primary index swaps with any other concurrent schema changes, but am I missing another way to get this case in the non-declarative schema changer?
pkg/sql/logictest/testdata/logic_test/alter_table
line 2500 at r1 (raw file):
statement ok create table t81448_b as select b from t81448
Out of curiosity, is this guaranteed to always scan the primary index rather than the secondary index? Looking at the current implementation I believe it will, but is there any reasons we couldn't in theory scan the secondary index on b.
pkg/sql/backfill/index_backfiller_cols_test.go
line 191 at r1 (raw file):
// This is the case where we're building a new primary index as part // of an add column. name: "one virtual, one new mutation column in source used in new primary",
Looks like this case might be duplicated from above.
I filed this as follow-up #81521 |
7b2e172
to
f2af107
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @stevendanna)
a discussion (no related file):
Previously, stevendanna (Steven Danna) wrote…
Overall I like that you moved some of this logic out so it can be unit tested, and the additional comments also help.
I mostly just left questions I had while reading this.
TFTR!
pkg/sql/backfill/index_backfiller_cols.go
line 21 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
[nit] s/, and/ and/ ?
Done.
pkg/sql/backfill/index_backfiller_cols.go
line 35 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Is this case possible in the non-declarative schema changer? I believe we block primary index swaps with any other concurrent schema changes, but am I missing another way to get this case in the non-declarative schema changer?
No, this is just for the declarative schema changer. I added a note.
pkg/sql/logictest/testdata/logic_test/alter_table
line 2500 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Out of curiosity, is this guaranteed to always scan the primary index rather than the secondary index? Looking at the current implementation I believe it will, but is there any reasons we couldn't in theory scan the secondary index on b.
I made it explicit
pkg/sql/backfill/index_backfiller_cols_test.go
line 191 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Looks like this case might be duplicated from above.
Removed.
…efault Release note (bug fix): In 21.1 a bug was introduced whereby default values were recomputed when populating data in new secondary indexes for columns which were added in the same transaction as the index. This arises, for example in cases like `ALTER TABLE t ADD COLUMN f FLOAT8 UNIQUE DEFAULT (random())`. If the default expression is not volatile, then the recomputation is harmless. If, however, the default expression is volatile, the data in the secondary index will not match the data in the primary index: a corrupt index will have been created. This bug has now been fixed.
f2af107
to
d531343
Compare
per the comment on
what if a column is virtual+computed, but it's in the key of the source primary index? |
a discussion (no related file): Previously, ajwerner wrote…
Thanks for fixing this! |
TFTR! bors r+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @stevendanna)
pkg/sql/backfill/index_backfiller_cols.go
line 82 at r3 (raw file):
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
per the comment on
computedCols
:// computedCols are the columns in this index which are computed and do // not have concrete values in the source index. This is virtual computed // columns and stored computed columns which are non-public and not part // of the currently public primary index.
what if a column is virtual+computed, but it's in the key of the source primary index?
it'll be counted as a computed column and re-computed. It ought not to matter that we're recomputing it given computed virtual columns in all other settings are recomputed and we assume that that's safe. I suppose using the physical value might save us in some rare edge case involving redefining the expression illegally, but I don't think I care about that case given it's just one special case of a broader problem.
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from d531343 to blathers/backport-release-21.1-81486: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.1.x failed. See errors above. error creating merge commit from d531343 to blathers/backport-release-21.2-81486: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
In #58990 we added support for the index backfiller to evaluate expressions. This unblocked the usage of virtual computed columns in secondary expressions, so wasn't a totally bogus change, but probably was trying to do too much without understanding all of the implications. The original motivation for that change was to unblock the then nascent declarative schema changer prototype which wanted to only implement #47989 as a column change protocol. In order to do that, it needed to evaluate default expressions when adding new columns to a new primary index. For whatever reason, probably lack of understanding, I made it such that all default expressions for columns which were in the adding state were evaluated. This is wrong in cases where the column has already been backfilled into the current primary index; it's wrong because volatile expressions will evaluate to different values causing the newly backfilled secondary index to be bogus and corrupt.
This change addresses the failure of the above change by being more thoughtful about whether it's sane to evaluate a default expression when backfilling into an index. Note that it's still insane to backfill a volatile expression for a new column into the key of a new primary index. As of writing, there is no way to do this.
Backports will address #81448.
Release note (bug fix): In 21.1 a bug was introduced whereby default values
were recomputed when populating data in new secondary indexes for columns
which were added in the same transaction as the index. This arises, for example
in cases like
ALTER TABLE t ADD COLUMN f FLOAT8 UNIQUE DEFAULT (random())
.If the default expression is not volatile, then the recomputation is harmless.
If, however, the default expression is volatile, the data in the secondary
index will not match the data in the primary index: a corrupt index will have
been created. This bug has now been fixed.