-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: async validation of foreign keys in ADD CONSTRAINT #37433
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @lucy-zhang, and @vivekmenezes)
pkg/sql/backfill.go, line 273 at r1 (raw file):
log.VEventf( ctx, 2, "backfiller tried to add constraint %+v but found existing constraint %+v, presumably due to a retry",
How can a retry cause the constraint to be seen. We usually go through the pain of cleaning up before a retry. Perhaps an error is warranted here?
pkg/sql/schema_changer.go, line 1470 at r1 (raw file):
log.Infof( ctx, "attempted to drop constraint %s, but it hadn't been added to the table descriptor yet",
should this be just an error?
pkg/sql/logictest/testdata/logic_test/fk, line 2154 at r1 (raw file):
statement error foreign key violation: "b" row a_z='z2', a_y='y2', a_x='x2' has no match in "a" ALTER TABLE b VALIDATE CONSTRAINT fk_ref
i see you have taken out a lot of tests above. Can you keep these tests while adding an FK constraint rather than through the validation step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 16 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @lucy-zhang, and @vivekmenezes)
pkg/sql/schema_changer.go, line 1220 at r1 (raw file):
// applying a schema change. If a column being added is reversed and droped, // all new indexes referencing the column will also be dropped. func (sc *SchemaChanger) reverseMutations(ctx context.Context, causingError error) error {
do we need a drop on failure
bool on ConstraintToUpdate?
I'm just thinking that if you add constraint ... not valid
and later run validate constraint
we should not drop on failure.
Right now I don't think we support not valid
and I if I'm reading this correctly, I think VALIDATE CONSTRAINT
is still just run in the alter txn, not via schema changer, right? SO in that case, this isn't currently an issue, but when we do add NOT VALID
and make VALIDATE CONSTRAINT
also a schema change, then thinking about the backwards compatibility of an explicit bool and its zero value, it seems like we won't have any way to turn off automatic dropping by older nodes unless we build it in now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @vivekmenezes)
pkg/sql/backfill.go, line 273 at r1 (raw file):
Previously, vivekmenezes wrote…
How can a retry cause the constraint to be seen. We usually go through the pain of cleaning up before a retry. Perhaps an error is warranted here?
It can happen if the cluster goes down in the state after the constraint has been added to the table descriptor in the Validating
state, but before it becomes Validated
and is removed from the list of mutations. I can manually reproduce this on a single-node cluster by calling Sleep
at https://github.com/cockroachdb/cockroach/pull/37433/files#diff-008234c9a29752f8b897974c95348fa4R228, doing a hard shutdown at that time, and then restarting the cluster. Not sure if there's a way to write a test for this.
pkg/sql/schema_changer.go, line 1220 at r1 (raw file):
Previously, dt (David Taylor) wrote…
do we need a
drop on failure
bool on ConstraintToUpdate?
I'm just thinking that if youadd constraint ... not valid
and later runvalidate constraint
we should not drop on failure.Right now I don't think we support
not valid
and I if I'm reading this correctly, I thinkVALIDATE CONSTRAINT
is still just run in the alter txn, not via schema changer, right? SO in that case, this isn't currently an issue, but when we do addNOT VALID
and makeVALIDATE CONSTRAINT
also a schema change, then thinking about the backwards compatibility of an explicit bool and its zero value, it seems like we won't have any way to turn off automatic dropping by older nodes unless we build it in now?
Yeah, VALIDATE CONSTRAINT
is still in the ALTER TABLE
transaction now.
I've been thinking that Validating
should basically mean "drop on failure," so it only occurs as part of ALTER TABLE
without NOT VALID
, and a constraint added using NOT VALID
and then validated using VALIDATE CONSTRAINT
will never be Validating
. So, NOT VALID
just corresponds to having an Unvalidated
constraint in the mutation, and results in skipping the extra step where the descriptor is published with the constraint and the subsequent validation query.
It's not ideal that the names of the states imply a Unvalidated -> Validating -> Validated
sequence of states, when the real possibilities are actually either Validating -> Validated
or Unvalidated -> Validated
. But I think the "drop on failure" state can be pretty much captured by Validating
and an extra boolean would just be duplicating that state (even if, arguably, the state should be on the mutation instead of the constraint itself). I'm open to also putting in a boolean instead, though I'm not sure Validating
would mean anything separate in that case.
We do also need to add a boolean flag for this mutation type to also support VALIDATE CONSTRAINT
for backward compatibility, to only run the validation without adding anything. But I think I'd rather add one or both flags in a separate PR.
pkg/sql/schema_changer.go, line 1470 at r1 (raw file):
Previously, vivekmenezes wrote…
should this be just an error?
This can happen if we have to roll back a schema change before it has reached the backfill stage.
pkg/sql/logictest/testdata/logic_test/fk, line 2154 at r1 (raw file):
Previously, vivekmenezes wrote…
i see you have taken out a lot of tests above. Can you keep these tests while adding an FK constraint rather than through the validation step?
The cases where validation fails on ADD CONSTRAINT are already covered. I added a few more tests for the cases where ADD CONSTRAINT succeeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt)
ad61b38
to
ef8bd9f
Compare
I updated this with some minor fixes/tests:
|
Don't we want to support |
Yeah, we'll want to support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 16 files at r1, 8 of 8 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @lucy-zhang)
pkg/sql/backfill.go, line 1255 at r2 (raw file):
// cluster version, it will be used in the InternalExecutor that performs the // validation query. // validateFkInTxn validates foreign key constraints within the provided
looks like this comment got copy/pasted from below.
pkg/sql/backfill.go, line 1308 at r2 (raw file):
fkName string, ) error { ie := evalCtx.InternalExecutor.(*SessionBoundInternalExecutor)
Are we doing this pattern often enough that it'd make sense to have a scheme-changer executor that DRYs it? (doesn't need to happen here of course)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt)
pkg/sql/backfill.go, line 1255 at r2 (raw file):
Previously, dt (David Taylor) wrote…
looks like this comment got copy/pasted from below.
Done.
pkg/sql/backfill.go, line 1308 at r2 (raw file):
Previously, dt (David Taylor) wrote…
Are we doing this pattern often enough that it'd make sense to have a scheme-changer executor that DRYs it? (doesn't need to happen here of course)
Yeah, I think so. The existing interface for modifying the TableCollection with tcModifier
is also kind of cumbersome - I think all of this could use some refactoring now that we have several different uses for this pattern. The issue for this was intentionally left open: #34304
Currently, validating an unvalidated foreign key constraint requires running `VALIDATE CONSTRAINT`, which executes a potentially long-running query in the user transaction. In this PR, `ADD CONSTRAINT` for foreign keys is now processed by the schema changer, with the validation for existing rows running in a separate transaction from the original user transaction. The foreign key mutation (represented by the new `FOREIGN_KEY` enum value in `ConstraintToUpdate`) is processed in the same way as check constraint mutations: the foreign key is in the mutations list while other columns and indexes with the same mutation ID are backfilled, then added to the appropriate index in the `Validating` state before being validated, and is finalized when the validation query for existing rows completes successfully. If validation fails, the transaction is rolled back, with the foreign key (and backreference) removed from the table descriptor(s) as part of the rollback. Adding foreign keys to columns or indexes being added is still not supported and is left for later work. Also unsupported is adding a foreign key constraint in the same transaction as `CREATE TABLE` that is either validated or that rolls back the entire transaction on failure. In this PR, the constraint is just left unvalidated; This needs a follow-up PR to queue a separate mutation for validating the constraint after it's been added. Release note (sql change): Foreign keys that are added to an existing table using `ALTER TABLE` will now be validated for existing rows, with improved performance compared to running `ADD CONSTRAINT` followed by `VALIDATE CONSTRAINT` previously.
bors r+ |
37433: sql: async validation of foreign keys in ADD CONSTRAINT r=lucy-zhang a=lucy-zhang Currently, validating an unvalidated foreign key constraint requires running `VALIDATE CONSTRAINT`, which executes a potentially long-running query in the user transaction. In this PR, `ADD CONSTRAINT` for foreign keys is now processed by the schema changer, with the validation for existing rows running in a separate transaction from the original user transaction. The foreign key mutation (represented by the new `FOREIGN_KEY` enum value in `ConstraintToUpdate`) is processed in the same way as check constraint mutations: the foreign key is in the mutations list while other columns and indexes with the same mutation ID are backfilled, then added to the appropriate index in the `Validating` state before being validated, and is finalized when the validation query for existing rows completes successfully. If validation fails, the transaction is rolled back, with the foreign key (and backreference) removed from the table descriptor(s) as part of the rollback. Adding foreign keys to columns or indexes being added is still not supported and is left for later work. Also unsupported is adding a foreign key constraint in the same transaction as `CREATE TABLE` that is either validated or that rolls back the entire transaction on failure. In this PR, the constraint is just left unvalidated; This needs a follow-up PR to queue a separate mutation for validating the constraint after it's been added. Release note (sql change): Foreign keys that are added to an existing table using `ALTER TABLE` will now be validated for existing rows, with improved performance compared to running `ADD CONSTRAINT` followed by `VALIDATE CONSTRAINT` previously. Co-authored-by: Lucy Zhang <[email protected]>
Build succeeded |
Currently, validating an unvalidated foreign key constraint requires running
VALIDATE CONSTRAINT
, which executes a potentially long-running query in theuser transaction. In this PR,
ADD CONSTRAINT
for foreign keys is nowprocessed by the schema changer, with the validation for existing rows running
in a separate transaction from the original user transaction.
The foreign key mutation (represented by the new
FOREIGN_KEY
enum value inConstraintToUpdate
) is processed in the same way as check constraintmutations: the foreign key is in the mutations list while other columns and
indexes with the same mutation ID are backfilled, then added to the appropriate
index in the
Validating
state before being validated, and is finalized whenthe validation query for existing rows completes successfully. If validation
fails, the transaction is rolled back, with the foreign key (and backreference)
removed from the table descriptor(s) as part of the rollback.
Adding foreign keys to columns or indexes being added is still not supported
and is left for later work. Also unsupported is adding a foreign key constraint
in the same transaction as
CREATE TABLE
that is either validated or thatrolls back the entire transaction on failure. In this PR, the constraint is
just left unvalidated; This needs a follow-up PR to queue a separate mutation
for validating the constraint after it's been added.
Release note (sql change): Foreign keys that are added to an existing table
using
ALTER TABLE
will now be validated for existing rows, with improvedperformance compared to running
ADD CONSTRAINT
followed byVALIDATE CONSTRAINT
previously.