-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: run check constraint mutations and validation by schema changer #32504
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great Stuff! A couple really minor nits. The schema_changer.go stuff is in code with which I have a little less familiarity but overall approach looks great (though as commented below, I still get confused about how txn semantics work with schema changes that "fail").
Reviewed 19 of 21 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/backfill.go, line 47 at r1 (raw file):
// checkConstraintBackfillChunkSize is the maximum number of rows // processed per chunk during check constraint validation. checkConstraintBackfillChunkSize = 800
How did we pick this number?
pkg/sql/backfill/backfill.go, line 111 at r1 (raw file):
); err != nil { return err } else {
this really should be allowed -- minimizing the scope in which typed
has to be bound is nice and clean -- but sadly the linter doesn't like it since first block ends in return. dumb linter.
pkg/sql/backfill/backfill.go, line 183 at r1 (raw file):
} if val == tree.DBoolFalse { return roachpb.Key{}, errors.Errorf("validation of CHECK %q failed on row: %s",
I think there's a proper pgerror code we should be returning -- though I don't know if returning it here will make it back to the client.
pkg/sql/logictest/testdata/logic_test/schema_change_in_txn, line 611 at r1 (raw file):
statement ok ALTER TABLE forlater ADD CONSTRAINT e_0 CHECK (e > 0)
Is it sort of weird that you get a success reply here, but if you then ran SELECT ... WHERE e < 0
in the txn, you could observe a violation?
pkg/sql/logictest/testdata/logic_test/schema_change_in_txn, line 613 at r1 (raw file):
ALTER TABLE forlater ADD CONSTRAINT e_0 CHECK (e > 0) statement error pq: validation of CHECK "e > 0" failed on row: k='c', v='d', d=NULL, e=0
So you get an error from commit but the txn does commit, right? just the CHECK doesn't get added? I realize some of our other schema changes have similar behavior, but it really seems like a strange violation of the all-or-nothing aspect of committing a txn. Maybe I just need to get over that expectation though.
3fe0f30
to
91eb4a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/backfill.go, line 47 at r1 (raw file):
Previously, dt (David Taylor) wrote…
How did we pick this number?
I chose a number larger than any of the other backfill chunk sizes because check backfills don't actually write any data. Testing and benchmarking could be done to tune this.
pkg/sql/backfill/backfill.go, line 183 at r1 (raw file):
Previously, dt (David Taylor) wrote…
I think there's a proper pgerror code we should be returning -- though I don't know if returning it here will make it back to the client.
You're right, there exists 23514 aka check_validation
data exception. However error codes from the schema changer get replaced with a statement_completion_unknown
code.
pkg/sql/logictest/testdata/logic_test/schema_change_in_txn, line 611 at r1 (raw file):
Previously, dt (David Taylor) wrote…
Is it sort of weird that you get a success reply here, but if you then ran
SELECT ... WHERE e < 0
in the txn, you could observe a violation?
Yea this is kinda weird, a similar result is observed with unique constraints as well.
pkg/sql/logictest/testdata/logic_test/schema_change_in_txn, line 613 at r1 (raw file):
Previously, dt (David Taylor) wrote…
So you get an error from commit but the txn does commit, right? just the CHECK doesn't get added? I realize some of our other schema changes have similar behavior, but it really seems like a strange violation of the all-or-nothing aspect of committing a txn. Maybe I just need to get over that expectation though.
Right, the txn which added the mutation does commit. The violation is strange but it's communicated to the user through the error code and name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sqlbase/structured.proto, line 480 at r2 (raw file):
} message CheckConstraint {
Not sure if pulling out this message type is safe or not.
da57c52
to
77a92c4
Compare
Gave this PR another pass: updating the commit message, adding tests, adding a |
cb6ba5c
to
6f9be42
Compare
This change moves check constraint adds and drops through the schema changer when the transaction commits, moving the constraint through the same intermediate states as when index/columns are added or dropped. The only small differences are: - Constraint adds can start in write-only and immediately start being enforced if all used columns are in write-only or public. - Constraint drops can move immediately from public to absent if they have not yet been validated. Therefore, the check constraint is no longer immediately public when it is added, but allows data validation of the constraint to be performed when the constraint is added (and is now the default behaviour). Constraints can now also be added on columns in the process of being added. This change also ensures that there are no data anomalies in either versions of the schema when dropping a validated check constraint, as previously the transition moved the constraint from public -> absent. Writes on the new version were not being checked even though nodes on an older version expect all rows to conform to the constraint. Fixes cockroachdb#33593. See also cockroachdb#33550. Release note (sql change): Check constraint adds by default will validate table data with the added constraint asynchronously after the transaction commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/sqlbase/structured.go, line 2261 at r4 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
Updated
AddCheckMutation
to match the implementation for indexes and columns.
Also removed a test in schema_changer_test.go
that was testing this special case.
bors r+ |
32504: sql: run check constraint mutations and validation by schema changer r=lucy-zhang a=eriktrinh This change moves check constraint adds and drops through the schema changer when the transaction commits, moving the constraint through the same intermediate states as when index/columns are added or dropped. The only small differences are: - Constraint adds can start in write-only and immediately start being enforced if all used columns are in write-only or public. - Constraint drops can move immediately from public to absent if they have not yet been validated. Therefore, the check constraint is no longer immediately public when it is added, but allows data validation of the constraint to be performed when the constraint is added (and is now the default behaviour). Constraints can now also be added on columns in the process of being added. This change also ensures that there are no data anomalies in either versions of the schema when dropping a validated check constraint, as previously the transition moved the constraint from public -> absent. Writes on the new version were not being checked even though nodes on an older version expect all rows to conform to the constraint. Release note (sql change): Check constraint adds by default will validate table data with the added constraint asynchronously after the transaction commits. Co-authored-by: Erik Trinh <[email protected]>
Build succeeded |
Currently, constraints are added in the `Unvalidated` state, and are not validated for existing rows until ALTER TABLE ... VALIDATE CONSTRAINT is run. With this change, check constraints will be validated asynchronously after they are added by default (and similar changes to FKs are to follow). This addresses the problematic long-running transactions caused by the current implementation of VALIDATE CONSTRAINT. This PR is a rework of cockroachdb#32504, and has the same tests. With this change, check constraints will be added to the table descriptor in the new `Validating` state, visible to CRUD operations, and a mutation is queued indicating that the constraint is to be validated. During the backfill step, the constraint is validated for existing rows. If validation succeeds, then the constraint moves to the `Validated` state; otherwise, it is dropped. The behavior when dropping constraints (either via DROP CONSTRAINT or indirectly when a column is dropped) is unchanged: no mutation is enqueued. As part of this change, check constraints can be added to non-public columns in the process of being added, including columns that were created earlier in the same transaction. The main difference between this PR and cockroachdb#32504 is that cockroachdb#32504 does not add the constraint to the table descriptor until it has been validated. Release note (sql change): Check constraint adds by default will validate table data with the added constraint asynchronously after the transaction commits.
Currently, constraints are added in the `Unvalidated` state, and are not validated for existing rows until ALTER TABLE ... VALIDATE CONSTRAINT is run. With this change, check constraints will be validated asynchronously after they are added by default (and similar changes to FKs are to follow). This addresses the problematic long-running transactions caused by the current implementation of VALIDATE CONSTRAINT. This PR is a rework of cockroachdb#32504, and has the same tests. With this change, check constraints will be added to the table descriptor in the new `Validating` state, visible to CRUD operations, and a mutation is queued indicating that the constraint is to be validated. During the backfill step, the constraint is validated for existing rows. If validation succeeds, then the constraint moves to the `Validated` state; otherwise, it is dropped. The behavior when dropping constraints (either via DROP CONSTRAINT or indirectly when a column is dropped) is unchanged: no mutation is enqueued. As part of this change, check constraints can be added to non-public columns in the process of being added, including columns that were created earlier in the same transaction. The main difference between this PR and cockroachdb#32504 is that cockroachdb#32504 does not add the constraint to the table descriptor until it has been validated. Release note (sql change): Check constraint adds by default will validate table data with the added constraint asynchronously after the transaction commits.
Currently, constraints are added in the `Unvalidated` state, and are not validated for existing rows until ALTER TABLE ... VALIDATE CONSTRAINT is run. With this change, check constraints will be validated asynchronously after they are added by default (and similar changes to FKs are to follow). This addresses the problematic long-running transactions caused by the current implementation of VALIDATE CONSTRAINT. This PR is a rework of cockroachdb#32504, and has the same tests. With this change, check constraints will be added to the table descriptor in the new `Validating` state, visible to CRUD operations, and a mutation is queued indicating that the constraint is to be validated. During the backfill step, the constraint is validated for existing rows. If validation succeeds, then the constraint moves to the `Validated` state; otherwise, it is dropped. The behavior when dropping constraints (either via DROP CONSTRAINT or indirectly when a column is dropped) is unchanged: no mutation is enqueued. As part of this change, check constraints can be added to non-public columns in the process of being added, including columns that were created earlier in the same transaction. The main difference between this PR and cockroachdb#32504 is that cockroachdb#32504 does not add the constraint to the table descriptor until it has been validated. Release note (sql change): Check constraint adds by default will validate table data with the added constraint asynchronously after the transaction commits.
34301: sql: validate check constraints with the schema changer r=lucy-zhang a=lucy-zhang Currently, constraints are added in the `Unvalidated` state, and are not validated for existing rows until ALTER TABLE ... VALIDATE CONSTRAINT is run. With this change, check constraints will be validated asynchronously after they are added by default (and similar changes to FKs are to follow). This addresses the problematic long-running transactions caused by the current implementation of VALIDATE CONSTRAINT. This PR is a rework of #32504 and has the same tests. With this change, check constraints will be added to the table descriptor in the new `Validating` state, visible to CRUD operations, and a mutation is queued indicating that the constraint is to be validated. During the backfill step, the constraint is validated for existing rows. If validation succeeds, then the constraint moves to the `Validated` state; otherwise, it is dropped. The behavior when dropping constraints (either via DROP CONSTRAINT or indirectly when a column is dropped) is unchanged: no mutation is enqueued. As part of this change, check constraints can be added to non-public columns in the process of being added, including columns that were created earlier in the same transaction. The main difference between this PR and #32504 is that #32504 does not add the constraint to the table descriptor until it has been validated. See #34238 for more context. Release note (sql change): Check constraint adds by default will validate table data with the added constraint asynchronously after the transaction commits. 34720: rpc: always trace incoming RPCs if tracing is enabled r=andreimatei a=andreimatei Before this patch, an incoming RPC would not be traced if the caller wasn't traced. Usually this was inconsequential, as the caller is generally traced if tracing is enabled in various ways. However, for the status server (AdminUI calls) this was not true - the caller (the browser, through a HTTP->gRPC gateway) was never tracing a call. This patch makes the server create a span regardless of the caller if tracing is enabled. Fixes #34310 Release note: None 34798: roachtest: add two more indexes to bulk index creation test r=vivekmenezes a=vivekmenezes Release note: None 34829: jobs: only include running and very recently finished jobs in SHOW JOBS r=dt a=dt On a cluster that has been running for a long time or with frequent periodic jobs, SHOW JOBS can output an unbounded, massive wall of text. This makes it hard to find the jobs you are likely interested in -- those that are running or have very recently finished. This changes SHOW JOBS to only include running jobs or those that finished in the last 12h. The full listing of jobs is still available via crdb_internal.jobs. Release note (general change): SHOW JOBS only returns running and recently finished jobs. Older jobs can still be inspected via the crdb_internal.jobs table. Co-authored-by: Lucy Zhang <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Vivek Menezes <[email protected]> Co-authored-by: David Taylor <[email protected]>
This change moves check constraint adds and drops through the schema
changer when the transaction commits, moving the constraint through the
same intermediate states as when index/columns are added or dropped. The
only small differences are:
enforced if all used columns are in write-only or public.
have not yet been validated.
Therefore, the check constraint is no longer immediately public when it
is added, but allows data validation of the constraint to be performed
when the constraint is added (and is now the default behaviour).
Constraints can now also be added on columns in the process of being
added.
This change also ensures that there are no data anomalies in either
versions of the schema when dropping a validated check constraint, as
previously the transition moved the constraint from public -> absent.
Writes on the new version were not being checked even though nodes on an
older version expect all rows to conform to the constraint.
Release note (sql change): Check constraint adds by default will
validate table data with the added constraint asynchronously after the
transaction commits.