-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: improvements to constraint validation #34238
Comments
@lucy-zhang this looks like a good plan. parts 1 and 2 look good. I'd move part 1c: implement erik's backfiller for future. I'd like to motivate that by create a large table and proving that chunking is needed for better progress reporting and schema lease extensions. |
I realized that a problem with using SQL queries for validation is that we can't run queries on non-public columns, so we won't be able to add constraints on columns that are being added. So we do actually need the backfiller to preserve that functionality. I opened #34301, which implements the check constraint validation changes with Erik's backfiller implementation. |
34301: sql: validate check constraints with the schema changer r=lucy-zhang a=lucy-zhang Currently, constraints are added in the `Unvalidated` state, and are not validated for existing rows until ALTER TABLE ... VALIDATE CONSTRAINT is run. With this change, check constraints will be validated asynchronously after they are added by default (and similar changes to FKs are to follow). This addresses the problematic long-running transactions caused by the current implementation of VALIDATE CONSTRAINT. This PR is a rework of #32504 and has the same tests. With this change, check constraints will be added to the table descriptor in the new `Validating` state, visible to CRUD operations, and a mutation is queued indicating that the constraint is to be validated. During the backfill step, the constraint is validated for existing rows. If validation succeeds, then the constraint moves to the `Validated` state; otherwise, it is dropped. The behavior when dropping constraints (either via DROP CONSTRAINT or indirectly when a column is dropped) is unchanged: no mutation is enqueued. As part of this change, check constraints can be added to non-public columns in the process of being added, including columns that were created earlier in the same transaction. The main difference between this PR and #32504 is that #32504 does not add the constraint to the table descriptor until it has been validated. See #34238 for more context. Release note (sql change): Check constraint adds by default will validate table data with the added constraint asynchronously after the transaction commits. 34720: rpc: always trace incoming RPCs if tracing is enabled r=andreimatei a=andreimatei Before this patch, an incoming RPC would not be traced if the caller wasn't traced. Usually this was inconsequential, as the caller is generally traced if tracing is enabled in various ways. However, for the status server (AdminUI calls) this was not true - the caller (the browser, through a HTTP->gRPC gateway) was never tracing a call. This patch makes the server create a span regardless of the caller if tracing is enabled. Fixes #34310 Release note: None 34798: roachtest: add two more indexes to bulk index creation test r=vivekmenezes a=vivekmenezes Release note: None 34829: jobs: only include running and very recently finished jobs in SHOW JOBS r=dt a=dt On a cluster that has been running for a long time or with frequent periodic jobs, SHOW JOBS can output an unbounded, massive wall of text. This makes it hard to find the jobs you are likely interested in -- those that are running or have very recently finished. This changes SHOW JOBS to only include running jobs or those that finished in the last 12h. The full listing of jobs is still available via crdb_internal.jobs. Release note (general change): SHOW JOBS only returns running and recently finished jobs. Older jobs can still be inspected via the crdb_internal.jobs table. Co-authored-by: Lucy Zhang <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Vivek Menezes <[email protected]> Co-authored-by: David Taylor <[email protected]>
Everything that isn't tracked in a separate issue is done here |
This issue is for tracking all work related to #32118. The overall goal is for constraint validation on existing rows of a table (which can take a long time) to happen in a separate transaction from the user transaction of the ALTER TABLE query, by running the validation through the schema changer. As part of the change, the validation of existing rows will occur when the constraint is added (which is what Postgres does by default), instead of when VALIDATE CONSTRAINT is run.
This will be done by adding a new type of mutation, representing a constraint to be validated, and a new
ConstraintValidity
state. When ALTER TABLE ... ADD CONSTRAINT is run, a constraint is added to the table descriptor (as before), but in the (new)Validating
state, and will be visible to CRUD operations. A mutation for validating the new constraint is also added to the mutations list. The validation for existing rows then happens during the backfill step; if it succeeds, then the state of the constraint on the table descriptor will be updated toValidated
, and if it fails, the constraint will be dropped.The validation itself should use a backfiller, but in the short term can also plug in to the existing validation implementations that use SQL queries. A backfiller for check constraints has already been implemented by Erik in #32504, but it's not high-priority for FK constraints.
A planned list of changes/PRs is as follows:
planner
Use Erik's backfiller implementation instead of the SQL query (this could be done in the previous steps, but it makes sense to do it separately to keep the PRs smaller)(We'd do this based on need, but there's no indication that we'd need to yet)(Future) Implement FK validation using a backfiller(We'd do this based on need, but there's no indication that we'd need to yet)Unvalidated
state, and for VALIDATE CONSTRAINT to add a mutation to validate it; if the validation fails, the constraint is not dropped but remainsUnvalidated
, which is how Postgres does itThe text was updated successfully, but these errors were encountered: