Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schemachanger: Implement ADD CONSTRAINT ... CHECK #91153

Merged
merged 7 commits into from
Dec 14, 2022

Conversation

Xiang-Gu
Copy link
Contributor

@Xiang-Gu Xiang-Gu commented Nov 2, 2022

This PR implemented ALTER TABLE ... ADD CHECK ... in the declarative
schema changer. It can be broken into two big categories:

  1. Enabling adding/dropping path of ConstraintName element;
  2. Implementing ALTER TABLE ... ADD CHECK in the builder.

Epic: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch 2 times, most recently from acae595 to ae5bff8 Compare November 18, 2022 20:02
@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch from ae5bff8 to d401f30 Compare November 29, 2022 18:53
@Xiang-Gu Xiang-Gu marked this pull request as ready for review November 29, 2022 18:58
@Xiang-Gu Xiang-Gu requested a review from a team November 29, 2022 18:58
@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch 3 times, most recently from b64748a to 01011ba Compare November 29, 2022 21:49
Copy link
Contributor

@chengxiong-ruan chengxiong-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely done!

Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I have a few comments to make while you deal with test failures.

@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch 2 times, most recently from 50fcb79 to c7c7ce4 Compare November 30, 2022 22:55
@Xiang-Gu Xiang-Gu requested a review from a team as a code owner November 30, 2022 22:55
@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch from c7c7ce4 to 0d673da Compare December 1, 2022 15:30
@Xiang-Gu Xiang-Gu marked this pull request as draft December 1, 2022 16:02
@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch 3 times, most recently from c68a257 to d6df5d4 Compare December 5, 2022 17:05
@Xiang-Gu
Copy link
Contributor Author

Xiang-Gu commented Dec 5, 2022

Now that the CI is green (finally), this is ready for another look.

Most of the combat for CI failure revolves around the improvement we made in the new schema changer that a check constraint references sequences by ID (rather than by name). That difference has consequences that "leaked" to the legacy schema changer, and I need to modify the legacy schema change to handle those consequences.

E.g., if we created a table and added a constraint that references a sequence in the new schema changer, and later drop that sequence cascade in the old schema changer, the old schema changer will see that this sequence is referenced by a table's constraint, which previously won't happen at all (a known limitation), and does not know what to do. I taught it to ignore/skip those tables.

@Xiang-Gu Xiang-Gu marked this pull request as ready for review December 5, 2022 18:17
@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch 5 times, most recently from 03e4c54 to ec33635 Compare December 6, 2022 18:02
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed all but the last commits, I'll continue later. Those are coming along nicely.

@@ -106,22 +106,3 @@ type ConstraintDetail struct {
// Only populated for Check Constraints.
CheckConstraint *TableDescriptor_CheckConstraint
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the ConstraintDetail struct definition also. It's not used at all.

return err
}
if constraint.AsUniqueWithIndex() != nil {
constraint.AsUniqueWithIndex().IndexDesc().Name = op.Name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to be careful with index-backed constraints, because their names are already modelled by the scpb.IndexName element. I'm not keen on having different elements represent the same thing so can you make ConstraintName really represent non-index-backed-constraint names? And in this line of code, return an assertion error.

return errors.AssertionFailedf("unknown constraint type")
}
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This op implementation should probably live in constraint.go instead of index.go.

// Use the DistSQLTypeResolver because we need to resolve types by ID.
resolver := descs.NewDistSQLTypeResolver(descriptors, txn)
// Use a schema resolver because we need to resolve types by ID and table by name.
resolver := NewSkippingCacheSchemaResolver(descriptors, sessiondata.NewStack(sessionData), txn, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment what the last nil is.


minSupportedClusterVersion := alterTableAddConstraintMinSupportedClusterVersion[cmdKey]
return b.EvalCtx().Settings.Version.IsActive(b, minSupportedClusterVersion)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but this code is still needlessly complex. Rather than dispatching in a switch block why not introduce a supportedAlterTableAddConstraintStatements map in the same vein as the other maps? That way you can get rid of all these useless panic(errors.AssertionFailedf("previously checked guardrail violated: Unknown add constraint command type")) which are effectively dead code. Also instead of keying on free-form strings, key on the reflect.Type of an AST node.

Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a pass on the last commit and didn't find anything shocking. I requested a few changes which will be noisy so I'm holding off on reviewing further. Still, I think this is close. Great work!

@@ -172,6 +176,7 @@ var elementSchemaOptions = []rel.SchemaOption{
rel.EntityAttr(ConstraintID, "ConstraintID"),
rel.EntityAttr(ReferencedSequenceIDs, "UsesSequenceIDs"),
rel.EntityAttr(ReferencedTypeIDs, "UsesTypeIDs"),
rel.EntityAttr(IndexIDToValidate, "IndexIDToValidate"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't actually need this new attribute type, instead rel.EntityAttr(IndexID, "IndexIDToValidate"), will do.

// IndexIDToValidate is the index id to hint to the check constraint validation SQL query. It is used
// exclusively by sql.validateCheckExpr).
optional uint32 index_id_to_validate = 9 [(gogoproto.customname) = "IndexIDToValidate",
(gogoproto.casttype) = "IndexID", (gogoproto.nullable) = false];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that you actually need to add this value to the protobuf, which I'd really prefer to leave alone.
Instead you can plumb a catalog.Index argument to sql.ValidateCheckConstraint which is exclusively used by the declarative schema changer, and pass its ID along to validateCheckExpr. Upstream you need to pass that index ID to the check constraint validation op.

The legacy schema changer exclusively seems to use validateCheckInTxn which could simply pass 0 to validateCheckExpr.

func dropDependentOnSequence(ctx context.Context, p *planner, seqDesc *tabledesc.Mutable) error {
for len(seqDesc.DependedOnBy) > 0 {
dependent := seqDesc.DependedOnBy[0]
// numDependedOnByTablesToSkip is a temporary hack to skip those tables that reference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of describing a hack as a temporary, describe the conditions which need to be fulfilled to remove the hack. One should always assume that, unless somebody's actively working on part of the code right now, it's not going to change soon.

@@ -329,6 +329,9 @@ message CheckConstraint {
Expression embedded_expr = 4 [(gogoproto.nullable) = false, (gogoproto.embed) = true];
// FromHashShardedColumn indicates whether this check constraint comes from a hash sharded column.
bool from_hash_sharded_column = 5;
// index_id_to_validate is the index id to hint to the check constraint validation SQL query. It is used
// exclusively by sql.validateCheckExpr).
uint32 index_id_to_validate = 6 [(gogoproto.customname) = "IndexIDToValidate", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/sem/catid.IndexID"];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you rename this to Index_id_for_validation please?

)

// These rules ensure that columns and constraints on those columns come
// to existence in the appropriate order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it turns out this rule binds a constraint to an index, not to columns.

@@ -400,12 +401,16 @@ func updateBackReferencesInSequences(
return nil
})
if forwardRefs.Contains(seqID) {
// Add `colID` to `updated`, if not exists.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this comment is not useful.

} else {
// Remove `colID` from `updated`, if exists.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: neither is this one.

registerDepRule(
"constraint dependent public right before constraint",
"constraint dependent public right before index-backed constraint",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name change is inconsistent with the definition of isConstraint

@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch from ec33635 to 8050fb9 Compare December 7, 2022 22:44
Copy link
Contributor Author

@Xiang-Gu Xiang-Gu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@postamar Thanks for your review. I've addressed your feedback and pushed the changes. If you get a chance, I'd appreciate it if you can give it another look!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan, @fqazi, and @postamar)


pkg/sql/backfill.go line 1540 at r7 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

update comment?

Done.

Code quote:

rn runHistoricalTxn.Exec(ctx, func(

pkg/sql/backfill.go line 1541 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: comment what the last nil is.

done


pkg/sql/drop_sequence.go line 238 at r18 (raw file):

Previously, postamar (Marius Posta) wrote…

Instead of describing a hack as a temporary, describe the conditions which need to be fulfilled to remove the hack. One should always assume that, unless somebody's actively working on part of the code right now, it's not going to change soon.

done. I think my previous comments were too verbose and I've shortened it and put the condition to remove this hack in the comment as well.

Code quote:

umDependedOnByTablesToSki

pkg/sql/catalog/descpb/constraint.go line 108 at r13 (raw file):

Previously, postamar (Marius Posta) wrote…

Remove the ConstraintDetail struct definition also. It's not used at all.

done, thanks for pointing this out!


pkg/sql/catalog/descpb/structured.proto line 946 at r18 (raw file):

argument
I hope our slack discussion got you on board with this solution, so I will keep it as is. If you have further doubts, let me know and we can discuss more.

For other reviewers, the most important reason that we want to add such a field in this CheckConstraint element is to allow writing the dep rule easily: the to-be-checked primary index needs to be ready before the check constraint reaches validated. Having such a field allows us to express the dependency rule with a join on this index_id_to_validate field.


pkg/sql/catalog/schemaexpr/expr.go line 86 at r6 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

It's okay to leave this as a follow on

Done.


pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table.go line 171 at r17 (raw file):

Previously, postamar (Marius Posta) wrote…

I'm sorry but this code is still needlessly complex. Rather than dispatching in a switch block why not introduce a supportedAlterTableAddConstraintStatements map in the same vein as the other maps? That way you can get rid of all these useless panic(errors.AssertionFailedf("previously checked guardrail violated: Unknown add constraint command type")) which are effectively dead code. Also instead of keying on free-form strings, key on the reflect.Type of an AST node.

A difficulty of that approach will be that we re-use the same *tree.UniqueConstraintTableDef type for UNIQUE, UNIQUE WITHOUT INDEX, and PRIMARY KEY constraints, so keying on reflect.Type won't work. That's why I chose to just key on free-form strings.

I can definitely remove all those panic code since we shouldn't expect those lines to be hit (I was just being paranoid). Let me know if you have a better alternative solution here (I am not very satisfied with the free-form strings either).

Code quote:

inSupportedClusterVersion := alterTableAddConstraintMinSupportedClusterVersion[cmdKey]

pkg/sql/schemachanger/scbuild/internal/scbuildstmt/alter_table_add_constraint.go line 54 at r7 (raw file):

Previously, postamar (Marius Posta) wrote…

My point was more that we already check for this in alter_table.go and that checking a second time isn't helpful.

I removed those lines.


pkg/sql/schemachanger/scbuild/internal/scbuildstmt/helpers.go line 316 at r7 (raw file):

Previously, postamar (Marius Posta) wrote…

Oh nice. I thought otherwise for some reason. Excellent! That's a relief.

Done.


pkg/sql/schemachanger/scexec/scmutationexec/references.go line 404 at r18 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: this comment is not useful.

I spent some time understanding what this function is doing, so I added those comments. But yeah, I think what's better is to add some comments for the whole function. To that regard, I removed those two lines of comments and added some for the whole function.

Code quote:

// Add `colID` to `updated`, if not exists.

pkg/sql/schemachanger/scexec/scmutationexec/references.go line 413 at r18 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: neither is this one.

acked above


pkg/sql/schemachanger/scpb/elements.proto line 334 at r18 (raw file):

Previously, postamar (Marius Posta) wrote…

can you rename this to Index_id_for_validation please?

done

Code quote:

 index_id_to_validate = 6 [(gogoproto.cus

pkg/sql/schemachanger/scplan/internal/rules/dep_add_constraint.go line 23 at r18 (raw file):

Previously, postamar (Marius Posta) wrote…

this name change is inconsistent with the definition of isConstraint

thanks for catching this. Fixed


pkg/sql/schemachanger/screl/attr.go line 179 at r18 (raw file):

Previously, postamar (Marius Posta) wrote…

You don't actually need this new attribute type, instead rel.EntityAttr(IndexID, "IndexIDToValidate"), will do.

done

Code quote:

rel.EntityMapping(t((*scpb.CheckConstraint)(nil)),

pkg/sql/schemachanger/scexec/scmutationexec/index.go line 363 at r14 (raw file):

Previously, postamar (Marius Posta) wrote…

We have to be careful with index-backed constraints, because their names are already modelled by the scpb.IndexName element. I'm not keen on having different elements represent the same thing so can you make ConstraintName really represent non-index-backed-constraint names? And in this line of code, return an assertion error.

Done. I renamed the element to ConstraintWithoutIndexName.


pkg/sql/schemachanger/scexec/scmutationexec/index.go line 374 at r14 (raw file):

Previously, postamar (Marius Posta) wrote…

This op implementation should probably live in constraint.go instead of index.go.

agreed. I created a new constraint.go in this package and moved all the op implementations related to constraints there.

Code quote:

	} else if constraint.AsUniqueWithoutIndex() != nil {
		constraint.AsUniqueWithoutIndex().UniqueWithoutIndexDesc().Name = op.Name
	} else if constraint.AsCheck() != nil {
		constraint.AsCheck().CheckDesc().Name = op.Name

pkg/sql/schemachanger/scplan/internal/rules/dep_add_column_and_constraint.go line 21 at r18 (raw file):

Previously, postamar (Marius Posta) wrote…

As it turns out this rule binds a constraint to an index, not to columns.

done. I've changed the comments as well as the file name.

Code quote:

// These rules ensure that columns and constraints on those columns come
// to existence in the appropriate order.

@postamar
Copy link
Contributor

postamar commented Dec 8, 2022

FYI, you forgot about removing index_id_to_validate from structured.proto which was arguably my number 1 concern.

@Xiang-Gu
Copy link
Contributor Author

Xiang-Gu commented Dec 8, 2022

FYI, you forgot about removing index_id_to_validate from structured.proto which was arguably my number 1 concern.

Don't we need it? Otherwise, how could we pass this information from scpb.CheckConstraint to a descpb.TableDescriptor_CheckConstraint?

(I indeed forget to change its name to index_id_for_validation though but I just changed it)

@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch from 8050fb9 to 6ac0367 Compare December 8, 2022 22:03
@postamar
Copy link
Contributor

postamar commented Dec 9, 2022

Like I already said, I don't think that we need it because we only ever need it in the declarative schema changer, and there the ID is readily available from the element.

@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch 2 times, most recently from 5aee5f1 to ddcb466 Compare December 12, 2022 22:40
@Xiang-Gu
Copy link
Contributor Author

@postamar Sorry, I didn't fully understand you previously. Yeah, I agree it's nicer to leave structure.proto alone and plumb in the index id for validation via scop.ValidateCheckConstraint operation. I've pushed the changes and can you retake a look at it? Thank you so much!

1. retain check constraint in the dropping path on `Checks`. The old
schema changer has the behavior of retaining the to-be-dropped
check constraint on the `Checks` slice. We will do the same in the
new schema changer.

2. when adding a check constraint, we enqueue it to the mutation slice
and then immediately fast-forward its mutation state to WRITE_ONLY.
Previously it will stay in DELTE_ONLY. This change makes it clear that
the check constraint, at this stage, is "enforced".

3. removed some dead code.
We want to be rigiorous in our naming of the `ConstraintName` element,
which is really just for non-index-backed constraint.
ConstraintName element is deemed simple dependents of constraint and
hence has transition between PUBLIC <==> ABSENT. This commit thus
is primarily implementing the scop `SetConstraintName`.
During constraint validation, a side hustle is to resolve table names
and type so we can pretty print them in log/error messages. The existing
resolver is not adequate if the check constraint references sequences
or user-defined types.
`DequalifyAndValidateExpr` is used to validate an expression such as
the expr has given type, its volatility is "alright", etc. This function
will be used in the builder for `ADD CONSTRAINT ... CHECK` but it
currently expects a table descriptor, which we might not have in the
builder. To circumvent that limitation, this commit refactor this
function by plumbing in the needed column information and lookupFn.
`minClusterVersion` is used to version gate each fully supported
stmts in the declarative schema changer. Previously, there is only one
`minClusterVersion` for stmt `ALTER TABLE ADD CONSTRAINT`, which is fine
bc we only supported one `ADD CONSTRAINT` (namely, `ADD PRIMARY KEY`)
so far. We will need to extend this as we are going to support more
`ADD CONSTRAINT` in the declarative schema changer.
This commit implements `ADD CONSTRAINT ... CHECK`. Thanks to the
previous infrastructural work for enabling adding/dropping check
constraint, we only need to add code in the new schema changer builder.

It also enabled `ADD CONSTRAINT ... CHECK` to be on by default. It
requires us to turn off new schema changer in various tests bc those
tests depends on specific behavior/knobs of the old schema changer.
@Xiang-Gu Xiang-Gu force-pushed the implement-add-constraint-check branch from ddcb466 to 51e2201 Compare December 13, 2022 17:54
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks! We really don't want to touch those descpb protobufs unless we need to.

@Xiang-Gu
Copy link
Contributor Author

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Dec 14, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants