-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: support the ADD COLUMN ... REFERENCES syntax #47082
Conversation
❌ The GitHub CI (Cockroach) build has failed on 40bd2ca3. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan. |
0eadd27
to
24f5a09
Compare
❌ The GitHub CI (Cockroach) build has failed on 24f5a09a. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan. |
❌ The GitHub CI (Cockroach) build has failed on 24f5a09a. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan. |
Is this blocked on anything? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass with some general comments.
Reviewed 8 of 8 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)
pkg/sql/create_table.go, line 738 at r1 (raw file):
} else { // No existing suitable index was found. if ts == NonEmptyTable && !isFKOnNewlyAddedColumn(tbl, originColumns) {
I don't like the idea of making an exception for ADD COLUMN ... REFERENCES
in our past stance to not auto-create new indexes for FKs unless the table is empty. My understanding is that the current implementation to auto-create an index is temporary because the index requirement will be lifted in 20.2. If we're fine with auto-creating indexes in more cases now because we know it'll be temporary, then why don't we just allow it in all cases?
Or we could disallow ADD COLUMN ... REFERENCES
for now unless the table is empty, which would be more consistent with what we have now.
(FWIW, we used to avoid creating new indexes because the pre-19.1 index backfiller was too expensive, but the SST-based index backfiller is much faster. But this is all a moot point if the plan is to get rid of the requirement for the index in the first place in 20.2. If that turns out to not make it into 20.2 for some reason, maybe we can revisit the index performance discussion.)
pkg/sql/logictest/testdata/logic_test/alter_table, line 1109 at r1 (raw file):
ROLLBACK subtest add_col_references
Eventually at some point we'll want to have tests for
- table created earlier in same transaction
- table references itself
- trying to add more foreign key constraints on the same column in the same transaction (not allowed)
- adding a column and then separately adding a foreign key in the same transaction (as I understand it, with this PR it will be allowed)
pkg/sql/sem/tree/alter_table.go, line 164 at r1 (raw file):
} d.CheckExprs = nil if d.HasFKConstraint() {
If we do end up making an exception to not auto-creating indexes for ALTER COLUMN REFERENCES
, is it possible to rewrite the AST here to be add column
, add index
, alter column ... references
instead of having special cases later in the code to add an index? (If this isn't currently possible, I think we should do whatever it takes to make it possible. It seems desirable to have ALTER COLUMN REFERENCES
behave identically to writing out each "command" manually.
cc @jordanlewis @awoods187 do you have an opinion on this? Do you know what users expect around this behavior? |
My thoughts on the matter are that we should automatically create the index for the foreign key constraint. Forcing a user to separately create the index for their new column just means not using the command. A similar case is the "add column unique", where we build the unique index while building the column.
I don't think so -- there create index isn't part of the alter table ast. Though I think that we can do whatever is done by the |
I think this would probably be fine, at least to start, because my guess is that most of these things get auto run by ORMs and stuff at startup - and that's the most frustrating thing for a user, when they're just running their basic migrations and stuff and something doesn't work. I'm guessing that once they have something in prod with data, they'll be more willing to rewrite stuff to a slightly different form if it doesn't quite work, as long as we include a helpful error mressag.e |
Yeah, I agree with this. I also think we should stop auto-creating indexes once the optimizer gets CASCADE support and we won't need an index on the referencing side anymore, which is planned for 20.2. So it's likely that once we release, users won't even have encountered the "empty table only" restriction for this feature at all. |
(#22253 is the issue for getting rid of the index requirement) |
ede1644
to
6ccf57e
Compare
Updated to not automatically create the index, and added more tests, PTAL!
I believe this is allowed now (#43417) |
❌ The GitHub CI (Cockroach) build has failed on 6ccf57ee. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
❌ The GitHub CI (Cockroach) build has failed on 6ccf57ee. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do anything to add RSG tests for the new syntax, or anything along those lines?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @lucy-zhang, and @rohany)
pkg/sql/create_table.go, line 675 at r2 (raw file):
} if len(targetCols) != len(originColumns) {
nit: Since we're renaming srcCols
to originColumns
, can we also rename targetCols
to referencedColumns
?
pkg/sql/create_table.go, line 734 at r2 (raw file):
// a DEFAULT expression of NULL and a NOT NULL constraint. if d.Actions.Delete == tree.SetDefault || d.Actions.Update == tree.SetDefault { for _, sourceColumn := range originColumns {
nit: Maybe for _, originColumn := range originColumns
or something like that.
pkg/sql/logictest/testdata/logic_test/alter_table, line 1109 at r1 (raw file):
trying to add more foreign key constraints on the same column in the same transaction (not allowed)
I believe this is allowed now (#43417)
Oh, right, thanks.
pkg/sql/logictest/testdata/logic_test/fk, line 2529 at r2 (raw file):
statement error adding a REFERENCES constraint while the column is being added not supported BEGIN; ALTER TABLE childid ADD id2 INT UNIQUE NOT NULL DEFAULT 0; ALTER TABLE childid ADD CONSTRAINT fk_id FOREIGN KEY (id2) REFERENCES parentid (k);
Should we keep this test and update it to verify that these statements work? (Maybe we also need one for the add column ...; create index ...; add foreign key...
case, for an empty table, to make sure we can find the index that's been created already instead of making a new one.)
pkg/sql/sqlbase/table.go, line 539 at r2 (raw file):
// It returns either an index that is active, or an index that was created // in the same transaction that is currently running. func FindFKOriginIndexInTxn(
I see we're still using FindFKOriginIndex
in the FK check helper implementation. What exactly happens when the table descriptor is in the state where the index is still a mutation, but we've started enforcing the FK constraint for the non-public index? How does the FK check helper know about the index?
A more general comment that I probably should have thought of earlier: While this PR is mostly about the new supported syntax to create a column, add an index, and add a FK all at once (for a new or empty table), it also would introduce the ability to (explicitly) create an index and add a FK based on it in the same transaction even for non-empty tables, right? Something like this currently doesn't work, but would start working:
create table t (a int); create table parent (a int primary key);
insert into parent values (1); insert into t values (1);
begin;
create index on t (a);
alter table t add foreign key (a) references parent;
commit;
I'm starting to think that we should actually consider "index + FK created in same transaction" as a separate new feature, and test specifically for it, since it has ramifications (including on the SQL exec side of things) beyond just supporting ADD COLUMN REFERENCES
. Either that, or we could come up with some special handling to disallow it, though I'm not sure what a principled way to do that would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)
pkg/sql/sqlbase/table.go, line 539 at r2 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
I see we're still using
FindFKOriginIndex
in the FK check helper implementation. What exactly happens when the table descriptor is in the state where the index is still a mutation, but we've started enforcing the FK constraint for the non-public index? How does the FK check helper know about the index?A more general comment that I probably should have thought of earlier: While this PR is mostly about the new supported syntax to create a column, add an index, and add a FK all at once (for a new or empty table), it also would introduce the ability to (explicitly) create an index and add a FK based on it in the same transaction even for non-empty tables, right? Something like this currently doesn't work, but would start working:
create table t (a int); create table parent (a int primary key); insert into parent values (1); insert into t values (1); begin; create index on t (a); alter table t add foreign key (a) references parent; commit;
I'm starting to think that we should actually consider "index + FK created in same transaction" as a separate new feature, and test specifically for it, since it has ramifications (including on the SQL exec side of things) beyond just supporting
ADD COLUMN REFERENCES
. Either that, or we could come up with some special handling to disallow it, though I'm not sure what a principled way to do that would be.
Of course, once we get rid of the requirement for an index on the referencing side, the schema changer will no longer care about indexes on the table, and the old FK check implementation won't be used, so maybe this just won't be a problem anymore. Though maybe we'll still need to keep the old code path around for backward compatibility during the upgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @lucy-zhang, and @RaduBerinde)
pkg/sql/logictest/testdata/logic_test/fk, line 2529 at r2 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
Should we keep this test and update it to verify that these statements work? (Maybe we also need one for the
add column ...; create index ...; add foreign key...
case, for an empty table, to make sure we can find the index that's been created already instead of making a new one.)
I'll add the tests for this near the other references tests.
pkg/sql/sqlbase/table.go, line 539 at r2 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
Of course, once we get rid of the requirement for an index on the referencing side, the schema changer will no longer care about indexes on the table, and the old FK check implementation won't be used, so maybe this just won't be a problem anymore. Though maybe we'll still need to keep the old code path around for backward compatibility during the upgrade?
Based on our discussions offline, it seems like this problem already exists when we create an index for the FK automatically. We can investigate that and move forward. @RaduBerinde do you have an idea about the probability that we remove the need for the origin indexes this release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @lucy-zhang)
pkg/sql/sqlbase/table.go, line 539 at r2 (raw file):
Previously, rohany (Rohan Yadav) wrote…
Based on our discussions offline, it seems like this problem already exists when we create an index for the FK automatically. We can investigate that and move forward. @RaduBerinde do you have an idea about the probability that we remove the need for the origin indexes this release?
I am optimistic we can remove it in this release.
RE rsg/sqlsmith tests -- IDTS for RSG, and sqlsmith doesn't attempt to make foreign keys right now. |
I talked about this non-public index problem with @jordanlewis and since it is existing and we plan on getting rid of the restriction anyway, it shouldn't block this work. However, we should maybe add a more informative error if this happens during a write or delete. One wonky thing is the index that gets created when adding a column with unique. If we allow this index to be used as the backing index for the foreign key it would become much more likely to see the error with the non-public foreign key since the period where the index is not visible could be much longer. In that case, should we not allow using that index for the added column? The downside is now an extra index would be created if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked about this non-public index problem with @jordanlewis and since it is existing and we plan on getting rid of the restriction anyway, it shouldn't block this work.
That's fine with me.
If we allow this index to be used as the backing index for the foreign key it would become much more likely to see the error with the non-public foreign key since the period where the index is not visible could be much longer.
I'm not sure I follow. Longer compared to if we auto-created a non-unique index?
Reviewed 1 of 7 files at r2, 2 of 2 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)
pkg/sql/create_table.go, line 675 at r2 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
nit: Since we're renaming
srcCols
tooriginColumns
, can we also renametargetCols
toreferencedColumns
?
Not a big deal but I think matching referencedCols
and originCols
is nicer.
pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):
CREATE INDEX ON t2 (x); ALTER TABLE t2 ADD FOREIGN KEY (x) REFERENCES t1 (x); COMMIT
Do we have a variant of this test and the one above it where the tables are created outside the transaction? There's a separate, simpler code path when the tables are created in the same transaction (they don't go through the schema changer).
pkg/sql/logictest/testdata/logic_test/fk, line 2529 at r2 (raw file):
Previously, rohany (Rohan Yadav) wrote…
I'll add the tests for this near the other references tests.
(see comment above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that we auto create the index for a unique column even if the table has data -- so if we allowed this unique index to be used for the foreign key on a table that had alot of data, there could be a larger window of time where the constraint is active and the index is not yet public. Or is this not a worry?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @lucy-zhang)
pkg/sql/create_table.go, line 675 at r2 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
Not a big deal but I think matching
referencedCols
andoriginCols
is nicer.
I think you have a slightly out of date revision -- i updated this.
pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
Do we have a variant of this test and the one above it where the tables are created outside the transaction? There's a separate, simpler code path when the tables are created in the same transaction (they don't go through the schema changer).
Added
pkg/sql/logictest/testdata/logic_test/fk, line 2529 at r2 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
(see comment above)
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug (manifesting in an error returned from FindFKOriginIndex
on writes) only happens in the interval between when we've added the validating
constraint to the table and when the schema change is done (i.e., the interval when we actually validate the constraint), so all this happens after the index backfill. Does that address your concern?
Focusing just on 20.1/19.2 and potential backported fixes for a bit: I just tried to write a test for this bug (with an auto-added index) and realized it affects writes for the entire CASCADE chain, not just the empty table to which we're adding a FK constraint with an auto-generated index. I agree that putting in a more helpful error would be good (and we'd want to document this as a known limitation). The alternative of fixing the FK check helper to look for non-public indexes seems like too invasive of a change to backport. I can open an issue separately for this.
Back to 20.2: This PR does expand the set of cases in which the bug can be encountered, by opening up new ways in which a new FK can depend on a new index. But if we're confident that we won't need an index anymore for 20.2, then can we just live with this increased buggy behavior on master for a little while longer? That seems to be the consensus, but I wanted to verify that we're fine with this.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @lucy-zhang)
pkg/sql/create_table.go, line 675 at r2 (raw file):
Previously, rohany (Rohan Yadav) wrote…
I think you have a slightly out of date revision -- i updated this.
Oh, I meant that currently we have referencedCols
and originColumns
and we might as well abbreviate both.
pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):
Previously, rohany (Rohan Yadav) wrote…
Added
I think it'd be useful to just have a test covering a minimal case of the buggy case we've been discussing:
BEGIN
CREATE INDEX ...
ALTER COLUMN ... ADD FOREIGN KEY ...
COMMIT
where the FK uses the new index. This by itself works, even though the bug happens in a later stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that makes sense -- so the length of the backfill doesn't affect the probability this bug surfaces.
I agree that putting in a more helpful error would be good (and we'd want to document this as a known limitation). The alternative of fixing the FK check helper to look for non-public indexes seems like too invasive of a change to backport. I can open an issue separately for this.
I can add these errors as a separate PR. I'll make an issue to point to as well.
But if we're confident that we won't need an index anymore for 20.2, then can we just live with this increased buggy behavior on master for a little while longer? That seems to be the consensus, but I wanted to verify that we're fine with this.
Yeah, i think this is fine. If it looks like we can't get rid of the restriction by the release, we can just remove the ability for the new index to be used at that point.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @lucy-zhang)
pkg/sql/create_table.go, line 675 at r2 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
Oh, I meant that currently we have
referencedCols
andoriginColumns
and we might as well abbreviate both.
Ohh, sorry for the confusion. Updated.
pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
I think it'd be useful to just have a test covering a minimal case of the buggy case we've been discussing:
BEGIN CREATE INDEX ... ALTER COLUMN ... ADD FOREIGN KEY ... COMMIT
where the FK uses the new index. This by itself works, even though the bug happens in a later stage.
Is this test not what you are looking for?
statement ok
DROP TABLE t1, t2 CASCADE;
CREATE TABLE t1 (x INT PRIMARY KEY);
CREATE TABLE t2 (y INT);
statement ok
BEGIN;
ALTER TABLE t2 ADD COLUMN x INT;
CREATE INDEX ON t2 (x);
ALTER TABLE t2 ADD FOREIGN KEY (x) REFERENCES t1 (x);
COMMIT
query TT
SHOW CREATE t2
----
t2 CREATE TABLE t2 (
y INT8 NULL,
x INT8 NULL,
CONSTRAINT fk_x_ref_t1 FOREIGN KEY (x) REFERENCES t1(x),
INDEX t2_x_idx (x ASC),
FAMILY "primary" (y, rowid, x)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one comment about the tests.
Reviewed 1 of 1 files at r4, 1 of 1 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)
pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):
Previously, rohany (Rohan Yadav) wrote…
Is this test not what you are looking for?
statement ok DROP TABLE t1, t2 CASCADE; CREATE TABLE t1 (x INT PRIMARY KEY); CREATE TABLE t2 (y INT); statement ok BEGIN; ALTER TABLE t2 ADD COLUMN x INT; CREATE INDEX ON t2 (x); ALTER TABLE t2 ADD FOREIGN KEY (x) REFERENCES t1 (x); COMMIT query TT SHOW CREATE t2 ---- t2 CREATE TABLE t2 ( y INT8 NULL, x INT8 NULL, CONSTRAINT fk_x_ref_t1 FOREIGN KEY (x) REFERENCES t1(x), INDEX t2_x_idx (x ASC), FAMILY "primary" (y, rowid, x) )
The only difference is the column being created in the transaction vs. prior to it.
Fixes cockroachdb#32917. This PR adds support for the add column references statement by allowing the foreign key building code to use columns and indexes added in the current txn. The schema changer already understands how to add the combination of the three in the same transaction. Release note (sql change): This PR adds support for the `ALTER TABLE ... ADD COLUMN ... REFERENCES ...` syntax for tables that are empty.
Ah, sorry for the confusion (again). Updated. |
bors r=lucy-zhang |
Build succeeded |
Fixes #32917.
This PR adds support for the add column references
statement by allowing the foreign key building code
to use columns and indexes added in the current txn.
The schema changer already understands how to add
the combination of the three in the same transaction.
Release note (sql change): This PR adds support for the
ALTER TABLE ... ADD COLUMN ... REFERENCES ...
syntax.