Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support the ADD COLUMN ... REFERENCES syntax #47082

Merged
merged 1 commit into from
May 4, 2020

Conversation

rohany
Copy link
Contributor

@rohany rohany commented Apr 6, 2020

Fixes #32917.

This PR adds support for the add column references
statement by allowing the foreign key building code
to use columns and indexes added in the current txn.
The schema changer already understands how to add
the combination of the three in the same transaction.

Release note (sql change): This PR adds support for the
ALTER TABLE ... ADD COLUMN ... REFERENCES ... syntax.

@rohany rohany requested review from jordanlewis, thoszhang and a team April 6, 2020 17:46
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@blathers-crl
Copy link

blathers-crl bot commented Apr 6, 2020

❌ The GitHub CI (Cockroach) build has failed on 40bd2ca3.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@rohany rohany force-pushed the add-references branch 2 times, most recently from 0eadd27 to 24f5a09 Compare April 6, 2020 19:12
@blathers-crl
Copy link

blathers-crl bot commented Apr 6, 2020

❌ The GitHub CI (Cockroach) build has failed on 24f5a09a.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Apr 6, 2020

❌ The GitHub CI (Cockroach) build has failed on 24f5a09a.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan.

@fire
Copy link

fire commented Apr 22, 2020

Is this blocked on anything?

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass with some general comments.

Reviewed 8 of 8 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)


pkg/sql/create_table.go, line 738 at r1 (raw file):

	} else {
		// No existing suitable index was found.
		if ts == NonEmptyTable && !isFKOnNewlyAddedColumn(tbl, originColumns) {

I don't like the idea of making an exception for ADD COLUMN ... REFERENCES in our past stance to not auto-create new indexes for FKs unless the table is empty. My understanding is that the current implementation to auto-create an index is temporary because the index requirement will be lifted in 20.2. If we're fine with auto-creating indexes in more cases now because we know it'll be temporary, then why don't we just allow it in all cases?

Or we could disallow ADD COLUMN ... REFERENCES for now unless the table is empty, which would be more consistent with what we have now.

(FWIW, we used to avoid creating new indexes because the pre-19.1 index backfiller was too expensive, but the SST-based index backfiller is much faster. But this is all a moot point if the plan is to get rid of the requirement for the index in the first place in 20.2. If that turns out to not make it into 20.2 for some reason, maybe we can revisit the index performance discussion.)


pkg/sql/logictest/testdata/logic_test/alter_table, line 1109 at r1 (raw file):

ROLLBACK

subtest add_col_references

Eventually at some point we'll want to have tests for

  • table created earlier in same transaction
  • table references itself
  • trying to add more foreign key constraints on the same column in the same transaction (not allowed)
  • adding a column and then separately adding a foreign key in the same transaction (as I understand it, with this PR it will be allowed)

pkg/sql/sem/tree/alter_table.go, line 164 at r1 (raw file):

			}
			d.CheckExprs = nil
			if d.HasFKConstraint() {

If we do end up making an exception to not auto-creating indexes for ALTER COLUMN REFERENCES, is it possible to rewrite the AST here to be add column, add index, alter column ... references instead of having special cases later in the code to add an index? (If this isn't currently possible, I think we should do whatever it takes to make it possible. It seems desirable to have ALTER COLUMN REFERENCES behave identically to writing out each "command" manually.

@rohany
Copy link
Contributor Author

rohany commented Apr 24, 2020

I don't like the idea of making an exception for ADD COLUMN ... REFERENCES in our past stance to not auto-create new indexes for FKs unless the table is empty. My understanding is that the current implementation to auto-create an index is temporary because the index requirement will be lifted in 20.2. If we're fine with auto-creating indexes in more cases now because we know it'll be temporary, then why don't we just allow it in all cases?
Or we could disallow ADD COLUMN ... REFERENCES for now unless the table is empty, which would be more consistent with what we have now.

cc @jordanlewis @awoods187 do you have an opinion on this? Do you know what users expect around this behavior?

@rohany
Copy link
Contributor Author

rohany commented Apr 25, 2020

My thoughts on the matter are that we should automatically create the index for the foreign key constraint. Forcing a user to separately create the index for their new column just means not using the command. A similar case is the "add column unique", where we build the unique index while building the column.

is it possible to rewrite the AST here

I don't think so -- there create index isn't part of the alter table ast. Though I think that we can do whatever is done by the UNIQUE constraint here too. When we create the column desc from the column def, we also create an index for the column and add the mutation for it. We can update that to be if unique or has a references. Then we can avoid the special casing within the foreign key building logic.

@jordanlewis
Copy link
Member

Or we could disallow ADD COLUMN ... REFERENCES for now unless the table is empty, which would be more consistent with what we have now.

I think this would probably be fine, at least to start, because my guess is that most of these things get auto run by ORMs and stuff at startup - and that's the most frustrating thing for a user, when they're just running their basic migrations and stuff and something doesn't work.

I'm guessing that once they have something in prod with data, they'll be more willing to rewrite stuff to a slightly different form if it doesn't quite work, as long as we include a helpful error mressag.e

@thoszhang
Copy link
Contributor

I think this would probably be fine, at least to start, because my guess is that most of these things get auto run by ORMs and stuff at startup - and that's the most frustrating thing for a user, when they're just running their basic migrations and stuff and something doesn't work.

Yeah, I agree with this.

I also think we should stop auto-creating indexes once the optimizer gets CASCADE support and we won't need an index on the referencing side anymore, which is planned for 20.2. So it's likely that once we release, users won't even have encountered the "empty table only" restriction for this feature at all.

@thoszhang
Copy link
Contributor

(#22253 is the issue for getting rid of the index requirement)

@rohany rohany force-pushed the add-references branch 4 times, most recently from ede1644 to 6ccf57e Compare April 27, 2020 16:49
@rohany
Copy link
Contributor Author

rohany commented Apr 27, 2020

Updated to not automatically create the index, and added more tests, PTAL!

trying to add more foreign key constraints on the same column in the same transaction (not allowed)

I believe this is allowed now (#43417)

@blathers-crl
Copy link

blathers-crl bot commented Apr 27, 2020

❌ The GitHub CI (Cockroach) build has failed on 6ccf57ee.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link

blathers-crl bot commented Apr 27, 2020

❌ The GitHub CI (Cockroach) build has failed on 6ccf57ee.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do anything to add RSG tests for the new syntax, or anything along those lines?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @lucy-zhang, and @rohany)


pkg/sql/create_table.go, line 675 at r2 (raw file):

	}

	if len(targetCols) != len(originColumns) {

nit: Since we're renaming srcCols to originColumns, can we also rename targetCols to referencedColumns?


pkg/sql/create_table.go, line 734 at r2 (raw file):

	// a DEFAULT expression of NULL and a NOT NULL constraint.
	if d.Actions.Delete == tree.SetDefault || d.Actions.Update == tree.SetDefault {
		for _, sourceColumn := range originColumns {

nit: Maybe for _, originColumn := range originColumns or something like that.


pkg/sql/logictest/testdata/logic_test/alter_table, line 1109 at r1 (raw file):

trying to add more foreign key constraints on the same column in the same transaction (not allowed)

I believe this is allowed now (#43417)

Oh, right, thanks.


pkg/sql/logictest/testdata/logic_test/fk, line 2529 at r2 (raw file):


statement error adding a REFERENCES constraint while the column is being added not supported
BEGIN; ALTER TABLE childid ADD id2 INT UNIQUE NOT NULL DEFAULT 0; ALTER TABLE childid ADD CONSTRAINT fk_id FOREIGN KEY (id2) REFERENCES parentid (k);

Should we keep this test and update it to verify that these statements work? (Maybe we also need one for the add column ...; create index ...; add foreign key... case, for an empty table, to make sure we can find the index that's been created already instead of making a new one.)


pkg/sql/sqlbase/table.go, line 539 at r2 (raw file):

// It returns either an index that is active, or an index that was created
// in the same transaction that is currently running.
func FindFKOriginIndexInTxn(

I see we're still using FindFKOriginIndex in the FK check helper implementation. What exactly happens when the table descriptor is in the state where the index is still a mutation, but we've started enforcing the FK constraint for the non-public index? How does the FK check helper know about the index?

A more general comment that I probably should have thought of earlier: While this PR is mostly about the new supported syntax to create a column, add an index, and add a FK all at once (for a new or empty table), it also would introduce the ability to (explicitly) create an index and add a FK based on it in the same transaction even for non-empty tables, right? Something like this currently doesn't work, but would start working:

create table t (a int); create table parent (a int primary key);
insert into parent values (1); insert into t values (1);
begin;
create index on t (a);
alter table t add foreign key (a) references parent;
commit;

I'm starting to think that we should actually consider "index + FK created in same transaction" as a separate new feature, and test specifically for it, since it has ramifications (including on the SQL exec side of things) beyond just supporting ADD COLUMN REFERENCES. Either that, or we could come up with some special handling to disallow it, though I'm not sure what a principled way to do that would be.

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)


pkg/sql/sqlbase/table.go, line 539 at r2 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

I see we're still using FindFKOriginIndex in the FK check helper implementation. What exactly happens when the table descriptor is in the state where the index is still a mutation, but we've started enforcing the FK constraint for the non-public index? How does the FK check helper know about the index?

A more general comment that I probably should have thought of earlier: While this PR is mostly about the new supported syntax to create a column, add an index, and add a FK all at once (for a new or empty table), it also would introduce the ability to (explicitly) create an index and add a FK based on it in the same transaction even for non-empty tables, right? Something like this currently doesn't work, but would start working:

create table t (a int); create table parent (a int primary key);
insert into parent values (1); insert into t values (1);
begin;
create index on t (a);
alter table t add foreign key (a) references parent;
commit;

I'm starting to think that we should actually consider "index + FK created in same transaction" as a separate new feature, and test specifically for it, since it has ramifications (including on the SQL exec side of things) beyond just supporting ADD COLUMN REFERENCES. Either that, or we could come up with some special handling to disallow it, though I'm not sure what a principled way to do that would be.

Of course, once we get rid of the requirement for an index on the referencing side, the schema changer will no longer care about indexes on the table, and the old FK check implementation won't be used, so maybe this just won't be a problem anymore. Though maybe we'll still need to keep the old code path around for backward compatibility during the upgrade?

Copy link
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @lucy-zhang, and @RaduBerinde)


pkg/sql/logictest/testdata/logic_test/fk, line 2529 at r2 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

Should we keep this test and update it to verify that these statements work? (Maybe we also need one for the add column ...; create index ...; add foreign key... case, for an empty table, to make sure we can find the index that's been created already instead of making a new one.)

I'll add the tests for this near the other references tests.


pkg/sql/sqlbase/table.go, line 539 at r2 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

Of course, once we get rid of the requirement for an index on the referencing side, the schema changer will no longer care about indexes on the table, and the old FK check implementation won't be used, so maybe this just won't be a problem anymore. Though maybe we'll still need to keep the old code path around for backward compatibility during the upgrade?

Based on our discussions offline, it seems like this problem already exists when we create an index for the FK automatically. We can investigate that and move forward. @RaduBerinde do you have an idea about the probability that we remove the need for the origin indexes this release?

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @lucy-zhang)


pkg/sql/sqlbase/table.go, line 539 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

Based on our discussions offline, it seems like this problem already exists when we create an index for the FK automatically. We can investigate that and move forward. @RaduBerinde do you have an idea about the probability that we remove the need for the origin indexes this release?

I am optimistic we can remove it in this release.

@rohany
Copy link
Contributor Author

rohany commented Apr 28, 2020

RE rsg/sqlsmith tests -- IDTS for RSG, and sqlsmith doesn't attempt to make foreign keys right now.

@rohany
Copy link
Contributor Author

rohany commented Apr 28, 2020

I talked about this non-public index problem with @jordanlewis and since it is existing and we plan on getting rid of the restriction anyway, it shouldn't block this work. However, we should maybe add a more informative error if this happens during a write or delete. One wonky thing is the index that gets created when adding a column with unique. If we allow this index to be used as the backing index for the foreign key it would become much more likely to see the error with the non-public foreign key since the period where the index is not visible could be much longer. In that case, should we not allow using that index for the added column? The downside is now an extra index would be created if ADD COLUMN .. UNIQUE .. REFERENCES ... was executed on an empty table.

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked about this non-public index problem with @jordanlewis and since it is existing and we plan on getting rid of the restriction anyway, it shouldn't block this work.

That's fine with me.

If we allow this index to be used as the backing index for the foreign key it would become much more likely to see the error with the non-public foreign key since the period where the index is not visible could be much longer.

I'm not sure I follow. Longer compared to if we auto-created a non-unique index?

Reviewed 1 of 7 files at r2, 2 of 2 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)


pkg/sql/create_table.go, line 675 at r2 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

nit: Since we're renaming srcCols to originColumns, can we also rename targetCols to referencedColumns?

Not a big deal but I think matching referencedCols and originCols is nicer.


pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):

CREATE INDEX ON t2 (x);
ALTER TABLE t2 ADD FOREIGN KEY (x) REFERENCES t1 (x);
COMMIT

Do we have a variant of this test and the one above it where the tables are created outside the transaction? There's a separate, simpler code path when the tables are created in the same transaction (they don't go through the schema changer).


pkg/sql/logictest/testdata/logic_test/fk, line 2529 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

I'll add the tests for this near the other references tests.

(see comment above)

Copy link
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that we auto create the index for a unique column even if the table has data -- so if we allowed this unique index to be used for the foreign key on a table that had alot of data, there could be a larger window of time where the constraint is active and the index is not yet public. Or is this not a worry?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @lucy-zhang)


pkg/sql/create_table.go, line 675 at r2 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

Not a big deal but I think matching referencedCols and originCols is nicer.

I think you have a slightly out of date revision -- i updated this.


pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

Do we have a variant of this test and the one above it where the tables are created outside the transaction? There's a separate, simpler code path when the tables are created in the same transaction (they don't go through the schema changer).

Added


pkg/sql/logictest/testdata/logic_test/fk, line 2529 at r2 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

(see comment above)

^

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug (manifesting in an error returned from FindFKOriginIndex on writes) only happens in the interval between when we've added the validating constraint to the table and when the schema change is done (i.e., the interval when we actually validate the constraint), so all this happens after the index backfill. Does that address your concern?

Focusing just on 20.1/19.2 and potential backported fixes for a bit: I just tried to write a test for this bug (with an auto-added index) and realized it affects writes for the entire CASCADE chain, not just the empty table to which we're adding a FK constraint with an auto-generated index. I agree that putting in a more helpful error would be good (and we'd want to document this as a known limitation). The alternative of fixing the FK check helper to look for non-public indexes seems like too invasive of a change to backport. I can open an issue separately for this.

Back to 20.2: This PR does expand the set of cases in which the bug can be encountered, by opening up new ways in which a new FK can depend on a new index. But if we're confident that we won't need an index anymore for 20.2, then can we just live with this increased buggy behavior on master for a little while longer? That seems to be the consensus, but I wanted to verify that we're fine with this.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @lucy-zhang)


pkg/sql/create_table.go, line 675 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

I think you have a slightly out of date revision -- i updated this.

Oh, I meant that currently we have referencedCols and originColumns and we might as well abbreviate both.


pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):

Previously, rohany (Rohan Yadav) wrote…

Added

I think it'd be useful to just have a test covering a minimal case of the buggy case we've been discussing:

BEGIN
CREATE INDEX ...
ALTER COLUMN ... ADD FOREIGN KEY ...
COMMIT

where the FK uses the new index. This by itself works, even though the bug happens in a later stage.

Copy link
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that makes sense -- so the length of the backfill doesn't affect the probability this bug surfaces.

I agree that putting in a more helpful error would be good (and we'd want to document this as a known limitation). The alternative of fixing the FK check helper to look for non-public indexes seems like too invasive of a change to backport. I can open an issue separately for this.

I can add these errors as a separate PR. I'll make an issue to point to as well.

But if we're confident that we won't need an index anymore for 20.2, then can we just live with this increased buggy behavior on master for a little while longer? That seems to be the consensus, but I wanted to verify that we're fine with this.

Yeah, i think this is fine. If it looks like we can't get rid of the restriction by the release, we can just remove the ability for the new index to be used at that point.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @lucy-zhang)


pkg/sql/create_table.go, line 675 at r2 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

Oh, I meant that currently we have referencedCols and originColumns and we might as well abbreviate both.

Ohh, sorry for the confusion. Updated.


pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

I think it'd be useful to just have a test covering a minimal case of the buggy case we've been discussing:

BEGIN
CREATE INDEX ...
ALTER COLUMN ... ADD FOREIGN KEY ...
COMMIT

where the FK uses the new index. This by itself works, even though the bug happens in a later stage.

Is this test not what you are looking for?

statement ok
DROP TABLE t1, t2 CASCADE;
CREATE TABLE t1 (x INT PRIMARY KEY);
CREATE TABLE t2 (y INT);

statement ok
BEGIN;
ALTER TABLE t2 ADD COLUMN x INT;
CREATE INDEX ON t2 (x);
ALTER TABLE t2 ADD FOREIGN KEY (x) REFERENCES t1 (x);
COMMIT

query TT
SHOW CREATE t2
----
t2  CREATE TABLE t2 (
    y INT8 NULL,
    x INT8 NULL,
    CONSTRAINT fk_x_ref_t1 FOREIGN KEY (x) REFERENCES t1(x),
    INDEX t2_x_idx (x ASC),
    FAMILY "primary" (y, rowid, x)
)

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one comment about the tests.

Reviewed 1 of 1 files at r4, 1 of 1 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)


pkg/sql/logictest/testdata/logic_test/alter_table, line 1284 at r3 (raw file):

Previously, rohany (Rohan Yadav) wrote…

Is this test not what you are looking for?

statement ok
DROP TABLE t1, t2 CASCADE;
CREATE TABLE t1 (x INT PRIMARY KEY);
CREATE TABLE t2 (y INT);

statement ok
BEGIN;
ALTER TABLE t2 ADD COLUMN x INT;
CREATE INDEX ON t2 (x);
ALTER TABLE t2 ADD FOREIGN KEY (x) REFERENCES t1 (x);
COMMIT

query TT
SHOW CREATE t2
----
t2  CREATE TABLE t2 (
    y INT8 NULL,
    x INT8 NULL,
    CONSTRAINT fk_x_ref_t1 FOREIGN KEY (x) REFERENCES t1(x),
    INDEX t2_x_idx (x ASC),
    FAMILY "primary" (y, rowid, x)
)

The only difference is the column being created in the transaction vs. prior to it.

Fixes cockroachdb#32917.

This PR adds support for the add column references
statement by allowing the foreign key building code
to use columns and indexes added in the current txn.
The schema changer already understands how to add
the combination of the three in the same transaction.

Release note (sql change): This PR adds support for the
`ALTER TABLE ... ADD COLUMN ... REFERENCES ...` syntax
for tables that are empty.
@rohany rohany force-pushed the add-references branch from fa3cfff to 83caa70 Compare May 4, 2020 18:54
@rohany
Copy link
Contributor Author

rohany commented May 4, 2020

Ah, sorry for the confusion (again). Updated.

@rohany
Copy link
Contributor Author

rohany commented May 4, 2020

bors r=lucy-zhang

@craig
Copy link
Contributor

craig bot commented May 4, 2020

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: Support ALTER TABLE ADD COLUMN colname REFERENCES tablename(colname);
6 participants