Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CockroachDB automatically creates duplicates of indices created within a transaction #46127

Closed
mqudsi opened this issue Mar 15, 2020 · 9 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@mqudsi
Copy link
Contributor

mqudsi commented Mar 15, 2020

CockroachDB always creates an auto-index on foreign keys if an index is not created at the time the table is created. Very helpful.

However, it does not recognize that a CREATE TABLE followed by a separate CREATE INDEX within a single transaction are equivalent to a CREATE TABLE with an inline definition of an index. As a result, duplicate indices are created (the user-created index plus the auto-created one).

Illustrated:

BEGIN TRANSACTION;
CREATE TABLE foo (id INT8 UNIQUE, name STRING);
CREATE TABLE bar (
        id
                INT8 UNIQUE,
        fooId
                INT8 REFERENCES foo (id),
        value
                STRING
);
CREATE INDEX idx_bar_foo_id ON bar (fooid);
COMMIT TRANSACTION;

Followed up with SHOW INDEX FROM bar:

root@:26257/defaultdb> SHOW INDEX FROM bar;
  table_name |           index_name            | non_unique | seq_in_index | column_name | direction | storing | implicit
-------------+---------------------------------+------------+--------------+-------------+-----------+---------+-----------
  bar        | primary                         |   false    |            1 | rowid       | ASC       |  false  |  false
  bar        | bar_id_key                      |   false    |            1 | id          | ASC       |  false  |  false
  bar        | bar_id_key                      |   false    |            2 | rowid       | ASC       |  false  |   true
  bar        | bar_auto_index_fk_fooid_ref_foo |    true    |            1 | fooid       | ASC       |  false  |  false
  bar        | bar_auto_index_fk_fooid_ref_foo |    true    |            2 | rowid       | ASC       |  false  |   true
  bar        | idx_bar_foo_id                  |    true    |            1 | fooid       | ASC       |  false  |  false
  bar        | idx_bar_foo_id                  |    true    |            2 | rowid       | ASC       |  false  |   true
(7 rows)

The discrepancy is clearer when looking at the output of SHOW CREATE because it makes immediately obvious that it's a duplicate and that creating the index separately is no different than creating it inline:

root@:26257/defaultdb> SHOW CREATE bar;
  table_name |                            create_statement
-------------+--------------------------------------------------------------------------
  bar        | CREATE TABLE bar (
             |     id INT8 NULL,
             |     fooid INT8 NULL,
             |     value STRING NULL,
             |     CONSTRAINT fk_fooid_ref_foo FOREIGN KEY (fooid) REFERENCES foo(id),
             |     UNIQUE INDEX bar_id_key (id ASC),
             |     INDEX bar_auto_index_fk_fooid_ref_foo (fooid ASC),
             |     INDEX idx_bar_foo_id (fooid ASC),
             |     FAMILY "primary" (id, fooid, value, rowid)
             | )
(1 row)

Environment:

  • CockroachDB 19.2.4
  • Server OS: Windows 10
  • Client app: cockroach sql

Additional context

This behavior differs from that of PostgreSQL and has a negative impact on performance and size. I believe the auto index should be created only when the creation is committed and not before then.

ORM drivers for PostgreSQL exist that (for the obvious simplicity reasons) prefer to map logical tables to SQL via separate creation of table and indices (example: npgsql). Removing the auto-created index by hand is of course an option, but it is tedious and unnecessary as it should not be too difficult to delay the creation of the index until the the entirety of the transaction has been committed.

Jira issue: CRDB-5104

@awoods187
Copy link
Contributor

Nice find! Thank you for filing!

@awoods187 awoods187 added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Mar 16, 2020
@rohany
Copy link
Contributor

rohany commented Mar 16, 2020

This behavior differs from that of PostgreSQL and has a negative impact on performance and size. I believe the auto index should be created only when the creation is committed and not before then.

As of now, postgres does not need an index at the source of a foreign key constraint, while cockroachDB does. I believe we are working on removing this constraint, but I don't know if there are trackers for it (cc @awoods187, @RaduBerinde). Because we require an index at the source of the foreign key relation, we must create it while executing the CREATE TABLE statement because a command after the CREATE TABLE could try and insert data into bar, and we need the ability to verify the foreign key constraint.

In 20.1 you'll be able to drop the auto generated index as long as there is another suitable index on the table for the foreign key constraint, if that helps at all.

@mqudsi
Copy link
Contributor Author

mqudsi commented Mar 17, 2020

Right, but if the index is created before data is inserted that should obviate the need to do so.

Eg I don’t know if cockroachdb supports deferred constraints but you could possibly treat all FK as deferred for the duration of the transaction they were created in until either an FK index is added, an attempt to insert data within that transaction is detected, or the transaction is committed?

If this is strictly a code flow that is triggered by the creation of a table within a transaction, it should have no impact on performance in future transactions.

@rohany
Copy link
Contributor

rohany commented Mar 17, 2020

We don't support deferrable constraints right now. Thanks for opening the issue! We'll see where it fits in on the roadmap.

@rohany rohany removed their assignment Mar 26, 2020
@thoszhang
Copy link
Contributor

@rohany I believe this is fixed as part of #47082. We can now recognize indexes created earlier in the same transaction as usable for the foreign key constraint.

@mqudsi
Copy link
Contributor Author

mqudsi commented May 5, 2020

Excellent, thank you!

@rohany
Copy link
Contributor

rohany commented May 5, 2020

No i think this is still a problem -- here the index is created after we add the foreign key.

@rohany rohany reopened this May 5, 2020
@thoszhang
Copy link
Contributor

Sorry, you're right. Well, the real solution here is to stop auto-creating indexes and just let FK checks be planned with the index that was created later in the transaction, so this will be fixed once #46674 is done (and we make the necessary changes on the schema change side).

@jlinder jlinder added the T-sql-schema-deprecated Use T-sql-foundations instead label Jun 16, 2021
@ajwerner
Copy link
Contributor

Okay, this really is fixed now.

@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

No branches or pull requests

6 participants