-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: permit foreign key relationships without indexes on the source side #36859
Comments
Note that (1) is also required because the foreign key needs foremost to be a key, which can only be guaranteed with a unique index. There is an extra difficulty here that @knz brought up. We currently use the |
The way that FK relationships are currently encoded in table descriptors is a There are two other problems caused by this
(For context, when I discussed the original design with relevant parties, we found out that it was the fruit of a) not checking what standard sql had to say — so even when accounting for (or tolerating) human mistakes, which make point d totally a-OK, casual effort on points an and b would have averted the current situation) |
@knz since it sounds like you have an opinion on a better design, would you care to share it? To me, it seems like the correct representation would be simply to store the table and referenced columns of the foreign key. That way, the optimizer would have full freedom as to how to plan the lookup. Does that make sense? |
So for the narrow view of "the semantic information defined by a FOREIGN KEY constraint definition in CREATE", yes there should be some object that merely stores a reference to the referenced table and its columns (not an index). It can also store the various other semantic constraint properties such as match type and cascading actions. That object presumably would be stored inside TableDescriptor, not IndexDescriptor as done currently. Also the columns should be referenced by column ID, not by column name. That's simple because (mostly due to a fortunate accident of design) column IDs are stable in cockroachdb. The table should be referenced by table ID presumably, but then care should be taken because TRUNCATE made table IDs non-stable (btw I hate this "feature" of TRUNCATE, but that's out of scope here). That would be about it for this "forward" FK constraint definition. However: beyond that narrow view, casual inspection of
|
I'm closing this as a duplicate of #37255, which has a proposal for how to upgrade the descriptors to avoid the problems we've discussed here already. |
Now that #36854 is merged, each of the 4 indexes in TPC-C that are required because of this issue are clearly marked with the suffix "_fk_idx". |
Currently, creating a foreign key between two tables requires that both tables have a "relevant" index that can be used to look values up precisely on both sides.
For example, consider a foreign key from a
customers
table to azipcodes
table, where azipcode
column incustomers
references theid
table in thezipcodes
table. Currently, there needs to be two indexes:id
column of thezipcodes
table, to permit fast checks for zipcode existence when adding a new customerzipcode
column of thecustomers
table, to permit fast checks for any customers that reference a particular zipcode when deleting a row from thezipcodes
table.The first requirement is reasonable, since most schemas with foreign keys commonly need to support inserts into the referencing table.
The second requirement is somewhat unreasonable, because it's very common for a schema to have a reference table that never is deleted from at all. In this case, like the
zipcodes
table in our example, requiring that "reverse" index on thecustomers
table is a waste of storage and compute resources, since every insert intocustomers
now also requires updating that other index.This issue tracks permitting foreign key relationships that don't require an index on the source side. I expect that this work will be mostly driven by the movement of foreign key checks into the optimizer. Once the optimizer is the entity deciding which index to use for reverse lookups on deletes, then this issue can be closed, since the optimizer should be able to decide that a precise lookup is unavailable and therefore require falling back to a full table scan, which is the desirable outcome in this scenario.
The text was updated successfully, but these errors were encountered: