-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean up various non-unique indexes #3530
Conversation
CREATE UNIQUE INDEX ON omicron.public.sled ( | ||
rack_id, | ||
id | ||
) WHERE | ||
time_deleted IS NULL; | ||
) WHERE time_deleted IS NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the cases mentioned in the PR description. If we want to paginate the sleds in a rack, we need the sled id in this index, and then it can also be unique.
rack_id | ||
) WHERE time_deleted IS NULL; | ||
|
||
CREATE INDEX ON omicron.public.sled ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This index is superfluous because CockroachDB always has an index on the primary key. (Note that having an index on "rack_id" and one on "id" is not the same as having one on (rack_id, id)
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This index also includes a filter predicate time_deleted IS NULL
which makes it a partial index no? Perhaps not super relevant for this particular table but in the limit there could be more old/deleted records that a partial index could speed up access for only live records.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out. I had missed that. Are you saying that even if CockroachDB already indexes on "id", it might be beneficial to have an index on "id" limited only to live rows because querying it could be faster? Yeah, tThat's definitely possible. I think we probably want to wait and see if that's true, though, since it's somewhat speculative and has downsides (i.e., the extra disk space and memory usage and cost to update it on writes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have a doc about it that says as much, but as you say, the benefit is an empirical question. They frame the advantages in contrast with a full index, as opposed to no index/one you're stuck with regardless.
@@ -152,8 +148,9 @@ CREATE TABLE omicron.public.sled_resource ( | |||
); | |||
|
|||
-- Allow looking up all resources which reside on a sled | |||
CREATE INDEX ON omicron.public.sled_resource ( | |||
sled_id | |||
CREATE UNIQUE INDEX ON omicron.public.sled_resource ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the cases mentioned in the PR description. If we want tp aginate the sled_resources on a sled, we need the "id" here, too.
) WHERE time_deleted IS NULL; | ||
|
||
CREATE INDEX ON omicron.public.switch ( | ||
CREATE UNIQUE INDEX ON omicron.public.switch ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another of the cases mentioned in the PR description. This one's arguably unnecessary since we don't expect to have many switches in a rack.
rack_id | ||
) WHERE time_deleted IS NULL; | ||
|
||
CREATE INDEX ON omicron.public.switch ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another superfluous index because the primary key always has an index.
CREATE UNIQUE INDEX ON omicron.public.identity_provider ( | ||
silo_id, | ||
name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the few functional changes in this PR. This index was not unique and I think that was a bug. See this comment.
I also flipped the order of fields in order to use this index to paginate the IdPs in a Silo by name.
CREATE UNIQUE INDEX ON omicron.public.saml_identity_provider ( | ||
silo_id, | ||
id | ||
) WHERE | ||
time_deleted IS NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above -- needed to flip the order of these fields.
CREATE UNIQUE INDEX ON omicron.public.saml_identity_provider ( | ||
silo_id, | ||
name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this to ensure that SamlIdentityProvider names are unique within a Silo.
@@ -861,8 +865,9 @@ CREATE UNIQUE INDEX ON omicron.public.instance ( | |||
|
|||
-- Allow looking up instances by server. This is particularly | |||
-- useful for resource accounting within a sled. | |||
CREATE INDEX ON omicron.public.instance ( | |||
active_sled_id | |||
CREATE UNIQUE INDEX ON omicron.public.instance ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the cases mentioned in the PR description. To paginate the instances on a sled, the id needs to be here, and then this can be a UNIQUE index.
@@ -976,8 +981,9 @@ CREATE UNIQUE INDEX ON omicron.public.disk ( | |||
) WHERE | |||
time_deleted IS NULL; | |||
|
|||
CREATE INDEX ON omicron.public.disk ( | |||
attach_instance_id | |||
CREATE UNIQUE INDEX ON omicron.public.disk ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the cases mentioned in the PR description. To paginate the disks attached to an instance, the id needs to be here, and then this can be a UNIQUE index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for the improvements!
This addresses part (but not all) of #1497 and #2124.
This could use some careful review.
A common pattern here is that we had a non-unique index on some table T property P, like table
instance
propertyactive_sled_id
. I think people usually added these because they wanted to be able to find the rows having a certain value for that property (e.g., Instances on a given Sled). I believe this only works if you don't plan to paginate that query. If you do, you need a pagination key, and that needs to be in the index, too, or else Cockroach can't use the index to serve the paginated query. That's why the index oninstance
is now on(active_sled_id, id)
. With the id there, this can be aUNIQUE
index.