-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some non-unique indexes in dbinit.sql are fishy #1497
Comments
Thanks for noting this Dave. I'll look at the indexes related to IP Pools. I did want to mention something about this:
I'm pretty sure that all indexes implicitly store the primary key, to support index joins with the primary index of the table. However, there's no mention in the docs that primary keys are always added to the index themselves, and I've not seen that in my experience. (Other than the implicit storage of the key.) |
Scratch that, I was wrong:
That's from here. The primary key is implicitly used in secondary indexes. |
This one should be unique, I've filed #1498. |
This index is used here. The fact that it's not unique seems right to me, as we can have more than one pool restricted to a given project. This index:
is used here, to list the addresses for an instance. I think that you're right, that could be upgraded to a unique index on This index:
is used here, to check that an IP Pool has no allocated IP addresses at the time it's deleted. There may be a better way to do that, but I don't think we can make it unique by adding another column (the ID of the addresses is in another table). |
I was able to insert multiple saml provider records for the same silo since the name-silo combo in the identity_provider table is not a unique index:
@davepacheco - Would this be a problem if the operator accidentally inserted a bad saml idp record after the initial configuration (and user began adding projects and vm)? Currently, the only way to clear bad idp records is to delete the entire silo. If the bad idp record somehow gets used for authentication, it'll be impossible to fix it with API. |
Yes, this is a problem. |
Okay. I just wonder why name-silo shouldn't be unique, or even unique by silo (since each silo should only be backed by one idp). Since customers are going to set up the first silo and idp via some RSS wizard (script or UI), it's highly unlikely for the operator to try to use create provider API until they have a need to create another silo on their own. The scenario I mentioned above is possible but low probability (hence not an immediate concern). |
Yeah, I do think you're right. I marked this "MVP" because I hope it would be easy to fix (add the "UNIQUE"), and even if it's low-probability that we hit it, if we do hit it after shipping, it'll be much more work to fix. |
I missed seeing #2124. The lack of unique IdP index by name was already mentioned there. |
#3530 cleans up many of these (and several more that have been introduced since then). I am still not sure what to do about the non-unique indexes that remain. |
Crucible changes Add test and fix for replay race condition (#1519) Fix clippy warnings (#1517) Add edition to `crucible-workspace-hack` (#1516) Split out Downstairs-specific stats (#1511) Move remaining `GuestWork` functionality into `Downstairs` (#1510) Track both jobs and bytes in each IO state (#1507) Fix new `rustc` and `clippy` warnings (#1509) Remove IOP/BW limits (for now) (#1506) Move `GuestBlockRes` into `DownstairsIO` (#1502) Update actions/checkout digest to eef6144 (#1499) Update Rust crate hyper-staticfile to 0.10.1 (#1411) Turn off test-up-2region-encrypted.sh (#1504) Add `IOop::Barrier` (#1494) Fix IPv6 addresses in `crutest` (#1503) Add region set options to more tests. (#1496) Simplify `CompleteJobs` (#1493) Removed ignored CI jobs (#1497) Minor cleanups to `print_last_completed` (#1501) Remove remaining `Arc<Volume>` instances (#1500) Add `VolumeBuilder` type (#1492) remove old unused scripts (#1495) More multiple region support. (#1484) Simplify matches (#1490) Move complete job tracker to a helper object (#1489) Expand summary and add documentation references to the README. (#1486) Remove `GuestWorkId` (2/2) (#1482) Remove `JobId` from `DownstairsIO` (1/2) (#1481) Remove unused `#[derive(..)]` (#1483) Update more tests to use dsc (#1480) Crutest now Volume only (#1479) Propolis changes manually impl Deserialize for PciPath for validation purposes (#801) phd: gate OS-specific tests, make others more OS-agnostic (#799) lib: log vCPU diagnostics on triple fault and for some unhandled exit types (#795) add marker trait to help check safety of guest memory reads (#794) clippy fixes for 1.82 (#796) lib: move cpuid::Set to cpuid_utils; prevent semantic subleaf conflicts (#782) PHD: write efivars in one go (#786) PHD: support guest-initiated reboot (#785) server: accept CPUID values in instance specs and plumb them to bhyve (#780) PHD: allow patched Crucible dependencies (#778) server: add a first-class error type to machine init (#777) PciPath to Bdf conversion is infallible; prove it and refactor (#774) instance spec rework: flatten InstanceSpecV0 (#767) Make PUT /instance/state 503 when waiting to init Less anxiety-inducing `Vm::{get, state_watcher}`
Crucible changes Add test and fix for replay race condition (#1519) Fix clippy warnings (#1517) Add edition to `crucible-workspace-hack` (#1516) Split out Downstairs-specific stats (#1511) Move remaining `GuestWork` functionality into `Downstairs` (#1510) Track both jobs and bytes in each IO state (#1507) Fix new `rustc` and `clippy` warnings (#1509) Remove IOP/BW limits (for now) (#1506) Move `GuestBlockRes` into `DownstairsIO` (#1502) Update actions/checkout digest to eef6144 (#1499) Update Rust crate hyper-staticfile to 0.10.1 (#1411) Turn off test-up-2region-encrypted.sh (#1504) Add `IOop::Barrier` (#1494) Fix IPv6 addresses in `crutest` (#1503) Add region set options to more tests. (#1496) Simplify `CompleteJobs` (#1493) Removed ignored CI jobs (#1497) Minor cleanups to `print_last_completed` (#1501) Remove remaining `Arc<Volume>` instances (#1500) Add `VolumeBuilder` type (#1492) remove old unused scripts (#1495) More multiple region support. (#1484) Simplify matches (#1490) Move complete job tracker to a helper object (#1489) Expand summary and add documentation references to the README. (#1486) Remove `GuestWorkId` (2/2) (#1482) Remove `JobId` from `DownstairsIO` (1/2) (#1481) Remove unused `#[derive(..)]` (#1483) Update more tests to use dsc (#1480) Crutest now Volume only (#1479) Propolis changes manually impl Deserialize for PciPath for validation purposes (#801) phd: gate OS-specific tests, make others more OS-agnostic (#799) lib: log vCPU diagnostics on triple fault and for some unhandled exit types (#795) add marker trait to help check safety of guest memory reads (#794) clippy fixes for 1.82 (#796) lib: move cpuid::Set to cpuid_utils; prevent semantic subleaf conflicts (#782) PHD: write efivars in one go (#786) PHD: support guest-initiated reboot (#785) server: accept CPUID values in instance specs and plumb them to bhyve (#780) PHD: allow patched Crucible dependencies (#778) server: add a first-class error type to machine init (#777) PciPath to Bdf conversion is infallible; prove it and refactor (#774) instance spec rework: flatten InstanceSpecV0 (#767) Make PUT /instance/state 503 when waiting to init Less anxiety-inducing `Vm::{get, state_watcher}` --------- Co-authored-by: Alan Hanson <[email protected]>
We have a bunch of non-unique indexes in the database. I'm a little skeptical of these because:
id
) as the last column in the index. (I believe that CockroachDB implicitly includes the primary key already and it's possible it uses this to address the above problems? But (a) if we're relying on that, we should verify it, and (b) if so, then we probably still want to be consistent. Right now, a bunch of our indexes explicitly include the id for this purpose and a bunch of indexes don't.)There's at least one good use case for non-unique indexes, which is when you expect to fetch a bunch of rows based on some property, you expect there could be a large number of these, and you don't care which ones you get. Examples of this include the index on
console_session.time_created
that's intended to be used to expire old sessions. This is fine because we'll wind up querying for sessions older than, say, a day ago, and we don't care which ones we find. (In fact, we'll probably keep querying and expiring everything we find until we run out.)We should take a pass through the non-unique indexes and audit them.
Examples that look intended for pagination or lookup but that I think won't work for the above reasons:
sled.rack_id
(needs sled's id too and then it'll be unique)service.sled_id
(needs service's id too and then it'll be unique)region.volume_id
(needs region's id too and then it'll be unique)disk.attach_instance_id
(needs disk's id too and then it'll be unique)These I think are wrong only because they have the fields in the wrong order:
identity_provider.(id, silo_id)
-- but I think this probably needs to be(silo_id, id)
because otherwise it can't be used to quickly list IdPs in a Siloidentity_provider.(name, silo_id)
-- but I think this probably needs to be(silo_id, name)
for the same reasonsaml_identity_provider.(id, silo_id)
-- sameThese I think are unique, just not constrained to be:
metric_producer.(oximeter_id, id)
Ones I'm not sure about without more investigation:
Dataset.size_used
ip_pool_range.project_id
instance_external_ip.(ip_pool_id,ip_pool_range_id)
instance_external_ip.instance_id
Here are all of them:
The text was updated successfully, but these errors were encountered: