-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
instance create sagas look up multiple things by name multiple times #1536
Comments
This seems easy to get wrong. It makes me wonder if we should have the Nexus layer only accept authz resources (rather than ever accepting names) and having callers (mostly the HTTP layer, but sometimes the saga layer) use LookupPath directly. I think it'd be fair to say that the mapping of an API URL path to a resource is part of the API layer anyway, not Nexus. I'm reminded too that right after that quote from RFD 192, it also says that some operations could be done without a separate lookup step altogether. To do that, you do want to pass the names all the way down to the datastore. Doing this correctly also requires integrating authz into the query, which feels like a long way off. So maybe we could consider this an optimization to be re-added later. A middle ground might be to accept |
Once RFD-322 starts getting implemented I believe we should solve this at the API level. E.g. we should at least move everything to operate as id by default. Id or |
No Propolis changes other than to update Crucible Crucible changes are: Add debug/timeout to test_memory.sh (#1563) Consolidate ack checking (#1561) Rename for crutest: RegionInfo -> DiskInfo (#1562) Fix dtrace system level scripts (#1560) Remove `ackable_work`; ack immediately instead (#1552) No more New jobs, no more New jobs column (#1559) Remove delay-based backpressure in favor of explicit queue limits (#1515) Only send flushes when Downstairs is idle; send Barrier otherwise (#1505) Update Rust crate reqwest to v0.12.9 (#1536) Update Rust crate omicron-zone-package to 0.11.1 (#1535) Remove separate validation array (#1522) Remove more unnecessary `DsState` variants (#1550) Consolidate `DownstairsClient::reinitialize` (#1549) Update Rust crate uuid to v1.11.0 (#1546) Update Rust crate reedline to 0.36.0 (#1544) Update Rust crate bytes to v1.8.0 (#1541) Update Rust crate thiserror to v1.0.66 (#1539) Update Rust crate serde_json to v1.0.132 (#1538) Update Rust crate serde to v1.0.214 (#1537) Remove transient states in `DsState` (#1526) Update Rust crate libc to v0.2.161 (#1534) Update Rust crate futures to v0.3.31 (#1532) Update Rust crate clap to v4.5.20 (#1531) Update Rust crate async-trait to 0.1.83 (#1530) Update Rust crate anyhow to v1.0.92 (#1529) Remove obsolete crutest perf test (#1528) Update dependency rust to v1.82.0 (#1512) Still more updates to support Volume layer activities. (#1508) Remove remaining IOPS/bandwidth limiting code (#1525) Add unit test for VersionMismatch (#1524) Removing panic paths by only destructuring once (#1523) Update actions/checkout digest to 11bd719 (#1518) Switch to using `Duration` for times (#1520)
No Propolis changes other than to update Crucible Crucible changes are: Add debug/timeout to test_memory.sh (#1563) Consolidate ack checking (#1561) Rename for crutest: RegionInfo -> DiskInfo (#1562) Fix dtrace system level scripts (#1560) Remove `ackable_work`; ack immediately instead (#1552) No more New jobs, no more New jobs column (#1559) Remove delay-based backpressure in favor of explicit queue limits (#1515) Only send flushes when Downstairs is idle; send Barrier otherwise (#1505) Update Rust crate reqwest to v0.12.9 (#1536) Update Rust crate omicron-zone-package to 0.11.1 (#1535) Remove separate validation array (#1522) Remove more unnecessary `DsState` variants (#1550) Consolidate `DownstairsClient::reinitialize` (#1549) Update Rust crate uuid to v1.11.0 (#1546) Update Rust crate reedline to 0.36.0 (#1544) Update Rust crate bytes to v1.8.0 (#1541) Update Rust crate thiserror to v1.0.66 (#1539) Update Rust crate serde_json to v1.0.132 (#1538) Update Rust crate serde to v1.0.214 (#1537) Remove transient states in `DsState` (#1526) Update Rust crate libc to v0.2.161 (#1534) Update Rust crate futures to v0.3.31 (#1532) Update Rust crate clap to v4.5.20 (#1531) Update Rust crate async-trait to 0.1.83 (#1530) Update Rust crate anyhow to v1.0.92 (#1529) Remove obsolete crutest perf test (#1528) Update dependency rust to v1.82.0 (#1512) Still more updates to support Volume layer activities. (#1508) Remove remaining IOPS/bandwidth limiting code (#1525) Add unit test for VersionMismatch (#1524) Removing panic paths by only destructuring once (#1523) Update actions/checkout digest to 11bd719 (#1518) Switch to using `Duration` for times (#1520) Co-authored-by: Alan Hanson <[email protected]>
There are three places in the instance-create saga where we look up things by name well into the saga:
Nexus::instance_attach_disk()
/Nexus::instance_detach_disk()
and pass along the organization_name, project_name, instance_name, and disk_name from the instance create request. This causes a new lookup of the organization/project/instance/disk path. That seems bad: there's no guarantee that it will find the same disk (or instance or project or organization, for that matter). So we might attach or detach the wrong disk, or we might fail spuriously.instance_ensure()
, we look up the instance by name in order to callNexus::instance_set_runtime()
. This could find a different instance than the one we created. We should be able to use the instance_id stored from a previous action instead.sic_delete_instance_record
, the undo action for the (early) create-instance-record action, we look up the instance name that we created. We should use the instance id instead.RFD 192 talks about this:
In terms of fixing: my first thought here is to do the lookup of organization and project once early in the saga to get the project_id. Then (again, still early in the saga) look up anything else we need (like all the disks) using
LookupPath(...).project_id(...).disk_name(...)
and store their ids. Thereafter, we can always use the ids.We'll probably want to change
Nexus::instance_attach_disk()
andNexus::instance_detach_disk()
to accept anauthz::Instance
andauthz::Disk
(and have the lookups done in the caller).Then I think we can remove the "instance_name" output from any saga action (so we're not ever tempted to use the instance name). We can also remove
organization_name
andproject_name
from theParams
(for the same reason).The text was updated successfully, but these errors were encountered: