Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use opctx_alloc in start saga to query boundary switches #4274

Merged
merged 1 commit into from
Oct 13, 2023

Conversation

gjcolombo
Copy link
Contributor

Finding boundary switches in the instance start saga requires fleet query access. Use the Nexus instance allocation context to get it instead of the saga initiator's operation context. (This was introduced in #3873; it wasn't previously a problem because the instance create saga set up instance NAT state, and that saga received a boundary switch list as a parameter, which parameter was generated by using the instance allocation context. #4194 made this worse by making instance create use the start saga to start instances instead of using its own saga nodes.)

Update the instance-in-silo integration test to make sure that instances created by a silo collaborator actually start. This is unfortunately not very elegant. The instance_simulate function family normally uses OpContext::for_tests to get an operation context, but that context is associated with a user that isn't in the test silo. To get around this, add some simulation interfaces that take an explicit OpContext and then generate one corresponding to the silo user for the test case in question. It seems like it'd be nicer to give helper routines like instance_simulate access to a context that is omnipotent across all silos, but I wasn't sure how best to do this. I'm definitely open to suggestions here.

Tested via cargo tests.

Fixes #4272.

Finding boundary switches requires fleet access; use the special instance-
allocation context to do this (this context was already used for sled lookups
earlier in the relevant saga step). This was not previously a problem because
NAT entries were always created by the instance create saga, which got a
boundary switch list from its creator that the creator obtained by querying
using the instance allocation context.

Update the instance-in-silo integration test to make sure that its instances
do actually start. This required a somewhat heavy lift: test helper functions
like `instance_simulate` use `OpContext::for_tests` to get an operation context
to use to query the database, but this context is associated with a user in the
default silo. To get around this, the test hacks up a "background" test context
with the user it created in its test silo. This seems to work OK, but I'm open
to alternative ways of doing this.

Tested with cargo tests.
@gjcolombo gjcolombo requested a review from davepacheco October 13, 2023 03:07
Copy link
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for diagnosing and fixing this!

It seems like it'd be nicer to give helper routines like instance_simulate access to a context that is omnipotent across all silos, but I wasn't sure how best to do this. I'm definitely open to suggestions here.

Agreed this would be convenient for testing. But I think it's in tension with the goal of strong silo separation. I believe that right now, under no conditions can silo users see siloed resources contained within other silos. #1681 talks about reasons we want to relax that (and reasons not to).

Short of that, we can at least make it ergonomic in the test suite to assume different identities (first by having different identities available and then by making it easy to use them). Maybe we could have a version of the helper function that we use to create silos that also creates users with different roles and provides OpContext's for them? (Not asking you to do this in this PR.)

#1374 is also related to all this.

@gjcolombo
Copy link
Contributor Author

Thanks for the review! Your comments on silo separation and test suite ergonomics make sense to me. I'd like to do a pass over the instance integration tests soon to try to do a bit of helper routine housekeeping and will see what I can do to clean this up as part of that work.

@gjcolombo gjcolombo merged commit ee8d93b into main Oct 13, 2023
21 checks passed
@gjcolombo gjcolombo deleted the gjcolombo/start-saga-authz branch October 13, 2023 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

User with silo collaborator role is unable to start an instance
2 participants