-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Testing sagas" is not easy in Omicron, but it should be #1799
Comments
My first pass at doing this:
|
02c50b5 has an implementation using a Now that I've done that, I'm admittedly not sure if it's the right approach. My real end-goal is to make "the guts of sagas exposed to tests, so I can perform fine-grained operations on them". Minimizing the interface exposed to Nexus certainly makes this possible, but it doesn't seem like the only approach. It also seems possible to simply add functions to Nexus for tests, exposing more of the internals of saga execution, such that tests can manipulate them. |
FWIW, omicron/nexus/src/saga_interface.rs Lines 48 to 50 in fbe8dc5
Nexus around to run the saga.
In the limit, saga actions are going to wind up talking to Sled Agent, boundary services, the database -- basically everything. So it's hard to unit test them without mocking those things. And if we do mock those things (as I think we should for most of those), then it's also easy to stand up Nexus, I think. See also oxidecomputer/steno#31. |
Yeah, I basically did this in my PR with the
This was definitely something I felt ; the distinction of "Nexus' interface for sagas" vs "Nexus' interface for HTTP endpoints" seems very unclear - both have access to:
It could be argued that the tight coupling between "Nexus' interface" and "what sagas want to do" means that there isn't much point in trying to isolate them. I don't love that, but I'm not sure the alternative is better.
Definitely interested here. Even the existing |
#1835 partially worked on this, by testing unwind safety |
@smklein Is there any additional saga testing work you were looking to track with this issue? I think the error injection and node replay facilities are pretty robust at this point; it seems like the basic primitives are all there, and we just need to arrange them conveniently (e.g. what I described in #3896) and then make sure the appropriate tests exist for all relevant sagas (like the ones listed in #2052 and #2094). |
I think this issue has been appropriately replaced by follow-ups -- I really wanted the idempotency + unwind tests, and now we have both. I'll mark this issue as closed in favor of more recent+specific issues. |
Sagas are tricky to write in Nexus right now - writing them not only involves "doing all the operations you need to do", but also:
So, how do we test sagas today? Mostly through our integration tests, which poke at both the external Nexus API, and internally at the database. This is a pretty coarse-grained mechanism for testing -- it would be much nicer if we could (for example) inject errors directly into individually constructed sagas.
I tried pulling out some saga-constructing goop in the "disk create" saga. My goal was to create a test exercising sagas, and node failure, without requiring most of the rest of the control plane to be up-and-running. Basically, something more akin to a "unit test" than an full "nexus integration test".
The big problem? The SagaContext object we pass to all sagas in Nexus has a reference back to the
Nexus
struct itself. And initializing theNexus
structure requires bringing up a lot of other services, so we're in a big ball of coupled services.It would be nice to create some better isolation between the
SagaContext
object and "all of Nexus" - looser coupling here would make it easier to create fine-grained tests to poke at more saga failure cases.The text was updated successfully, but these errors were encountered: