-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock in iroh-docs actor loop #2345
Comments
4 tasks
github-merge-queue bot
pushed a commit
that referenced
this issue
Jun 6, 2024
…2346) ## Description Fixes #2345 The iroh-docs actor loop can easily be deadlocked from the client side: If you call any RPC method that returns a stream, and the stream is longer than what the RPC layer buffers, and you call and await any other docs method *while consuming the stream*, the docs actor will deadlock. (It will only happen though if the stream is longer than the capacity of the intermediate channel that goes from the actor to the RPC layer, which is why this does not *always* happen) This is the case for all methods that return iterators. The solution is twofold: * Run single-threaded executor in iroh-docs actor loop * For actions returning iterators/streams, spawn a task on that executor to forward the store iterator into the stream, yielding when the receiver is not consuming fast enough To be able to spawn the iterators onto a task, they have to be `'static`. Which they can be - but only when operating on snapshots. So this PR fixes the potential for deadlock. It has the downside, however, that whenever calling a docs client function that returns an iterator, the current write transaction will be committed first, which has a perfomance penalty. However this is preferable to deadlocks, IMO. ## Breaking Changes <!-- Optional, if there are any breaking changes document them, including how to migrate older code. --> ## Notes & open questions This will need tests and likely documentation of the perfomance implications. ## Change checklist - [ ] Self-review. - [ ] Documentation updates if relevant. - [ ] Tests if relevant. - [ ] All breaking changes documented.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Calling
iroh.docs.list
, or other docs methods that return streams, is deadlock prone at the moment: If there are more items than the RPC layer will buffer a call to anyiroh.docs
RPC method while consuming the stream will deadlock.Reason:
iroh-docs internally runs in a synchronous thread loop. When the client calls list_docs or other streaming methods, this sends a message over the RPC interface into that message loop. The docs thread will get an iterator from the store and push all items into a reply channel, until it is done. However it is a synchronous thread, so it cannot do anything else until it is done.
And this may lead to the folllowing deadlock sitution: If, while consuming such a stream, you happen to call any function that goes back into the docs thread, this will - deadlock! Because the docs thread is still not done with processing the previous iterator, it won't reply to the new call, and you won't continue consuming the stream - classical deadlock.
Thanks to @dev-msp for spotting this.
See #2346 for a fix.
The text was updated successfully, but these errors were encountered: