-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ipc failure while streaming #346
Conversation
c9f2d96
to
2aed7dd
Compare
Lol.. ok so its 5dc07f0.. No idea why yet, but dropping that commit for this history and gonna rebase all dependents. |
2aed7dd
to
f7cc993
Compare
Lul, and turns out ba6a2b1 causes the So gonna drop that one for now as well.. |
f7cc993
to
6d34893
Compare
6895172
to
b722238
Compare
@@ -319,7 +319,7 @@ async def _invoke( | |||
BrokenPipeError, | |||
): | |||
# if we can't propagate the error that's a big boo boo | |||
log.error( | |||
log.exception( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important to show what traceback wasn't able to be propagated to the far end actor's caller task.
@@ -826,7 +829,12 @@ async def _push_result( | |||
|
|||
if ctx._backpressure: | |||
log.warning(text) | |||
await send_chan.send(msg) | |||
try: | |||
await send_chan.send(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using backpressure on a stream we need to be sure we don't crash when the local feed memchan has been broken.
Need a test for this, though I can't remember if this was the original issue that caused the ipc_failure_during_stream.py
example to inf hang before?
tractor/_streaming.py
Outdated
@@ -609,7 +610,14 @@ async def open_stream( | |||
# XXX: Make the stream "one-shot use". On exit, signal | |||
# ``trio.EndOfChannel``/``StopAsyncIteration`` to the | |||
# far end. | |||
await self.send_stop() | |||
try: | |||
await self.send_stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need a test for this:
- one end of a stream is closed
- the other end trying to close before any other stream methods
.receive()
/.send()
are called (thus detecting the closure before trying to send a stop msg)
finally: | ||
# NOTE: this is ABSOLUTELY REQUIRED to avoid | ||
# the following wacky bug: | ||
# <tractorbugurlhere> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can remember the case that caused this unfortunately 😂 though pretty sure it was in piker
.
Unfortunately this may just go in without a false positive case ..
When backpressure is used and a feeder mem chan breaks during msg delivery (usually because the IPC allocating task already terminated) instead of raising we simply warn as we do for the non-backpressure case. Also, add a proper `Actor.is_arbiter` test inside `._invoke()` to avoid doing an arbiter-registry lookup if the current actor **is** the registrar.
Use a task nursery in the subactor to spawn tasks which cancel the IPC channel mid stream to simulate the most concurrent case we're likely to see. Make `main()` accept a `debug_mode: bool` for parametrization. Fill out detailed comments/docs on this example.
With the new fancy `_pytest.pathlib.import_path()` we can do real parametrization of the example-script-module code and thus configure whether the child, parent, or both silently break the IPC connection. Parametrize the test for all the above mentioned cases as well as the case where the IPC never breaks but we still simulate the user hammering ctl-c / SIGINT to terminate the actor tree. Adjust expected errors based on each case and heavily document each of these.
We weren't doing this originally I *think* just because of the path dependent nature of the way the code was developed (originally being mega pedantic about one-way vs. bidirectional streams) but, it doesn't seem like there's any issue just calling the stream's `.aclose()`; also have the benefit of just being less code and logic checks B)
710dee0
to
13c9ead
Compare
Variety of streaming related teardown fixes for when IPC goes down before streams are terminated gracefully. Mostly discovered during dev of initial draft of pikers/piker#420.
TODO:
OSError: [Errno 9] Bad file descriptor
in cluster test #345 => dropping 5dc07f0 did it