Add support for pipes. #621

Fuyukai · 2018-08-22T18:07:04Z

This is Step 4 of #4 (comment), and adds support for reading/writing from os.pipe objects to Trio.

codecov · 2018-08-22T18:14:05Z

Codecov Report

Merging #621 into master will increase coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #621      +/-   ##
==========================================
+ Coverage   99.28%   99.31%   +0.02%     
==========================================
  Files          91       93       +2     
  Lines       10785    10954     +169     
  Branches      770      782      +12     
==========================================
+ Hits        10708    10879     +171     
+ Misses         58       56       -2     
  Partials       19       19

Impacted Files	Coverage Δ
trio/tests/subprocess/test_unix_pipes.py	`100% <100%> (ø)`
trio/_subprocess/unix_pipes.py	`100% <100%> (ø)`
trio/testing/_check_streams.py	`99.31% <0%> (+0.68%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc7c7e0...aaad5b5. Read the comment docs.

njsmith · 2018-08-23T09:01:26Z

Highlevel comment on both this and #622: this functionality is great, and obviously an important step towards #4. One question we need to answer along the way is how we're going to expose it – eventually it'll be part of some full-fledged subprocess spawning API, but should we expose the individual pieces, and if so, how? For WaitForSingleObject, this was pretty easy: it's an OS-specific core primitive, and the proper API was obvious, so, make it public in trio.hazmat, done. For these, it's a little more complicated: these particular implementations are specific to Unix/Linux, but the concepts are more general; and, I think we'll probably want to fine-tune the API as we go. (For example, with pipes we need to create a pipe where one end is wrapped in trio and non-blocking, and the other is in blocking mode to be passed to the child; and with waitpid, I suspect we might adjust how we track the WaitpidResult once we have Popen objects that could hold them – like maybe we'll kick off the waitpid call immediately when the process is created, and stash the WaitpidResult in the Popen object?)

So, my suggestion for right now is:

Go ahead and add the implementations, maybe even in a subpackage like trio/_subprocess/...
Go ahead and write the tests
Don't export them publically
Don't write newsfragments (because there's nothing for users yet)

That way we can incrementally write the different pieces, review them, test them, etc., and eventually we'll figure out how to hook them together into a coherent subprocess API.

Does that make sense?

njsmith · 2018-08-23T09:05:12Z

Two more quick comments before I go to bed:

We're going to need __del__ methods here... most of trio doesn't bother, because e.g. SocketStream holds a trio.socket.SocketType, which holds a socket.socket... and socket.socket already has a __del__ that makes sure the OS-level socket object is eventually closed. os.pipe doesn't provide that kind of wrapper, so we have to take care of __del__ ourselves.
Check out trio.testing.check_one_way_stream

Fuyukai · 2018-08-23T15:58:30Z

Re: check_one_way_stream - this can't be used, because it does a check that closing side W of a pipe ensures side R cancels reading (which, well, it doesn't, because one half of the fds are closed on a fork-with-exec).

njsmith

Some more detailed comments, in addition to the more general ones above...

njsmith · 2018-08-24T01:56:51Z

trio/_highlevel_pipes.py

+        self._pipe = pipefd
+
+    async def aclose(self):
+        os.close(self._pipe)


We should also make sure that after aclose, future operations will fail with ClosedStreamError, even if the file descriptor value gets reallocated to a new file. One way would be to add a self._closed attribute that we check before each operation. The other option (what socket.socket does) is to assign self._pipe = -1, since -1 cannot be a valid fd.

The -1 thing is also baked into the _translate_socket_errors_to_stream_errors... on Unix, it's whole job is to convert EBADF into a ClosedStreamError (because EBADF is what you get when you try to use -1 as an fd), and convert everything else into BrokenStreamError.

It might make sense to write a _translate_pipe_errors_to_stream_errors, even if it's very similar to the socket version, just because both error-converters are so tightly coupled to internal implementation decisions that it's nice to not have to think about code in other files when looking at them.

Oh, and we should be calling notify_fd_close here too.

njsmith · 2018-08-24T01:57:29Z

trio/_highlevel_pipes.py

+class _PipeMixin:
+    def __init__(self, pipefd: int):
+        if not isinstance(pipefd, int):
+            raise TypeError("PipeSendStream needs a pipe fd")


Error message has wrong class name

njsmith · 2018-08-24T02:04:15Z

trio/_highlevel_pipes.py

+                with view[total_sent:] as remaining:
+                    total_sent += os.write(self._pipe, remaining)
+
+                await self.wait_send_all_might_not_block()


This is going to crash if the pipe buffer fills up – os.write will raise BlockingIOError (EWOULDBLOCK), and we don't have anything to catch it. In SocketStream.send_all, it has a slightly easier job, because it doesn't call the raw OS send function directly – it calls the trio.socket send method, and that takes care or retrying and waiting. See _nonblocking_helper and _try_sync in trio.socket for the gory details... not that they're perfect, they're probably a bit over-elaborate themselves.

njsmith · 2018-08-24T02:08:18Z

trio/_highlevel_pipes.py

+                await self.wait_send_all_might_not_block()
+
+    async def wait_send_all_might_not_block(self) -> None:
+        await _core.wait_socket_writable(self._pipe)


We shouldn't be using socket functions on non-socket fds, that's just confusing :-). Of course we can get away with it on Unix, and this is Unix-only code, so it's more a style thing. But if this code were accidentally run on Windows, then calling the socket functions would invoke undefined behavior, while using the fd versions would give a clean error.

njsmith · 2018-08-24T02:15:38Z

trio/_highlevel_pipes.py

+def make_pipe() -> Tuple[PipeReceiveStream, PipeSendStream]:
+    """Makes a new pair of pipes."""
+    (r, w) = os.pipe2(os.O_NONBLOCK)
+    return PipeReceiveStream(r), PipeSendStream(w)


I suspect it'll end up being a bit nicer if we make the stream __init__ methods accept arbitrary pipes, and set them to non-blocking mode at that point. Then when we only want to use one side of a pipe with trio, we can write

r_raw, w_raw = os.pipe() r_trio = PipeReceiveStream(r_raw) # ... now use r_trio and w_raw ...

Also, elsewhere in trio where we return a pair like this, we do (sender, receiver). Which yeah is the opposite of os.pipe. But trio's way is easier to remember ("data flows from left to right, like you read").

Neither of these are big issues right now since we're keeping this private, and once we start trying to use it for real we'll probably find other things we want to tweak. But figured I'd mention.

njsmith · 2018-08-24T02:18:40Z

Re: check_one_way_stream - this can't be used, because it does a check that closing side W of a pipe ensures side R cancels reading (which, well, it doesn't, because one half of the fds are closed on a fork-with-exec).

Not sure I follow... we wouldn't be doing fork/exec in the middle of a call to check_one_way_stream, right? Obviously when we actually use the pipe streams it'll be in a more complicated situation involving subprocesses, but that doesn't mean we can't use check_one_way_stream in a simple in-process case to validate that the stream code works and handles the edge cases correctly.

Fuyukai · 2018-08-24T13:29:10Z

The check_one_way_stream runs a test that ensures that closing the S side of the stream stops the R side from working - with pipes, this can't happen due to child processes and what-not.

njsmith · 2018-08-24T23:25:43Z

The check_one_way_stream runs a test that ensures that closing the S side of the stream stops the R side from working - with pipes, this can't happen due to child processes and what-not.

This isn't a problem. It would be a problem if we stopped in the middle of a test to spawn a child process and shared our pipe object with it, but we won't do that :-). If you're just using a pipe object within a process, then it follows the usual rule where closing the send side causes the receive side to get an EOF:

In [1]: import os

In [2]: R, S = os.pipe()

In [3]: os.write(S, b"x")
Out[3]: 1

In [4]: os.close(S)

In [5]: os.read(R, 10)
Out[5]: b'x'

In [6]: os.read(R, 10)
Out[6]: b''

njsmith · 2018-08-25T10:24:56Z

trio/_subprocess/tests/conftest.py

@@ -0,0 +1,25 @@
+import pytest


Instead of making another copy of this, I'm thinking we should put the subprocess tests into trio/tests/subprocess/? Or _subprocess/ or test_subprocess/, I guess, whichever looks nicest. (And at some point we should move the _core tests into trio/tests/ too. And probably rename it to trio/_tests/. But that can wait for another PR...)

njsmith · 2018-08-25T10:25:56Z

trio/_subprocess/tests/test_highlevel_pipes.py

+
+pytestmark = pytest.mark.skipif(
+    not hasattr(os, "pipe2"), reason="pipes require os.pipe2()"
+)


Now that we don't use pipe2 this looks a little odd :-). Maybe skipif(os.name != "posix", reason=...) would be a bit clearer. (I think it has the same effect.)

njsmith · 2018-08-25T10:26:42Z

trio/_subprocess/unix_pipes.py

+        self._pipe = pipefd
+        self._closed = False
+
+        if set_non_blocking:


I don't think there's any reason for this to be configurable? We can just do it unconditionally.

njsmith · 2018-08-25T10:37:05Z

trio/_subprocess/unix_pipes.py

+            pass
+        except OSError as e:
+            # already closed from somewhere else
+            if e.errno != 9:


Whoops, you should never hard-code errno values like this – they're arbitrary, and can be different on different platforms. (Though 9 does seem to be EBADF on both Linux and macOS... weird!) Instead, use the stdlib errno module, e.g. e.errno != errno.EBADF.

Except... in this case, I actually think we shouldn't be filtering out EBADF. Instead we should follow the rule that once you give an fd to one of these pipe objects, the pipe object "takes ownership" of it, and is the only object that should be used to close it. Closing the fileno somewhere else is a bug.

In fact, this probably explains some of your weird test failures. I bet what's happening is:

The test allocates the fd, and wraps it in a trio pipe object

The test closes the fd by hand

Later, some other fd is opened, and is assigned the same numerical value as the earlier pipe fd.

Finally, the trio pipe object gets GCed, and its __del__ method closes whatever random fd was opened in step 3.

Chaos ensues.

File descriptors are a weird kind of global state with no guard-rails. They're very C-ish. It's important to know exactly who is responsible for managing them.

njsmith · 2018-08-25T10:38:35Z

trio/_subprocess/unix_pipes.py

+
+        if max_bytes < 1:
+            await _core.checkpoint()
+            raise ValueError("max_bytes must be integer >= 1")


You can save a bit of repetition by making it if not isinstance(max_bytes, int) or max_bytes < 1

Can't, must be different errors for the test.

Oh yeah, good point :-)

njsmith · 2018-08-25T10:39:20Z

newsfragments/621.feature.rst

@@ -0,0 +1 @@
+Add support for pipes (os.pipe).


No newsfragment for now, since the API's not public yet.

njsmith

Another pass.

njsmith · 2018-08-25T12:26:48Z

trio/tests/subprocess/test_unix_pipes.py

+def nonblock_pipe(p: int):
+    import fcntl
+    flags = fcntl.fcntl(p, fcntl.F_GETFL)
+    fcntl.fcntl(p, fcntl.F_SETFL, flags | os.O_NONBLOCK)


What's the point of this? You seem to only call it on fds that are about to be passed to PipeSendStream or PipeReceiveStream, which also set their inputs to non-blocking, so it seems redundant...?

I clearly wasn't thinking clearly when I wrote this, whoops.

njsmith · 2018-08-25T12:28:33Z

trio/_subprocess/unix_pipes.py

+            try:
+                data = os.read(self._pipe, max_bytes)
+                if data == b'':
+                    await self.aclose()


Huh, I'm surprised this passes check_one_way_stream! You shouldn't close here; after receive_some returns b"", calling it again should give b"" again, not ClosedResourceError. And we should fix check_one_way_stream so that it tests this corner case. Well done finding that omission :-).

njsmith · 2018-08-25T12:31:33Z

trio/_subprocess/unix_pipes.py

+            except BlockingIOError:
+                await _core.wait_readable(self._pipe)
+            else:
+                await _core.checkpoint()


Doing a full checkpoint here unfortunately doesn't quite work right – the problem is that this should only raise Cancelled if it did nothing. If we do a full checkpoint here, then we could read some data out of the pipe, and then drop it on the floor when Cancelled is raised.

This is why the operations in trio/_socket.py split the checkpoint into two pieces: before we attempt the operation, we do a await checkpoint_if_cancelled(), and then after the operation succeeds, we do a await cancel_shielded_checkpoint(). That way we always do a full checkpoint one way or another, but if the operation succeeds we're guaranteed not to raise Cancelled.

njsmith · 2018-08-25T12:32:45Z

trio/tests/subprocess/test_unix_pipes.py

+
+
+async def test_pipe_fully():
+    await check_one_way_stream(make_pipe, None)


We should fill in the second argument here – it lets check_one_way_stream do much more thorough testing.

Something like:

async def make_clogged_pipe(): s, r = make_pipe() try: while True: # We want to totally fill up the pipe buffer. # This requires working around a weird feature that POSIX pipes have. # If you do a write of <= PIPE_BUF bytes, then it's guaranteed # to either complete entirely, or not at all. So if we tried to write # PIPE_BUF bytes, and the buffer's free space is only # PIPE_BUF/2, then the write will raise BlockingIOError... even # though a smaller write could still succeed! To avoid this, # make sure to write >PIPE_BUF bytes each time, which disables # the special behavior. # For details, search for PIPE_BUF here: # http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html os.write(s.fileno(), b"x" * select.PIPE_BUF * 2) except BlockingIOError: pass return s, r

njsmith · 2018-08-25T23:57:54Z

trio/_subprocess/unix_pipes.py

+                    except BlockingIOError:
+                        pass
+
+                await self.wait_send_all_might_not_block()


I think the way I'd structure send_all is:

Just do a checkpoint as the very first thing, given how this is structured it doesn't seem worth the hassle of avoiding this.

Structure the main loop like:

while True: <try to write some data> <check if we're done; if so, break out of the loop> <wait for the pipe to be writable>

njsmith · 2018-08-26T00:55:42Z

trio/_subprocess/unix_pipes.py

+                            # also doesn't checkpoint so we have to do that
+                            # ourselves here too
+                            await _core.checkpoint()
+                            raise BrokenStreamError from e


Maybe put this inside wait_send_all_might_not_block?

njsmith

OK, did a close read as requested. A few trivial little comments, and one substantive one: where'd __del__ go? Before there were bugs because __del__ was closing the fds and also your tests were closing the same fds directly, but we should fix that by not closing the fds directly, not by removing __del__ :-).

njsmith · 2018-08-26T02:03:38Z

trio/_subprocess/unix_pipes.py

+        self._pipe = pipefd
+        self._closed = False
+
+        import fcntl


Can we move this import to the top of the file, like other imports?

The import was originally there because windows tests would import test_unix_pipes and crash when fcntl was imported. I rearranged some stuff in the test so now this won't happen.

njsmith · 2018-08-26T02:06:07Z

trio/_subprocess/unix_pipes.py

+
+        if max_bytes < 1:
+            await _core.checkpoint()
+            raise ValueError("max_bytes must be integer >= 1")


Oh yeah, good point :-)

njsmith · 2018-08-26T02:08:37Z

trio/_subprocess/unix_pipes.py

+                await _core.cancel_shielded_checkpoint()
+                break
+
+        return data


I guess technically you could move the checkpoint stuff out of the loop but in practice it doesn't really matter.

njsmith · 2018-08-26T02:10:13Z

trio/tests/subprocess/test_unix_pipes.py

+            # the special behavior.
+            # For details, search for PIPE_BUF here:
+            #   http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html
+            buf_size = getattr(select, "PIPE_BUF", 8192)


Please add a comment on this with a link to https://bitbucket.org/pypy/pypy/issues/2876/selectpipe_buf-is-missing-on-pypy3 , to improve our chances of noticing later when it becomes unnecessary.

Fuyukai · 2018-08-26T02:32:08Z

Well, if __del__ doesn't close the fds, what else can it do?

njsmith · 2018-08-26T03:24:56Z

As a general rule in Python, any OS resource should be owned by some object whose __del__ will make sure it's eventually closed. So, here that's the the Pipe*Stream classes, and they should have a __del__ that closes the fds. And then to avoid double-closes, once you pass an fd into a Pipe*Stream, then its __del__ or aclose should be the only thing that closes the fds.

I guess testing __del__ to make sure it actually closes things is a little tricky... I guess you could do something like:

# Maybe this should move somewhere else?
from trio._core.tests.tutil import gc_collect_harder

async def test_pipe_del():
    s, r = make_pipe()
    s_fd = s.fileno()
    r_fd = r.fileno()
    del s, r
    gc_collect_harder()
    with pytest.raises(OSError) as excinfo:
        os.close(s_fd)
    assert excinfo.value.errno == errno.EBADF
    with pytest.raises(OSError) as excinfo:
        os.close(r_fd)
    assert excinfo.value.errno == errno.EBADF

njsmith

One minor comment, and otherwise looks good – feel free to merge after addressing that :-)

Thanks for patience with all my nitpicking!

njsmith · 2018-08-29T04:39:03Z

trio/_subprocess/unix_pipes.py

+
+    async def __aexit__(self, exc_type, exc_val, exc_tb):
+        await self.aclose()
+        return False


You don't need to define __aenter__ and __aexit__ here – trio.abc.AsyncResource provides generic implementations, and SendStream and ReceiveStream inherit from AsyncResource.

Oh, yeah, whoops.

https://bitbucket.org/pypy/pypy/issues/2876

njsmith mentioned this pull request Aug 22, 2018

Subprocess support #4

Closed

njsmith mentioned this pull request Aug 23, 2018

Add support for asynchronous waitpid on Linux systems. #622

Merged

njsmith reviewed Aug 24, 2018

View reviewed changes

njsmith reviewed Aug 25, 2018

View reviewed changes

njsmith reviewed Aug 26, 2018

View reviewed changes

njsmith approved these changes Aug 29, 2018

View reviewed changes

Fuyukai added 13 commits August 29, 2018 14:13

Add support for pipes.

51e9c6d

Add __del__ method for GC'd pipes.

905158f

Change a lotta stuff

4f4910f

Pass blocking pipes by default, and fcntl them to non-block.

b0354d1

Make requested changes

b6f0a86

Change order of these operations

8ff9d4e

Manually make these pipes non-blocking

cf5de2d

Perform more tests on the pipe stream.

f6e1ff1

Do proper checkpoints for the receive_some call.

6f52ce4

Change send_all slightly

ad644fb

kqueue returns EPIPE immediately, rather than on write

2a9d453

select.BUF_SIZE doesn't exist on PyPy yet

8c4792b

https://bitbucket.org/pypy/pypy/issues/2876

Rearrange logic

61f2262

Fuyukai added 2 commits August 29, 2018 14:13

Make some last changes

2e1a1e5

Add __del__ to pipe objects.

aaad5b5

Fuyukai merged commit fbb4543 into python-trio:master Aug 29, 2018

njsmith mentioned this pull request Jan 22, 2019

Subtle bugs around closure handling in _unix_pipes.py and _windows_pipes.py #661

Closed



		async def test_pipe_fully():
		await check_one_way_stream(make_pipe, None)

Add support for pipes. #621

Add support for pipes. #621

Conversation

Fuyukai commented Aug 22, 2018 • edited Loading

codecov bot commented Aug 22, 2018 • edited Loading

Codecov Report

njsmith commented Aug 23, 2018

njsmith commented Aug 23, 2018 • edited Loading

Fuyukai commented Aug 23, 2018

njsmith left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

njsmith commented Aug 24, 2018

Fuyukai commented Aug 24, 2018

njsmith commented Aug 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fuyukai Aug 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

njsmith left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

njsmith Aug 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

njsmith left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fuyukai commented Aug 26, 2018

njsmith commented Aug 26, 2018

njsmith left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fuyukai commented Aug 22, 2018 •

edited

Loading

codecov bot commented Aug 22, 2018 •

edited

Loading

njsmith commented Aug 23, 2018 •

edited

Loading

njsmith commented Aug 24, 2018 •

edited

Loading

Fuyukai Aug 25, 2018 •

edited

Loading

njsmith Aug 25, 2018 •

edited

Loading