Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design: higher-level stream abstractions #8

Closed
njsmith opened this issue Jan 22, 2017 · 4 comments
Closed

Design: higher-level stream abstractions #8

njsmith opened this issue Jan 22, 2017 · 4 comments

Comments

@njsmith
Copy link
Member

njsmith commented Jan 22, 2017

There's a gesture towards moving beyond concrete objects like sockets and pipes in the trio._streams interfaces. I'm not sure how much of this belongs in trio proper (as opposed to a library on top), but I think even a simple bit of convention might go a long way. What should this look like?

Prior art: Twisted endpoints. I like the flexibility and power. I'm not as big a fan of the strings and plugin architecture, but hey.

@njsmith
Copy link
Member Author

njsmith commented May 1, 2017

The SSL branch (#107) has a set of ABCs for streams that I'm pretty happy with. One place that I realized it's lacking, though, while trying to write tests for SSL-over-streams, is in specifying exception handling. In particular, there are some pretty generic things that can happen to streams:

  • recv but the other end has closed the connection: stdlib sockets return b""
  • sendall but the other end has closed the connection: stdlib sockets on Linux & MacOS raise BrokenPipeError (which is a bit confusing, since there are no pipes involved, but, you know, Unix), and on Windows raise ConnectionAbortedError: [WinError 10053] An established connection was aborted by the software in your host machine
  • recv but the connection has already been closed (i.e., locally someone called close and resources have been freed): stdlib sockets on Linux & MacOS raise OSError: [Errno 9] Bad file descriptor, and on Windows raise OSError: [WinError 10038] An operation was attempted on something that is not a socket
  • sendall but the connection has already been closed (in the same sense): stdlib sockets on Linux & MacOS raise OSError: [Errno 9] Bad file descriptor, and on Windows raise OSError: [WinError 10038] An operation was attempted on something that is not a socket

(Notes: The last two cases are due to python sockets setting their internal fileno to -1 when closing. In the second case, the WinError 10053 is WSAECONNABORTED.)

There's also ECONNRESET / ConnectionResetError which can happen when the remote side has closed the connection, from both recv and sendall, depending apparently on whether there's unACKed data sitting in the TCP send buffer (some details). And WSAENETRESET, and a few other WSA things that might or might not be related. Twisted does from errno import WSAECONNRESET as ECONNABORTED, which is odd given that WSAECONNABORTED exists and is something else, but seems to be specifically for accept.

And SSL wrappers add their own exciting ways of wanting to report these same failure modes. I'm guessing when we get to subprocesses then pipes/named pipes will have their own weirdness.

So there's at the very least a temptation to collapse these down into:

  • recv but the connection is gone (somehow, network broken, whatever): return b""
  • sendall but the connection is gone (somehow, network broken, whatever): raise BrokenStream
  • recv after local close: raise ClosedStream
  • sendall after local close: raise ClosedStream

And of course we'd use exception chaining to preserve the original cause as well, so at least for debugging the information is there. The question is whether this would be making it hard to reliably/programmatically access information that turns out to be important, like if for some reason an application calling sendall needs to respond differently depending on whether that gives ECONNABORTED vs ECONNRESET.

TCP distinguishes between FIN and RST as ways of ending a connection. IIUC the difference is that FIN is a regular "eof" packet (e.g. sent by shutdown(SHUT_WR)) and gets ACKed; the orderly way to shut down a connection is for both sides to send FIN and then ACK them. OTOH RST just means "no go away" and is sent when e.g. unexpected data arrives; it doesn't get ACKed. When we call close I'm not entirely sure what happens – maybe sending a FIN plus if any new packets arrive sending a RST?

Maybe the thing to do is to allow recv to raise BrokenStreamError, so recv() == b"" indicates FIN, and everything else is an error? (Oh also I decided between sentences that probably these exceptions should have Error in the name.)

@njsmith
Copy link
Member Author

njsmith commented May 1, 2017

Ugh, here's a problem with the above: so far I've been trying to keep the Stream API a subset of the socket API, to keep things simple.

And I think it'd even be reasonable to say that our sockets convert OSErrors into the special trio exceptions in some cases.

But... sometimes sockets aren't streams at all. It'd be really weird to raise BrokenStreamError for a UDP sendto!

Ideas:

  • only do exception type conversion in socket methods if self.type == SOCK_STREAM? (Problem: ughh having method signatures change depending on the setting of an attribute really smells bad. Counter-argument: everything about sockets changes depending on the setting of self.type, that's just the BSD socket API for you.)

  • make the stream view on sockets be a wrapper around the low-level socket object (maybe: the socket API + getsockopt + setsockopt + underlying socket object?), and have the high-level functions like create_connection and ... whatever we do for listening ... return a pre-wrapped socket object, so actual sockets become a low-level thing and most people just use the high-level objects?

  • do the exception type conversion in any case, but it's OK because UDP sockets never give ECONNRESET or ECONNABORTED? (Problem: is that even true?)

...I guess if we go with the "SocketStream" approach (the 2nd option above) then it might also make sense to move our other weirdness there, like the "improved" default socket options (see #72).

@njsmith
Copy link
Member Author

njsmith commented May 2, 2017

I guess we should probably also add recv_into to the RecvStream API. It can have a trivial default implementation based on recv. (Should recv also have a trivial default implementation based on recv_into? Probably... allocating a bytearray(max_bytes) and then truncating it again is basically equivalent to what socket.socket.recv does.)

(If we're breaking from the socket API, should we also rename the methods to send_all and receive_some?)

@njsmith
Copy link
Member Author

njsmith commented Jun 12, 2017

#107 is merged with a decent Stream abstraction; comments above are all out-of-date.

@njsmith njsmith closed this as completed Jun 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant