-
-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subprocess support #4
Comments
Looked at this some more today. We want to provide our own version of the Communication: we should expose pipes as Waiting: this we definitely want to implement ourselves. I think we might need to hack at What else: there are other attributes, in particular |
Might be worth checking out some alternative subprocess APIs like delegator.py and pexpect (which relies on ptyprocess for a lot of the heavy lifting). From a quick skim, my tentative conclusion is that the main advantage of And |
There's another set of issues that it looks like I didn't write down yet, regarding how to implement pipes (i.e., when the users passes What Alternatively, there's a cute hack we might be able to use: we already have perfectly good code for wrapping sockets. And on Unix, using a What about Windows? Well, it turns out that on Windows, a little known fact is that you can use a native socket object as the stdin/stdout/stderr for a subprocess, if and only if you create that socket without passing
|
Well, this is promising: In [1]: import socket, msvcrt
In [2]: s = socket.socket()
In [3]: s.fileno()
Out[3]: 472
In [4]: msvcrt.open_osfhandle(s.fileno(), 0)
Out[4]: 6
In [5]: msvcrt.get_osfhandle(6)
Out[5]: 472 |
It did occur to me that at least the Unix part of this approach is the same code we need for #174. The Windows part is different though. |
@buhman Have you had any time to work on this? |
Nope :( |
@buhman, if so maybe discard self-assignment, so that someone could take it? |
Probably it's best to leave the assignment field empty, given that we're all volunteers here and stuff comes up. And if you're wondering if something is available or what to work on, then asking in the gitter chat is probably a good idea. |
I appear to need the low-level messy stuff (i.e. waiting for child processes sanely) for the asyncio test suite. The disadvantage of running a new thread with a blocking The disadvantage of SIGCHLD is that it's a messy signal that only works in the main process. On the positive side, IMHO these days we can assume that libraries don't steal it from us – and, if required, we could periodically check that the handler is still present. My take: we could implement both, as depending on the situation the users find themselves in, one may work but the other may not (… and what about Windows?). We do this by implementing a singleton which hides the whole mess from our users. A task that wants to wait for a child process should be able to call The next level up is forking a process and talking to it. IMHO the interface should be Unless somebody has better ideas, I'll start working on the first part of this. |
Is it possible to skip those parts of the asyncio test suite for now? I don't want to stop you making things better :-), but I also don't want you to get stuck on some distraction just because of the test suite – probably it's a higher priority to get trio-asyncio solid for network apps, and no one can blame trio-asyncio for not handling subprocesses when trio itself doesn't. Plus it's an easy to document limitation in the mean time. Re On Windows and MacOS this is much nicer, but they each need their own new low level primitives exposed. For Windows it's We usually model our APIs after the regular synchronous stdlib, not asyncio. So the default here would be that we're implementing an async version of |
Oh, ok, now that I've checked the asyncio docs I see that in this case they actually did the same thing and |
OK. I have implemented a rudimentary waitpid implementation and pushed that to the "sigchld" branch. Implementing further refinements, like a thread-based solution or a kqueue or …, should be straightforward. There are some tests, which happen to pass. I still need to do some refactoring;
I don't plan to do the the |
@smurfix I have to admit that looking at https://github.com/python-trio/trio/compare/sigchld in its current state is mostly just reminding me why I don't want to support SIGCHLD. That's a ton of code and complexities exposed in the public API. |
Well, there's public and "public". We can probably do without the top-level scan and verify methods, they're mostly for circumventing problems you really should fix by using a different child-watcher class in the first place. The most scratchy part I see is the choice of sync vs. async watcher and the fact that there are different ways of using them. I suspect that that can't really be helped even when we add some other ways of dealing with child processes. The best way to do it under Linux that I can think of, other than threads which have their own messy trade-offs, is by using a signalfd on SIGCHLD, but that still requires most of the scaffolding I added for dealing with SIGCHLD directly. I don't see the internals of the watcher classes as public interface per se. You don't actually use any of that; all you do is In fact, we should hide the whole mess (adding Windows will not make it look any nicer) behind the |
I'd really rather not have any visible API beyond I'm not happy with my idea of calling |
Also, |
Ok. Fair enough. A couple of the comments in asyncio's Windows handling of this stuff led me to believe that some sort of global state-keeper was needed there too. If that's wrong, so much the better. Well, we can't do it all inside the I'll implement a thread-based solution instead, and put that in hazmat. OK? NB, I do wonder whether one thread which simply waits on all processes might not be a good idea. |
Asyncio does its best but it's not a terribly reliable source for this kind of thing. On Windows there's the equivalent of
Oh, I see, because you want to wait on an arbitrary pid that some other code spawned. If someone starts a process by calling But I guess you are thinking in terms of trio-asyncio using some hybrid of asyncio's process handling code and trio's process handling code? I suspect that's not going to work satisfactorily, and we'll eventually want to implement the process protocol handling stuff on top of |
Oh well. I found out yesterday that SIGCHD handling is not only fraught with problems, it cannot work reliably at all. The problem is that while the main thread blocks, waiting for the Trio thread, it does not process SIGCHLD. I also cannot replace the blocking queues with Trio-ish ones because that opens the window for a lot of very interesting race conditions within asyncio. Now, what do we do about cancellations vs. subprocesses we started? (A not-canceled subprocess is easy: close its stdin and wait for it to die on its own.) |
@njsmith
I made a python socat version which worked on python 2.7 but failed on python 3. With this trick all is working well (use TCP connection as stdin, stdout of a process) |
@njsmith |
@diorcety Ha, that's clever! Am I understanding right though that using this trick makes python stop setting |
@njsmith Indeed.
Regarding the |
So does trio.subprocess settle with a reliable internal implementation for |
@imrn I'm not sure I understand the question, but as of right now trio doesn't have subprocess support yet (which is why this issue is still open). |
what about the way curio does it? do they solve the problems that trio is running into? |
There are no "problems trio is running into". There's an initial subprocess implementation in https://github.com/smurfix/trio/tree/subprocess (I just merged it up to the current Trio master) which works well enough, except that @njsmith would like to see a few changes (which I'd love to have time working on, but …). |
It is based on waiting in a separate thread similar to curio, right?
And there are windows supplements. Did you ever think about using kqueue
for bsd/macos?
|
@smurfix ah, I hadn't seen the updates after Feb 2, so I had the impression from earlier comments that there was some technical challenges preventing it from being in trio. good to hear it's working. |
The only problem trio is running into here is that it's complicated to get this right when you take into account differences between Windows/Linux/kqueue platforms, handling cancellation of
Yeah, there are some tricky questions here! One constraint is that we need to make it possible for people to build complicated subprocess control schemes, like sending signals at arbitrary times, starting to The stdlib behavior is not too unreasonable: you can explicitly call My criticism of this way of handling I suspect we'll also want to make subprocesses implement the |
|
@nosklo I think we're more likely to spell it something like |
There's a lot of chatter in this thread, which is interesting and has good stuff in it, but probably makes it difficult for folks to see the big picture. So here's a quick overview of what needs to be done (and then I'll edit the first comment to link down to this): Trio is designed in layers, and this will need work across multiple layers. Ultimately, we want something like the
This is straightforward in theory, but the I/O part will need special handling on Unix vs. Windows, and the waiting part will need totally different implementations for Linux vs. MacOS/the BSDs vs. Windows. And for extra fun, in order to write these implementations, we'll need to expose some more fundamental primitives inside Trio's core I/O loop. Here's one possible plan, from the bottom up. This is a pretty complete version; some of these steps can possibly be avoided/deferred with cleverness.
WaitForSingleObject |
Would it be acceptable to implement each platform separately? I personally only need Linux support. I'm happy to contribute to Windows and Mac, but I'd like to play with something on Linux as soon as possible. Would you accept a contribution that only works on Linux but has the right cross-platform API? |
Regarding Step 1, am I right in thinking we don't really want a new thread waiting for every subprocess? I'm imagining one thread calling |
I want to be a little careful about going too far down the path of supporting one platform at a time, because that's how a lot of projects end up with e.g. Windows as the forever-slightly-broken second-class-citizen. But of course things have to be implemented in some order, and if you want to start with Linux then sure, that's fine.
I was trying to keep it simple :-). I think one thread per handle would be a fine first version that's relatively simpler to implement, and then we could always optimize it further in the future. But if someone wanted to go ahead and implement the fancy thing from the beginning, that'd be fine too! Note that a single |
Edit: if you're just coming to this issue, then this comment has a good overview of what there is to do: #4 (comment)
Original description
Lots of annoying and fiddly details, but important!
I think for waiting, the best plan is to just give up on SIGCHLD (seriously, SIGCHLD is the worst) and park a thread in
waitpid
for each child process. Threads are lighter weight than processes so one thread-per-process shouldn't be a big deal. At least on Linux - if we're feeling ambitious we can do better on kqueue platforms. On Windows, it depends on what the state of our WaitFor{Multiple,Single}Object-fu is.The text was updated successfully, but these errors were encountered: