-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wurlitzer.pipes hang if output from C is very large #48
Comments
I believe this might turn out to be the same as #20. I can reproduce it with this script: import ctypes
import wurlitzer
libc = ctypes.CDLL(None)
def main():
sz = 64000
while True:
sz = sz + 100
print(sz)
buf = b'1' * sz
with wurlitzer.pipes(stderr=False) as (stdout, _):
libc.printf(b"%s\n", buf)
print(len(stdout.read()))
if __name__ == "__main__":
main() and you're right, it occurs when the buffer exceeds 64k (65536 bytes). The block occurs in It should work if you use import io
with pipes(stdout=io.StringIO(), stderr=io.StringIO()) as (stdout, stderr):
result = c_function(args)
# seek(0) resets cursor to beginning, so we can read what was written
stdout = stdout.seek(0)
lines = stdout.readlines()
... |
Should be fixed by #49, which does the above by default. |
On further investigation, it is related to #20, but not quite the same. Both are due to hangs on full pipes, but #20 is a hang on the |
Excellent. I'll need to wait for the 3.0.0 release to check it out. |
Just released 3.0, let me know how it goes! |
I tried running it this way, but I'm still seeing the same behavior when it gets too many bytes.
Is there a limit that you programmed in that might be configurable? |
Do you have an example you can extract? I've tested with code that produces multiple megabytes of output and it works. It's still possible you are hitting #20 (i.e. your program holds the GIL while producing low-level output), in which case there is an upper limit of the pipe buffer size. #51 should try to bump this to the system max pipe size, but I don't know on how many systems this works. Can you tell me what you get for |
Extracting an example could be difficult, unfortunately. However, I estimate the size of the output when not using wurlitzer to be about 3.2M. Meanwhile, Also
|
Python fcntl doesn't bother to expose it, but you can still use it as hardcoded If your C code does hold the GIL the whole time it is writing all of this data, I don't think there is much we can really do, unless we introduce intermediate files to store an arbitrary-length buffer. The code has to yield some time to give the Python thread a chance to consume the pipe. |
Well, this is interesting:
This also fails with a 1 instead. I've also tried setting the bufsize argument to pipes, using StringIO, etc. I can see that in ff1ec5f you are ignoring this OSError. Have you seen it work? Probably I'm doing something wrong. I was confused by why you're setting it for pipe_in, but not understanding the underlying os.pipe stuff is what lead me here in the first place. :) Thanks for all your help! |
Indeed, it wasn't working and I was suppressing the error telling us so. #52 fixes it to actually set the size, and warns on failure. |
It should actually do what it was claiming in 3.0.1 |
Well, other than seeing the warning about the OSError, I'm not seeing much difference. I replicated some of the code in _setup_pipes in the IDE and found that exceeding the 1MB value lead to a PermissionError?
Curious, I attached a sys.audithook and found this:
I don't entirely know what that means, but is it possible that my installation of python 3.8 does not permit raising the pipe size above |
I think it's actually your linux kernel that prevents exceeding echo 16777216 > /proc/sys/fs/pipe-max-size # 16MB
cat /proc/sys/fs/pipe-max-size then it should handle up to 16MB before the thread needs to start consuming things |
I think you're right, and no I don't have permissions (and my case wouldn't be very safe or portable if it were necessary anyway). I will have to come at this issue from another angle, probably. Thanks for all your help and I hope that wurlitzer became more robust because of all this! |
It absolutely did, thanks for your testing! You should be able to do arbitrary size with files. I'll think about adding an option to use files in wurlitzer, but in the meantime you can do it without wurlitzer (it's simpler with files because you don't need the read thread). The below context manager copies out the import ctypes
import os
import sys
from contextlib import contextmanager
libc = ctypes.CDLL(None)
# simplified linux-only version from wurlitzer
c_stdout_p = ctypes.c_void_p.in_dll(libc, 'stdout')
c_stderr_p = ctypes.c_void_p.in_dll(libc, 'stderr')
@contextmanager
def capture_to_file(stdout="./stdout", stderr="./stderr"):
stdout_f = stderr_f = None
if stdout:
stdout_f = open(stdout, mode="wb")
real_stdout = sys.__stdout__.fileno()
save_stdout = os.dup(real_stdout)
os.dup2(stdout_f.fileno(), real_stdout)
if stderr:
stderr_f = open(stderr, mode="wb")
real_stderr = sys.__stderr__.fileno()
save_stderr = os.dup(real_stderr)
os.dup2(stderr_f.fileno(), real_stderr)
try:
yield stdout, stderr # filenames, unlike wurlitzer's pipes
finally:
# flush to capture the last of it
libc.fflush(c_stdout_p)
libc.fflush(c_stderr_p)
if stdout:
os.dup2(save_stdout, real_stdout)
stdout_f.close()
if stderr:
os.dup2(save_stderr, real_stderr)
stderr_f.close()
...
with capture_to_file() as (stdout_fname, stderr_fname):
produce_output()
with open(stdout_fname) as f:
stdout = f.read()
os.remove(stdout_fname) full example in this gist. |
Thanks, I'll give that a shot when I get a chance! |
Bug description
The following code works for most C function calls, but hangs when the C-side tries to output thousands of lines. Unfortunately, I am not able to quantify the number of kB at which it hangs:
Naturally, when the
c_function
is called without the context manager, all lines are printed to stdout and application completes normally.I have also tried changing the arguments to pipes, and I've tried sys_pipes(), etc.
Expected behaviour
I know that pipes filling up can cause this, so I suspect that the stdout buffer is exceeding 64k. If you have any suggestions as to how to increase that size, I'm open to that as well; however adding that as an argument to
pipes
would be ideal.Actual behaviour
How to reproduce
Try creating a C function that outputs many kB of code and determine at which point it hangs.
Your personal set up
The text was updated successfully, but these errors were encountered: