Fails to capture long output from C stdio #16

marcom · 2023-07-10T23:55:31Z

This currently fails:

using IOCapture, Test
function test_puts(str)
    cap = IOCapture.capture() do
        @ccall puts(str::Cstring)::Cint
    end
    @test cap.value == length(str) + 1
end
test_puts("Hello World\n"^100_000)

For Julia-only code using e.g. print this works fine.

Using Libc.flush_cstdio() doesn't seem to help:

function test_puts_flush(str)
    cap = IOCapture.capture() do
        ret = @ccall puts(str::Cstring)::Cint
        Libc.flush_cstdio()
        ret
    end
    @test cap.value == length(str) + 1
end
test_puts_flush("Hello World\n"^100_000)

Related: JuliaIO/Suppressor.jl#46 and JuliaIO/Suppressor.jl#47

The text was updated successfully, but these errors were encountered:

mortenpi · 2023-07-11T00:46:23Z

Okay, interesting. Probably not too surprising that directly calling into C would not work properly (IOCapture just uses redirect_stdour/err functions).

I also see an issue with a short string, where we don't actually return the capture. With

function test_puts(str)
    IOCapture.capture() do
        @ccall puts(str::Cstring)::Cint
    end
end

I get

julia> test_puts("Hello World\n"^10)
(value = 121, output = "", error = false, backtrace = Ptr{Nothing}[])

value is fine, but output is empty. With a longer string, or multiple calls, it eventually shows up. It seems like there is some ~4KB buffer somewhere.

I can also replicate the issue with the long string. I guess the -1 from puts is because it runs into some error and returns the EOF? Could we extract the C error somehow?

Would be great to get this working if possible, but I don't have the bandwidth currently to go down the IO rabbit hole.

marcom · 2023-07-11T12:02:07Z

Here is a bit more systematic testing script and output, I think I can confirm your results:

Test script

using IOCapture, Test

nbytes = [10^0, 10^1, 10^2, 10^3, 10^4, 10^5, 10^6]

function test_print(str)
    cap = IOCapture.capture() do
        print(str)
    end
    # save in a bool to avoid printing long strings when test fails
    cap_output_equals_str = cap.output == str
    @test cap_output_equals_str
end

function test_puts(str)
    cap = IOCapture.capture() do
        @ccall puts(str::Cstring)::Cint
    end
    @test cap.value == length(str) + 1
    # save in a bool to avoid printing long strings when test fails
    cap_output_equals_str = cap.output == str * "\n"
    @test cap_output_equals_str
end

function test_puts_flush(str)
    cap = IOCapture.capture() do
        ret = @ccall puts(str::Cstring)::Cint
        Libc.flush_cstdio()
        return ret
    end
    @test cap.value == length(str) + 1
    # save in a bool to avoid printing long strings when test fails
    cap_output_equals_str = cap.output == str * "\n"
    @test cap_output_equals_str
end

function run_testset(fn::Function)
    @testset verbose=true "$fn" begin
        for (i, n) in enumerate(nbytes)
            # print different letters for each test
            c = collect('A':'Z')[i]
            @testset "$n" begin
                fn(c^n)
            end
        end
    end
end

@testset "IOCapture cstdio" verbose=true begin
    run_testset(test_print)
    run_testset(test_puts)
    run_testset(test_puts_flush)
end

Test script output

Test Summary:     | Pass  Fail  Total  Time
IOCapture cstdio  |   22    13     35  3.1s
  test_print      |    7            7  0.2s
    1             |    1            1  0.1s
    10            |    1            1  0.0s
    100           |    1            1  0.0s
    1000          |    1            1  0.0s
    10000         |    1            1  0.0s
    100000        |    1            1  0.0s
    1000000       |    1            1  0.0s
  test_puts       |    5     9     14  2.1s
    1             |    1     1      2  2.0s
    10            |    1     1      2  0.0s
    100           |    1     1      2  0.0s
    1000          |    1     1      2  0.0s
    10000         |    1     1      2  0.0s
    100000        |          2      2  0.0s
    1000000       |          2      2  0.0s
  test_puts_flush |   10     4     14  0.1s
    1             |    2            2  0.0s
    10            |    2            2  0.0s
    100           |    2            2  0.0s
    1000          |    2            2  0.0s
    10000         |    2            2  0.0s
    100000        |          2      2  0.0s
    1000000       |          2      2  0.0s

For the the smaller sizes (10k bytes and below), puts still reports that it has written all bytes, and adding Libc.flush_cstdio() allows the output to be captured completely. At larger sizes, puts cannot write all bytes anymore because the pipe buffer is full.

So I guess a first step would be to:

add a disclaimer to the README that it will fail with cstdio

Second step:

add a Libc.flush_cstdio() to the IOCapture code, perhaps making it optional if there is a large performance impact

Third step maybe (I'm not sure if this is correct):

let the capture code and the function to be captured run on different threads, so IOCapture can keep on reading from the pipe while the cstdio code is writing to it

What do you think?

Edit: looks like the code to copy from the pipe to the IOBuffer is already running on another thread

IOCapture.jl/src/IOCapture.jl

Line 108 in a23f2a4

buffer_redirect_task = @async write(output, pipe)

so there must be something else going on.

mortenpi · 2023-07-12T23:22:04Z

Regarding the async stuff -- those should be tasks, not threads. So it's still all single-threaded. But regardless, those tasks should be taking care of keeping the buffer empty.

add a disclaimer to the README that it will fail with cstdio

Yeah, happy to take a PR for this. The edge cases should definitely be documented.

add a Libc.flush_cstdio() to the IOCapture code, perhaps making it optional if there is a large performance impact

This also sounds good to me, and would happily accept a PR here too. I wouldn't worry too much about performance here -- I generally assume that the IOCapture calls are not very performance critical, and we can add the option to disable it later once it becomes a problem for someone.

mortenpi added bug Something isn't working help wanted Extra attention is needed labels Jul 11, 2023

marcom mentioned this issue Jul 11, 2023

Suppressor.jl don't capture all stdout from CBindinged function JuliaIO/Suppressor.jl#46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fails to capture long output from C stdio #16

Fails to capture long output from C stdio #16

marcom commented Jul 10, 2023

mortenpi commented Jul 11, 2023

marcom commented Jul 11, 2023 •

edited

Loading

mortenpi commented Jul 12, 2023

Fails to capture long output from C stdio #16

Fails to capture long output from C stdio #16

Comments

marcom commented Jul 10, 2023

mortenpi commented Jul 11, 2023

marcom commented Jul 11, 2023 • edited Loading

mortenpi commented Jul 12, 2023

marcom commented Jul 11, 2023 •

edited

Loading