Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SnoopPrecompile segfault regression with IOCapture.jl #48837

Closed
mortenpi opened this issue Feb 28, 2023 · 8 comments · Fixed by #48875 or #48882
Closed

SnoopPrecompile segfault regression with IOCapture.jl #48837

mortenpi opened this issue Feb 28, 2023 · 8 comments · Fixed by #48875 or #48882
Milestone

Comments

@mortenpi
Copy link
Contributor

I don't know if this is a SnoopPrecompile (v1.0.3) issue or a Julia one, but adding the following call to IOCapture.jl:

using SnoopPrecompile
@precompile_all_calls begin
    IOCapture.capture() do
        println("...")
    end
end

Leads to a segfault with 1.10-DEV after e06a591 (#44527 and #47407, cc @timholy @vchuravy; 328dd57 is still good; thanks to @inkydragon for bisecting!).

Full segfault stacktrace
$ julia --project -e'using Pkg; Pkg.test()'
     Testing IOCapture
      Status `/tmp/jl_xDO7rM/Project.toml`
  [b5f81e59] IOCapture v0.2.2 `~/juliadocs/clones/IOCapture.jl`
  [66db9d55] SnoopPrecompile v1.0.3
  [56ddb016] Logging `@stdlib/Logging`
  [9a3f8284] Random `@stdlib/Random`
  [8dfed614] Test `@stdlib/Test`
      Status `/tmp/jl_xDO7rM/Manifest.toml`
  [b5f81e59] IOCapture v0.2.2 `~/juliadocs/clones/IOCapture.jl`
  [21216c6a] Preferences v1.3.0
  [66db9d55] SnoopPrecompile v1.0.3
  [2a0f44e3] Base64 `@stdlib/Base64`
  [ade2ca70] Dates `@stdlib/Dates`
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`
  [56ddb016] Logging `@stdlib/Logging`
  [d6f4376e] Markdown `@stdlib/Markdown`
  [de0858da] Printf `@stdlib/Printf`
  [9a3f8284] Random `@stdlib/Random`
  [ea8e919c] SHA v0.7.0 `@stdlib/SHA`
  [9e88b42a] Serialization `@stdlib/Serialization`
  [fa267f1f] TOML v1.0.3 `@stdlib/TOML`
  [8dfed614] Test `@stdlib/Test`
  [4ec0a83e] Unicode `@stdlib/Unicode`
     Testing Running tests...
[ Info: Running IOCapture.capture

[794841] signal (11.128): Segmentation fault
in expression starting at /home/mortenpi/juliadocs/clones/IOCapture.jl/test/runtests.jl:3
ijl_subtype_env at /home/mortenpi/juliadocs/precompile/julia-bad/src/subtype.c:1867
ijl_isa at /home/mortenpi/juliadocs/precompile/julia-bad/src/subtype.c:2124
jl_tuple1_isa at /home/mortenpi/juliadocs/precompile/julia-bad/src/subtype.c:2029
jl_typemap_entry_assoc_exact at /home/mortenpi/juliadocs/precompile/julia-bad/src/typemap.c:974
jl_typemap_assoc_exact at /home/mortenpi/juliadocs/precompile/julia-bad/src/julia_internal.h:1441 [inlined]
jl_lookup_generic_ at /home/mortenpi/juliadocs/precompile/julia-bad/src/gf.c:2646 [inlined]
ijl_apply_generic at /home/mortenpi/juliadocs/precompile/julia-bad/src/gf.c:2702
sroa_pass! at ./compiler/ssair/passes.jl:967
run_passes at ./compiler/optimize.jl:558
run_passes at ./compiler/optimize.jl:573 [inlined]
optimize at ./compiler/optimize.jl:522 [inlined]
_typeinf at ./compiler/typeinfer.jl:271
typeinf at ./compiler/typeinfer.jl:215
typeinf_ext at ./compiler/typeinfer.jl:1064
typeinf_ext_toplevel at ./compiler/typeinfer.jl:1097
typeinf_ext_toplevel at ./compiler/typeinfer.jl:1093
jfptr_typeinf_ext_toplevel_15062 at /home/mortenpi/juliadocs/precompile/julia-bad/usr/lib/julia/sys.so (unknown line)
jl_apply at /home/mortenpi/juliadocs/precompile/julia-bad/src/julia.h:1875 [inlined]
jl_type_infer at /home/mortenpi/juliadocs/precompile/julia-bad/src/gf.c:312
jl_generate_fptr_impl at /home/mortenpi/juliadocs/precompile/julia-bad/src/jitlayers.cpp:421
jl_compile_method_internal at /home/mortenpi/juliadocs/precompile/julia-bad/src/gf.c:2157 [inlined]
jl_compile_method_internal at /home/mortenpi/juliadocs/precompile/julia-bad/src/gf.c:2098
_jl_invoke at /home/mortenpi/juliadocs/precompile/julia-bad/src/gf.c:2516 [inlined]
ijl_apply_generic at /home/mortenpi/juliadocs/precompile/julia-bad/src/gf.c:2706
capture at /home/mortenpi/juliadocs/clones/IOCapture.jl/src/IOCapture.jl:72
unknown function (ip: 0x7fb3810fc3c2)
jl_apply at /home/mortenpi/juliadocs/precompile/julia-bad/src/julia.h:1875 [inlined]
do_call at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:126
eval_value at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:226
eval_stmt_value at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:177 [inlined]
eval_body at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:624
jl_interpret_toplevel_thunk at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:762
jl_toplevel_eval_flex at /home/mortenpi/juliadocs/precompile/julia-bad/src/toplevel.c:912
jl_toplevel_eval_flex at /home/mortenpi/juliadocs/precompile/julia-bad/src/toplevel.c:856
ijl_toplevel_eval_in at /home/mortenpi/juliadocs/precompile/julia-bad/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1522
_include at ./loading.jl:1582
include at ./client.jl:478
unknown function (ip: 0x7fb3810fb082)
jl_apply at /home/mortenpi/juliadocs/precompile/julia-bad/src/julia.h:1875 [inlined]
do_call at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:126
eval_value at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:226
eval_stmt_value at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:177 [inlined]
eval_body at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:624
jl_interpret_toplevel_thunk at /home/mortenpi/juliadocs/precompile/julia-bad/src/interpreter.c:762
jl_toplevel_eval_flex at /home/mortenpi/juliadocs/precompile/julia-bad/src/toplevel.c:912
jl_toplevel_eval_flex at /home/mortenpi/juliadocs/precompile/julia-bad/src/toplevel.c:856
ijl_toplevel_eval_in at /home/mortenpi/juliadocs/precompile/julia-bad/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
exec_options at ./client.jl:280
_start at ./client.jl:522
jfptr__start_33934 at /home/mortenpi/juliadocs/precompile/julia-bad/usr/lib/julia/sys.so (unknown line)
jl_apply at /home/mortenpi/juliadocs/precompile/julia-bad/src/julia.h:1875 [inlined]
true_main at /home/mortenpi/juliadocs/precompile/julia-bad/src/jlapi.c:573
jl_repl_entrypoint at /home/mortenpi/juliadocs/precompile/julia-bad/src/jlapi.c:717
main at /home/mortenpi/juliadocs/precompile/julia-bad/cli/loader_exe.c:58
unknown function (ip: 0x7fb38da29d8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at /home/mortenpi/juliadocs/precompile/julia-bad/usr/bin/julia (unknown line)
Allocations: 2976 (Pool: 2963; Big: 13); GC: 0
ERROR: Package IOCapture errored during testing (received signal: 11)
Stacktrace:
 [1] pkgerror(msg::String)
   @ Pkg.Types ~/juliadocs/precompile/julia-bad/usr/share/julia/stdlib/v1.10/Pkg/src/Types.jl:68
 [2] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, julia_args::Cmd, test_args::Cmd, test_fn::Nothing, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool)
   @ Pkg.Operations ~/juliadocs/precompile/julia-bad/usr/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1878
 [3] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, test_fn::Nothing, julia_args::Cmd, test_args::Cmd, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool, kwargs::Base.Pairs{Symbol, Base.TTY, Tuple{Symbol}, NamedTuple{(:io,), Tuple{Base.TTY}}})
   @ Pkg.API ~/juliadocs/precompile/julia-bad/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:435
 [4] kwcall(::NamedTuple{(:io,), Tuple{Base.TTY}}, ::typeof(Pkg.API.test), ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec})
   @ Pkg.API ~/juliadocs/precompile/julia-bad/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:414
 [5] test(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.TTY, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ Pkg.API ~/juliadocs/precompile/julia-bad/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:156
 [6] test(pkgs::Vector{Pkg.Types.PackageSpec})
   @ Pkg.API ~/juliadocs/precompile/julia-bad/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:145
 [7] test(; name::Nothing, uuid::Nothing, version::Nothing, url::Nothing, rev::Nothing, path::Nothing, mode::Pkg.Types.PackageMode, subdir::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ Pkg.API ~/juliadocs/precompile/julia-bad/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:171
 [8] test()
   @ Pkg.API ~/juliadocs/precompile/julia-bad/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:162
 [9] top-level scope
   @ none:1

It can be reproduced by running a simple snippet in the test environment (I can't replicate it in the normal environment..).

IOCapture.capture() do
    println("println")
end

I have pushed a branch with an MWE and doing julia --project -e'using Pkg; Pkg.test()' with Julia master should reproduce it.

While the error is different, I think it's related to our precompile woes in Documenter where a SnoopPrecompile leads to a strange error after e06a591 that also traces back to IOCapture usage: JuliaDocs/Documenter.jl#2004

@gbaraldi
Copy link
Member

gbaraldi commented Mar 1, 2023

Does this on 1.9 as well?

@mortenpi
Copy link
Contributor Author

mortenpi commented Mar 1, 2023

Yep, checked, the release-1.9 branch is affected as well.

@gbaraldi gbaraldi added this to the 1.9 milestone Mar 1, 2023
@timholy
Copy link
Member

timholy commented Mar 2, 2023

I got an rr trace of this one. On a debug build it fails at an assert; the key part of the stack trace seems to be

#4  0x00007fe8f9d051be in jl_array_ptr_ref (a=0x7fe8e7abd1b0 <jl_system_image_data+38649520>, i=45)
    at /home/tim/src/julia-branch/src/julia.h:1022
#5  0x00007fe8f9d0b2ce in lookup_root (m=0x7fe8e7abd0e0 <jl_system_image_data+38649312>, key=0, index=45)
    at /home/tim/src/julia-branch/src/method.c:1249
#6  0x00007fe8f9d499bf in jl_decode_value (s=0x7fffde1b9510) at /home/tim/src/julia-branch/src/ircode.c:659
#7  0x00007fe8f9d493ec in jl_decode_value_expr (s=0x7fffde1b9510, tag=9 '\t')
    at /home/tim/src/julia-branch/src/ircode.c:557
#8  0x00007fe8f9d49ac2 in jl_decode_value (s=0x7fffde1b9510) at /home/tim/src/julia-branch/src/ircode.c:681
#9  0x00007fe8f9d4901f in jl_decode_value_array (s=0x7fffde1b9510, tag=22 '\026')
    at /home/tim/src/julia-branch/src/ircode.c:500
#10 0x00007fe8f9d49aab in jl_decode_value (s=0x7fffde1b9510) at /home/tim/src/julia-branch/src/ircode.c:676
#11 0x00007fe8f9d4a9ca in ijl_uncompress_ir (m=0x7fe8e7abd0e0 <jl_system_image_data+38649312>, metadata=0x0,
    data=0x7fe8deb2e5f0) at /home/tim/src/julia-branch/src/ircode.c:895
#12 0x00007fe8e4fda539 in codeinst_to_ir () at compiler/ssair/irinterp.jl:125
#13 julia_abstract_call_method_with_const_args_19652 (interp=..., result=...,
    f=0x7fe8e5b4f870 <jl_system_image_data+5694832>, arginfo=..., si=..., match=..., sv=..., invokecall=...)
    at compiler/abstractinterpretation.jl:977
#14 0x00007fe8e50080c2 in julia_abstract_call_method_with_const_args_19678 (interp=..., result=...,
    f=0x7fe8e5b4f870 <jl_system_image_data+5694832>, arginfo=..., si=..., match=..., sv=...)

where m is

(rr) call jl_(m)
(::Type{Base.IOContext{IO_t} where IO_t<:IO})(IO, Pair{A, B} where B where A)
(rr) call jl_(m->roots)
Array{Any, (44,)}[
  :unwrapcontext,
  :ImmutableDict,
  :IOContext,
  Base.ImmutableDict{Symbol, Any},
  Base.ImmutableDict{Symbol, Any}(parent=#<null>, key=#<null>, value=#<null>),
  Base.IOContext{Base.GenericIOBuffer{Array{UInt8, 1}}},
  :io,
  :dict,
  Base.GenericIOBuffer{Array{UInt8, 1}},
  Tuple{Symbol},
  Tuple{Expr, Expr},
  Tuple{Expr},
  Tuple{Base.OneTo{Int64}},
  NTuple{5, Expr},
  Base.Dict{Symbol, Any},
  Tuple{Symbol, Any},
  Tuple{Int64},
  Tuple{Symbol, Symbol, Expr},
  Tuple{Symbol, Expr},
  Tuple{Expr, Expr, Expr},
  NTuple{4, Expr},
  NTuple{4, Symbol},
  Tuple{Symbol, Symbol, Symbol},
  Base.IOContext{Core.CoreSTDOUT},
  Core.CoreSTDOUT,
  Char,
  get_have_color() from get_have_color(),
  :get_have_color,
  Base.ImmutableDict{Symbol, Any}(parent=Base.ImmutableDict{Symbol, Any}(parent=#<null>, key=#<null>, value=#<null>), key=:color, value=true),
  Base.IOContext{Base.TTY},
  Tuple{String, Int64},
  Symbol("ttyhascolor.jl"),
  Base.TTY,
  Base.Dict{String, String},
  Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}},
  Tuple{Int64, Int64},
  Base.Dict{String, Base.VersionNumber},
  Base.Set{String},
  Pkg.REPLMode.Option,
  Pkg.REPLMode.PackageIdentifier,
  Pkg.REPLMode.Rev,
  Pkg.REPLMode.Subdir,
  Pkg.REPLMode.VersionToken,
  Array{Int64, 2}]
(rr) call jl_(m->root_blocks)
#<null>

The puzzling part is why it's referencing a root that isn't there.

@timholy
Copy link
Member

timholy commented Mar 3, 2023

OK, I have a clearer understanding of what's happening: during the call to jl_module_run_initializer(Base) (yes, I'm mixing C & Julia notation here), Base's __init__ forces type-inference on

(::Type{Base.IOContext{IO_t} where IO_t<:IO})(Base.IOStream, Pair{Symbol, Bool}) from (::Type{Base.IOContext{IO_t} where IO_t<:IO})(IO, Pair{A, B} where B where A)

During jl_compress_ir, this ends up adding Base.IOContext{Base.PipeEndpoint} (twice) as a new root. However, jl_precompile_toplevel_module has not yet been set (it's still NULL) and hence the new roots get added without a key. This prevents them from being added to the cache file. But the compressed CodeInstance contains references to them (they are beyond-the-end of nroots_sysimg).

So the fundamental issue seems to be that roots that get added by sysimg __init__ functions risk a crash in packages that compile new CodeInstances that reference those roots.

I'm not sure of the right way to fix it. I had high hopes for

diff --git a/src/init.c b/src/init.c
index 0651d3b274..cac7ac40e5 100644
--- a/src/init.c
+++ b/src/init.c
@@ -864,8 +864,13 @@ static NOINLINE void _finish_julia_init(JL_IMAGE_SEARCH rel, jl_ptls_t ptls, jl_
         jl_module_init_order = NULL;
         int i, l = jl_array_len(init_order);
         for (i = 0; i < l; i++) {
-            jl_value_t *mod = jl_array_ptr_ref(init_order, i);
-            jl_module_run_initializer((jl_module_t*)mod);
+            jl_module_t *mod = (jl_module_t*)jl_array_ptr_ref(init_order, i);
+            // #48837: it is possible to have the __init__ methods add roots,
+            // which get referenced when compressing CodeInstances.
+            // We need to make sure they get marked with the correct key.
+            jl_precompile_toplevel_module = mod;
+            jl_module_run_initializer(mod);
+            jl_precompile_toplevel_module = NULL;
         }
         JL_GC_POP();
     }

but that's only a partial fix: the problem seems to be that we don't recreate the same roots in the test process, and so in a debug build we fail at

assert(key == 0);
: since we used the sysimg's key after writing the sysimg, they aren't cached.

However, there may be an easy solution. I'm testing it locally, but if it appears promising, there's a PR coming.

@mortenpi
Copy link
Contributor Author

mortenpi commented Mar 4, 2023

I didn't test it, but I think #48875 wasn't supposed to fix it?

@mortenpi mortenpi reopened this Mar 4, 2023
@vtjnash vtjnash closed this as completed Mar 4, 2023
@vtjnash
Copy link
Member

vtjnash commented Mar 4, 2023

That isn't what triggered the close (though it was a close mistake to happen)

@timholy
Copy link
Member

timholy commented Mar 5, 2023

It was #48882, and that indeed fixes it.

@mortenpi
Copy link
Contributor Author

mortenpi commented Mar 6, 2023

Ah, sorry, I didn't notice the other PR. In any case, I can also confirm that it fixed the Documenter issue 🎉 Thank you for a speedy fix!

KristofferC pushed a commit that referenced this issue Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants