Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AddressSanitizer: heap-use-after-free ./src/julia.h:1235:13 in jl_is_array_type(void*) #42498

Closed
DilumAluthge opened this issue Oct 4, 2021 · 9 comments
Labels
bug Indicates an unexpected problem or unintended behavior ci Continuous integration compiler:codegen Generation of LLVM IR and native code needs more info Clarification or a reproducible example is required

Comments

@DilumAluthge
Copy link
Member

Seen in the asan job in CI. The full CI log is here.

Precompilation complete. Summary:
Total ─────── 731.294125 seconds
Generation ── 434.749219 seconds 59.4493%
Execution ─── 296.544906 seconds 40.5507%
=================================================================
==13821==ERROR: AddressSanitizer: heap-use-after-free on address 0x6110072336b8 at pc 0x7f070ee894b2 bp 0x7ffc97c9de50 sp 0x7ffc97c9de48
READ of size 8 at 0x6110072336b8 thread T0
    #0 0x7f070ee894b1 in jl_is_array_type(void*) /cache/build/amdci7-2/julialang/julia-master/src/julia.h:1235:13
    #1 0x7f070ee88e34 in dereferenceable_size(_jl_value_t*) /cache/build/amdci7-2/julialang/julia-master/src/cgutils.cpp:320:9
    #2 0x7f070ee88dd0 in maybe_mark_load_dereferenceable(llvm::Instruction*, bool, _jl_value_t*) /cache/build/amdci7-2/julialang/julia-master/src/cgutils.cpp:384:19
    #3 0x7f070eeb7ef3 in emit_varinfo(jl_codectx_t&, jl_varinfo_t&, _jl_sym_t*, _jl_value_t*) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:4069:9
    #4 0x7f070ee9c86e in emit_local(jl_codectx_t&, _jl_value_t*) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:4109:12
    #5 0x7f070ee93b29 in emit_expr(jl_codectx_t&, _jl_value_t*, long) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:4529:16
    #6 0x7f070eea070f in emit_call(jl_codectx_t&, jl_expr_t*, _jl_value_t*) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:3793:19
    #7 0x7f070ee95297 in emit_expr(jl_codectx_t&, _jl_value_t*, long) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:4632:26
    #8 0x7f070ef21ee2 in emit_ssaval_assign(jl_codectx_t&, long, _jl_value_t*) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:4274:16
    #9 0x7f070ef1d740 in emit_stmtpos(jl_codectx_t&, _jl_value_t*, int) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:4516:9
    #10 0x7f070ee601a3 in emit_function(_jl_method_instance_t*, _jl_code_info_t*, _jl_value_t*, jl_codegen_params_t&, bool) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:7359:13
    #11 0x7f070ee4f6ba in jl_emit_code(_jl_method_instance_t*, _jl_code_info_t*, _jl_value_t*, jl_codegen_params_t&) /cache/build/amdci7-2/julialang/julia-master/src/codegen.cpp:7721:30
    #12 0x7f070f21c3fb in ijl_create_native /cache/build/amdci7-2/julialang/julia-master/src/aotcompile.cpp:321:50
    #13 0x7f070f1203ca in jl_precompile /cache/build/amdci7-2/julialang/julia-master/src/precompile.c:401:25
    #14 0x7f070f11e68b in jl_write_compiler_output /cache/build/amdci7-2/julialang/julia-master/src/precompile.c:33:23
    #15 0x7f070f081703 in ijl_atexit_hook /cache/build/amdci7-2/julialang/julia-master/src/init.c:211:9
    #16 0x7f070f16d307 in jl_repl_entrypoint /cache/build/amdci7-2/julialang/julia-master/src/jlapi.c:691:5
    #17 0x7f0712a66af9 in jl_load_repl /cache/build/amdci7-2/julialang/julia-master/cli/loader_lib.c:225:12
    #18 0x4f7196 in main /cache/build/amdci7-2/julialang/julia-master/cli/loader_exe.c:59:15
    #19 0x7f0712ac409a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #20 0x41f319 in _start (/cache/build/amdci7-2/julialang/julia-master/tmp/test-asan/asan/usr/bin/julia-debug+0x41f319)
 
0x6110072336b8 is located 56 bytes inside of 192-byte region [0x611007233680,0x611007233740)
freed by thread T0 here:
    #0 0x4b13e4 in free /workspace/srcdir/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:127
    #1 0x7f070f149844 in jl_free_aligned /cache/build/amdci7-2/julialang/julia-master/src/gc.c:255:5
 
previously allocated by thread T0 here:
    #0 0x4b2374 in posix_memalign /workspace/srcdir/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:226
    #1 0x7f070f130f5d in jl_malloc_aligned /cache/build/amdci7-2/julialang/julia-master/src/gc.c:235:9
 
SUMMARY: AddressSanitizer: heap-use-after-free /cache/build/amdci7-2/julialang/julia-master/src/julia.h:1235:13 in jl_is_array_type(void*)
Shadow bytes around the buggy address:
  0x0c2280e3e680: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2280e3e690: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2280e3e6a0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c2280e3e6b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2280e3e6c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c2280e3e6d0: fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd
  0x0c2280e3e6e0: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  0x0c2280e3e6f0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c2280e3e700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2280e3e710: 00 04 fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2280e3e720: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==13821==ABORTING
@DilumAluthge DilumAluthge added the bug Indicates an unexpected problem or unintended behavior label Oct 4, 2021
@DilumAluthge DilumAluthge changed the title SUMMARY: AddressSanitizer: heap-use-after-free ./src/julia.h:1235:13 in jl_is_array_type(void*) AddressSanitizer: heap-use-after-free ./src/julia.h:1235:13 in jl_is_array_type(void*) Oct 4, 2021
@DilumAluthge
Copy link
Member Author

DilumAluthge commented Oct 4, 2021

This failure seems to be non-deterministic.

The asan CI job does not run under rr, so unfortunately I do not have an rr trace.

@maleadt
Copy link
Member

maleadt commented Oct 5, 2021

We should probably bump malloc_context_size a little to get some more context. Maybe we can even get rid of the option altogether, as well as fast_unwind_on_malloc; I added those when running ASAN jobs on a memory-constrained system, while our CI machines have plenty of memory.

@JeffBezanson
Copy link
Member

It looks like compute_va_type is one place in codegen where we construct a type that might end up not rooted?

@JeffBezanson JeffBezanson added the compiler:codegen Generation of LLVM IR and native code label Oct 5, 2021
@vtjnash
Copy link
Member

vtjnash commented Oct 5, 2021

It currently is supposed to only generate concrete types (since we're going make a stack-allocation of it, and we add a tag to the output to mark it as such), but looks like it might currently easily generate some invalid Tuple tags (e.g. with Type{T} instead of DataType, or with Any)?

@DilumAluthge
Copy link
Member Author

DilumAluthge commented Oct 5, 2021

We should probably bump malloc_context_size a little to get some more context. Maybe we can even get rid of the option altogether, as well as fast_unwind_on_malloc; I added those when running ASAN jobs on a memory-constrained system, while our CI machines have plenty of memory.

@maleadt or @tkf Can you make a PR to make those changes to the Buildkite ASAN job?

@tkf
Copy link
Member

tkf commented Oct 5, 2021

Do we have some memory requirement-based scheduling or a max memory cap on the buildkite runner? In the worst case scenario, would it be OOM'ed and the CI server can recover? I was bit worried if it can accidentally break the CI pipeline. If it's a bit dangerous to try it out on the CI pipeline, I can check the memory conception and the build time on an identical machine.

@DilumAluthge
Copy link
Member Author

Do we have some memory requirement-based scheduling or a max memory cap on the buildkite runner?

Currently, I don't believe that the Buildkite agents have any memory caps. @staticfloat

I guess in that case, maybe we should keep the current restrictions on malloc_context_size and fast_unwind_on_malloc?

@tkf
Copy link
Member

tkf commented Oct 6, 2021

So I tried removing fast_unwind_on_malloc and malloc_context_size tkf@9e541a2 and checked the memory usage with mprof run --multiprocess contrib/asan/build.sh ./tmp/sanitizers -j32

image

It looks like the peak usage is about 5G, which is no problem in the CI machines.

@vtjnash vtjnash added needs more info Clarification or a reproducible example is required ci Continuous integration labels Nov 8, 2021
@vtjnash
Copy link
Member

vtjnash commented Sep 28, 2022

Fixed by #44724

@vtjnash vtjnash closed this as completed Sep 28, 2022
vtjnash added a commit that referenced this issue Sep 28, 2022
Issue noted in #42498. This should be the same as
Core.Compiler.tuple_tfunc. Otherwise we might accidentally constant-fold
something like:

   code_llvm((x...) -> x isa Tuple{Type{Tuple{Any}},Int}, (Type{Tuple{Any}}, Int))

to return true. This is rarely a compile-sig in practice, so it does not
usually affect code, but is observable there in the IR.
vtjnash added a commit that referenced this issue Sep 28, 2022
Issue noted in #42498. This should be the same as
Core.Compiler.tuple_tfunc. Otherwise we might accidentally constant-fold
something like:

     code_llvm((x...) -> x isa Tuple{Type{Tuple{Any}},Int}, (Type{Tuple{Any}}, Int))

to return true. This is rarely a compile-sig in practice, so it does not
usually affect code, but is observable there in the IR.
vtjnash added a commit that referenced this issue Sep 29, 2022
Issue noted in #42498. This should be the same as
Core.Compiler.tuple_tfunc. Otherwise we might accidentally constant-fold
something like:

     code_llvm((x...) -> x isa Tuple{Type{Tuple{Any}},Int}, (Type{Tuple{Any}}, Int))

to return true. This is rarely a compile-sig in practice, so it does not
usually affect code, but is observable there in the IR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior ci Continuous integration compiler:codegen Generation of LLVM IR and native code needs more info Clarification or a reproducible example is required
Projects
None yet
Development

No branches or pull requests

5 participants