Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation faults with NTuple{2,NTuple{N,Core.VecElement{T}}}, but not NTuple{2N,Core.VecElement{T}} #30426

Closed
chriselrod opened this issue Dec 17, 2018 · 5 comments

Comments

@chriselrod
Copy link
Contributor

chriselrod commented Dec 17, 2018

when N * sizeof(T) == 64.

I am showing results on a (4-day-old) master below, but I see the same behaviour on Julia 1.0.3:

julia> versioninfo()
Julia Version 1.2.0-DEV.12
Commit 77a7d92e91 (2018-12-13 21:20 UTC)
Platform Info:
  OS: Linux (x86_64-redhat-linux)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libimf
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

julia> bigvec() = ntuple(i -> Core.VecElement(1.0), Val(16))
bigvec (generic function with 1 method)

julia> twovecs() = (ntuple(i -> Core.VecElement(1.0), Val(8)),ntuple(i -> Core.VecElement(1.0), Val(8)))
twovecs (generic function with 1 method)

julia> bigvec()
(VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0))

julia> twovecs()
((VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0)), (VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0)))

julia> bigvec() |> typeof |> Base.datatype_alignment
16

julia> twovecs() |> typeof |> Base.datatype_alignment
16

julia> bigvec() |> typeof |> sizeof
128

julia> twovecs() |> typeof |> sizeof
128

julia> using BenchmarkTools

julia> @benchmark bigvec()
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  4607182418800017408
  --------------
  minimum time:     0.015 ns (0.00% GC)
  median time:      0.017 ns (0.00% GC)
  mean time:        0.018 ns (0.00% GC)
  maximum time:     2.991 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark twovecs()

signal (11): Segmentation fault
in expression starting at no file:0
#_run#18 at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:336
unknown function (ip: 0x7f3f05708efb)
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
inner at ./none:0
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
jl_apply at /home/chriselrod/Documents/languages/jdev/src/julia.h:1571 [inlined]
jl_f__apply at /home/chriselrod/Documents/languages/jdev/src/builtins.c:556
jl_f__apply_latest at /home/chriselrod/Documents/languages/jdev/src/builtins.c:594
#invokelatest#1 at ./essentials.jl:746 [inlined]
#invokelatest at ./none:0 [inlined]
#run_result#16 at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:32 [inlined]
#run_result at ./none:0 [inlined]
#run#18 at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:46
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
#run at ./none:0 [inlined]
#run at ./none:0 [inlined]
#warmup#21 at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:79 [inlined]
warmup at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:79
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
do_call at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:323
eval_value at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:411
eval_stmt_value at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:362 [inlined]
eval_body at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:759
jl_interpret_toplevel_thunk_callback at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:885
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f3f162af98f)
unknown function (ip: 0x7)
jl_interpret_toplevel_thunk at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:894
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/jdev/src/toplevel.c:764
jl_toplevel_eval_in at /home/chriselrod/Documents/languages/jdev/src/toplevel.c:793
eval at ./boot.jl:328
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
eval_user_input at /home/chriselrod/Documents/languages/jdev/usr/share/julia/stdlib/v1.2/REPL/src/REPL.jl:85
run_backend at /home/chriselrod/.julia/packages/Revise/gStbk/src/Revise.jl:771
#58 at ./task.jl:259
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
jl_apply at /home/chriselrod/Documents/languages/jdev/src/julia.h:1571 [inlined]
start_task at /home/chriselrod/Documents/languages/jdev/src/task.c:572
unknown function (ip: 0xffffffffffffffff)
Allocations: 25427851 (Pool: 25422691; Big: 5160); GC: 64
Segmentation fault (core dumped)

When I use a tuple of 2 8-length (64-total-byte) vectors I get a segmentation fault.

When I use a tuple of 2 4-length (32-total-byte) vectors I get no segmentation fault.

It is the number of bytes that matters. When using Float32, I can reproduce with no-segfault (but incorrectly reported allocations) NTuple{32,...}, but segfault on NTuple{2,NTuple{16,...}}.

I can reproduce the segmentation faults on both Ryzen and Skylake-X. I can try Haswell later.

Although this is obviously of more interest on Skylake-X (and other avx-512) architectures: because tuples of 32-byte vectors don't cause segfaults, I'd just not construct these tuples.

The workaround -- concatenating and then sub-setting larger vectors -- is a little awkward.

@vtjnash
Copy link
Member

vtjnash commented Dec 18, 2018

Sounds similar to #21959, although I thought were were handling that at the codegen level in most cases now.

@chriselrod
Copy link
Contributor Author

chriselrod commented Jun 25, 2019

@vtjnash , could it at all be related to this:

julia> @noinline function foo(a)
           v = ntuple(Val(8)) do w Core.VecElement(Float64(w)) end
           a, (v, (a,(1<<10,1<<20)))
       end
foo (generic function with 1 method)

julia> foo(1)
(4607182418800017408, ((VecElement{Float64}(3.0), VecElement{Float64}(4.0), VecElement{Float64}(5.0), VecElement{Float64}(6.0), VecElement{Float64}(7.0), VecElement{Float64}(8.0), VecElement{Float64}(5.0e-324), VecElement{Float64}(5.06e-321)), (1048576, (14690820581056367, 140697086305440))))

julia> reinterpret(Float64, 4607182418800017408)
1.0

julia> reinterpret.(Float64,(1,1<<10))
(5.0e-324, 5.06e-321)

julia> 1<<20
1048576

?
If not, I'll file a new issue.

The memory layout Julia uses for constructing the tuple looks different from what what it uses to deconstruct it.

The first 1::Int disapears, and is replaced with a 1::Float64, the first element of the LLVM vector.
That vector is shifted by 16 bytes, so that its last two elements are instead from the following tuple.

The last two elements of the tuple are filled with junk.

Any idea about the cause or fix?
Right now I need to @inline more code than I'd like, which lets the compiler avoid constructing the tuples, leaving the memory uncorrupted.

@vtjnash
Copy link
Member

vtjnash commented Jun 25, 2019

Seems like a new issue: it looks like we're specifying the wrong sort of vector to LLVM for our desired alignment for Julia.

@KristofferC
Copy link
Member

Code in first post seems to work now. Can this be closed? @chriselrod

@chriselrod
Copy link
Contributor Author

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants