Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted memory? #27955

Closed
PetrKryslUCSD opened this issue Jul 6, 2018 · 27 comments · Fixed by #28251
Closed

Corrupted memory? #27955

PetrKryslUCSD opened this issue Jul 6, 2018 · 27 comments · Fixed by #28251
Assignees
Labels
bug Indicates an unexpected problem or unintended behavior compiler:codegen Generation of LLVM IR and native code GC Garbage collector

Comments

@PetrKryslUCSD
Copy link

With

julia> versioninfo()
Julia Version 0.7.0-beta.156
Commit 6745e4fdca (2018-07-06 02:21 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6650U CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

execute

using FinEtools
using Test
for btu in [:SEC :MIN :HR :DY :YR :WK]
    @show btu
    t = 0.333*phun(base_time_units = btu, "s")
    v = 2.0*phun(base_time_units = btu, "m/s")
end

Prints

btu = :SEC
btu = nothing
ERROR: The only valid entries for the time base units are: :SEC|:MIN|:HR|:DY|:YR|:WK
Stacktrace:
 [1] error at ./error.jl:33 [inlined]
 [2] #physunitdict#1(::Symbol, ::Symbol, ::Function) at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/src/PhysicalUnitModule.jl:143
 [3] #phun at ./none:0 [inlined]
 [4] macro expansion at ./show.jl:549 [inlined]
 [5] top-level scope at ./REPL[3]:2 [inlined]
 [6] top-level scope at ./<missing>:0
@KristofferC
Copy link
Member

KristofferC commented Jul 6, 2018

Locally, I get

btu = :SEC

signal (11): Segmentation fault: 11
in expression starting at no file:0
sig_match_fast at /Users/kristoffer/julia/src/gf.c:2021 [inlined]
jl_lookup_generic_ at /Users/kristoffer/julia/src/gf.c:2092
jl_apply_generic at /Users/kristoffer/julia/src/gf.c:2148
#sprint#335 at ./strings/io.jl:99
unknown function (ip: 0x12c95f173)
#sprint at ./none:0 [inlined]

Putting the whole thing in a function

julia> f()
btu = :SEC
btu = Core.Compiler.VarState(Core.Compiler.Const(5280, false), false)
ERROR: The only valid entries for the time base units are: SEC|MIN|HR|DY|YR
Stacktrace:
 [1] error at ./error.jl:33 [inlined]
 [2] #physunitdict#1(::Symbol, ::Symbol, ::Function) at /Users/kristoffer/.julia/dev/FinETools/src/PhysicalUnitModule.jl:144
 [3] #physunitdict at ./none:0 [inlined]
 [4] #phun#2 at /Users/kristoffer/.julia/dev/FinETools/src/PhysicalUnitModule.jl:315 [inlined]
 [5] #phun at ./none:0 [inlined]
 [6] macro expansion at ./show.jl:549 [inlined]
 [7] f() at ./REPL[3]:3
 [8] top-level scope at none:0

Running it a second time works

julia> f()
btu = :SEC
btu = :MIN
btu = :HR
btu = :DY
btu = :YR
btu = :WK

@PetrKryslUCSD
Copy link
Author

On 64-bit Windows:

PetrKrysl@Firebolt MINGW64 ~/Documents/Work-in-progress
$ export JULIA_NUM_THREADS=4; ~/AppData/Local/Julia-0.7.0-beta/bin/julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.7.0-beta.108 (2018-07-01 22:57 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 3ac99765de (4 days old master)
|__/                   |  x86_64-w64-mingw32

julia> using FinEtools

julia> using Test

julia> for btu in [:SEC :MIN :HR :DY :YR :WK]
           @show btu
               t = 0.333*phun(base_time_units = btu, "s")
                   v = 2.0*phun(base_time_units = btu, "m/s")
                   end
btu = :SEC

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x6b593244 -- sig_match_fast at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2022 [inlined]
jl_lookup_generic_ at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2092 [inlined]
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2148
in expression starting at no file:0
sig_match_fast at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2022 [inlined]
jl_lookup_generic_ at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2092 [inlined]
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2148
#sprint#335 at .\strings\io.jl:99
unknown function (ip: 000000001A4C416A)
jl_invoke at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:42
#sprint at .\none:0 [inlined]
#repr#337 at .\strings\io.jl:193 [inlined]
repr at .\strings\io.jl:193 [inlined]
top-level scope at .\REPL[3]:2 [inlined]
top-level scope at .\<missing>:0
jl_fptr_trampoline at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:1813
jl_toplevel_eval_flex at /home/Administrator/buildbot/worker/package_win64/build/src\toplevel.c:808
jl_toplevel_eval_in at /home/Administrator/buildbot/worker/package_win64/build/src\builtins.c:633
eval at .\boot.jl:319
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2151
eval_user_input at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\REPL\src\REPL.jl:85
macro expansion at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\REPL\src\REPL.jl:116 [inlined]
#28 at .\task.jl:257
jl_fptr_trampoline at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:1813
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2151
jl_apply at /home/Administrator/buildbot/worker/package_win64/build/src\julia.h:1532 [inlined]
start_task at /home/Administrator/buildbot/worker/package_win64/build/src\task.c:268
Allocations: 9094599 (Pool: 9093321; Big: 1278); GC: 20

PetrKrysl@Firebolt MINGW64 ~/Documents/Work-in-progress
$

@PetrKryslUCSD
Copy link
Author

Still get segmentation violation with 0.7.0-beta.209 on Linux:

signal (11): Segmentation fault
in expression starting at no file:0
sig_match_fast at /buildworker/worker/package_linux64/build/src/gf.c:2030 [inlined]
jl_lookup_generic_ at /buildworker/worker/package_linux64/build/src/gf.c:2100 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2156
#sprint#334 at ./strings/io.jl:99
#sprint at ./none:0 [inlined]
#repr#336 at ./strings/io.jl:193 [inlined]
repr at ./strings/io.jl:193 [inlined]
top-level scope at ./REPL[4]:2 [inlined]
top-level scope at ./<missing>:0
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1821
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:815
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/builtins.c:633
eval at ./boot.jl:319
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
eval_user_input at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:85
macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:116 [inlined]
#28 at ./task.jl:257
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1821
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1533 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:268
unknown function (ip: 0xffffffffffffffff)
Allocations: 23259682 (Pool: 23255774; Big: 3908); GC: 53
Segmentation fault (core dumped)

@PetrKryslUCSD
Copy link
Author

PetrKryslUCSD commented Jul 10, 2018

In the module PhysicalUnitModule, function phun, there is a line

val  =  eval(Meta.parse(ostr));

I eliminated the eval, but kept the parse. Since the code still crashes with a segmentation error, it appears that the crashes and errors that keep popping up are due to the parse alone.

EDIT: Further experiments showed that even with the eval the code can be made to run (see below the case where doing using twice can result in a successful run). So the above is a red herring.

@PetrKryslUCSD
Copy link
Author

Still fails on Windows 64 bit: Version 0.7.0-beta.212

@vchuravy vchuravy added the bug Indicates an unexpected problem or unintended behavior label Jul 10, 2018
@PetrKryslUCSD
Copy link
Author

PetrKryslUCSD commented Jul 10, 2018

When I do using FinEtools TWICE, the code

for btu in [:SEC :MIN :HR :DY :YR :WK]
     @show btu
     t = 0.333*phun(base_time_units = btu, "s")
     v = 2.0*phun(base_time_units = btu, "m/s")
 end

runs fine. When I run it ONCE, I get a segmentation violation.

EDIT: This appears to be quite reproducible. All the FinEtools tests pass on Windows when using FinEtools is done twice or more, while hard crash occurs otherwise.

@KristofferC
Copy link
Member

Seems like some world age issue then..?

@PetrKryslUCSD
Copy link
Author

I am not familiar with this subject. Why do you think this might have something to do with the crashes?

@PetrKryslUCSD
Copy link
Author

PetrKryslUCSD commented Jul 11, 2018

0.7.0-beta.233 still fails on Windows 64.
The error stack is different now. Perhaps it will be helpful to reproduce it here?

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x6b59f7e4 -- jl_f_nfields at /home/Administrator/buildbot/worker/package_win64/build/src\builtins.c:857
in expression starting at no file:0
jl_f_nfields at /home/Administrator/buildbot/worker/package_win64/build/src\builtins.c:857
show_default at .\show.jl:322
show at .\show.jl:316
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2159
show_default at .\show.jl:320
show at .\show.jl:316
jl_fptr_trampoline at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:1821
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2159
#sprint#334 at .\strings\io.jl:99
#sprint at .\none:0 [inlined]
#repr#336 at .\strings\io.jl:193 [inlined]
repr at .\strings\io.jl:193 [inlined]
top-level scope at .\REPL[3]:2 [inlined]
top-level scope at .\<missing>:0
jl_fptr_trampoline at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:1821
jl_toplevel_eval_flex at /home/Administrator/buildbot/worker/package_win64/build/src\toplevel.c:815
jl_toplevel_eval_in at /home/Administrator/buildbot/worker/package_win64/build/src\builtins.c:633
eval at .\boot.jl:319
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2159
eval_user_input at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\REPL\src\REPL.jl:85
macro expansion at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\REPL\src\REPL.jl:116 [inlined]
#28 at .\task.jl:257
jl_fptr_trampoline at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:1821
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2159
jl_apply at /home/Administrator/buildbot/worker/package_win64/build/src\julia.h:1533 [inlined]
start_task at /home/Administrator/buildbot/worker/package_win64/build/src\task.c:268
Allocations: 11707878 (Pool: 11706227; Big: 1651); GC: 26

@PetrKryslUCSD
Copy link
Author

The trick of using FinEtools TWICE to prevent hard crash also works on Linux, Version 0.7.0-beta.272.

@PetrKryslUCSD
Copy link
Author

@vchuravy , @KristofferC : Is there anything I can do to help find and fix this bug? How is this sort of bug dealt with (what tools are available etc)?

@vchuravy
Copy link
Member

No there is not much you an do without getting into the entrails of the runtime. One thing you could do is minimising the MWE, e.g. reducing FinEtools until it is only a couple of lines that is showing the problem, but that can be somewhat tedious.

I am taking a look at it right now, but I don't have much time and can't promise any quick results.
The tools that I would use are gdb and rr. rr let's you capture a trace of what is going on and you can then rerun that trace and use gdb to figure out where that error is triggered by, but that leaves you with figuring out the runtime...

@vchuravy
Copy link
Member

Ok I could reproduce this on the current beta, it looks to me like a missing GC root somewhere, the call to jl_apply_generic has an argument that is getting corrupted, which from the stacktrace looks like it should have been btu (also probably why Kristoffer saw btu = Core.Compiler.VarState(Core.Compiler.Const(5280, false), false), I saw a different instance of that so we are putting different things into that memory location)... It is curious that this reproduces so well since.

@PetrKryslUCSD is FinEtools doing any funny business with pointers?

@PetrKryslUCSD
Copy link
Author

@vchuravy : No, not to my knowledge. I don't use pointers in the library itself, and I use a very small number of external packages.

@PetrKryslUCSD
Copy link
Author

@vchuravy : I turned off all @inbounds just to check that the memory was not getting corrupted by writing out of bounds: no, the code is clean.

@vchuravy
Copy link
Member

vchuravy commented Jul 13, 2018

@Keno, @vtjnash I didn't manage to track this down but it is easily reproducible and it looks like a missing root somewhere..., but I won't have anymore time to look at this and you guys are more efficient.

@PetrKryslUCSD
Copy link
Author

I could no longer reproduce the crash today with Version 0.7.0-beta.279 (x86_64-w64-mingw32), and with Version 0.7.0-beta2.0 (x86_64-pc-linux-gnu).

@PetrKryslUCSD
Copy link
Author

PetrKryslUCSD commented Jul 16, 2018

With

julia> versioninfo()
Julia Version 0.7.0-beta.279
Commit b0f531e5f0* (2018-07-12 15:33 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-6650U CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 4

I get a different corruption of memory with this code:

module mmMeasurement_1
using FinEtools
using Test
function test()
    W = 1.1;
    L = 12.;
    t =  3.32;
    nl, nt, nw = 5, 3, 4;

    println("New segmentation fault?")
    for orientation in [:a :b :ca :cb]
        fens,fes  = T4block(L,W,t, nl,nw,nt, orientation)
        geom  =  NodalField(fens.xyz)

        femm  =  FEMMBase(IntegData(fes, TetRule(5)))
        V = integratefunction(femm, geom, (x) ->  1.0)
        @test abs(V - W*L*t)/V < 1.0e-5
    end

end
end
using .mmMeasurement_1
mmMeasurement_1.test()

The printout says:

ERROR: Unknown orientation
Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] T4blockx(::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Symbol) at C:\Users\PetrKrysl\Documents\Work-in-progress\FinEtools\src\MeshTetrahedronModule.jl:84
 [3] test() at C:\Users\PetrKrysl\Documents\Work-in-progress\FinEtools\src\MeshTetrahedronModule.jl:32
 [4] top-level scope at none:0

@PetrKryslUCSD
Copy link
Author

PetrKryslUCSD commented Jul 16, 2018

@Keno, @vtjnash The above is reproducible with

julia> versioninfo()
Julia Version 0.7.0-beta2.12
Commit a878341b2b* (2018-07-15 15:57 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-6650U CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 4

EDIT: @Keno, @vtjnash Bug still reproducible with beta2.43 on Windows.

@PetrKryslUCSD
Copy link
Author

By way of explanation: With earlier versions of Julia this test passed, as all the orientations are in fact recognized as valid. Correspondingly I think the argument is getting corrupted and an unrecognized orientation is encountered inside T4block.

@PetrKryslUCSD
Copy link
Author

It is noteworthy that this problem is also something to do with Symbol as argument … Coincidence?

@PetrKryslUCSD
Copy link
Author

With Linux 64-bit (Version 0.7.0-beta2.26) I get

orientation = :a
orientation = :b
orientation = :ca

signal (11): Segmentation fault
in expression starting at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/test_debug.jl:64
subtype at /buildworker/worker/package_linux64/build/src/subtype.c:873
exists_subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1057 [inlined]
forall_exists_subtype at /buildworker/worker/package_linux64/build/src/subtype.c:1085
jl_subtype_env at /buildworker/worker/package_linux64/build/src/subtype.c:1140
jl_isa at /buildworker/worker/package_linux64/build/src/subtype.c:1257
jl_tuple_isa at /buildworker/worker/package_linux64/build/src/subtype.c:1182
jl_typemap_entry_assoc_exact at /buildworker/worker/package_linux64/build/src/typemap.c:797
jl_typemap_assoc_exact at /buildworker/worker/package_linux64/build/src/julia_internal.h:879 [inlined]
jl_typemap_level_assoc_exact at /buildworker/worker/package_linux64/build/src/typemap.c:833
jl_typemap_assoc_exact at /buildworker/worker/package_linux64/build/src/julia_internal.h:882 [inlined]
jl_lookup_generic_ at /buildworker/worker/package_linux64/build/src/gf.c:2110 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2156
#sprint#335 at ./strings/io.jl:101
#sprint at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/test_debug.jl:0 [inlined]
#repr#337 at ./strings/io.jl:195 [inlined]
repr at ./strings/io.jl:195 [inlined]
T4block at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/src/MeshTetrahedronModule.jl:32 [inlined]
test at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/test_debug.jl:53
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1821
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:324
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:428
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:363 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:675
jl_interpret_toplevel_thunk_callback at /buildworker/worker/package_linux64/build/src/interpreter.c:792
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f3662ea1d6f)
unknown function (ip: (nil))
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:801
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:821
jl_parse_eval_all at /buildworker/worker/package_linux64/build/src/ast.c:855
jl_load at /buildworker/worker/package_linux64/build/src/toplevel.c:855
include at ./boot.jl:317 [inlined]
include_relative at ./loading.jl:1034
include at ./sysimg.jl:29
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
include at ./client.jl:401
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
macro expansion at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/runtests.jl:104 [inlined]
macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v0.7/Test/src/Test.jl:1080 [inlined]
macro expansion at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/runtests.jl:104 [inlined]
top-level scope at ./util.jl:156 [inlined]
top-level scope at ./<missing>:0
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1821
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:815
jl_parse_eval_all at /buildworker/worker/package_linux64/build/src/ast.c:855
jl_load at /buildworker/worker/package_linux64/build/src/toplevel.c:855
include at ./boot.jl:317 [inlined]
include_relative at ./loading.jl:1034
include at ./sysimg.jl:29
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
include at ./client.jl:401
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1821
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:324
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:428
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:363 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:675
jl_interpret_toplevel_thunk_callback at /buildworker/worker/package_linux64/build/src/interpreter.c:792
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f366225064f)
unknown function (ip: 0xffffffffffffffff)
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:801
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:821
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:769
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/builtins.c:633
eval at ./boot.jl:319
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
macro expansion at ./logging.jl:305 [inlined]
exec_options at ./client.jl:224
_start at ./client.jl:435
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2159
jl_apply at /buildworker/worker/package_linux64/build/ui/../src/julia.h:1533 [inlined]
true_main at /buildworker/worker/package_linux64/build/ui/repl.c:112
main at /buildworker/worker/package_linux64/build/ui/repl.c:243
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/julia-299300a409/bin/julia (unknown line)
Allocations: 16936302 (Pool: 16933386; Big: 2916); GC: 38

@PetrKryslUCSD
Copy link
Author

The bug occurs with the newest Linux Julia as well:

              _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.7.0-beta2.47 (2018-07-19 14:45 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit b012ab3a44 (1 day old master)
|__/                   |  x86_64-pc-linux-gnu

(v0.7) pkg> test FinEtools
   Testing FinEtools
 Resolving package versions...
success?
mr = FinEtools.DeforModelRedModule.DeforModelRed2DAxisymm
material.mr = FinEtools.DeforModelRedModule.DeforModelRed2DAxisymm
failure?
New segmentation fault?
Debug: Error During Test at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/runtests.jl:104
  Got exception LoadError("/mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/test_debug.jl", 64, ErrorException("Unknown orientation")) outside of a @test
  LoadError: Unknown orientation
  Stacktrace:
   [1] error(::String) at ./error.jl:33
   [2] T4blockx(::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Symbol) at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/src/MeshTetrahedronModule.jl:84
   [3] test() at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/src/MeshTetrahedronModule.jl:32
   [4] top-level scope at none:0
   [5] include at ./boot.jl:317 [inlined]
   [6] include_relative(::Module, ::String) at ./loading.jl:1034
   [7] include(::Module, ::String) at ./sysimg.jl:29
   [8] include(::String) at ./client.jl:401
   [9] macro expansion at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/runtests.jl:104 [inlined]
   [10] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v0.7/Test/src/Test.jl:1080 [inlined]
   [11] macro expansion at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/runtests.jl:104 [inlined]
   [12] top-level scope at ./util.jl:156 [inlined]
   [13] top-level scope at ./<missing>:0
   [14] include at ./boot.jl:317 [inlined]
   [15] include_relative(::Module, ::String) at ./loading.jl:1034
   [16] include(::Module, ::String) at ./sysimg.jl:29
   [17] include(::String) at ./client.jl:401
   [18] top-level scope at none:0
   [19] eval(::Module, ::Any) at ./boot.jl:319
   [20] macro expansion at ./logging.jl:305 [inlined]
   [21] exec_options(::Base.JLOptions) at ./client.jl:224
   [22] _start() at ./client.jl:435
  in expression starting at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/test_debug.jl:64
Test Summary: | Pass  Error  Total
Debug         |    2      1      3
ERROR: LoadError: Some tests did not pass: 2 passed, 0 failed, 1 errored, 0 broken.
in expression starting at /mnt/c/Users/PetrKrysl/Documents/Work-in-progress/FinEtools/test/runtests.jl:104
ERROR: Package FinEtools errored during testing

@Keno Keno self-assigned this Jul 23, 2018
@Keno
Copy link
Member

Keno commented Jul 23, 2018

It's the array itself that's getting dropped:

julia> @noinline use(x) = x
use (generic function with 1 method)

julia> function foo()
           for btu in [:SEC :MIN :HR :DY :YR :WK]
               ccall(:jl_, Cvoid, (Any,), btu)
               GC.gc()
               # Allocate a bunch of garbage
               for i = 1:100000
                   use([:A :B :C :D :E :F])
               end
           end
       end
foo (generic function with 1 method)

julia> foo()
:SEC
:B
:C
:D
:E
:F

@Keno
Copy link
Member

Keno commented Jul 23, 2018

FWIW, this is the bug that @vtjnash asked me a few weeks ago how the GC placement logic handled. The answer is that it doesn't, apparently. I'll play with marking the array pointer derived, which doesn't quite have the correct semantics currently, but I think it can potentially do double duty.

@JeffBezanson JeffBezanson added compiler:codegen Generation of LLVM IR and native code GC Garbage collector labels Jul 23, 2018
Keno added a commit that referenced this issue Jul 24, 2018
The array data pointer is somewhat special. It points to a chunk
for memory that is effectively managed by the GC, but is not itself
a GC-tracked value. However, it is also not quite an interior pointer
into the array, since it may be an external allocation (or at the
more immediate IR level it is derived using a load rather than
a gep). We could try to make Derived do both, but the semantics
turn out to be rather different, so add a new kind of AS `Loaded`,
that handles precisely this situation: It roots the object that it
was loaded from while it is live.

Fixes #27955
Keno added a commit that referenced this issue Jul 28, 2018
The array data pointer is somewhat special. It points to a chunk
for memory that is effectively managed by the GC, but is not itself
a GC-tracked value. However, it is also not quite an interior pointer
into the array, since it may be an external allocation (or at the
more immediate IR level it is derived using a load rather than
a gep). We could try to make Derived do both, but the semantics
turn out to be rather different, so add a new kind of AS `Loaded`,
that handles precisely this situation: It roots the object that it
was loaded from while it is live.

Fixes #27955
Keno added a commit that referenced this issue Jul 28, 2018
The array data pointer is somewhat special. It points to a chunk
for memory that is effectively managed by the GC, but is not itself
a GC-tracked value. However, it is also not quite an interior pointer
into the array, since it may be an external allocation (or at the
more immediate IR level it is derived using a load rather than
a gep). We could try to make Derived do both, but the semantics
turn out to be rather different, so add a new kind of AS `Loaded`,
that handles precisely this situation: It roots the object that it
was loaded from while it is live.

Fixes #27955
@PetrKryslUCSD
Copy link
Author

This bug is around no more, it appears. Thanks a bunch!

@PetrKryslUCSD
Copy link
Author

@Keno : rc2 does not display this bug anymore.

KristofferC pushed a commit that referenced this issue Feb 11, 2019
The array data pointer is somewhat special. It points to a chunk
for memory that is effectively managed by the GC, but is not itself
a GC-tracked value. However, it is also not quite an interior pointer
into the array, since it may be an external allocation (or at the
more immediate IR level it is derived using a load rather than
a gep). We could try to make Derived do both, but the semantics
turn out to be rather different, so add a new kind of AS `Loaded`,
that handles precisely this situation: It roots the object that it
was loaded from while it is live.

Fixes #27955
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior compiler:codegen Generation of LLVM IR and native code GC Garbage collector
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants