Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression from v1.0.2, causes crash on linux with v1.0.3 #30612

Closed
richardreeve opened this issue Jan 5, 2019 · 17 comments
Closed

Regression from v1.0.2, causes crash on linux with v1.0.3 #30612

richardreeve opened this issue Jan 5, 2019 · 17 comments
Labels
regression Regression in behavior compared to a previous version

Comments

@richardreeve
Copy link

My package now crashes on Julia v1.0.3, when it didn't on v1.0.2. Any suggestions? I don't know how to do a bisection to identify what happened...

Working on 1.0.2:

julia> versioninfo()
Julia Version 1.0.2
Commit d789231e99 (2018-11-08 20:11 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2687W v2 @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, ivybridge)

(v1.0) pkg> add Phylo

(v1.0) pkg> status
    Status `~/.julia/environments/v1.0/Project.toml`
  [aea672f4] Phylo v0.3.2

julia> using Phylo

julia> parsenewick("((,),(,));")
BinaryTree{DataFrames.DataFrame,Dict{String,Any}} with 4 tips, 7 nodes and 6 branches.
Leaf names are Node 1, Node 2, Node 4 and Node 5

Crashing on v1.0.3:

julia> versioninfo()
Julia Version 1.0.3
Commit 099e826241 (2018-12-18 01:34 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2687W v2 @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, ivybridge)

(v1.0) pkg> add Phylo
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
 Resolving package versions...
status
  Updating `~/.julia/environments/v1.0/Project.toml`
 [no changes]
  Updating `~/.julia/environments/v1.0/Manifest.toml`
 [no changes]

(v1.0) pkg> status
    Status `~/.julia/environments/v1.0/Project.toml`
  [aea672f4] Phylo v0.3.2

julia> using Phylo

julia> parsenewick("((,),(,));")
signal (11): Segmentation fault
in expression starting at no file:0
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1191
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
emit_invoke at /buildworker/worker/package_linux64/build/src/codegen.cpp:3094
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:3893
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:3615
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:3801 [inlined]
emit_function at /buildworker/worker/package_linux64/build/src/codegen.cpp:6262
jl_compile_linfo at /buildworker/worker/package_linux64/build/src/codegen.cpp:1159
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1796
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2184
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:324
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:430
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:363 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:682
jl_interpret_toplevel_thunk_callback at /buildworker/worker/package_linux64/build/src/interpreter.c:806
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f6949c40fff)
unknown function (ip: 0xffffffffffffffff)
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:815
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:805
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/builtins.c:622
eval at ./boot.jl:319
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2184
eval_user_input at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/REPL/src/REPL.jl:85
macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/REPL/src/REPL.jl:117 [inlined]
#28 at ./task.jl:259
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2184
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1537 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:268
unknown function (ip: 0xffffffffffffffff)
Allocations: 38676146 (Pool: 38667798; Big: 8348); GC: 83
Segmentation fault (core dumped)
@KristofferC KristofferC added the regression Regression in behavior compared to a previous version label Jan 5, 2019
@richardreeve
Copy link
Author

The code works on v1.0.3 on MacOS and on Windows, and on Julia v1.0.2, v0.7 and v0.6 on all platforms.

@richardreeve
Copy link
Author

Just seen there's a v1.1.0-rc1 too. It's now fixed, sorry... is there going to be another point release before v1.1.0 to fix this?

julia> versioninfo()
Julia Version 1.1.0-rc1.0
Commit ba87aa3962 (2018-12-31 23:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2687W v2 @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, ivybridge)

(v1.1) pkg> add Phylo


(v1.1) pkg> status
    Status `~/.julia/environments/v1.1/Project.toml`
  [aea672f4] Phylo v0.3.2

julia> using Phylo

julia> parsenewick("((,),(,));")
BinaryTree{DataFrames.DataFrame,Dict{String,Any}} with 4 tips, 7 nodes and 6 branches.
Leaf names are Node 1, Node 2, Node 4 and Node 5

@ararslan
Copy link
Member

ararslan commented Jan 6, 2019

I ran git bisect between v1.0.2 and v1.0.3 using the success condition as @assert parsenewick("((,),(,));") isa Phylo.BinaryTree. It identified #30113, but the chosen commit was one of the broken interim commits and the PR was not squashed on merge, so it's unclear whether that PR is actually to blame. I've restarted the bisect with git bisect skip for that commit, so we'll see.

@ararslan
Copy link
Member

ararslan commented Jan 6, 2019

So, interesting result: I cannot reproduce the failure at all with a source build on this machine (git bisect picked the git bisect bad commit as the failure but it's unrelated). However, I can reproduce with the official binaries. I'm not sure what to make of that.

@staticfloat, ideas?

@staticfloat
Copy link
Member

Most likely it's sensitive to local compiler flags; good things to check are if your local source build is using the same sysimg multiversioning flags; whether you're setting the same MARCH, etc....

@ararslan
Copy link
Member

ararslan commented Jan 7, 2019

It was just a plain make build on Anubis. I can try again later with particular options set to mirror those on the buildbots, but I didn't think we were setting anything too special there.

@richardreeve
Copy link
Author

Just wondering if there are any thoughts about this? It would be nice if it was resolved, or at the very least that I was certain that the next binary release was guaranteed not to break again...

@ararslan
Copy link
Member

Do the 1.1.0-rc2 binaries work for you?

@richardreeve
Copy link
Author

Yes they do... but so did 1.1.0-rc1 - I guess my concern is whether this was compiled using the same compiler flags as the official 1.0.3 release, since it was only that release (and not the source itself) that seemed to be broken.

@richardreeve
Copy link
Author

Okay, the 1.1.0 release works, so does that mean we should close this and hope that whatever went wrong with the 1.0.3 release won't happen again :)?

@ararslan
Copy link
Member

We'll still be making 1.0.x releases, so this is still a problem. I don't have time to investigate further at the moment though, unfortunately. Hopefully this can be resolved for 1.0.4 if the issue can be identified.

@richardreeve
Copy link
Author

Fair enough... I'll keep my fingers crossed someone has time. I'm working mostly on MacOS at the moment, so checking things on linux is inconvenient, but I'll try to keep looking into it.

@maleadt
Copy link
Member

maleadt commented Jan 23, 2019

if the issue can be identified

I'm running a reduce job but there's a lot of code involved so this might take a while.
EDIT: after reducing 40 out of 70KLOC, the segfault has become nondeterministic so I'm not sure this will end anywhere 🙁

@maleadt
Copy link
Member

maleadt commented Feb 14, 2019

I've managed to reduce this example, but I'm not sure I like the result...
This is what's left of the entire codebase with all its dependencies:

# depot/packages/DataFrames/lyCjP/src/abstractdataframe/io.jl
using WeakRefStrings # depot/packages/DataFrames/lyCjP/src/DataFrames.jl
module DataFrames
if VERSION >= v"1.1.0DEV.792" end
include("abstractdataframe/io.jl")
end
# depot/packages/DataValues/cAl6R/src/DataValues.jl
module DataValues
include("scalar/core.jl")
end
# depot/packages/DataValues/cAl6R/src/scalar/core.jl
for b in (:!, )
  @eval begin
      import .$b
      $b(a) = c
  end
end
# depot/packages/IterableTables/xvpnQ/src/IterableTables.jl
module IterableTables
using Requires, TableTraitsUtils
end
# depot/packages/Phylo/g085o/src/newick.jl
using Tokenize
function parsenewick(::Tokenize.Lexers.Lexer, ::c) where c
  "Unexpected $token.kind token '$(untokenize(token))' "
end
parsenewick(::String, ::Type{c}) where c = parsenewick(a, c)
parsenewick(b) = parsenewick(b, NamedTree)
# depot/packages/Phylo/g085o/src/Phylo.jl
module Phylo
include("Tree.jl")
include("newick.jl")
export parsenewick
include("trim.jl")
if VERSION < v"0.7.0" end
end
# depot/packages/Phylo/g085o/src/Tree.jl
using DataFrames
struct BinaryTree end
NamedTree = BinaryTree
# depot/packages/Phylo/g085o/src/trim.jl
using IterableTables
# depot/packages/Requires/9Jse8/src/require.jl
 
# depot/packages/Requires/9Jse8/src/Requires.jl
module Requires
include("require.jl")
end
# depot/packages/TableTraitsUtils/p4RrX/src/TableTraitsUtils.jl
module TableTraitsUtils
using DataValues
end
# depot/packages/Tokenize/P2B32/src/lexer.jl
module Lexers
struct Lexer end
end
# depot/packages/Tokenize/P2B32/src/Tokenize.jl
module Tokenize
include("token.jl")
include("lexer.jl")
import .Tokens: untokenize
export untokenize
end
# depot/packages/Tokenize/P2B32/src/token.jl
module Tokens
include("token_kinds.jl")
function a()
  for b in instances(Kind)
    if string(b) end
  end
end
a()
struct c e::Kind end
function untokenize(d::c)
  if string(d.e) end
end
end
# depot/packages/Tokenize/P2B32/src/token_kinds.jl
@enum(Kind, end_keywords)
# depot/packages/WeakRefStrings/RmyGQ/src/WeakRefStrings.jl
module WeakRefStrings
struct a <: AbstractString end
Base.thisind(::a, c) = b
end
# main.jl
using Phylo
parsenewick("")

Nothing particularly exciting, really, but creduce doesn't manage to reduce this any further. This includes, e.g., the empty require.jl -- removing it and the include from Requires.jl breaks the repro.

Now for what makes this repro annoying: the segfault is nondeterministic, and requires a couple of runs before triggering. Worse, the segfault only happens when piping the output of julia to a process, even if it's just tee (you can say I "selected" such a repro by testing against julia ... | grep Segfault).
The issue is also precompile-related, and only reproduces when starting with an empty cache (i.e., removing .julia/compiled). And to top it all off, it doesn't reproduce when disabling ASLR.

I tried running against 1.0.3 + ASAN, but it really only reproduces with the binaries. To try it yourself:

$ git clone https://github.com/maleadt/creduce_julia -b julia/30612 .
$ while true;
  do
    echo try;
    rm -rf depot/compiled/v1.0;
    PATH=/path/to/julia-1.0.3/bin:$PATH JULIA_DEPOT_PATH=$(pwd)/depot julia main.jl |& grep Segmentation;
  done
try
signal (11): Segmentation fault

Verified on cyclops. All code up at https://github.com/maleadt/creduce_julia/tree/julia/30612

@richardreeve
Copy link
Author

Thanks so much for looking into this. I'm feeling a bit dispirited that it seems to be so complicated, and a bit like the only option I have at the moment is prayer that the 1.0.4 binary release won't be afflicted and the problem will silently disappear... what I don't understand is what the difference is between the (presumably?) compiler flags for the binary official releases and the nightlies that might have made this show up (or just the platform they are compiled on?) - presumably there's a script somewhere that does both of these that could be compared?

@vtjnash vtjnash added this to the 1.x milestone Oct 22, 2020
@vtjnash
Copy link
Member

vtjnash commented Oct 22, 2020

@maleadt Since this was only an issue in v1.0.x (which isn't likely to see more releases), and not on master (which should soon have another LTS release and has likely already advanced significantly in the various areas this may have failed), did you analysis indicate whether we can close it?

@maleadt
Copy link
Member

maleadt commented Oct 23, 2020

I did not investigate this again with 1.0.4 or 1.0.5, but with another LTS coming up and this issue being fixed on 1.1+ I think we can close this. Maybe @richardreeve can elaborate whether this is still an actual issue with Phylo.jl on any version of Julia.

@vtjnash vtjnash closed this as completed Mar 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

6 participants