Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assigning value on remote worker: SIGSEGV (regression, parallel) #12468

Closed
bermanmaxim opened this issue Aug 5, 2015 · 6 comments
Closed

Assigning value on remote worker: SIGSEGV (regression, parallel) #12468

bermanmaxim opened this issue Aug 5, 2015 · 6 comments

Comments

@bermanmaxim
Copy link

I ran into this crash while testing the assignment of globals on remote workers

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+6478 (2015-08-03 04:22 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 4d08b6d* (2 days old master)
|__/                   |  x86_64-linux-gnu

julia> addprocs(1);

julia> a = [1,2,3];

julia> @spawnat 2 global b = a;

julia> @everywhere println(b)

signal (11): Segmentation fault
unknown function (ip: 0x7f0ca8b36e86)
jl_f_isa at /home/maxim/julia/usr/bin/../lib/libjulia.so (unknown line)
typeinf_uncached at ./inference.jl:1481
unknown function (ip: 0x7f0ca59facc0)
jl_apply_generic at /home/maxim/julia/usr/bin/../lib/libjulia.so (unknown line)
typeinf at ./inference.jl:1340
unknown function (ip: 0x7f0ca59f5416)
jl_apply_generic at /home/maxim/julia/usr/bin/../lib/libjulia.so (unknown line)
typeinf_ext at ./inference.jl:1284
jl_apply_generic at /home/maxim/julia/usr/bin/../lib/libjulia.so (unknown line)
unknown function (ip: 0x7f0ca8b4a69d)
unknown function (ip: 0x7f0ca8b4b107)
jl_apply_generic at /home/maxim/julia/usr/bin/../lib/libjulia.so (unknown line)
sync_end at ./task.jl:406
anonymous at multi.jl:421
unknown function (ip: 0x7f0ca8bb4ceb)
jl_toplevel_eval_in at /home/maxim/julia/usr/bin/../lib/libjulia.so (unknown line)
eval_user_input at REPL.jl:63
jlcall_eval_user_input_21091 at  (unknown line)
jl_apply_generic at /home/maxim/julia/usr/bin/../lib/libjulia.so (unknown line)
anonymous at task.jl:92
unknown function (ip: 0x7f0ca8ba7580)
unknown function (ip: (nil))
fish: “julia” terminated by signal SIGSEGV (Address boundary error)

A git bisect narrowed the commit introducing this behavior to 4d08b6d ; before that, there is no SIGSEGV, only the expected output:

julia> @everywhere println(b)
exception on 1: ERROR: UndefVarError: b not defined
 in eval at sysimg.jl:14
 in anonymous at multi.jl:1321
 in run_work_thunk at multi.jl:584
 in remotecall_fetch at multi.jl:657
 in remotecall_fetch at multi.jl:672
 in anonymous at task.jl:365
    From worker 2:  [1,2,3]
@amitmurthy
Copy link
Contributor

I get

  | | |_| | | | (_| |  |  Version 0.4.0-dev+6496 (2015-08-05 14:55 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 2171f80 (0 days old master)
|__/                   |  x86_64-linux-gnu

julia> addprocs(1);

julia> a = [1,2,3];

julia> @spawnat 2 global b = a;

julia> @everywhere println(b)
        From worker 2:  [1,2,3]
ERROR: UndefVarError(:b)
 in eval at sysimg.jl:14
 in anonymous at multi.jl:1350
 in run_work_thunk at multi.jl:643
 in remotecall_fetch at multi.jl:716
 in anonymous at task.jl:447
 in remotecall_fetch at multi.jl:717
 in anonymous at task.jl:447
 in sync_end at ./task.jl:413
 in anonymous at multi.jl:422

Could you do a git pull; make clean; make and try again?

@amitmurthy
Copy link
Contributor

I don't have a Make.user

@amitmurthy
Copy link
Contributor

Also can you print the output of versioninfo ?

Mine is

julia> versioninfo()
Julia Version 0.4.0-dev+6496
Commit 2171f80 (2015-08-05 14:55 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

@amitmurthy
Copy link
Contributor

The underlying issue may be same - #12381

@bermanmaxim
Copy link
Author

You were right, I don't see this problem anymore after make clean and rebuilding, I guess make alone wasn't enough (the bug seemed strangely consistent during my git bisect)... Thanks for the help !

julia> versioninfo()
Julia Version 0.4.0-dev+6511
Commit a2391b9 (2015-08-06 08:55 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

@amitmurthy
Copy link
Contributor

Please reopen if you see it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants