Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing ProbNumDiffEq errors on 1.8 with Unreachable reached, signal (4): Illegal instruction #45704

Open
KristofferC opened this issue Jun 16, 2022 · 11 comments
Labels
bug Indicates an unexpected problem or unintended behavior compiler:precompilation Precompilation of modules

Comments

@KristofferC
Copy link
Member

(Running with assertions on).

https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/8b2e406_vs_742b9ab/ProbNumDiffEq.primary.log

Unreachable reached at 0x7fc58bdd4f00

signal (4): Illegal instruction
in expression starting at /home/kc/.julia/packages/ProbNumDiffEq/C4V7j/test/destats.jl:7
Dual at /home/kc/.julia/packages/ForwardDiff/wAaVJ/src/dual.jl:19
#ForwardColorJacCache#12 at /home/kc/.julia/packages/SparseDiffTools/HI65u/src/differentiation/compute_jacobian_ad.jl:41
Type##kw at /home/kc/.julia/packages/SparseDiffTools/HI65u/src/differentiation/compute_jacobian_ad.jl:19 [inlined]
build_jac_config at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/derivative_wrappers.jl:141 [inlined]
build_jac_config at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/derivative_wrappers.jl:123 [inlined]
alg_cache at /home/kc/.julia/packages/ProbNumDiffEq/C4V7j/src/caches.jl:195
#__init#563 at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/solve.jl:295
__init##kw at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/solve.jl:9 [inlined]
__init##kw at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/solve.jl:9 [inlined]
__init##kw at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/solve.jl:9 [inlined]
__init##kw at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/solve.jl:9 [inlined]
__init##kw at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/solve.jl:9 [inlined]
#__solve#562 at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/solve.jl:4
__solve##kw at /home/kc/.julia/packages/OrdinaryDiffEq/nrmdG/src/solve.jl:1 [inlined]
#solve_call#39 at /home/kc/.julia/packages/DiffEqBase/S7V8q/src/solve.jl:221 [inlined]
solve_call##kw at /home/kc/.julia/packages/DiffEqBase/S7V8q/src/solve.jl:207 [inlined]
#solve_up#41 at /home/kc/.julia/packages/DiffEqBase/S7V8q/src/solve.jl:248 [inlined]
solve_up##kw at /home/kc/.julia/packages/DiffEqBase/S7V8q/src/solve.jl:237 [inlined]
#solve#40 at /home/kc/.julia/packages/DiffEqBase/S7V8q/src/solve.jl:234 [inlined]
solve##kw at /home/kc/.julia/packages/DiffEqBase/S7V8q/src/solve.jl:226
...
@KristofferC KristofferC added the regression Regression in behavior compared to a previous version label Jun 16, 2022
@KristofferC KristofferC added this to the 1.8 milestone Jun 16, 2022
@JeffBezanson
Copy link
Member

Are we sure this only happens with assertions on? After all, it's not an assertion failure 😄

@KristofferC
Copy link
Member Author

Indeed, it happens even without assertions.

@JeffBezanson
Copy link
Member

These are usually type inference or intersection bugs.

@JeffBezanson
Copy link
Member

Running in a debug build I get instead:

Intrinsic name not mangled correctly for type arguments! Should be: llvm.powi.f64.i32
double (double, i32)* @llvm.powi.f64
in function julia__transdiff_ibm_element_104715

@vtjnash
Copy link
Member

vtjnash commented Jun 24, 2022

That is our mistake then, and fixed on master by #44580. We were using llvmcall, which does not have a stable API across LLVM versions. We should notice this less often because of #44697 now.

@JeffBezanson
Copy link
Member

👍
With that fixed, I get the illegal instruction, yay (?)

@vtjnash
Copy link
Member

vtjnash commented Jun 27, 2022

(rr) p jl_gdblookup($rip)
Dual at /home/vtjnash/.julia/packages/ForwardDiff/wAaVJ/src/dual.jl:19
(rr) p jl_(jl_gdblookuplinfo($rip))
(::Type{ForwardDiff.Dual{ForwardDiff.Tag{OrdinaryDiffEq.OrdinaryDiffEqTag, Float64}, Float64, 3}})(Float64, ForwardDiff.Partials{3, Float64}) from (::Type{ForwardDiff.Dual{T, V, N}})(V, ForwardDiff.Partials{N, V}) where {T, V, N}
(rr) p $rip                                                                                                                                                                                                                                                                $4 = (void (*)()) 0x7f769beb14d0                                                                                                                                                                                                                                           
(rr) watch *0x7f769beb14d0                                                                                                                                                                                                                                                 
Hardware watchpoint 1: *0x7f769beb14d0                                                                                                                                                                                                                                     
(rr) b JuliaOJIT::OptSelLayerT::emit
(rr) rc
(rr) until 502
(rr) p jl_dump_llvm_module(&M)
define void @julia_Dual_100210({ double, [1 x [3 x double]] }* noalias nocapture noundef nonnull sret({ double, [1 x [3 x double]] }) align 8 dereferenceable(32) %0, double %1, [1 x [3 x double]] addrspace(11)* nocapture noundef nonnull readonly align 8 dereferenceable(24) %2) #0 !dbg !4 {
top:
  %3 = alloca { double, [1 x [3 x double]] }, align 8
  %4 = call {}*** @julia.get_pgcstack()
  %5 = bitcast {}*** %4 to {}**
  %current_task = getelementptr inbounds {}*, {}** %5, i64 -13
  %6 = bitcast {}** %current_task to i64*
  %world_age = getelementptr inbounds i64, i64* %6, i64 14
  %7 = getelementptr inbounds { double, [1 x [3 x double]] }, { double, [1 x [3 x double]] }* %3, i32 0, i32 0, !dbg !7
  store double %1, double* %7, align 8, !dbg !7, !tbaa !8
  %8 = getelementptr inbounds { double, [1 x [3 x double]] }, { double, [1 x [3 x double]] }* %3, i32 0, i32 1, !dbg !7
  %9 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]] addrspace(11)* %2, i32 0, i32 0, !dbg !7
  %10 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]]* %8, i32 0, i32 0, !dbg !7
  %11 = bitcast [3 x double]* %10 to i8*, !dbg !7
  %12 = bitcast [3 x double] addrspace(11)* %9 to i8 addrspace(11)*, !dbg !7
  call void @llvm.memcpy.p0i8.p11i8.i64(i8* align 8 %11, i8 addrspace(11)* %12, i64 24, i1 false), !dbg !7, !tbaa !12
  call void @llvm.trap(), !dbg !7
  unreachable, !dbg !7

after_noret:                                      ; No predecessors!
  call void @llvm.trap(), !dbg !7
  unreachable, !dbg !7
}
julia> using ProbNumDiffEq
┌ Warning: Replacing module `OrdinaryDiffEq`
└ @ Base loading.jl:1196

julia> using OrdinaryDiffEq

julia> using ForwardDiff

julia> code_llvm(ForwardDiff.Dual{ForwardDiff.Tag{OrdinaryDiffEq.OrdinaryDiffEqTag, Float64}, Float64, 3}, (Float64, ForwardDiff.Partials{3, Float64}), optimize=false)
;  @ /home/vtjnash/.julia/packages/ForwardDiff/wAaVJ/src/dual.jl:17 within `Dual`
define void @julia_Dual_1654({ double, [1 x [3 x double]] }* noalias nocapture noundef nonnull sret({ double, [1 x [3 x double]] }) align 8 dereferenceable(32) %0, double %1, [1 x [3 x double]]* nocapture noundef nonnull readonly align 8 dereferenceable(24) %2) #0 {
top:
  %3 = alloca { double, [1 x [3 x double]] }, align 8
  %4 = call {}*** @julia.get_pgcstack()
  %5 = bitcast {}*** %4 to {}**
  %current_task = getelementptr inbounds {}*, {}** %5, i64 -13
  %6 = bitcast {}** %current_task to i64*
  %world_age = getelementptr inbounds i64, i64* %6, i64 14
;  @ /home/vtjnash/.julia/packages/ForwardDiff/wAaVJ/src/dual.jl:19 within `Dual`
  %7 = getelementptr inbounds { double, [1 x [3 x double]] }, { double, [1 x [3 x double]] }* %3, i32 0, i32 0
  store double %1, double* %7, align 8
  %8 = getelementptr inbounds { double, [1 x [3 x double]] }, { double, [1 x [3 x double]] }* %3, i32 0, i32 1
  %9 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]]* %2, i32 0, i32 0
  %10 = getelementptr inbounds [1 x [3 x double]], [1 x [3 x double]]* %8, i32 0, i32 0
  %11 = bitcast [3 x double]* %10 to i8*
  %12 = bitcast [3 x double]* %9 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %11, i8* %12, i64 24, i1 false)
  call void @llvm.trap()
  unreachable

after_noret:                                      ; No predecessors!
  call void @llvm.trap()
  unreachable
}

The @warn statement there is probably causing corruption of our datatype references. Since #43990 maybe it should have been changed to an error?

@KristofferC
Copy link
Member Author

KristofferC commented Jun 28, 2022

module ProbNumDiffEq

using OrdinaryDiffEq
using TaylorIntegration

end

is enough to repro.

...
require(TaylorIntegration [92b13dbe-c966-51a2-8445-caca9f8a7d42], RecursiveArrayTools) -> RecursiveArrayTools [731186ca-8d62-57ce-b412-fbd966d074cd]
require(TaylorIntegration [92b13dbe-c966-51a2-8445-caca9f8a7d42], DiffEqBase) -> DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e]
require(TaylorIntegration [92b13dbe-c966-51a2-8445-caca9f8a7d42], OrdinaryDiffEq) -> OrdinaryDiffEq [1dea7af3-3e70-54e6-95c3-0bf5283fa5ed]
ERROR: Replacing module `OrdinaryDiffEq`
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] macro expansion
    @ ./loading.jl:1120 [inlined]
  [3] macro expansion
    @ ./lock.jl:223 [inlined]
  [4] register_root_module(m::Module)

This load comes from a @require block:

https://github.com/PerezHz/TaylorIntegration.jl/blob/0223c5c02d82dda293f765d15e9aeeca1aa88139/src/TaylorIntegration.jl#L20-L22

https://github.com/PerezHz/TaylorIntegration.jl/blob/dc6cc36f5f7fff7822dca16505bbdc8eca097679/src/common.jl#L1

So the loading seems to go something like:

load OrdinaryDiffEq
    load DiffEqBase
load TaylorIntegration
    include "common".jl from `@require DiffEqBase`
        load OrdinaryDiffEq
            error due to Replacing module `OrdinaryDiffEq`

@vtjnash, does the above make any bells ring of what could be the problem?

@vtjnash
Copy link
Member

vtjnash commented Jun 29, 2022

Yeah, I see. TaylorIntegration demanded that we create a cycle in the dependency graph, and those are potentially impossible to handle correctly. In particular, if TaylorIntegration is loaded, then immediately after DiffEqBase is loaded, TaylorIntegration expects to demand that OrdinaryDiffEq is already loaded too. But if we were choosing to load DiffEqBase specifically to satisfy the dependency for OrdinaryDiffEq, that may be an impossible request.

@vtjnash
Copy link
Member

vtjnash commented Jun 29, 2022

A similarly related version of this is:

julia> using TaylorIntegration
julia> using OrdinaryDiffEq
<certain deadlock>

This might just generally be a danger of putting code in __init__ blocks?

vtjnash added a commit that referenced this issue Jun 29, 2022
Does not explicitly close issue #45704, as perhaps the deserialized
module should still be valid after the replacement warning.
@vtjnash
Copy link
Member

vtjnash commented Jun 30, 2022

It looks like the underlying issue causing the segfault is the ircode compressor assuming that modules are relocatable in the roots array, resulting in the ircode pointing to the wrong module after loading the later packages (due to #43990)

vtjnash added a commit that referenced this issue Jun 30, 2022
Does not explicitly close issue #45704, as perhaps the deserialized
module should still be valid after the replacement warning.
KristofferC pushed a commit that referenced this issue Jul 4, 2022
Does not explicitly close issue #45704, as perhaps the deserialized
module should still be valid after the replacement warning.

(cherry picked from commit ad8893b)
KristofferC pushed a commit that referenced this issue Jul 4, 2022
Does not explicitly close issue #45704, as perhaps the deserialized
module should still be valid after the replacement warning.

(cherry picked from commit ad8893b)
@KristofferC KristofferC reopened this Jul 17, 2022
@KristofferC KristofferC removed this from the 1.8 milestone Jul 17, 2022
@vtjnash vtjnash added bug Indicates an unexpected problem or unintended behavior compiler:precompilation Precompilation of modules and removed regression Regression in behavior compared to a previous version labels Aug 10, 2022
pcjentsch pushed a commit to pcjentsch/julia that referenced this issue Aug 18, 2022
Does not explicitly close issue JuliaLang#45704, as perhaps the deserialized
module should still be valid after the replacement warning.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior compiler:precompilation Precompilation of modules
Projects
None yet
Development

No branches or pull requests

3 participants