Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in collect in HierarchialUtilities #44501

Closed
KristofferC opened this issue Mar 7, 2022 · 4 comments · Fixed by #45155
Closed

Segfault in collect in HierarchialUtilities #44501

KristofferC opened this issue Mar 7, 2022 · 4 comments · Fixed by #45155
Labels
bug Indicates an unexpected problem or unintended behavior compiler:codegen Generation of LLVM IR and native code
Milestone

Comments

@KristofferC
Copy link
Member

KristofferC commented Mar 7, 2022

Somewhat of an MWE:

import HierarchicalUtils: NodeType, children

mutable struct Leaf
    n::Int64
end
struct SingletonVertex{T}
    ch::T
end

NodeType(::Type{Nothing}) = LeafNode()
NodeType(::Type{<:Tuple}) = InnerNode()
NodeType(::Type{Leaf}) = LeafNode()
NodeType(::Type{<:SingletonVertex}) = InnerNode()

children(v::Tuple) = v
children(t::SingletonVertex) = (t.ch,)

_childsort(x::Tuple) = x
_children_sorted(n) = _childsort(children(n))

l2 = Leaf(2)
n3t = SingletonVertex(l2)
ts = (n3t, nothing)

chss1 = [_children_sorted(t) for t in ts if !(isnothing(t))] # boom
signal (11): Segmentation fault: 11
in expression starting at HierarchicalUtils.jl/test/utilities.jl:25
iterate at ./tuple.jl:68 [inlined]
iterate at ./iterators.jl:470 [inlined]
iterate at ./generator.jl:44 [inlined]
grow_to! at ./array.jl:882
grow_to! at ./array.jl:864 [inlined]
collect at ./array.jl:784
unknown function (ip: 0x111a3aae9)
_jl_invoke at /Users/kristoffercarlsson/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/kristoffercarlsson/julia/src/gf.c:2540
...

Tested on the backports-release-1.8 branch

@KristofferC KristofferC added this to the 1.8 milestone Mar 7, 2022
@maleadt
Copy link
Member

maleadt commented Mar 8, 2022

Reduced to the following (which only fails on -O0 anymore, but it looks like the same issue):

struct a
end
struct b{c}
    ch::c
end
children(d) = (d,)
e(i) = children(i)
f = a
g = b(f)
h = g, nothing
[e(d) for d in h if isnothing(d)] 

@maleadt maleadt added the bug Indicates an unexpected problem or unintended behavior label Mar 8, 2022
@vtjnash
Copy link
Member

vtjnash commented Mar 8, 2022

There does seem to be a lot of badness there of this sort:

   %value_phi19 = phi {} addrspace(10)* [ null, %L62 ], [ %119, %L73 ], [ addrspacecast ({}* inttoptr (i64 139967715850432 to {}*) to {} addrspace(10)*), %L68 ]                                  
   %123 = bitcast {} addrspace(10)* %value_phi19 to i64 addrspace(10)*, !dbg !25                                                                                                                  
   %124 = getelementptr inbounds i64, i64 addrspace(10)* %123, i64 -1, !dbg !25                                                                                                                   
   %125 = load atomic i64, i64 addrspace(10)* %124 unordered, align 8, !dbg !25, !tbaa !30, !range !38

Which is confusing to codegen, since it probably wants to union-unbox this from the phi node

│     %82  = φ (#31 => %79)::Union{Tuple{Nothing}, Tuple{b{DataType}}}                                                                                                                            

To make it applicable to the next phi node:

│     %82  = φ (#31 => %79)::Union{Tuple{Nothing}, Tuple{b{DataType}}}
...
│     %86  = φ (#32 => %82, #119 => %286)::Tuple{b{DataType}}

What likely happened here is that the optimizer generates code it knows is illegal (moving non-existent values into places they cannot go), but which it wants to be valid at runtime (since we promise we won't observe the value later, codegen is expected to simply discard the impossible)

julia> begin
       struct b{c}
           ch::c
       end
       e(i) = (i,)
       h = b(Int), nothing
       end
(b{DataType}(Int64), nothing)

julia> # collect(Base.Generator(e, Base.Iterators.Filter(isnothing, h)))

julia> code_llvm(Base.grow_to!, (Array{Tuple{Nothing}, 1}, Base.Generator{Base.Iterators.Filter{typeof(isnothing), typeof(h)}, typeof(e)}, Int64), raw=true, optimize=false)

@vtjnash vtjnash added the compiler:codegen Generation of LLVM IR and native code label Mar 8, 2022
@KristofferC
Copy link
Member Author

Bump. @vtjnash (or someone else) did you manage to get any further with this?

@vtjnash
Copy link
Member

vtjnash commented Apr 30, 2022

Yes, I wrote a fix for this

vtjnash added a commit that referenced this issue May 2, 2022
The optimization pass often uses values for phi values (and thus by
extension, also for pi, phic and upsilon values) that are invalid. We
make sure that these have a null pointer, so that we can detect that
case at runtime (at the cost of slightly worse code generation for
them), but it means we need to be very careful to check for that.

This is identical to #39747, which added the equivalent code to the
other side of the conditional there, but missed some additional
relevant, but rare, cases that are observed to be possible.

The `emit_isa_and_defined` is derived from the LLVM name for this
operation: `isa_and_nonnull<T>`.

Secondly, we also optimize `emit_unionmove` to change a bad IR case to a
better IR form.

Fix #44501
KristofferC pushed a commit that referenced this issue May 4, 2022
vtjnash added a commit that referenced this issue May 9, 2022
The optimization pass often uses values for phi values (and thus by
extension, also for pi, phic and upsilon values) that are invalid. We
make sure that these have a null pointer, so that we can detect that
case at runtime (at the cost of slightly worse code generation for
them), but it means we need to be very careful to check for that.

This is identical to #39747, which added the equivalent code to the
other side of the conditional there, but missed some additional
relevant, but rare, cases that are observed to be possible.

The `emit_isa_and_defined` is derived from the LLVM name for this
operation: `isa_and_nonnull<T>`.

Secondly, we also optimize `emit_unionmove` to change a bad IR case to a
better IR form.

Fix #44501
oscardssmith pushed a commit that referenced this issue May 11, 2022
The optimization pass often uses values for phi values (and thus by
extension, also for pi, phic and upsilon values) that are invalid. We
make sure that these have a null pointer, so that we can detect that
case at runtime (at the cost of slightly worse code generation for
them), but it means we need to be very careful to check for that.

This is identical to #39747, which added the equivalent code to the
other side of the conditional there, but missed some additional
relevant, but rare, cases that are observed to be possible.

The `emit_isa_and_defined` is derived from the LLVM name for this
operation: `isa_and_nonnull<T>`.

Secondly, we also optimize `emit_unionmove` to change a bad IR case to a
better IR form.

Fix #44501
KristofferC pushed a commit that referenced this issue May 16, 2022
The optimization pass often uses values for phi values (and thus by
extension, also for pi, phic and upsilon values) that are invalid. We
make sure that these have a null pointer, so that we can detect that
case at runtime (at the cost of slightly worse code generation for
them), but it means we need to be very careful to check for that.

This is identical to #39747, which added the equivalent code to the
other side of the conditional there, but missed some additional
relevant, but rare, cases that are observed to be possible.

The `emit_isa_and_defined` is derived from the LLVM name for this
operation: `isa_and_nonnull<T>`.

Secondly, we also optimize `emit_unionmove` to change a bad IR case to a
better IR form.

Fix #44501

(cherry picked from commit 72b80e2)
KristofferC pushed a commit that referenced this issue May 18, 2022
The optimization pass often uses values for phi values (and thus by
extension, also for pi, phic and upsilon values) that are invalid. We
make sure that these have a null pointer, so that we can detect that
case at runtime (at the cost of slightly worse code generation for
them), but it means we need to be very careful to check for that.

This is identical to #39747, which added the equivalent code to the
other side of the conditional there, but missed some additional
relevant, but rare, cases that are observed to be possible.

The `emit_isa_and_defined` is derived from the LLVM name for this
operation: `isa_and_nonnull<T>`.

Secondly, we also optimize `emit_unionmove` to change a bad IR case to a
better IR form.

Fix #44501

(cherry picked from commit 72b80e2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior compiler:codegen Generation of LLVM IR and native code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants