Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decode in Julia #4

Merged
merged 19 commits into from
Dec 10, 2021

Conversation

vilterp
Copy link
Owner

@vilterp vilterp commented Dec 7, 2021

based on JuliaLang#42768

also

  • delete more serialization code
  • start working on decoding in Julia

@vilterp vilterp changed the title try using unique_ptr decode in Julia Dec 7, 2021
for i in 1:raw_results.num_allocs
raw_alloc = unsafe_load(raw_results.allocs, i)
push!(out, Alloc(
# unsafe_pointer_to_objref(convert(Ptr{Any}, raw_alloc.type)),
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to work some of the time, but segfault some of the time (on pointers that look fine…)

};
}

JL_DLLEXPORT void jl_free_alloc_profile() {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NHDaly I wrote this in case people don't like the C++ destructors… 🤷‍♂️

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, too bad. But yeah, I think that makes sense. It's definitely worse, IMO, but it's more standard for julia's C/C++ style. 🤷

end

function decode(raw_results::RawAllocResults)::Vector{Alloc}
out = Vector{Alloc}()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
out = Vector{Alloc}()
out = sizehint!(Vector{Alloc}(), raw_results.num_allocs)

:) This is why i tend to prefer the list-comprehension style, since it does things like this for you. But doing it explicitly works too 👍

Comment on lines -257 to +76
static jl_bt_element_t bt_data[JL_MAX_BT_SIZE];
jl_bt_element_t *bt_data = (jl_bt_element_t*) malloc(JL_MAX_BT_SIZE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... weird... i don't know why i thought this would work before.. we were using the same bt_data pointer all the time and trying to free it? That should have been crashing. I guess it probably was when you tried it? It was a silly mistake. :)

I had misunderstood how this worked when you explained it; i thought you'd only be using this temporarily, and then reading or copying from it. 👍 Good catch.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was originally a malloc, then we changed it to a static buffer during our session (when we were thinking of using one big buffer for all the backtraces), then we forgot to change it back 😛

Comment on lines +76 to 79
jl_bt_element_t *bt_data = (jl_bt_element_t*) malloc(JL_MAX_BT_SIZE);

// TODO: tune the number of frames that are skipped
size_t bt_size = rec_backtrace(bt_data, JL_MAX_BT_SIZE, 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it would probably be nice to not be allocating such a large backtrace every time when we only need a small one.. Maybe we could still record the backtrace onto a single static large one, and then malloc and copy only the bt_size elements?

Maybe something like this?:

Suggested change
jl_bt_element_t *bt_data = (jl_bt_element_t*) malloc(JL_MAX_BT_SIZE);
// TODO: tune the number of frames that are skipped
size_t bt_size = rec_backtrace(bt_data, JL_MAX_BT_SIZE, 1);
static jl_bt_element_t bt_data_buffer[JL_MAX_BT_SIZE];
// TODO: tune the number of frames that are skipped
size_t bt_size = rec_backtrace(bt_data_buffer, JL_MAX_BT_SIZE, 1);
jl_bt_element_t *bt_data = (jl_bt_element_t*) malloc(bt_size * sizeof(jl_bt_element_t));
std::copy(bt_data_buffer, bt_data_buffer + bt_size, bt_data);

Something like that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way you're only allocating ~800 bytes (assuming ~100 frames) instead of 80,000 bytes?

kind of annoying, but it doesn't seem we can reliably load them from
the addresses without segfaulting
but I think the problem is that the strings are being GC'd

cuz they're not rooted
this works, but all the strings leak
ugh C is hard
Comment on lines 104 to 118
// package up type names
results.num_type_names = g_alloc_profile.type_name_by_address.size();
// TODO: free this malloc
results.type_names = (TypeNamePair*) malloc(sizeof(TypeNamePair) * results.num_type_names);
int i = 0;
for (auto type_addr_name : g_alloc_profile.type_name_by_address) {
jl_printf(JL_STDERR, "type name: %s\n", type_addr_name.second.c_str());
// TODO: free this malloc
char *name = (char *) malloc(type_addr_name.second.length() + 1);
memcpy(name, type_addr_name.second.c_str(), type_addr_name.second.length() + 1);
results.type_names[i++] = TypeNamePair{
type_addr_name.first,
name,
};
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we compute the type names in C++? Can't we do this on the julia side, now? If we just store the type's pointer?

Or is this the thing that was segfaulting when you tried it, and you didn't know why?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to go with this approach, one option is we can construct julia strings here, that will be managed by the GC, so they never need to be freed? That way they can be passed directly to the julia side.

Copy link
Owner Author

@vilterp vilterp Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is the thing that was segfaulting :/ I tried just passing the pointer to the type back to the Julia side, then unsafe_pointer_to_objref it in decode. But, sadly, that would segfault sometimes, on pointers that looked fine…

I can show you, so you can sanity check me… I would very much prefer to just pass back type pointers and not do this.

And yeah, I initially tried Julia strings, but that didn't work either (I think when I accessed them they were either garbage or segfaulted). This seemed to be because they were getting GC'd, because while I could construct them here, I couldn't figure out how to tell the GC to root them.

Anyway yeah. Tim's PR passes back type pointers and unsafe_pointer_to_objref, but I've never been able to build it, so I'm not sure if that PR has the same segfault problem…

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 oh, i see you were trying that in a previous commit.. I think maybe you just weren't using the write string constructor, and you were creating like a view over the char* instead of copying the data out of the char*. I think we should be able to get that to work.....

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, weird. it really shouldn't segfault. i think that should be fine. I'd be interested to work through it with you! 👍

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, probably worth another try before we dig deeper into string stuff

like the Profile does, and the C++ side did before

wonder if we can move this into stacktraces.jl or something to
dedup with Profile
@vilterp
Copy link
Owner Author

vilterp commented Dec 8, 2021

Added caching of stack frame lookups; now it's reasonably fast.

That was a lot easier to write in Julia than it was in C++ 😂

@vilterp
Copy link
Owner Author

vilterp commented Dec 9, 2021

Now I'm seeing segfaults while just looking up stack trace frames (don't know why I wasn't seeing that before). I'm afraid that the objects pointed to by backtrace entries are getting GC'd before we can decode the backtrace.

It seems like we might need to call GC.@preserve, like the CPU profiler and Tim's memory profiler do: https://github.com/JuliaLang/julia/blob/c041759ab381360d01c1a9faa5ae226437c31470/stdlib/Profile/src/Profile.jl#L525

Maaaybe it'd even help with the types segfaulting? Idk. I'll have to try it out.

also change Julia representation of raw backtrace elements,
thogh I'm not sure it was breaking anything
Copy link
Owner Author

@vilterp vilterp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welp, GC.@preserve fixed the segfaults during the stack trace decoding phase (caused by stuff that was pointed to by the stack traces being GC'd), but I'm still seeing segfaults while loading types; see below…


# loading anything below this seems to segfault
# TODO: find out what's going on
TYPE_PTR_THRESHOLD = 0x0000000100000000
Copy link
Owner Author

@vilterp vilterp Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporarily "fixed" the segfaults with this, hit for 2 out of ~65 types returned by my test workload (https://github.com/vilterp/HeapSnapshotParser.jl)

vilterp pushed a commit that referenced this pull request Dec 9, 2021
…43226)

In order to allow `Argument`s to be printed nicely.

> before
```julia
julia> code_typed((Float64,)) do x
           sin(x)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = invoke Main.sin(_2::Float64)::Float64
└──      return %1
) => Float64

julia> code_typed((Bool,Any,Any)) do c, x, y
           z = c ? x : y
           z
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not c
2 ─      goto #4
3 ─      nothing::Nothing
4 ┄ %4 = φ (#2 => _3, #3 => _4)::Any
└──      return %4
) => Any
```

> after
```julia
julia> code_typed((Float64,)) do x
           sin(x)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = invoke Main.sin(x::Float64)::Float64
└──      return %1
) => Float64

julia> code_typed((Bool,Any,Any)) do c, x, y
           z = c ? x : y
           z
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not c
2 ─      goto #4
3 ─      nothing::Nothing
4 ┄ %4 = φ (#2 => x, #3 => y)::Any
└──      return %4
) => Any
```
@vilterp
Copy link
Owner Author

vilterp commented Dec 9, 2021

Seen while testing this on a larger code base: got stuck running GC finalizers, seemingly forever, while decoding the response. CPU pegged. Will have to investigate…

Looks like there's a loop from

finalizer -> GC -> alloc -> finalizer -> GC

@vilterp
Copy link
Owner Author

vilterp commented Dec 10, 2021

note from pairing sesh: let's use jl_backtrace_from_here because it handles interpreter frames better. We'll have to turn allocation tracking off, then call it, then back on again. It also manages allocating memory for each stack traces, which is better than our malloc(MAX_BT_SIZE) strategy.

@vilterp vilterp merged commit 10d79a8 into pv-alloc-profile-flat-stacks Dec 10, 2021
vilterp pushed a commit that referenced this pull request Dec 18, 2021
…43226)

In order to allow `Argument`s to be printed nicely.

> before
```julia
julia> code_typed((Float64,)) do x
           sin(x)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = invoke Main.sin(_2::Float64)::Float64
└──      return %1
) => Float64

julia> code_typed((Bool,Any,Any)) do c, x, y
           z = c ? x : y
           z
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not c
2 ─      goto #4
3 ─      nothing::Nothing
4 ┄ %4 = φ (#2 => _3, #3 => _4)::Any
└──      return %4
) => Any
```

> after
```julia
julia> code_typed((Float64,)) do x
           sin(x)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = invoke Main.sin(x::Float64)::Float64
└──      return %1
) => Float64

julia> code_typed((Bool,Any,Any)) do c, x, y
           z = c ? x : y
           z
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not c
2 ─      goto #4
3 ─      nothing::Nothing
4 ┄ %4 = φ (#2 => x, #3 => y)::Any
└──      return %4
) => Any
```
@NHDaly NHDaly assigned NHDaly and unassigned NHDaly Dec 20, 2021
vilterp pushed a commit that referenced this pull request Feb 3, 2022
…43226)

In order to allow `Argument`s to be printed nicely.

> before
```julia
julia> code_typed((Float64,)) do x
           sin(x)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = invoke Main.sin(_2::Float64)::Float64
└──      return %1
) => Float64

julia> code_typed((Bool,Any,Any)) do c, x, y
           z = c ? x : y
           z
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not c
2 ─      goto #4
3 ─      nothing::Nothing
4 ┄ %4 = φ (#2 => _3, #3 => _4)::Any
└──      return %4
) => Any
```

> after
```julia
julia> code_typed((Float64,)) do x
           sin(x)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = invoke Main.sin(x::Float64)::Float64
└──      return %1
) => Float64

julia> code_typed((Bool,Any,Any)) do c, x, y
           z = c ? x : y
           z
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not c
2 ─      goto #4
3 ─      nothing::Nothing
4 ┄ %4 = φ (#2 => x, #3 => y)::Any
└──      return %4
) => Any
```
vilterp pushed a commit that referenced this pull request Nov 17, 2022
This commit implements a simple optimization within `sroa_mutables!` to
eliminate `isdefined` call by checking load-forwardability of the field.
This optimization may be especially useful to eliminate extra allocation
of `Core.Box` involved with a capturing closure, e.g.:
```julia
julia> callit(f, args...) = f(args...);

julia> function isdefined_elim()
           local arr::Vector{Any}
           callit() do
               arr = Any[]
           end
           return arr
       end;

julia> code_typed(isdefined_elim)
```
```diff
diff --git a/_master.jl b/_pr.jl
index 3aa40ba20e5..11eccf65f32 100644
--- a/_master.jl
+++ b/_pr.jl
@@ -1,15 +1,8 @@
 1-element Vector{Any}:
  CodeInfo(
-1 ─ %1  = Core.Box::Type{Core.Box}
-│   %2  = %new(%1)::Core.Box
-│   %3  = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
-│         Core.setfield!(%2, :contents, %3)::Vector{Any}
-│   %5  = Core.isdefined(%2, :contents)::Bool
-└──       goto #3 if not %5
-2 ─       goto #4
-3 ─       $(Expr(:throw_undef_if_not, :arr, false))::Any
-4 ┄ %9  = Core.getfield(%2, :contents)::Any
-│         Core.typeassert(%9, Vector{Any})::Vector{Any}
-│   %11 = π (%9, Vector{Any})
-└──       return %11
+1 ─ %1 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
+└──      goto #3 if not true
+2 ─      goto #4
+3 ─      $(Expr(:throw_undef_if_not, :arr, false))::Any
+4 ┄      return %1
 ) => Vector{Any}
```
vilterp pushed a commit that referenced this pull request Nov 17, 2022
Follows up JuliaLang#44708 -- in that PR I missed the most obvious optimization
opportunity, i.e. we can safely eliminate `isdefined` checks when all
fields are defined at allocation site.
This change allows us to eliminate capturing closure constructions when
the body and callsite of capture closure is available within a optimized
frame, e.g.:
```julia
function abmult(r::Int, x0)
    if r < 0
        r = -r
    end
    f = x -> x * r
    return @inline f(x0)
end
```
```diff
diff --git a/_master.jl b/_pr.jl
index ea06d865b75..c38f221090f 100644
--- a/_master.jl
+++ b/_pr.jl
@@ -1,24 +1,19 @@
 julia> @code_typed abmult(-3, 3)
 CodeInfo(
-1 ── %1  = Core.Box::Type{Core.Box}
-│    %2  = %new(%1, r@_2)::Core.Box
-│    %3  = Core.isdefined(%2, :contents)::Bool
-└───       goto #3 if not %3
+1 ──       goto #3 if not true
 2 ──       goto #4
 3 ──       $(Expr(:throw_undef_if_not, :r, false))::Any
-4 ┄─ %7  = (r@_2 < 0)::Any
-└───       goto #9 if not %7
-5 ── %9  = Core.isdefined(%2, :contents)::Bool
-└───       goto #7 if not %9
+4 ┄─ %4  = (r@_2 < 0)::Any
+└───       goto #9 if not %4
+5 ──       goto #7 if not true
 6 ──       goto #8
 7 ──       $(Expr(:throw_undef_if_not, :r, false))::Any
-8 ┄─ %13 = -r@_2::Any
-9 ┄─ %14 = φ (#4 => r@_2, #8 => %13)::Any
-│    %15 = Core.isdefined(%2, :contents)::Bool
-└───       goto #11 if not %15
+8 ┄─ %9  = -r@_2::Any
+9 ┄─ %10 = φ (#4 => r@_2, #8 => %9)::Any
+└───       goto #11 if not true
 10 ─       goto #12
 11 ─       $(Expr(:throw_undef_if_not, :r, false))::Any
-12 ┄ %19 = (x0 * %14)::Any
+12 ┄ %14 = (x0 * %10)::Any
 └───       goto #13
-13 ─       return %19
+13 ─       return %14
 ) => Any
```
vilterp pushed a commit that referenced this pull request Nov 17, 2022
Currently the optimizer handles abstract callsite only when there is a
single dispatch candidate (in most cases), and so inlining and static-dispatch
are prohibited when the callsite is union-split (in other word, union-split
happens only when all the dispatch candidates are concrete).

However, there are certain patterns of code (most notably our Julia-level compiler code)
that inherently need to deal with abstract callsite.
The following example is taken from `Core.Compiler` utility:
```julia
julia> @inline isType(@nospecialize t) = isa(t, DataType) && t.name === Type.body.name
isType (generic function with 1 method)

julia> code_typed((Any,)) do x # abstract, but no union-split, successful inlining
           isType(x)
       end |> only
CodeInfo(
1 ─ %1 = (x isa Main.DataType)::Bool
└──      goto #3 if not %1
2 ─ %3 = π (x, DataType)
│   %4 = Base.getfield(%3, :name)::Core.TypeName
│   %5 = Base.getfield(Type{T}, :name)::Core.TypeName
│   %6 = (%4 === %5)::Bool
└──      goto #4
3 ─      goto #4
4 ┄ %9 = φ (#2 => %6, #3 => false)::Bool
└──      return %9
) => Bool

julia> code_typed((Union{Type,Nothing},)) do x # abstract, union-split, unsuccessful inlining
           isType(x)
       end |> only
CodeInfo(
1 ─ %1 = (isa)(x, Nothing)::Bool
└──      goto #3 if not %1
2 ─      goto #4
3 ─ %4 = Main.isType(x)::Bool
└──      goto #4
4 ┄ %6 = φ (#2 => false, #3 => %4)::Bool
└──      return %6
) => Bool
```
(note that this is a limitation of the inlining algorithm, and so any
user-provided hints like callsite inlining annotation doesn't help here)

This commit enables inlining and static dispatch for abstract union-split callsite.
The core idea here is that we can simulate our dispatch semantics by
generating `isa` checks in order of the specialities of dispatch candidates:
```julia
julia> code_typed((Union{Type,Nothing},)) do x # union-split, unsuccessful inlining
                  isType(x)
              end |> only
CodeInfo(
1 ─ %1  = (isa)(x, Nothing)::Bool
└──       goto #3 if not %1
2 ─       goto #9
3 ─ %4  = (isa)(x, Type)::Bool
└──       goto #8 if not %4
4 ─ %6  = π (x, Type)
│   %7  = (%6 isa Main.DataType)::Bool
└──       goto #6 if not %7
5 ─ %9  = π (%6, DataType)
│   %10 = Base.getfield(%9, :name)::Core.TypeName
│   %11 = Base.getfield(Type{T}, :name)::Core.TypeName
│   %12 = (%10 === %11)::Bool
└──       goto #7
6 ─       goto #7
7 ┄ %15 = φ (#5 => %12, #6 => false)::Bool
└──       goto #9
8 ─       Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}
└──       unreachable
9 ┄ %19 = φ (#2 => false, #7 => %15)::Bool
└──       return %19
) => Bool
```

Inlining/static-dispatch of abstract union-split callsite will improve
the performance in such situations (and so this commit will improve the
latency of our JIT compilation). Especially, this commit helps us avoid
excessive specializations of `Core.Compiler` code by statically-resolving
`@nospecialize`d callsites, and as the result, the # of precompiled
statements is now reduced from  `2005` ([`master`](f782430)) to `1912` (this commit).

And also, as a side effect, the implementation of our inlining algorithm
gets much simplified now since we no longer need the previous special
handlings for abstract callsites.

One possible drawback would be increased code size.
This change seems to certainly increase the size of sysimage,
but I think these numbers are in an acceptable range:
> [`master`](f782430)
```
❯ du -shk usr/lib/julia/*
17604	usr/lib/julia/corecompiler.ji
194072	usr/lib/julia/sys-o.a
169424	usr/lib/julia/sys.dylib
23784	usr/lib/julia/sys.dylib.dSYM
103772	usr/lib/julia/sys.ji
```

> this commit
```
❯ du -shk usr/lib/julia/*
17512	usr/lib/julia/corecompiler.ji
195588	usr/lib/julia/sys-o.a
170908	usr/lib/julia/sys.dylib
23776	usr/lib/julia/sys.dylib.dSYM
105360	usr/lib/julia/sys.ji
```
vilterp pushed a commit that referenced this pull request Nov 17, 2022
…Lang#45790)

Currently the `@nospecialize`-d `push!(::Vector{Any}, ...)` can only
take a single item and we will end up with runtime dispatch when we try
to call it with multiple items:
```julia
julia> code_typed(push!, (Vector{Any}, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing
│   %2 = Base.arraylen(a)::Int64
│        Base.arrayset(true, a, item, %2)::Vector{Any}
└──      return a
) => Vector{Any}

julia> code_typed(push!, (Vector{Any}, Any, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Base.append!(a, iter)::Vector{Any}
└──      return %1
) => Vector{Any}
```

This commit adds a new specialization that it can take arbitrary-length
items. Our compiler should still be able to optimize the single-input 
case as before via the dispatch mechanism.
```julia
julia> code_typed(push!, (Vector{Any}, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing
│   %2 = Base.arraylen(a)::Int64
│        Base.arrayset(true, a, item, %2)::Vector{Any}
└──      return a
) => Vector{Any}

julia> code_typed(push!, (Vector{Any}, Any, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─ %1  = Base.arraylen(a)::Int64
│         $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000002, 0x0000000000000002))::Nothing
└──       goto #7 if not true
2 ┄ %4  = φ (#1 => 1, #6 => %14)::Int64
│   %5  = φ (#1 => 1, #6 => %15)::Int64
│   %6  = Base.getfield(x, %4, true)::Any
│   %7  = Base.add_int(%1, %4)::Int64
│         Base.arrayset(true, a, %6, %7)::Vector{Any}
│   %9  = (%5 === 2)::Bool
└──       goto #4 if not %9
3 ─       goto #5
4 ─ %12 = Base.add_int(%5, 1)::Int64
└──       goto #5
5 ┄ %14 = φ (#4 => %12)::Int64
│   %15 = φ (#4 => %12)::Int64
│   %16 = φ (#3 => true, #4 => false)::Bool
│   %17 = Base.not_int(%16)::Bool
└──       goto #7 if not %17
6 ─       goto #2
7 ┄       return a
) => Vector{Any}
```

This commit also adds the equivalent implementations for `pushfirst!`.
vilterp pushed a commit that referenced this pull request Nov 17, 2022
When calling `jl_error()` or `jl_errorf()`, we must check to see if we
are so early in the bringup process that it is dangerous to attempt to
construct a backtrace because the data structures used to provide line
information are not properly setup.

This can be easily triggered by running:

```
julia -C invalid
```

On an `i686-linux-gnu` build, this will hit the "Invalid CPU Name"
branch in `jitlayers.cpp`, which calls `jl_errorf()`.  This in turn
calls `jl_throw()`, which will eventually call `jl_DI_for_fptr` as part
of the backtrace printing process, which fails as the object maps are
not fully initialized.  See the below `gdb` stacktrace for details:

```
$ gdb -batch -ex 'r' -ex 'bt' --args ./julia -C invalid
...
fatal: error thrown and no exception handler available.
ErrorException("Invalid CPU name "invalid".")

Thread 1 "julia" received signal SIGSEGV, Segmentation fault.
0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
1277    /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h: No such file or directory.
 #0  0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
 #1  std::map<unsigned int, JITDebugInfoRegistry::ObjectInfo, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__x=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_map.h:1258
 #2  jl_DI_for_fptr (fptr=4155049385, symsize=symsize@entry=0xffffcfa8, slide=slide@entry=0xffffcfa0, Section=Section@entry=0xffffcfb8, context=context@entry=0xffffcf94) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1181
 #3  0xf75c056a in jl_getFunctionInfo_impl (frames_out=0xffffd03c, pointer=4155049385, skipC=0, noInline=0) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1210
 #4  0xf7a6ca98 in jl_print_native_codeloc (ip=4155049385) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:636
 #5  0xf7a6cd54 in jl_print_bt_entry_codeloc (bt_entry=0xf0798018) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:657
 #6  jlbacktrace () at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:1090
 #7  0xf7a3cd2b in ijl_no_exc_handler (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:605
 #8  0xf7a3d10a in throw_internal (ct=ct@entry=0xf070c010, exception=<optimized out>, exception@entry=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:638
 #9  0xf7a3d330 in ijl_throw (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:654
 #10 0xf7a905aa in ijl_errorf (fmt=fmt@entry=0xf7647cd4 "Invalid CPU name \"%s\".") at /cache/build/default-amdci5-4/julialang/julia-master/src/rtutils.c:77
 #11 0xf75a4b22 in (anonymous namespace)::createTargetMachine () at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:823
 #12 JuliaOJIT::JuliaOJIT (this=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:1044
 #13 0xf7531793 in jl_init_llvm () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8585
 #14 0xf75318a8 in jl_init_codegen_impl () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8648
 #15 0xf7a51a52 in jl_restore_system_image_from_stream (f=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2131
 #16 0xf7a55c03 in ijl_restore_system_image_data (buf=0xe859c1c0 <jl_system_image_data> "8'\031\003", len=125161105) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2184
 #17 0xf7a55cf9 in jl_load_sysimg_so () at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:424
 #18 ijl_restore_system_image (fname=0x80a0900 "/build/bk_download/julia-d78fdad601/lib/julia/sys.so") at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2157
 #19 0xf7a3bdfc in _finish_julia_init (rel=rel@entry=JL_IMAGE_JULIA_HOME, ct=<optimized out>, ptls=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:741
 #20 0xf7a3c8ac in julia_init (rel=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:728
 #21 0xf7a7f61d in jl_repl_entrypoint (argc=<optimized out>, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/src/jlapi.c:705
 #22 0x080490a7 in main (argc=3, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/cli/loader_exe.c:59
```

To prevent this, we simply avoid calling `jl_errorf` this early in the
process, punting the problem to a later PR that can update guard
conditions within `jl_error*`.
vilterp pushed a commit that referenced this pull request Nov 18, 2022
When calling `jl_error()` or `jl_errorf()`, we must check to see if we
are so early in the bringup process that it is dangerous to attempt to
construct a backtrace because the data structures used to provide line
information are not properly setup.

This can be easily triggered by running:

```
julia -C invalid
```

On an `i686-linux-gnu` build, this will hit the "Invalid CPU Name"
branch in `jitlayers.cpp`, which calls `jl_errorf()`.  This in turn
calls `jl_throw()`, which will eventually call `jl_DI_for_fptr` as part
of the backtrace printing process, which fails as the object maps are
not fully initialized.  See the below `gdb` stacktrace for details:

```
$ gdb -batch -ex 'r' -ex 'bt' --args ./julia -C invalid
...
fatal: error thrown and no exception handler available.
ErrorException("Invalid CPU name "invalid".")

Thread 1 "julia" received signal SIGSEGV, Segmentation fault.
0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
1277    /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h: No such file or directory.
 #0  0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
 #1  std::map<unsigned int, JITDebugInfoRegistry::ObjectInfo, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__x=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_map.h:1258
 #2  jl_DI_for_fptr (fptr=4155049385, symsize=symsize@entry=0xffffcfa8, slide=slide@entry=0xffffcfa0, Section=Section@entry=0xffffcfb8, context=context@entry=0xffffcf94) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1181
 #3  0xf75c056a in jl_getFunctionInfo_impl (frames_out=0xffffd03c, pointer=4155049385, skipC=0, noInline=0) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1210
 #4  0xf7a6ca98 in jl_print_native_codeloc (ip=4155049385) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:636
 #5  0xf7a6cd54 in jl_print_bt_entry_codeloc (bt_entry=0xf0798018) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:657
 #6  jlbacktrace () at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:1090
 #7  0xf7a3cd2b in ijl_no_exc_handler (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:605
 #8  0xf7a3d10a in throw_internal (ct=ct@entry=0xf070c010, exception=<optimized out>, exception@entry=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:638
 #9  0xf7a3d330 in ijl_throw (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:654
 #10 0xf7a905aa in ijl_errorf (fmt=fmt@entry=0xf7647cd4 "Invalid CPU name \"%s\".") at /cache/build/default-amdci5-4/julialang/julia-master/src/rtutils.c:77
 #11 0xf75a4b22 in (anonymous namespace)::createTargetMachine () at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:823
 #12 JuliaOJIT::JuliaOJIT (this=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:1044
 #13 0xf7531793 in jl_init_llvm () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8585
 #14 0xf75318a8 in jl_init_codegen_impl () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8648
 #15 0xf7a51a52 in jl_restore_system_image_from_stream (f=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2131
 #16 0xf7a55c03 in ijl_restore_system_image_data (buf=0xe859c1c0 <jl_system_image_data> "8'\031\003", len=125161105) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2184
 #17 0xf7a55cf9 in jl_load_sysimg_so () at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:424
 #18 ijl_restore_system_image (fname=0x80a0900 "/build/bk_download/julia-d78fdad601/lib/julia/sys.so") at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2157
 #19 0xf7a3bdfc in _finish_julia_init (rel=rel@entry=JL_IMAGE_JULIA_HOME, ct=<optimized out>, ptls=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:741
 #20 0xf7a3c8ac in julia_init (rel=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:728
 #21 0xf7a7f61d in jl_repl_entrypoint (argc=<optimized out>, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/src/jlapi.c:705
 #22 0x080490a7 in main (argc=3, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/cli/loader_exe.c:59
```

To prevent this, we simply avoid calling `jl_errorf` this early in the
process, punting the problem to a later PR that can update guard
conditions within `jl_error*`.

(cherry picked from commit 21ab24e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants