Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch2 #3

Merged
merged 17 commits into from
Jul 5, 2021
Merged

Patch2 #3

merged 17 commits into from
Jul 5, 2021

Conversation

N5N3
Copy link
Owner

@N5N3 N5N3 commented Jul 5, 2021

No description provided.

steffzahn and others added 17 commits July 2, 2021 08:17
This seems to be a fairly arbitrary case for throwing exceptions, when
the user might often use this value in arithmetic afterwards, which is
not checked. It leads to awkward complexity in the API however, where it
may be unclear which function to reach for, with no particular
justification for why a particular usage is "safe". And it inhibits
optimization and performance due to the additional checks and error
cases (and is not even entirely type-stable).
Move some copyto! definitions to other place.
These definitions need @simd.
@N5N3 N5N3 merged commit b69943a into master Jul 5, 2021
N5N3 pushed a commit that referenced this pull request Oct 5, 2021
…tional` (JuliaLang#42434)

Otherwise, in rare cases, we may see this sort of weird behavior:
```julia
julia> @eval edgecase(_) = $(Core.Compiler.InterConditional(2, Int, Any))
edgecase (generic function with 1 method)

julia> code_typed((Any,)) do x
           edgecase(x) ? x : nothing
       end
1-element Vector{Any}:
 CodeInfo(
1 ─     goto #3 if not $(QuoteNode(Core.InterConditional(2, Int64, Any)))
2 ─     return x
3 ─     return Main.nothing
) => Any

julia> code_typed((Any,)) do x
           edgecase(x) ? x : nothing
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not $(QuoteNode(Core.InterConditional(2, Int64, Any)))
2 ─ %2 = π (x, Int64)
└──      return %2
3 ─      return Main.nothing
) => Union{Nothing, Int64}
```
N5N3 added a commit that referenced this pull request Oct 29, 2021
commit c054dbc
Author: Shuhei Kadowaki <[email protected]>
Date:   Fri Oct 29 01:31:55 2021 +0900

    optimizer: eliminate allocations (JuliaLang#42833)

commit 6a9737d
Author: Jeff Bezanson <[email protected]>
Date:   Thu Oct 28 12:23:53 2021 -0400

    fix JuliaLang#42659, move `jl_coverage_visit_line` to runtime library (JuliaLang#42810)

commit c762f10
Author: Marc Ittel <[email protected]>
Date:   Thu Oct 28 12:19:13 2021 +0200

    change `julia` to `julia-repl` in docstrings (JuliaLang#42824)

    Co-authored-by: Michael Abbott <[email protected]>

commit 9f52ec0
Author: Dilum Aluthge <[email protected]>
Date:   Thu Oct 28 05:30:11 2021 -0400

    CI (Buildkite): Update all rootfs images to the latest versions (JuliaLang#42802)

    * CI (Buildkite): Update all rootfs images to the latest versions

    * Re-sign all of the signed pipelines

commit 404e584
Author: DilumAluthgeBot <[email protected]>
Date:   Wed Oct 27 21:11:04 2021 -0400

    🤖 Bump the Statistics stdlib from 74897fe to 5256d57 (JuliaLang#42826)

    Co-authored-by: Dilum Aluthge <[email protected]>

commit c74814e
Author: Jeff Bezanson <[email protected]>
Date:   Wed Oct 27 16:34:46 2021 -0400

    reset `RandomDevice` file from `__init__` (JuliaLang#42537)

    This prevents us from seeing an invalid `IOStream` object from a saved
    system image, and also ensures the files are opened once for all
    threads.

commit 05ed348
Author: Jeff Bezanson <[email protected]>
Date:   Wed Oct 27 15:24:17 2021 -0400

    only visit nonfunction_mt once when traversing method tables (JuliaLang#42821)

commit d71b77d
Author: DilumAluthgeBot <[email protected]>
Date:   Tue Oct 26 20:39:08 2021 -0400

    🤖 Bump the Downloads stdlib from 5f1509d to dbb0625 (JuliaLang#42811)

    Co-authored-by: Dilum Aluthge <[email protected]>

commit b4fddc1
Author: DilumAluthgeBot <[email protected]>
Date:   Tue Oct 26 14:46:20 2021 -0400

    🤖 Bump the Pkg stdlib from bc32103f to 26918395 (JuliaLang#42806)

    Co-authored-by: Dilum Aluthge <[email protected]>

commit 6a386de
Author: Dilum Aluthge <[email protected]>
Date:   Tue Oct 26 12:15:51 2021 -0400

    CI (Buildkite): make sure to hit ignore any unencrypted repo keys, regardless of where they are located in the repository (JuliaLang#42803)

commit 021a6b5
Author: Shuhei Kadowaki <[email protected]>
Date:   Wed Oct 27 01:08:33 2021 +0900

    optimizer: clean up inlining test code (JuliaLang#42804)

commit 16eb196
Merge: 21ebabf 1510eaa
Author: Shuhei Kadowaki <[email protected]>
Date:   Tue Oct 26 23:25:41 2021 +0900

    Merge pull request JuliaLang#42766 from JuliaLang/avi/42754

    optimizer: fix JuliaLang#42754, inline union-split const-prop'ed sources

commit 21ebabf
Author: Kristoffer Carlsson <[email protected]>
Date:   Tue Oct 26 16:11:32 2021 +0200

    simplify code loading test now that TOML files are parsed with a real TOML parser (JuliaLang#42328)

commit 1510eaa
Author: Shuhei Kadowaki <[email protected]>
Date:   Mon Oct 25 01:35:12 2021 +0900

    optimizer: fix JuliaLang#42754, inline union-split const-prop'ed sources

    This commit complements JuliaLang#39754 and JuliaLang#39305: implements a logic to use
    constant-prop'ed results for inlining at union-split callsite.
    Currently it works only for cases when constant-prop' succeeded for all
    (union-split) signatures.

    > example
    ```julia
    julia> mutable struct X
               # NOTE in order to confuse `fieldtype_tfunc`, we need to have at least two fields with different types
               a::Union{Nothing, Int}
               b::Symbol
           end;

    julia> code_typed((X, Union{Nothing,Int})) do x, a
               # this `setproperty` call would be union-split and constant-prop will happen for
               # each signature: inlining would fail if we don't use constant-prop'ed source
               # since the approximated inlining cost of `convert(fieldtype(X, sym), a)` would
               # end up very high if we don't propagate `sym::Const(:a)`
               x.a = a
               x
           end |> only |> first
    ```

    > before this commit
    ```julia
    CodeInfo(
    1 ─ %1 = Base.setproperty!::typeof(setproperty!)
    │   %2 = (isa)(a, Nothing)::Bool
    └──      goto #3 if not %2
    2 ─ %4 = π (a, Nothing)
    │        invoke %1(_2::X, 🅰️:Symbol, %4::Nothing)::Any
    └──      goto #6
    3 ─ %7 = (isa)(a, Int64)::Bool
    └──      goto #5 if not %7
    4 ─ %9 = π (a, Int64)
    │        invoke %1(_2::X, 🅰️:Symbol, %9::Int64)::Any
    └──      goto #6
    5 ─      Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}
    └──      unreachable
    6 ┄      return x
    )
    ```

    > after this commit
    ```julia
    CodeInfo(
    1 ─ %1 = (isa)(a, Nothing)::Bool
    └──      goto #3 if not %1
    2 ─      Base.setfield!(x, :a, nothing)::Nothing
    └──      goto #6
    3 ─ %5 = (isa)(a, Int64)::Bool
    └──      goto #5 if not %5
    4 ─ %7 = π (a, Int64)
    │        Base.setfield!(x, :a, %7)::Int64
    └──      goto #6
    5 ─      Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}
    └──      unreachable
    6 ┄      return x
    )
    ```

commit 4c3ae20
Author: Chris Foster <[email protected]>
Date:   Tue Oct 26 21:48:32 2021 +1000

    Make Base.ifelse a generic function (JuliaLang#37343)

    Allow user code to directly extend `Base.ifelse` rather than needing a
    special package for it.

commit 2e388e3
Author: Shuhei Kadowaki <[email protected]>
Date:   Mon Oct 25 01:30:09 2021 +0900

    optimizer: eliminate excessive specialization in inlining code

    This commit includes several code quality improvements in inlining code:
    - eliminate excessive specializations around:
      * `item::Pair{Any, Any}` constructions
      * iterations on `Vector{Pair{Any, Any}}`
    - replace `Pair{Any, Any}` with new, more explicit data type `InliningCase`
    - remove dead code
N5N3 pushed a commit that referenced this pull request Oct 29, 2021
This commit complements JuliaLang#39754 and JuliaLang#39305: implements a logic to use
constant-prop'ed results for inlining at union-split callsite.
Currently it works only for cases when constant-prop' succeeded for all
(union-split) signatures.

> example
```julia
julia> mutable struct X
           # NOTE in order to confuse `fieldtype_tfunc`, we need to have at least two fields with different types
           a::Union{Nothing, Int}
           b::Symbol
       end;

julia> code_typed((X, Union{Nothing,Int})) do x, a
           # this `setproperty` call would be union-split and constant-prop will happen for
           # each signature: inlining would fail if we don't use constant-prop'ed source
           # since the approximated inlining cost of `convert(fieldtype(X, sym), a)` would
           # end up very high if we don't propagate `sym::Const(:a)`
           x.a = a
           x
       end |> only |> first
```

> before this commit
```julia
CodeInfo(
1 ─ %1 = Base.setproperty!::typeof(setproperty!)
│   %2 = (isa)(a, Nothing)::Bool
└──      goto #3 if not %2
2 ─ %4 = π (a, Nothing)
│        invoke %1(_2::X, 🅰️:Symbol, %4::Nothing)::Any
└──      goto #6
3 ─ %7 = (isa)(a, Int64)::Bool
└──      goto #5 if not %7
4 ─ %9 = π (a, Int64)
│        invoke %1(_2::X, 🅰️:Symbol, %9::Int64)::Any
└──      goto #6
5 ─      Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}
└──      unreachable
6 ┄      return x
)
```

> after this commit
```julia
CodeInfo(
1 ─ %1 = (isa)(a, Nothing)::Bool
└──      goto #3 if not %1
2 ─      Base.setfield!(x, :a, nothing)::Nothing
└──      goto #6
3 ─ %5 = (isa)(a, Int64)::Bool
└──      goto #5 if not %5
4 ─ %7 = π (a, Int64)
│        Base.setfield!(x, :a, %7)::Int64
└──      goto #6
5 ─      Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}
└──      unreachable
6 ┄      return x
)
```
N5N3 pushed a commit that referenced this pull request Nov 28, 2021
…43226)

In order to allow `Argument`s to be printed nicely.

> before
```julia
julia> code_typed((Float64,)) do x
           sin(x)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = invoke Main.sin(_2::Float64)::Float64
└──      return %1
) => Float64

julia> code_typed((Bool,Any,Any)) do c, x, y
           z = c ? x : y
           z
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not c
2 ─      goto #4
3 ─      nothing::Nothing
4 ┄ %4 = φ (#2 => _3, #3 => _4)::Any
└──      return %4
) => Any
```

> after
```julia
julia> code_typed((Float64,)) do x
           sin(x)
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = invoke Main.sin(x::Float64)::Float64
└──      return %1
) => Float64

julia> code_typed((Bool,Any,Any)) do c, x, y
           z = c ? x : y
           z
       end
1-element Vector{Any}:
 CodeInfo(
1 ─      goto #3 if not c
2 ─      goto #4
3 ─      nothing::Nothing
4 ┄ %4 = φ (#2 => x, #3 => y)::Any
└──      return %4
) => Any
```
N5N3 pushed a commit that referenced this pull request Mar 25, 2022
This commit implements a simple optimization within `sroa_mutables!` to
eliminate `isdefined` call by checking load-forwardability of the field.
This optimization may be especially useful to eliminate extra allocation
of `Core.Box` involved with a capturing closure, e.g.:
```julia
julia> callit(f, args...) = f(args...);

julia> function isdefined_elim()
           local arr::Vector{Any}
           callit() do
               arr = Any[]
           end
           return arr
       end;

julia> code_typed(isdefined_elim)
```
```diff
diff --git a/_master.jl b/_pr.jl
index 3aa40ba20e5..11eccf65f32 100644
--- a/_master.jl
+++ b/_pr.jl
@@ -1,15 +1,8 @@
 1-element Vector{Any}:
  CodeInfo(
-1 ─ %1  = Core.Box::Type{Core.Box}
-│   %2  = %new(%1)::Core.Box
-│   %3  = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
-│         Core.setfield!(%2, :contents, %3)::Vector{Any}
-│   %5  = Core.isdefined(%2, :contents)::Bool
-└──       goto #3 if not %5
-2 ─       goto #4
-3 ─       $(Expr(:throw_undef_if_not, :arr, false))::Any
-4 ┄ %9  = Core.getfield(%2, :contents)::Any
-│         Core.typeassert(%9, Vector{Any})::Vector{Any}
-│   %11 = π (%9, Vector{Any})
-└──       return %11
+1 ─ %1 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
+└──      goto #3 if not true
+2 ─      goto #4
+3 ─      $(Expr(:throw_undef_if_not, :arr, false))::Any
+4 ┄      return %1
 ) => Vector{Any}
```
N5N3 pushed a commit that referenced this pull request Mar 25, 2022
In JuliaLang#44635, we observe that occasionally a call to
`view(::SubArray, ::Colon, ...)` dispatches to the
wrong function. The post-inlining IR is in relevant part:

```
│   │ %8   = (isa)(I, Tuple{Colon, UnitRange{Int64}, SubArray{Int64, 2, UnitRange{Int64}, Tuple{Matrix{Int64}}, false}})::Bool
└───│        goto #3 if not %8
2 ──│ %10  = π (I, Tuple{Colon, UnitRange{Int64}, SubArray{Int64, 2, UnitRange{Int64}, Tuple{Matrix{Int64}}, false}})
│   │ @ indices.jl:324 within `to_indices` @ multidimensional.jl:859
│   │┌ @ multidimensional.jl:864 within `uncolon`
│   ││┌ @ indices.jl:351 within `Slice` @ indices.jl:351
│   │││ %11  = %new(Base.Slice{Base.OneTo{Int64}}, %7)::Base.Slice{Base.OneTo{Int64}}
│   │└└
│   │┌ @ essentials.jl:251 within `tail`
│   ││ %12  = Core.getfield(%10, 2)::UnitRange{Int64}
│   ││ %13  = Core.getfield(%10, 3)::SubArray{Int64, 2, UnitRange{Int64}, Tuple{Matrix{Int64}}, false}
│   │└
│   │ @ indices.jl:324 within `to_indices`
└───│        goto #5
    │ @ indices.jl:324 within `to_indices` @ indices.jl:333
    │┌ @ tuple.jl:29 within `getindex`
3 ──││ %15  = Base.getfield(I, 1, true)::Function
│   │└
│   │        invoke Base.to_index(A::SubArray{Int64, 3, Array{Int64, 3}, Tuple{Vector{Int64}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}}, false}, %15::Function)::Union{}
```

Here we expect the `isa` at `%8` to always be [1]. However,
we seemingly observe the result that the branch is not taken
and we instead end up in the fallback `to_index`, which (correctly)
complains that the colon should have been dereferenced to
an index.

After some investigation of the relevant rr trace, what turns out
to happen here is that the va tuple we compute in codegen gets
garbage collected before the call to `emit_isa`, causing a use-after-free
read, which happens to make `emit_isa` think that the isa condition
is impossible, causing it to fold the branch away.

The fix is to simply add the relevant GC root. It's a bit unfortunate that this
wasn't caught by the GC verifier. It would have in principle been capable of doing
so, but it is currently disabled for C++ sources. It would be worth revisiting
this in the future to see if it can't be made to work.

Fixes JuliaLang#44635.

[1] The specialization heuristics decided to widen `Colon` to `Function`,
which doesn't make much sense here, but regardless, it shouldn't
crash.
N5N3 pushed a commit that referenced this pull request Mar 28, 2022
Follows up JuliaLang#44708 -- in that PR I missed the most obvious optimization
opportunity, i.e. we can safely eliminate `isdefined` checks when all
fields are defined at allocation site.
This change allows us to eliminate capturing closure constructions when
the body and callsite of capture closure is available within a optimized
frame, e.g.:
```julia
function abmult(r::Int, x0)
    if r < 0
        r = -r
    end
    f = x -> x * r
    return @inline f(x0)
end
```
```diff
diff --git a/_master.jl b/_pr.jl
index ea06d865b75..c38f221090f 100644
--- a/_master.jl
+++ b/_pr.jl
@@ -1,24 +1,19 @@
 julia> @code_typed abmult(-3, 3)
 CodeInfo(
-1 ── %1  = Core.Box::Type{Core.Box}
-│    %2  = %new(%1, r@_2)::Core.Box
-│    %3  = Core.isdefined(%2, :contents)::Bool
-└───       goto #3 if not %3
+1 ──       goto #3 if not true
 2 ──       goto #4
 3 ──       $(Expr(:throw_undef_if_not, :r, false))::Any
-4 ┄─ %7  = (r@_2 < 0)::Any
-└───       goto #9 if not %7
-5 ── %9  = Core.isdefined(%2, :contents)::Bool
-└───       goto #7 if not %9
+4 ┄─ %4  = (r@_2 < 0)::Any
+└───       goto #9 if not %4
+5 ──       goto #7 if not true
 6 ──       goto #8
 7 ──       $(Expr(:throw_undef_if_not, :r, false))::Any
-8 ┄─ %13 = -r@_2::Any
-9 ┄─ %14 = φ (#4 => r@_2, #8 => %13)::Any
-│    %15 = Core.isdefined(%2, :contents)::Bool
-└───       goto #11 if not %15
+8 ┄─ %9  = -r@_2::Any
+9 ┄─ %10 = φ (#4 => r@_2, #8 => %9)::Any
+└───       goto #11 if not true
 10 ─       goto #12
 11 ─       $(Expr(:throw_undef_if_not, :r, false))::Any
-12 ┄ %19 = (x0 * %14)::Any
+12 ┄ %14 = (x0 * %10)::Any
 └───       goto #13
-13 ─       return %19
+13 ─       return %14
 ) => Any
```
N5N3 pushed a commit that referenced this pull request Apr 3, 2022
Currently the optimizer handles abstract callsite only when there is a
single dispatch candidate (in most cases), and so inlining and static-dispatch
are prohibited when the callsite is union-split (in other word, union-split
happens only when all the dispatch candidates are concrete).

However, there are certain patterns of code (most notably our Julia-level compiler code)
that inherently need to deal with abstract callsite.
The following example is taken from `Core.Compiler` utility:
```julia
julia> @inline isType(@nospecialize t) = isa(t, DataType) && t.name === Type.body.name
isType (generic function with 1 method)

julia> code_typed((Any,)) do x # abstract, but no union-split, successful inlining
           isType(x)
       end |> only
CodeInfo(
1 ─ %1 = (x isa Main.DataType)::Bool
└──      goto #3 if not %1
2 ─ %3 = π (x, DataType)
│   %4 = Base.getfield(%3, :name)::Core.TypeName
│   %5 = Base.getfield(Type{T}, :name)::Core.TypeName
│   %6 = (%4 === %5)::Bool
└──      goto #4
3 ─      goto #4
4 ┄ %9 = φ (#2 => %6, #3 => false)::Bool
└──      return %9
) => Bool

julia> code_typed((Union{Type,Nothing},)) do x # abstract, union-split, unsuccessful inlining
           isType(x)
       end |> only
CodeInfo(
1 ─ %1 = (isa)(x, Nothing)::Bool
└──      goto #3 if not %1
2 ─      goto #4
3 ─ %4 = Main.isType(x)::Bool
└──      goto #4
4 ┄ %6 = φ (#2 => false, #3 => %4)::Bool
└──      return %6
) => Bool
```
(note that this is a limitation of the inlining algorithm, and so any
user-provided hints like callsite inlining annotation doesn't help here)

This commit enables inlining and static dispatch for abstract union-split callsite.
The core idea here is that we can simulate our dispatch semantics by
generating `isa` checks in order of the specialities of dispatch candidates:
```julia
julia> code_typed((Union{Type,Nothing},)) do x # union-split, unsuccessful inlining
                  isType(x)
              end |> only
CodeInfo(
1 ─ %1  = (isa)(x, Nothing)::Bool
└──       goto #3 if not %1
2 ─       goto #9
3 ─ %4  = (isa)(x, Type)::Bool
└──       goto #8 if not %4
4 ─ %6  = π (x, Type)
│   %7  = (%6 isa Main.DataType)::Bool
└──       goto #6 if not %7
5 ─ %9  = π (%6, DataType)
│   %10 = Base.getfield(%9, :name)::Core.TypeName
│   %11 = Base.getfield(Type{T}, :name)::Core.TypeName
│   %12 = (%10 === %11)::Bool
└──       goto #7
6 ─       goto #7
7 ┄ %15 = φ (#5 => %12, #6 => false)::Bool
└──       goto #9
8 ─       Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}
└──       unreachable
9 ┄ %19 = φ (#2 => false, #7 => %15)::Bool
└──       return %19
) => Bool
```

Inlining/static-dispatch of abstract union-split callsite will improve
the performance in such situations (and so this commit will improve the
latency of our JIT compilation). Especially, this commit helps us avoid
excessive specializations of `Core.Compiler` code by statically-resolving
`@nospecialize`d callsites, and as the result, the # of precompiled
statements is now reduced from  `2005` ([`master`](f782430)) to `1912` (this commit).

And also, as a side effect, the implementation of our inlining algorithm
gets much simplified now since we no longer need the previous special
handlings for abstract callsites.

One possible drawback would be increased code size.
This change seems to certainly increase the size of sysimage,
but I think these numbers are in an acceptable range:
> [`master`](f782430)
```
❯ du -shk usr/lib/julia/*
17604	usr/lib/julia/corecompiler.ji
194072	usr/lib/julia/sys-o.a
169424	usr/lib/julia/sys.dylib
23784	usr/lib/julia/sys.dylib.dSYM
103772	usr/lib/julia/sys.ji
```

> this commit
```
❯ du -shk usr/lib/julia/*
17512	usr/lib/julia/corecompiler.ji
195588	usr/lib/julia/sys-o.a
170908	usr/lib/julia/sys.dylib
23776	usr/lib/julia/sys.dylib.dSYM
105360	usr/lib/julia/sys.ji
```
N5N3 pushed a commit that referenced this pull request Jun 27, 2022
…Lang#45790)

Currently the `@nospecialize`-d `push!(::Vector{Any}, ...)` can only
take a single item and we will end up with runtime dispatch when we try
to call it with multiple items:
```julia
julia> code_typed(push!, (Vector{Any}, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing
│   %2 = Base.arraylen(a)::Int64
│        Base.arrayset(true, a, item, %2)::Vector{Any}
└──      return a
) => Vector{Any}

julia> code_typed(push!, (Vector{Any}, Any, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Base.append!(a, iter)::Vector{Any}
└──      return %1
) => Vector{Any}
```

This commit adds a new specialization that it can take arbitrary-length
items. Our compiler should still be able to optimize the single-input 
case as before via the dispatch mechanism.
```julia
julia> code_typed(push!, (Vector{Any}, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing
│   %2 = Base.arraylen(a)::Int64
│        Base.arrayset(true, a, item, %2)::Vector{Any}
└──      return a
) => Vector{Any}

julia> code_typed(push!, (Vector{Any}, Any, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─ %1  = Base.arraylen(a)::Int64
│         $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000002, 0x0000000000000002))::Nothing
└──       goto #7 if not true
2 ┄ %4  = φ (#1 => 1, #6 => %14)::Int64
│   %5  = φ (#1 => 1, #6 => %15)::Int64
│   %6  = Base.getfield(x, %4, true)::Any
│   %7  = Base.add_int(%1, %4)::Int64
│         Base.arrayset(true, a, %6, %7)::Vector{Any}
│   %9  = (%5 === 2)::Bool
└──       goto #4 if not %9
3 ─       goto #5
4 ─ %12 = Base.add_int(%5, 1)::Int64
└──       goto #5
5 ┄ %14 = φ (#4 => %12)::Int64
│   %15 = φ (#4 => %12)::Int64
│   %16 = φ (#3 => true, #4 => false)::Bool
│   %17 = Base.not_int(%16)::Bool
└──       goto #7 if not %17
6 ─       goto #2
7 ┄       return a
) => Vector{Any}
```

This commit also adds the equivalent implementations for `pushfirst!`.
N5N3 pushed a commit that referenced this pull request Jun 29, 2022
When calling `jl_error()` or `jl_errorf()`, we must check to see if we
are so early in the bringup process that it is dangerous to attempt to
construct a backtrace because the data structures used to provide line
information are not properly setup.

This can be easily triggered by running:

```
julia -C invalid
```

On an `i686-linux-gnu` build, this will hit the "Invalid CPU Name"
branch in `jitlayers.cpp`, which calls `jl_errorf()`.  This in turn
calls `jl_throw()`, which will eventually call `jl_DI_for_fptr` as part
of the backtrace printing process, which fails as the object maps are
not fully initialized.  See the below `gdb` stacktrace for details:

```
$ gdb -batch -ex 'r' -ex 'bt' --args ./julia -C invalid
...
fatal: error thrown and no exception handler available.
ErrorException("Invalid CPU name "invalid".")

Thread 1 "julia" received signal SIGSEGV, Segmentation fault.
0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
1277    /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h: No such file or directory.
 #0  0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
 #1  std::map<unsigned int, JITDebugInfoRegistry::ObjectInfo, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__x=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_map.h:1258
 #2  jl_DI_for_fptr (fptr=4155049385, symsize=symsize@entry=0xffffcfa8, slide=slide@entry=0xffffcfa0, Section=Section@entry=0xffffcfb8, context=context@entry=0xffffcf94) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1181
 #3  0xf75c056a in jl_getFunctionInfo_impl (frames_out=0xffffd03c, pointer=4155049385, skipC=0, noInline=0) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1210
 #4  0xf7a6ca98 in jl_print_native_codeloc (ip=4155049385) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:636
 #5  0xf7a6cd54 in jl_print_bt_entry_codeloc (bt_entry=0xf0798018) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:657
 #6  jlbacktrace () at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:1090
 #7  0xf7a3cd2b in ijl_no_exc_handler (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:605
 #8  0xf7a3d10a in throw_internal (ct=ct@entry=0xf070c010, exception=<optimized out>, exception@entry=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:638
 #9  0xf7a3d330 in ijl_throw (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:654
 #10 0xf7a905aa in ijl_errorf (fmt=fmt@entry=0xf7647cd4 "Invalid CPU name \"%s\".") at /cache/build/default-amdci5-4/julialang/julia-master/src/rtutils.c:77
 #11 0xf75a4b22 in (anonymous namespace)::createTargetMachine () at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:823
 #12 JuliaOJIT::JuliaOJIT (this=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:1044
 #13 0xf7531793 in jl_init_llvm () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8585
 #14 0xf75318a8 in jl_init_codegen_impl () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8648
 #15 0xf7a51a52 in jl_restore_system_image_from_stream (f=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2131
 #16 0xf7a55c03 in ijl_restore_system_image_data (buf=0xe859c1c0 <jl_system_image_data> "8'\031\003", len=125161105) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2184
 #17 0xf7a55cf9 in jl_load_sysimg_so () at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:424
 #18 ijl_restore_system_image (fname=0x80a0900 "/build/bk_download/julia-d78fdad601/lib/julia/sys.so") at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2157
 #19 0xf7a3bdfc in _finish_julia_init (rel=rel@entry=JL_IMAGE_JULIA_HOME, ct=<optimized out>, ptls=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:741
 #20 0xf7a3c8ac in julia_init (rel=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:728
 #21 0xf7a7f61d in jl_repl_entrypoint (argc=<optimized out>, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/src/jlapi.c:705
 #22 0x080490a7 in main (argc=3, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/cli/loader_exe.c:59
```

To prevent this, we simply avoid calling `jl_errorf` this early in the
process, punting the problem to a later PR that can update guard
conditions within `jl_error*`.
N5N3 pushed a commit that referenced this pull request Sep 13, 2022
…cts (JuliaLang#46712)

Currently our inference isn't able to propagate `isa`-based type
constraint for cases like `isa(Type{<:...}, DataType)` since
`typeintersect` may return `Type` object itself when taking `Type`
object and `iskindtype`-object.

This case happens in the following kind of situation (motivated by the
discussion at <JuliaLang#46553 (comment)>):
```julia
julia> function isa_kindtype(T::Type{<:AbstractVector})
           if isa(T, DataType)
               # `T` here should be inferred as `DataType` rather than `Type{<:AbstractVector}`
               return T.name.name # should be inferred as ::Symbol
           end
           return nothing
       end
isa_kindtype (generic function with 1 method)

julia> only(code_typed(isa_kindtype; optimize=false))
CodeInfo(
1 ─ %1 = (T isa Main.DataType)::Bool
└──      goto #3 if not %1
2 ─ %3 = Base.getproperty(T, :name)::Any
│   %4 = Base.getproperty(%3, :name)::Any
└──      return %4
3 ─      return Main.nothing
) => Any
```

This commit improves the situation by adding a special casing for
abstract interpretation, rather than changing the behavior of
`typeintersect`.
N5N3 pushed a commit that referenced this pull request Jul 14, 2023
This commit improves SROA pass by extending the `unswitchtupleunion`
optimization to handle the general parametric types, e.g.:
```julia
julia> struct A{T}
           x::T
       end;

julia> function foo(a1, a2, c)
           t = c ? A(a1) : A(a2)
           return getfield(t, :x)
       end;

julia> only(Base.code_ircode(foo, (Int,Float64,Bool); optimize_until="SROA"))
```

> Before
```
2 1 ─      goto #3 if not _4                                          │
  2 ─ %2 = %new(A{Int64}, _2)::A{Int64}                               │╻ A
  └──      goto #4                                                    │
  3 ─ %4 = %new(A{Float64}, _3)::A{Float64}                           │╻ A
  4 ┄ %5 = φ (#2 => %2, #3 => %4)::Union{A{Float64}, A{Int64}}        │
3 │   %6 = Main.getfield(%5, :x)::Union{Float64, Int64}               │
  └──      return %6                                                  │
   => Union{Float64, Int64}
```

> After
```
julia> only(Base.code_ircode(foo, (Int,Float64,Bool); optimize_until="SROA"))
2 1 ─      goto #3 if not _4                                           │
  2 ─      nothing::A{Int64}                                           │╻ A
  └──      goto #4                                                     │
  3 ─      nothing::A{Float64}                                         │╻ A
  4 ┄ %8 = φ (#2 => _2, #3 => _3)::Union{Float64, Int64}               │
  │        nothing::Union{A{Float64}, A{Int64}}
3 │   %6 = %8::Union{Float64, Int64}                                   │
  └──      return %6                                                   │
   => Union{Float64, Int64}
```
N5N3 pushed a commit that referenced this pull request Feb 12, 2024
Fixes: JuliaLang#33147
Replaces/Closes: JuliaLang#40445

The difference here, compared to past implementations, is that we use
the zero-cost `isiterable` check on every intermediate step, instead of
wrapping the call in a try/catch and then trying to re-approximate the
`isiterable` afterwards. Some samples:

```julia
julia> Dict(i for i in 1:3)                                                        
ERROR: ArgumentError: AbstractDict(kv): kv needs to be an iterator of 2-tuples or pairs                                                                               
Stacktrace:                                                                                                                                                           
 [1] _throw_dict_kv_error()                                                        
   @ Base ./dict.jl:118                                                                                                                                               
 [2] grow_to!                                                                      
   @ ./dict.jl:132 [inlined]                                                       
 [3] dict_with_eltype                                                                                                                                                 
   @ ./abstractdict.jl:592 [inlined]                                                                                                                                  
 [4] Dict(kv::Base.Generator{UnitRange{Int64}, typeof(identity)})                                                                                                     
   @ Base ./dict.jl:120                                                            
 [5] top-level scope                                                                                                                                                  
   @ REPL[1]:1                                                                     
                                                                                                                                                                      
julia> Dict(i => error("$i") for i in 1:3)                                         
ERROR: 1                                                                                                                                                              
Stacktrace:                                                                                                                                                           
 [1] error(s::String)                                                                                                                                                 
   @ Base ./error.jl:35                                                                                                                                               
 [2] (::var"#3#4")(i::Int64)                                                       
   @ Main ./none:0                                                                 
 [3] iterate                                                                                                                                                          
   @ ./generator.jl:48 [inlined]                                                   
 [4] grow_to!                                                                      
   @ ./dict.jl:124 [inlined]                                                       
 [5] dict_with_eltype                                                              
   @ ./abstractdict.jl:592 [inlined]                                               
 [6] Dict(kv::Base.Generator{UnitRange{Int64}, var"#3#4"})                                                                                                            
   @ Base ./dict.jl:120                                                                                                                                               
 [7] top-level scope                                                               
   @ REPL[2]:1                                                                                                                                                        
```

The other unrelated change here is that `dest = empty(dest, typeof(k),
typeof(v))` is made conditional, so we do not unconditionally construct
an empty Dict in order to discard it and allocate an exact duplicate of
it, but only do so if inference wasn't precise originally.

Co-authored-by: Curtis Vogt <[email protected]>
N5N3 pushed a commit that referenced this pull request Mar 9, 2024
For example, we seek to eliminate the gc frame from this function, as
observed here:

```julia
julia> code_llvm((BitSet,), raw=true) do x; r = x.bits; GC.safepoint(); @inbounds r[1]; end
; Function Signature: var"#3"(Base.BitSet)
;  @ REPL[1]:1 within `#3`
define swiftcc i64 @"julia_#3_494"(ptr nonnull swiftself %pgcstack, ptr noundef nonnull align 8 dereferenceable(16) %"x::BitSet") #0 !dbg !5 {
top:
  call void @llvm.dbg.declare(metadata ptr %"x::BitSet", metadata !21, metadata !DIExpression()), !dbg !22
  %ptls_field = getelementptr inbounds ptr, ptr %pgcstack, i64 2
  %ptls_load = load ptr, ptr %ptls_field, align 8, !tbaa !23
  %0 = getelementptr inbounds ptr, ptr %ptls_load, i64 2
  %safepoint = load ptr, ptr %0, align 8, !tbaa !27
  fence syncscope("singlethread") seq_cst
  %1 = load volatile i64, ptr %safepoint, align 8, !dbg !22
  fence syncscope("singlethread") seq_cst
; ┌ @ Base.jl:49 within `getproperty`
   %"x::BitSet.bits" = load atomic ptr, ptr %"x::BitSet" unordered, align 8, !dbg !29, !tbaa !27, !alias.scope !33, !noalias !36, !nonnull !11, !dereferenceable !41, !align !42
; └
; ┌ @ gcutils.jl:253 within `safepoint`
   %ptls_load4 = load ptr, ptr %ptls_field, align 8, !dbg !43, !tbaa !23
   %2 = getelementptr inbounds ptr, ptr %ptls_load4, i64 2, !dbg !43
   %safepoint5 = load ptr, ptr %2, align 8, !dbg !43, !tbaa !27
   fence syncscope("singlethread") seq_cst, !dbg !43
   %3 = load volatile i64, ptr %safepoint5, align 8, !dbg !43
   fence syncscope("singlethread") seq_cst, !dbg !43
; └
; ┌ @ essentials.jl:892 within `getindex`
   %4 = load ptr, ptr %"x::BitSet.bits", align 8, !dbg !46, !tbaa !49, !alias.scope !52, !noalias !53
   %5 = load i64, ptr %4, align 8, !dbg !46, !tbaa !54, !alias.scope !57, !noalias !58
   ret i64 %5, !dbg !46
; └
}
```
N5N3 pushed a commit that referenced this pull request Jul 29, 2024
The functions `toms`, `tons`, and `days` uses `sum` over a vector of
`Period`s to obtain the conversion of a `CompoundPeriod`. However, the
compiler cannot infer the return type because those functions can return
either `Int` or `Float` depending on the type of the `Period`. This PR
forces the result of those functions to be `Float64`, fixing the type
stability.

Before this PR we had:

```julia
julia> using Dates

julia> p = Dates.Second(1) + Dates.Minute(1) + Dates.Year(1)
1 year, 1 minute, 1 second

julia> @code_warntype Dates.tons(p)
MethodInstance for Dates.tons(::Dates.CompoundPeriod)
  from tons(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:458
Arguments
  #self#::Core.Const(Dates.tons)
  c::Dates.CompoundPeriod
Body::Any
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto #3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.tons::Core.Const(Dates.tons)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
└──       return %11


julia> @code_warntype Dates.toms(p)
MethodInstance for Dates.toms(::Dates.CompoundPeriod)
  from toms(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:454
Arguments
  #self#::Core.Const(Dates.toms)
  c::Dates.CompoundPeriod
Body::Any
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto #3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.toms::Core.Const(Dates.toms)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
└──       return %11


julia> @code_warntype Dates.days(p)
MethodInstance for Dates.days(::Dates.CompoundPeriod)
  from days(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:468
Arguments
  #self#::Core.Const(Dates.days)
  c::Dates.CompoundPeriod
Body::Any
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto #3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.days::Core.Const(Dates.days)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
└──       return %11
```

After this PR we have:

```julia
julia> using Dates

julia> p = Dates.Second(1) + Dates.Minute(1) + Dates.Year(1)
1 year, 1 minute, 1 second

julia> @code_warntype Dates.tons(p)
MethodInstance for Dates.tons(::Dates.CompoundPeriod)
  from tons(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:458
Arguments
  #self#::Core.Const(Dates.tons)
  c::Dates.CompoundPeriod
Body::Float64
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto #3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.tons::Core.Const(Dates.tons)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
│   %12 = Dates.Float64::Core.Const(Float64)
│   %13 = Core.typeassert(%11, %12)::Float64
└──       return %13


julia> @code_warntype Dates.toms(p)
MethodInstance for Dates.toms(::Dates.CompoundPeriod)
  from toms(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:454
Arguments
  #self#::Core.Const(Dates.toms)
  c::Dates.CompoundPeriod
Body::Float64
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto #3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.toms::Core.Const(Dates.toms)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
│   %12 = Dates.Float64::Core.Const(Float64)
│   %13 = Core.typeassert(%11, %12)::Float64
└──       return %13


julia> @code_warntype Dates.days(p)
MethodInstance for Dates.days(::Dates.CompoundPeriod)
  from days(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:468
Arguments
  #self#::Core.Const(Dates.days)
  c::Dates.CompoundPeriod
Body::Float64
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto #3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.days::Core.Const(Dates.days)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
│   %12 = Dates.Float64::Core.Const(Float64)
│   %13 = Core.typeassert(%11, %12)::Float64
└──       return %13
```
N5N3 pushed a commit that referenced this pull request Oct 23, 2024
Rebase and extension of @alexfanqi's initial work on porting Julia to
RISC-V. Requires LLVM 19.

Tested on a VisionFive2, built with:

```make
MARCH := rv64gc_zba_zbb
MCPU := sifive-u74

USE_BINARYBUILDER:=0

DEPS_GIT = llvm
override LLVM_VER=19.1.1
override LLVM_BRANCH=julia-release/19.x
override LLVM_SHA1=julia-release/19.x
```

```julia-repl
❯ ./julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.12.0-DEV.1374 (2024-10-14)
 _/ |\__'_|_|_|\__'_|  |  riscv/25092a3982* (fork: 1 commits, 0 days)
|__/                   |

julia> versioninfo(; verbose=true)
Julia Version 1.12.0-DEV.1374
Commit 25092a3* (2024-10-14 09:57 UTC)
Platform Info:
  OS: Linux (riscv64-unknown-linux-gnu)
  uname: Linux 6.11.3-1-riscv64 #1 SMP Debian 6.11.3-1 (2024-10-10) riscv64 unknown
  CPU: unknown:
              speed         user         nice          sys         idle          irq
       #1  1500 MHz        922 s          0 s        265 s     160953 s          0 s
       #2  1500 MHz        457 s          0 s        280 s     161521 s          0 s
       #3  1500 MHz        452 s          0 s        270 s     160911 s          0 s
       #4  1500 MHz        638 s         15 s        301 s     161340 s          0 s
  Memory: 7.760246276855469 GB (7474.08203125 MB free)
  Uptime: 16260.13 sec
  Load Avg:  0.25  0.23  0.1
  WORD_SIZE: 64
  LLVM: libLLVM-19.1.1 (ORCJIT, sifive-u74)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)
Environment:
  HOME = /home/tim
  PATH = /home/tim/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/games
  TERM = xterm-256color


julia> ccall(:jl_dump_host_cpu, Nothing, ())
CPU: sifive-u74
Features: +zbb,+d,+i,+f,+c,+a,+zba,+m,-zvbc,-zksed,-zvfhmin,-zbkc,-zkne,-zksh,-zfh,-zfhmin,-zknh,-v,-zihintpause,-zicboz,-zbs,-zvknha,-zvksed,-zfa,-ztso,-zbc,-zvknhb,-zihintntl,-zknd,-zvbb,-zbkx,-zkt,-zvkt,-zicond,-zvksh,-zvfh,-zvkg,-zvkb,-zbkb,-zvkned


julia> @code_native debuginfo=:none 1+2.
	.text
	.attribute	4, 16
	.attribute	5, "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zmmul1p0_zba1p0_zbb1p0"
	.file	"+"
	.globl	"julia_+_3003"
	.p2align	1
	.type	"julia_+_3003",@function
"julia_+_3003":
	addi	sp, sp, -16
	sd	ra, 8(sp)
	sd	s0, 0(sp)
	addi	s0, sp, 16
	fcvt.d.l	fa5, a0
	ld	ra, 8(sp)
	ld	s0, 0(sp)
	fadd.d	fa0, fa5, fa0
	addi	sp, sp, 16
	ret
.Lfunc_end0:
	.size	"julia_+_3003", .Lfunc_end0-"julia_+_3003"

	.type	".L+Core.Float64#3005",@object
	.section	.data.rel.ro,"aw",@progbits
	.p2align	3, 0x0
".L+Core.Float64#3005":
	.quad	".L+Core.Float64#3005.jit"
	.size	".L+Core.Float64#3005", 8

.set ".L+Core.Float64#3005.jit", 272467692544
	.size	".L+Core.Float64#3005.jit", 8
	.section	".note.GNU-stack","",@progbits
```

Lots of bugs guaranteed, but with this we at least have a functional
build and REPL for further development by whoever is interested.

Also requires Linux 6.4+, since the fallback processor detection
used here relies on LLVM's `sys::getHostCPUFeatures`, which for
RISC-V is implemented using hwprobe introduced in 6.4. We could
probably add a fallback that parses `/proc/cpuinfo`, either by building
a CPU database much like how we've done for AArch64, or by parsing the
actual ISA string contained there. That would probably also be a good
place to add support for profiles, which are supposedly the way forward
to package RISC-V binaries. That can happen in follow-up PRs though.
For now, on older kernels, use the `-C` arg to Julia to specify an ISA.

Co-authored-by: Alex Fan <[email protected]>
N5N3 pushed a commit that referenced this pull request Oct 23, 2024
E.g. this allows `finalizer` inlining in the following case:
```julia
mutable struct ForeignBuffer{T}
    const ptr::Ptr{T}
end
const foreign_buffer_finalized = Ref(false)
function foreign_alloc(::Type{T}, length) where T
    ptr = Libc.malloc(sizeof(T) * length)
    ptr = Base.unsafe_convert(Ptr{T}, ptr)
    obj = ForeignBuffer{T}(ptr)
    return finalizer(obj) do obj
        Base.@assume_effects :notaskstate :nothrow
        foreign_buffer_finalized[] = true
        Libc.free(obj.ptr)
    end
end
function f_EA_finalizer(N::Int)
    workspace = foreign_alloc(Float64, N)
    GC.@preserve workspace begin
        (;ptr) = workspace
        Base.@assume_effects :nothrow @noinline println(devnull, "ptr = ", ptr)
    end
end
```
```julia
julia> @code_typed f_EA_finalizer(42)
CodeInfo(
1 ── %1  = Base.mul_int(8, N)::Int64
│    %2  = Core.lshr_int(%1, 63)::Int64
│    %3  = Core.trunc_int(Core.UInt8, %2)::UInt8
│    %4  = Core.eq_int(%3, 0x01)::Bool
└───       goto #3 if not %4
2 ──       invoke Core.throw_inexacterror(:convert::Symbol, UInt64::Type, %1::Int64)::Union{}
└───       unreachable
3 ──       goto #4
4 ── %9  = Core.bitcast(Core.UInt64, %1)::UInt64
└───       goto #5
5 ──       goto #6
6 ──       goto #7
7 ──       goto #8
8 ── %14 = $(Expr(:foreigncall, :(:malloc), Ptr{Nothing}, svec(UInt64), 0, :(:ccall), :(%9), :(%9)))::Ptr{Nothing}
└───       goto #9
9 ── %16 = Base.bitcast(Ptr{Float64}, %14)::Ptr{Float64}
│    %17 = %new(ForeignBuffer{Float64}, %16)::ForeignBuffer{Float64}
└───       goto #10
10 ─ %19 = $(Expr(:gc_preserve_begin, :(%17)))
│    %20 = Base.getfield(%17, :ptr)::Ptr{Float64}
│          invoke Main.println(Main.devnull::Base.DevNull, "ptr = "::String, %20::Ptr{Float64})::Nothing
│          $(Expr(:gc_preserve_end, :(%19)))
│    %23 = Main.foreign_buffer_finalized::Base.RefValue{Bool}
│          Base.setfield!(%23, :x, true)::Bool
│    %25 = Base.getfield(%17, :ptr)::Ptr{Float64}
│    %26 = Base.bitcast(Ptr{Nothing}, %25)::Ptr{Nothing}
│          $(Expr(:foreigncall, :(:free), Nothing, svec(Ptr{Nothing}), 0, :(:ccall), :(%26), :(%25)))::Nothing
└───       return nothing
) => Nothing
```

However, this is still a WIP. Before merging, I want to improve EA's
precision a bit and at least fix the test case that is currently marked
as `broken`. I also need to check its impact on compiler performance.

Additionally, I believe this feature is not yet practical. In
particular, there is still significant room for improvement in the
following areas:
- EA's interprocedural capabilities: currently EA is performed ad-hoc
for limited frames because of latency reasons, which significantly
reduces its precision in the presence of interprocedural calls.
- Relaxing the `:nothrow` check for finalizer inlining: the current
algorithm requires `:nothrow`-ness on all paths from the allocation of
the mutable struct to its last use, which is not practical for
real-world cases. Even when `:nothrow` cannot be guaranteed, auxiliary
optimizations such as inserting a `finalize` call after the last use
might still be possible (JuliaLang#55990).
N5N3 pushed a commit that referenced this pull request Nov 21, 2024
This PR introduces a new, toplevel-only, syntax form `:worldinc` that
semantically represents the effect of raising the current task's world
age to the latest world for the remainder of the current toplevel
evaluation (that context being an entry to `eval` or a module
expression). For detailed motivation on why this is desirable, see
JuliaLang#55145, which I won't repeat here, but the gist is that we never really
defined when world-age increments and worse are inconsistent about it.
This is something we need to figure out now, because the bindings
partition work will make world age even more observable via bindings.

Having created a mechanism for world age increments, the big question is
one of policy, i.e. when should these world age increments be inserted.

Several reasonable options exist:
1. After world-age affecting syntax constructs (as proprosed in JuliaLang#55145)
2. Option 1 + some reasonable additional cases that people rely on
3. Before any top level `call` expression
4. Before any expression at toplevel whatsover

As an example, case, consider `a == a` at toplevel. Depending on the
semantics that could either be the same as in local scope, or each of
the four world age dependent lookups (three binding lookups, one method
lookup) could (potentially) occur in a different world age.

The general tradeoff here is between the risk of exposing the user to
confusing world age errors and our ability to optimize top-level code
(in general, any `:worldinc` statement will require us to fully
pessimize or recompile all following code).

This PR basically implements option 2 with the following semantics:

1. The interpreter explicit raises the world age only at `:worldinc`
exprs or after `:module` exprs.
2. The frontend inserts `:worldinc` after all struct definitions, method
definitions, `using` and `import.
3. The `@eval` macro inserts a worldinc following the call to `eval` if
at toplevel
4. A literal (syntactic) call to `include` gains an implicit `worldinc`.

Of these the fourth is probably the most questionable, but is necessary
to make this non-breaking for most code patterns. Perhaps it would have
been better to make `include` a macro from the beginning (esp because it
already has semantics that look a little like reaching into the calling
module), but that ship has sailed.

Unfortunately, I don't see any good intermediate options between this PR
and option #3 above. I think option #3 is closest to what we have right
now, but if we were to choose it and actually fix the soundness issues,
I expect that we would be destroying all performance of global-scope
code. For this reason, I would like to try to make the version in this
PR work, even if the semantics are a little ugly.

The biggest pattern that this PR does not catch is:
```
eval(:(f() = 1))
f()
```

We could apply the same `include` special case to eval, but given the
existence of `@eval` which allows addressing this at the macro level, I
decided not to. We can decide which way we want to go on this based on
what the package ecosystem looks like.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants