allow `@nospecialize`-d `push!` to take arbitrary items #45790

aviatesk · 2022-06-23T09:08:16Z

Currently the @nospecialize-d push!(::Vector{Any}, ...) can only
take a single item and we will end up with runtime dispatch when we try
to call it with multiple items:

julia> code_typed(push!, (Vector{Any}, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing
│   %2 = Base.arraylen(a)::Int64
│        Base.arrayset(true, a, item, %2)::Vector{Any}
└──      return a
) => Vector{Any}

julia> code_typed(push!, (Vector{Any}, Any, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Base.append!(a, iter)::Vector{Any}
└──      return %1
) => Vector{Any}

This commit extends it so that it can take arbitrarily long items.
Our compiler should still be able to optimize the single-input case as
before by unrolling using the constant item length information:

julia> code_typed(push!, (Vector{Any}, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing
│   %2 = Base.arraylen(a)::Int64
│        Base.arrayset(true, a, item, %2)::Vector{Any}
└──      return a
) => Vector{Any}

julia> code_typed(push!, (Vector{Any}, Any, Any))
1-element Vector{Any}:
 CodeInfo(
1 ─ %1  = Base.arraylen(a)::Int64
│         $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000002, 0x0000000000000002))::Nothing
└──       goto #7 if not true
2 ┄ %4  = φ (#1 => 1, #6 => %14)::Int64
│   %5  = φ (#1 => 1, #6 => %15)::Int64
│   %6  = Base.getfield(x, %4, true)::Any
│   %7  = Base.add_int(%1, %4)::Int64
│         Base.arrayset(true, a, %6, %7)::Vector{Any}
│   %9  = (%5 === 2)::Bool
└──       goto #4 if not %9
3 ─       goto #5
4 ─ %12 = Base.add_int(%5, 1)::Int64
└──       goto #5
5 ┄ %14 = φ (#4 => %12)::Int64
│   %15 = φ (#4 => %12)::Int64
│   %16 = φ (#3 => true, #4 => false)::Bool
│   %17 = Base.not_int(%16)::Bool
└──       goto #7 if not %17
6 ─       goto #2
7 ┄       return a
) => Vector{Any}

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-06-23T09:53:54Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

vtjnash · 2022-06-23T19:26:46Z

test/compiler/inline.jl

+    @test count(iscall((src, push!)), src.code) == 0
+    @test count(src.code) do @nospecialize x
+        isa(x, Core.GotoNode) || isa(x, Core.GotoIfNot)
+    end == 0 # the loop should be optimized away for a single item


this seems to be failing on many CI bots

It seems like push!(::Vector{Any}, ::Any) produces the following IR:

julia> code_typed((Vector{Any},Any)) do xs, x push!(xs, x) end 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.arraylen(xs)::Int64 │ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing └── goto #3 if not true 2 ─ %4 = Base.add_int(%1, 1)::Int64 └── Base.arrayset(true, xs, x, %4)::Vector{Any} 3 ┄ goto #4 4 ─ return xs ) => Vector{Any}

So the loop can't be fully optimized on Julia level actually, ~~but I confirmed LLVM can further optimize it down the road and the performance doesn't regress.~~

EDIT: A nanosolidier result suggested that loop can be left unoptimized. For now, I'd like to specialize the single-arg case and optimize it manually.

aviatesk · 2022-06-24T23:15:24Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-06-25T00:00:55Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2022-06-25T00:03:16Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-06-25T00:48:42Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Currently the `@nospecialize`-d `push!(::Vector{Any}, ...)` can only take a single item and we will end up with runtime dispatch when we try to call it with multiple items: ```julia julia> code_typed(push!, (Vector{Any}, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing │ %2 = Base.arraylen(a)::Int64 │ Base.arrayset(true, a, item, %2)::Vector{Any} └── return a ) => Vector{Any} julia> code_typed(push!, (Vector{Any}, Any, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.append!(a, iter)::Vector{Any} └── return %1 ) => Vector{Any} ``` This commit extends it so that it can take arbitrary-length items. Our compiler should still be able to optimize the single-input case as before by unrolling using the constant item length information: ```julia julia> code_typed(push!, (Vector{Any}, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing │ %2 = Base.arraylen(a)::Int64 │ Base.arrayset(true, a, item, %2)::Vector{Any} └── return a ) => Vector{Any} julia> code_typed(push!, (Vector{Any}, Any, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.arraylen(a)::Int64 │ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000002, 0x0000000000000002))::Nothing └── goto #7 if not true 2 ┄ %4 = φ (#1 => 1, #6 => %14)::Int64 │ %5 = φ (#1 => 1, #6 => %15)::Int64 │ %6 = Base.getfield(x, %4, true)::Any │ %7 = Base.add_int(%1, %4)::Int64 │ Base.arrayset(true, a, %6, %7)::Vector{Any} │ %9 = (%5 === 2)::Bool └── goto #4 if not %9 3 ─ goto #5 4 ─ %12 = Base.add_int(%5, 1)::Int64 └── goto #5 5 ┄ %14 = φ (#4 => %12)::Int64 │ %15 = φ (#4 => %12)::Int64 │ %16 = φ (#3 => true, #4 => false)::Bool │ %17 = Base.not_int(%16)::Bool └── goto #7 if not %17 6 ─ goto #2 7 ┄ return a ) => Vector{Any} ```

aviatesk · 2022-06-25T01:34:53Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-06-25T02:20:12Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

For `Vector{Any}` to avoid runtime dispatch.

aviatesk · 2022-06-26T02:23:30Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-06-26T03:08:50Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

…Lang#45790) Currently the `@nospecialize`-d `push!(::Vector{Any}, ...)` can only take a single item and we will end up with runtime dispatch when we try to call it with multiple items: ```julia julia> code_typed(push!, (Vector{Any}, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing │ %2 = Base.arraylen(a)::Int64 │ Base.arrayset(true, a, item, %2)::Vector{Any} └── return a ) => Vector{Any} julia> code_typed(push!, (Vector{Any}, Any, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.append!(a, iter)::Vector{Any} └── return %1 ) => Vector{Any} ``` This commit adds a new specialization that it can take arbitrary-length items. Our compiler should still be able to optimize the single-input case as before via the dispatch mechanism. ```julia julia> code_typed(push!, (Vector{Any}, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing │ %2 = Base.arraylen(a)::Int64 │ Base.arrayset(true, a, item, %2)::Vector{Any} └── return a ) => Vector{Any} julia> code_typed(push!, (Vector{Any}, Any, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.arraylen(a)::Int64 │ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000002, 0x0000000000000002))::Nothing └── goto JuliaLang#7 if not true 2 ┄ %4 = φ (JuliaLang#1 => 1, JuliaLang#6 => %14)::Int64 │ %5 = φ (JuliaLang#1 => 1, JuliaLang#6 => %15)::Int64 │ %6 = Base.getfield(x, %4, true)::Any │ %7 = Base.add_int(%1, %4)::Int64 │ Base.arrayset(true, a, %6, %7)::Vector{Any} │ %9 = (%5 === 2)::Bool └── goto JuliaLang#4 if not %9 3 ─ goto JuliaLang#5 4 ─ %12 = Base.add_int(%5, 1)::Int64 └── goto JuliaLang#5 5 ┄ %14 = φ (JuliaLang#4 => %12)::Int64 │ %15 = φ (JuliaLang#4 => %12)::Int64 │ %16 = φ (JuliaLang#3 => true, JuliaLang#4 => false)::Bool │ %17 = Base.not_int(%16)::Bool └── goto JuliaLang#7 if not %17 6 ─ goto JuliaLang#2 7 ┄ return a ) => Vector{Any} ``` This commit also adds the equivalent implementations for `pushfirst!`.

aviatesk requested a review from vtjnash June 23, 2022 09:08

vtjnash approved these changes Jun 23, 2022

View reviewed changes

vtjnash reviewed Jun 23, 2022

View reviewed changes

aviatesk force-pushed the avi/anypush! branch from fafee99 to 65a82e8 Compare June 24, 2022 23:11

aviatesk force-pushed the avi/anypush! branch from 65a82e8 to d79e61f Compare June 25, 2022 01:34

add @nospecialize-d pushfirst! version

c3222b1

For `Vector{Any}` to avoid runtime dispatch.

aviatesk merged commit c686e4a into master Jun 27, 2022

aviatesk deleted the avi/anypush! branch June 27, 2022 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow `@nospecialize`-d `push!` to take arbitrary items #45790

allow `@nospecialize`-d `push!` to take arbitrary items #45790

aviatesk commented Jun 23, 2022 •

edited

Loading

nanosoldier commented Jun 23, 2022

vtjnash Jun 23, 2022

aviatesk Jun 24, 2022 •

edited

Loading

aviatesk commented Jun 24, 2022

nanosoldier commented Jun 25, 2022

aviatesk commented Jun 25, 2022

nanosoldier commented Jun 25, 2022

aviatesk commented Jun 25, 2022

nanosoldier commented Jun 25, 2022

aviatesk commented Jun 26, 2022

nanosoldier commented Jun 26, 2022

allow @nospecialize-d push! to take arbitrary items #45790

allow @nospecialize-d push! to take arbitrary items #45790

Conversation

aviatesk commented Jun 23, 2022 • edited Loading

nanosoldier commented Jun 23, 2022

vtjnash Jun 23, 2022

Choose a reason for hiding this comment

aviatesk Jun 24, 2022 • edited Loading

Choose a reason for hiding this comment

aviatesk commented Jun 24, 2022

nanosoldier commented Jun 25, 2022

aviatesk commented Jun 25, 2022

nanosoldier commented Jun 25, 2022

aviatesk commented Jun 25, 2022

nanosoldier commented Jun 25, 2022

aviatesk commented Jun 26, 2022

nanosoldier commented Jun 26, 2022

allow `@nospecialize`-d `push!` to take arbitrary items #45790

allow `@nospecialize`-d `push!` to take arbitrary items #45790

aviatesk commented Jun 23, 2022 •

edited

Loading

aviatesk Jun 24, 2022 •

edited

Loading