-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference improvements #2691
Inference improvements #2691
Conversation
@nalimilan - if you would be OK with this I propose to review this PR as is and then merge it. Next I would assume that I will make a series of small PRs focusing on one area of the code base. Otherwise there is a risk that it will be very hard to review one big PR. The point is that these code transformations seem simple but can be quite tricky to make them correctly as our code base is complex. OK? |
Maybe let's wait a bit for Tim to comment at #2597 (comment) in case he has simpler solutions to suggest? |
OK. But let me comment on timings. Running this PR
main
(this is before generating new precompile statements adjusted to this PR) In general this is much better improvement than what @timholy reported in #2563:
|
After the additional changes this is the timing of grouping tests on this PR (first and second run in a clean session):
and this is a timing of select: this PR
vs
|
I have decided to add some more
(here we have an uniform improvement)
(here first run is a bit slower, but successive run is faster because we specialize more) |
@nalimilan - I would stop with changes at this point. Otherwise it is days of work. I have also investigated if we can improve In the original case (the topmost example) currently we have:
so we we spend ~0.015 seconds on inference, but majority of this time is spent in The problem is that in 0.22 branch the same function is much faster:
But the major reason for this is addition of support of threading (note that the major cost is not the inference, but code generation cost that I do not know how to reduce). So my proposal is:
Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timings are impressive. It's a shame that we cannot achieve the same with @nospecialize
.
function _add_multicol_res(res::AbstractDataFrame, newdf::DataFrame, df::AbstractDataFrame, | ||
colnames::AbstractVector{Symbol}, | ||
allow_resizing_newdf::Ref{Bool}, @nospecialize(fun), | ||
allow_resizing_newdf::Ref{Bool}, wfun::Ref{Any}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For functions like this one that don't call fun
directly but just pass it to other functions, the compiler should only specialize on Function
, not on the particular passed function. Are you sure this change makes a difference? (Same for other _add_multicol_res
methods.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the compiler should only specialize on
Function
, not on the particular passed function.
No I am not sure, but are you sure the opposite? I.e. is this behavior you describe documented and guaranteed by the compiler?
Also note the following. This is on 0.22 branch:
julia> using MethodAnalysis
julia> using DataFrames
julia> combine(DataFrame(), x->(a=1,));
julia> combine(DataFrame(), x->(a=1,));
julia> combine(DataFrame(), x->(a=1,));
julia> methodinstances(DataFrames._add_multicol_res)
11-element Vector{Core.MethodInstance}:
MethodInstance for _add_multicol_res(::AbstractDataFrame, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Type, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::AbstractDataFrame, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Function, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::AbstractMatrix{T} where T, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Type, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::AbstractMatrix{T} where T, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Function, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::NamedTuple{var"#s267", var"#s266"} where {var"#s267", var"#s266"<:Tuple{Vararg{AbstractVector{T} where T, N} where N}}, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Type, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::NamedTuple{var"#s267", var"#s266"} where {var"#s267", var"#s266"<:Tuple{Vararg{AbstractVector{T} where T, N} where N}}, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Function, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::NamedTuple{(:a,), Tuple{Int64}}, ::DataFrame, ::DataFrame, ::Vector{Symbol}, ::Base.RefValue{Bool}, ::Any, ::Nothing, ::Bool, ::Nothing)
MethodInstance for _add_multicol_res(::NamedTuple, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Type, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::NamedTuple, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Function, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::DataFrameRow, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Type, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::DataFrameRow, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Function, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
and this is on this PR:
julia> using MethodAnalysis
julia> using DataFrames
julia> combine(DataFrame(), x->(a=1,));
julia> combine(DataFrame(), x->(a=1,));
julia> combine(DataFrame(), x->(a=1,));
julia> methodinstances(DataFrames._add_multicol_res)
6-element Vector{Core.MethodInstance}:
MethodInstance for _add_multicol_res(::AbstractDataFrame, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Base.RefValue{Any}, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::AbstractMatrix{T} where T, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Base.RefValue{Any}, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::NamedTuple{var"#s281", var"#s280"} where {var"#s281", var"#s280"<:Tuple{Vararg{AbstractVector{T} where T, N} where N}}, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Base.RefValue{Any}, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::NamedTuple, ::DataFrame, ::DataFrame, ::Vector{Symbol}, ::Base.RefValue{Bool}, ::Base.RefValue{Any}, ::Nothing, ::Bool, ::Nothing)
MethodInstance for _add_multicol_res(::NamedTuple, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Base.RefValue{Any}, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
MethodInstance for _add_multicol_res(::DataFrameRow, ::DataFrame, ::DataFrame, ::AbstractVector{Symbol}, ::Base.RefValue{Bool}, ::Base.RefValue{Any}, ::Union{Nothing, Int64, AsTable, AbstractVector{Int64}}, ::Bool, ::Union{Nothing, Type{AsTable}, AbstractVector{Symbol}})
so as you can see we generate two times less method instances for _add_multicol_res
. Under Julia 1.6 indeed the method is not specialized for a specific function (but what will be the rules in Julia 1.7 or what were the rules in e.g. Julia 1.2?), but since we allow Base.Callable
it is specialized both for Function
and for Type
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK that's a documented behavior. But yeah if both Function
and Type
are passed then we will get two different methods. I hadn't realized Type
was used there in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no - Type
is not passed. These methods get generated even if Type
is never passed. Just because they may be passed they get generated
copycols, keeprows) | ||
end | ||
|
||
function _manipulate(df::AbstractDataFrame, @nospecialize(normalized_cs), copycols::Bool, keeprows::Bool) | ||
function _manipulate(df::AbstractDataFrame, normalized_cs::Vector{Any}, copycols::Bool, keeprows::Bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function _manipulate(df::AbstractDataFrame, normalized_cs::Vector{Any}, copycols::Bool, keeprows::Bool) | |
function _manipulate(df::AbstractDataFrame, normalized_cs::AbstractVector, copycols::Bool, keeprows::Bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer Vector{Any}
to make sure that in the future we have only one method instance of _manipulate
with respect to this argument (so that we by accident do not introduce compilation latency).
src/abstractdataframe/subset.jl
Outdated
conditions = Any[] | ||
|
||
# subset allows a transformation specification without a target column name or a column | ||
for (i, a) in enumerate(args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use a comprehension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to eliminate _proess_subset_pair
function that would get specialized on the types of arguments passed. However, I can do additional timings if this is a significant compilation overhead (probably not as this is a short function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you can use a comprehension without _process_subset_pair
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it would be very ugly then as the whole code:
if a isa ColumnIndex
a => Symbol(:x, i)
elseif a isa Pair{<:Any, <:Base.Callable}
first(a) => last(a) => Symbol(:x, i)
else
throw(ArgumentError("condition specifier $a is not supported by `subset`"))
end
would have to go into the body of the comprehension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matter of taste I guess. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed - please have a look how it reads now.
src/abstractdataframe/selection.jl
Outdated
@nospecialize(fun)) | ||
wfun::Ref{Any}) | ||
fun = only(wfun) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just occurred to me that this syntax works: (fun,)::Ref
. That avoids adding one line for each argument and playing with argument names. The Any
doesn't add anything AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to use Ref{Any}
to ensure we have exactly one method instance generated (same comment as in several places above). This is mostly for the future development.
Regarding (fun,)::Ref{And}
this is smart indeed. Do you feel it will be more readable than fun = only(wfun)
(I wanted to keep consistent style as sometimes we just pass through wfun
without unwrapping). But I am OK with both - just wondering which method is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it a bit more readable, especially since it allows keeping fun
as the name everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - I will change then. But when Ref{Any}
is only passed around and not accessed I will keep wfun
to avoid unwrap-rewrap operation.
Co-authored-by: Milan Bouchet-Valat <[email protected]>
@nalimilan - we should think if functions like:
could be split into parts as described in #2563 to reduce their compile cost. I have looked at them but I feel that you know this part of code better. All these functions are quite long but we have to specialize them (so |
Maybe we could avoid specializing The same reasoning applies to For |
src/abstractdataframe/selection.jl
Outdated
function _gen_colnames(@nospecialize(res), newname::Union{AbstractVector{Symbol}, | ||
Type{AsTable}, Nothing}) | ||
function _gen_colnames(res, newname::Union{AbstractVector{Symbol}, | ||
Type{AsTable}, Nothing}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here you think it's better to specialize? :-D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmarks show that it is minimally better not to specialize given our test set :) - these things are super strange. However the difference is almost imperceptible, so I reverted @nospecialize
as probably in practical loads not-specializing here might be better.
But there is a loop in this function still (over batches), so I was afraid to do this.
This would indeed be a good change. In general passing around
Agreed In summary: I have a feeling that it would be best to merge this PR and then do these actions as separate PR or even PRs. Do you think the same? |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Co-authored-by: Milan Bouchet-Valat <[email protected]>
…o bk/inference_improvements
@nalimilan - do you have any more comments to this PR. Of course no rush, but I am asking because I would separate method-splitting to a different PR (I can do it later) and here just concentrate on despecialization and merge this one. |
I've already approved. :-) |
This is a big change so I wanted to double check. Thank you! |
Fixes #2516
There is still more work to do of similar kind, but I least I feel I have a handle what kind of thins need to be done. In particular I think that instead of
@nospecialize
it is better to useRef{Any}
and use sentinel of target type instead ofnothing
.Timings:
This PR:
main:
0.22.6 release (which shows there is still more work to do as we significantly increased the complexity of methods because of adding multi threading):