-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Despecialization part 2 #2709
Despecialization part 2 #2709
Conversation
Sounds reasonable, but can you check that performance isn't affected for large tables? |
@assert f isa Base.Callable | ||
@assert incols isa Union{Nothing, AbstractVector, Tuple, NamedTuple} | ||
@assert first isa Union{NamedTuple, DataFrameRow, AbstractDataFrame} | ||
@assert firstmulticol isa Union{Val{true}, Val{false}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A despecialized Val
is quite paradoxical. How about making this a Bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point - I will change it. The problem was that for Val
compiler genterates methos for Val{true}
, Val{false}
and Val{T} where T
and this last method is not needed by us.
Yes. I will do some timings (these always take time) and report here (probably today in the evening or tomorrow). |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
This is super strange.
with this PR wrapping
now if I pass
which is quite puzzling. Therefore I decided to go for a cleaner design and introduce types that signal if we have a single or multi column aggregation. With them the timing is:
so this is comparable to passing around Now I am benchmarking the performance on larger split-apply-combine operations. |
The performance benchmarks show that this PR does not regress against Here are the details:
in summary - this PR looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmarks look good. Any failure to specialize in a place where it matters would would have dramatic effects that would be hard to miss.
Thank you! |
I changed
grouped_reduce!
by extracting out the part of code that does have less arguments (this is a relatively clean change and gives an improvement in compilation - but only a small one).Further despecialization of
combine_with_first!
andcombine_rows_with_first!
(but notcombine_tables_with_first!
as it contains a hot loop). This second change is heavy handed, but gives the following benefits:this PR
main
by despecializing we incur a cost of having a loop over threads that is not specialized and will use dynamic dispatch but I assume that this loop is small (up to number of available threads). Is this correct @nalimilan?