-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
n-arg map performance #17321
Comments
julia> @benchmark map(f, As...)
BenchmarkTools.Trial:
samples: 165
evals/sample: 1
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 7.63 mb
allocs estimate: 10
minimum time: 27.52 ms (0.00% GC)
median time: 30.26 ms (0.00% GC)
mean time: 30.37 ms (2.57% GC)
maximum time: 39.38 ms (7.84% GC)
julia> @benchmark broadcast(f, As...)
BenchmarkTools.Trial:
samples: 887
evals/sample: 1
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 7.63 mb
allocs estimate: 83
minimum time: 4.57 ms (0.00% GC)
median time: 4.89 ms (0.00% GC)
mean time: 5.63 ms (13.98% GC)
maximum time: 9.60 ms (26.82% GC) Again I wonder whether |
No. I'm convinced we can optimize In any case, performance hacks should be directed at Also noting that |
Got it. Thank you for the resources, Jeff! |
This regressed quite a lot
|
Bumping this. Anything we can do here? |
Is there anything to do here? I just ran the above on my machine, and saw 2-4ms on everything posted above. |
While working on
map
forNullableArray
s (JuliaStats/NullableArrays.jl#128 (comment)), I came across some evidence that the current n-argmap
implementation in Base might be leaving performance on the table. Here's a simple benchmark for the current Base implementation:Here's the same benchmark for an alternative implementation (
mymap
in https://gist.github.com/davidagold/d7088aae22f23d383e5bf1f26aa1a045) that (1) avoids usingith_all
to index into theAs
and (2) avoidszip
ing theAs
together in the construction of aGenerator(f, As...)
object:In this case,
mymap
is 5x faster. However, its implementation involves the use of a macro and generated functions in place ofith_all
. Is the speed up here worth introducing such changes into the Base implementation?cc @nalimilan
The text was updated successfully, but these errors were encountered: