-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map on tuples is prohibitively slow #15695
Comments
Is that a case for this |
No. |
+1. I had one map on tuples in JuliaParser and it slowed down the whole thing by about 2-3 orders of magnitude. |
Oh I see, constructing the tuple is the last action, not the call to map... |
We could probably get the equivalent of the generated |
So stack allocated |
Yes |
Similarly function map2(f, ts::Tuple...)
([f([t[i] for t in ts]...) for i in 1:length(ts[1])]...)
end julia> function foo(f, n)
t = ntuple(x->x, n)
@time f(+, t,t)
@time f(+, t,t,t)
@time f(+, t,t,t,t)
end
julia> foo(map, 2)
0.003090 seconds (725 allocations: 35.893 KB)
1.004426 seconds (1.73 M allocations: 66.629 MB, 1.60% gc time)
0.002223 seconds (162 allocations: 7.375 KB)
(4,8)
julia> foo(map, 2)
0.000001 seconds (1 allocation: 32 bytes) # this has a special case
0.000133 seconds (86 allocations: 3.516 KB)
0.000146 seconds (117 allocations: 4.375 KB)
(4,8)
julia> foo(map2, 2)
0.024458 seconds (23.03 k allocations: 997.798 KB)
0.000011 seconds (4 allocations: 352 bytes)
0.000006 seconds (4 allocations: 352 bytes)
(4,8)
julia> foo(map2, 2)
0.000023 seconds (4 allocations: 320 bytes)
0.000006 seconds (4 allocations: 352 bytes)
0.000005 seconds (4 allocations: 352 bytes)
(4,8)
julia> foo(map, 3)
0.008495 seconds (4.66 k allocations: 232.674 KB)
0.450469 seconds (699.39 k allocations: 27.190 MB, 2.75% gc time)
0.000150 seconds (186 allocations: 7.500 KB)
(4,8,12)
julia> foo(map, 3)
0.000039 seconds (11 allocations: 672 bytes)
0.000144 seconds (132 allocations: 5.500 KB)
0.000132 seconds (182 allocations: 7.125 KB)
(4,8,12)
julia> foo(map2, 3)
0.025736 seconds (23.19 k allocations: 1007.829 KB)
0.000010 seconds (5 allocations: 480 bytes)
0.000005 seconds (5 allocations: 480 bytes)
(4,8,12)
julia> foo(map2, 3)
0.000017 seconds (5 allocations: 432 bytes)
0.000006 seconds (5 allocations: 480 bytes)
0.000005 seconds (5 allocations: 480 bytes)
(4,8,12) |
Out of curiosity --- how did this come up? Was there an interesting use case? |
In ComputeFramework there's a type called the arguments can themselves be thunks. The scheduler runs a thunk when all input thunks are computed: JuliaParallel/Dagger.jl@93c7e79 (it was using map to look up results, no particular reason) sometimes the I picked tuples because I could do things like this: Do you think tl; dr: I didn't need to use |
There's a tension between efficiency of compilation and efficiency of production. Currently there's an easy test case:
The However, for those of us who work on core array code, we use tuples that tend to be of the length of the dimensionality of an array, not the contents of the array. So our tuples are typically of length 2-4. In which case, this is informative: julia> sum3(t::NTuple{3}) = t[1] + t[2] + t[3]
sum3 (generic function with 1 method)
julia> function tuple1(n)
s = 0
for i = 1:n
t = ntuple(identity, 3)
s += sum3(t)
end
s
end
tuple1 (generic function with 1 method)
julia> function tuple2(n)
s = 0
for i = 1:n
t = ntuple(identity, Val{3})
s += sum3(t)
end
s
end
tuple2 (generic function with 1 method)
julia> tuple1(1)
6
julia> tuple2(1)
6
julia> @time tuple1(10^6)
0.145727 seconds (2.00 M allocations: 30.515 MB, 12.31% gc time)
6000000
julia> @time tuple2(10^6)
0.005430 seconds (7 allocations: 208 bytes)
6000000 In other words, there is currently a tension between the needs for users of small tuples and those hoping to use large tuples. |
Just to be clear, I didn't need to use large tuples in my case. I just happened to because I didn't know this would be a problem. I agree that for smaller tuples, there should be specific methods (or generated ones) so that core array code dealing with small tuples is really fast, but if a user does (maybe inadvertently) use a bigger tuple something like |
Not a bad plan. I was hoping to get away from "breaks" in performance at n=5 (for example), but maybe the alternative is worse. |
I should acknowledge that one valid criticism of my benchmark above is that in julia> ntuple2{N}(f, ::Type{Val{N}}) = ([f(i) for i = 1:N]...)::NTuple{N,typeof(f(1))}
ntuple2 (generic function with 1 method) and verified that the return type is predictable, switched |
#16460 fixes this... |
Two alternative implementations (for tuples with > 3 elements) would be:
The text was updated successfully, but these errors were encountered: