Regression of `reduce` (formerly `reducedim`) in Julia 0.7 #498

wsshin · 2018-09-17T19:07:29Z

I observe a significant regression in reduce in Julia 0.7 compared to reducedim in Julia 0.6. Below, I add the columns of an m×2 matrix to create an m×1 matrix for different values of m. In Julia 0.6,

julia> VERSION
v"0.6.3-pre.0"

julia> m = 1; A = @SMatrix rand(m, 2); @btime reducedim(+, $A, Val{2});
  1.894 ns (0 allocations: 0 bytes)

julia> m = 2; A = @SMatrix rand(m, 2); @btime reducedim(+, $A, Val{2});
  2.261 ns (0 allocations: 0 bytes)

julia> m = 3; A = @SMatrix rand(m, 2); @btime reducedim(+, $A, Val{2});
  2.261 ns (0 allocations: 0 bytes)

julia> m = 4; A = @SMatrix rand(m, 2); @btime reducedim(+, $A, Val{2});
  2.264 ns (0 allocations: 0 bytes)

julia> m = 5; A = @SMatrix rand(m, 2); @btime reducedim(+, $A, Val{2});
  2.634 ns (0 allocations: 0 bytes)

julia> m = 10; A = @SMatrix rand(m, 2); @btime reducedim(+, $A, Val{2});
  4.208 ns (0 allocations: 0 bytes)

On the other hand, in Julia 0.7:

julia> VERSION
v"0.7.1-pre.0"

julia> m = 1; A = @SMatrix rand(m, 2); @btime reduce(+, $A, dims=Val(2));
  8.015 ns (0 allocations: 0 bytes)

julia> m = 2; A = @SMatrix rand(m, 2); @btime reduce(+, $A, dims=Val(2));
  8.205 ns (0 allocations: 0 bytes)

julia> m = 3; A = @SMatrix rand(m, 2); @btime reduce(+, $A, dims=Val(2));
  210.229 ns (3 allocations: 160 bytes)

julia> m = 4; A = @SMatrix rand(m, 2); @btime reduce(+, $A, dims=Val(2));
  55.512 ns (2 allocations: 128 bytes)

julia> m = 5; A = @SMatrix rand(m, 2); @btime reduce(+, $A, dims=Val(2));
  44.934 ns (2 allocations: 144 bytes)

julia> m = 10; A = @SMatrix rand(m, 2); @btime reduce(+, $A, dims=Val(2));
  57.598 ns (2 allocations: 272 bytes)

Observations:

0.6 does not use any allocations.
0.7 starts using allocations for $m ≥ 3$, but even for $m ≤ 2$ for which no allocations are used, 0.7 is 3–4 times slower than 0.6.
There is something weird about $m = 3$ in 0.7: the code runs significantly slower for $m = 3$ than for $m > 3$, probably because it uses one more allocation for some reason.

Any idea why this regression occurs?

The text was updated successfully, but these errors were encountered:

wsshin · 2018-09-17T19:08:59Z

Maybe related to #439?

tkoolen · 2018-09-17T19:15:49Z

See also #494.

nlw0 · 2018-12-27T20:19:28Z

When I reproduced your tests today I only saw allocations for m>=6, but sooner in my use case (4x4 matrices)

julia> VERSION
v"1.2.0-DEV.66"

julia> A = randn(4,4);

julia> for m = 1:10
display(m)
A = @smatrix rand(m, 2); @Btime reduce(+, $A, dims=Val(1));
A = @smatrix rand(m, m); @Btime reduce(+, $A, dims=Val(1));
end
1
1.814 ns (0 allocations: 0 bytes)
1.693 ns (0 allocations: 0 bytes)
2
1.695 ns (0 allocations: 0 bytes)
2.024 ns (0 allocations: 0 bytes)
3
1.754 ns (0 allocations: 0 bytes)
2.029 ns (0 allocations: 0 bytes)
4
2.031 ns (0 allocations: 0 bytes)
36.787 ns (2 allocations: 192 bytes)
5
2.032 ns (0 allocations: 0 bytes)
41.070 ns (2 allocations: 256 bytes)
6
37.657 ns (2 allocations: 144 bytes)
53.786 ns (2 allocations: 368 bytes)
7
32.522 ns (2 allocations: 160 bytes)
65.344 ns (2 allocations: 464 bytes)
8
34.810 ns (2 allocations: 176 bytes)
78.683 ns (2 allocations: 624 bytes)
9
35.368 ns (2 allocations: 192 bytes)
92.158 ns (2 allocations: 752 bytes)
10
38.982 ns (2 allocations: 208 bytes)
107.778 ns (2 allocations: 912 bytes)

Are there any open plans to solve this? This looks like a problem I might go around by myself in my own project, but it would be certainly better to have this solved in the library. Is there some way a newbie could help?

andyferris · 2019-01-02T06:31:48Z

Yes, this is annoying. It is possible that the problem might relate to how keyword argument functions are lowered - I think there was an issue or comment somewhere to automatically add @propagate_inbounds to the helper functions to fix certain inlining and boundschecking performance issues.

PS - Personally, I don't love the dims keyword appearing everywhere in method signatures since it reflects poor seperation of concerns. I would we write things like sum.(splitdims(A, 1)) (or the new eachrow, eachcol, eachslice functions).

tkoolen · 2019-01-02T15:49:44Z

Completely agree, I wish the dims kwarg API would just go away.

c42f · 2019-07-31T09:46:51Z

Probable same root cause as #540

mateuszbaran · 2019-09-26T11:23:49Z

One possible workaround for this is directly using the StaticArrays._reduce method like this:

f_(A) = StaticArrays._reduce(+, A, Val(2), NamedTuple())

That method was added in #659.

c42f added the performance runtime performance label Jul 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression of `reduce` (formerly `reducedim`) in Julia 0.7 #498

Regression of `reduce` (formerly `reducedim`) in Julia 0.7 #498

wsshin commented Sep 17, 2018

wsshin commented Sep 17, 2018

tkoolen commented Sep 17, 2018

nlw0 commented Dec 27, 2018

andyferris commented Jan 2, 2019

tkoolen commented Jan 2, 2019

c42f commented Jul 31, 2019

mateuszbaran commented Sep 26, 2019

Regression of reduce (formerly reducedim) in Julia 0.7 #498

Regression of reduce (formerly reducedim) in Julia 0.7 #498

Comments

wsshin commented Sep 17, 2018

wsshin commented Sep 17, 2018

tkoolen commented Sep 17, 2018

nlw0 commented Dec 27, 2018

andyferris commented Jan 2, 2019

tkoolen commented Jan 2, 2019

c42f commented Jul 31, 2019

mateuszbaran commented Sep 26, 2019

Regression of `reduce` (formerly `reducedim`) in Julia 0.7 #498

Regression of `reduce` (formerly `reducedim`) in Julia 0.7 #498