Change README rule of thumb #506

ChrisRackauckas · 2018-09-24T01:59:51Z

The README says:

Note that in the current implementation, working with large StaticArrays puts a lot of stress on the compiler, and becomes slower than Base.Array as the size increases. A very rough rule of thumb is that you should consider using a normal Array for arrays larger than 100 elements. For example, the performance crossover point for a matrix multiply microbenchmark seems to be about 11x11 in julia 0.5 with default optimizations.

In Julia v1.0 there were some pretty big changes to tuple optimizations, for example JuliaLang/julia#27398 . Since SVectors are backed by tuples, this seemed to have a pretty big effect on large static arrays. For example, 11x11 used to be over the tuple inference limit, counter acted by some of the nice generated functions in StaticArrays.jl, and that made them seem to match. But without that inference limit, 11x11 seems to be much better than before.

@dpsanders mentioned in Slack

Chris Rackauckas [6:43 PM]

>For example, the performance crossover point for a matrix multiply microbenchmark seems to be about 11x11 in julia 0.5 with default optimizations.

Is probably much higher now.

David Sanders [6:47 PM]
Yes, my global optimization code in N dimensions now is super fast
(N-dimensional `IntervalBox`es just wrap an `SVector` of `Interval`s)
Something like a factor of 10 faster than with Julia 0.6

I think it might be worth re-evaluating the rule of thumb for the README since now the definition of "small" may have shifted. In another sense, "small" seems to have shifted high enough that now it's a discussion of runtime vs compilation time (compilation time can grow pretty fast for much larger static arrays, once again quoting @dpsanders

If I use SVectors of big dimension (e.g. 100, 200) there seems to be a huge compile-time penalty, but the calculations themselves are blindingly fast. Is this normal?

). So a more nuanced discussion is probably needed in Julia v1.0 land, and I think this is important since that 11x11 number is something I and others copy around as a nice heuristic which may now be old!

The text was updated successfully, but these errors were encountered:

andyferris · 2018-09-25T02:04:45Z

Yes... it's hard to condense down all the nuances into a sentence or two for new users. I'm definitely open to suggestions!

Out of curiousity, how high dimensional are these IntervalBoxes? (I've been playing with my own interval boxes with a similar-sounding implementation, for implementation of R-trees for 2D and 3D geometries, but this seems orthogonal to large static arrays).

dpsanders · 2018-09-25T02:24:21Z

An IntervalBox wraps an SVector{N, Interval{Float64}} to represent a Cartesian product of intervals living in $N$ dimensions, e.g.

julia> using IntervalArithmetic

julia> X = IntervalBox(-10..10, 20)
[-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10] × [-10, 10]

so that X is a subset of R^20.

I can do global optimization of a function R^N -> R with, say, N=200, except that compiling the code takes ages. I'm not sure up to which value of N the calculation would run in a reasonable time if the compile-time penalty were less significant. I'm happy to give more details.

andyferris · 2018-09-25T03:16:01Z

OK, yep.

Also note #494. The long-term goal here has always been to use loops and so-on for larger arrays, which would solve the compilation time issue. For N = 200 I would start to wonder if it makes sense to keep this data on the stack rather than the heap.

PS: random thought, what about the syntax -10..10 ⊗ -10..10 ⊗ -10..10 ⊗ ...?

dpsanders · 2018-09-25T03:23:24Z

julia> (-10..10) × (-10..10) × (-10..10)
[-10, 10] × [-10, 10] × [-10, 10]

already works (but you don't want to do that for 50 of them) ;) Slight abuse of the × operator maybe.

I also wonder about heap vs stack, but it doesn't seem to be something that's easy to play with?

andyferris · 2018-09-25T04:22:40Z

I also wonder about heap vs stack, but it doesn't seem to be something that's easy to play with?

Try compare MVector vs SVector

ChrisRackauckas · 2018-09-25T05:29:41Z

MVector is still backed by a tuple? Its storage is still stack-allocated and it's just the wrapper type that gets allocated each time though?

I also wonder about heap vs stack, but it doesn't seem to be something that's easy to play with?

DiffEq has both inplace and out-of-place routes, both optimized respectively. That would probably be a good "full program" perspective as to where it's a good spot to cut it off. We can make a test that is a PDE discretization of N points and find the switching point N.

KristofferC · 2018-09-25T12:26:14Z

MVector is stored on the heap (that's why you can mutate it).

c42f added the documentation documentation improvements label Jul 31, 2019

jbcaillau mentioned this issue Jul 23, 2024

Improve performance control-toolbox/OptimalControl.jl#139

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change README rule of thumb #506

Change README rule of thumb #506

ChrisRackauckas commented Sep 24, 2018

andyferris commented Sep 25, 2018

dpsanders commented Sep 25, 2018

andyferris commented Sep 25, 2018

dpsanders commented Sep 25, 2018

andyferris commented Sep 25, 2018

ChrisRackauckas commented Sep 25, 2018

KristofferC commented Sep 25, 2018

Change README rule of thumb #506

Change README rule of thumb #506

Comments

ChrisRackauckas commented Sep 24, 2018

andyferris commented Sep 25, 2018

dpsanders commented Sep 25, 2018

andyferris commented Sep 25, 2018

dpsanders commented Sep 25, 2018

andyferris commented Sep 25, 2018

ChrisRackauckas commented Sep 25, 2018

KristofferC commented Sep 25, 2018