Performance #10

ChrisRackauckas · 2017-03-31T16:09:31Z

As came up over here: #9 (comment), the performance could use improvements. The basic design is simply using callbacks on ODE solvers, and for pure Gillespie models, using the Order 0 Discrete ODE solver. This works, gives interpolations, event handling, etc., and the biggest thing is that this makes it naturally work with ODEs/SDEs/PDEs/etc. So this is a good long-term solution and I don't want to do an alternative design. Instead, I want to find out how to make this work better.

The good thing is, we have @sdwfrost 's Gillespie.jl as a benchmark target. For very large equations we should be faster because we have a sparse representation of our jump stoichiometry vs their dense matrix. So let's not focus there. Instead we should focus on the performance of the small test cases vs Gillespie.jl. The test case from that comment is a good one.

There are two areas which likely need to be optimized. First of all, we are definitely hitting splatting penalties:

JuliaLang/julia#13359

I don't think we should work around what is hopefully a bug that should be patched up before 1.0. The other thing is we should check the generated code in the OrdinaryDiffEq solvers (and equivalently, StochsticDiffEq solvers which handle this almost the same) for the event handling. Essentially, a benchmark of jumps is just a benchmark of the differential equation solvers' event handling. There is nothing theoretically that should make this have a large performance gap from a tight loop here other than the fact that tstops, the structure for the next timepoint to hit, is a Binary Heap. This has to be a Binary Heap in order to allow event handling to push timepoints into tstops in an arbitrary order. However, I wouldn't expect it to be that bad...

So the aim is: don't lose any generality, but get within 2x of Gillespie.jl, likely closer to 1.5x. We may need to use generated functions for the rate equations to do so, and the splatting recursions may need to be checked. This should be doable.

The text was updated successfully, but these errors were encountered:

ChrisRackauckas · 2017-04-09T18:43:25Z

I'm gonna try a val-type format.

ChrisRackauckas · 2017-06-07T23:56:57Z

The performance goals were hit. We actually allocate less than Gillespie.jl and scale better because of the sparse format. Branch predictions were improved by compiling away some more branches, i.e. cheating, and so now even in the simple test problems we are <2x and closer to 1.5x, improving and surpassing as the problem gets harder.

There are two spots which show up in the profiler. One of them is the binary heap push/pop, which will get improved by the compiler in the near future when it knows to not discard the larger array. The second is this line:

https://github.com/JuliaDiffEq/DiffEqJump.jl/blob/master/src/aggregators/direct.jl#L18

Because affects! is a tuple of functions, this incurs a fixed cost 100ns dynamic dispatch upon each usage. This then accounts for the entire difference. For small enough problems like the SIR test problem, this causes the full 1.5-2x difference, which can be tested by making that affects![1] (this makes our timing == Gillespie.jl). For larger problems, the other calculations grow in complexity while this stays constant, which is why other effects then dominate. This has to stay because it allows arbitrary functions to act on the integrator, giving a ton more functionality then just allowing increments at decrements, at a cost of <100ns per usage.

In total, we hit the goal and quantified exactly what the difference is, and why it has to stay or why it will fix itself in the future. Case closed.

ChrisRackauckas mentioned this issue Mar 31, 2017

SIR test gives funny answers #9

Closed

ChrisRackauckas closed this as completed Jun 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance #10

Performance #10

ChrisRackauckas commented Mar 31, 2017

ChrisRackauckas commented Apr 9, 2017

ChrisRackauckas commented Jun 7, 2017

Performance #10

Performance #10

Comments

ChrisRackauckas commented Mar 31, 2017

ChrisRackauckas commented Apr 9, 2017

ChrisRackauckas commented Jun 7, 2017