Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance #10

Closed
ChrisRackauckas opened this issue Mar 31, 2017 · 2 comments
Closed

Performance #10

ChrisRackauckas opened this issue Mar 31, 2017 · 2 comments

Comments

@ChrisRackauckas
Copy link
Member

As came up over here: #9 (comment), the performance could use improvements. The basic design is simply using callbacks on ODE solvers, and for pure Gillespie models, using the Order 0 Discrete ODE solver. This works, gives interpolations, event handling, etc., and the biggest thing is that this makes it naturally work with ODEs/SDEs/PDEs/etc. So this is a good long-term solution and I don't want to do an alternative design. Instead, I want to find out how to make this work better.

The good thing is, we have @sdwfrost 's Gillespie.jl as a benchmark target. For very large equations we should be faster because we have a sparse representation of our jump stoichiometry vs their dense matrix. So let's not focus there. Instead we should focus on the performance of the small test cases vs Gillespie.jl. The test case from that comment is a good one.

There are two areas which likely need to be optimized. First of all, we are definitely hitting splatting penalties:

JuliaLang/julia#13359

I don't think we should work around what is hopefully a bug that should be patched up before 1.0. The other thing is we should check the generated code in the OrdinaryDiffEq solvers (and equivalently, StochsticDiffEq solvers which handle this almost the same) for the event handling. Essentially, a benchmark of jumps is just a benchmark of the differential equation solvers' event handling. There is nothing theoretically that should make this have a large performance gap from a tight loop here other than the fact that tstops, the structure for the next timepoint to hit, is a Binary Heap. This has to be a Binary Heap in order to allow event handling to push timepoints into tstops in an arbitrary order. However, I wouldn't expect it to be that bad...

So the aim is: don't lose any generality, but get within 2x of Gillespie.jl, likely closer to 1.5x. We may need to use generated functions for the rate equations to do so, and the splatting recursions may need to be checked. This should be doable.

@ChrisRackauckas
Copy link
Member Author

I'm gonna try a val-type format.

@ChrisRackauckas
Copy link
Member Author

The performance goals were hit. We actually allocate less than Gillespie.jl and scale better because of the sparse format. Branch predictions were improved by compiling away some more branches, i.e. cheating, and so now even in the simple test problems we are <2x and closer to 1.5x, improving and surpassing as the problem gets harder.

There are two spots which show up in the profiler. One of them is the binary heap push/pop, which will get improved by the compiler in the near future when it knows to not discard the larger array. The second is this line:

https://github.com/JuliaDiffEq/DiffEqJump.jl/blob/master/src/aggregators/direct.jl#L18

Because affects! is a tuple of functions, this incurs a fixed cost 100ns dynamic dispatch upon each usage. This then accounts for the entire difference. For small enough problems like the SIR test problem, this causes the full 1.5-2x difference, which can be tested by making that affects![1] (this makes our timing == Gillespie.jl). For larger problems, the other calculations grow in complexity while this stays constant, which is why other effects then dominate. This has to stay because it allows arbitrary functions to act on the integrator, giving a ton more functionality then just allowing increments at decrements, at a cost of <100ns per usage.

In total, we hit the goal and quantified exactly what the difference is, and why it has to stay or why it will fix itself in the future. Case closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant