-
-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable multithreading in IDAKLU #2947
Conversation
Sample performance (based on the dae solver example) import pybamm
import numpy as np
# construct model
model = pybamm.lithium_ion.DFN()
geometry = model.default_geometry
param = model.default_parameter_values
param.process_model(model)
param.process_geometry(geometry)
n = 500 # control the complexity of the geometry (increases number of parameters)
var_pts = {"x_n": n, "x_s": n, "x_p": n, "r_n": round(n/10), "r_p": round(n/10)}
mesh = pybamm.Mesh(geometry, model.default_submesh_types, var_pts)
disc = pybamm.Discretisation(mesh, model.default_spatial_methods)
disc.process_model(model)
t_eval = np.linspace(0, 3600, 100)
# solve using IDAKLU
options = {'num_threads': 1}
for _ in range(5):
klu_sol = pybamm.IDAKLUSolver(atol=1e-8, rtol=1e-8, options=options).solve(model, t_eval)
print(f"Solve time: {klu_sol.solve_time.value*1000} msecs [{options['num_threads']} threads]")
options = {'num_threads': 4}
for _ in range(5):
klu_sol = pybamm.IDAKLUSolver(atol=1e-8, rtol=1e-8, options=options).solve(model, t_eval)
print(f"Solve time: {klu_sol.solve_time.value*1000} msecs [{options['num_threads']} threads]") Output: Solve time: 7708.164755254984 msecs [1 threads]
Solve time: 7715.674336999655 msecs [1 threads]
Solve time: 7623.938228935003 msecs [1 threads]
Solve time: 7630.4152719676495 msecs [1 threads]
Solve time: 7748.058792203665 msecs [1 threads]
Solve time: 4278.668664395809 msecs [4 threads]
Solve time: 4256.6283363848925 msecs [4 threads]
Solve time: 4241.460246965289 msecs [4 threads]
Solve time: 4225.298915058374 msecs [4 threads]
Solve time: 4225.85211135447 msecs [4 threads] |
Codecov ReportPatch and project coverage have no change.
Additional details and impacted files@@ Coverage Diff @@
## develop #2947 +/- ##
========================================
Coverage 99.71% 99.71%
========================================
Files 273 273
Lines 19002 19002
========================================
Hits 18947 18947
Misses 55 55
☔ View full report in Codecov by Sentry. |
@martinjrobins I've implemented openmp as it provides an immediate speed-up while I look further into MPI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jsbrittain, this is great! Did you check that num_threads=1 and N_VNew_OpenMP with is the same performance as when using N_VNew_Serial?
@martinjrobins Performance on feature branch (OpenMP):
Performance on
Average 1 thread completion times: 7599 vs 7697 msecs. I may have mentioned some overhead/slowdown previously, but that was because I had missed some vector definitions and was mixing the OpenMP and Serial vectors types. Now that they are fixed there is no slow-down (OpenMP even fractionally quicker?). |
Description
Enable multithreading in IDAKLU by replacing Serial vectors with OpenMP [OMP] vectors, and exposing a
num_threads
parameter to the user.Serial vectors have been replaced with OMP vectors for the c-solver, with the default parameter
num_threads=1
being equivalent to the Serial vector implementation. OMP vectors were introduced in the Sundials CVODE solver in v2.8.0 [currently v6.5.1], [https://sundials.readthedocs.io/en/latest/History_link.html](released in 2015) (changelog).User implementation example:
(Partially) Fixes #2645
Implements shared memory multithreading only, not distributed memory.
Type of change
Please add a line in the relevant section of CHANGELOG.md to document the change (include PR #) - note reverse order of PR #s. If necessary, also add to the list of breaking changes.
Key checklist:
$ pre-commit run
(see CONTRIBUTING.md for how to set this up to run automatically when committing locally, in just two lines of code)$ python run-tests.py --all
$ python run-tests.py --doctest
You can run unit and doctests together at once, using
$ python run-tests.py --quick
.Further checks:
*** NOTE: ***
This requires recompilation of SUNDIALS with
omp
enabled; while this is now default behaviour, it will not be compatible with previous installs