-
-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive RAM consumption when running multiple simulations (Solver Issue) #1442
Comments
I have never done much with memory profiling. Do you know a workflow for this? |
Ok, I think I understand what is going on (but not 100% sure) What's happening
** Workaround ** I don't think it's worth doing anything about this at this stage. Depending on your use case some possible fixes are:
solver.models_set_up = {} Let me know if either of these fix the excessive RAM consumption issue for you |
I quickly tried your solution: def workaround_sulzer_01():
model = pb.lithium_ion.SPM()
para = pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020)
exp = pb.Experiment(["Discharge at 0.2C until 2.9V (1 minutes period)"])
solver = pb.CasadiSolver(mode="safe")
for i in range(1000):
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
sim.solve()
solver.models_set_up = {}
print(f"cycle {i:>4}, solver_size: {sys.getsizeof(solver):>4}B") Unfortunately, the RAM consumption continues to increase with this workaround. Do you think it costs me a lot of computing time to initialise a new solver every time? With this solution, the Issue gets solved (see second code example in the first post). |
What about defining the simulation outside the for loop? Or is that not possible for your use case? |
I am updating the model every iteration, based on data from a PV System. It is a kind of lifetime simulation for PV batteries based on real data and I feed my data on an hourly basis as an experiment. But even if I define the simulation outside, this would not solve the issue with the solver. RAM consumption is certainly increasing due to the solver and not the simulation. If it helps, I can check which variable in the solver is causing the increase. |
It would still be good to check what happens if you put the simulation outside the for loop. The simulation creating a new model each time might be the problem. We have Yes, would be great to pinpoint the variable in the solver that is causing issues. |
There are a couple more storage variables in the casadi solver
|
Ah yeah I forgot about those. How did you generate that plot? |
I used |
Any update on this @huegi ? |
def workaround_RAM_loss():
model = pb.lithium_ion.SPM()
para = pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020)
exp = pb.Experiment(["Rest for 5 minutes"])
solver = pb.CasadiSolver(mode="safe")
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
for i in range(100):
sim.solve()
sim.set_up_experiment(
pb.lithium_ion.SPM(), pb.Experiment(["Rest for 5 minutes"])
)
sim.solver.models_set_up = {}
sim.solver.integrators = {}
sim.solver.integrator_specs = {}
print(f"cycle {i:>4}") Running the Code above without resetting the three solver variables results in: Running the code with resetting those 3 variables works totally fine: Then I tried to find the variable(s) which is causing the problem: Perhaps one of you can explain why the memory is only released when all three variables are deleted. PS: If you select a longer experiment, the memory consumption increases in smaller steps as in @martinjrobins picture. So you could probably also free up memory there. |
It could be that the three variables reference the memory in the others, so deleting one has no effect as the memory is still referenced somewhere (this is a guess!). I agree that there is still a (small) memory leak somewhere. This may be specific to the casadi solver, as I also tried the scipy solver and did not see a memory increase over time. |
All three dictionaries have the model as their key, so the model's memory only gets released when all three are deleted
I know nothing about memory leaks, but plausible. This |
Thank you for your idea with Funny how Python deals with unallocated memory in this case. I have never seen anything like that. But I'm also relatively new to the game ;) def workaround_RAM_loss():
sim = pb.Simulation(
model=pb.lithium_ion.SPM(),
experiment=pb.Experiment(["Discharge at 0.2C until 3.4V"]),
parameter_values=pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020),
solver=pb.CasadiSolver(mode="safe"),
)
for i in range(100):
sim = myCycleFunc(sim)
print(f"cycle: {i}")
def myCycleFunc(sim: pb.Simulation):
sim.solver.models_set_up = {}
sim.solver.integrators = {}
sim.solver.integrator_specs = {}
sim.set_up_experiment(
pb.lithium_ion.SPM(), pb.Experiment(["Discharge at 0.2C until 3.4V"])
)
sim.solve()
# --> normally I would analyze the results here
return sim Do you think that we have now found all the factors that have led to an increase in memory? The latter issue with |
Can you try replacing https://github.com/pybamm-team/PyBaMM/blob/develop/pybamm/solvers/base_solver.py#L400-L402 with _, event_eval, _ = process(
event.expression, f"event_{n}", use_jacobian=False
) ? |
Also, I will stress again that you're inducing some (probably unnecessary) overheads by running And feel free to add a |
I'm using def workaround_RAM_loss():
model = pb.lithium_ion.SPM()
exp_charge = pb.Experiment(["Charge at 0.2C for 30 min or until 4V"])
exp_discharge = pb.Experiment(["Discharge at 0.2C for 30 min or until 3V"])
sim = pb.Simulation(
model=model,
experiment=pb.Experiment(["Discharge at 0.2C for 120 min or until 3V"]),
parameter_values=pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020),
solver=pb.CasadiSolver(mode="safe"),
)
sim.solve()
for i in range(1000):
model.set_initial_conditions_from(solution=sim.solution, inplace=True)
sim.reset() # new proposal to fix the issue
if i % 2 == 0:
sim.set_up_experiment(model, exp_charge)
else:
sim.set_up_experiment(model, exp_discharge)
sim.solve()
last_voltage = sim.solution["Terminal voltage [V]"].entries[-1]
print(f"cycle: {i}; last_voltage: {last_voltage}") In the following you can find the different memory profiles for 100 cycles: As you can see it needs the def reset(self):
self.solver.models_set_up = {}
self.solver.integrators = {}
self.solver.integrator_specs = {}
self.model.events = [] On this last image you can see whats possible with this reset method (@1000 cycles). There is still some memory leakage but I think less enough to run all applications without problems. |
Removing the events of the model is not the way to go, it will almost certainly break something else. |
i encounter similar problem when solve with parameter "starting_solution", memory increase to 7.2GB while loop 574. |
@chuckliu1979 can you post your code? |
sure import time
import numpy
import pybamm
def constant_t_eval():
return numpy.linspace(0, 5760, 576)
def constant_var_pts():
var = pybamm.standard_spatial_vars
return {var.x_n: 20, var.x_s: 20, var.x_p: 20, var.r_n: 10, var.r_p: 10}
class PVExperiment(pybamm.Simulation):
def __init__(
self,
model=None,
experiment=None,
parameter_values=None,
var_pts=None,
solver=None,
options=None,
power_supply=None,
electrodes=None,
cells=None,
):
self._t_eval = None
options = options or {"thermal": "lumped"}
model = model or pybamm.lithium_ion.DFN(options=options)
var_pts = var_pts or constant_var_pts()
solver = solver or pybamm.CasadiSolver(mode="safe")
solver.max_step = 100000
parameter_values = parameter_values or pybamm.ParameterValues(chemistry=pybamm.parameter_sets.Chen2020)
parameter_values.update(
{
"Number of electrodes connected in parallel to make a cell": electrodes or 4,
"Number of cells connected in series to make a battery": cells or 1,
},
check_already_exists=False,
)
if experiment is None:
parameter_values.update(
{
"Power function [W]": power_supply or 39.620,
},
check_already_exists=False,
)
super().__init__(model=model, parameter_values=parameter_values, experiment=experiment, var_pts=var_pts, solver=solver)
@property
def t_eval(self):
if self._t_eval is None and self.solution is not None:
self._t_eval = self.solution["Time [s]"].entries
return self._t_eval
def solve(
self,
t_eval=None,
solver=None,
check_model=True,
save_at_cycles=None,
starting_solution=None,
**kwargs,
):
self._t_eval, t_eval = None, None
return super().solve(
t_eval=t_eval,
solver=solver,
check_model=check_model,
save_at_cycles=save_at_cycles,
starting_solution=starting_solution,
**kwargs,
)
if __name__ == '__main__':
solution = None
parameter_values = None
for i in range(0, 574):
print(f"loop {i}:")
parameter_values = parameter_values or pybamm.ParameterValues(chemistry=pybamm.parameter_sets.Chen2020)
vmin = parameter_values["Lower voltage cut-off [V]"]
sim = PVExperiment(experiment=pybamm.Experiment([tuple([f"Discharge at 39.620 W for 10 seconds or until {vmin} V"])]), parameter_values=parameter_values)
solution = sim.solve(starting_solution=solution)
if solution is not None:
print(solution["Terminal voltage [V]"].entries)
print("Sleep...")
time.sleep(600) |
Hi - I've been looking into this issue today and a major (but apparently not the only) problem appears to have been caused by a failure to release both the Simulation and ElectrodeSOHSolver instances for garbage collection due to the use of lru_cache decorators in simulation.py and full_battery_models/lithium_ion/electrode_soh.py. Removal of the decorators allows garbage collection to occur as normal. The lru_cache containers can instead be retained by wrapping the functions during class construction, which makes the containers local to the instance (and hence are destroyed when the instance goes out of scope, or is marked for deletion). Here's a nice explanation of the issue that I found while searching for a suitable solution: https://rednafi.github.io/reflections/dont-wrap-instance-methods-with-functoolslru_cache-decorator-in-python.html. As demonstration, here are some memory_profiler results for @huegi's original minimal example (slightly updated due to deprecation of the 'chemistry' argument in ParameterValues). Code: import pybamm as pb
import gc
solver = pb.CasadiSolver(mode="safe")
para = pb.ParameterValues('Chen2020')
model = pb.lithium_ion.SPM()
exp = pb.Experiment(["Discharge at 0.2C until 2.9V (1 minutes period)"])
for i in range(30):
print(i)
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
sim.solve()
del sim
gc.collect() Note that memory consumption increases much more slowly, and drops from a peak of ~310Mb to ~180Mb. This does not entirely resolve the problem, but I can confirm that the destructors are now being called on each loop. There is also one other use of lru_cache as an instance decorator in the repository (expression_tree/symbol.py). |
Thanks @jsbrittain, nice catch. You would think the lru_cache docs would put some sort of warning on that decorator |
@martinjrobins I remember implementing |
@jsbrittain's solution of wrapping the functions during class construction doesn't seem to have a downside (apart from an extra line of code), sounds like we should do this for all the instances of |
Nice catch @jsbrittain , happy to revert the change to get rid of |
@martinjrobins @tinosulzer - yes, as you say this is not the only issue... The other issue arising from @huegi's minimal sample comes from the solver (as was originally identified), due to an accumulation of items in the @chuckliu1979's code sample on the other hand produces a growing However, since having the ability to iteratively run simulations seems desirable, one option would be to provide a user-adjustable limit on the length / history of these lists, and could default so that only the current and previous values sets are retained, for example Summarising:
|
Summary point 3 (above) is also likely to be resolvable by forwarding |
I agree points 1 and 2 should be resolved as you suggest. To give some more context, solving a model gives the outputs Another bug is that |
Problem description
When using the same solver object for multiple simulations, memory consumption increases every simulation. The issue is triggered by the solver object. (If this is desired behaviour, please close the issue.)
Link to Slack Discussion
This minimal example triggers the Issue
This minimal example is a workaround for the Issue
The text was updated successfully, but these errors were encountered: