Electroneutrality breaks when combining Widom + cpH methods #4595

pm-blanco · 2022-10-16T13:56:23Z

@davidbbeyer and I have observed that when combining the Widom insertion method with the constant pH method (and possibly also with the Reaction ensemble) leads to the destruction of an additional ion, ultimately breaking the electroneutrality of the system.

Below I provide a minimal example script that produces the bug. This only happens in the 4.2.0 version of espresso but it does not happen in the previous version 4.1.4, so it could be related to one of the newer changes we introduced.

import espressomd
from espressomd import version
import numpy as np

pH=7
pKa=7
exclusion_range=1
seed=12345
kT=1
BOX_LENGTH=10
N_SALT=2
N_acid = 2
bug=False

types = {"H": 0, "Na": 1, "Cl": 2, "Ha": 3, "a": 4}
charges = {"H": 1.0, "Na": 1.0, "Cl": -1.0, "Ha": 0.0, "a": -1.0}

system = espressomd.System(box_l=[BOX_LENGTH,]*3)
system.part.add(type=[types["Na"]]*N_SALT, pos=np.random.rand(N_SALT, 3) * BOX_LENGTH,
                q=[charges["Na"]]*N_SALT)
system.part.add(type=[types["Cl"]]*N_SALT, pos=np.random.rand(N_SALT, 3) * BOX_LENGTH,
                q=[charges["Cl"]]*N_SALT)
if bug:

    system.part.add(type=[types["Ha"]]*N_acid, pos=np.random.rand(N_acid, 3) * BOX_LENGTH,
                q=[charges["Ha"]]*N_acid)
else:
    system.part.add(type=[types["a"]]*N_acid, pos=np.random.rand(N_acid, 3) * BOX_LENGTH,
                q=[charges["a"]]*N_acid)
    system.part.add(type=[types["H"]]*N_acid, pos=np.random.rand(N_acid, 3) * BOX_LENGTH,
                q=[charges["H"]]*N_acid)

if version.friendly() == "4.2" or version.friendly() == "4.3":
    from espressomd import reaction_methods
    RE = espressomd.reaction_methods.ConstantpHEnsemble(
        constant_pH=pH,
        kT=kT,
        exclusion_range=exclusion_range,
        seed=seed)
    widom = espressomd.reaction_methods.WidomInsertion(
        kT=kT,
        seed=seed)

elif version.friendly() == "4.1":

    from espressomd import reaction_ensemble
    RE = espressomd.reaction_ensemble.ConstantpHEnsemble(
        temperature=kT,
        exclusion_radius=exclusion_range,
        seed=seed)
    RE.constant_pH=pH
    widom = espressomd.reaction_ensemble.WidomInsertion(
        temperature=kT,
        seed=seed)

RE.add_reaction(
        gamma=10**(-pKa),
        reactant_types=[types["Ha"]],
        reactant_coefficients = [1],
        product_types=[types["a"], types["H"]],
        product_coefficients = [1, 1],
        default_charges={types["Ha"]: charges["Ha"],
                         types["a"]: charges["a"],
                         types["H"]: charges["H"]}
        )

widom.add_reaction(
        reactant_types=[],
        reactant_coefficients=[],
        product_types=[types["Na"], types["Cl"]],
        product_coefficients=[1, 1],
        default_charges={types["Na"]: charges["Na"],
                         types["Cl"]: charges["Cl"]}
        )


system.setup_type_map(type_list=list(types.values()))

if version.friendly() == "4.2" or version.friendly() == "4.3":
    widom.calculate_particle_insertion_potential_energy(reaction_id=0)

elif version.friendly() == "4.1":
    widom.measure_excess_chemical_potential(0)

for step in range(2000):
    print("step ", step)
    RE.reaction(reaction_steps=100)
    charge = 0.0
    for p in system.part:
        charge += p.q
    a = system.number_of_particles(type = types["a"])
    H = system.number_of_particles(type = types["H"])
    Ha = system.number_of_particles(type = types["Ha"])
    Na = system.number_of_particles(type = types["Na"])
    Cl = system.number_of_particles(type = types["Cl"])
    print("Before widom: a:", a, "H:", H, "Ha:", Ha, "Na:", Na, "Cl:", Cl)
    print("total charge ", charge)


    # Measure the chemical potential
    for j in range(50):
        charge = 0.0

        if version.friendly() == "4.2" or version.friendly() == "4.3":
            widom.calculate_particle_insertion_potential_energy(reaction_id=0)

        elif version.friendly() == "4.1":
            widom.measure_excess_chemical_potential(0)
        charge = 0.0
        for p in system.part:
            charge += p.q
        if charge != 0:
            a = system.number_of_particles(type = types["a"])
            H = system.number_of_particles(type = types["H"])
            Na = system.number_of_particles(type = types["Na"])
            Ha = system.number_of_particles(type = types["Ha"])
            Cl = system.number_of_particles(type = types["Cl"])
            print("After widom: a:", a, "H:", H, "Ha:", Ha, "Na:", Na, "Cl:", Cl)
            print("total charge ", charge)
            exit()

jngrad · 2022-10-17T15:51:05Z

I can reproduce it on 4.2.0 with 2 MPI ranks, but not with 1 MPI rank. The bug also doesn't always happen. Could not reproduce it on 4.3-dev.

jngrad · 2022-10-17T17:31:14Z

I cannot bisect the bug due to how randomly it happens. UBSAN doesn't catch any undefined behavior on 1 MPI rank. On 2 MPI ranks it catches a null pointer dereference in the boost::mpi code when runtime error messages are reduced on the head node after setting the box size (no idea why).

pm-blanco · 2022-10-18T12:06:55Z

I can reproduce it on 4.2.0 with 2 MPI ranks, but not with 1 MPI rank. The bug also doesn't always happen. Could not reproduce it on 4.3-dev.

@jngrad I did some tests on my work station too. I can reproduce it with 4.2.0 with 1, 2 and 4 MPI ranks (although not always happens, sometimes it actually runs). It never seem to happen in the devel version with any number of ranks. I am very confused about this bug, I will try to dig further into it and see if I can pin point its origin

pm-blanco · 2022-10-24T09:52:09Z

@davidbbeyer @jngrad I have make some progress in improving the reproducibility of this bug. I have updated the example script, it now produces the bug with both espresso 4.2.0 and espresso devel. The bug can be switched on and off with bug=True or bug=False. Some observations about the bug:

The bug always happens if one starts with a system with only "HA" particles if one performs first widom insertion and then cpH and widom again.
The bug never happens if one starts with a system with "A" and "H" particles and does the same procedure than in 1).
The bug never happens in 4.1.4

pm-blanco · 2022-10-24T13:10:09Z

Ok, I finally have pinpointed the origin of the bug. The problem is on the bookkeeping of "empty ids" done with the attribute ReactionMethods::ReactionAlgorithm::m_empty_p_ids_smaller_than_max_seen_particle. Since this list is not shared between different instances of ReactionMethods, it can contain ids that have been actually created by other reactions (or even manually by the user...). This problem actually also is present in espressov4.1.4 but the bug does not show up for reasons that I do not understand yet.
@jngrad @davidbbeyer I think that this bookkeeping as actually implemented is quite dangerous since it assumes that particles are not being created/destroyed outside the given instance of the reaction_methods. In my opinion, it would be more robust to check for the holes in the particle id list every time that the user calls the reaction function. I guess that such implementation would be slightly slower, but I do not think that it would significantly damage the performance. What do you think? If you agree with this solution, I can propose a bugfix myself.

jngrad · 2022-10-24T21:09:59Z

@pm-blanco Good find! I completely forgot about that bookkeeping feature.

Ideally this container of "free" ids should have been a shared pointer global variable in particle_node.cpp, and all reaction methods should have stored a copy of that pointer, such that all reactions would agree on which particle ids are available. But our plan is to remove all globals. Also, such a global variable would only fix the bug from concurrent reactions, and not the other bug you just described where the user creates a new particle using one of the freed ids.

Just for future reference, and for readers of the bugfix release changelog, here is an illustration of the second variant of the bug. Consider a system of 10 particles with id 0 to 9. A reaction deletes particle with id 5 and marks it as "available". Then the user creates a particle with id 5. This particle id will be later used to create a new particle by the reaction algorithm. But in ESPResSo, the function that creates a particle doesn't throw an error if the particle already exists, and instead silently moves it (terrible design decision!). If the user-defined particle has an electric charge that differs from the one that will be created, the system will throw a runtime error about the system not being charge-neutral. However if the reaction is uncharged, there is now way to notify the user that something went wrong.

Here is a MWE derived from a sample:

diff --git a/samples/reaction_ensemble_complex_reaction.py b/samples/reaction_ensemble_complex_reaction.py
index cd727a587..308dc0fca 100644
--- a/samples/reaction_ensemble_complex_reaction.py
+++ b/samples/reaction_ensemble_complex_reaction.py
@@ -100,7 +100,13 @@ numbers = {type_A: [], type_B: [], type_C: [], type_D: [], type_E: []}
 RE.set_non_interacting_type(type=max(types) + 1)
 
 # warmup
-RE.reaction(reaction_steps=200)
+RE.reaction(reaction_steps=2)
+p = system.part.add(pos=np.random.random(3) * system.box_l, type=55, id=52)
+print(p.type, p.pos)
+RE.reaction(reaction_steps=2)
+p = system.part.by_id(52)
+print(p.type, p.pos)
+exit(0)
 
 for i in range(200):
     RE.reaction(reaction_steps=10)

Here is the output, with printfs to show when an id is added or taken from m_empty_p_ids_smaller_than_max_seen_particle. We see that the particle with type 55 is accidentally reacted into a particle of type 2.

$ ./pypresso ../samples/reaction_ensemble_complex_reaction.py
Mark pid 242 as available
Mark pid 52 as available
55 [19.67757918 18.69700346 31.817899  ]
Take pid 52
Take pid 242
Mark pid 52 as available
Mark pid 242 as available
Take pid 52
Take pid 242
Mark pid 193 as available
Mark pid 47 as available
2 [32.15424818 23.28252943 33.68358226]

Computing the free id when needed as proposed by @pm-blanco is probably the best solution, even though it comes with a performance penalty. @RudolfWeeber do you have any objection?

Description of changes: - particle state tracking is now fully encapsulated - helps with separation of concerns and reducing code duplication - particle creation and particle moves are now carried out by 2 independent functions - this makes the reaction methods bugs throw an error instead of silently reacting particles of the wrong type (#4595)

Fixes #4595 Description of changes: - Reaction methods now rebuild the list of free particle ids every time `ReactionAlgorithm::do_reaction()` and `WidomInsertion::calculate_particle_insertion_potential_energy()` are called - Added a new test that checks that the reaction methods do the bookkeeping of empty ids by setting up two different instances of the reaction methods (constant pH and Widom insertion) with competing reactions. I benchmarked my implementation against the current python branch using the `mc_acid_base_reservoir.py` script and I get the following timings: 1.564e-04 (my PR) vs 1.473e-04 (current python branch). That means that my implementation comes with a 6% performance loss in its current state.

Fixes espressomd#4595 Description of changes: - Reaction methods now rebuild the list of free particle ids every time `ReactionAlgorithm::do_reaction()` and `WidomInsertion::calculate_particle_insertion_potential_energy()` are called - Added a new test that checks that the reaction methods do the bookkeeping of empty ids by setting up two different instances of the reaction methods (constant pH and Widom insertion) with competing reactions. I benchmarked my implementation against the current python branch using the `mc_acid_base_reservoir.py` script and I get the following timings: 1.564e-04 (my PR) vs 1.473e-04 (current python branch). That means that my implementation comes with a 6% performance loss in its current state.

jngrad added the Bug label Oct 17, 2022

jngrad changed the title ~~Bug: Electroneutrality breaking when combining Widom + cpH methods~~ Electroneutrality breaks when combining Widom + cpH methods Oct 17, 2022

jngrad added this to the Espresso 4.3.0 milestone Oct 24, 2022

jngrad mentioned this issue Nov 8, 2022

Refactor particle management #4605

Merged

pm-blanco mentioned this issue Nov 23, 2022

Allow multiple concurrent reaction methods #4609

Merged

kodiakhq bot closed this as completed in #4609 Dec 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Electroneutrality breaks when combining Widom + cpH methods #4595

Electroneutrality breaks when combining Widom + cpH methods #4595

pm-blanco commented Oct 16, 2022 •

edited

Loading

jngrad commented Oct 17, 2022

jngrad commented Oct 17, 2022

pm-blanco commented Oct 18, 2022 •

edited

Loading

pm-blanco commented Oct 24, 2022 •

edited

Loading

pm-blanco commented Oct 24, 2022 •

edited

Loading

jngrad commented Oct 24, 2022

Electroneutrality breaks when combining Widom + cpH methods #4595

Electroneutrality breaks when combining Widom + cpH methods #4595

Comments

pm-blanco commented Oct 16, 2022 • edited Loading

jngrad commented Oct 17, 2022

jngrad commented Oct 17, 2022

pm-blanco commented Oct 18, 2022 • edited Loading

pm-blanco commented Oct 24, 2022 • edited Loading

pm-blanco commented Oct 24, 2022 • edited Loading

jngrad commented Oct 24, 2022

pm-blanco commented Oct 16, 2022 •

edited

Loading

pm-blanco commented Oct 18, 2022 •

edited

Loading

pm-blanco commented Oct 24, 2022 •

edited

Loading

pm-blanco commented Oct 24, 2022 •

edited

Loading