-
Notifications
You must be signed in to change notification settings - Fork 189
Proceedings 2023 ESPResSo meetings
Jean-Noël Grad edited this page Oct 24, 2023
·
12 revisions
- next refactoring milestone in #4816
- prototype supports multiple ESPResSo systems in the same simulation scripts
- prototype available in jngrad/multiverse
- long-range algorithms are not yet ready for multi-system simulations
- For next time:
- organize community meeting fully remote and outside the summer school
- defer BBQ for 30 min to improve poster session experience
- switch from hybrid back to onsite due to low online participation
- some lectures might still be recorded for the YouTube channel
- circa November
- ESPResSo 4.2.1 was benchmarked on HPC Vega up to 1024 cores
- benchmark report due end of September
- the FFT algorithms on CPU and GPU lose a lot of performance when the simulation box is not cubic
- relevant to ELC simulations, where a gap space is needed in the z-direction
- relevant to P3M simulations that use more than 1 MPI rank
- the box geometry and cell structure are now part of the System class in the core
- most MPI callbacks were removed from the core
- script interface of non-bonded interactions needs a rewrite
- we have a CIP pool test account and a Zoom workstation
- ESPResSo is in
~/Documents/espresso/build
- can run
pypresso
oripypresso
from any location -
tutorial
alias: moves to the tutorial folder and starts Jupyter -
soltut
alias: moves to the solutions and starts Jupyter
- ESPResSo is in
- simulation is still too slow
- ELC part is complete
- ELCIC is still a work in progress
- 50% complete
- LB/EK refactor:
lb
andekcontainer
are now members of theSystem
class, thesystem.actors
list was removed - the LB collide-then-stream ("push scheme") was switched to stream-then-collide ("pull scheme"), which can be coalesced into a single kernel, but then boundary conditions must be completely rewritten
- ghost force reduction: remove synchronization barrier by computing the forces twice, instead of accumulating the forces on ghost particles and communicating them at every time step; add a particle flag to switch this behavior, since not all algorithms can avoid a ghost force reduction
- GCMC tutorial: pretty much done
- split non-bonded interactions in 3 kernels for central forces, non-symmetric forces and Coulomb-based forces: the goal is to speed-up short-range force calculations using a new memory layout
- P3M and other long-range interactions are being rewritten to take particle property arrays instead of an array of particles
- investigate zero-centered LB (PDFs with subtracted background rho_0) for faster kernels
- fine-grained events for reaction methods: reduce amount of calls to on_particle_change in the reaction methods, and split on_particle_change into more specialized event to avoid reconstructing the cell system upon trivial particle changes
- center of mass virtual sites: issues with implementation in the core
- approx. 90 global variables remain, which are an issue for machine learning and parallel tempering
- electrostatics and magnetostatics actors will soon be members of the System class instead of going to the actors list (#4749)
- lb will follow next (depends on another PR)
- in total, 8 global variables will be removed
Here is the proposed new syntax:
system.magnetostatics.solver = espressomd.magnetostatics.DipolarDirectSumCpu(...)
system.electrostatics.solver = espressomd.electrostatics.P3M(...)
system.electrostatics.extension = espressomd.electrostatic_extensions.ICC(...)
system.ekcontainer.solver = espressomd.electrokinetics.EKFFT(...)
system.ekcontainer.add(espressomd.electrokinetics.EKSpecies(...))
system.lb.set(espressomd.lb.LBFluidWalberla(...))
system.electrostatics.clear()
system.magnetostatics.clear()
system.ekcontainer.clear()
system.ekcontainer.solver = None
system.lb.clear()
- instead of communicating all particles to MPI rank 0 and then extract the quantity of interest, extract the quantity on each rank and communicate the buffer
- for reductions like weighted sums, do a partial reduction on each rank and a final reduction across all ranks
- advection-diffusion-reaction tutorial: started
- capacitor plate tutorial: ELCICC setup is in progress
- the goal is calculate the force on an ion between two charged plates with ICC, and then measure the capacitance
- tracking ticket: #2628
- the ESPResSo global variables are causing a lot of issue with integration of new features
- GPU LB for example is blocked because the GPU primary context expires before the device memory is deallocated, causing a runtime crash at the end of a simulation
- machine-learning projects are stalled because the state of an ESPResSo system cannot be reset, and restarting a Python kernel is not feasible on clusters where kernels are tied to allocated slurm jobs
- a C++
System
class will be introduced to collect some of the global state of ESPRessSo into theespressomd.system.System
class (#4741)
- swimmers
- per-particle gamma for Langevin/Brownian/lattice-Boltmann was added (#4743)
- per-particle anisotropic gamma with rotation is unclear in LB: how to do point-coupling of torques in LB?
- swimmers will be re-implemented as virtual sites (RudolfWeeber/espresso#61)
- pre-requisite for GPU LB
- per-particle gamma for Langevin/Brownian/lattice-Boltmann was added (#4743)
- observables
- currently, with particle-based observables, we need to communicate the entire particle information to MPI rank 0, and then the relevant property is extracted, which leads to a communication bottleneck
- the relevant property could be evaluated on the local MPI ranks, and the result could be sent in a much more compact buffer to MPI rank 0
- center of mass virtual site
- virtual site that tracks the center of a collection of particles, to help redistribute forces e.g. from umbrella sampling
- new feature: particle dipole field calculation (#4626)
- consider running coding days over 3 days to maximize productivity outside the lecture period
- everyone: cleanup ICP wiki pages, especially about ESPResSo to better find the resources we provide to get started developing
- the waLBerla implementations of EK/LB are now part of the development branch of ESPResSo
- still missing: forces on boundaries, EK thermalization, GPU LB
- GPU LB:
- currently each MPI rank manages its pdfs field on device memory
- one could instantiate a single entire field on the device and only communicate particle data with MPI rank 0
- particle coupling needs to be rewritten as GPU kernels
- LB pressure tensor: awaiting review
- LB shear profile regression is no longer affecting the current waLBerla LB implementation
- EK thermalization: RNG and kernels mostly ready, new RNG class was needed in the codegen script
- removing particle ranges from P3M: goal is to rewrite kernels to take position and charge iterators, and return forces
- parallel observables: currently the observables are only calculated on MPI rank 0 and require a full copy of all particles; the parallel version should run the kernel in parallel on the local particles and reduce the results to MPI rank 0
- remove code duplication in the Langevin and LB friction coupling, such that anisotropic particles can be coupled in LB
- implement virtual particle type that tracks the center of mass of a collection of particles to help distribute a force on all particles from that collection, e.g. for umbrella sampling
- Lees-Edwards particle coupling: the position and velocity offsets need to be interpolated in the shear boundary
- main objectives for ESPResSo:
- use high-performance code
- write GPU-enabled lattice-based algorithms
- write demonstrator using batteries physics
- release notes and PR
- tasks:
- checkpointing (Mariano, Ingo, David)
- Brownian and Langevin dynamics (Christoph, Chinmay)
- reaction method bugfixes (Pablo, Keerthi, David)
- active matter tutorial (Somesh)
- visualizer: arrow materials, collision detection (Ingo)
- LB GPU (Alex)
- investigate Lees-Edwards effects on cluster analysis, fractal dimension, polymer observables (JN)
- branch: jngrad/espresso@walberla_cuda
-
LBWalberlaImpl
class rewritten to instantiate CUDA fields instead of CPU fields - all LB setters and getters pass the tests, macroscopic accessors are 100% CUDA code
- particle coupling and LB boundaries are implemented
- VTK writers not yet implemented
- needs performance improvements
- checkpointing now works in parallel
- thermalized electrokinetics is not ready yet
- GPU LB is not ready yet (custom interface needs to be written)
- MC methods were rewritten as Python code by Pablo
- overhead from MPI callbacks in the hot loop (make_reaction_attempt)
- priority is to be able to write the acceptance probability in Python
- the custom GPU detection code was replaced with native CUDA support from CMake 3.22 (#4642)
- specify the CUDA compiler and toolkit with
CUDACXX=/path/to/nvcc cmake .. -D CUDAToolkit_ROOT=/path/to/toolkit
- bugs with high severity:
- broken checkpointing: LB boundaries, particle director and dipole moment
- broken reactions when using concurrent reactions that delete particles
- broken configurational moves in reactions
- broken Brownian integration of fixed particles
- draft release notes: 4.2.1
- contributions to community codes: LAMMPS, ESPResSo, waLBerla, kinetic lattice models
- deliverables:
- filesystem-based environments containing dependencies tailored for HPC (EESSI)
- waLBerla-ESPResSo integration including CUDA kernels
- refactoring electrostatics algorithms (e.g. replacing MPI FFT by shared memory FFT)
- ESPResSo is still missing the CUDA setters and getters (populations, density, etc.)
- ekin checkpointing is a work in progress
- planned for mid-February