You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
and number of LAMMPS jobs running on a node increases, instances are using less than 100% CPU usage/job which is ultimately causing all the simulations to slow down.
Here’s the CPU usage of each of the simulations on one of the compute nodes using top command:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22037 username 20 0 598632 101664 1196 R 33.2 0.1 8082:11 lmp_mpi
I am have tried multiple options with MPI and Slurm like --exclusive option and trying to set --cpus-per-task --ntasks-per-node parameters but still see the same results.
Is this error caused because of how much I/O processing LAMMPS takes? If so, can we reduce the verbosity of LAMMPS ?
How can we get past this error?
System information:
CentOS 7
Slurm Scheduler
LAMMPS version - lammps-16Feb16
MPI Version - openmpi-1.8/gcc
Each of our compute nodes has either 20-cores / 24-cores and each core can run 1 process.
Here’s the complete job script of one of the simulations:
One of the researchers is trying to run multiple LAMMPS jobs on a node on our cluster where a job uses 1 core.
mpirun -np 1 lmp_mpi < Project.txt7 > output_7_1.txt
However, as the user submits multiple jobs that are similar:
…
mpirun -np 1 lmp_mpi < Project.txt7 > output_7_1.txt
mpirun -np 1 lmp_mpi < Project.txt7 > output_8_1.txt
…
and number of LAMMPS jobs running on a node increases, instances are using less than 100% CPU usage/job which is ultimately causing all the simulations to slow down.
Here’s the CPU usage of each of the simulations on one of the compute nodes using top command:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22037 username 20 0 598632 101664 1196 R 33.2 0.1 8082:11 lmp_mpi
22332 username 20 0 597292 100796 1196 R 33.2 0.1 8076:26 lmp_mpi
22345 username 20 0 596560 101572 1196 R 33.2 0.1 8084:15 lmp_mpi
I am have tried multiple options with MPI and Slurm like --exclusive option and trying to set --cpus-per-task --ntasks-per-node parameters but still see the same results.
Is this error caused because of how much I/O processing LAMMPS takes? If so, can we reduce the verbosity of LAMMPS ?
How can we get past this error?
System information:
CentOS 7
Slurm Scheduler
LAMMPS version - lammps-16Feb16
MPI Version - openmpi-1.8/gcc
Each of our compute nodes has either 20-cores / 24-cores and each core can run 1 process.
Here’s the complete job script of one of the simulations:
#######################
#!/usr/bin/env bash
#SBATCH --job-name=Sim-8
#SBATCH --partition=debug.q
#SBATCH --mem=1G
#SBATCH --export=ALL
#SBATCH --nodelist=mrcd08
module load lammps16
mpirun -np 1 lmp_mpi < Project.txt7 > output_7_1.txt
#########################
Sarvani Chadalapaka
HPC Administrator
University of California Merced, Office of Information Technology
See Full Post
The text was updated successfully, but these errors were encountered: