Skip to content
maddyscientist edited this page Nov 24, 2021 · 2 revisions

The MILC NERSC RHMC benchmark concerning running a 2+1+1 flavor HISQ-improved staggered fermion simulation. There are four benchmarks, small, medium, large and x-large, with each subsequent benchmark 16x larger than the prior. We can thus strong scale by running the same benchmark on different process counts or weak scale by running the different benchmarks with the same local volume per process, e.g., running the large benchmark on 16x more GPUs than the medium benchmark.

Benchmark Volume
Small 18^3 x 36
Medium 36^3 x 72
Large 72^3 x 144
X-Large 144^3 x 288

Medium

The medium benchmark is suitable for scaling up to 16 GPUs.

Machine Nodes MPI processes GPU #GPU Time (s)
Selene 1 1 NVIDIA A100-80 1 2260
Selene 1 2 NVIDIA A100-80 2 1319
Selene 1 4 NVIDIA A100-80 4 700
Selene 1 8 NVIDIA A100-80 8 394

Large

The large benchmark is suitable for scaling up to 512 GPUs.

Machine Nodes MPI processes GPU #GPU Time (s)
Selene 4 32 NVIDIA A100-80 32 1913
Selene 8 64 NVIDIA A100-80 64 1015
Selene 16 128 NVIDIA A100-80 128 651
Selene 32 256 NVIDIA A100-80 256 433
Selene 64 512 NVIDIA A100-80 512 320
Clone this wiki locally