Skip to content

Commit

Permalink
Merge branch 'main' into feature/Cerebras_R_2.4.0_updates
Browse files Browse the repository at this point in the history
  • Loading branch information
wcarnold1010 authored Dec 20, 2024
2 parents 8f87063 + 334b606 commit 9262374
Show file tree
Hide file tree
Showing 3 changed files with 206 additions and 0 deletions.
58 changes: 58 additions & 0 deletions docs/polaris/applications-and-libraries/applications/amber.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Amber on Polaris

## What is Amber?

Amber is a suite of biomolecular simulation programs. Amber is distributed in two parts: AmberTools24 and Amber24. Please visit the Amber [website](https://ambermd.org/) for additional information on capabilities and licensing.

## Using Amber at ALCF

ALCF offers assistance with building binaries and compiling instructions for Amber. For questions, contact us at [email protected].

## Building Amber

The following build instructions can be applied to both free and licensed versions of Amber.

1. Download AmberTools24 and Amber24 from Amber [website](https://ambermd.org/GetAmber.php). Note, they are two separate downloads. Copy the tarballs `AmberTools24.tar.bz2` and `Amber24.tar.bz2` into a home or project directory (e.g. $HOME/Amber).

``` bash
$ cd Amber
$ tar -xf AmberTools.tar.bz2
$ tar -xf Amber24.tar.bz2
$ cd amber24_src
```

2. Download and install the [bzip2](https://sourceware.org/bzip2/downloads.html) library. Insert the `-fPIC` into the `CFLAGS` variable in the Makefile and build with installation to a local directory (e.g. `make install PREFIX=$HOME/bzip2`).

3. Download and install the [FFTW3](https://www.fftw.org/download.html) library, extract the tarball, an rename it as `fftw`. Proceed to build with installation to a local directory (e.g. `./configure --prefix=$HOME/fftw ; make ; make install`).

4. Update user environment to include the newly installed `bzip2` and `fftw` and use the GNU programming environment.

```bash
$ export PATH="/PATH-TO/bzip2/lib:$PATH"
$ export PATH="/PATH-TO/bzip2:$PATH"
$ export PATH="/PATH-TO/bzip2/include:$PATH"
$ export PATH="/PATH-TO/fftw:$PATH"
$ export PATH="/PATH-TO/fftw/lib:$PATH"
$ export PATH="/PATH-TO/fftw/include:$PATH"

$ module use /soft/modulefiles
$ module load PrgEnv-gnu/8.5.0
$ module load cudatoolkit-standalone/12.4.0
```

5. Proceed with building Amber binaries by first modifying `run_cmake` and setting the following.
* -DMPI=TRUE
* -DCUDA=TRUE
* -DCOMPILER=MANUAL
* -DDISABLE_TOOLS=FEW
* -DCMAKE_C_COMPILER=gcc-12
* -DCMAKE_CXX_COMPILER=g++-12
* -DCMAKE_Fortran_COMPILER=gfortran-12

``` bash
$ cd build
$ ./run_cmake
$ make install -j 8
```

All Amber binaries will be installed to the `amber24` folder.
146 changes: 146 additions & 0 deletions docs/polaris/applications-and-libraries/applications/namd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# NAMD on Polaris

## What is NAMD?

NAMD, recipient of a 2002 Gordon Bell Award, a 2012 Sidney Fernbach Award, and a 2020 Gordon Bell Prize, is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 50,000 cores for the largest simulations. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR.

## Using NAMD at ALCF

ALCF offers assistance with building binaries and compiling instructions for NAMD. For questions, contact us at [email protected].

## Running NAMD on Polaris

Prebuilt release of NAMD binaries can be found in directory `/soft/applications/namd`.
* `Linux-x86_64-netlrts-smp-CUDA` supports GPU-resident runs.
* `Linux-x86_64-ofi-smp-CUDA` supports general GPU-offload runs.
* `Linux-x86_64-ofi-smp-CUDA-memopt` supports memory-optimized input files and parallel I/O for the largest simulations (~100 million atoms or more).

NAMD supports two types of parallelized simulations: single instance strong-scaling and multiple-copy weak-scaling (i.e. replica exchange). For more functionality details, please visit the NAMD [website](https://tcbg.illinois.edu/Research/namd).

### GPU-resident runs

A sample PBS script follows for GPU-resident runs on Polaris.

``` bash
#!/bin/sh -l
#PBS -l select=1:system=polaris
#PBS -l place=scatter
#PBS -l walltime=0:30:00
#PBS -q debug
#PBS -A PROJECT
#PBS -l filesystems=home:eagle

EXE=/PATH-TO/namd3

cd ${PBS_O_WORKDIR}

mpiexec -n 1 --ppn 1 --depth=16 --cpu-bind=depth $EXE +p 15 +devices 3,2,1,0 stmv.namd > stmv.output
```

Measured performance for a ~1,000,000 atom system generated with the above submission script run under NPT conditions and a timestep of 2 fs was `16 CPUs 0.00381933 s/step 45.2435 ns/day`.

Note, the GPU-resident version only runs on a single node currently and some important functions remain to be implemented with it. A user is strongly encouraged to ensure the updated GPU-resident version fully supports the planned simulation in advance.

### Multiple-copy GPU-resident runs

A sample PBS script for multiple-copy GPU-resident runs follows.

``` bash
#!/bin/sh -l
#PBS -l select=4:system=polaris
#PBS -l place=scatter
#PBS -l walltime=0:30:00
#PBS -q debug-scaling
#PBS -A PROJECT
#PBS -l filesystems=home:eagle

EXE=/PATH-TO/namd3
CHARMRUN=/PATH-TO/charmrum

cd ${PBS_O_WORKDIR}

$CHARMRUN ++mpiexec ++np 16 ++ppn 8 $EXE +replicas 16 init.conf --source rest2_remd.namd +pemap 0-31 +setcpuaffinity +devices 0,1,2,3 +stdout rest2_output/%d/job0.%d.log +devicesperreplica 1
```

This sample script launches a solute-tempering replica-exchange simulation with 16 replicas on 4 Polaris nodes. Each node accomodates 4 replicas, and each replica uses 8 CPU cores and binds to 1 GPU device.

### GPU-offload run on multiple nodes

A sample PBS script for a GPU-offload run follows.

``` bash
#!/bin/sh -l
#PBS -l select=64:system=polaris
#PBS -l place=scatter
#PBS -l walltime=0:30:00
#PBS -q prod
#PBS -A PROJECT
#PBS -l filesystems=home:eagle

EXE=/PATH-TO/namd3

cd ${PBS_O_WORKDIR}

aprun -N 4 -n 256 --cc core --cpus-per-pe 8 $EXE +p 6 +setcpuaffinity +devices 3,2,1,0 stmv.namd > stmv_64nodes.output
```

Measured performance for a ~1,000,000 atom system generated with the above submission script run under NPT conditions and a timestep of 2 fs was `1536 CPUs 0.00151797 s/step 113.724 ns/day`.

### Multiple-copy GPU-offload run

A sample PBS script for multiple-copy GPU-offload runs follows.

``` bash
#!/bin/sh -l
#PBS -l select=4:system=polaris
#PBS -l place=scatter
#PBS -l walltime=0:30:00
#PBS -q debug-scaling
#PBS -A PROJECT
#PBS -l filesystems=home:eagle

EXE=/PATH-TO/namd3

cd ${PBS_O_WORKDIR}

aprun -N 4 -n 16 --cc=core --cpus-per-pe 8 $EXE +replicas 16 init.conf --source rest2_remd.namd +setcpuaffinity +stdout rest2_output/%d/job0.%d.log +devicesperreplica 1
```


## Building NAMD

We recommend using the NAMD binaries provided.

The following instructions are for the GPU-offload version build on top of Slingshit-11 generic charm++.

1. module swap PrgEnv-nvhpc PrgEnv-gnu
2. Download NAMD source [code](https://www.ks.uiuc.edu/Development/Download/download.cgi?PackageName=NAMD)
``` bash
$ tar -xzf NAMD_3.0_Source.tar.gz
$ cd NAMD_3.0_Source
$ tar xvf charm-8.0.0.tar
$ cd charm-8.0.0
$ ./buildold charm++ ofi-crayshasta cxi slurmpmi2cray smp --with-production -j8 -DCMK_OBJIC_COLLECTION_BITS=8 -DCMK_OBJID_HOME_BITS=20
$ cd ..
$ wget http://www.ks.uiuc.edu/Research/namd/libraries/fftw-linux-x86_64.tar.gz
$ wget http://www.ks.uiuc.edu/Research/namd/libraries/tcl8.6.13-linux-x86_64.tar.gz
$ wget http://www.ks.uiuc.edu/Research/namd/libraries/tcl8.6.13-linux-x86_64-threaded.tar.gz
$ tar xzf fftw-linux-x86_64.tar.gz
$ tar xzf tcl8.6.13-linux-x86_64.tar.gz
$ tar xzf tcl8.6.13-linux-x86_64-threaded.tar.gz
$ mv linux-x86_64 fftw
$ mv tcl8.6.13-linux-x86_64 tcl
$ mv tcl8.6.13-linux-x86_64-threaded tcl-threaded
$ ./config Linux-x86_64-g++ --charm-base ./charm-8.0.0 --charm-arch ofi-crayshasta-cxi-slurmpmi2cray-smp --with-cuda --cuda-prefix /soft/compilers/cudatoolkit/cuda-12.2.2
$ cd Linux-x86_64-g++
$ make -j8
```

The NAMD binary is namd3. To build a memory-optimized version, the flag `--with-memopt` needs to be inserted as a config argument.

The configure steps above can be replaced with the following to build the GPU-resident version of NAMD.

``` bash
$ ./build charm++ netlrts-linux-x86_64 gcc smp -j8 --with-production
$ ./config Linux-x86_64-g++ --charm-base ./charm-8.0.0 --charm-arch netlrts-linux-x86_64-smp-gcc --with-cuda --with-single-node-cuda --cuda-prefix /soft/compilers/cudatoolkit/cuda-12.2.2
```
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -130,8 +130,10 @@ nav:
- Running Jobs: polaris/running-jobs.md
- Applications and Libraries:
- Applications:
- Amber: polaris/applications-and-libraries/applications/amber.md
- GROMACS: polaris/applications-and-libraries/applications/gromacs.md
- LAMMPS: polaris/applications-and-libraries/applications/lammps.md
- NAMD: polaris/applications-and-libraries/applications/namd.md
- NekRS: polaris/applications-and-libraries/applications/nekrs.md
- OpenMM: polaris/applications-and-libraries/applications/openmm.md
- QMCPACK: polaris/applications-and-libraries/applications/QMCPACK.md
Expand Down

0 comments on commit 9262374

Please sign in to comment.