NeSI Guide for BEAST2 developers

Although NeSI has BEAST2 pre-installed on the cluster, you can still install your own version of BEAST2 for package development purposes.

Learn Bash https://docs.nesi.org.nz/Getting_Started/Cheat_Sheets/Bash-Reference_Sheet/ before start:

1. Login NeSI

You need to apply a NeSI account, and login to NeSI e.g. Mahuika. https://docs.nesi.org.nz

Login Troubleshooting https://docs.nesi.org.nz/General/FAQs/Login_Troubleshooting/

Accessing the HPCs

JupyterHub https://docs.nesi.org.nz/Scientific_Computing/Interactive_computing_using_Jupyter/Jupyter_on_NeSI/#jupyter-term
Terminal https://docs.nesi.org.nz/Scientific_Computing/Terminal_Setup/Standard_Terminal_Setup/

Make sure you can access your NeSI home folder before the next step.

More training: https://docs.nesi.org.nz/Scientific_Computing/Training/Introduction_to_computing_on_the_NeSI_HPC_YouTube_Recordings/

2. Setup BEAST 2

You need to install BEAST 2 and packages in your NeSI home folder. Here is the instruction:

i. Download the latest Linux x86 version and upload the .tgz file to your NeSI home folder.

https://github.com/CompEvol/beast2/releases/latest

ii. Go to your NeSI home folder, and unzip everything to a subfolder "beast" under your home folder.

# Check your location
pwd
# Go to your NeSI home folder, if you are in a different path
cd ~
# unzip BEAST under your home folder
tar -xvzf BEAST.v2.7.7.Linux.x86.tgz
# check if it is there
ls

Mac could automatically extract the .tgz file into a .tar file. In this case, you can use the command below without -z:

tar -xvf BEAST.v2.7.7.Linux.x86.tar

iii. List or install BEAST 2 packages using Package Manager from command line.

# more commands on the website
YOUR_BEAST/bin/packagemanager -list

The installed packages are stored under the path ~/.beast/2.7/ where 2.7 is the major version of BEAST that you are using. Use the command ls -la ~/.beast/2.7/ to see how many packages (subfolders) are installed.

3. Run BEAST 2

Please note: never run any jobs in the terminal. You need to submit jobs to the computation nodes using Slurm.

Your workspace will be located within the subfolder of your project directory, such as /nesi/nobackup/nesi???/YOUR_NAME. Since this folder is shared with other members, please exercise extreme caution when using bash commands, especially when deleting files.

Here is an example template to run a BEAST2 XML file, where the placeholders in capital letters should be replaced with your specific details, and file paths must be adjusted according to your environment settings:

#!/bin/sh
#SBATCH -J PREFIX-FILE		# The job name
#SBATCH -A nesi???		# The account code
#SBATCH --time=72:00:00         # The walltime
#SBATCH --mem=1G 	        # in total
#SBATCH --cpus-per-task=2       # OpenMP Threads
#SBATCH --ntasks=1              # not use MPI
#SBATCH --hint=multithread      # A multithreaded job, also a Shared-Memory Processing (SMP) job
#SBATCH -D ./			# The initial directory
#SBATCH -o FILE_out.txt		# The output file
#SBATCH -e FILE_err.txt		# The error file

# sacct -j JOBID --format="ReqMem,MaxRSS,CPUTime,AveCPU,Elapsed"
#Whenever SLURM mentions CPUs it is referring to logical CPUs (2 logical CPUs = 1 physical CPU)
#Total mem = mem-per-cpu * task / 2
module load beagle-lib/4.0.0-GCC-11.3.0
module load Java/17

# beast 2.7.x 
srun /home/???/beast/bin/beast -beagle_SSE -seed SEED ../FILE

After modify the placeholders into your values, save it to a file, such as test.sl. Then use Slurm to submit the job:

sbatch test.sl

Check your job:

squeue --me

More Slurm commands: https://docs.nesi.org.nz/Getting_Started/Cheat_Sheets/Slurm-Reference_Sheet/

Parallel computing

Parallel Execution https://docs.nesi.org.nz/Getting_Started/Next_Steps/Parallel_Execution/

Batch processing

How to run 100 XMLs at one time? You can directly use or modify the following bash script to submit many jobs at one time:

#!/usr/bin/env bash
DIR=$1
TEMPLATE=mytemplate

# 1. create templates ready to submit
for file in *.xml; do
seed=$(( ( RANDOM % 10000 )  + 1 ))
base=${file##*-} # after last -
stem=${base%.*}
sed "s/FILE/$file/g;s/PREFIX/$stem/g;s/SEED/$seed/g" ./${TEMPLATE} > $DIR/${stem}.sl
echo "save to $DIR/$stem created from $file at seed $seed using template ${TEMPLATE}"
done

# 2. submit all templates
cd $DIR
echo "$PWD"
for tmpfl in *.sl; do
sbatch $tmpfl
#  rm -f $tmpfl
echo "submit job $tmpfl"
sleep 1
done
cd ..

This script consists of two parts. The first part generates a Slurm template ${stem}.sl for each XML file and saves it in the directory $DIR. The second part navigates to that directory and submits all the jobs.

Please note:

the logger section in each XML has to use different log file names and tree file names, otherwise they will be overwritten each other.
use sleep 1 command between each job submission to prevent overwhelming the cluster with too many simultaneous requests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NeSI.md

NeSI.md

NeSI Guide for BEAST2 developers

1. Login NeSI

Accessing the HPCs

2. Setup BEAST 2

3. Run BEAST 2

Parallel computing

Batch processing

Files

NeSI.md

Latest commit

History

NeSI.md

File metadata and controls

NeSI Guide for BEAST2 developers

1. Login NeSI

Accessing the HPCs

2. Setup BEAST 2

3. Run BEAST 2

Parallel computing

Batch processing