ThetaGPU is the GPU portion of Theta, and is distinct from the KNL nodes that make up most of Theta.
This assumes you are looking to set up for development for the gccy3 project, developing DES analysis code in CosmoSIS running on GPUs.
It also assumes you want to share the installed versions of cosmosis and gpuintegration, as well as all the other dependencies of y3_cluster_cpp
.
Each project (e.g. gccy3) has a limited quota available on ThetaGPU. To check a project quota, from the login node use the following (substitute your project name, as needed):
sbank-list-allocations -p gccy3 -r all
This doesn't seem to work from a compute node.
To do development of y3_cluster_cpp
software, using CUDA, you need to be on a ThetaGPU compute node.
Because you'll not be running MPI programs while doing development, you only need 1 GPU.
These instructions will get you there.
First log into theta.alcf.anl.gov
:
ssh theta.alcf.anl.gov
If you have set up two-factor authentication, this requires using the numeric code generated by the MobilePass+ app on your phone.
Then activate the module that will make the Cobalt commands (e.g. qsub
) talk to the GPU node, rather than the KNL nodes:
module load cobalt/cobalt-gpu
The ALCF site for ThetaGPU has some instructions that are wrong. To get access to an interactive GPU node, do this:
qsub -I -q single-gpu -n 1 -t 60 -A gccy3 --attrs=pubnet
Specifying -t 0
is supposed to give you the maximum time allowed, but it fails with a message:
No handlers could be found for logger "Proxy"
<Fault 1001: "Walltime less than the 'single-gpu' queue min walltime of 00:05:00\n">
When the prompt returns, you are on a compute node with 1 GPU allocated.
We use an environment module that provides conda
.
We create our own conda environment to contain the necessary packages.
Some are Python, some are C++, and some use Fortran.
Note that the ucx
module used is specific to running on ThetaGPU, and was recommended during the May 2021 ALCF Software Performance Workshop.
Note that the conda create
command below generates a warning that a newer version of conda exists, and suggests upgrading.
Because this version of conda is provided by an environment module, you can not update the version of conda.
All the modules we will use will be installed in the environment we create, so this should not be an actual impediment.
module load nvhpc-byo-compiler/21.7
module load conda/2021-09-22
#export CUDA_HOME # because it is set, but not exported
#conda create -p /grand/gccy3/cosmosis-2 -c conda-forge astropy cffi cfitsio click cmake configparser cudatoolkit cxx-compiler cython emcee==2.2.1 fftw fitsio fortran-compiler future gsl jinja2 kombine matplotlib minuit2 mpi4py numba numpy nvcc_linux-64 openblas pybind11 pyccl pycparser pytest pyyaml sacc scikit-learn scipy six ucx pytest-runner
conda create -p /grand/gccy3/cosmosis-ompi -c conda-forge astropy cffi cfitsio click cmake configparser cudatoolkit cython emcee==2.2.1 fftw fitsio future gsl jinja2 kombine matplotlib minuit2 mpi4py numba numpy nvcc_linux-64 openblas pybind11 pycparser pytest pyyaml scipy six ucx pytest-runner
I set up my working environment under /grand/gccy3
.
The conda environment installation is under /grand/gccy3/cosmosis
.
The top-level for the software installation stack (as opposed to conda itself) is /grand/gccy3/top_dir
.
Note that this working environment is the one to be use after all the underlying products are built. Before that, not everything here is available. Separate instructions are found below, in the sections on installing the underlying products, for the working environment for those steps.
module load conda/2021-09-22
conda activate base
export CUDA_HOME # because it is set, but not exported
export OMPI_MCA_opal_cuda_support=true
export OMPI_MCA_pml="ucx"
export OMPI_MCA_osc="ucx"
export Y3GCC_DIR=/grand/gccy3/topdir
export Y3_CLUSTER_CPP_DIR=${Y3GCC_DIR}/y3_cluster_cpp
export Y3_CLUSTER_WORK_DIR=${Y3GCC_DIR}/y3_cluster_cpp
export LD_LIBRARY_PATH=${Y3GCC_DIR}/cuba/lib:$LD_LIBRARY_PATH
export http_proxy=http://theta-proxy.tmi.alcf.anl.gov:3128
export https_proxy=https://theta-proxy.tmi.alcf.anl.gov:3128
export HTTP_PROXY=http://theta-proxy.tmi.alcf.anl.gov:3128
export HTTPS_PROXY=https://theta-proxy.tmi.alcf.anl.gov:3128
export PAGANI_DIR=/grand/gccy3/topdir/gpuintegration
cd ${Y3GCC_DIR}/cosmosis
source config/setup-conda-cosmosis /grand/gccy3/cosmosis-2
This should result in a shell in which nvcc
picks up the GCC 9.4.0 compiler that is part of the conda environment, rather than the GCC 9.3.0-17ubuntu1~20.04 that comes from the OS.
It should also make python
be Python v3.9.7, rather than 2.7.18 that is the system python
.
We are trying to use a conda environment, because that handles binary compatibility in installation.
Installation of python libraries with pip
would require more care to assure that any Fortran or C++ was compiled with the right compiler and with the right switches, for binary compatibility.
The following is done to build the software.
Most needs to be done only once; the only software we are generally modifying is in y3_cluster_cpp
itself.
Do the setup below before going through these build steps.
module load conda/2021-06-28
conda activate /grand/gccy3/cosmosis-2
export CUDA_HOME # because it is set, but not exported
export OMPI_MCA_opal_cuda_support=true
export OMPI_MCA_pml="ucx"
export OMPI_MCA_osc="ucx"
export Y3GCC_DIR=/grand/gccy3/topdir
export Y3_CLUSTER_CPP_DIR=${Y3GCC_DIR}/y3_cluster_cpp
export Y3_CLUSTER_WORK_DIR=${Y3GCC_DIR}/y3_cluster_cpp
export LD_LIBRARY_PATH=${Y3GCC_DIR}/cuba/lib:$LD_LIBRARY_PATH
export http_proxy=http://theta-proxy.tmi.alcf.anl.gov:3128
export https_proxy=https://theta-proxy.tmi.alcf.anl.gov:3128
export HTTP_PROXY=http://theta-proxy.tmi.alcf.anl.gov:3128
export HTTPS_PROXY=https://theta-proxy.tmi.alcf.anl.gov:3128
export PAGANI_DIR=/grand/gccy3/topdir/gpuintegration
Note that we have to use the HTTP protocol; ssh
and https
do not work from ThetaGPU compute nodes.
You may get asked for a username and password, for y3_cluster_cpp
.
mkdir -p ${Y3GCC_DIR}
# Clone repositories
git clone http://github.com/marcpaterno/cuba.git
git clone http://bitbucket.org/mpaterno/y3_cluster_cpp.git
git clone http://bitbucket.org/mpaterno/cubacpp.git
git clone http://github.com/marcpaterno/gpuintegration.git
git clone -b better-exceptions http://bitbucket.org/mpaterno/cosmosis.git
cd cosmosis/
git clone -b develop http://bitbucket.org/mpaterno/cosmosis-standard-library.git
cd ${Y3GCC_DIR}
cluster_toolkit
is not available from Conda, so I have to build it and install it myself.
wget https://github.com/marcpaterno/cluster_toolkit/archive/master.tar.gz
tar xf master.tar.gz
cd cluster_toolkit-master/
python3 setup.py install # This will install into the environment
cd ../
rm -r cluster_toolkit-master/
rm master.tar.gz
Note that LD_LIBRARY_PATH
above is already set to include the directory into which we will be placing the CUBA dynamic library.
cd ${Y3GCC_DIR}/cuba
# We do not set CC, because it is already set by the environment modules
./makesharedlib.sh
mkdir include
mkdir lib
mv cuba.h include/
mv libcuba.so lib/
cd ${Y3GCC_DIR}
cd ${Y3GCC_DIR}/cosmosis
conda deactivate #
source config/setup-conda-cosmosis /grand/gccy3/cosmosis-2
make
mkdir -p ${PAGANI_DIR}/cudaPagani/build
cd ${PAGANI_DIR}/cudaPagani/build
cmake ../ -DPAGANI_TARGET_ARCH="80-real" -DPAGANI_DIR=${PAGANI_DIR} -DCMAKE_BUILD_TYPE="Release"
cd ${Y3_CLUSTER_CPP_DIR}
cmake -DUSE_CUDA=On -DY3GCC_TARGET_ARCH="80-real" -DPAGANI_DIR=${PAGANI_DIR} -DCMAKE_MODULE_PATH="${Y3_CLUSTER_CPP_DIR}/cmake;$Y3GCC_DIR/cubacpp/cmake/modules" -DCUBACPP_DIR=${Y3GCC_DIR}/cubacpp -DCUBA_DIR=${Y3GCC_DIR}/cuba -DCMAKE_BUILD_TYPE=Release .
Sometimes this command produces a warning that PAGANI_DIR
was defined and not used.
This is incorrect; PAGANI_DIR
is used in multiple CMakeLists.txt
files.