-
Notifications
You must be signed in to change notification settings - Fork 9
Building and Running MPICH (CH4)
This page describes how to build and run MPICH (CH4 device) to test the libfabric GNI provider. It is assumed the user is building MPICH on a Cray XC system like jupiter or edision/cori, and that you have built and installed a copy of libfabric.
MPICH can be built to use the Cray PMI or SLURM PMI. Differences in the build procedure using the two PMIs are highlighted below.
First, if you don't already have a clone of MPICH
% git clone [email protected]:pmodels/mpich.git
Next, configure and build/install MPICH. Note you will need libtool 2.4.4 or higher to keep MPICH's configury happy.
If you intend to use Cray PMI, you'll need to apply these two patches:patch0, patch1
After applying the patch, the following steps can be used to configure MPICH CH4:
% module load PrgEnv-gnu
%./autogen.sh
%./configure CFLAGS="-DMPIDI_CH3_HAS_NO_DYNAMIC_PROCESS" LDFLAGS="-Wl,-rpath -Wl,<path-to-ofi-libfabric-install>/lib" --with-pmi=cray --with-pm=none
--prefix=<path-to-mpich-install> --with-libfabric=<path-to-ofi-libfabric-install> --with-device=ch4:ofi
% make -j install
The following has been used to configure MPICH CH4 on Cray XC systems with SLURM PMI installed:
% ./autogen.sh
% module load PrgEnv-gnu
% export MPID_NO_PMI=yes
% export MPID_NO_PM=yes
% export USE_PMI2_API=yes
For SLURM PMI on Cori, set these environment variables as follows:
% export LDFLAGS="-L/usr/lib64/slurmpmi"
% export LIBS="-lpmi"
There is no patch to use MPICH/CH4 with Cray PMI.
The configure line needs to include the location the base of your libfabric install, as well
as specifying the OFI nemesis netmod, and specifying use of the FI_MR_BASIC
memory
registration model:
% ./configure --prefix=mpich_install_dir --with-libfabric=path_to_libfabric_install --with-device=ch4:ofi --with-ch4-netmod-ofi-args=mr-basic
% make -j 8 install
Note if you are wanting to run MPI multi-threaded tests which use MPI_THREAD_MULTIPLE, you will need to configure MPICH as follows
% ./configure --prefix=mpich_install_dir --with-ofi=your_libfabric_install_dir --enable-threads=multiple --with-device=ch4:ofi --with-ch4-netmod-ofi-args=mr-basic
First you will need to build an MPI app using MPICH's compiler wrapper:
% export PATH=mpich_install_dir/bin:${PATH}
% mpicc -o my_app my_app.c
On Tiger and NERSC edison/cori, the application can be launched using srun:
% srun -n 2 -N 2 ./my_app
IMPORTANT NOTE: If you are running on Cori and using the SLURM PMI library, you will need to set the LD_LIBRARY_PATH (MPICH compiler scripts doesn't rpath apparently):
% export LD_LIBRARY_PATH=/usr/lib64/slurmpmi:$LD_LIBRARY_PATH
If you'd like to double check against the sockets provider, do the following
% export MPIR_CVAR_OFI_USE_PROVIDER=sockets
% srun -n 2 -N 2 ./my_app
This will force the OFI netmod to use the sockets provider.
OSU provides a relatively simple set of MPI benchmark tests which are useful for testing the GNI libfabric provider.
% wget http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-5.0.tar.gz
% tar -zxvf osu-micro-benchmarks-5.0.tar.gz
% cd osu-micro-benchmarks-5.0
% ./configure CC=mpicc
% make
In the mpi/pt2pt
and mpi/collective
subdirectories there are a number
of tests. To test, for example MPICH send/recv message latency, osu_latency
can be used
% cd mpi/pt2pt
% srun -n 2 -N 2 ./osu_latency
The MPICH CH4 OFI Netmod is under active development. Expect suprises, hangs, etc. when using the OFI CH4 netmod, especially with the GNI provider.