-
Notifications
You must be signed in to change notification settings - Fork 9
Build and Run on Corona
git clone -b clang11_hip [email protected]:SCOREC/omega_h.git
Create envCoronaRocm.sh
with the following contents:
module load opt
module load rocm/3.9.0
module unload intel
source envCoronaClang11.sh
mkdir build-omega-clang11
cd !$
cmake ../omega_h \
-DCMAKE_INSTALL_PREFIX=$PWD/install \
-DBUILD_SHARED_LIBS=OFF \
-DOmega_h_USE_HIP=OFF \
-DOmega_h_USE_Kokkos=OFF \
-DOmega_h_USE_MPI=OFF \
-DCMAKE_CXX_COMPILER=/opt/rocm-3.6.0/llvm/bin/clang++ \
-DOmega_h_CXX_WARNINGS=OFF \
-DBUILD_TESTING=ON
make -j4
ctest # all tests should pass
Allocate a mi60 node then build. SLURM will automatically spawn a shell on the allocated node.
Building on a login/front end node results in runtime errors.
salloc -N 1 -p mi60 -t 30
#wait for allocation
source envCoronaClang11.sh
mkdir build-omega-rocm39
cd !$
srcpath=/path/to/omegah/source
hipcc=`which hipcc`
rocm=${hipcc%%bin/hipcc}
export HIP_PATH=$rocm/hip
export CMAKE_PREFIX_PATH=$rocm:$CMAKE_PREFIX_PATH
cmake $srcpath \
-DCMAKE_INSTALL_PREFIX=$PWD/install \
-DBUILD_SHARED_LIBS=OFF \
-DOmega_h_USE_HIP=ON \
-DOmega_h_USE_Kokkos=OFF \
-DCMAKE_CXX_FLAGS="-O2 --amdgpu-target=gfx906" \
-DHIP_PATH=${rocm}/hip \
-DOmega_h_USE_MPI=OFF \
-DCMAKE_CXX_COMPILER=hipcc \
-DOmega_h_CXX_WARNINGS=OFF \
-DBUILD_TESTING=ON
make
Run tests
ctest
As of 790805, the run_unit_mesh
test fails with the following errors; all other tests pass:
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
/g/g19/smith516/develop/omega_h/src/Omega_h_qr.hpp:32: Vector<max_m> Omega_h::householder_vector(Omega_h::Int, Matrix<max_m, max_n>, Omega_h::Int) [max_m = 72, max_n = 4]: Device-side assertion `norm_x > 0.0' failed.
:0:rocdevice.cpp :2180: 675783340969 us: Device::callbackQueue aborting with status: 0x1016
Aborted (core dumped)
To run the fun3d delta case omega_h needs to rebuilt with libmeshb
support.
libmeshb
is here:
https://github.com/LoicMarechal/libMeshb
Installation requires Cmake and a C (and Fortran) compiler.
After installing libmeshb
:
- append the path to the
libmeshb
install directory to yourCMAKE_PREFIX_PATH
, - checkout the
run_ugawg_delta
branch, and - rebuild Omega_h with
-DOmega_h_USE_libMeshb=on
passed tocmake
.
Clone the repo with the fun3d delta case. The repo is 527MB.
module load git-lfs
git lfs install
git clone [email protected]:UGAWG/parallel-adapt-results.git
Create a file named rocProf.txt
with the following contents:
pmc : Wavefronts VALUInsts SALUInsts SFetchInsts
# Perf counters group 2
pmc : TCC_HIT[0], TCC_MISS[0]
# Filter by dispatches range, GPU index and kernel names
# supported range formats: "3:9", "3:", "3"
gpu : 0
Create a file named runDeltaProf.sh
with the following contents; edit the paths for bin
and delta
.
#!/bin/bash -ex
bin=/path/to/omegah/build/directory
delta=/path/to/parallel-adapt/delta-wing/fun3d-fv-lp2
mesh=$delta/delta50k.meshb
mach=$delta/delta50k-mach.solb
prof="rocprof --hip-trace -i rocProf.txt"
$prof $bin/ugawg_hsc $mesh $mach $delta/delta50k-metric.solb 50k &> 50k.log
#uncomment the following line to run a larger case
#$prof $bin/ugawg_hsc $mesh $mach $delta/scaled-metric/delta500k-metric.solb 500k &> 500k.log
Make it executable:
chmod +x runDeltaProf.sh
Allocate an mi60 node and run:
salloc -N 1 -p mi60 -t 30
cd /path/to/run/dir
./runDeltaProf.sh
This should produce rocProf.json
which can be viewed in a Chome based browser following the instructions here:
https://aras-p.info/blog/2017/01/23/Chrome-Tracing-as-Profiler-Frontend/
Create a file named runDeltaOshTime.sh
with the following contents; edit the paths for bin
and delta
.
#!/bin/bash
bin=/path/to/omegah/build/directory
delta=/path/to/parallel-adapt/delta-wing/fun3d-fv-lp2
mesh=$delta/delta50k.meshb
mach=$delta/delta50k-mach.solb
for case in 50k 500k; do
metric=$delta/scaled-metric/delta${case}-metric.solb
[ "$case" == "50k" ] && metric=$delta/delta${case}-metric.solb
for opt in time pool timePool; do
arg=""
[ "$opt" == "time" ] && arg="--osh-time" && export HIP_LAUNCH_BLOCKING=1
[ "$opt" == "pool" ] && arg="--osh-pool" && unset HIP_LAUNCH_BLOCKING
[ "$opt" == "timePool" ] && arg="--osh-time --osh-pool" && export HIP_LAUNCH_BLOCKING=1
echo $case $arg
$bin/ugawg_hsc $arg $mesh $mach $metric $case &> ${case}-${opt}.log
done
done
Make it executable:
chmod +x runDeltaOshTime.sh
Allocate an mi60 node and run:
salloc -N 1 -p mi60 -t 30
cd /path/to/run/dir
./runDeltaOshTime.sh
Appended to the end of the runs using --osh-time
will be a top-down and bottom-up tree of functions sorted by time in descending order.
The --osh-pool
argument enables use of an internal memory pool instead of device runtime allocation calls.
https://github.com/RadeonOpenCompute/ROCm/issues/1212 was resolved with rocm 3.9