Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WONTFIX: ATS-2 XL issues with Trilinos #5238

Closed
jjellio opened this issue May 23, 2019 · 12 comments
Closed

WONTFIX: ATS-2 XL issues with Trilinos #5238

jjellio opened this issue May 23, 2019 · 12 comments
Labels
ATDM Env Issue Issue with ATDM build or test caused (at least partly) by the env, not a bug in Trilinos client: ATDM Any issue primarily impacting the ATDM project impacting: configure or build The issue is primarily related to configuring or building type: bug The primary issue is a bug in Trilinos code or tests

Comments

@jjellio
Copy link
Contributor

jjellio commented May 23, 2019

Bug Report

It is not obvious who to mark for this.
@crtrott @mhoemmen @mwglass @bartlettroscoe

Description

This issue encompasses several problems:
1. CMake w/Power9+XL generates bogus optimization flags -O -DNDEBUG
2. Need lower optimization settings for specific files (needed feature)
3. Kokkos does not set the proper architecture flags for Power9 w/XL (it should set atleast -qarch=pwr9 -qtune=pwr9 and for machines where the building node is the same as the compute -qcache=auto

Steps to Reproduce

Lassen:

Environment:

module load git
module swap xl/2019.02.07
module swap cuda/9.2.148 cuda/10.1.105
module load hdf5-parallel/1.10.4
module load netcdf-c/4.6.3
module load lapack/3.8.0-P9-xl-2018.11.26
module load cmake/3.12.1

export WORKSPACE="/usr/workspace/wsa/jjellio/"

export NVCC_WRAPPER_DEFAULT_COMPILER=$(command -v xlC)
export OMPI_CXX="$HOME/src/Trilinos-jje/packages/kokkos/bin/nvcc_wrapper"
export LLNL_USE_OMPI_VARS="Y"

Configure

# jje: tokenize_env is just a function I wrote that creates a standard flat token that
#  spells out compiler/mpi/cuda/host vendor and version. e.g.,
#  mutrino_hsw_intel-1.2.3_cray-mpich-1.2.3_cuda-6.6.6
# so replace this with whatever
BUILD="lassen_xl-16_cuda-10.1_spectrum" #$(tokenize_env)
# if you provide TRILINOS_PATH, the script will use it, otherwise, it look in the parent directory
# e.g., if you build inside a git worktree
TRILINOS_PATH=${TRILINOS_PATH:-"$(cd ..; realpath $PWD)"}
# grab the Trilinos version
TRILINOS_VERSION=`grep "Trilinos_VERSION " ${TRILINOS_PATH}/Version.cmake | cut -f2 -d' ' | tr -d ')'`
# grab the git sha, repo, and branch
TRILINOS_SHA=`git --git-dir=${TRILINOS_PATH}/.git rev-parse HEAD`
TRILINOS_REPO=$(basename -s .git `git --git-dir=${TRILINOS_PATH}/.git config --get remote.origin.url`)
TRILINOS_BRANCH=$(git --git-dir=${TRILINOS_PATH}/.git rev-parse --abbrev-ref HEAD)

TRILINOS_DIR="${TRILINOS_PATH}"
TRILINOS_INSTALL=${TRILINOS_PATH}/install-${TRILINOS_REPO}-${TRILINOS_BRANCH}/${TRILINOS_SHA}/${BUILD}

echo "$TRILINOS_INSTALL"

KOKKOS_ARCH="Power9;Volta70"
ARCH="Power9;Volta70"


ARGS=(
  -D "BUILD_SHARED_LIBS:BOOL=OFF"
  -D "CMAKE_BUILD_TYPE:STRING=Release"
  -D "CMAKE_BUILD_WITH_INSTALL_RPATH:BOOL=OFF"
  -D "CMAKE_INSTALL_PREFIX=$TRILINOS_INSTALL"
  -D "CMAKE_CXX_FLAGS=-O3 -qmaxmem=-1 -qarch=pwr9 -qtune=pwr9 -qcache=auto -qstrict=nolibrary -qhot=fastmath"
  -D "KOKKOS_ARCH:STRING=${KOKKOS_ARCH}"
  -D "Trilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=ON"
  -D "Trilinos_ENABLE_Fortran:BOOL=OFF"
  -D "MueLu_ENABLE_Kokkos_Refactor_Use_By_Default:BOOL=ON"
  -D "Trilinos_ENABLE_TESTS:BOOL=OFF"
  -D "Trilinos_ENABLE_EXAMPLES:BOOL=OFF"
  -D "Trilinos_CXX11_FLAGS=-std=c++11 --expt-extended-lambda"
  -D "Kokkos_ENABLE_Cuda_Lambda:BOOL=ON"
  -D "Kokkos_ENABLE_Cuda_UVM:BOOL=ON"
  -D "Kokkos_ENABLE_Profiling:BOOL=ON"

  -D "Tpetra_INST_SERIAL:BOOL=ON"
  -D "Tpetra_INST_CUDA:BOOL=ON"

  -D "MueLu_ENABLE_Kokkos_Refactor=ON"
  -D "Xpetra_ENABLE_Kokkos_Refactor=ON"

  -D "Trilinos_ENABLE_Panzer=ON"
  -D "Panzer_ENABLE_TESTS:BOOL=OFF"
  -D "Panzer_ENABLE_EXAMPLES:BOOL=OFF"
  -D "Trilinos_ENABLE_PanzerMiniEM=ON"
  -D "PanzerMiniEM_ENABLE_EXAMPLES=ON"
  -D "PanzerMiniEM_ENABLE_TESTS=ON"

  -D "Trilinos_ENABLE_SEACASExodiff:BOOL=OFF"
  -D "Trilinos_ENABLE_SEACASEpu:BOOL=OFF"
  -D "Trilinos_ENABLE_SEACASNemspread:BOOL=OFF"
  -D "Trilinos_ENABLE_SEACASNemslice:BOOL=OFF"
  -D "Trilinos_ENABLE_SEACASAprepro:BOOL=OFF"
  -D "Teuchos_KOKKOS_PROFILING:BOOL=ON"

  # pulled these from CMakeCache.txt on waterman, look for the comment"
  # //Set in /ascldap/users/jhu/software/trilinos/Trilinos/cmake/std/atdm/ATDMDevEnvSettings.cmake"
  -D "Panzer_ENABLE_FADTYPE:STRING=Sacado::Fad::DFad<RealType>"
  -D "Phalanx_KOKKOS_DEVICE_TYPE:STRING=CUDA"
  -D "Sacado_ENABLE_HIERARCHICAL_DFAD:BOOL=ON"

  -D "TPL_FIND_SHARED_LIBS:BOOL=ON"
  -D "TPL_ENABLE_CUSPARSE:BOOL=ON"

  -D "TPL_ENABLE_HDF5:BOOL=ON"
    -D "HDF5_INCLUDE_DIRS:PATH=${HDF5}/include"
    -D "HDF5_LIBRARY_DIRS:PATH=${HDF5}/lib"
  -D "TPL_ENABLE_Netcdf:BOOL=ON"
    -D "Netcdf_LIBRARY_NAMES=netcdf;hdf5;z"
    -D "Netcdf_INCLUDE_DIRS=${NETCDF}/include"
    -D "Netcdf_LIBRARY_DIRS=${NETCDF}/lib;${HDF5}/lib"

  -D "TPL_ENABLE_BLAS:BOOL=ON"
    -D "BLAS_INCLUDE_DIRS:PATH=${LAPACK_INC}"
    -D "BLAS_LIBRARY_DIRS:PATH=${LAPACK_DIR}"
  -D "TPL_ENABLE_LAPACK:BOOL=ON"
    -D "LAPACK_INCLUDE_DIRS:PATH=${LAPACK_INC}"
    -D "LAPACK_LIBRARY_DIRS:PATH=${LAPACK_DIR}"

  -D "F77_BLAS_MANGLE:STRING=(name,NAME) name"
  -D "F77_FUNC:STRING=(name,NAME) name"
  -D "F77_FUNC_:STRING=(name,NAME) name"

  -D "TPL_ENABLE_MPI:BOOL=ON"
  -D "TPL_ENABLE_CUDA:BOOL=ON"
  -D "TPL_ENABLE_DLlib:BOOL=ON"

  # HDF5 dependency, for linking SEACAS executables"
  #-D "Trilinos_EXTRA_LINK_FLAGS:STRING=-lz -ldl -lm"
  # we could avoid this by marking this libraries as needed libs w/a TPL (Lapack/Blas/etc..)
  -D "Trilinos_EXTRA_LINK_FLAGS:STRING=-lz -llapack -lblas -lcusparse -lcublas -lcufft -lcudart -lxlopt -lxl -lxlf90 -lxlfmath -lm  -L/usr/tce/packages/xl/xl-2019.02.07/xlf/16.1.1/lib/"
)

echo cmake ${KEEP_TEMP_FILES} "${ARGS[@]}" ${TRILINOS_DIR} | tee configure.txt

cmake ${KEEP_TEMP_FILES} "${ARGS[@]}" ${TRILINOS_DIR} | tee configure.log

#### Now, fix the compiler flags for certain packages:
sed -i -e "s/-O3/-O2 -qinline=noauto:level=1/g" packages/belos/src/CMakeFiles/belos.dir/flags.make
sed -i -e "s/-O3/-O2 -qinline=noauto:level=1/g" packages/belos/tpetra/src/CMakeFiles/belostpetra.dir/flags.make
sed -i -e "s/-O3/-O2 -qinline=noauto:level=1/g" packages/muelu/src/Interface/CMakeFiles/muelu-interface.dir/flags.make
# I end up with -O -D.. which is saying optimization without a level... makes no sense
find -name "flags.make" -print0 | xargs -0 sed -i -e 's/-O -DNDEBUG/-DNDEBUG/g'
find -name "link.txt" -print0 | xargs -0 sed -i -e 's/-O -DNDEBUG/-DNDEBUG/g'

Recognize the hackery at the end. The find corrects the odd optimization flags.

The subpackage optimization flags have to be set this way, because you cannot use <Package>_CXX_FLAGS. Using the CMake route will end up with multiple optimization flags, which is not good. (O3 turns on stuff we explicitly want off).

The hack above will not work with Ninja, since you don't want to impact the entire build.ninja file.

@jjellio jjellio added the type: bug The primary issue is a bug in Trilinos code or tests label May 23, 2019
@mhoemmen
Copy link
Contributor

@trilinos/framework

@jhux2
Copy link
Member

jhux2 commented May 23, 2019

@pwxy

@jhux2
Copy link
Member

jhux2 commented May 23, 2019

  1. CMake w/Power9+XL generates bogus optimization flags -O -DNDEBUG

This is valid according to the XL documentation. It's actually nvcc_wrapper that doesn't like it. @nmhamster fixed this, see kokkos/nvcc_wrapper#28.

@jjellio
Copy link
Contributor Author

jjellio commented May 23, 2019

Generating the optimization flag -O just seems weird to me. Legal or not, it would be saner to specify a level.

The good news is that my binary generated with the love shown above is actually running.

@trilinos trilinos deleted a comment from jjellio May 23, 2019
@bartlettroscoe bartlettroscoe added the client: ATDM Any issue primarily impacting the ATDM project label Jun 3, 2019
@bartlettroscoe
Copy link
Member

@jjellio, can we reproduce this on 'vortex'?

@bartlettroscoe bartlettroscoe added the ATDM Env Issue Issue with ATDM build or test caused (at least partly) by the env, not a bug in Trilinos label Jun 3, 2019
@sebrowne
Copy link
Contributor

I can confirm that you can replicate the -O thing on Vortex. I found it valid, because it just says "use the default optimization level for XL", which is -O2 incidentally. We're turning it up to -O3 (without the redundant warning from nvcc_wrapper) like this:

-D CMAKE_C_FLAGS="$EXTRA_C_FLAGS -O3"
-D CMAKE_CXX_FLAGS="$EXTRA_CXX_FLAGS -ccbin xlC -O3"
-D CMAKE_Fortran_FLAGS="$EXTRA_F_FLAGS -O3"
-D CMAKE_C_FLAGS_RELEASE_OVERRIDE="-DNDEBUG"
-D CMAKE_CXX_FLAGS_RELEASE_OVERRIDE="-DNDEBUG"
-D CMAKE_Fortran_FLAGS_RELEASE_OVERRIDE="-DNDEBUG" \

@bartlettroscoe
Copy link
Member

Anyone want to try to contribute a new 'ats2' env for that ATDM Trilinos builds as described at:

?

For an example that works with SPARC, see:

  • Trilinos/cmake/std/atdm/cee-rhel6/

and:

  • Trilinos/cmake/std/atdm/waterman/

If I am the only person who can add new platforms and envs with this system then this will not be sustainable. (NOTE: @fryeguy52 has added new systems as has @bathmatt with the 'sems-rhel7' env so it can be done.)

I will review anything anyone wants to try to contribute a PR.

@jjellio
Copy link
Contributor Author

jjellio commented Jun 22, 2019

Ross,

I had looked at adding one, but the systems currently do not provide the needed ATDM TPLs. Sparc built their own. HDF5 in particular is a mess on these machines (they have a version installed in the system path /usr/lib and include, which given the push for shared libraries is a royally stupid thing to do).

It also isn't clear if OpenMP is really working on these machines. I know XL tends to die if OMP is turned on.

@bartlettroscoe
Copy link
Member

@jjellio said:

I had looked at adding one, but the systems currently do not provide the needed ATDM TPLs.

Yea, that is a problem. That is why we are working on a Spack build of these TPLs (see CDOFA-41). I have a meeting set up with Greg Becker at LLNL next week to see if we can fast-track a Spack-based build that we have been working on for these TPLs on the ATS-2 systems (see CDOFA-51).

In the meantime, I think we can extend the ATDM Trilinos configuration setup to allow for the disable of various TPLs on systems like this until we can get them installed and working. We can add support for env vars ATDM_CONFIG_<tpl-name>_DISABLE which are undefined by default, but if set to TRUE will disable the TPL in the file ATDMDevSettings.cmake. We can then set ATDM_CONFIG_HDF5_DISABLE=TRUE, ATDM_CONFIG_NETCDF_DISABLE=TRUE, ... in the file cmake/std/atdm/ats2/enviornment.sh. If you have a working BLAS and LAPACK then you can test a good bit of Trilinos (just not SEACAS and packages downstream from SEACAS). But that would allow us to get a basic sense if the compiler and MPI work on new systems by running a good bit of the native Trilinos test suite (just not any of the ATDM APPs). And it would allow us to reproduce problems like called out in this Issue and #5404.

They have a version installed in the system path /usr/lib and include, which given the push for shared libraries is a royally stupid thing to do

Have we reported that to them yet? I am sure they can delete some unusable TPLs on ATS-2 if we ask them and explain why this is a problem.

It also isn't clear if OpenMP is really working on these machines. I know XL tends to die if OMP is turned on.

An ATDM Trilinos env does not have to support all of the Kokkos backends. For now, the cmake/std/atdm/ats2/environment.sh can just error out if the user tries to use openmp. All we care about for ATS-2 is the CUDA backend, right?

@jhux2
Copy link
Member

jhux2 commented Jun 28, 2019

Adding @ikarlin, as a Lassen POC.

@bartlettroscoe bartlettroscoe added the impacting: configure or build The issue is primarily related to configuring or building label Feb 11, 2020
@bartlettroscoe
Copy link
Member

All: This goes to the general question, what level of support will Trilinos provide for the XL compiler on ATS-2 (which is the officially supported compiler on that system). EMPIRE does not use XL, but SPARC does.

@bartlettroscoe bartlettroscoe changed the title Trilinos: Lassen/IBM build issues/bugs/challenges Trilinos: ATS-2 XL issues with Trilinos Nov 20, 2020
@bartlettroscoe
Copy link
Member

Actually, I am going to go ahead and close this issue. SPARC has been using Trilinos with CUDA+XL for a long time. There are some new problems with the CUDA+XL build of Trilinos for SPARC but those should be separate GitHub issues.

Closing this as "worksforme".

@bartlettroscoe bartlettroscoe changed the title Trilinos: ATS-2 XL issues with Trilinos WONTFIX: ATS-2 XL issues with Trilinos Nov 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ATDM Env Issue Issue with ATDM build or test caused (at least partly) by the env, not a bug in Trilinos client: ATDM Any issue primarily impacting the ATDM project impacting: configure or build The issue is primarily related to configuring or building type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

5 participants