Skip to content

Commit

Permalink
Replace bash env files with modules (#238)
Browse files Browse the repository at this point in the history
* Pass machine name to build scripts.

* Use modules environment instead of shell scripts.

* Leave conda activation to the user.

* Remove set_machine script.

* Rename env to modulefiles

* Minor fix.

* Minor fix

* Take out *module purge* from modufiles and put it in devbuild.sh

* Activate conda directly in signularity modulefile.

* Minor fixes.

* Add Gaea modulefiles.

* Restore odin env files.

* Bug fixes in singularity modulefiles.

* Move activation of Lmod to devbuild.sh

* Don't do 'module purge' on cray systems

* Put Lmod initialization code in separate script.

* Go back to using modulefile for odin.

* Optionally pass machine name to lmod-setup.sh

* Modify odin wflow modulefile.

* Allow unknown platforms in devbuild.sh

* Update documentation.

* Move cmake init out of lmod-setup.sh on odin

* Also update markup language build documentation.

* Lmod setup script for both bash and tcsh login shells.

* Some fixes for tcsh login shell.

* Add singularity platform to lmod-setup
  • Loading branch information
danielabdi-noaa authored May 1, 2022
1 parent d8340a7 commit 74d9249
Show file tree
Hide file tree
Showing 44 changed files with 706 additions and 532 deletions.
49 changes: 33 additions & 16 deletions devbuild.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# usage instructions
usage () {
cat << EOF_USAGE
Usage: $0 [OPTIONS]...
Usage: $0 --platform=PLATFORM [OPTIONS]...
OPTIONS
-h, --help
Expand Down Expand Up @@ -93,8 +93,6 @@ BUILD_JOBS=4
CLEAN=false
CONTINUE=false
VERBOSE=false
# detect PLATFORM (MACHINE)
source ${SRC_DIR}/env/detect_machine.sh

# process required arguments
if [[ ("$1" == "--help") || ("$1" == "-h") ]]; then
Expand Down Expand Up @@ -138,17 +136,32 @@ while :; do
shift
done

# check if PLATFORM is set
if [ -z $PLATFORM ] ; then
printf "\nERROR: Please set PLATFORM.\n\n"
usage
exit 0
fi

# set PLATFORM (MACHINE)
MACHINE="${PLATFORM}"
printf "PLATFORM(MACHINE)=${PLATFORM}\n" >&2

set -eu

# automatically determine compiler
if [ -z "${COMPILER}" ] ; then
case ${PLATFORM} in
jet|hera) COMPILER=intel ;;
jet|hera|gaea) COMPILER=intel ;;
orion) COMPILER=intel ;;
wcoss_dell_p3) COMPILER=intel ;;
cheyenne) COMPILER=intel ;;
macos) COMPILER=gccgfortran ;;
*) printf "ERROR: Unknown platform ${PLATFORM}\n" >&2; usage >&2; exit 1 ;;
macos,singularity) COMPILER=gnu ;;
odin) COMPILER=intel ;;
*)
COMPILER=intel
printf "WARNING: Setting default COMPILER=intel for new platform ${PLATFORM}\n" >&2;
;;
esac
fi

Expand All @@ -159,18 +172,19 @@ if [ "${VERBOSE}" = true ] ; then
settings
fi

# set ENV_FILE for this platform/compiler combination
ENV_FILE="${SRC_DIR}/env/build_${PLATFORM}_${COMPILER}.env"
if [ ! -f "${ENV_FILE}" ]; then
printf "ERROR: environment file does not exist for platform/compiler\n" >&2
printf " ENV_FILE=${ENV_FILE}\n" >&2
# set MODULE_FILE for this platform/compiler combination
MODULE_FILE="build_${PLATFORM}_${COMPILER}"
if [ ! -f "${SRC_DIR}/modulefiles/${MODULE_FILE}" ]; then
printf "ERROR: module file does not exist for platform/compiler\n" >&2
printf " MODULE_FILE=${MODULE_FILE}\n" >&2
printf " PLATFORM=${PLATFORM}\n" >&2
printf " COMPILER=${COMPILER}\n\n" >&2
printf "Please make sure PLATFORM and COMPILER are set correctly\n" >&2
usage >&2
exit 64
fi

printf "ENV_FILE=${ENV_FILE}\n" >&2
printf "MODULE_FILE=${MODULE_FILE}\n" >&2

# if build directory already exists then exit
if [ "${CLEAN}" = true ]; then
Expand Down Expand Up @@ -228,10 +242,13 @@ if [ "${VERBOSE}" = true ]; then
MAKE_SETTINGS="${MAKE_SETTINGS} VERBOSE=1"
fi

# source the environment file for this platform/compiler combination, then build the code
printf "... Source ENV_FILE and create BUILD directory ...\n"
module use ${SRC_DIR}/env
. ${ENV_FILE}
# Before we go on load modules, we first need to activate Lmod for some systems
source ${SRC_DIR}/etc/lmod-setup.sh

# source the module file for this platform/compiler combination, then build the code
printf "... Load MODULE_FILE and create BUILD directory ...\n"
module use ${SRC_DIR}/modulefiles
module load ${MODULE_FILE}
module list
mkdir -p ${BUILD_DIR}
cd ${BUILD_DIR}
Expand Down
36 changes: 32 additions & 4 deletions docs/INSTALL
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,38 @@ git clone https://github.com/ufs-community/ufs-srweather-app.git
cd ufs-srweather-app/
./manage_externals/checkout_externals

# Prior to building, you must set up the environment so cmake can find the appropriate compilers
# and libraries. For instructions specific to supported platforms, see the "build_[machine]_[compiler].env
# files in the "env" directory. These files give instructions assuming a bash or ksh login shell, for
# csh and tcsh users you will have to modify the commands for setting envronment variables.
# We can build ufs-sreweather-app binaries in two ways.

# Method 1
# ========

# This is the simplest way to build the binaries

./devbuild.sh --platform=PLATFORM

# If compiler auto-detection fails, specify it using

./devbuild.sh --platform=PLATFORM --compiler=COMPILER

# Method 2
# ========

# The above instructions will work atleast on Tier-1 systems, if not on all supported machines.
# However, if it fails for some reason, we can build directly with cmake.

# First, we need to make sure that there is a modulefile "build_[PLATFORM]_[COMPILER]" in the
# "modulefiles" directory. Also, on some systems (e.g. Gaea/Odin) that come with cray module app,
# we may need to swap that for Lmod instead. Assuming your login shell is bash, run

source etc/lmod-setup.sh PLATFORM

# and if your login schell is csh/tcsh, source etc/lmod-setup.csh instead.

# From here on, we can assume Lmod is loaded and ready to go. Then we load the specific
# module for a given PLATFORM and COMPILER as follows

module use modulefiles
module load build_[PLATFORM]_[COMPILER]

# Supported CMake flags:
# -DCMAKE_INSTALL_PREFIX Location where the bin/ include/ lib/ and share/ directories containing
Expand Down
19 changes: 14 additions & 5 deletions docs/RUNTIME
Original file line number Diff line number Diff line change
@@ -1,13 +1,22 @@
# Users should load the appropriate python environment for the workflow.
# The workflow requires Python 3, with the packages 'PyYAML', 'Jinja2', and 'f90nml' available.

# For users' convenience, the python environment for the workflow is put in 'ufs-srweather-app/env/wflow_[machine].env'.
# When generating a workflow experiment or running a workflow, users can use this file for a specific machine.
# For users' convenience, the python environment for the workflow can be activated by loading wflow_[PLATFORM] modulefile

# For example, on Hera:

cd ufs-srweather-app/env
source wflow_hera.env
module load wflow_hera

# Due to older version of Lmod, inconsistency with TCL modulefiles etc, you may have to activate
# conda manually using instructions that the previous module command prints.
# Hera is one of those systems, so execute:

conda activate regional_workflow

# After that we can setup an experiment in the directory

cd regional_workflow/ush

# Once we prepare experiment file config.sh, we can generate workflow using

cd ../regional_workflow/ush
./generate_FV3LAM_wflow.sh
50 changes: 31 additions & 19 deletions docs/UsersGuide/source/BuildRunSRW.rst
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,9 @@ The cloned repository contains the configuration files and sub-directories shown
+--------------------------------+--------------------------------------------------------+
| ufs_srweather_app.settings.in | SRW App configuration summary |
+--------------------------------+--------------------------------------------------------+
| env | Contains build and workflow environment files |
| modulefiles | Contains build and workflow module files |
+--------------------------------+--------------------------------------------------------+
| etc | Contains Lmod startup scripts |
+--------------------------------+--------------------------------------------------------+
| docs | Contains release notes, documentation, and User's Guide|
+--------------------------------+--------------------------------------------------------+
Expand All @@ -123,38 +125,48 @@ Run the executable that pulls in SRW App components from external repositories:
.. _SetUpBuild:
Build with ``devbuild.sh``
==========================

Set up the Build Environment
============================
On Level-1 systems, for which a modulefile is provided under ``modulefiles`` directory, we can build SRW App binaries with:

Before building the SRW App, the build environment must be set up for the user's specific platform. There is a set of common modules required to build the SRW App. These are located in the ``env/srw_common`` file. To load the set of common modules, run:
.. code-block:: console
./devbuild.sh --platform=hera
If compiler auto-detection fails for some reason, specify it using

.. code-block:: console
module use <path/to/env/directory>
./devbuild.sh --platform=hera --compiler=intel
If this method doesn't work, we will have to manually setup the environment, and build SRW app binaries with CMake.

where ``<path/to/env/directory>`` is the full path to the ``env`` directory.
.. _SetUpBuild:

Set up the Build/Run Environment
================================

We need to setup our environment to run a workflow or to build the SRW app with CMake. Note that ``devbuild.sh`` does not prepare environment for workflow runs so this step is necessary even though binaries are built properly using ``devbuild.sh``.

Then, users must set up the platform-specific elements of the build environment. For Level 1 systems, scripts for loading the proper modules and/or setting the correct environment variables can be found in the ``env`` directory of the SRW App in files named ``build_<platform>_<compiler>.env``. Here is a sample directory listing of these build files:
The build environment must be set up for the user's specific platform. First, we need to make sure ``Lmod`` is the app used for loading modulefiles. That is often the case on most systems, however, on some systems such as Gaea/Odin, the default modulefile loader is from Cray and we need to swap it for ``Lmod``. For example on Gaea, assuming a ``bash`` login shell, run:

.. code-block:: console
$ ls -l env/
-rw-rw-r-- 1 user ral 1228 Oct 9 10:09 build_cheyenne_intel.env
-rw-rw-r-- 1 user ral 1134 Oct 9 10:09 build_hera_intel.env
-rw-rw-r-- 1 user ral 1228 Oct 9 10:09 build_jet_intel.env
...
source etc/lmod-setup.sh gaea
On Level 1 systems, the commands in the ``build_<platform>_<compiler>.env`` files can be directly copy-pasted into the command line, or the file can be sourced from the ``ufs-srweather-app/env`` directory. For example, on Hera, run:
or if your login shell is ``csh`` or ``tcsh``, source ``etc/lmod-setup.csh`` instead. If you execute the above command on systems that don't need it, it will simply do a ``module purge``. From here on, we can assume, ``Lmod`` is ready to load modulefiles needed by the SRW app.

.. code-block::
The modulefiles needed for building and running SRW App are located in ``modulefiles`` directory. To load the necessary modulefile for a specific ``<platform>`` using ``<compiler>`` , run:

.. code-block:: console
source env/build_hera_intel.env
module use <path/to/modulefiles/directory>
module load build_<platform>_<compiler>
from the main ``ufs-srweather-app`` directory to source the appropriate file.
where ``<path/to/modulefiles/directory>`` is the full path to the ``modulefiles`` directory. This will work on Level 1 systems, where a modulefile is available in the ``modulefiles`` directory.

On Level 2-4 systems, users will need to modify certain environment variables, such as the path to NCEP libraries, so that the SRW App can find and load the appropriate modules. For systems with Lmod installed, one of the current ``build_<platform>_<compiler>.env`` files can be copied and used as a template. To check whether Lmod is installed, run ``echo $LMOD_PKG``, and see if it outputs a path to the Lmod package. On systems without Lmod, users can modify or set the required environment variables with the ``export`` or ``setenv`` commands despending on whether they are using a bash or csh/tcsh shell, respectively:
On Level 2-4 systems, users will need to modify certain environment variables, such as the path to NCEP libraries, so that the SRW App can find and load the appropriate modules. For systems with Lmod installed, one of the current ``build_<platform>_<compiler>`` modulefiles can be copied and used as a template. To check whether Lmod is installed, run ``echo $LMOD_PKG``, and see if it outputs a path to the Lmod package. On systems without Lmod, users can modify or set the required environment variables with the ``export`` or ``setenv`` commands despending on whether they are using a bash or csh/tcsh shell, respectively:

.. code-block::
Expand Down Expand Up @@ -599,7 +611,7 @@ The workflow requires Python 3 with the packages 'PyYAML', 'Jinja2', and 'f90nml

.. code-block:: console
source ../../env/wflow_<platform>.env
module load wflow_<platform>
This command will activate the ``regional_workflow`` conda environment. The user should see ``(regional_workflow)`` in front of the Terminal prompt at this point. If this is not the case, activate the regional workflow from the ``ush`` directory by running:

Expand Down
18 changes: 0 additions & 18 deletions env/build_gaea_intel.env

This file was deleted.

22 changes: 0 additions & 22 deletions env/build_jet_intel.env

This file was deleted.

85 changes: 0 additions & 85 deletions env/build_macos_gnu.env

This file was deleted.

Loading

0 comments on commit 74d9249

Please sign in to comment.