Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hera gnu build is broken #66

Closed
RussTreadon-NOAA opened this issue Jan 30, 2025 · 13 comments · Fixed by #67
Closed

Hera gnu build is broken #66

RussTreadon-NOAA opened this issue Jan 30, 2025 · 13 comments · Fixed by #67

Comments

@RussTreadon-NOAA
Copy link
Contributor

Attempts to build GSI-utils develop on Hera using the gnu compiler fail with the error message

Lmod has detected the following error: The following module(s) are unknown: "openmpi/4.1.5"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "openmpi/4.1.5"

Also make sure that all modulefiles written in TCL start with the string #%Module

Executing this command requires loading "openmpi/4.1.5" which failed while processing the following module(s):

    Module fullname      Module Filename
    ---------------      ---------------
    stack-openmpi/4.1.5  /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/gcc/9.2.0/stack-openmpi/4.1.5.lua
    gsiutils_hera.gnu    /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/modulefiles/gsiutils_hera.gnu.lua
While processing the following module(s):
    Module fullname      Module Filename
    ---------------      ---------------
    stack-openmpi/4.1.5  /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/gcc/9.2.0/stack-openmpi/4.1.5.lua
    gsiutils_hera.gnu    /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/modulefiles/gsiutils_hera.gnu.lua

The following prepend_path needs to be added to gsiutils_hera.gnu.lua

@@ -2,6 +2,7 @@ help([[
 ]])

 prepend_path("MODULEPATH", "/scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/Core")
+prepend_path("MODULEPATH", "/scratch1/NCEPDEV/jcsda/jedipara/spack-stack/modulefiles")

 local python_ver=os.getenv("python_ver") or "3.11.6"
 local stack_intel_ver=os.getenv("stack_gcc_ver") or "9.2.0"

An additional issue with the move of spack-stack/1.6.0 from /scratch1/NCEPDEV/nems/role.epic/spack-stack to /contrib/spack-stack is that there no gnu build of spack-stack in /contrib/spack-stack.

This issue is opened to document these problems.

@DavidHuber-NOAA
Copy link
Collaborator

Opened issue JCSDA/spack-stack#1483 to request the GNU install of the gsi-addon environment.

@RatkoVasic-NOAA
Copy link

Spack-stack GNU installation on Hera:
/contrib/spack-stack/spack-stack-1.6.0/envs/gnu-fms-2024.01/install/modulefiles/Core

@RatkoVasic-NOAA
Copy link

Also, can you point me to your modulefile? I see it is looking for old openmpi (4.1.5). New openmpi is here:
module use /scratch4/NCEPDEV/stmp/role.epic/installs/openmpi/modulefiles
module load openmpi/4.1.6

@RussTreadon-NOAA
Copy link
Contributor Author

Also, can you point me to your modulefile? I see it is looking for old openmpi (4.1.5). New openmpi is here: module use /scratch4/NCEPDEV/stmp/role.epic/installs/openmpi/modulefiles module load openmpi/4.1.6

modulefiles/gsiutils_hera.gnu.lua

@RatkoVasic-NOAA
Copy link

@RussTreadon-NOAA and @DavidHuber-NOAA
I found what GSI used for gsi-addon with GNU compiler on Hera machine. It was with old [email protected] and [email protected]
I'm now in process of re-installing whole spack-stack with old [email protected] / [email protected] on /scratch4 disk. Once it is done, I'll install gsi-addon.

@RussTreadon-NOAA
Copy link
Contributor Author

Thank you @RatkoVasic-NOAA

@RatkoVasic-NOAA
Copy link

@RussTreadon-NOAA and @DavidHuber-NOAA new installation of [email protected]/[email protected] is done. Please change -prepend_path line in https://github.com/NOAA-EMC/GSI-utils/blob/develop/modulefiles/gsiutils_hera.gnu.lua

from:
prepend_path("MODULEPATH", "/contrib/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/Core")
to:
prepend_path("MODULEPATH", "/scratch4/NCEPDEV/stmp/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/Core")

@RussTreadon-NOAA
Copy link
Contributor Author

RussTreadon-NOAA commented Jan 31, 2025

Thank you @RatkoVasic-NOAA .

The suggested changes was made to modulefiles/gsiutils_hera.gnu.lua

diff --git a/modulefiles/gsiutils_hera.gnu.lua b/modulefiles/gsiutils_hera.gnu.lua
index 3eee0ad..8baa086 100644
--- a/modulefiles/gsiutils_hera.gnu.lua
+++ b/modulefiles/gsiutils_hera.gnu.lua
@@ -1,7 +1,7 @@
 help([[
 ]])

-prepend_path("MODULEPATH", "/contrib/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/Core")
+prepend_path("MODULEPATH", "/scratch4/NCEPDEV/stmp/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/Core")

 local python_ver=os.getenv("python_ver") or "3.11.6"
 local stack_intel_ver=os.getenv("stack_gcc_ver") or "9.2.0"

COMPILER was set to gnu in the shell environment and ush/build.sh executed. The build successfully ran to completion

Hera(hfe03):/scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop$ more ush/build_gnu.log

Currently Loaded Modules:
  1) stack-gcc/9.2.0          8) sqlite/3.43.2           15) zstd/1.5.2            22) py-numpy/1.23.4  29) sfcio/1.4.1           36) gsi-ncdiag/1.1.2
  2) gnu/9.2.0                9) util-linux-uuid/2.38.1  16) c-blosc/1.21.5        23) bufr/11.7.0      30) nemsio/2.5.4          37) gsiutils_common
  3) openmpi/4.1.6_gnu9.2.0  10) python/3.11.6           17) pkg-config/0.27.1     24) bacio/2.4.1      31) wrf-io/1.2.0          38) prod_util/2.1.1
  4) stack-openmpi/4.1.6     11) nghttp2/1.57.0          18) hdf5/1.14.0           25) w3emc/2.10.0     32) ncio/1.1.2            39) openblas/0.3.24
  5) gettext/0.19.8.1        12) curl/8.4.0              19) netcdf-c/4.9.2        26) sp/2.5.0         33) crtm-fix/2.4.0.1_emc  40) gsiutils_hera.gnu
  6) libxcrypt/4.4.35        13) cmake/3.23.1            20) netcdf-fortran/4.6.1  27) ip/4.3.0         34) git-lfs/2.10.0
  7) zlib/1.2.13             14) snappy/1.1.10           21) py-setuptools/63.4.3  28) sigio/2.3.2      35) crtm/2.4.0.1

+ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/install -DBUILD_UTIL_ALL=ON /scratch1/NCEP
DEV/da/Russ.Treadon/git/gsi-utils/develop
-- The C compiler identification is GNU 9.2.0
-- The Fortran compiler identification is GNU 9.2.0

...

-- Configuring done
-- Generating done
-- Build files have been written to: /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/build
+ make -j 8 VERBOSE=

...

[100%] Linking Fortran executable calcstats.x
[100%] Built target calcstats_aero.x
[100%] Built target calcstats.x
+ make install
[  5%] Built target calc_increment_ens_aero.x
[ 10%] Built target cov_calc.x
[ 11%] Built target adderrspec.x

...

-- Installing: /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/install/bin/fov_util.x
-- Installing: /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/install/bin/makeoneobbufr.x
-- Set runtime path of "/scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/install/bin/makeoneobbufr.x" to ""
-- Installing: /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/install/bin/zero_biascoeff.x
+ set +x

@RussTreadon-NOAA
Copy link
Contributor Author

Looking more closely at modulefiles/gsiutils_hera.gnu.lua there are inconsistencies in this file. The file sets

local stack_intel_ver=os.getenv("stack_gcc_ver") or "9.2.0"
local stack_impi_ver=os.getenv("stack_openmpi_ver") or "4.1.5"

but does not reference these intel variables. Instead. we load

load(pathJoin("stack-gcc", stack_gcc_ver))
load(pathJoin("stack-openmpi", stack_openmpi_ver))

We should replace the intel variables with gnu variables

local stack_gcc_ver=os.getenv("stack_gcc_ver") or "9.2.0"
local stack_openmpi_ver=os.getenv("stack_openmpi_ver") or "4.1.5"

When this is done, the gnu build fails with

Lmod has detected the following error:  The following module(s) are unknown: "stack-openmpi/4.1.5"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "stack-openmpi/4.1.5"

Also make sure that all modulefiles written in TCL start with the string #%Module

Executing this command requires loading "stack-openmpi/4.1.5" which failed while processing the following module(s):

    Module fullname    Module Filename
    ---------------    ---------------
    gsiutils_hera.gnu  /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi-utils/develop/modulefiles/gsiutils_hera.gnu.lua

@RatkoVasic-NOAA mentioned this above. We should use 4.1.6. Replacing 4.1.5 with 4.1.6 works.

@RussTreadon-NOAA
Copy link
Contributor Author

@RatkoVasic-NOAA , will the path to the gnu spack-stack remain

/scratch4/NCEPDEV/stmp/role.epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev-rocky8/install/modulefiles/Core

or will it move to a /contrib/spack-stack location?

If the role.epic is the official location, I'll open a PR to get the updated modulefiles/gsiutils_hera.gnu.lua merged into develop. If the role.epic path is temporary, I won't open a PR until we have the official path.

@RatkoVasic-NOAA
Copy link

@RussTreadon-NOAA yes. I will not remove it.

@RussTreadon-NOAA
Copy link
Contributor Author

Thank you @RatkoVasic-NOAA . We can open a PR to get the updated modulefiles/gsiutils_hera.gnu.lua into develop. Thank you for acting quickly so we can close this issue.

@RussTreadon-NOAA
Copy link
Contributor Author

Work for this issue will be done in bugfix/hera_gnu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants