Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install ESMF 8.6.1 and MAPL 2.46.2 -> 2.46.3 #5

Open
junwang-noaa opened this issue Aug 12, 2024 · 30 comments
Open

Install ESMF 8.6.1 and MAPL 2.46.2 -> 2.46.3 #5

junwang-noaa opened this issue Aug 12, 2024 · 30 comments

Comments

@junwang-noaa
Copy link

junwang-noaa commented Aug 12, 2024

Install ESMF 8.6.1 and MAPL 2.46.2 after netcdf build with zstd is available on wcoss2.

The MAPL 2.46.2 has issues when running with UFS-weather-model. MAPL 2.46.3 has the fix, please install 2.46.3 with ESMF 8.6.1.

8/30/2024:
To clarify, MAPL 2.46.3 needs to be installed with ESMF 8.6.1 in both spack-stack 1.6.0 and HPC-stack on Acorn for UFS weather model testing.

@junwang-noaa
Copy link
Author

Corresponing UFS weather model issues are:

ufs-community/ufs-weather-model#2345

ufs-community/ufs-weather-model#2346

@edwardhartnett
Copy link
Contributor

There is a build problem that needs to be resolved by the teams:

MAPL 2.46.2/ESMF 8.6.1 (Hang)

This happens on all machines, not just WCOSS2.

@junwang-noaa junwang-noaa changed the title Install ESMF 8.6.1 and MAPL 2.46.2 Install ESMF 8.6.1 and MAPL 2.46.2 -> 2.46.3 Aug 20, 2024
@edwardhartnett
Copy link
Contributor

Is there a new release of MAPL now? @Hang-Lei-NOAA can you try installing it?

@junwang-noaa junwang-noaa mentioned this issue Aug 26, 2024
@Hang-Lei-NOAA
Copy link

Hang-Lei-NOAA commented Aug 26, 2024 via email

@edwardhartnett
Copy link
Contributor

@DusanJovic-NOAA will test installation provided by @AlexanderRichert-NOAA on acorn.

@DusanJovic-NOAA
Copy link

@DusanJovic-NOAA will test installation provided by @AlexanderRichert-NOAA on acorn.

I do not have any information about Alex's installation on Acorn. I looked at linked ufs-weather-model issues. Where is it?

@Hang-Lei-NOAA
Copy link

Hang-Lei-NOAA commented Aug 30, 2024 via email

@junwang-noaa
Copy link
Author

@edwardhartnett @AlexanderRichert-NOAA Can you provide details on the installation either in this issue or in ufs-weather-model issue #2345? When is it installed and how to load the module? Without this information, we can't test ufs-weather-model.

@junwang-noaa
Copy link
Author

@Hang-Lei-NOAA I think Ed said we need to test the library Alex installed. Also MAPL version is 2.46.3

@Hang-Lei-NOAA
Copy link

Hang-Lei-NOAA commented Aug 30, 2024 via email

@junwang-noaa
Copy link
Author

@edwardhartnett As discussed in ufs-wether-model issue #2345, the MAPL 2.46.2 does not work. But the library /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core is still using MAPL 2.46.2. Are you going to ask Alex install a new spack-stack version?

@Hang-Lei-NOAA I assume your installation is for wcoss2 testing since it is using hpc-stack, is it correct? Also is your testing working? Can you list the module file location and the test log? Thanks

@Hang-Lei-NOAA
Copy link

Hang-Lei-NOAA commented Aug 30, 2024 via email

@Hang-Lei-NOAA
Copy link

Hang-Lei-NOAA commented Aug 30, 2024 via email

@DusanJovic-NOAA
Copy link

Compilation fails with this error:

Force 32-bit build for GOCART
CMake Error at GOCART/CMakeLists.txt:63 (find_package):
  By not providing "FindGFTL_SHARED.cmake" in CMAKE_MODULE_PATH this project
  has asked CMake to find a package configuration file provided by
  "GFTL_SHARED", but CMake did not find one.

  Could not find a package configuration file provided by "GFTL_SHARED" with
  any of the following names:

    GFTL_SHAREDConfig.cmake
    gftl_shared-config.cmake

  Add the installation prefix of "GFTL_SHARED" to CMAKE_PREFIX_PATH or set
  "GFTL_SHARED_DIR" to a directory containing one of the above files.  If
  "GFTL_SHARED" provides a separate development package or SDK, be sure it
  has been installed.


-- Configuring incomplete, errors occurred!

In current spack-stack, gftl-shared module is:

$ ll /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/unified-env/install/modulefiles/intel/2022.0.2.262/gftl-shared/
total 4
-rw-r--r-- 1 alexander.richert nceplibs 1182 Jan  6  2024 1.6.1.lua

in ue-esmf-8.6.1-mapl-2.46.3 stack it is:

$ ll /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.3/install/modulefiles/intel/2022.0.2.262/gftl-shared/
total 4
-rw-r--r-- 1 alexander.richert nceplibs 1219 Aug 30 19:40 main.lua

@AlexanderRichert-NOAA
Copy link

According to the MAPL Spack recipe, versions 2.45.x and up require gftl-shared v1.8.0 and up. I can use v1.8.0 or v1.9.0, or I can chance it with 1.6.1 but no promises it wouldn't break anything.

@DusanJovic-NOAA
Copy link

Whatever, we just need to have exactly the same module version and the same name of the modules on all RDHPCS platforms and Acorn, because we use ufs_common.lua on all of them.

@DusanJovic-NOAA
Copy link

I also see that the current name of mapl module is mapl/2.46.2-esmf-8.6.1 while the new one is just mapl/2.46.3. If we are changing the naming on Acorn, the new name must be used on all other machines.

@AlexanderRichert-NOAA
Copy link

Okay, I installed with [email protected], and I updated the module file to follow the mapl/xxx-emsf-xxx pattern.

@DusanJovic-NOAA
Copy link

Thanks.

I ran cpld_control_p8 test and it failed. I see these messages in the stderr file:

pe=00000 FAIL at line=01088    MAPL_CapGridComp.F90                     <status=41>
pe=00000 FAIL at line=01088    MAPL_CapGridComp.F90                     <status=41>
pe=00000 FAIL at line=01560    MAPL_EsmfRegridder.F90                   <destination masking with this regrid type is unsupported>
pe=00000 FAIL at line=01382    MAPL_EsmfRegridder.F90                   <status=1>
pe=00000 FAIL at line=00977    MAPL_AbstractRegridder.F90               <status=1>
pe=00000 FAIL at line=00097    NewRegridderManager.F90                  <status=1>
pe=00000 FAIL at line=01101    GriddedIO.F90                            <status=1>
pe=00000 FAIL at line=04539    ExtDataGridCompMod.F90                   <status=1>
pe=00000 FAIL at line=01468    ExtDataGridCompMod.F90                   <status=1>
pe=00000 FAIL at line=01838    MAPL_Generic.F90                         <status=1>
pe=00000 FAIL at line=01241    MAPL_CapGridComp.F90                     <status=1>
pe=00000 FAIL at line=01204    MAPL_CapGridComp.F90                     <status=1>
pe=00000 FAIL at line=01164    MAPL_CapGridComp.F90                     <status=1>
pe=00000 FAIL at line=00832    MAPL_CapGridComp.F90                     <status=1>
pe=00000 FAIL at line=00972    MAPL_CapGridComp.F90                     <status=1>

@DusanJovic-NOAA
Copy link

With updated GOCART (head of current develop branch), ufs-weather-model is still failing, this time with the error in SU2G_GridCompMod.F90:

pe=00136 FAIL at line=00193    SU2G_GridCompMod.F90                     <status=41>
pe=00136 FAIL at line=04713    MAPL_Generic.F90                         <status=41>
pe=00136 FAIL at line=04900    MAPL_Generic.F90                         <status=41>
pe=00136 FAIL at line=01338    GOCART2G_GridCompMod.F90                 <status=41>
pe=00136 FAIL at line=01316    GOCART2G_GridCompMod.F90                 <status=41>
pe=00136 FAIL at line=00188    GOCART2G_GridCompMod.F90                 <status=41>

This is probably due to how GOCART is configured in our regression tests.

@Hang-Lei-NOAA
Copy link

Hang-Lei-NOAA commented Sep 5, 2024 via email

@edwardhartnett
Copy link
Contributor

@DusanJovic-NOAA and @Hang-Lei-NOAA is there an install of these versions that is working anywhere? That is, is there a successful case of these software packages working together?

@edwardhartnett
Copy link
Contributor

OK, as a data point, I installed spack-stack-1.8.0 and it correctly installs the correct versions of netCDF (4.9.2), MAPL (2.46.3), and ESMF (8.6.1). Netcdf-c is installed with zstd and only one copy of the netCDF library is installed, and all other applications are using that one. So all that is good.

@edwardhartnett
Copy link
Contributor

Email from Ed:

All,

There is a current issue on WCOSS2-requests: Install ESMF 8.6.1 and MAPL 2.46.2 -> 2.46.3.

Hang has installed the requested versions of ESMF and MAPL, all built and ESMF passed unit testing (MAPL has no tests). All are using netcdf-c-4.9.2.

When the UFS regression tests are run, there are failures with GOCART cases. See the issue for the exact description. This does not seem to be an installation issue, but a software issue. ESMF-8.6.1 and MAPL-2.46.3 are installed correctly. We have tested both with hpc-stack and spack-stack installs, with the same results. On orion, Brian has apparently encountered the same problems with this combination of software versions.

Hang has experimented and has found when the older MAPL version is used, the regression tests pass.

I'm not sure there is anything further our group can do on this issue. We have installed the software as requested, but cannot fix it, unfortunately. We understand that Brian and Dusan are following up with the MAPL team.

Please let us know if there is anything else we can do to help move this forward.

Thanks,
Ed & Hang

Reply from Jun:

We need a bug fix from MAPL 2.46.3. At Monday's model infrastructure meeting, Barry agreed to take a look at the GOCART failure. Dusan transferred the test case to Hera, I just tagged Barry.

@DusanJovic-NOAA
Copy link

@AlexanderRichert-NOAA ue-esmf-8.6.1-mapl-2.46.3 environment on Acorn does not have g2/3.5.1 and g2tmpl/1.13.0. Can you please add them.

@AlexanderRichert-NOAA
Copy link

Will do

@AlexanderRichert-NOAA
Copy link

AlexanderRichert-NOAA commented Sep 17, 2024

@DusanJovic-NOAA, please try /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/upp-esmf-8.6.1-mapl-2.46.3/install/modulefiles/Core

@edwardhartnett
Copy link
Contributor

@DusanJovic-NOAA did you find the versions you need?

@RatkoVasic-NOAA
Copy link

For your tests, you can find new installations (esmf-8.6.1-mapl-2.46.3) on Orion and Hercules:

Hercules: /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.3/install/modulefiles/Core
Orion: /work/noaa/epic/role-epic/spack-stack/orion/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.3/install/modulefiles/Core

Included are new g2, g2tmpl and fms.

@edwardhartnett
Copy link
Contributor

One way I think we went astray here is biting off too much at once.

Can we update ESMF to 8.6.1 and get that all resolved before we upgrade MAPL?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants