Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to Intel oneapi (LLVM) compilers for C, C++ #912

Open
11 of 12 tasks
climbfuji opened this issue Dec 18, 2023 · 24 comments
Open
11 of 12 tasks

Move to Intel oneapi (LLVM) compilers for C, C++ #912

climbfuji opened this issue Dec 18, 2023 · 24 comments
Assignees
Labels
Epic For planning and administration INFRA JEDI Infrastructure NAVY United States Naval Research Lab NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center

Comments

@climbfuji
Copy link
Collaborator

climbfuji commented Dec 18, 2023

Is your feature request related to a problem? Please describe.

From @michalakes:

With its 2024 release of OneAPI, Intel is deprecating its "classic" versions of Intel Fortran (ifort) and Intel C/C++ in favor of it's new LLVM-based versions. The C/C++ classic compilers are already deprecated in the 2024 release of OneAPI and classic Fortran will be deprecated by the end of 2024.

The announcements are here:
https://community.intel.com/t5/Intel-oneAPI-Data-Parallel-C/DEPRECATION-NOTICE-Intel-Fortran-Compiler-Classic-ifort/td-p/1545790
https://community.intel.com/t5/Intel-oneAPI-Data-Parallel-C/REMOVAL-NOTICE-Intel-C-Compiler-Classic/td-p/1545804

Based on what I have found so far, we need the 2024 oneapi release, because Python 3.9 and later don't built with oneapi compilers up to the latest 2023 release:

While at it, I also came across an issue building bison with the oneapi compilers, adding it here as a reference/reminder:

Describe the solution you'd like

Note that we need [email protected] or later because of bugsin previous releases

  1. Start with the NEPTUNE standalone environment on Nautilus
  • Have Intel oneAPI compilers installed on NAVY Nautilus
  • Nautilus: Switch site config to use this compiler's icx and icpx (but not yet ifx)
  • Make necessary updates to packages and spack configs so that neptune-dev builds
  • Nautilus: Build NEPTUNE standalone and run tests
  1. For NOAA/UFS
  • Have Intel oneAPI compilers installed on a RDHPCS platform of choice
  • Locally, switch site config to use this compiler's icx and icpx (but not yet ifx; keep using ifort)
  • Make necessary updates to packages and spack configs so that unified-dev builds
  • Build ufs-weather-model and run tests CURRENTLY FAILING, see Build / test UFS WM with Intel oneAPI v2024.2.1 (LLVM)-based stacks #1390
  1. For JCSDA/jedi-bundle
  • Have Intel oneAPI compilers installed on a platform of choice
  • Locally, switch site config to use this compiler's icx and icpx (but not yet ifx; keep using ifort)
  • Make necessary updates to packages and spack configs so that unified-dev builds
  • Build jedi-bundle and run tests (@stiggy87)

Additional context

Related issue for JEDI: https://github.com/JCSDA-internal/jedi-bundle/issues/38

@climbfuji climbfuji added INFRA JEDI Infrastructure NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center labels Dec 18, 2023
@climbfuji climbfuji self-assigned this Dec 18, 2023
@climbfuji
Copy link
Collaborator Author

I started working on this on AWS ParallelCluster - PR to come

@climbfuji climbfuji moved this from Todo to In Progress in spack-stack-1.7.0 (2024 Q1) Jan 25, 2024
@climbfuji climbfuji added the Epic For planning and administration label Feb 2, 2024
@climbfuji
Copy link
Collaborator Author

This is additional information from a duplicate epic/issue that got closed (#969)

This is in addition to the wgrib2 issue I posted (#967).
Here is a list of packages that don't build with the oneAPI compilers 2024.0.2. Updated list after switching to ifx instead of ifort (because the Python meson builder doesn't accept spack compiler flags and trips over the deprecation messages from ifort):

ai-env-1.0.0-juvu3nik2oqc3tyl3xcsgo7nglecfk2c
ewok-env-1.0.0-cx4jzlczhau3qk45vza3omszgy62a5bd
fms-2023.04-tn6g4hdumwnbwhlykds5acbd7toocgfk
gdal-3.8.3-p6iatrs32e6q7hrkmolzxgvfu6nctxhq
gfsio-1.4.1-mcg3mb3qmgrsaan5tvfjsnkquukg4zd7
global-workflow-env-1.0.0-upb53g4iru5anqukigiw6l3vpc2ey22p
gmao-swell-env-1.0.0-cquu3az5m4hg2b3n2pcijqbhqekf4ccl
grib-util-1.3.0-fbhkry6znyjhgsp34etba7f77utvsohe
gsi-env-1.0.0-dtodfhlos6a5juvha7tug5n7ju4onmz2
jedi-base-env-1.0.0-m5cg44j276uvy6q6vagwgseuv2hqhhv5
jedi-fv3-env-1.0.0-x5s5xtq3bjlngkfyzryaghhxwadae2q3
jedi-geos-env-1.0.0-2wipr3k7iqzb5y5tnho4zczo4c24gvds
jedi-mpas-env-1.0.0-rs5xmdtqjlpj4xtpslqdcm6hg5ovlbi2
jedi-neptune-env-1.0.0-4z7kdwdn6pkxzldlfqppuhpy6fmllmjx
jedi-tools-env-1.0.0-pmzcvsyysygcebfixepaasxlkh3gnfnb
jedi-ufs-env-1.0.0-vt7pgktu5u4jjpy3uhd7jhkdsfeq6fhx
jedi-um-env-1.0.0-4dlsyqaozn5sibyndryflco7vsq243bj
landsfcutil-2.4.1-z7io4g7ccdfhjfewmzdyfrhxjdlxfxri
mapl-2.40.3-f2pmar2sy3sfsivampru23kl5sm6b6ld
met-11.1.0-wfkyd4kqjzkavjqcftvvhvasg3um7efz
metplus-5.1.0-4z353ifekeymyxf2iggimmjcbomai3g5
py-cartopy-0.21.1-2bwaccqbxmjq4o7g4uwwfn34wvolhv2c
py-networkx-3.1-ow75t7mfx6f4ukptwjtv3j3g3dops7na
py-pandas-1.5.3-qpvhapy7fywnusiugkb5tjpj5dynt7na
py-torch-2.1.2-h4bubyoloaxtkwzf7tdn5jui2n662rcw
py-xarray-2023.7.0-tjvaimmrsrc62lpcvallnz4szjo25hie
sfcio-1.4.1-mjvs2ifapl6ulr3aeuibec3o4qwo6quv
sigio-2.3.2-hiv4tdfwdmwt6ubl54b4no6i25i4jysh
soca-env-1.0.0-iklom4kp3topbddcuyzibwalbzdmedwm
ufs-pyenv-1.0.0-pnhke3ht6rtcndfcl7jzkso54exzjxnz
ufs-srw-app-env-1.0.0-arz4cg54i7pprqqedvsn4obfpu364ury
ufs-weather-model-env-1.0.0-lvj3mp2kufgr2wwo26b2q6lj5vrc2ffe
w3nco-2.4.1-cfrrf6gpqqhcquijv4wf5i3a3u3pasph
wgrib2-2.0.8-qlozqyttr5xmme4ox42xoyhosjy3xe3s

Note that some of these failures look like weird bison/flex parsing errors - maybe bison/flex need to be fixed instead?

@jkbk2004
Copy link

@BrianCurtis-NOAA @zach1221 @FernandoAndrade-NOAA FYI: LLVM option might be available along with 1.8 release.

@climbfuji
Copy link
Collaborator Author

Moving this out to spack-stack-1.9.0. We've come a long way, with the exception of wgrib2 all packages in the unified environment and the neptune standalone environment build with [email protected]: (still using ifort instead of ifx, as described above).

@ulmononian
Copy link
Collaborator

ulmononian commented Sep 19, 2024

fyi for hercules:

module load intel-oneapi-compilers/2024.1.0

module load intel-oneapi-mpi/2021.12.0

@climbfuji guess we can check off the Have Intel oneAPI compilers installed on Hercules box? i had submitted that request a long time ago and they were indeed installed.

@climbfuji
Copy link
Collaborator Author

Thanks for checking @ulmononian

Unfortunately, we need [email protected] or later (the latest is 2024.2.1) to work properly :-)

@ulmononian
Copy link
Collaborator

@climbfuji oh dang, spoke too soon...should've read the bolded note in the issue description.

do we have anything tracking the existence/installation of the LLVM compilers on specific machines so that work can go forward to update those once narwhal/hercules testing is completed? if not, might be useful to have an issue tracking that &/or progress of switching to the new compiler on each machine as we approach the deprecation dates.

@climbfuji
Copy link
Collaborator Author

There is one for installing the very latest version of the classic compilers (oneapi 2023.2.?) on the NOAA machines, but none for oneAPI. It's still relatively new and we don't need full coverage across all platforms until we've established a version that works for all applications.

@climbfuji
Copy link
Collaborator Author

I will also note that it is easier and faster to install oneapi ourselves and apply the bug fixes for the known bugs, rather than waiting for the sysadmins to do this.

@climbfuji
Copy link
Collaborator Author

@srherbener @stiggy87 @eap We've come very far with the push to the Intel LLVM compilers for C/C++ for the spack-stack-1.9.0 release. All that is needed is to build end test jedi-bundle with a combination of icx, icpx, ifort from Intel oneAPI 2024.2.0 or 2024.2.1.

We know from CI that the unified environment builds, but we don't know if JEDI runs successfully or not. Can someone please take care of this and report back, so that we can close this issue as completed? It's perhaps the most important deliverable for spack-stack-1.9.0. Thanks very much!

@climbfuji
Copy link
Collaborator Author

I will also throw in that I recently contributed code changes to JEDI so that the components used in neptune-bundle (oops, ufo, ioda) build even with icx, icpx, ifx from Intel oneAPI 2025.0.0.

@stiggy87
Copy link
Contributor

stiggy87 commented Jan 9, 2025

I tried building JEDI bundle and I am getting the following errors:

/home/ubuntu/jedi/jedi-bundle/fv3-jedi/src/fv3jedi/FieldMetadata/fields_metadata_mod.f90(12): error #6405: The same named entity from different modules and/or program units cannot be referenced.   [F_C_STRING]
use string_f_c_mod, only: f_c_string, c_f_string
--------------------------^
/home/ubuntu/jedi/jedi-bundle/fv3-jedi/src/fv3jedi/Utilities/fv3jedi_constants_mod.f90(12): error #6405: The same named entity from different modules and/or program units cannot be referenced.   [F_C_STRING]
use string_f_c_mod, only: f_c_string
--------------------------^
/home/ubuntu/jedi/jedi-bundle/fv3-jedi/src/fv3jedi/Utilities/fv3jedi_constants_mod.f90(50): error #6285: There is no matching specific subroutine for this generic subroutine call.   [F_C_STRING]
    call f_c_string(constant_name, constant_name_c)
---------^

This is using the ifort command instead of ifx since I installed the 2024.2.1.

@climbfuji
Copy link
Collaborator Author

This is an error in the JEDI code, I believe. Are you at the head of the JEDI branches? I remember vaguely having seen this issue and fixing it in neptune-bundle (which uses oops, ufo, ioda only), but I can't remember the details.

@stiggy87
Copy link
Contributor

stiggy87 commented Jan 9, 2025

The fv3-jedi is pointed to head (develop). Looks like there's some work around these fields. I'll ping the devs on what I'm seeing and if they know about it.

@climbfuji
Copy link
Collaborator Author

Et voila. Not a workaround, but a bug fix (it's a bug in fv3-jedi, apparently): https://github.com/JCSDA-internal/ioda/pull/1351

@stiggy87
Copy link
Contributor

Talked with Francois H and filed the issue here: https://github.com/JCSDA-internal/fv3-jedi/issues/1331

Also getting a segfault inside Rocky8:

/usr/bin/ar: Relink `/opt/spack-stack/envs/unified-env-oneapi/install/oneapi/2024.2.1/intel-oneapi-runtime-2024.2.1-fgyyxgh/lib/libimf.so' with `/lib64/libm.so.6' for IFUNC symbol `cosf'
Error running link command: Segmentation fault

Talked to @srherbener and he thought this looks like a similar issue to the tar one with intel.

@climbfuji
Copy link
Collaborator Author

Looks similar to the libirc.so issue ?

@srherbener
Copy link
Collaborator

Similar to what we have found in #1355. Here is the error message from tar:

tar: Relink `/apps/spack-managed/gcc-11.3.1/intel-oneapi-compilers-2023.1.0-sb753366rvywq75zeg4ml5k5c72xgj72/compiler/2023.1.0/linux/compiler/lib/intel64_lin/libimf.so' with `/usr/lib64/libm.so.6' for IFUNC symbol `sincosf'
CMake Error at crtm/test/CMakeLists.txt:106 (message):
  Failed to untar the file.

@climbfuji
Copy link
Collaborator Author

@srherbener @stiggy87 Gentle reminder about this issue once the tar problem is fixed/worked around and the fv3-jedi compile errors are resolved. Thanks very much!

@stiggy87
Copy link
Contributor

Did a few more tests around jedi-bundle and running ctests. Everything builds to skylab for me, so we can go ahead and mark this done!.

@github-project-automation github-project-automation bot moved this from In Progress to Done in spack-stack-1.9.0 (2024 Q4) Jan 23, 2025
@rickgrubin-noaa
Copy link
Collaborator

rickgrubin-noaa commented Jan 23, 2025

Given this condition in the issue description:

  1. If successful, proceed to a system shared between the UFS and JEDI repeat with jedi-bundle/ufs-weather-model
  • Have Intel oneAPI compilers installed on a RDHPCS platform of choice
  • Locally, switch site config to use this compiler's icx and icpx (but not yet ifx; keep using ifort)
  • Make necessary updates to packages and spack configs so that unified-dev builds
  • Build ufs-weather-model and run tests

Please point to the host / env location that satisfies "shared between the UFS and JEDI"

AND

successfuly runs ufs-weather-model tests. Re-opening the issue until the above is otherwise demonstrated.

See also: Build / test UFS WM with Intel oneAPI v2024.2.1 (LLVM)-based stacks #1390

@climbfuji
Copy link
Collaborator Author

@rickgrubin-noaa I asked @stiggy87 to run tests with JEDI and then mark the issue as resolved. I know from conversation with Dusan and my own tests that ufs runs with icx, icpx, ifort (and the issue clearly states ifort above, not ifx). This issue is not about ifx which is where you are seeing the error in #1390.

unified-dev builds for me on RDHPCS and NRL platforms, and for @stiggy87 on JCSDA platforms.

I think we should remove the SHARED between UFS and JEDI part, it's good enough if each organization can build their own version of the unified environment and run the tests. And then close this as completed.

@rickgrubin-noaa
Copy link
Collaborator

@rickgrubin-noaa I asked @stiggy87 to run tests with JEDI and then mark the issue as resolved. I know from conversation with Dusan and my own tests that ufs runs with icx, icpx, ifort (and the issue clearly states ifort above, not ifx). This issue is not about ifx which is where you are seeing the error in #1390.

Understood re: what's documented in #1390, so I should update that issue, as this also fails with [email protected] and [email protected] and ifort.

I think we should remove the SHARED between UFS and JEDI part, it's good enough if each organization can build their own version of the unified environment and run the tests. And then close this as completed.

Agree. Thanks for the clarification and distinction.

@climbfuji
Copy link
Collaborator Author

Hhmm so if it also fails with ifort, then you are right - we need to keep it open. I didn't see that, sorry. But I'll update the description so that we can have separate environments, not necessarily one joint environment, for testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic For planning and administration INFRA JEDI Infrastructure NAVY United States Naval Research Lab NOAA-EMC OAR-EPIC NOAA Oceanic and Atmospheric Research and Earth Prediction Innovation Center
Projects
Development

No branches or pull requests

8 participants