Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile issue with GCC 10.1.0 #3954

Closed
ndkeen opened this issue Nov 20, 2020 · 8 comments · Fixed by #4822
Closed

Compile issue with GCC 10.1.0 #3954

ndkeen opened this issue Nov 20, 2020 · 8 comments · Fixed by #4822

Comments

@ndkeen
Copy link
Contributor

ndkeen commented Nov 20, 2020

Wanted to try newer GNU compiler, but hit an issue here:

cime/src/share/timing/perf_utils.F90

  282 |    call MPI_BCAST(vec,lsize,MPI_INTEGER,0,comm,ierr)
      |                  1
......
  314 |    call MPI_BCAST(vec,lsize,MPI_LOGICAL,0,comm,ierr)
      |                  2
Error: Type mismatch between actual argument at (1) and actual argument at (2) (INTEGER(4)/LOGICAL(4)).

This is on cori. I created simple tester:

module boilerplate
  implicit none
  private
#include <mpif.h>
  save
  
  interface shr_mpi_bcast ; module procedure &
       shr_mpi_bcastl0, &
       shr_mpi_bcasti0
  end interface
  
contains
  
  SUBROUTINE shr_mpi_bcasti0(vec,comm)
    IMPLICIT none
    integer, intent(inout):: vec
    integer, intent(in)   :: comm
    integer               :: ierr,lsize
    lsize = 1
    call MPI_BCAST(vec,lsize,MPI_INTEGER,0,comm,ierr)
  END SUBROUTINE shr_mpi_bcasti0
  
  SUBROUTINE shr_mpi_bcastl0(vec,comm)
    IMPLICIT none
    logical, intent(inout):: vec
    integer, intent(in)   :: comm
    integer               :: ierr, lsize
    lsize = 1
    call MPI_BCAST(vec,lsize,MPI_LOGICAL,0,comm,ierr)
  END SUBROUTINE shr_mpi_bcastl0
end module boilerplate


program mpitest

#include <mpif.h>
  integer::rank,nprocs,ierr

  call MPI_INIT(ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr)
  
  print*, "rank ", rank, " out of ", nprocs
  
  call MPI_FINALIZE(ierr)
end program mpitest

module swap PrgEnv-intel PrgEnv-gnu
module swap gcc gcc/10.1.0
And then build with ftn.

cori10% ftn module_interface.F90
module_interface.F90:7:6:

    7 |   save
      |      1
Warning: Legacy Extension: Blanket SAVE statement at (1) follows previous SAVE statement
module_interface.F90:22:19:

   22 |     call MPI_BCAST(vec,lsize,MPI_INTEGER,0,comm,ierr)
      |                   1
......
   31 |     call MPI_BCAST(vec,lsize,MPI_LOGICAL,0,comm,ierr)
      |                   2
Error: Type mismatch between actual argument at (1) and actual argument at (2) (INTEGER(4)/LOGICAL(4)).



@ambrad
Copy link
Member

ambrad commented Nov 23, 2020

I googled a bit, and it seems adding the Fortran flag -fallow-argument-mismatch might fix this.

@whannah1
Copy link
Contributor

@ndkeen I've used the argument suggested by @ambrad with good results. I've got E3SM building and running locally on my mac with GNU, which also required using the -fallow-invalid-boz flag to fix a similar issue.

@ndkeen
Copy link
Contributor Author

ndkeen commented Nov 23, 2020

Thanks Andrew! That does de-escalate that sort of error to a warning and let the build continue. (Though there are quite a few warnings that could be cleaned up).

I then ran into another issue with
integer, parameter :: gen_hash_key_offset = z'000053db'

Might need -fallow-invalid-boz to work-around this oddity. Just now reading Walter's comment...

@ndkeen
Copy link
Contributor Author

ndkeen commented Nov 25, 2020

I still get an error at link time when using gcc 10.1 on Cori. Even if I build with intel, but use the gcc/10.1.0 module, the build will not complete.

@ndkeen
Copy link
Contributor Author

ndkeen commented Dec 10, 2020

Noting the link error here, which isn't very helpful. But as I noted above, I still get this error even if I use Intel compiler, but still have gcc/10.1.0 module loaded.

[ 76%] Linking Fortran static library libatm.a

...

/usr/bin/ranlib libatm.a
make[2]: Leaving directory '/global/cscratch1/sd/ndk/e3sm_scratch/cori-haswell/m49-nov19/f-ne4.m49-nov19.gnu.mpt.n001p032t32x1.2d.gnu10.DEBUG/bld/cmake-bld'
[ 76%] Built target atm
make[1]: Leaving directory '/global/cscratch1/sd/ndk/e3sm_scratch/cori-haswell/m49-nov19/f-ne4.m49-nov19.gnu.mpt.n001p032t32x1.2d.gnu10.DEBUG/bld/cmake-bld'
gmake: *** [Makefile:87: all] Error 2

/global/cscratch1/sd/ndk/e3sm_scratch/cori-haswell/m49-nov19/f-ne4.m49-nov19.gnu.mpt.n001p032t32x1.2d.gnu10.DEBUG/bld/e3sm.bldlog.201210-132354

@rljacob
Copy link
Member

rljacob commented Dec 21, 2020

This is a problem with GNU 10, right? It should allow both instances of MPI_bcast by default without any special flags.

@rljacob
Copy link
Member

rljacob commented Dec 21, 2020

This discussion suggests its an MPI problem: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91731

@ndkeen
Copy link
Contributor Author

ndkeen commented Sep 5, 2021

I created a branch ndk/machinefiles/cori-test-gnu9 that allows for building with gnu9 or gnu10.
Using --compiler=gnu works as now in master (ie 8.3). But if you try --compiler=gnu9 or --compiler=gnu10, will build with those.

To complete the build with gnu10, will need a change to components/elm/src/external_models/sbetr/src/betr/betr_math/ODEMod.F90
that is described here:
https://www.gitmemory.com/issue/E3SM-Project/E3SM/4151/794906780

I'm not sure if there is a PR for that change yet. Yes there is: #4364

I was hoping we could use this branch to test and fix any issues with gnu9/gnu10 before then deciding how to proceed.
Note that the new machine Perlumutter only has gnu version 9 or 10.

ndkeen added a commit that referenced this issue Mar 2, 2022
Add flags for GNU builds to allow using GNU v10 and higher versions

Add -fallow-argument-mismatch for all GNU builds (including gnugpu).

Only add -fallow-invalid-boz for 2 specific files for gnu/gnugpu builds.

This PR adds Depends.gnugpu.cmake which is similar to Depends.gnu.cmake

Fixes #3954
And is needed first for #4809 and #4818

[BFB]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants