-
Notifications
You must be signed in to change notification settings - Fork 371
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISSUES about pgigpu+openmpi with WCYCL1850 and CMPASO-NYF #5563
Comments
@lulu1599 On your second, It looks like your files don't have the correct flags for building the GPU-enabled ocean. You will need a -DMPAS_OPENACC flag for preprocessing. And the fortran flags should have -acc -Minfo=accel (the ta flag is somewhat optional but if you know the architecture it can sometimes help). Looks like the latter are enabled on the LDFLAGS but not on the compile line. With the Minfo flag on, you should see the compiler generate additional output for acceleration and that's a good way to see whether it's actually generating gpu instructions. On the first, we have seen that in the past but I think it disappeared with later compiler versions - which version of Nvidia/PGI are you using? |
Thanks a lot!
1. I'm trying the FLAGS you mentioned to see if they worked.
2. My PGI version is 21.9-0, my CUDA version is 11.4. Maybe I should try a newer PGI?
Thanks again,
Jingyu
…------------------ 原始邮件 ------------------
发件人: "E3SM-Project/E3SM" ***@***.***>;
发送时间: 2023年3月29日(星期三) 凌晨0:04
***@***.***>;
***@***.******@***.***>;
主题: Re: [E3SM-Project/E3SM] ISSUES about pgigpu+openmpi with WCYCL1850 and CMPASO-NYF (Issue #5563)
@lulu1599 On your second, It looks like your files don't have the correct flags for building the GPU-enabled ocean. You will need a -DMPAS_OPENACC flag for preprocessing. And the fortran flags should have -acc -Minfo=accel (the ta flag is somewhat optional but if you know the architecture it can sometimes help). Looks like the latter are enabled on the LDFLAGS but not on the compile line. With the Minfo flag on, you should see the compiler generate additional output for acceleration and that's a good way to see whether it's actually generating gpu instructions.
On the first, we have seen that in the past but I think it disappeared with later compiler versions - which version of Nvidia/PGI are you using?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
HI! I have added the FLAGS
string(APPEND CPPDEFS " -DMPAS_OPENACC")
string(APPEND FFLAGS "-acc -Minfo=accel")
When I set num of process 8 (-n 8), and the --ngpu-per-node 8, the e3sm.exe was submitted to GPU successfully, however,
When I set num of process 16 (-n 16), and the --ngpu-per-node 8, I can't see my process on GPU, here's my running info, perhaps you can see any clues?
![image](https://user-images.githubusercontent.com/51312558/228414079-d70b171e-4d7f-4541-b660-d800cc02818b.png)
![image](https://user-images.githubusercontent.com/51312558/228414141-cc5b3f93-9aae-4dab-bc5e-9e423e14e9c9.png)
![image](https://user-images.githubusercontent.com/51312558/228414154-92f0eb3c-450d-448d-b52b-fe490ff30690.png)
![image](https://user-images.githubusercontent.com/51312558/228414172-75c6fa31-2d68-4d52-9b29-f4b36f11b7b7.png)
|
Since the first one works, it seems you have a gpu-enabled executable, so i suspect this is an issue with how your job launcher/scheduler is allocating resources and whether it supports multiple ranks per gpu. |
OK, thanks! Your answers are very helpful! |
HI!
I'm new at GPU running and thanks for your guidance! |
In v2.1, only mpas-o/mpas-si have GPU capability. EAM does use kokkos in part of the code but its in "cpu mode". |
OK,THANKS for your reply. |
Here's another question, when I use mpas.part.* file in this path, I found the min num part file is 8. I know this is related to the num of ocn process. Thus, how can I run with ocn process smaller than 8, like 1, 2, or 4? |
To create a new partition, use the gpmetis command line tool from metis (you'll need metis on your machine - many of our supported machines have it available as a module). If you have the module loaded or metis installed, just use That's for the ocean only. The sea-ice does some additional load balancing and the process is a bit more involved. |
That’s really helpful! |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Regarding E3SM maint-2.1, I tried to run it on a CPU platform with intel+impi, and it worked well with the case WCYCL1850+ne30pg2_EC30to60E2r2. However, when I tried to switch to a GPU platform using PGI+openmpi, I encountered the following issues:
when I compiled with "--compset WCYCL1850 --res ne30pg2_EC30to60E2r2" on the GPU platform, I could not pass the compilation. In the "./case.build" process, I encountered an NVLINK multiple definition error, as follows:
"nvlink error : Multiple definition of 'mpas_vector_reconstruction_mpas_reconstruct_1d_gpu_647_gpu' in '../../mpas-framework/src/libocn.a:mpas_vector_reconstruction.f90.o', first defined in '../../mpas-framework/src/libice.a:mpas_vector_reconstruction.f90.o'"
I searched the forum but could not find any relevant solutions. It seems to point to a compatibility issue between the PGI compiler and the source code (perhaps it's my compile parameters?), however, I am not sure. Do you know what the reason might be?
Then I tried another case, which only activates the ocean module "--compset CMPASO-NYF --res T62_oEC60to30v3" to avoid the aforementioned NVLINK problem. However, when I ran "./case.submit", I found that the process was not submitted to the GPU card but still running on the CPU. Did I miss something?
The config_machine.xml, cmake files, running scripts, and log files are attached. Hope an expert can help me, thanks!
config_machine.xml.txt
create_case.sh.txt
Depends.pgigpu.cmake.txt
pgigpu_lu-gpu.cmake.txt
preview_run.log
The text was updated successfully, but these errors were encountered: