-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI processes (Starting PEs : 1) does not match the expected number 24 #454
Comments
Hi @YueZhang720, this looks like an issue where ESMF is running with a single thread, across all threads. This would explain why you are seeing multiple prints of the same code. Check your ESMF build. Was |
@YueZhang720, were you able to resolve this issue? |
It doesn't work. So I try GCHPv14.5.0 with [email protected]. When I use mpirun -np 6 ./gchp, here is the error message:
When I use srun -n 48 -N 2 -m plane=24 --mpi=pmi2 ./gchp, the error is as follows:
Is there anything wrong between my slurm and esmf? I have tried many times but it didn't work out. |
Hi @YueZhang720, this still looks like an MPI issue. Do you have a system administrator on your cluster who can help look into the MPI configuration? |
This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days it will be closed. You can add the "never stale" tag to prevent the issue from closing this issue. |
Your name
Yue Zhang
Your affiliation
HKUST(GZ)
What happened? What did you expect to happen?
After submitting slurm job, there are errors in GCHP log:
What are the steps to reproduce the bug?
I have tried gchp13.3.4 and gchp14.4.3, and both simulations report the same errors. I also used different versions of [email protected] and [email protected]; they didn't work either. What do you think caused this issue and what do you think I should do to solve this problem?
Please attach any relevant configuration and log files.
ExtData.txt
GCHP_log.txt
run_sh.txt
setCommonRunSettings.txt
What GCHP version were you using?
14.4.3
What environment were you running GCHP on?
Local cluster
What compiler and version were you using?
gcc 10.2.0
What MPI library and version were you using?
openmpi 5.0.5
Will you be addressing this bug yourself?
Yes, but I will need some help
Additional information
No response
The text was updated successfully, but these errors were encountered: