-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch Step
attributes from cores
to ntasks
and cpus_per_task
#413
Switch Step
attributes from cores
to ntasks
and cpus_per_task
#413
Conversation
0ee98a2
to
a6fe7b3
Compare
raise ValueError('Unsupported step configuration in which ' | ||
'ntasks > 1 and ' | ||
'cpus_per_task != min_cpus_per_task') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To preserve general sanity, I'm enforcing a rule that cpus_per_task == min_cpus_per_task
if ntasks > 1
. If you're doing MPI jobs, you will always know that cpus_per_task = min_cpus_per_task = openmp_thread
, since you're using cpus_per_task
for OpenMP threading. (This will be enforced in a future update.) If you are running a multithreaded or multiprocessing job on a single node, you will always have ntasks = 1
and the number of threads or processes will be controlled by cpus_per_task
and min_cpus_per_task
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should cpus_per_task = min_cpus_per_task = openmp_thread
be cpus_per_task = min_cpus_per_task * openmp_thread
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, if you're running with threads, the number of OpenMP threads is the same thing as the number of CPUs per MPI task. This is just by definition.
In addition to that, what I'm also saying here is that things get too confusing if cpus_per_task
is different from min_cpus_per_task
. That is, it gets very confusing if a developer allows for some number of MPI tasks in a range between min_tasks
and ntasks
, and at the same time they also allow for a range of threads between min_cpus_per_task
and cpus_per_task
. I don't want to deal with that complicated (maybe even impossible) optimization problem. So I suggest we also require that the number of OpenMP threads is always fixed for tasks with multiple MPI tasks, so that cpus_per_task = openmp_threads
and min_cpus_per_task = cpus_per_task
.
Does that make things any clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense, thanks.
TestingI have run the ocean
and here:
I also set up (but did not try to run) all the tests in compass, to make sure at least through the setup stage there were no references to missing |
I'm still working on updating the documentation. |
7c214f1
to
a892190
Compare
@altheaden, it would be great if you could run a few tests, maybe even the @matthewhoffman and @trhille, if you are willing to do some testing on your favorite machines to make sure I haven't broken tests on the @mark-petersen, similarly on the ocean side. I ran the Obviously, I'm also keen for feedback from all of you. |
I reran tests after finding a few more |
@xylar In the cosine bell QU90 test, we're seeing that it's using 128 threads when it should be just using one. It looks like the |
@mark-petersen and @matthewhoffman. please hold off on reviewing until @altheaden and I figure out the issue above. |
517d32b
to
267e8d5
Compare
Okay, issue figured out. I had forgotten to actually use the environment variable ( |
6a2e411
to
5fa2549
Compare
abaec21
to
bf8ca45
Compare
@trhille, I was able to run the
It could be that a recent rebase took care of these or it could be that they're specific to Cori. I will test I haven't investigated any of the calving tests yet. |
@trhille,
I see the same failures with this branch but no other failures (i.e. not the others you mentioned in #413 (comment)) and the baseline comparison passes for all tests (even these). Baseline:
This branch:
I believe permissions should be such that you can take a look. I did need to copy the |
@xylar, I'm wondering if the issue is with my spack packages or MPAS-Tools, because the test execution failures seem to arise from MPAS Cell Culler in the |
Okay, I am no longer getting the test execution errors in It's possible I accidentally had an old branch of MPAS-Tools checked out when I ran the test before, but I'm not sure. |
@trhille, I think the more likely thing is just that your conda environment wasn't up-to-date in on or the other branch. Since the 1.1.0 release and changing to 1.2.0-alpha.1 on |
@matthewhoffman and @mark-petersen, could you make this PR a priority (relative to other reviews -- I know these are very busy times). This is holding up other work that @altheaden and I would like to do before summer is over and she has to resume classes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xylar and @altheaden for your work on this. I can see how specifying both ntasks
and cpus_per_tasks
would be needed to match with the srun format.
I tested this compass PR on cori with intel and badger with gnu, running today's E3SM master. I ran the nightly test suite on both and everything was successful.
I also tried some extra test cases on both:
cori10:pr$ compass setup -p /global/homes/m/mpeterse/repos/E3SM/master/components/mpas-ocean/ -w $n/ocean_model_220816_c4b7eacb_co_intel-nersc_openmp_debug_master -n 196 200 168 183
Setting up test cases:
ocean/soma/32km/default
ocean/soma/32km/three_layer
ocean/internal_wave/default
ocean/planar_convergence/horizontal_advection
target cores: 272
minimum cores: 27
These appear to run correctly except that ocean/soma/32km/three_layer
fails in forward mode both before and after this PR, so that is unrelated.
I do get a warning on cori that appears to be related to this PR:
pwd
/global/cscratch1/sd/mpeterse/runs/n/ocean_model_220816_c4b7eacb_co_intel-nersc_openmp_debug_master/ocean/soma/32km/three_layer
(dev_compass_1.2.0-alpha.1) nid00961:three_layer$ compass run
ocean/soma/32km/three_layer
/global/u2/m/mpeterse/repos/compass/pr/compass/parallel.py:93: UserWarning: Slurm found 64 cpus per node but config from mache was 32
warnings.warn(f'Slurm found {cpus_per_node} cpus per node but '
compass calling: compass.ocean.tests.soma.soma_test_case.SomaTestCase.run()
inherited from: compass.testcase.TestCase.run()
in /global/u2/m/mpeterse/repos/compass/pr/compass/testcase.py
Running steps: initial_state, forward
* step: initial_state
It looks like the warning is just telling us that we are not taking advantage of all the cores available on the node, so would be expected.
Thanks a bunch @mark-petersen! It's worth figuring out what's wrong with 3-layer SOMA but not here. The warning:
should be fixed in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the landice integration suite on Badger including baseline comparison and all tests pass. I think that is sufficient for MALI interests.
Thank you very much @trhille, @matthewhoffman and @mark-petersen!!! This will be a big deal for allowing @altheaden's work to move forward. Sorry it's so disruptive. |
This merge is a vital step in @altheaden's work on parallel tasks. Up to now, we have been able to get by in compass without differentiating between "tasks" and "CPUs per task". However, there is an important distinction between these concepts in Slurm and other job managers. For task parallelism to function properly, we need to specify both
--cpus_per_task
(-c
for short) and--ntasks
(-n
for short) to each Slurm job step we launch withsrun
.In anticipation of this need, this merge replaces the
cores
attribute of theStep
class withntasks
andcpus_per_task
, and themin_cores
attribute withmin_tasks
andmin_cpus_per_task
.Nearly ever test case is affected (thus, 96 files are changed).
I made a separate commit for each land-ice test group but ran out of steam for the ocean test groups and switched (based on the experience gained from
landice
) to bulk search-and-replace in ocean test groups.