-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable asyncio using pio #325
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your work on this @jedwards4b ! This is a really exciting new feature!
I have a few comments below. Most of them are minor & easy. The main one that might be more involved is my long comment, particularly the second paragraph starting "Part of what I'm wondering about here...". Let me know if you'd like to talk about this.
cesm/driver/ensemble_driver.F90
Outdated
@@ -53,6 +62,14 @@ subroutine SetServices(ensemble_driver, rc) | |||
specRoutine=SetModelServices, rc=rc) | |||
if (chkerr(rc,__LINE__,u_FILE_u)) return | |||
|
|||
! ModifyCplLists is a NUOPC specialization which happens after Advertize but before Realize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should be changed to say "PostChildrenAdvertise is a..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cesm/driver/ensemble_driver.F90
Outdated
if(pio_asyncio_stride == 0 .or. modulo(n,pio_asyncio_rootpe+1) .ne. 0) then | ||
petList(petcnt) = currentpet | ||
petcnt = petcnt+1 | ||
if (currentpet == localPet) comp_task=.true. | ||
else | ||
asyncio_petlist(iopetcnt) = currentpet | ||
iopetcnt = iopetcnt + 1 | ||
if (currentpet == localPet) asyncio_task=.true. | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does one of these blocks (I'm thinking the first one) apply when not using asyncio? If so, I think this would be more clear and robust if there was an explicit check on whether asyncio is being used, or at least a comment here explaining this.
Part of what I'm wondering about here is whether there is an assumption here - and maybe elsewhere - that, if you're not using asyncio, then pio_asyncio_ntasks, pio_asyncio_stride and pio_asyncio_rootpe are at their default values. I'm imagining a scenario where someone first enables asyncio and tweaks those settings, but then wants to try rerunning with asyncio disabled. Do they need to explicitly reset those three variables to their defaults in that situation? If so, that seems error-prone. Can this (and maybe some other code) be written to examine pio_async_interface and ignore those three variables if that is false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you may be right here. I'll create a few scenarios to test this and put in a fix if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's difficult to get to the pio_async_interface flags at this point in the fortran and I am thinking about implementing something in the buildnml instead - would that be acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that sounds like a good plan - thanks. Then can you just add a comment here saying which of these blocks (if either) applies to the case without asyncio?
cesm/driver/esm.F90
Outdated
! call driver_pio_init(driver, rc=rc) | ||
! if (chkerr(rc,__LINE__,u_FILE_u)) return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to go ahead and remove these commented-out lines, along with the two comment lines above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cesm/driver/esm.F90
Outdated
! call driver_pio_component_init(driver, size(comps), rc) | ||
! if (chkerr(rc,__LINE__,u_FILE_u)) return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to go ahead and remove these commented-out lines, along with the two comment lines above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cesm/driver/esm_time_mod.F90
Outdated
call NUOPC_CompAttributeGet(ensemble_driver, name="glc_avg_period", value=glc_avg_period, rc=rc) | ||
if (ChkErr(rc,__LINE__,u_FILE_u)) return | ||
read(cvalue,*) glc_avg_period |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you don't use glc_avg_period, so you could remove these lines. (I see you just moved this from below, so this is a pre-existing thing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
call NUOPC_CompAttributeGet(gcomp(i), name="pio_typename", value=cval, rc=rc) | ||
call NUOPC_CompAttributeGet(gcomp(i), name="pio_async_interface", value=cval, rc=rc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a glance, it looks like this block of code from here to line 354 duplicates the above block of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for spotting that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this looks great now. I'm happy with how you handled the asyncio settings in buildnml. Thanks!
private InitializeIPDv03p3 ! realize connected Fields with transfer action "provide" | ||
private InitializeIPDv03p4 ! optionally modify the decomp/distr of transferred Grid/Mesh | ||
private InitializeIPDv03p5 ! realize all Fields with transfer action "accept" | ||
private AdvertiseFields ! advertise fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All UWM test pass.
Retested prealpha against alpha12b
@fischer-ncar these all look like failures in alpha12b - can I go ahead and merge? |
Yes, those are expected failures. You can go ahead and merge. |
Description of changes
Allows IO tasks to be independent of compute tasks in cesm
Specific notes
(testing in progress)
Contributors other than yourself, if any:
Depends on share (ESCOMP/CESM_share#37) and cime (ESMCI/cime#4340).
CMEPS Issues Fixed (include github issue #):
Are changes expected to change answers? (specify if bfb, different at roundoff, more substantial)
Any User Interface Changes (namelist or namelist defaults changes)?
Testing performed
Testing performed if application target is CESM:
Testing performed if application target is UFS-coupled:
Testing performed if application target is UFS-HAFS:
Hashes used for testing: