-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add split-explicit AB2 time stepping capability to the ocean component #5989
Conversation
|
Tangentially, we need to discuss if we wish to include this PR in our planned high res GPU simulations |
@sarats , I will run this code on GPU and report here. Thanks! |
@sarats , After modifying some OpenACC directives, this code runs well on GPUs as tested on PM-GPU for both the standalone MPAS-O and the ocean-ice coupled (GMPAS-IAF) case. This PR does not address the issue of running simulations on PM-GPU. I've referenced @grnydawn 's fix for PM-GPU at here: https://github.com/E3SM-Project/E3SM/tree/ykim/mpas/gcase. |
@hyungyukang can you comment here on the exact tests you did for performance and quality? Do you have scaling plots for particular resolutions and machines? If information is in slide format, you can just upload them. |
@mark-petersen , I've run the standalone MPAS-O on RRS18to6v3 mesh with a real-world configuration using atmospheric forcing to check performance and quality.
|
@hyungyukang, thank you for these results. They look fantastic! I tested this branch with both optimized and debug, using gnu on perlmutter and intel on chrysalis. Using the default flags (including 'split-explicit' timestepping) everything passes and compares bfb against the master branch point, as expected. Using the new AB2 method ( nightly suite output:
|
@mark-petersen , Thanks for testing! I don't think we need to include previous two steps, but I'm going to look into it today. For QU240 restart tests, AB2 passes |
That sounds very odd because the difference between the two is just the initial condition (i.e. the T and S fields being interpolated from). Are you sure you ran the PHC restart test with |
@xylar , Another odd thing is that AB2 passes the baroclinic channel restart tests on both Compy and Perlmutter... |
The method obviously doesn't require storing multiple time levels in restart files or no restart tests would be passing. This is obviously a different AB2 method than the one I used when I was a grad student (involving times n and n-1). |
The restart tests work and are bfb for active ocean cells. For some reason on AB2 deeper land edges get that dummy value of -1e34, but only for restart runs and not for the very first forward run. I'm looking into it now.
|
components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_split_ab2.F
Outdated
Show resolved
Hide resolved
d7d2896
to
54a3220
Compare
@mark-petersen and @xylar, Thanks for reviewing and your suggestions! Based on your findings and suggestions, I added a fix to code change if ( config_do_restart ) then
block => domain % blocklist
call mpas_pool_get_subpool(block%structs, 'state', statePool)
call mpas_pool_get_subpool(statePool, 'tracers', tracersPool)
call mpas_pool_get_subpool(block%structs, 'mesh', meshPool)
call mpas_pool_get_dimension(block % dimensions, 'nVertLevels', nVertLevels)
call mpas_pool_get_dimension(block % dimensions, 'nEdges', nEdges)
call mpas_pool_get_array(statePool, 'normalVelocity', &
normalVelocity, 1)
call mpas_pool_get_array(meshPool, 'minLevelEdgeBot', minLevelEdgeBot)
call mpas_pool_get_array(meshPool, 'maxLevelEdgeTop', maxLevelEdgeTop)
! normalVelocity=0 on inactive cells
do iEdge = 1, nEdges
! at top
do k = 1, minLevelEdgeBot(iEdge)-1
normalVelocity(k,iEdge) = 0.0_RKIND
enddo
! at bottom
do k = maxLevelEdgeTop(iEdge)+1, nVertLevels
normalVelocity(k,iEdge) = 0.0_RKIND
enddo
enddo ! edge loop
end if nightly suite output00:15 PASS ocean_baroclinic_channel_10km_default And I double checked I'm using |
The attached is slides about this AB2 implementation. |
Thanks! I tested with the nightly suite with gnu and intel, and everything passes. |
The second-order Adams Bashforth (AB2) time stepping method for the baroclinic system (config_time_integrator = 'split_explicit_ab2') is implemented. The AB2 time stepping method, one of multistep methods, computes time stepping procedure (Stage 1~3) once per time step, while the predictor-corrector which is default computes it twice per time step. Therefore, the AB2 method can theoretically reduce model runtime by up to half. In practice, the AB2 method can provide a speedup of 1.5x to 1.8x. Due to its high sensitivity to time step size, the predictor-corrector scheme is only applied to the layer thickness equation. All subroutines and the barotropic system advance (Stage 2) in the AB2 code are the same as in the split-explicit code (mpas_ocn_time_integration_split.F).
- Update Registry.xml and some codes to write restart variables used in AB2 time stepping
- Revised OpenACC directives to fix issues when running on GPUs
Testing of this PR in conjunction with PR #6035 is being undertaken by the coupled model group. The run with these two PRs is 20231106.v3b01-AB2.piControl.chrysalis and the corresponding case without AB2 for an apples-to-apples comparison is 20231105.v3b01.piControl.chrysalis. Both runs have gone more than 200 years |
Thanks @jonbob for the B-case comparisons. I spent an hour carefully comparing the two simulations. Here are my conclusions. In all screenshots we have MPAS-Analysis from years 151-200 with:
For example, Antarctic bottom temperature is very warm in both, but identical by eye between the control and AB2. Also for salinity. In summary, the only visible differences with AB2 are the temperature trends. The temperature drifts are smaller than the control, so are preferred. In addition, the spatial distribution of the temperatures look nearly identical by eye, so there are no problems that stand out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am approving again based on the B-case simulations and results above. I feel comfortable with AB2 as the default baroclinic time stepping scheme for V3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hyungyukang and @mark-petersen I'm glad to hear the 200 year comparison turned out so well (similar to control). Great work!
!> \details | ||
!> This module contains the routine for the split explicit time | ||
!> integration scheme, where the second-order Adams Bashforth (AB2) | ||
!> time stepping method is applied to the baroclinic system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hyungyukang can we modify this comment to clarify that the AB2 stepping is for momentum only? The tracer is euler forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vanroekel , Sure I will change it. Thanks a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hyungyukang this is excellent, great work! I have one tiny request, when I looked through the code it appears AB2 is applied to momentum only and I believe you confirmed that. Could we clarify that in the header of the ab2 file?
ada4136
to
b2067b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approving based on visual inspection and testing from @hyungyukang @mark-petersen and @jonbob
Add split-explicit AB2 time stepping capability to the ocean component The second-order Adams Bashforth (AB2) time stepping method is applied to the baroclinic system of MPAS-Ocean to enable faster computation of the baroclinic system. The AB2 time stepping method, one of multistep methods, computes time stepping procedure (Stage 1~3) once per time step, while the predictor-corrector which is a default scheme computes it twice per time step. In practice, the AB2 method can provide a speedup of 1.5x to 1.8x. Due to its high sensitivity to time step size, the predictor-corrector scheme is only applied to the layer thickness equation. All subroutines and the barotropic system advance (Stage 2) in the AB2 code are the same as used in the split-explicit code (mpas_ocn_time_integration_split.F). [NML] [non-BFB]
passes sanity testing with expected DIFFs -- merged to next |
Update weights for ocean barotropic subcycling in split-explicit solver This PR updates the weights config_btr_gam1_velWt1, config_btr_gam2_SSHWt1 in the MPAS-Ocean barotropic solver based on recent analysis of this scheme. This update applies to "split explicit" time stepping schemes, i.e. config_time_integrator = 'split_explicit' and the new config_time_integrator = 'split_explicit_ab2' in #5989. The new weights allow for a barotropic time step config_btr_dt to be 25% longer in the tests of EC30to60, thus speeding up the barotropic subcycling. We expect that values of config_btr_dt can be increased by 25% for all meshes. [NML] [non-BFB]
@mark-petersen , @sarats , @xylar , @cbegeman , @vanroekel , and @jonbob , thank you so much for your review, testing, suggestions, comments, and approval of this PR! |
merged to master -- expected DIFFs will be blessed with PR #6035 |
Update weights for ocean barotropic subcycling in split-explicit solver This PR updates the weights config_btr_gam1_velWt1, config_btr_gam2_SSHWt1 in the MPAS-Ocean barotropic solver based on recent analysis of this scheme. This update applies to "split explicit" time stepping schemes, i.e. config_time_integrator = 'split_explicit' and the new config_time_integrator = 'split_explicit_ab2' in #5989. The new weights allow for a barotropic time step config_btr_dt to be 25% longer in the tests of EC30to60, thus speeding up the barotropic subcycling. We expect that values of config_btr_dt can be increased by 25% for all meshes. [NML] [non-BFB]
This merge updates the E3SM-Project submodule from [894b5b2](https://github.com/E3SM-Project/E3SM/tree/894b5b2) to [5d5f15c](https://github.com/E3SM-Project/E3SM/tree/5d5f15c). This update includes the following MPAS-Ocean and MPAS-Frameworks PRs (check mark indicates bit-for-bit with previous PR in the list): - [ ] (ocn) E3SM-Project/E3SM#5945 - [ ] (ocn) E3SM-Project/E3SM#5946 - [ ] (ocn) E3SM-Project/E3SM#5947 - [ ] (ocn) E3SM-Project/E3SM#5999 - [ ] (ocn) E3SM-Project/E3SM#6037 - [ ] (ocn) E3SM-Project/E3SM#5989 - [ ] (ocn) E3SM-Project/E3SM#6035 - [ ] (ocn) E3SM-Project/E3SM#6077
It appears that after the merge nightly OpenACC test
stopped working with either
|
@amametjanov , Thanks for reporting it. I was able to run the AB2 code on both pm-gpu and Frontier-gpu, but I had to do some modifications for a macro I also found that I made a mistake in |
The second-order Adams Bashforth (AB2) time stepping method is applied to the baroclinic system of MPAS-Ocean to enable faster computation of the baroclinic system. The AB2 time stepping method, one of multistep methods, computes time stepping procedure (Stage 1~3) once per time step, while the predictor-corrector which is a default scheme computes it twice per time step. In practice, the AB2 method can provide a speedup of 1.5x to 1.8x. Due to its high sensitivity to time step size, the predictor-corrector schem is only applied to the layer thickness equation.
All subroutines and the barotropic system advance (Stage 2) in the AB2 code are the same as used in the split-explicit code (mpas_ocn_time_integration_split.F).
To enable this option, users need to set
config_time_integrator = 'split_explicit_ab2'
.Results of G-case (100 years) and WCYCL1850 (30 years) and more details of the method can be found here: https://acme-climate.atlassian.net/wiki/spaces/ImPACTS/pages/3826385232/AB2+time+stepping+for+the+baroclinic+system
[NML]
[non-BFB]