-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add in-line self attraction and loading for global tidal simulation #4472
Add in-line self attraction and loading for global tidal simulation #4472
Conversation
I performed harmonic analysis comparison to benchmark TPXO8 tidal data from a single-layer barotropic run on Icosahedron 9 (16km) uniform mesh. This shows the serial version of SAL in this pull request gives the same tidal results as the serial version built previously. I have also attached a comparison with the scalar SAL approximation to show the improvement due to full SAL calculation. This was build on Cori with these settings : M2 Constituent M2 Complex RMSE Values in Deep water (<1000m) between 66N and 66S : K1 Constituent K1 Complex RMSE Values in Deep water (<1000m) between 66N and 66S : N2 Constituent N2 Complex RMSE Values in Deep water (<1000m) between 66N and 66S : O1 Constituent O1 Complex RMSE Values in Deep water (<1000m) between 66N and 66S : S2 Constituent S2 Complex RMSE Values in Deep water (<1000m) between 66N and 66S : We do expect a slight difference in results between the previous serial SAL calculation and the one in this PR due to changing from N = Nlat - 1 to N = Nlat/2 - 1. Currently, there is an error when using the parallel SAL. When that is resolved, I will share those harmonic analysis results as well. |
28b03d8
to
f9e0e11
Compare
Here are the results for the parallel SAL implementation on Icosahedron 9 mesh using the same settings as before. RMSE Values in Deep water (<1000m) between 66N and 66S in meters : |
components/mpas-ocean/src/shared/mpas_ocn_vel_self_attraction_loading.F
Outdated
Show resolved
Hide resolved
components/mpas-ocean/src/shared/mpas_ocn_vel_self_attraction_loading.F
Outdated
Show resolved
Hide resolved
@@ -829,7 +829,7 @@ subroutine ocn_vel_vmix_tend_implicit_spatially_variable_mannings(meshPool, forc | |||
! averaging the two bottomDepth cells. | |||
implicitCd = max(0.0025_RKIND, & | |||
min(0.1_RKIND, & | |||
0.16_RKIND/log(250.0_RKIND*(bottomDepth(cell1)+bottomDepth(cell2)))**2 )) | |||
0.16_RKIND/(log(250.0_RKIND*(bottomDepth(cell1)+bottomDepth(cell2))))**2 )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, this parenthesis correction was added to this PR because config_use_implicit_bottom_roughness
is only used in the tidal simulations used in this PR.
call associatedLegendrePolynomials(n, m, startIdx, endIdx, l, pmnm2, pmnm1, pmn) | ||
|
||
! Compute local integral contribution | ||
do iCell = endIdx, startIdx, -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbrus89 we typically place OMP pragmas on the outer loop, e.g.
!$omp parallel
!$omp do schedule(runtime)
do iCell = 1, nCellsAll
...
end do
!$omp end do
!$omp end parallel
with private clauses for internal indices as needed. That said, I think the $OMP
would go on the outermost loop m here, if m
iterations are independent, which they appear to be.
OpenMP is particularly important for the parallel implementation, but could also be used for serial now that we found that bash error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, OMP is not needed for init mode. It is much less important for init subroutines in forward mode.
@sbrus89 and/or @knbarton could you, with parallel and serial with full SAL on, conduct these three tests after the OpenMP commit, in optimized mode:
for all three, use a bfb comparison of last restart file, with python or ncdiff. Simple ncdiff commands:
max absolute difference should be exactly zero. You can use compass for this, but if you already have SAL runs set up, might be easier to just copy your settings from there. |
components/mpas-ocean/src/shared/mpas_ocn_vel_self_attraction_loading.F
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged into master locally and retested. Passes
SMS_D_Ln9.T62_oQU120_ais20.MPAS_LISIO_TEST.chrysalis_gnu
SMS_D_Ln9.T62_oQU120_ais20.MPAS_LISIO_TEST.chrysalis_intel
PET_Ln3.T62_oEC60to30v3wLI.GMPAS-DIB-IAF-ISMF.chrysalis_gnu
Also compiled with gnu and intel on grizzly with SHTNS off (default). Passes pr
test suite for both, with bfb match to compass reference. Compiled gnu with SHTNS=true
but off using config flags. That also passes pr
suite with both debug and optimized gnu.
0051799
to
336b521
Compare
Here are the results I got for BFB testing for the spherical harmonic transform (SHT) only (via init mode, Compy: Intel 19.0.5, DEBUG=false
The non-BFB behavior for the parallel SHT is due to the reproducibility of the For a forward mode run on the icos9 mesh, the parallel version is BFB for restarts, but not across partitions due to the differences in the SHT above. The max difference after two days between a 32 node and 64 node run was 8.83346729096957e-11 for layerThickness and 1.7656587997239e-11 for normalVelocity. The serial version is BFB for restarts and across partitions for this case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, based on lengthy review and testing by @sbrus89 @knbarton and myself described above. Only remaining issue is that the parallel version of the SHT is non-bfb across partitions due to operation order of the global sums (see previous comment), but it is machine precision. I think that result is sufficient for merging in the PR.
@philipwjones is working on implementing a reproducible sum in MPAS, which would solve the problem of non-BFB results on different processor counts. |
@mark-petersen - I re-requested your review on this PR because it would be more helpful once the reproducible sum has been added. As-is, it really isn't ready to be merged |
@jonbob and @mark-petersen: Now that #4700 has been submitted, I'll work on testing this with the reproducible sum implemented by @philipwjones . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I retested with today's commits using gnu debug on badger with SAL and tidal potential on, and bfb flag first on. The full nightly suite passes.
@jonbob this is ready to merge when the repo is open.
@philipwjones and @matthewhoffman - can you please review after the changes have been made? Thanks |
This comment is to confirm that the issue I experienced in my previous comment (#4472 (comment)) has been addressed by @knbarton's commit (badff01). Previously, I experienced a bound-check error during coupled MALI - Sea Level Model simulations in which the subroutine After incorporating this fix, coupled MALI - Sea Level Model simulations compiled in DEBUG mode do not experience any errors, and the simulation results (i.e., MALI thickness data interpolated between the MALI mesh and the global Gaussian grid) are correct. Thanks again, @knbarton, @mark-petersen, @sbrus89! |
Thanks for pointing this out and helping to test the fix @hollyhan! |
@philipwjones can you check the acc changes I just made in 03a5551? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes for acc look fine. Approving based on all the other discussion/testing.
@sbrus89 - the scripts that update the bld files to be consistent with Registry changes only came up with a missing blank line in namelist_definition_mpaso.xml. Is it OK to push to your branch? Or I can tell you where the line is if you prefer to handle it |
@jonbob, thanks for catching that. Sure, it's totally fine to push to my branch. |
This PR adds a 1D gravitationally consistent sea-level model (Kendall et al., 2005; Gomez et al., 2010; Han et al., 2022; https://github.com/GomezGroup/1DSeaLevelModel_FWTW) as a submodule within MALI, adding the capability of regional sea-level projection associated with ice sheet changes solved by MALI. In this new addition, the sea-level model computes changes in the height of sea surface equipotential and viscoelastic solid Earth associated with surface (ice and ocean) loading changes, enabling MALI to take into account gravitational, rotational and deformational effects of the loading changes. The implemented requirements and modified/added codes in the PR are as follow: 1. MALI to optionally build SLM from the top level directory (mpas-albany-landice): modified `Makefile` in MALI's top level directory such that `SLM=True` in make command builds the SLM. 2. MALI to initialize SLM at t=0 of a simulation: added wrapper subroutine 'slmodel_init' for initializing and the SLM in `mpas_li_bedtopo.F` in which sea-level model sub-modules are called. 3. MALI to call SL solver at every coupling time interval: added wrapper subroutine 'slmodel_solve' for calling the SLM in `mpas_li_bedtopo.F` in which sea-level model sub-modules are called. 4. MALI to call the SLM at every coupling time interval (i.e., SLM timestep): 1) Modified `Registry.xml` to include a namelist value of the coupling time, 2) modified `mpas_li_core.F` to add an alarm to the land ice core clock object for the coupling time interval, and 3) modified `mpas_li_bedtopo.F` to use `mpas_timekeeping` and check/reset the alarm for the coupling time interval. 5. MALI needs to accommodate SLM running on a single processor while running on multiple processors: Implement MPI scatter, gather and halo updates in `mpas_li_bedtopo.F`. MALI data (ice thickness) is gathered on the head node after which the SLM is called. 6. Interpolation between MALI and SLM native grids should be done: Scripfiles `mpas_to_grid.nc` and `grid_to_mpas.nc` for the regional, unstructured MALI mesh and the global Gaussian grid are generated using MPAS-mesh tool `from_mpas.py` and `ncremap`, respectively. Subroutines `interpolate_init` and `interpolate` and MPI scatter/gather functions are copied from the ocean SAL PR#4472 [E3SM-Project#4472] 7. MALI and SLM must exchange model data: changes on the head node, MALI provides thickness as input to SLM through SLM subroutine `sl_solver` that calculates and outputs total bedtopography, which is used to update bedrock topography in `mpas_li_bedtopo.F` 8. Include the SLM namelist file such that SLM parameters can be accessed and modifed in MALI top-level directory and that the SLM does not need to be recompiled the parameters are modified. * origin/hollyhan/mali/add_SLmodel: (43 commits) Change the variable name 'topoChange' to 'bedTopographyChange' Add a check for consistency in coupling interval prescribed in MALI and SLM settings Modify the src/Makefile such that 'make clean' removes files in the SLM submodule Fix the while loop in subroutine 'interpolate' Fix the file name for the 'gridToMpasFile' Update the SeaLevelModel submodule Change variable name `unit_num` in MALI to avoid conflict with the one in SLM Update the SeaLevelModel submodule Clean up and add an address to the SLM Github repo Update SLM submodule URL and hash Remove white trailing spaces deallocate MPI variables across routines Change a variable name to camelCase Add an error statement for when compile flag and namelist config have a conflict Update sea-level model submodule Change the variable name 'UpliftDiff' to 'TopoChange' Update sea-level model submodule Change 'err' to 'err_tmp' in calling routines Clean up and reorginize the code Include a wrapper subroutine that reads in the SLM namelist ...
Is this ready? |
As far as I know, it is. |
Add in-line self attraction and loading for global tidal simulation Self attraction and loading (SAL) forcing involves a spherical harmonic transform (SHT) of the SSH (Ray, 1998). The spherical harmonic coefficients are scaled by the load Love numbers, then an inverse transformation is performed to compute the SAL field. The SAL forcing is incorporated in the pressure gradient term along with the tidal potential. Two different approaches are implemented for performing the SHT: 1. SSH is gathered onto a single MPI rank and interpolated to a Gaussian mesh. A very fast library, SHTns, is used to compute the forward/inverse SHT. The SAL field is then interpolated back to the unstructured mesh and scattered back to the other MPI ranks. This work was done by @knbarton. 2. The SHT integration is performed on each subdomain and the SH coefficients are globally summed with an all-reduce. Approach 1 ) performs better at low core-counts, but 2) scales better for larger runs. This PR includes an init mode option for verification testing of the forward/inverse SHT. This work was done as a part of the tides task of the ICoM project. [NML] [BFB] for all standard E3SM tests
Passes:
with expected NML DIFFs. |
merged to next |
#4472) Re-merge to pick up missed changes in mpas-ocean build-namelist file
Missed changes that needed to go in the mpas-ocean build-namelist file. Retesting passes:
Still with expected NML DIFFs. Re-merged to next |
merged to master and expected NML DIFFs blessed |
Self attraction and loading (SAL) forcing involves a spherical harmonic transform (SHT) of the SSH (Ray, 1998). The spherical harmonic coefficients are scaled by the load Love numbers, then an inverse transformation is performed to compute the SAL field. The SAL forcing is incorporated in the pressure gradient term along with the tidal potential.
Two different approaches are implemented for performing the SHT:
Approach 1 ) performs better at low core-counts, but 2) scales better for larger runs.
This PR includes an init mode option for verification testing of the forward/inverse SHT.
This work was done as a part of the tides task of the ICoM project.
[NML]
[BFB] (for all standard E3SM tests, and for tests with SAL on when
config_parallel_self_attraction_loading_bfb = .true.
)