Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds submesoscale parameterization for mpas-ocean #5099

Merged
merged 28 commits into from
Aug 29, 2022

Conversation

vanroekel
Copy link
Contributor

@vanroekel vanroekel commented Jul 27, 2022

Adds the Fox-Kemper et al 2011 parameterization for submesocale eddies.
There is also a substantial eddy parameterization reorganization as the
buoyancy gradient that was calculated within GM is also needed for this
parameterization. There are also cases when we will run with the
submeso parameterization and not GM so that calculation is separate.
The density threshold MLD is also removed from the analysis member as it
is required by numerous calculations in the forward model now.

The contributions of submesoscale eddy velocity to AMOC is also added to
the MOC streamfunction computation

Description of new parameterization is as follows. This appears in the file
components/mpas-ocean/src/shared/mpas_ocn_submesoscale_eddies.F.

  !>  This routine implements the Fox-Kemper et al 2011 submesoscale 
  !>  parameterization (https://doi.org/10.1016/j.ocemod.2010.09.002).
  !>  The transport velocity from the submesoscales is given by a 
  !>  stream function which is given as
  !>
  !>  Psi = C_e*Delta_S/L_f*H^2 int(bGrad_H)/sqrt(f^2+tau^-2)*mu(z)
  !>
  !>  here C_e is a specified efficiency, Delta_s is taken as 
  !>  min(dcEdge,dsMax), H is the mixed layer depth, L_f is the subgrid
  !>  frontal width that is scaled to the coarse grid, f is the 
  !>  coriolis parameter, and tau is a time scale specified to prevent 
  !>  the singularity near the equator.  The horizontal buoyancy gradient
  !>  is integrated over the mixed layer depth and mu(z) is a non-dimensional
  !>  shape function that is zero below the mixed layer depth

[NML]
[BFB] - stealth feature not turned on by default

@vanroekel vanroekel added mpas-ocean BFB PR leaves answers BFB NML Stealth PR has feature which, if turned on, could change climate. fka FCC labels Jul 27, 2022
@vanroekel vanroekel changed the title Adds submesoscale parameterization Adds submesoscale parameterization for mpas-ocean Jul 27, 2022
@vanroekel
Copy link
Contributor Author

This PR has been tested in a fully coupled configuration for 200+ years on chrysalis. The simulation page describing the run is here

https://acme-climate.atlassian.net/wiki/spaces/NGDO/pages/3475243009/20220715.submeso.piControl.ne30pg2+EC30to60E2r2.chrysalis

Direct link to the diagnostics are here
https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/ac.vanroekel/E3SMv2/20220715.submeso.piControl.ne30pg2_EC30to60E2r2.chrysalis/

@vanroekel
Copy link
Contributor Author

There are a lot of improvements to note here, but a few to call out, of course AMOC first

image

here is a baseline for comparison

image

@vanroekel
Copy link
Contributor Author

Ann avg MLD First the submeso run. focus in particular on the tropics

image

and baseline

image

@vanroekel
Copy link
Contributor Author

vanroekel commented Jul 27, 2022

SST

image

image

RMS reduces by more than 0.1

@vanroekel
Copy link
Contributor Author

Ice thickness

image

image

The central arctic is most influenced in a positive way

@vanroekel
Copy link
Contributor Author

FYI to @xylar, @ytakano3, @maltrud. I'd also appreciate a look from you all if you have time.

mocStreamValLatAndDepthRegionLocal(1, k - 1, iTransect) &
+ sumTransport(k - 1, iTransect)
mocStreamValLatAndDepthRegionLocal(1, k + 1, iTransect) &
- sumTransport(k + 1, iTransect)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this change came from a discussion with Gokhan about how to calculate AMOC. If preferred I can make this one change its own PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being nosy, but how do the old and new way of computing AMOC affect the AMOC curves you show above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem at all. Thanks for looking. The change ends up being about 0.5-1 Sv on average increase, but not every month, some months go up some down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So summing from the bottom up? Could you explain a bit more about the k index change? I don't fully follow how the sign change and the index change relate to each other. Does the sign of the resulting MOC somehow remain the same?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to make this its own PR and explain the change further. We would want to make the corresponding change in MPAS-Analysis and do some tests to make sure the two are consistent.

do iEdge=1,nEdgesAll
cell1 = cellsOnEdge(1,iEdge)
cell2 = cellsOnEdge(2,iEdge)
indMLDedge(iEdge) = min(indMLD(cell1),indMLD(cell2))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philipwjones I think I could move this to a calculation in the submesocale eddy calculation to remove this from ocn_diagnostics_variables if you prefer that approach.

@vanroekel
Copy link
Contributor Author

Here is relevant PACE data for the runs -- https://pace.ornl.gov/search/20220715.submeso.piControl.ne30pg2_EC30to60E2r2.chrysalis

@ambrad
Copy link
Member

ambrad commented Jul 27, 2022

@vanroekel the CI build is failing because you've accidentally included submodule updates to the land component in this PR.

@vanroekel
Copy link
Contributor Author

Thanks @ambrad I think I got all the unintended commits removed from the PR. The CI passes now.

@ytakano3
Copy link

@vanroekel Thank you Luke for sharing the feature looks good. From a user side minor comment (or more question), apologies if I missed the information. I see the submesoscale parameterization settings are in default Registry, does this mean this feature will be turned on as a default when setting up a case (for me g-case)?

@vanroekel
Copy link
Contributor Author

@ytakano3 yes for now we will leave this default off. But all you have to do is add

config_submesoscale_enable = .true.

in your user_nl_mpaso file

@ytakano3
Copy link

@vanroekel Thank you Luke, that is straightforward.

@rljacob rljacob removed the request for review from jonbob August 2, 2022 22:49
@mark-petersen
Copy link
Contributor

@vanroekel can you change these lines so that this compiles with OpenMP? I do not have permissions to push:

diff --git a/components/mpas-ocean/src/shared/mpas_ocn_submesoscale_eddies.F b/components/mpas-ocean/src/shared/mpas_ocn_submesoscale_eddies.F
index 8d0a1de912..d8ef8d3d1d 100644
--- a/components/mpas-ocean/src/shared/mpas_ocn_submesoscale_eddies.F
+++ b/components/mpas-ocean/src/shared/mpas_ocn_submesoscale_eddies.F
@@ -92,8 +92,8 @@ module ocn_submesoscale_eddies

      !$omp parallel
      !$omp do schedule(runtime) &
-     !$omp private(cell1,cell2,bvfML,zEdge,streamFunction,gradBuoyML,hML,bvfAv)
-     !$omp private(bvfML,zMLD,Lf,ds,mu,k)
+     !$omp private(cell1,cell2,bvfML,zEdge,streamFunction,gradBuoyML,hML,bvfAv, &
+     !$omp         zMLD,hAv,Lf,ds,mu,k)

@amametjanov
Copy link
Member

A 5-day master pre- vs. post-merge comparison on Summit+pgigpu with the streams.ocean mod shows 1 diff in mpaso.hist.am.highFrequencyOutput.0001-01-01_00.00.00.nc:

 vertVelocityTop   (nVertLevelsP1,nCells,Time)  t_index =      1     1
   11845131 14344760  (    37, 50226,     1) (    38, 90495,     1) (    11,158145,     1) (     9,224472,     1)
            14344760   2.596361748874187E-02  -1.668974943459034E-02 3.3E-02 -1.517195254564285E-02 7.6E-01 -5.016618873924017E-03
            14344760   2.320441044867039E-02  -1.528023649007082E-02          1.738854497671127E-02          5.016617942601442E-03
            14344760  (     2, 51552,     1) (    52, 48032,     1)
          avg abs field values:    2.286154311150312E-03    rms diff: 5.4E-03   avg rel diff(npos):  7.6E-01
                                   2.700790762901306E-03                        avg decimal digits(ndif):  0.1 worst: -0.3
 RMS vertVelocityTop                  5.3651E-03            NORMALIZED  2.1517E+00

Run-dir on Summit:

/gpfs/alpine/cli115/proj-shared/azamat/e3sm_scratch/SMS.T62_oEC60to30v3.CMPASO-NYF.summit_pgigpu.C.20220830-submeso/run

@mark-petersen
Copy link
Contributor

Thanks. Do you know how to interpret these lines? Are there only differences in six locations, and lines 2 and 3 here are the diffs at those index locations? Why does the second line have six values and the third have four?

   11845131 14344760  (    37, 50226,     1) (    38, 90495,     1) (    11,158145,     1) (     9,224472,     1)
            14344760   2.596361748874187E-02  -1.668974943459034E-02 3.3E-02 -1.517195254564285E-02 7.6E-01 -5.016618873924017E-03
            14344760   2.320441044867039E-02  -1.528023649007082E-02          1.738854497671127E-02          5.016617942601442E-03
            14344760  (     2, 51552,     1) (    52, 48032,     1)

@jonbob
Copy link
Contributor

jonbob commented Aug 31, 2022

@mark-petersen - the cprnc output is difficult to read, but it means there are differences in 11845131 of 14344760 values. Lines 2 and 3 are for the extrema in values: line 2 has max and min values from file1 (I think) as well as max diff (3.3e-2), the file1 value at the max diff location, the max relative diff, and the file1 value at that location. Line 3 does not repeat the max diff values but otherwise outputs the same information from file2

@xylar
Copy link
Contributor

xylar commented Aug 31, 2022

That's really strange. It seems like vertVelocityTop is utter garbage but everything else is fine. It's hard to see how that could be possible.

xylar added a commit to xylar/compass that referenced this pull request Aug 31, 2022
As of E3SM-Project/E3SM#5099, this
variable is no longer part of the AM and is instead part of the
main MPAS-Ocean code.
@mark-petersen
Copy link
Contributor

I agree. 11.3M of 14.3M values of the array mismatch, so this is not on inconsequential cells like land cells as we thought. But vertVelocityTop should immediately change all the other variables on the next step. I am also puzzled.

@vanroekel
Copy link
Contributor Author

vanroekel commented Sep 1, 2022

Just a note that I don't think vertVelocityTop actually feeds back to the other variables in the default configuration. Advection is by vertAleTransportTop which is recomputed elsewhere. I think it would change things if remapping is used instead.

What is especially odd to me is that this PR did not change the calculation of vertVelocityTop. So I don't understand why this PR is causing the diff.

@xylar
Copy link
Contributor

xylar commented Sep 1, 2022

Thanks @vanroekel, I think that at least helps us understand why vertVelocityTop could be non-BFB without affecting other variables. But I agree, there's no clear reason that this variable should be affected.

I think we're in a pinch in therms of investigating further because those of us involved in this PR don't seem to have RSA tokens for Summit or Ascent. Am I incorrect about that?

@amametjanov
Copy link
Member

I tried to narrow down: CPU runs with ibm and pgi on Summit are BFB. The diff was reported for pgigpu and so the culprit is likely in !$acc pragmas.
On the other hand, we don't run nightly pgigpu tests with baseline comparisons. So this diff might be from elsewhere.

@jonbob
Copy link
Contributor

jonbob commented Sep 6, 2022

That's a good hint, @amametjanov. I haven't done much with !$acc pragmas, but this loop in mpas_ocn_diagnostics.F (lines 1873-1919) looks suspicious to me, like div_hu and div_huTransport are missing from the private section?

#ifdef MPAS_OPENACC
      !$acc parallel loop gang vector &
      !$acc          present(divergence, kineticEnergyCell, &
      !$acc                  areaCell, nEdgesOnCell, edgesOnCell, edgeSignOnCell, dcEdge, dvEdge, &
      !$acc                  maxLevelCell, normalVelocity, divergence, layerThicknessEdgeFlux, &
      !$acc                  normalTransportVelocity, vertVelocityTop, vertTransportVelocityTop, &
      !$acc                  invAreaCell, minLevelCell) &
      !$acc          private(invAreaCell1, i, iEdge, edgeSignOnCell_temp, dcEdge_temp, &
      !$acc                  dvEdge_temp, k, r_tmp)
#else
      !$omp parallel
      !$omp do schedule(runtime) &
      !$omp private(i, iEdge, invAreaCell1, edgeSignOnCell_temp, dcEdge_temp, dvEdge_temp, &
      !$omp         k, r_tmp, div_hu, div_huTransport)
#endif
      do iCell = 1, nCells
         divergence(:, iCell) = 0.0_RKIND
         kineticEnergyCell(:, iCell) = 0.0_RKIND
         div_hu(:) = 0.0_RKIND
         div_huTransport(:) = 0.0_RKIND
         invAreaCell1 = invAreaCell(iCell)
         do i = 1, nEdgesOnCell(iCell)
            iEdge = edgesOnCell(i, iCell)
            edgeSignOnCell_temp = edgeSignOnCell(i, iCell)
            dcEdge_temp = dcEdge(iEdge)
            dvEdge_temp = dvEdge(iEdge)
            do k = minLevelCell(iCell), maxLevelCell(iCell)
               r_tmp = dvEdge_temp * normalVelocity(k, iEdge) * invAreaCell1

               divergence(k, iCell) = divergence(k, iCell) - edgeSignOnCell_temp * r_tmp
               div_hu(k) = div_hu(k) - layerThicknessEdgeFlux(k, iEdge) * edgeSignOnCell_temp * r_tmp
               kineticEnergyCell(k, iCell) = kineticEnergyCell(k, iCell) &
                                              + 0.25 * r_tmp * dcEdge_temp * normalVelocity(k,iEdge)
               ! Compute vertical velocity from the horizontal total transport
               div_huTransport(k) = div_huTransport(k) &
                                    - layerThicknessEdgeFlux(k, iEdge) * edgeSignOnCell_temp &
                                    * dvEdge_temp * normalTransportVelocity(k, iEdge) * invAreaCell1
            end do
         end do
         ! Vertical velocity at bottom (maxLevelCell(iCell)+1) is zero, initialized above.
         vertVelocityTop(1:minLevelCell(iCell)-1, iCell) = 0.0_RKIND
         vertVelocityTop(maxLevelCell(iCell)+1, iCell) = 0.0_RKIND
         do k = maxLevelCell(iCell), 1, -1
            vertVelocityTop(k,iCell) = vertVelocityTop(k+1,iCell) - div_hu(k)
            vertTransportVelocityTop(k,iCell) = vertTransportVelocityTop(k+1,iCell) - div_huTransport(k)
         end do
      end do

!$acc vertTransportVelocityTop, vertGMBolusVelocityTop, invAreaCell, &
!$acc minLevelCell) &
!$acc normalTransportVelocity, vertVelocityTop, vertTransportVelocityTop, &
!$acc invAreaCell, minLevelCell) &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these lines were changed in this PR.

!$acc vertTransportVelocityTop, vertGMBolusVelocityTop, invAreaCell, &
!$acc minLevelCell) &
!$acc normalTransportVelocity, vertVelocityTop, vertTransportVelocityTop, &
!$acc invAreaCell, minLevelCell) &
!$acc private(invAreaCell1, i, iEdge, edgeSignOnCell_temp, dcEdge_temp, &
!$acc dvEdge_temp, k, r_tmp)
Copy link
Contributor

@mark-petersen mark-petersen Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the private lines were not changed - div_hu and div_huTransport should be private and listed here, but they were not listed before this PR either. Regardless, they should be added.

Since they are not listed as private, this calculation could simply be wrong for OpenACC, but wrong in different ways before and after this PR.

@philipwjones
Copy link
Contributor

Yes, the present() clause is missing both a maxLevelCell and layerThickEdgeFlux and the div_huGMBolus needs to be added to the private clause. The latter is likely causing some race conditions. The former are just potentially leading to unnecessary data motion. Similar changes are needed in the sister routine diagnostic_solve_MLEvel and the original diagnostic_solve_vortVel. Not sure how the original evaded detection...

jonbob added a commit that referenced this pull request Sep 6, 2022
…5167)

Add normalGMBolusVelocity to the gm package

This PR is needed to put the normalGMBolusVelocity in the gm package, so
it is not allocated when GM is off. Otherwise, extra memory is allocated
unnecessarily for high-resolution simulations.

This reverts commit 2c6f496, which was a small change in #5099 to make
that PR BFB on intel-19 optimized in MPAS-Ocean standalone.

This PR is BFB with gcc-9 debug, gcc-9 optimized, and intel-19 debug.
For intel-19 optimized there is a mismatch with master of 1e-12 on the
first timestep for the MPAS-Ocean standalone compass test
ocean_global_ocean_QU240_PHC_performance_test on chrysalis. It appears
that intel-19 conducts some optimization that changes the order of
operations for the lines involved.

We expect this PR to be bfb for all E3SM tests.

[BFB]
@jonbob
Copy link
Contributor

jonbob commented Sep 6, 2022

@philipwjones - can I make an issue from this and assign you or @amametjanov?

@philipwjones
Copy link
Contributor

@jonbob - yes, and go ahead and assign me since I need to fix some device_resident code in the same routines

@jonbob
Copy link
Contributor

jonbob commented Sep 6, 2022

moving conversation to E3SM Issue #5176, since this PR has already been merged

jonbob added a commit that referenced this pull request Sep 7, 2022
Add normalGMBolusVelocity to the gm package

This PR is needed to put the normalGMBolusVelocity in the gm package, so
it is not allocated when GM is off. Otherwise, extra memory is allocated
unnecessarily for high-resolution simulations.

This reverts commit 2c6f496, which was a small change in #5099 to make
that PR BFB on intel-19 optimized in MPAS-Ocean standalone.

This PR is BFB with gcc-9 debug, gcc-9 optimized, and intel-19 debug.
For intel-19 optimized there is a mismatch with master of 1e-12 on the
first timestep for the MPAS-Ocean standalone compass test
ocean_global_ocean_QU240_PHC_performance_test on chrysalis. It appears
that intel-19 conducts some optimization that changes the order of
operations for the lines involved.

We expect this PR to be bfb for all E3SM tests.

[BFB]
jonbob added a commit that referenced this pull request Sep 30, 2022
…on' into next (PR #5172)

Enables and modifies the submesoscale eddy parameterization

This enables the Fox-Kemper et al 2011 submesoscale eddy
parameterization as the default for all configurations.

It also fixes a unit error in the submesoscale eddy code that did not
influence the long control run, but was introduced in the first PR
(#5099). The variable gradBuoyEddy was not the buoyancy but the density
gradient. Here the name is changed to gradDensityEddy for clarity.

[NML]
[CC]
jonbob added a commit that referenced this pull request Oct 3, 2022
…on' (PR #5172)

Enables and modifies the submesoscale eddy parameterization

This enables the Fox-Kemper et al 2011 submesoscale eddy
parameterization as the default for all configurations.

It also fixes a unit error in the submesoscale eddy code that did not
influence the long control run, but was introduced in the first PR
(#5099). The variable gradBuoyEddy was not the buoyancy but the density
gradient. Here the name is changed to gradDensityEddy for clarity.

[NML]
[CC]
@rljacob rljacob restored the vanroekel/ocean/add-submesoscale-eddies branch October 14, 2022 18:58
@rljacob rljacob deleted the vanroekel/ocean/add-submesoscale-eddies branch November 15, 2024 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFB PR leaves answers BFB mpas-ocean NML Stealth PR has feature which, if turned on, could change climate. fka FCC
Projects
None yet
Development

Successfully merging this pull request may close these issues.