Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return bottom flux #377

Closed

Conversation

mnlevy1981
Copy link
Collaborator

Create a new array bottom_fluxes in the marbl_interface_class that represents per-tracer flux into the bottom of the column.

Opening this as a draft PR so I can see CI test results (and better share code snippets), but it's not ready for review / merging yet.

Fixes #360

Currently this array is just set to zero, but eventually terms that are added
to the kmt level of interior_tendencies(:,:) will be returned to the GCM as a
flux instead
Instead, these terms are converted to fluxes and stored in bottom_fluxes;
sed_denitrif affects no3, alk, and alk_alt_co2 while other_remin affects o2.
For testing in the stand-alone test suite, I then convert these bottom_fluxes
back to tendencies and add them to the interior_tendencies array. This change
led to an update in the Jint_Ntot conservation computation as well (since
sed_denitrif is not included in the tendency, it does not need to be added back
into the total budget).

I have verified that this introduces round-off level changes in Jint_Ntot and
some of the tendency diagnostics, as well as a larger-than-roundoff change in
o2_consumption (since this term no longer includes sed_denitrif and
other_remin). I suspect the CI will fail because of this change to
o2_consumption, but I'll wait until I have properly computed the rest of the
bottom_flux terms before updating the baselines.
@mnlevy1981 mnlevy1981 changed the base branch from stable to development April 30, 2021 21:08
@mnlevy1981
Copy link
Collaborator Author

The CI failure in 75a1998 is expected:

Test Results:
    1. CodeConsistency.py: PASS
    2. pylint: PASS
    3. yaml_to_json.py: PASS
    4. JSON is unchanged: PASS
    5. MARBL_generate_settings_file.py: PASS
    6. MARBL_generate_diagnostics_file.py: PASS
    7. make clean: PASS
    8. bld_lib.py --no_pause: PASS
    9. bld_exe.py --no_pause: PASS
    10. get_put.py: PASS
    11. marbl_utils.py: PASS
    12. init.py: PASS
    13. init-twice.py: PASS
    14. gen_settings_file.py: PASS
    15. call_compute_subroutines.py: PASS
    16. call_compute_subroutines.py -n test_2inst.nml: PASS
    17. call_compute_subroutines.py -n test_5inst.nml: PASS
    18. netCDF Comparison (2 inst vs 1 inst): PASS
    19. netCDF Comparison (5 inst vs 1 inst): PASS
*** 20. netCDF Comparison (1 inst vs baseline): FAIL ***
    21. requested_diags.py: PASS
    22. requested_forcings.py: PASS
    23. requested_forcings.py -n test_with_PAR.nml: PASS
    24. requested_restoring.py: PASS
    25. requested_tracers.py: PASS
    26. init.py --mpitasks 2: PASS

Details from the failing test:

Comparing /home/travis/build/marbl-ecosys/MARBL/tests/regression_tests/call_compute_subroutines/history_1inst.nc to the baseline /home/travis/build/marbl-ecosys/MARBL/tests/input_files/baselines/call_compute_subroutines.history.nc
Variable: O2_CONSUMPTION
... Max relative error (0.28912135819328016) exceeds 1e-11
Differences found between files!

The change in O2_CONSUMPTION should be exactly offset by bottom_flux(o2_ind)/dz(kmt), and I'll verify that through some off-line testing.

This only modifies the stand-alone driver; as with the interior tendencies and
the surface fluxes, it is up to the GCM to add these fields to output.
@mnlevy1981
Copy link
Collaborator Author

After adding bottom fluxes to driver output in 963df13, I figured the easiest way to test this would be to write a python script... I don't think it's necessary to show the full script, but the important lines are

    o2_diff = (ds.O2_CONSUMPTION - ds_baseline.O2_CONSUMPTION).isel(num_cols=colid, num_levels=kmt).values
    btf_term = (ds.BTF_O2.isel(num_cols=colid)/(100*ds_grid.delta_z.isel(zt=kmt))).values

And the output does show that the changes in O2_CONSUMPTION are offset by the new bottom flux term:

$ ./compare_baselines_with_BTF.py
Column 1:
8.775770868915916e-10 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
8.775770868915922e-10 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by 6.203854594147708e-25
Relative difference: 7.069298739466837e-16
Column 2:
3.9312510354777264e-10 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
3.931251035477717e-10 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by -9.305781891221561e-25
Relative difference: 2.3671299052747266e-15
Column 3:
1.7650138939060229e-12 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
1.7650138939060304e-12 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by 7.472090494253424e-27
Relative difference: 4.2334457083040226e-15
Column 4:
7.045025998423119e-10 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
7.045025998423124e-10 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by 5.169878828456423e-25
Relative difference: 7.338338892735941e-16
Column 5:
3.0675974132705846e-06 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
3.067597413270584e-06 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by -4.235164736271502e-22
Relative difference: 1.3806129572120386e-16

This commit results in large changes in PON_REMIN_NH4 and PON_REMIN_DONr, but I
have confirmed that those changes are offset (within machine precision) by the
bottom flux values. This also results in an update to the Ntot conservation
check.
@mnlevy1981
Copy link
Collaborator Author

mnlevy1981 commented May 3, 2021

For 838c8de, I expect PON_REMIN_NH4 and PON_REMIN_DONr to also fail the baseline comparison test and this proves true in the continuous integration:

Variable: O2_CONSUMPTION
... Max relative error (0.28912135819328016) exceeds 1e-11
Variable: PON_REMIN_DONr
... Max relative error (0.9158504210899284) exceeds 1e-11
Variable: PON_REMIN_NH4
... Max relative error (0.9158504210899285) exceeds 1e-11

As before, these are the only test failures and I have added those variables to my script to verify the changes are offset by the new bottom_flux terms. All five columns show similar results, but I'll just include the first column for brevity:

Column 1:
8.775770868915916e-10 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
8.775770868915922e-10 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by 6.203854594147708e-25
Relative difference: 7.069298739466837e-16
----
1.4386126332468396e-09 is the difference between new PON_REMIN_NH4 and baseline (level k = kmt)
1.4386126332468396e-09 is BTF_NH4/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
4.964926484338693e-13 is the difference between new PON_REMIN_DONr and baseline (level k = kmt)
4.964926484338694e-13 is BTF_DONr/delta_z(kmt) in the new model
Those terms differ by 1.0097419586828951e-28
Relative difference: 2.033750070354543e-16

mnlevy1981 added 2 commits May 3, 2021 14:18
This commit results in large changes in CaCO3_REMIN and CaCO3_ALT_CO2_REMIN
that are offset (within machine precision) by the bottom flux values. This also
results in an update to the Ctot conservation check.
Rather than change the kmt layer of the term we are integrating, I adjust the
kmt level of the terms being summed for the comparison check. I think it looks
a little cleaner, and it avoids dividing by delta_z(kmt) when adding a term to
work (which is then multiplied by delta_z).
@mnlevy1981
Copy link
Collaborator Author

mnlevy1981 commented May 3, 2021

For 5fa8cec, we add CaCO3_REMIN_zint, CaCO3_REMIN, and CaCO3_ALT_CO2_REMIN to the expected failures:

Variable: CaCO3_ALT_CO2_REMIN
... Max relative error (0.9118842274591255) exceeds 1e-11
Variable: CaCO3_REMIN
... Max relative error (0.9118842274591255) exceeds 1e-11
Variable: CaCO3_REMIN_zint
... Max relative error (0.14192812863534043) exceeds 1e-11
Variable: O2_CONSUMPTION
... Max relative error (0.28912135819328016) exceeds 1e-11
Variable: PON_REMIN_DONr
... Max relative error (0.9158504210899284) exceeds 1e-11
Variable: PON_REMIN_NH4
... Max relative error (0.9158504210899285) exceeds 1e-11

From my tool's output (switching to column 2, because that's the only one with non-zero changes in CaCO3_REMIN values):

Column 2:
dz(kmt) = 24973.5938 cm
3.9312510354777264e-10 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
3.931251035477717e-10 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by 9.305781891221561e-25
Relative difference: 2.3671299052747266e-15
----
2.6486009975829807e-10 is the difference between new PON_REMIN_NH4 and baseline (level k = kmt)
2.6486009975829807e-10 is BTF_NH4/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
9.140827026985593e-14 is the difference between new PON_REMIN_DONr and baseline (level k = kmt)
9.140827026985592e-14 is BTF_DONr/delta_z(kmt) in the new model
Those terms differ by 1.262177448353619e-29
Relative difference: 1.3808131853139903e-16
----
1.4158226217575558e-09 is the difference between new CaCO3_REMIN and baseline (level k = kmt)
1.4158226217575558e-09 is BTF_DIC/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
1.4158226217575558e-09 is the difference between new CaCO3_ALT_CO2_REMIN and baseline (level k = kmt)
1.4158226217575558e-09 is BTF_DIC_ALT_CO2/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0

I also verified that the absolute difference in CaCO3_REMIN_zint is equal to the absolute difference in CaCO3_REMIN times dz(kmt)

So these failures are all still expected

This commit results in a large change in SiO2_REMIN that is offset (within
machine precision) by the bottom flux value. This also results in an update to
the Sitot conservation check.
@mnlevy1981
Copy link
Collaborator Author

Continuing the pattern, 227b0cc adds SiO2_REMIN to the list of variables that no longer match the baseline value:

Variable: CaCO3_ALT_CO2_REMIN
... Max relative error (0.9118842274591255) exceeds 1e-11
Variable: CaCO3_REMIN
... Max relative error (0.9118842274591255) exceeds 1e-11
Variable: CaCO3_REMIN_zint
... Max relative error (0.14192812863534043) exceeds 1e-11
Variable: O2_CONSUMPTION
... Max relative error (0.28912135819328016) exceeds 1e-11
Variable: PON_REMIN_DONr
... Max relative error (0.9158504210899284) exceeds 1e-11
Variable: PON_REMIN_NH4
... Max relative error (0.9158504210899285) exceeds 1e-11
Variable: SiO2_REMIN
... Max relative error (0.9858322353849616) exceeds 1e-11

But I am convinced this change is being offset by the SiO3 bottom flux appropriately:

Column 2:
dz(kmt) = 24973.5938 cm
3.9312510354777264e-10 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
3.931251035477717e-10 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by 9.305781891221561e-25
Relative difference: 2.3671299052747266e-15
----
2.6486009975829807e-10 is the difference between new PON_REMIN_NH4 and baseline (level k = kmt)
2.6486009975829807e-10 is BTF_NH4/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
9.140827026985593e-14 is the difference between new PON_REMIN_DONr and baseline (level k = kmt)
9.140827026985592e-14 is BTF_DONr/delta_z(kmt) in the new model
Those terms differ by 1.262177448353619e-29
Relative difference: 1.3808131853139903e-16
----
1.4158226217575558e-09 is the difference between new CaCO3_REMIN and baseline (level k = kmt)
1.4158226217575558e-09 is BTF_DIC/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
1.4158226217575558e-09 is the difference between new CaCO3_ALT_CO2_REMIN and baseline (level k = kmt)
1.4158226217575558e-09 is BTF_DIC_ALT_CO2/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
3.3671487754632125e-08 is the difference between new SiO2_REMIN and baseline (level k = kmt)
3.367148775463212e-08 is BTF_SiO3/delta_z(kmt) in the new model
Those terms differ by 6.617444900424222e-24
Relative difference: 1.9652962615273427e-16

mnlevy1981 added 4 commits May 4, 2021 09:31
This commit removes the bottom flux terms from POC%remin, which affects the
bottom flux for the above three variables. This commit has been harder to
verify because BTF_O2 and BTF_DIC were already non-zero -- the changes to
O2_CONSUMPTION and POC_REMIN_DIC were not completely offset by bottom fluxes. I
was able to compare POC_REMIN_DIC to output from the previous commit, but I'm
not sure how to convince myself that O2_CONSUMPTION is still correct.
This commit results in large changes to POP_REMIN_PO4 and POP_REMIN_DOPr that
are offset (within machine precision) by the bottom flux values. This also
results in an update to the Ptot conservation check.
Despite only using the variable when POC%to_floor is positive, it was computed
in one (POC%to_floor > 0) block and used in another, which raised a warning in
some versions of gfortran. Computing it at the top of the (k == kmt) block
should avoid that warning.
I had missed the relationship between lig_loss and POC_remin in previous
commits
@mnlevy1981
Copy link
Collaborator Author

mnlevy1981 commented May 4, 2021

I think the bulk of this PR is done, but it looks like there is some trouble-shooting to do. Here's a full list of the errors from Travis:

Variable: CaCO3_ALT_CO2_REMIN
... Max relative error (0.9118842274591255) exceeds 1e-11
Variable: CaCO3_REMIN
... Max relative error (0.9118842274591255) exceeds 1e-11
Variable: CaCO3_REMIN_zint
... Max relative error (0.14192812863534043) exceeds 1e-11
Variable: J_DIC_ALT_CO2
... Max relative error (1.4131487444454425) exceeds 1e-11
Variable: J_Lig
... Max absolute error (8.480668048918108e-16) exceeds 1e-16
... Max relative error (230.8942306654167) exceeds 1e-11
Variable: J_O2
... Max relative error (252546.9925676971) exceeds 1e-11
Variable: Lig_deg
... Max relative error (0.9145364598112757) exceeds 1e-11
Variable: Lig_loss
... Max relative error (0.8248745958619461) exceeds 1e-11
Variable: Lig_prod
... Max relative error (0.9145364598112757) exceeds 1e-11
Variable: O2_CONSUMPTION
... Max relative error (0.7282683802851332) exceeds 1e-11
Variable: POC_REMIN_DIC
... Max relative error (0.9145364598112757) exceeds 1e-11
Variable: POC_REMIN_DIC_zint
... Max relative error (0.6864365018998254) exceeds 1e-11
Variable: POC_REMIN_DIC_zint_100m
... Max relative error (0.6864365018998254) exceeds 1e-11
Variable: POC_REMIN_DOCr
... Max relative error (0.9145364598112757) exceeds 1e-11
Variable: POC_REMIN_DOCr_zint
... Max relative error (0.6864365018998254) exceeds 1e-11
Variable: POC_REMIN_DOCr_zint_100m
... Max relative error (0.6864365018998254) exceeds 1e-11
Variable: PON_REMIN_DONr
... Max relative error (0.9158504210899284) exceeds 1e-11
Variable: PON_REMIN_NH4
... Max relative error (0.9158504210899285) exceeds 1e-11
Variable: POP_REMIN_DOPr
... Max relative error (0.9165604402026947) exceeds 1e-11
Variable: POP_REMIN_PO4
... Max relative error (0.9167673446036031) exceeds 1e-11
Variable: SiO2_REMIN
... Max relative error (0.9858322353849616) exceeds 1e-11

I don't know what to make of J_DIC_ALT_CO2, J_Lig and J_O2 failing here; when I compare local output to the baseline, I get

Variable: J_DIC_ALT_CO2 ...
... variable is double ...
... Biggest (absolute) diff = 5.21683231820356e-06
... Biggest (relative) diff = 5.232944571080894e-43

Variable: J_Lig ...
... variable is double ...
... Biggest (absolute) diff = 4.906766438974732e-10
... Biggest (relative) diff = 4.921921049445709e-47

Variable: J_O2 ...
... variable is double ...
... Biggest (absolute) diff = 0.006159340023692336
... Biggest (relative) diff = 6.178363223589508e-40

I'll look through the changes and see if anything catches my eye. Once I get to the bottom of this, I'll comment on whether the rest of the differences in the variables are expected.

mnlevy1981 added 3 commits May 4, 2021 15:23
This was causing a problem in my netcdf comparison tool, because it was
treating the max value of some fields as 1e37 and reporting a tiny relative
error when in fact the fields were changing quite a bit.
Still need to figure out why J_O2 is off by so much
With this commit, J_O2 is no longer significantly different from the baseline
@mnlevy1981
Copy link
Collaborator Author

As of 4a5ab9d the variables that fail are all expected:

Variable: CaCO3_ALT_CO2_REMIN
... Max relative error (0.9118842274591255) exceeds 1e-11
Variable: CaCO3_REMIN
... Max relative error (0.9118842274591255) exceeds 1e-11
Variable: CaCO3_REMIN_zint
... Max relative error (0.14192812863534043) exceeds 1e-11
Variable: Lig_deg
... Max relative error (0.9145364598112757) exceeds 1e-11
Variable: Lig_loss
... Max relative error (0.8248745958619461) exceeds 1e-11
Variable: Lig_prod
... Max relative error (0.9145364598112757) exceeds 1e-11
Variable: O2_CONSUMPTION
... Max relative error (0.7282683802851332) exceeds 1e-11
Variable: POC_REMIN_DIC
... Max relative error (0.9145364598112757) exceeds 1e-11
Variable: POC_REMIN_DIC_zint
... Max relative error (0.6864365018998254) exceeds 1e-11
Variable: POC_REMIN_DIC_zint_100m
... Max relative error (0.6864365018998254) exceeds 1e-11
Variable: POC_REMIN_DOCr
... Max relative error (0.9145364598112757) exceeds 1e-11
Variable: POC_REMIN_DOCr_zint
... Max relative error (0.6864365018998254) exceeds 1e-11
Variable: POC_REMIN_DOCr_zint_100m
... Max relative error (0.6864365018998254) exceeds 1e-11
Variable: PON_REMIN_DONr
... Max relative error (0.9158504210899284) exceeds 1e-11
Variable: PON_REMIN_NH4
... Max relative error (0.9158504210899285) exceeds 1e-11
Variable: POP_REMIN_DOPr
... Max relative error (0.9165604402026947) exceeds 1e-11
Variable: POP_REMIN_PO4
... Max relative error (0.9167673446036031) exceeds 1e-11
Variable: SiO2_REMIN
... Max relative error (0.9858322353849616) exceeds 1e-11

I've updated my script to verify these changes are all reflected in BTF terms, although I don't compare the zint terms:

Column 2:
dz(kmt) = 24973.5938 cm
1.4158226217575558e-09 is the difference between new CaCO3_ALT_CO2_REMIN and baseline (level k = kmt)
1.4158226217575554e-09 is offset due to changes in BTF in the new model
Those terms differ by 4.1359030627651384e-25
Relative difference: 2.9212014267938197e-16
----
1.4158226217575558e-09 is the difference between new CaCO3_REMIN and baseline (level k = kmt)
1.4158226217575554e-09 is offset due to changes in BTF in the new model
Those terms differ by 4.1359030627651384e-25
Relative difference: 2.9212014267938197e-16
----
1.7905913114418874e-13 is the difference between new Lig_deg and baseline (level k = kmt)
1.7905913114418874e-13 is offset due to changes in BTF in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
1.7905913114418874e-13 is the difference between new Lig_loss and baseline (level k = kmt)
1.7905913114418874e-13 is offset due to changes in BTF in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
1.9048843738743489e-13 is the difference between new Lig_prod and baseline (level k = kmt)
1.9048843738743489e-13 is offset due to changes in BTF in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
1.903741443250024e-09 is the difference between new POC_REMIN_DIC and baseline (level k = kmt)
1.9037414432500237e-09 is offset due to changes in BTF in the new model
Those terms differ by 4.1359030627651384e-25
Relative difference: 2.1725130150575587e-16
----
1.8523135218240513e-09 is the difference between new O2_CONSUMPTION and baseline (level k = kmt)
1.8523135218240513e-09 is BTF_O2/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
1.142930624324609e-12 is the difference between new POC_REMIN_DOCr and baseline (level k = kmt)
1.142930624324609e-12 is BTF_DOCr/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
9.140827026985593e-14 is the difference between new PON_REMIN_DONr and baseline (level k = kmt)
9.140827026985592e-14 is BTF_DONr/delta_z(kmt) in the new model
Those terms differ by 1.262177448353619e-29
Relative difference: 1.3808131853139903e-16
----
2.6486009975829807e-10 is the difference between new PON_REMIN_NH4 and baseline (level k = kmt)
2.6486009975829807e-10 is BTF_NH4/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
3.016577883876736e-15 is the difference between new POP_REMIN_DOPr and baseline (level k = kmt)
3.0165778838767356e-15 is BTF_DOPr/delta_z(kmt) in the new model
Those terms differ by 3.944304526105059e-31
Relative difference: 1.3075427447727823e-16
----
1.6755749443653546e-11 is the difference between new POP_REMIN_PO4 and baseline (level k = kmt)
1.6755749443653546e-11 is BTF_PO4/delta_z(kmt) in the new model
Those terms differ by 0.0
Relative difference: 0.0
----
3.3671487754632125e-08 is the difference between new SiO2_REMIN and baseline (level k = kmt)
3.367148775463212e-08 is BTF_SiO3/delta_z(kmt) in the new model
Those terms differ by 6.617444900424222e-24
Relative difference: 1.9652962615273427e-16

So I think this is ready for review. @klindsay28, I'll send you a calendar invite -- besides going over the Fortran changes, I can also walk you through the python script that generated the output above; while the comparison is clear in most variables, the terms marked "offset due to changes in BTF in the new model" require multiplying by some scale factors and / or taking the difference of terms.

@mnlevy1981 mnlevy1981 requested a review from klindsay28 May 5, 2021 15:07
@mnlevy1981 mnlevy1981 marked this pull request as ready for review May 5, 2021 15:07
@klindsay28
Copy link
Collaborator

It doesn't seem like ciso tracers are being handled.

Unfortunately we don't have input data to run the call_compute_subroutines test
with CISO enabled, so this will need to be tested in POP... but I think I
updated the correct bottom flux terms and accounted for them correctly when
doing the conservation checks in the diagnostics.
@mnlevy1981
Copy link
Collaborator Author

It doesn't seem like ciso tracers are being handled.

Good catch! I forgot about them because the datasets for our stand-alone test doesn't include CISO data. I think e6e3930 handles those tracers; I'll test it out in POP tomorrow. The POP test will abort if I botched the conservation checks, but I don't have a good way to test for correctness... maybe I should run POP for a day with nstep output? I could compare results from before and after pulling down that last commit.

@klindsay28
Copy link
Collaborator

In subroutine compute_denitrif, denitrif and sed_denitrif are downscaled if they would consume too much N. I think bottom_fluxes(no3_ind) and bottom_fluxes(alk_ind) need to be modified in that subroutine also, by the same magnitude that sed_denitrif is modified by. (I haven't worked out the sign of the modification to bottom_fluxes.

@mnlevy1981
Copy link
Collaborator Author

In subroutine compute_denitrif, denitrif and sed_denitrif are downscaled if they would consume too much N. I think bottom_fluxes(no3_ind) and bottom_fluxes(alk_ind) need to be modified in that subroutine also, by the same magnitude that sed_denitrif is modified by. (I haven't worked out the sign of the modification to bottom_fluxes.

Good find! I think this will be tricky to implement, though, because the oxygen bottom flux includes sed_denitrif among other terms:

           bottom_fluxes(o2_ind) = o2_consumption_scalef(k) * &
                                   min(max((O2_loc - parm_o2_min) / parm_o2_min_delta, c0), c1) / &
                                   parm_Remin_D_C_O2 * (delta_z(k) * (sed_denitrif * denitrif_C_N + other_remin) - &
                                                        (POC%to_floor - POC%sed_loss(k)) * (c1 - POCremin_refract))

Maybe the best solution is to introduce a compute_bottom_fluxes() subroutine that gets called after both compute_particulate_terms() and compute_denitrif()? I'll try to implement that and see how it goes

@klindsay28
Copy link
Collaborator

Introducing a compute_bottom_fluxes subroutine sounds like a good idea to me.

I'm struck by how complicated it has become to implement an idea that initially seemed so simple. I guess I should know by now that things are rarely as simple as they initially seem.

@mnlevy1981
Copy link
Collaborator Author

Creating compute_bottom_fluxes() helped a little bit. Here are some error metrics when bottom_fluxes were output from compute_particulate_terms()

Variable: J_NO3 ...
... variable is double ...
... Biggest (absolute) diff = 4.133616588247961e-05
... Biggest (relative) diff = 0.7424219282023071

Variable: J_O2 ...
... variable is double ...
... Biggest (absolute) diff = 4.194405067486895e-05
... Biggest (relative) diff = 0.02263273937134894

Variable: J_ALK ...
... variable is double ...
... Biggest (absolute) diff = 4.13361658824796e-05
... Biggest (relative) diff = 0.2495357306469912

Variable: J_ALK_ALT_CO2 ...
... variable is double ...
... Biggest (absolute) diff = 4.13361658824796e-05
... Biggest (relative) diff = 0.2495357306469912

And the same variables when we compute bottom_fluxes after compute_denitrif()

Variable: J_NO3 ...
... variable is double ...
... Biggest (absolute) diff = 9.412193541883236e-06
... Biggest (relative) diff = 0.1690485493464708

Variable: J_O2 ...
... variable is double ...
... Biggest (absolute) diff = 9.229926513434878e-09
... Biggest (relative) diff = 4.980408850221943e-06

Variable: J_ALK ...
... variable is double ...
... Biggest (absolute) diff = 9.412193541883238e-06
... Biggest (relative) diff = 0.05681897540139824

Variable: J_ALK_ALT_CO2 ...
... variable is double ...
... Biggest (absolute) diff = 9.412193541883238e-06
... Biggest (relative) diff = 0.05681897540139824

I think the issue is that previously we scaled denitrif (and sed_denitrif) based on a value of denitrif that included POC%to_floor - POC%sed_loss(kmt) inPOC_remin(kmt) and now that term is sent to bottom_fluxes instead:

denitrif(k) = work * ((DOC_remin(k) + DOCr_remin(k) + POC_remin(k) * (c1 - POCremin_refract) &
         - other_remin(k)) / denitrif_C_N  - sed_denitrif(k))

! scale down denitrif if computed rate would consume all NO3 in 10 days
if (NO3_loc(k) < ((c10*spd)*(denitrif(k)+sed_denitrif(k)))) then
  work = NO3_loc(k) / ((c10*spd)*(denitrif(k)+sed_denitrif(k)))
  denitrif(k) = denitrif(k) * work
  sed_denitrif(k) = sed_denitrif(k) * work
end if

Is it reasonable to chalk the bigger-than-roundoff differences up to changes in denitrif? For testing purposes I could try to get back to roundoff by adding bottom_fluxes(docr_ind) into denitrif(kmt) to compute the scale factor and then pulling it back out before returning (and then applying the scale factor to the appropriate bottom fluxes) but that seems too kludgy to leave as a permanent fix.

For ciso, set bottom_fluxes in marbl_ciso_interior_tendency_compute() before
setting the tendencies. For non-ciso, introduce compute_bottom_fluxes() which
gets called before compute_local_tendencies() (ciso doesn't have quite the
modularity of the non-iso module, but I think this split fits the existing code
in both places)
@mnlevy1981
Copy link
Collaborator Author

Okay, these are the differences between 3c145c7 (introduction of compute_bottom_fluxes()) and a baseline using development in some of the J_ diagnostics after the first timestep.

Variable: J_NO3 ...
... variable is double ...
... Biggest (absolute) diff = 9.412193541883236e-06
... Biggest (relative) diff = 0.1690485493464708

Variable: J_NH4 ...
... variable is double ...
... Biggest (absolute) diff = 6.776263578034403e-21
... Biggest (relative) diff = 3.668181257713404e-17

Variable: J_O2 ...
... variable is double ...
... Biggest (absolute) diff = 9.229926513434878e-09
... Biggest (relative) diff = 4.980408850221943e-06

Variable: J_DIC ...
... variable is double ...
... Biggest (absolute) diff = 8.131516293641283e-20
... Biggest (relative) diff = 5.313875699579526e-17

Variable: J_DIC_ALT_CO2 ...
... variable is double ...
... Biggest (absolute) diff = 8.131516293641283e-20
... Biggest (relative) diff = 5.313875699579526e-17

Variable: J_ALK ...
... variable is double ...
... Biggest (absolute) diff = 9.412193541883238e-06
... Biggest (relative) diff = 0.05681897540139824

Variable: J_ALK_ALT_CO2 ...
... variable is double ...
... Biggest (absolute) diff = 9.412193541883238e-06
... Biggest (relative) diff = 0.05681897540139824

Variable: J_DI13C ...
... variable is double ...
... Biggest (absolute) diff = 4.065758146820642e-20
... Biggest (relative) diff = 2.629129888880259e-17

Variable: J_DI14C ...
... variable is double ...
... Biggest (absolute) diff = 2.710505431213761e-20
... Biggest (relative) diff = 2.004481232770391e-17

These are not the only J_ terms that differ; they're the ones that differ by more than round-off plus NH4 (because I wanted to include both NO3 and NH4 when looking at changes due to denitrif) and all the DIC tracers (to capture the carbon isotope output)

The right-hand side of conservation integrals is now to_floor instead of
sed_loss (this branch had changed it to sed_loss + (to_floor - sed_loss) and
this commit just does the obvious algebraic cancellation).

Some additional clean-up renamed variables involved in the vertical integral
computation (giving work and work2 more descriptive names) and also cleaning up
comments in interior_tendency_mod.
The call_compute_subroutine test did not deallocate the global memory allocated
for the bottom fluxes
@klindsay28
Copy link
Collaborator

Regarding the hypothesis that the remaining differences are due to the downscaling of denitrif and sed_denitrif in compute_denitrif...
This hypothesis can be tested by commenting out the downscaling, rerunning the control and experiment cases, and seeing if the remaining differences go away.

@klindsay28
Copy link
Collaborator

I suggest adding a column where the downscaling of denitrif and sed_denitrif occurs to the dataset used in testing with the offline MARBL driver. This will increase code coverage of testing done with the offline driver. We still might be missing some code blocks with the offline testing so we shouldn't rely completely on the offline testing. But the increased coverage will catch a wider variety of unintended changes sooner in the development/testing process.

@klindsay28
Copy link
Collaborator

I'm not immediately seeing a robust way to modify the code in compute_denitrif to account for bottom_flux terms. In particular, in MOM6, bottom_flux will be a boundary condition for the vertical mixing tri-diagonal solver. We don't know a priori in MARBL how that flux will be distributed across the column.

This block of code was added to avoid total depletion of N at the sea floor, which can happen without the downscaling. I'm not sure how to include this safety rail with the loss term due tobottom_flux occurring outside of MARBL.

@mnlevy1981
Copy link
Collaborator Author

This hypothesis can be tested by commenting out the downscaling, rerunning the control and experiment cases, and seeing if the remaining differences go away.

This is significantly easier than the tests I was thinking of, I've started running it now

I suggest adding a column where the downscaling of denitrif and sed_denitrif occurs to the dataset used in testing with the offline MARBL driver.

I definitely think we need a column like that in the test. I'd also like to include the carbon isotopes in testing -- maybe it makes the most sense to keep the current test as is and also introduce a second initial condition file that contains a few different columns as well as CISO initial conditions / forcing fields? As helpful as that would be for this PR, I'd rather finish this work and then add the new test columns... I think there was some testing to make sure I got the data out of POP correctly that would side-track me from the testing I'm doing right now.

I'm not immediately seeing a robust way to modify the code in compute_denitrif to account for bottom_flux terms. In particular, in MOM6, bottom_flux will be a boundary condition for the vertical mixing tri-diagonal solver. We don't know a priori in MARBL how that flux will be distributed across the column.

This block of code was added to avoid total depletion of N at the sea floor, which can happen without the downscaling. I'm not sure how to include this safety rail with the loss term due tobottom_flux occurring outside of MARBL.

One possibility would be to scale the bottom fluxes of NO3 and NH4 by the same factor as the tendencies (e.g. if the scaling term is returned by compute_denitrif() it could be applied in compute_bottom_fluxes()). I think that would be effective in the case of POP, where the bottom flux is added to the tendency term anyway, but I'm having trouble picturing how that would effect the solution in MOM if there is a vanishingly thin bottom layer.

@klindsay28
Copy link
Collaborator

As helpful as that would be for this PR, I'd rather finish this work and then add the new test columns... I think there was some testing to make sure I got the data out of POP correctly that would side-track me from the testing I'm doing right now.

Agreed

There is a POC_remin term in denitrif that needs to be reflected in
bottom_fluxes(no3_ind) that I was missing. It is weighted by

min(max(((parm_o2_min + parm_o2_min_delta) - O2_loc) / parm_o2_min_delta, c0),
c1)

So I assume it wasn't caught in the stand-alone test due to O2_loc(kmt) being
laeger than (parm_o2_min + parm_o2_min_delta) in every column?
@mnlevy1981
Copy link
Collaborator Author

The good news is that testing with the threshold check commented out helped me find a bug (well, the good part is that 55a06ab fixes said bug). With the threshold check uncommented, though, there are still some bigger-than-roundoff errors:

Variable: J_NO3 ...
... variable is double ...
... Biggest (absolute) diff = 2.317351310062709e-07
... Biggest (relative) diff = 0.004162099680048253

Variable: J_O2 ...
... variable is double ...
... Biggest (absolute) diff = 9.229926513434878e-09
... Biggest (relative) diff = 4.980408850221943e-06

The J_NO3 term is several orders of magnitude closer to the baseline compared to the previous commit, which is excellent. If we leave the threshold check in place, then I think it makes sense to apply the same weighting to the bottom fluxes that gets applied to the tendencies; in the morning I'll test the idea I floated previously about returning the scaling factor from compute_denitrif() and passing it to compute_bottom_fluxes().

This term is then passed to compute_bottom_fluxes and used to scale the POC
terms in bottom_fluxes(o2_ind) and bottom_fluxes(no3_ind) to account for the
change to denitrif if NO3 concentrations in the bottom level are below a
certain threshold.
@mnlevy1981
Copy link
Collaborator Author

In 7678ac2, I scaled the POC contribution to bottom flux for NO3 and O2 by the same factor that denitrif is scaled when NO3 concentrations are low. This didn't appear to have much of an impact on the error term (one downside to the L-infinity norm is that unless you reduce the largest error it's hard to gauge improvement):

Variable: J_NO3 ...
... variable is double ...
... Biggest (absolute) diff = 2.317351310062709e-07
... Biggest (relative) diff = 0.004162099680048253

Variable: J_O2 ...
... variable is double ...
... Biggest (absolute) diff = 9.229926513434878e-09
... Biggest (relative) diff = 4.980408850221943e-06

but it let me do one additional test with the following diff in marbl_interior_tendency_mod.F90 (that is NOT in the code base):

410c410
<          dissolved_organic_matter%DOCr_remin(:), &
---
>          dissolved_organic_matter%DOCr_remin(:), c1 / domain%delta_z(kmt), POC, &
3356c3356
<        POC_remin, other_remin, sed_denitrif, denitrif, denitrif_coeff_kmt)
---
>        dzr_kmt, POC, POC_remin, other_remin, sed_denitrif, denitrif, denitrif_coeff_kmt)
3367a3368,3369
>     real(r8),                      intent(in)    :: dzr_kmt
>     type(column_sinking_particle_type), intent(in)    :: POC
3378c3380
<     real(r8) :: work
---
>     real(r8) :: work, work2
3391a3394,3398
>         work2 = (c10*spd)*(denitrif(k)+sed_denitrif(k))
>         if ((k == kmt) .and. (POC%to_floor > c0)) then
>           work2 = work2 + (c10*spd) * work * (POC%to_floor - POC%sed_loss(k)) * (c1 - POCremin_refract) * dzr_kmt / &
>                           denitrif_C_N
>         end if
3393,3394c3400,3401
<         if (NO3_loc(k) < ((c10*spd)*(denitrif(k)+sed_denitrif(k)))) then
<           work = NO3_loc(k) / ((c10*spd)*(denitrif(k)+sed_denitrif(k)))
---
>         if (NO3_loc(k) < work2) then
>           work = NO3_loc(k) / work2

For context, the first block is adding arguments (POC and dzr_loc(kmt)) to the compute_denitrif() call, the second and third blocks are adding the same to the compute_denitrif() declaration, the fourth block defines a temporary scalar to store the denominator of the NO3 threshold, the fifth block adds work * (POC%to_floor - POC%sed_loss(k)) * (c1 - POCremin_refract) * dzr_kmt / denitrif_C_N (the term removed from POC_remin(kmt)) back to denitrif(kmt) in this denominator, and the last block replaces the denominator of the scale factor with the temporary scalar.

Basically, I added the POC terms that were removed from denitrif back in to the denominator of the NO3 threshold check so denitrif_coeff_kmt would match (within roundoff) the value computed before introducing bottom_fluxes. And, as expected, this patch gets us back to roundoff-level differences:

Variable: J_NO3 ...
... variable is double ...
... Biggest (absolute) diff = 2.541098841762901e-21
... Biggest (relative) diff = 4.563963448419116e-17

Variable: J_O2 ...
... variable is double ...
... Biggest (absolute) diff = 1.084202172485504e-19
... Biggest (relative) diff = 5.850285034682431e-17

Variable: J_DIC ...
... variable is double ...
... Biggest (absolute) diff = 8.131516293641283e-20
... Biggest (relative) diff = 5.313875699579526e-17

Variable: J_ALK ...
... variable is double ...
... Biggest (absolute) diff = 1.355252715606881e-20
... Biggest (relative) diff = 8.181309741250619e-17

So I am convinced that everything removed from interior_tendencies in this PR is being properly accounted for in bottom_fluxes. There is still a little more code clean-up to do, and @klindsay28 I'm happy to review these latest changes with you if you'd like, otherwise I'll merge / tag this later today or tomorrow.

For the call_compute_subroutines test, I had been adding bottom_flux / dz to
the tendency to ensure the tendencies were within round-off of the previous
values. That was just for testing purposes, and now that this PR is done with
testing it makes sense to generate fresh baselines where the bottom fluxes and
tendencies are output separately.
Previous commit included the wrong baseline file (it still had BTF added to J_)
There were two lines that I added while testing out some mods, and then
commented out because I wasn't sure if I needed them... I don't need them, so
rather than leave as comments I am removing them.
Small cleanup to give the variable a more accurate / descriptive name
@mnlevy1981
Copy link
Collaborator Author

Superseded by #381, which is a much better approach to the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

change interface so that interior_tendency_compute returns vertical tracer flux at sea floor
2 participants