Dynlakes: fix some subtle issues #3

billsacks · 2020-09-09T22:37:09Z

Description of changes

Fix a variety of subtle issues with dynamic lakes - particularly the accounting of total water and energy.

Specific notes

This branch contains the following commits:

a9fa875: This is needed to avoid counting lake water in the begwb and endwb terms, which is needed because these are used to calculate gridcell total water store (TWS), which in turn influences the methane code. Because the methane code was tuned around old values of TWS, changing TWS would lead to unintentional – and potentially large – changes in methane terms. Eventually we'd like to remove methane's dependence on TWS, but for now this workaround is needed to avoid changing behavior too much. See Subtract dynbal baselines from begwb and endwb ESCOMP/CTSM#659 (comment) for more details.
52105c4: This minor fix is needed for the sake of water tracers / water isotopes. It shouldn't have any impact outside of that (because the tracer_ratio of bulk water is 1)
de3e12c and acf0984: This one is especially subtle; it is needed for backwards compatibility with old restart files. The main changes are in de3e12c; acf0984 is just a minor tweak on top of that. The problem is that, on existing initial conditions files, there can be already-existing DYNBAL_BASELINE variables (for LIQUID, ICE & HEAT). But these pre-existing variables will have baseline values of 0 for lake. Before this commit, when you started up from an old initial conditions file, the code would use these 0 values for lake baselines (because baselines are only reset if the user explicitly asks them to be reset with a namelist flag). This commit adds some code to detect if the initial conditions file is old, and if so, recomputes dynbal baselines for lake using the new definition. Note that some even older initial conditions files didn't have the DYNBAL_BASELINE variables at all; those would have been okay before this change: the problem is with initial conditions files that are somewhat but not very old - so have DYNBAL_BASELINE variables on them that use the old definition (where lake baseline values were 0).
8088c3c: Minor fix for a pre-existing issue
a31875d: I'm not sure if this is actually needed, but I thought it would be good to group together the lake water content and the roughly equal-but-opposite baselines, so that these can cancel to near zero before adding the smaller terms. In principle, this should help maintain precision in these smaller terms. I thought this might help resolve some of the larger-than-expected answer changes I was seeing in testing, but I don't think it actually does... but I still thought this would be good to keep in place. I have double and triple checked these changes, but it would be good to have an extra set of eyes on them to make sure I did this reordering correctly. In particular, I think there were some subtleties about when a term should accumulate on top of an existing value vs. setting the initial value of a variable.

This is needed for water tracer masses to be counted correctly

We have changed the definition of total column water for lake columns, so the baseline values for lakes are incorrect on old initial conditions files. This commit adds some code to check if we're using an old initial conditions file, and if so, resets the dynbal baseline values for lakes to use the new definition.

I went back and forth about whether we should do this, but I actually feel that it's best if we do reset the lake baselines in a branch or continue run, if using an older restart file. If we didn't do this, we'd want to add some logic for writing out the issue-fixed metadata for any further restart files written from these runs, to note that this issue isn't actually fixed yet on these restart files.

These seem to have been missing for a while (forever?).

billsacks · 2020-09-09T22:51:39Z

@Ivanderkelen I'd like your review of this. With the changes here, I am satisfied with the results of the full CTSM test suite (except one single-point test that I want to look into a bit more to convince myself that the level of answer changes is acceptable), so I think the dynamic lakes code (without the mksurfdata_map changes for now) is ready to come to master once you give your okay to this final set of changes. However, I have NOT run any dynamic lakes runs with these changes, and that seems important to do. I do plan to soon create a single-point dynamic lakes test and run it before and after these changes to verify for myself that I haven't broken things, but I'd like your help testing this, too.

So what I'd like from you is:

Can you please look over these changes (either as a batch or commit-by-commit) and give your okay or make any comments on these? See my above comment for notes on the individual commits on this branch.
Can you please run a bit of testing with a dynamic lakes case to verify that the dynbal adjustments still look correct with dynamic lakes? Depending on what your initial conditions were for your earlier testing, it's possible that de3e12c would lead to significant changes – due to a correction in the treatment of dynbal lake baselines on old initial conditions files. (If so, I think you could set the namelist flag reset_dynbal_baselines = .true. when running with old code in order to fix the issue in old code; I think you should not need that flag with this branch.) Other than that, I expect small but not significant answer changes from this branch, and it would be great if you could verify that.

Please let me know if you'd like any help with how to do this, or if you'd like to talk more about any of this.

Ivanderkelen · 2020-09-17T15:30:57Z

Apologies for my late answer. I had a detailed look at your commits, and they look all fine by me.

I cannot comment on a31875d, as I don't have enough experience on the precision of accumulating large values with existing or intial values, but the reordering seems alright.

In addition, I performed some tests with dynamical lakes, running similar cases to my earlier testing a year ago.
Running with the https://github.com/billsacks/ctsm/tree/dynlakes_avoid_tws_changes branch led to a small increase of dynbal fluxes, although they stay within the same order of magnitude as the fluxes simulated with https://github.com/billsacks/ctsm/tree/dynlakes_master_notools. Setting reset_dynbal_baselines= .true. did not change values on either of the simulations.
If you want, I can share the simulated dynbal fluxes.

billsacks · 2020-09-23T21:04:27Z

@Ivanderkelen thanks for looking at this, and now it's my turn to apologize for a delay in responding!

Based on what you described from your testing, my guess is that the finidat files in your runs did NOT have the three DYNBAL_BASELINE_* variables (DYNBAL_BASELINE_LIQ, DYNBAL_BASELINE_ICE, DYNBAL_BASELINE_HEAT). Can you confirm if that's true? If so: The case with dynlakes_master_notools would have used baseline values calculated from cold start initial conditions, which I think have lake ice fraction = 0, whereas the case with dynlakes_avoid_tws_changes would have used baseline values calculated from the spunup initial conditions, which could have a non-zero lake ice fraction as well as different lake temperatures. I believe that the differences in the dynbal fluxes (when differencing the results of the new branch vs. the old branch) would then be almost exactly equal and opposite for liquid vs. ice dynbal fluxes. Does this seem like what you're seeing? I'm not sure what to expect for the dynbal energy/heat fluxes... my intuition is that they should actually be smaller in magnitude in the run from the new branch, but I'm not positive.

If this isn't clear or doesn't sound like what you're seeing, then I'd be interested in looking at the results myself if it's easy for you to share them. I just want to be sure that the differences you're seeing make sense, and that I haven't introduced a bug. For example, I especially want to make sure that I didn't introduce any issues with a31875d.

I don't understand why setting reset_dynbal_baselines = .true. leads to the same results with the older code, though that may not be too important to figure out.

billsacks · 2020-09-23T21:47:00Z

If this isn't clear or doesn't sound like what you're seeing, then I'd be interested in looking at the results myself if it's easy for you to share them.

If it's about as easy or easier for you to share your run setup – including the necessary input files – then I'd be happy to reproduce this myself. (This would also let me look at the differences that arise from the individual different commits.)

Ivanderkelen · 2020-09-25T14:39:29Z

You are right about the finidat files in my runs, they did not include the DYNBAL_BASELINE_* variables. For reference, they used this file: /glade/p/cesmdata/cseg/inputdata/lnd/clm2/initdata_map/clmi.I2000Clm50BgcCrop.2011-01-01.1.9x2.5_gx1v7_gl4_simyr2000_c190312.nc

As I am getting very confused about the different branches and commits, and the values I get are not matching my or your reasoning, I rather share my run setup script and namelist settings.

I use the script /glade/u/home/ivanderk/for_Bill/setup_test_dynlakes_master_notools.sh to setup and run the case. You would only need to update the SCRIPTSDIR variable to point to your clm5 repository. The script uses the namelist settings, including the inputfiles in /glade/u/home/ivanderk/for_Bill/nl_clm_dynlakes_master_test.sh. The case is set to run for 1968-1970, as in 1970 two large reservoirs appear (Aswan dam; and one in Russia), causing large fluxes.

Please let me know if you need any additional information, and thank you for having a close look at this!

billsacks · 2020-09-25T23:31:35Z

Thanks a lot @Ivanderkelen . I have run your test case on each commit in this PR to examine the incremental diffs. Everything looks as I expected. In particular, in my testing, I did see differences in the original (dynlakes_master_notools) branch when setting reset_dynbal_baselines. And the dynbal fluxes in this PR are just roundoff-level different from the dynbal fluxes in dynlakes_master_notools when setting reset_dynbal_baselines = .true. in the latter. I also spot-checked a few grid cells for their diffs in QFLX_LIQ_DYNBAL and QFLX_ICE_DYNBAL before and after all of the changes in this PR. They look reasonable based on my expectations. (The differences arise because, before this PR, the lake baselines had 0 ice.)

I'm happy to share more details if you're interested.

update branch

Snow occlusion updates fates

Run fsurdat_modifier via an appropriate python version on cheyenne

Updating flags wording to avoid confusions

Ideally we would do year-2000 tests to have more crop cover and thus potentially be more useful tests. However, there are problems running a year-2000 ciso test with crop. These problems exist even with an SMS test on master: I tried tests like SMS_Ly1_P72x1.f10_f10_mg37.I2000Clm45BgcCrop.cheyenne_gnu.clm-ciso--clm-cropMonthOutput, but both debug & non-debug, intel & gnu versions. Debug tests fail like this (from SMS_D_Ly1_P72x1.f10_f10_mg37.I2000Clm45BgcCrop.cheyenne_gnu.clm-ciso--clm-cropMonthOutput): 30:Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. 30: 30:Backtrace for this error: 13: 13:Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. 13: 13:Backtrace for this error: 13:#0 0x2b9d1acc4aff in ??? 30:#0 0x2b9d1acc4aff in ??? 13:#1 0xf63fff in cisofluxcalc 13: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNCIsoFluxMod.F90:1555 30:#1 0xf63fff in cisofluxcalc 30: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNCIsoFluxMod.F90:1555 30:#2 0xf6b489 in __cncisofluxmod_MOD_cisoflux1 30: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNCIsoFluxMod.F90:153 13:#2 0xf6b489 in __cncisofluxmod_MOD_cisoflux1 13: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNCIsoFluxMod.F90:153 13:#3 0xe45657 in __cndrivermod_MOD_cndrivernoleaching 13: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNDriverMod.F90:559 30:#3 0xe45657 in __cndrivermod_MOD_cndrivernoleaching 30: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNDriverMod.F90:559 An intel test dies in the same place. Non-debug versions die like this (both for gnu and intel): 30: set_curr_delta ERROR: found unexpected non-zero delta mid-year 30: Dribbler name: hrv_xsmrpool_to_atm_c_13 30: i, delta = 2 NaN 30: Start of time step date (yr, mon, day, tod) = 2000 1 15 57600 30: This indicates that some non-zero flux was generated at a time step 30: other than the first time step of the year, which this dribbler was told not to expect. 30: If this non-zero mid-year delta is expected, then you can suppress this error 30: by setting allows_non_annual_delta to .true. when constructing this dribbler. 30:iam = 30: local gridcell index = 2 30:iam = 30: global gridcell index = 103 30:iam = 30: gridcell longitude = 285.0000000 30:iam = 30: gridcell latitude = -10.0000000 30: ENDRUN: 30: ERROR: set_curr_delta: found unexpected non-zero delta mid-year: ERROR in /glade/work/sacks/ctsm_code/ctsm/src/utils/AnnualFluxDr ibbler.F90 at line 276 So there is some issue with year-2000 ciso tests with crop. This issue exists on master, for clm45 and clm50 tests. (e.g., for clm50, I tried SMS_D_Ly1_P72x1.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_gnu.clm-ciso--clm-cropMonthOutput.)

updating input data file paths

Update `fates_harvest_mode` to use characters for namelist option select

billsacks added 6 commits August 28, 2020 11:10

Do not add lake water to begwb and endwb, and thus also TWS

a9fa875

See ESCOMP#659 (comment) for details.

Use tracer ratio when counting lake water

52105c4

This is needed for water tracer masses to be counted correctly

Add missing 'OMP END PARALLEL DO' statements

8088c3c

These seem to have been missing for a while (forever?).

Reorder accumulations to avoid loss of precision

a31875d

billsacks mentioned this pull request Sep 9, 2020

Dynamic lakes - without tools changes ESCOMP/CTSM#1109

Merged

billsacks merged commit 2cb54b5 into dynlakes_master_notools Sep 28, 2020

billsacks deleted the dynlakes_avoid_tws_changes branch September 29, 2020 01:55

billsacks pushed a commit that referenced this pull request Apr 19, 2021

Merge pull request #3 from ESCOMP/master

eb91e9a

update branch

billsacks pushed a commit that referenced this pull request Jun 11, 2021

Merge pull request #3 from rgknox/snow_occlusion_ctsm

810a09f

Snow occlusion updates fates

billsacks pushed a commit that referenced this pull request Dec 3, 2021

Merge pull request #3 from billsacks/cssi_1

7d914fc

Run fsurdat_modifier via an appropriate python version on cheyenne

billsacks pushed a commit that referenced this pull request Jan 30, 2022

Merge pull request #3 from adrifoster/subsetdata_update

717ce37

Updating flags wording to avoid confusions

slevis-lmwg pushed a commit that referenced this pull request Jul 9, 2024

Merge pull request #3 from wwieder/PLUMBERcsv

51879ca

updating input data file paths

billsacks pushed a commit that referenced this pull request Aug 24, 2024

Merge pull request #3 from glemieux/fates-landuse-v2-fhm-chars

0676cf3

Update `fates_harvest_mode` to use characters for namelist option select

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynlakes: fix some subtle issues #3

Dynlakes: fix some subtle issues #3

billsacks commented Sep 9, 2020

billsacks commented Sep 9, 2020 •

edited

Loading

Ivanderkelen commented Sep 17, 2020

billsacks commented Sep 23, 2020

billsacks commented Sep 23, 2020

Ivanderkelen commented Sep 25, 2020

billsacks commented Sep 25, 2020

Dynlakes: fix some subtle issues #3

Dynlakes: fix some subtle issues #3

Conversation

billsacks commented Sep 9, 2020

Description of changes

Specific notes

billsacks commented Sep 9, 2020 • edited Loading

Ivanderkelen commented Sep 17, 2020

billsacks commented Sep 23, 2020

billsacks commented Sep 23, 2020

Ivanderkelen commented Sep 25, 2020

billsacks commented Sep 25, 2020

billsacks commented Sep 9, 2020 •

edited

Loading