-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynlakes: fix some subtle issues #3
Dynlakes: fix some subtle issues #3
Conversation
This is needed for water tracer masses to be counted correctly
We have changed the definition of total column water for lake columns, so the baseline values for lakes are incorrect on old initial conditions files. This commit adds some code to check if we're using an old initial conditions file, and if so, resets the dynbal baseline values for lakes to use the new definition.
I went back and forth about whether we should do this, but I actually feel that it's best if we do reset the lake baselines in a branch or continue run, if using an older restart file. If we didn't do this, we'd want to add some logic for writing out the issue-fixed metadata for any further restart files written from these runs, to note that this issue isn't actually fixed yet on these restart files.
These seem to have been missing for a while (forever?).
@Ivanderkelen I'd like your review of this. With the changes here, I am satisfied with the results of the full CTSM test suite (except one single-point test that I want to look into a bit more to convince myself that the level of answer changes is acceptable), so I think the dynamic lakes code (without the mksurfdata_map changes for now) is ready to come to master once you give your okay to this final set of changes. However, I have NOT run any dynamic lakes runs with these changes, and that seems important to do. I do plan to soon create a single-point dynamic lakes test and run it before and after these changes to verify for myself that I haven't broken things, but I'd like your help testing this, too. So what I'd like from you is:
Please let me know if you'd like any help with how to do this, or if you'd like to talk more about any of this. |
Apologies for my late answer. I had a detailed look at your commits, and they look all fine by me. I cannot comment on a31875d, as I don't have enough experience on the precision of accumulating large values with existing or intial values, but the reordering seems alright. In addition, I performed some tests with dynamical lakes, running similar cases to my earlier testing a year ago. |
@Ivanderkelen thanks for looking at this, and now it's my turn to apologize for a delay in responding! Based on what you described from your testing, my guess is that the finidat files in your runs did NOT have the three DYNBAL_BASELINE_* variables ( If this isn't clear or doesn't sound like what you're seeing, then I'd be interested in looking at the results myself if it's easy for you to share them. I just want to be sure that the differences you're seeing make sense, and that I haven't introduced a bug. For example, I especially want to make sure that I didn't introduce any issues with a31875d. I don't understand why setting reset_dynbal_baselines = .true. leads to the same results with the older code, though that may not be too important to figure out. |
If it's about as easy or easier for you to share your run setup – including the necessary input files – then I'd be happy to reproduce this myself. (This would also let me look at the differences that arise from the individual different commits.) |
You are right about the finidat files in my runs, they did not include the DYNBAL_BASELINE_* variables. For reference, they used this file: As I am getting very confused about the different branches and commits, and the values I get are not matching my or your reasoning, I rather share my run setup script and namelist settings. I use the script Please let me know if you need any additional information, and thank you for having a close look at this! |
Thanks a lot @Ivanderkelen . I have run your test case on each commit in this PR to examine the incremental diffs. Everything looks as I expected. In particular, in my testing, I did see differences in the original ( I'm happy to share more details if you're interested. |
Snow occlusion updates fates
Run fsurdat_modifier via an appropriate python version on cheyenne
Updating flags wording to avoid confusions
Ideally we would do year-2000 tests to have more crop cover and thus potentially be more useful tests. However, there are problems running a year-2000 ciso test with crop. These problems exist even with an SMS test on master: I tried tests like SMS_Ly1_P72x1.f10_f10_mg37.I2000Clm45BgcCrop.cheyenne_gnu.clm-ciso--clm-cropMonthOutput, but both debug & non-debug, intel & gnu versions. Debug tests fail like this (from SMS_D_Ly1_P72x1.f10_f10_mg37.I2000Clm45BgcCrop.cheyenne_gnu.clm-ciso--clm-cropMonthOutput): 30:Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. 30: 30:Backtrace for this error: 13: 13:Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. 13: 13:Backtrace for this error: 13:#0 0x2b9d1acc4aff in ??? 30:#0 0x2b9d1acc4aff in ??? 13:#1 0xf63fff in cisofluxcalc 13: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNCIsoFluxMod.F90:1555 30:#1 0xf63fff in cisofluxcalc 30: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNCIsoFluxMod.F90:1555 30:#2 0xf6b489 in __cncisofluxmod_MOD_cisoflux1 30: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNCIsoFluxMod.F90:153 13:#2 0xf6b489 in __cncisofluxmod_MOD_cisoflux1 13: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNCIsoFluxMod.F90:153 13:#3 0xe45657 in __cndrivermod_MOD_cndrivernoleaching 13: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNDriverMod.F90:559 30:#3 0xe45657 in __cndrivermod_MOD_cndrivernoleaching 30: at /glade/work/sacks/ctsm_code/ctsm/src/biogeochem/CNDriverMod.F90:559 An intel test dies in the same place. Non-debug versions die like this (both for gnu and intel): 30: set_curr_delta ERROR: found unexpected non-zero delta mid-year 30: Dribbler name: hrv_xsmrpool_to_atm_c_13 30: i, delta = 2 NaN 30: Start of time step date (yr, mon, day, tod) = 2000 1 15 57600 30: This indicates that some non-zero flux was generated at a time step 30: other than the first time step of the year, which this dribbler was told not to expect. 30: If this non-zero mid-year delta is expected, then you can suppress this error 30: by setting allows_non_annual_delta to .true. when constructing this dribbler. 30:iam = 30: local gridcell index = 2 30:iam = 30: global gridcell index = 103 30:iam = 30: gridcell longitude = 285.0000000 30:iam = 30: gridcell latitude = -10.0000000 30: ENDRUN: 30: ERROR: set_curr_delta: found unexpected non-zero delta mid-year: ERROR in /glade/work/sacks/ctsm_code/ctsm/src/utils/AnnualFluxDr ibbler.F90 at line 276 So there is some issue with year-2000 ciso tests with crop. This issue exists on master, for clm45 and clm50 tests. (e.g., for clm50, I tried SMS_D_Ly1_P72x1.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_gnu.clm-ciso--clm-cropMonthOutput.)
updating input data file paths
Update `fates_harvest_mode` to use characters for namelist option select
Description of changes
Fix a variety of subtle issues with dynamic lakes - particularly the accounting of total water and energy.
Specific notes
This branch contains the following commits:
a9fa875: This is needed to avoid counting lake water in the begwb and endwb terms, which is needed because these are used to calculate gridcell total water store (TWS), which in turn influences the methane code. Because the methane code was tuned around old values of TWS, changing TWS would lead to unintentional – and potentially large – changes in methane terms. Eventually we'd like to remove methane's dependence on TWS, but for now this workaround is needed to avoid changing behavior too much. See Subtract dynbal baselines from begwb and endwb ESCOMP/CTSM#659 (comment) for more details.
52105c4: This minor fix is needed for the sake of water tracers / water isotopes. It shouldn't have any impact outside of that (because the tracer_ratio of bulk water is 1)
de3e12c and acf0984: This one is especially subtle; it is needed for backwards compatibility with old restart files. The main changes are in de3e12c; acf0984 is just a minor tweak on top of that. The problem is that, on existing initial conditions files, there can be already-existing DYNBAL_BASELINE variables (for LIQUID, ICE & HEAT). But these pre-existing variables will have baseline values of 0 for lake. Before this commit, when you started up from an old initial conditions file, the code would use these 0 values for lake baselines (because baselines are only reset if the user explicitly asks them to be reset with a namelist flag). This commit adds some code to detect if the initial conditions file is old, and if so, recomputes dynbal baselines for lake using the new definition. Note that some even older initial conditions files didn't have the DYNBAL_BASELINE variables at all; those would have been okay before this change: the problem is with initial conditions files that are somewhat but not very old - so have DYNBAL_BASELINE variables on them that use the old definition (where lake baseline values were 0).
8088c3c: Minor fix for a pre-existing issue
a31875d: I'm not sure if this is actually needed, but I thought it would be good to group together the lake water content and the roughly equal-but-opposite baselines, so that these can cancel to near zero before adding the smaller terms. In principle, this should help maintain precision in these smaller terms. I thought this might help resolve some of the larger-than-expected answer changes I was seeing in testing, but I don't think it actually does... but I still thought this would be good to keep in place. I have double and triple checked these changes, but it would be good to have an extra set of eyes on them to make sure I did this reordering correctly. In particular, I think there were some subtleties about when a term should accumulate on top of an existing value vs. setting the initial value of a variable.