MIMICS-related refactor before introducing MIMICS #1340

slevis-lmwg · 2021-04-14T22:51:30Z

Description of changes

Refactor aspects of the CTSM affecting our introduction of MIMICS before engaging in the actual introduction of MIMICS which has started in #1318 .

Specific notes

Introducing MIMICS initially appears to just require:

addition of SoilBiogeochemDecompCascadeMIMICSMod following SoilBiogeochemDecompCascadeBGCMod as a template
minor code mods in clm_varpar and SoilBiogeochemDecompCascadeConType

However, the hooks between the CTSM and MIMICS do not have 1-to-1 correspondence with the hooks between CTSM and BGC (or CTSM and CN). To avoid hacking things together just to make them work, we will try generalizing the hooks between the models.

Contributors other than yourself, if any:
At this stage @wwieder @ekluzek

CTSM Issues Fixed (include github issue #):
Note, #1356 becomes fully broken

Are answers expected to change (and if so in what way)? No bit-for-bit

Any User Interface Changes (namelist or namelist defaults changes)?
A couple of parameters that are currently set in the namelist will be moved to the params file.

Testing as of now:
PASS ERP_D_Ld3_P36x2.f10_f10_mg37.I2000Clm50BgcCru.cheyenne_intel.clm-default -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev033
PASS ERP_P36x2_D_Ld5.f10_f10_mg37.I2000Clm50Cn.cheyenne_intel.clm-default -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev033
The first (ie the BGC test) defaults to use_vertsoilc = .true. and the second (the CN test) defaults to use_vertsoilc = .false. so I repeated the second with use_vertsoilc = .true. with & without my code mods in *CascadeCN and both FAIL. This tells me that the CN implementation was already broken with use_vertsoilc = .true.

...thereby adding flexibility because as we add MIMICS to the list of decomp models (currently CN and BGC) we will need different values for these parameters

I ran these tests: ERP_D_Ld3_P36x2.f10_f10_mg37.I2000Clm50BgcCru.cheyenne_intel.clm-default -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev033 --> PASS ERP_P36x2_D_Ld5.f10_f10_mg37.I2000Clm50Cn.cheyenne_intel.clm-default -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev033 --> PASS The first (BGC test) defaults to use_vertsoilc = .true. and the second (CN test) defaults to use_vertsoilc = .false. so I repeated the second with use_vertsoilc = .true. with & without my code mods in *CascadeCN and both FAIL.

src/biogeochem/CNC14DecayMod.F90

src/biogeochem/CNCStateUpdate1Mod.F90

src/main/clm_varpar.F90

src/soilbiogeochem/SoilBiogeochemDecompCascadeBGCMod.F90

src/soilbiogeochem/SoilBiogeochemNStateUpdate1Mod.F90

slevis-lmwg · 2021-04-14T23:34:47Z

Next I will look into the params file refactor. Right off the bat I am aware of the following bfb mods:

two parameters that we decided to move from the namelist to the params file
two params that need the suffix _bgc to make clear that they are the ones introduced with DecompCascadeBGC
I am also removing k_frag from DecompCascadeBGC but not from the params file because it's used in DecompCascadeCN

Successfully repeated the two tests listed in this PRs intro. New params file clm50_params.c210418.nc currently located in /glade/scratch/slevis/temp_work/param_files In this file I also renamed k_frag -> k_frag_cn & tau_cwd -> tau_cwd_bgc

bld/CLMBuildNamelist.pm

ekluzek · 2021-04-22T22:27:17Z

Linking this to #1356 where we plan to remove the "CN" option to BGC. @slevisconsulting don't plan on keeping the "CN" version around or putting any work into it.

slevis-lmwg · 2021-04-23T20:08:58Z

Linking this to #1356 where we plan to remove the "CN" option to BGC. @slevisconsulting don't plan on keeping the "CN" version around or putting any work into it.

Thank you @ekluzek and did you resolve whether I should actually remove "CN" things in this PR or whether someone else should do it? Of course @wwieder has say in this decision, too :-)

ekluzek · 2021-04-23T20:15:29Z

@slevisconsulting good question. I put the issue as a "next" so we'll discuss it next week at the Thursday Software meeting and likely decide. I think the important thing for you to do is to NOT put any work into the CN version of soil-BGC. Or at least the minimum amount possible. If it does get in your way in the next week -- let us know and we can talk about it sooner.

Even a straight forward removal of the CN version is likely to end up in the peeling onion effect where you end up touching more files than you think you would. So it will take some work to do. So I think it's worth deciding who's going to do that.

This commit covers roughly half of the refactor. This refactor should make the code more flexible to handle varying numbers of litter pools instead of expecting three. This should accommodate the introduction of an alternate DecompCascade model such as MIMICS that uses two litter pools.

bld/CLMBuildNamelist.pm

src/biogeochem/CNCIsoFluxMod.F90

src/biogeochem/CNCStateUpdate1Mod.F90

src/soilbiogeochem/SoilBiogeochemDecompCascadeBGCMod.F90

bld/namelist_files/namelist_defaults_ctsm.xml

Partly clean-up and partly getting the cheyenne test-suite to pass, except for Clm45Cn, Clm50Cn, and Clm45Fates tests. Maintaining CN is not a priority, but how about Clm45Fates? Also not a priority?

slevis-lmwg · 2021-05-28T23:35:42Z

There are some minor changes that I suggest. The clm45 paramsfile needs to have some values changed on it. I ask for some comments. And I also ask for an additional index to better explain the i_litr_min index that's outside the loop.

Thank you @ekluzek
I am running the test suites next.

slevis-lmwg · 2021-05-29T02:36:14Z

Cheyenne test suite OK. Expected tests fail (now this includes CN tests).

Izumi test suite FAILed. Investigating 2 failures:
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.izumi_pgi.clm-crop BASELINE ctsm5.1.dev042: DIFF
var with max rms diff: 3.4E-19 LITR1C_vr

SMS.f10_f10_mg37.I2000Clm50BgcCrop.izumi_pgi.clm-crop BASELINE ctsm5.1.dev042: DIFF
var with max rms diff: 5.7E-05 CONC_O2_SAT
for this variable, avg rel diff: 7.3E-06

@ekluzek both failures are pgi. The same tests with gnu and intel passed on izumi. Similar tests (BgcCrop-crop*) passed on cheyenne.

I reran the test w the larger rms diff, but w my last commit that used dev033: Also DIFF from baseline.
I ran the same test with gnu & intel instead of pgi and both PASS.

Additional info about the two failing tests:
@ekluzek you showed roundoff differences for the same tests in tag ctsm5.1.dev020 (see ChangeLog)! Coincidence?

What’s a reasonable path fwd in this case?
@ekluzek recommended that I run the same tests on cheyenne w gnu & intel (first for baseline and then for the branch) to confirm bfb. If this test passes, then we will ignore the two pgi failures.

slevis-lmwg · 2021-06-01T22:59:38Z

The tests that @ekluzek recommended this morning PASS.
Tests listed here just for the record.

Baseline tests:
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.cheyenne_intel.clm-crop -g /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.intel1
SMS.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_intel.clm-crop -g /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.intel2
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.cheyenne_gnu.clm-crop -g /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.gnu1
SMS.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_gnu.clm-crop -g /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.gnu2
Branch tests:
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.cheyenne_intel.clm-crop -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.intel1
SMS.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_intel.clm-crop -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.intel2
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.cheyenne_gnu.clm-crop -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.gnu1
SMS.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_gnu.clm-crop -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.gnu2

I will push some updates to the ChangeLog. Then the PR is ready for merge. As always pls let me know if I'm missing something.

ekluzek

@slevisconsulting addressed everything I saw. So this is ready to merge.

ekluzek · 2021-06-03T17:54:58Z

@slevisconsulting I noticed you didn't enter the new paramfiles into svn inputdata. You should be able to do that looking at your permissions. It looks like when you ran on izumi, you copied the files over rather than letting them be populated from svn. I actually see running on izumi as a good way to catch if I forget to add files to inputdata. So I never copy by hand.

So could you enter those into svn? I use the rimport script on cheyenne that's at ...

/glade/p/cesmdata/cseg/inputdata/rimport

Thanks.

ekluzek · 2021-06-03T18:06:29Z

@slevisconsulting I also noticed that CN tests are breaking. Which is OK, since we view them as deprecated (see #1356). But, we should at least mark the CN tests as expected fails. And likely we should just remove those tests. And we are going to want to eventually remove the CN option as a possibility.

I did wonder if keeping CN around would be good just to show that our infrastructure is flexible to support more than type of decomposition. But, I'm not sure about that as it would require you to put effort into something we plan to remove.

ekluzek · 2021-06-03T18:10:54Z

@slevisconsulting it looks to me like some BGC and FATES tests are failing. Can you comment on that?

ERP_D_Ld10.f19_g17.I1850Clm51BgcCrop.cheyenne_intel.clm-ADspinup
ERS_D_Ld5.f09_g17.I2000Clm45Fates.cheyenne_intel.clm-FatesColdDef
ERS_D_Ld5.f09_g17.I2000Clm50Fates.cheyenne_intel.clm-FatesColdDef
SMS_Ld1.f09_g17.I2000Clm50BgcCru.cheyenne_intel.clm-af_bias_v7
SMS_Ld2_D.f09_g17.I1850Clm50BgcCropCmip6.cheyenne_intel.clm-basic_interp
SMS_Ld5.f19_g17.I2000Clm45Fates.cheyenne_intel.clm-FatesColdDef
SMS_Ld5.f19_g17.I2000Clm50Fates.cheyenne_intel.clm-FatesColdDef
SMS_Lm1.f19_g17.I1850Clm50BgcCropCmip6waccm.cheyenne_intel.clm-basic
SMS_N2_D_Lh12.f09_g17.I2000Clm50Sp.cheyenne_intel.clm-pauseResume
SSP_D_Ld10.f19_g17.I1850Clm51Bgc.cheyenne_intel.clm-rtmColdSSP
SSP_D_Ld4.f09_g17.I1850Clm50BgcCrop.cheyenne_intel.clm-ciso_rtmColdSSP
SSP_Ld10.f19_g17.I1850Clm50Bgc.cheyenne_intel.clm-rtmColdSSP

ekluzek · 2021-06-03T18:15:09Z

Looks like everything is good on izumi, with just the CN test failing, but that's expected.

slevis-lmwg · 2021-06-03T19:08:26Z

I reran that list of unexpected failures and they passed the second time around. The second time I ran them as individual tests. I used -g to generate baselines for all these, so I think we should be fine.

slevis-lmwg · 2021-06-03T21:15:55Z

@slevisconsulting I noticed you didn't enter the new paramfiles into svn inputdata. You should be able to do that looking at your permissions. It looks like when you ran on izumi, you copied the files over rather than letting them be populated from svn. I actually see running on izumi as a good way to catch if I forget to add files to inputdata. So I never copy by hand.

So could you enter those into svn? I use the rimport script on cheyenne that's at ...

/glade/p/cesmdata/cseg/inputdata/rimport

@ekluzek
I see a new ctsm51_params file dated c210601 in the paramdata directory.

I ran rimport on the clm45 and clm50 files but will wait for your response on what to do about the ctsm51 file.

slevis-lmwg · 2021-06-03T21:23:02Z

@slevisconsulting I also noticed that CN tests are breaking. Which is OK, since we view them as deprecated (see #1356). But, we should at least mark the CN tests as expected fails. And likely we should just remove those tests. And we are going to want to eventually remove the CN option as a possibility.

I did wonder if keeping CN around would be good just to show that our infrastructure is flexible to support more than type of decomposition. But, I'm not sure about that as it would require you to put effort into something we plan to remove.

@ekluzek and I agreed on the phone that:

@ekluzek will update the expected failures file
we see no need to maintain CN since MIMICS will show that our infrastructure has the flexibility to support more than one type of decomp.

slevis-lmwg · 2021-06-03T21:27:54Z

I see a new ctsm51_params file dated c210601 in the paramdata directory.

I ran rimport on the clm45 and clm50 files but will wait for your response on what to do about the ctsm51 file.

@ekluzek gave me the go-ahead to run rimport on my ctsm51 file, as well. They're all done now.

ekluzek · 2021-06-03T22:47:21Z

Note, we did notice an increase in memory based on some of the tests. There were some arrays that added a dimension to them, so I suppose that makes sense.

Make certain history fields descriptive that have been labeled by number since #1340

I used existing infrastructure to add descriptive strings to certain history fields that I had labeled by number in ESCOMP#1340. While doing this, I applied the change to a bunch of other history fields that needed it. Some variable names for pools were also changed to use terms consistent with the new names as well.

I used existing infrastructure to add descriptive strings to certain history fields that I had labeled by number in ESCOMP#1340. While doing this, I applied the change to a bunch of other history fields that needed it. Some variable names for pools was also changed to use terms consistent with the new names as well.

slevis-lmwg added 2 commits April 13, 2021 17:33

Derive instead of hardwiring i_litr2, i_litr3, i_cwd

dfe3f71

...thereby adding flexibility because as we add MIMICS to the list of decomp models (currently CN and BGC) we will need different values for these parameters

slevis-lmwg self-assigned this Apr 14, 2021

slevis-lmwg added PR status: work in progress next this should get some attention in the next week or two. Normally each Thursday SE meeting. code health improving internal code structure to make easier to maintain (sustainability) labels Apr 14, 2021