Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIMICS-related refactor before introducing MIMICS #1340

Merged
merged 20 commits into from
Jun 3, 2021

Conversation

slevis-lmwg
Copy link
Contributor

@slevis-lmwg slevis-lmwg commented Apr 14, 2021

Description of changes

Refactor aspects of the CTSM affecting our introduction of MIMICS before engaging in the actual introduction of MIMICS which has started in #1318 .

Specific notes

Introducing MIMICS initially appears to just require:

  • addition of SoilBiogeochemDecompCascadeMIMICSMod following SoilBiogeochemDecompCascadeBGCMod as a template
  • minor code mods in clm_varpar and SoilBiogeochemDecompCascadeConType

However, the hooks between the CTSM and MIMICS do not have 1-to-1 correspondence with the hooks between CTSM and BGC (or CTSM and CN). To avoid hacking things together just to make them work, we will try generalizing the hooks between the models.

Contributors other than yourself, if any:
At this stage @wwieder @ekluzek

CTSM Issues Fixed (include github issue #):
Note, #1356 becomes fully broken

Are answers expected to change (and if so in what way)? No bit-for-bit

Any User Interface Changes (namelist or namelist defaults changes)?
A couple of parameters that are currently set in the namelist will be moved to the params file.

Testing as of now:
PASS ERP_D_Ld3_P36x2.f10_f10_mg37.I2000Clm50BgcCru.cheyenne_intel.clm-default -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev033
PASS ERP_P36x2_D_Ld5.f10_f10_mg37.I2000Clm50Cn.cheyenne_intel.clm-default -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev033
The first (ie the BGC test) defaults to use_vertsoilc = .true. and the second (the CN test) defaults to use_vertsoilc = .false. so I repeated the second with use_vertsoilc = .true. with & without my code mods in *CascadeCN and both FAIL. This tells me that the CN implementation was already broken with use_vertsoilc = .true.

...thereby adding flexibility because as we add MIMICS to the list of
decomp models (currently CN and BGC) we will need different values
for these parameters
I ran these tests:
ERP_D_Ld3_P36x2.f10_f10_mg37.I2000Clm50BgcCru.cheyenne_intel.clm-default -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev033  --> PASS
ERP_P36x2_D_Ld5.f10_f10_mg37.I2000Clm50Cn.cheyenne_intel.clm-default -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev033  --> PASS
The first (BGC test) defaults to use_vertsoilc = .true. and
the second (CN test) defaults to use_vertsoilc = .false. so I repeated
the second with use_vertsoilc = .true. with & without my code mods in
*CascadeCN and both FAIL.
@slevis-lmwg slevis-lmwg self-assigned this Apr 14, 2021
@slevis-lmwg slevis-lmwg added PR status: work in progress next this should get some attention in the next week or two. Normally each Thursday SE meeting. code health improving internal code structure to make easier to maintain (sustainability) labels Apr 14, 2021
src/main/clm_varpar.F90 Outdated Show resolved Hide resolved
@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Apr 14, 2021

Next I will look into the params file refactor. Right off the bat I am aware of the following bfb mods:

  • two parameters that we decided to move from the namelist to the params file
  • two params that need the suffix _bgc to make clear that they are the ones introduced with DecompCascadeBGC
  • I am also removing k_frag from DecompCascadeBGC but not from the params file because it's used in DecompCascadeCN

Successfully repeated the two tests listed in this PRs intro.
New params file clm50_params.c210418.nc currently located in
/glade/scratch/slevis/temp_work/param_files
In this file I also renamed k_frag -> k_frag_cn & tau_cwd -> tau_cwd_bgc
bld/CLMBuildNamelist.pm Outdated Show resolved Hide resolved
@billsacks billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Apr 22, 2021
@ekluzek
Copy link
Collaborator

ekluzek commented Apr 22, 2021

Linking this to #1356 where we plan to remove the "CN" option to BGC. @slevisconsulting don't plan on keeping the "CN" version around or putting any work into it.

@slevis-lmwg
Copy link
Contributor Author

Linking this to #1356 where we plan to remove the "CN" option to BGC. @slevisconsulting don't plan on keeping the "CN" version around or putting any work into it.

Thank you @ekluzek and did you resolve whether I should actually remove "CN" things in this PR or whether someone else should do it? Of course @wwieder has say in this decision, too :-)

@ekluzek
Copy link
Collaborator

ekluzek commented Apr 23, 2021

@slevisconsulting good question. I put the issue as a "next" so we'll discuss it next week at the Thursday Software meeting and likely decide. I think the important thing for you to do is to NOT put any work into the CN version of soil-BGC. Or at least the minimum amount possible. If it does get in your way in the next week -- let us know and we can talk about it sooner.

Even a straight forward removal of the CN version is likely to end up in the peeling onion effect where you end up touching more files than you think you would. So it will take some work to do. So I think it's worth deciding who's going to do that.

This commit covers roughly half of the refactor. This refactor should
make the code more flexible to handle varying numbers of litter pools
instead of expecting three. This should accommodate the introduction of
an alternate DecompCascade model such as MIMICS that uses two litter
pools.
Partly clean-up and partly getting the cheyenne test-suite to pass,
except for Clm45Cn, Clm50Cn, and Clm45Fates tests.
Maintaining CN is not a priority, but how about Clm45Fates? Also not a
priority?
@slevis-lmwg
Copy link
Contributor Author

There are some minor changes that I suggest. The clm45 paramsfile needs to have some values changed on it. I ask for some comments. And I also ask for an additional index to better explain the i_litr_min index that's outside the loop.

Thank you @ekluzek
I am running the test suites next.

@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented May 29, 2021

Cheyenne test suite OK. Expected tests fail (now this includes CN tests).

Izumi test suite FAILed. Investigating 2 failures:
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.izumi_pgi.clm-crop BASELINE ctsm5.1.dev042: DIFF
var with max rms diff: 3.4E-19 LITR1C_vr

SMS.f10_f10_mg37.I2000Clm50BgcCrop.izumi_pgi.clm-crop BASELINE ctsm5.1.dev042: DIFF
var with max rms diff: 5.7E-05 CONC_O2_SAT
for this variable, avg rel diff: 7.3E-06

@ekluzek both failures are pgi. The same tests with gnu and intel passed on izumi. Similar tests (BgcCrop-crop*) passed on cheyenne.

I reran the test w the larger rms diff, but w my last commit that used dev033: Also DIFF from baseline.
I ran the same test with gnu & intel instead of pgi and both PASS.

Additional info about the two failing tests:
@ekluzek you showed roundoff differences for the same tests in tag ctsm5.1.dev020 (see ChangeLog)! Coincidence?

What’s a reasonable path fwd in this case?
@ekluzek recommended that I run the same tests on cheyenne w gnu & intel (first for baseline and then for the branch) to confirm bfb. If this test passes, then we will ignore the two pgi failures.

@slevis-lmwg
Copy link
Contributor Author

The tests that @ekluzek recommended this morning PASS.
Tests listed here just for the record.

Baseline tests:
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.cheyenne_intel.clm-crop -g /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.intel1
SMS.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_intel.clm-crop -g /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.intel2
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.cheyenne_gnu.clm-crop -g /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.gnu1
SMS.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_gnu.clm-crop -g /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.gnu2
Branch tests:
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.cheyenne_intel.clm-crop -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.intel1
SMS.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_intel.clm-crop -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.intel2
SMS_D.f10_f10_mg37.I2000Clm51BgcCrop.cheyenne_gnu.clm-crop -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.gnu1
SMS.f10_f10_mg37.I2000Clm50BgcCrop.cheyenne_gnu.clm-crop -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev042.gnu2

I will push some updates to the ChangeLog. Then the PR is ready for merge. As always pls let me know if I'm missing something.

@ekluzek ekluzek added PR status: ready PR: this is ready to merge in, with all tests satisfactory and reviews complete and removed PR status: work in progress labels Jun 3, 2021
Copy link
Collaborator

@ekluzek ekluzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slevisconsulting addressed everything I saw. So this is ready to merge.

@ekluzek
Copy link
Collaborator

ekluzek commented Jun 3, 2021

@slevisconsulting I noticed you didn't enter the new paramfiles into svn inputdata. You should be able to do that looking at your permissions. It looks like when you ran on izumi, you copied the files over rather than letting them be populated from svn. I actually see running on izumi as a good way to catch if I forget to add files to inputdata. So I never copy by hand.

So could you enter those into svn? I use the rimport script on cheyenne that's at ...

/glade/p/cesmdata/cseg/inputdata/rimport

Thanks.

@ekluzek
Copy link
Collaborator

ekluzek commented Jun 3, 2021

@slevisconsulting I also noticed that CN tests are breaking. Which is OK, since we view them as deprecated (see #1356). But, we should at least mark the CN tests as expected fails. And likely we should just remove those tests. And we are going to want to eventually remove the CN option as a possibility.

I did wonder if keeping CN around would be good just to show that our infrastructure is flexible to support more than type of decomposition. But, I'm not sure about that as it would require you to put effort into something we plan to remove.

@ekluzek
Copy link
Collaborator

ekluzek commented Jun 3, 2021

@slevisconsulting it looks to me like some BGC and FATES tests are failing. Can you comment on that?

ERP_D_Ld10.f19_g17.I1850Clm51BgcCrop.cheyenne_intel.clm-ADspinup
ERS_D_Ld5.f09_g17.I2000Clm45Fates.cheyenne_intel.clm-FatesColdDef
ERS_D_Ld5.f09_g17.I2000Clm50Fates.cheyenne_intel.clm-FatesColdDef
SMS_Ld1.f09_g17.I2000Clm50BgcCru.cheyenne_intel.clm-af_bias_v7
SMS_Ld2_D.f09_g17.I1850Clm50BgcCropCmip6.cheyenne_intel.clm-basic_interp
SMS_Ld5.f19_g17.I2000Clm45Fates.cheyenne_intel.clm-FatesColdDef
SMS_Ld5.f19_g17.I2000Clm50Fates.cheyenne_intel.clm-FatesColdDef
SMS_Lm1.f19_g17.I1850Clm50BgcCropCmip6waccm.cheyenne_intel.clm-basic
SMS_N2_D_Lh12.f09_g17.I2000Clm50Sp.cheyenne_intel.clm-pauseResume
SSP_D_Ld10.f19_g17.I1850Clm51Bgc.cheyenne_intel.clm-rtmColdSSP
SSP_D_Ld4.f09_g17.I1850Clm50BgcCrop.cheyenne_intel.clm-ciso_rtmColdSSP
SSP_Ld10.f19_g17.I1850Clm50Bgc.cheyenne_intel.clm-rtmColdSSP

@ekluzek
Copy link
Collaborator

ekluzek commented Jun 3, 2021

Looks like everything is good on izumi, with just the CN test failing, but that's expected.

@slevis-lmwg
Copy link
Contributor Author

I reran that list of unexpected failures and they passed the second time around. The second time I ran them as individual tests. I used -g to generate baselines for all these, so I think we should be fine.

@slevis-lmwg
Copy link
Contributor Author

@slevisconsulting I noticed you didn't enter the new paramfiles into svn inputdata. You should be able to do that looking at your permissions. It looks like when you ran on izumi, you copied the files over rather than letting them be populated from svn. I actually see running on izumi as a good way to catch if I forget to add files to inputdata. So I never copy by hand.

So could you enter those into svn? I use the rimport script on cheyenne that's at ...

/glade/p/cesmdata/cseg/inputdata/rimport

@ekluzek
I see a new ctsm51_params file dated c210601 in the paramdata directory.

I ran rimport on the clm45 and clm50 files but will wait for your response on what to do about the ctsm51 file.

@slevis-lmwg
Copy link
Contributor Author

@slevisconsulting I also noticed that CN tests are breaking. Which is OK, since we view them as deprecated (see #1356). But, we should at least mark the CN tests as expected fails. And likely we should just remove those tests. And we are going to want to eventually remove the CN option as a possibility.

I did wonder if keeping CN around would be good just to show that our infrastructure is flexible to support more than type of decomposition. But, I'm not sure about that as it would require you to put effort into something we plan to remove.

@ekluzek and I agreed on the phone that:

  • @ekluzek will update the expected failures file
  • we see no need to maintain CN since MIMICS will show that our infrastructure has the flexibility to support more than one type of decomp.

@slevis-lmwg
Copy link
Contributor Author

I see a new ctsm51_params file dated c210601 in the paramdata directory.

I ran rimport on the clm45 and clm50 files but will wait for your response on what to do about the ctsm51 file.

@ekluzek gave me the go-ahead to run rimport on my ctsm51 file, as well. They're all done now.

@ekluzek
Copy link
Collaborator

ekluzek commented Jun 3, 2021

Note, we did notice an increase in memory based on some of the tests. There were some arrays that added a dimension to them, so I suppose that makes sense.

@ekluzek ekluzek merged commit 1e3cc27 into ESCOMP:master Jun 3, 2021
@ekluzek ekluzek deleted the premimics_refactor branch June 3, 2021 23:16
ekluzek added a commit that referenced this pull request Jul 10, 2021
Make certain history fields descriptive that have been labeled by number since #1340
slevis-lmwg added a commit to slevis-lmwg/ctsm that referenced this pull request Jul 11, 2021
 I used existing infrastructure to add descriptive strings to certain history
 fields that I had labeled by number in ESCOMP#1340. While doing this, I applied the
 change to a bunch of other history fields that needed it.

 Some variable names for pools were also changed to use terms consistent with
 the new names as well.
ekluzek added a commit to RonnyMeier/ctsm that referenced this pull request Jul 16, 2021
 I used existing infrastructure to add descriptive strings to certain history
 fields that I had labeled by number in ESCOMP#1340. While doing this, I applied the
 change to a bunch of other history fields that needed it.

 Some variable names for pools was also changed to use terms consistent with the new
 names as well.
ekluzek added a commit to negin513/ctsm that referenced this pull request Jul 26, 2021
 I used existing infrastructure to add descriptive strings to certain history
 fields that I had labeled by number in ESCOMP#1340. While doing this, I applied the
 change to a bunch of other history fields that needed it.

 Some variable names for pools was also changed to use terms consistent with the new
 names as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code health improving internal code structure to make easier to maintain (sustainability) PR status: ready PR: this is ready to merge in, with all tests satisfactory and reviews complete
Projects
Status: Done (non release/external)
Development

Successfully merging this pull request may close these issues.

4 participants