-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge to clm > r17x-ish and stable cime tag #46
Comments
Testing for r159 - massive failures for ed tests related to pio. Expected because of problems with ed/pio in clm trunk. Going to continue merging to current r174. If problems aren't fixed, then backtrack to r159-ish or fix on head....? |
TODO(bja): Check CNWoodProducts change for cnveg nitrogen state intent from Stef's branch is in new CNProducts? |
Testing on yellowstone with c3a1f92 / r175 just to see where things stand. |
The clm-ed parameter file needs to be merged with the standard clm parameter file. Merged clm_params_ed.c151027.nc with clm_params.c160225.nc to generate clm_params_ed.c160225.nc. The resulting parameter set needs science review. slatop had minor conflicts. TODO(bja, 20160412) add clm_params_ed.c160225.nc to input data repo! |
Ben -- what are the differences in the variable lists between these PFT parameter files? |
The major difference seemed to be the addition of flexcn/luna/fun parameters. You can run see the files and run ncdump on yellowstone |
But these PFT parameters ought not to really apply to this model, since it currently has neither LUNA nor FUN nor FlexCN in it. So are the new variables even read when ED is active? |
Maybe an alternate solution is to have two possible sets of PFT files: one for the CN veg model and one for the ED veg model? It seems like this would be required during interfacing anyway (since, e.g. CLM and ACME non-demographic veg models will definitely have different PFT parameters), and would reduce any confusion about which PFT parameters are in use in a given model configuration? |
All the parameters are required to be in the file even, if they aren't used. I think we will definitely want to stop reusing the clm parameter file and create a separate file for ED. |
seems like the parameter file strategy might be a thing we could discuss On 13 April 2016 at 12:38, Ben Andre [email protected] wrote:
Dr Rosie A. Fisher Terrestrial Sciences Section |
Failing restart tests with the r175 based ed:
Bisecting through the history:
ed-clm at r153, 56dfb03, introduced the restart failure at f10 and f45. From the clm short log:
r153 is a very small delta. The CanopyHydrology change:
When this is reverted, i.e. '>' back to '>=', the failing ED tests pass. Have not yet investigated why.... |
All the variables related to the frac_sno already appear to be on the restart file. Restart is not bit for bit at a single point in the antarctic.... ED interacts through frac_sno_eff. |
As a sanity check, I tried the other r153, 56dfb03, delta from SnowHydrology:
This does NOT allow the exact restart tests to pass. Testing done with: |
@bandre-ucar so did this isolation of the code that breaks restart allow you to solve the issue? If not, should we reach out to someone with specific knowledge of the snow code to get this sorted out? thanks- |
The problem isn't with snow, it's with ED, either it's interaction with snow or this is just highlighting the generally poor restart capability in ED (see #14 and #43). I have a work around, so I'm going to use it, and move on. This workaround is in the CLM code, and will be unacceptable to put back into mainline CLM, so this will have to be resolved at some point. Further discussion and work on this issue should take place in #74. |
Merging from r175-r180 results in runtime failures.
This seems to happen for all ERS_D tests. There is no useful information in the logs and no core files. Testing intermediate clm trunk tags: 175 - 06f4619 - runs 7107a26 has the following error in the cesm.log file:
There is no useful information in the stack trace and no core files. The last line in the land log file is:
r178 is:
There was also a problem with this merge. Need to investigate further if the runtime failure is a result of the merge or r178. |
Reran the failing case in debug mode in the debugger. Error is:
From the stack, the variable is |
this sounds like a conflict with the phenology refactoring. i deleted any interaction between fates and the accum machinery in that so sounds like the issue is there. Charlie Sent from a mobile device.
|
Yep, it looks like I didn't handle the merge conflicts correctly in TemperatureType.F90. Running the full test suite on the fixed version of the code. |
I've merged the fix from @rosiealice for the ed r153 restart problem into my branch for the clm trunk merge. The current status of the branch is: based on clm-ed master at c3a1f92, up to date with clm trunk r180. All tests pass. Next step is a series of merges to bring it up to the current clm-ed master. |
Nice job Ben! On 22 June 2016 at 14:04, Ben Andre [email protected] wrote:
Dr Rosie A. Fisher Terrestrial Sciences Section |
clm-ed master commits that need to be merged: c3a1f92 - done, all tests pass |
I'm looking through the commits to eebdf4d as you suggested. a couple things: eebdf4d#diff-495103cd15f8049be0952b4736d75bf5R2407 -- the logic here ought to be related to use_vertsoilc rather than use_century_decomp. if in SP mode, then none of this ought to be on http://github.com/NGEET/ed-clm/commit/eebdf4dc5a4a55edefd09a74b95b4597f77d4181#diff-86b7b164f9607a4d86d626947a92af4bR95 -- only the cnveg_carbonstate_type and cnveg_carbonflux_type are actually used in EDBGCDynMod.F90, so you could delete the occurrences of cnveg_state_inst (and actually cnveg_nitrogenstate_type and cnveg_nitrogenflux_type too) from EDBGCDynMod.F90, but agreed that this could be interfaced in a better way. |
@bandre-ucar I am confused about which options are triggering the error. this is happening in SP mode or in ED mode? |
@ckoven The default conditions used by all the ed tests appear to be ed mode with sp. You can recreate it with:
The snippet of the lnd_in above came from that test. |
@bandre-ucar ok so I think the issue is the scripting logic. "ed" ought to be a value for bgc_mode as ED and SP are mutually exclusive possibilities. The old logic may have allowed that since in both cases the soil biogeochemistry is off, but now the soil biogeochemistry is on (or a subset of it anyway) whenever ED is on, so the mutual incompatibility of those is crashing the model. I've made a new commit, f47cd81, which I think sets the switches the way I think they ought to be set. I tried running the test on yellowstone but am running into other issues -- could you try testing the new commit? thanks |
@ckoven ok, testing. |
@ckoven You have a typo in the xml file you modified:
Even fixing that, your branch still has the same runtime error with decomp_depth_efolding described above. |
Hi @bandre-ucar, I have a build here that I think is passing the restart read phase. Lawrencium has been very troublesome the last week due to IO problems, so its been hard to test. But I'm curious if the restart reads will work for you as well, and if there is any difference with how you implemented your fix. |
@ekluzek, not sure how much background @bandre-ucar has given you on this, but the baseline tag that has the problem ought to be 853f006. I've been trying to test it via Ben's suggested fix of: Erik, I am trying to fix a scripting issue that has come up in the Ben’s recent merges of the FATES code to the CLM trunk. Basically the issue is that when use_ed is set to .true., the perl logic is somehow setting bgc_mode to “sp”. This doesn’t logically make sense, since ED ought to be mutually exclusive with SP. I think this used to work because ED was effectively a biophysics model only, but now that ED is connected to the soil biogeochemistry code this creates errors. So what I’d like is to define a new bgc_mode option of “ed”. so that whenever ed is on, that is the BGC mode as well. I tried to do that in the xml file components/clm/bld/namelist_files/namelist_defaults_clm4_5.xml and I also did the same in the testing scripts components/clm/bld/test_build_namelist/t/input/namelist_defaults_clm4_5_test.xml and components/clm/bld/test_build_namelist/t/input/namelist_definition_clm4_5_test.xml and but none of this seems to work; the perl script is still receiving a value of “sp” in components/clm/bld/CLMBuildNamelist.pm. I don’t understand the syntax of the setup_cmdl_bgc subroutine in that script so I don’t get where that info actually comes from. Any chance you could help me get this sorted out? Thanks, |
@ckoven I have a branch on my NGEET fork that is basically functional. The clarification I want to make is that when ed is on, you also by default want the "bgc" settings right? So you get methane, nitrif-denitrif, century pools, and vertical carbon right? So you also need "use_cn" on, which also turns on some things you don't need like (lightning, and pop-density namelists). But, since we don't have switches for above and below ground processes, it seems that getting the "bgc" switches in addition to "use_ed=T" is what makes sense. I just want to confirm that's correct? Later at some point we need a way to distinguish the above and below ground processes. |
@ekluzek thanks. Of the things you mention, we really only want methane, century pools, and vertical carbon when ED is on. Not the nitrif-denitrif option or the fang-fire-model-specific things (lightning, and pop-density) since ED doesn't have an N cycle yet and has its own fire model. I can't see your branch for whatever reason, but unless you changed the fortran, the nitrogen-cycle-specific things should still be off even though the bgc is flagged? |
I haven't pushed my branch to the upstream main repository, so it's just on my fork right now. But, I'll try to get something that works as you've outlined and then push the branch upstream and let you know it's there. And I'm just messing with the namelist generation. Also I thought that methane requires nitrif_denitrif to be on? |
As of 6ab0d89 the merge branch is up to date with clm-r181. |
Testing of the c0654db changes into the r16x branch causes a restart failure in the coupler:
Failing fields from TestStatus.log
Note: all tests in the I poked at this a bit. Since the merged change incorporates three different bug fixes, I think the sanest way to debug the issues is going to be to back them out one at a time to narrow down the problem. For now I'm going to mark this test as an expected fail. |
In that last commit we modified the argument in the ice_mask to setFilters(). Yet, it looks like the second argument to setFilters is no longer the ice_mask and now is a "glc_behavior". EDIT: I am trying to determine if we really need that call to setFilters (as some of the comments imply), that seems the most likely culprit to mess up something in the coupler. |
also, @bandre-ucar, I'm glad that we caught a new error, but where did the ERS_D_Ld5.f45_g37.ICLM45ED come from? |
@rgknox The setFilters call was one of the conflicts, and I resolved it the same way you did in the prototype merge you made last week. I need to look more closely at it. |
@rgknox It looks like it's been part of the standard test suite since January.
Line 770. |
wow, I think one of my wires short-circuited, yeah, its been there the whole time. Its been so agreeable up until now. I looked through setfilters, and it doesn't seem like ED is changing any of the verctors that are dictating what happens in setFilters, so it does not appear to be needed (but I'm probably wrong). ED does change frac_veg_nosno_alb_patch, but that is not called during that setFilters(). |
These seem to be the largest differences: Nothing is jumping out at me. Just to be sure, these are not b4b regressions your showing right? The hard-coded stomatal slope parameters that we no longer use do not match what is in the PFT file. To get b4b regressions, we need to either retroactively change the hard-coded stomatal slope or change the PFT file to match. The differences are minor and the parameter is the quintessential tuning parameter (ie no physical analogue), so whether we use the value of 9 that is hard-coded or the 8 in the file is insignificant until someone starts optimizing (see Chonggang's experiments). |
Nope, no baselines, it is just an exact restart issue with info that is being handed to the coupler. |
Yellowstone is down on Tuesday 8/2. I checked test results and all gnu and intel passed. All pgi tests fail with a compiler error. Documenting for Wednesday when I have access to pgi again.
|
yellowstone is back up. The pgi problem was a quick simple fix. Re-running tests. |
Update merge branch to more recent clm trunk tags: r182 - done, all ed and clm_short tests pass. |
closing. new clm tags will be integrated periodically. |
Summary of Issue:
Merge to a more recent clm trunk tag based on clm4_5_x > r17x-ish. This will include critical bugfixes for clm, new but stable cime user interface scripts and a better starting point for clm5 - ED integration.
Code review and testing needed for:
Significant user interfaces changes
The text was updated successfully, but these errors were encountered: