ed-clm fails bit for bit restart for f19 and f09 grids #14

bandre-ucar · 2016-02-12T00:10:02Z

Notes from Stef Muszala as of 2016-02-10

Current state of restarts – notes that I compiled while working on this.

From what I've been able to track down, the issue that I'm seeing in f19 and f09 restart errors have to with the code in ed_clm_link. First to recreate the problem, use the branch: https://svn-ccsm-models.cgd.ucar.edu/clm2/branches/ed4x5fix/

I've been using this to debug and therefore haven't made a branch_tag to go with my latest fix attempts. A bulk of the mods are simply there to help in debugging (when compared to https://svn-ccsm-models.cgd.ucar.edu/clm2/branch_tags/ed4x5fix_tags/ed4x5fix_n10_r120

Recreate what I've been seeing:

You can recreate the problem by running an f19 case for 5 days and comparing that with a case that attempts to restart from the 4th day of the above run and running for one more day. You should see differences at time-step 145.

See: /glade/u/home/muszala/NCAR_rfisher_NGEE/svn/edHighRes/cime/scripts/chkf19

and

/glade/scratch/muszala/chkf19/run

4+1 is the restart run I've been testing...baseline is in the 'save' directory.

ED restart

When ED is restarted this roughly happens:

call EDRest
call ed_clm_link

Start CLM for current timestep
call surfRad with a use_ed branch ( where one can see the first differences in surface radiation in elai_profile around line 528 )
call EDPhotoSynthesis
call EDAccumulateFlux
end CLM for current timestep

start ED for current timestep
call EDPhysiology which calls EDCohortDynamics
call ed_clm_link (Note 1)
end ED for current timestep

Note 1: now see problems in currentCohort%bstore and ED_bstore and then elai_profile on ranks 10, 35,36,37,68 and 69 for one and/or two cohort(s) depending on the MPI rank. At this point everything should be more in-synch with the rest of the run and we also see that some of the data that was set up by the call to ed_clm_link in EDRest have been overwritten during this time-step, indicating that there is unnecessary code being called.

By the end of time-step 145, there are differences at the end of the time-step and by the end of the run we can see errors on the order of :

>>cprnc chkf19.clm2.h0.0001-01-05-00000.nc ../save/chkf19.clm2.h0.0001-01-05-00000.nc | grep RMS
RMS ED_bstore                        3.3816E-12            NORMALIZED  2.7871E-07
RMS GPP                              1.7497E-09            NORMALIZED  9.3335E-05
RMS H2OSOI                           2.5714E-08            NORMALIZED  2.4811E-07
RMS NPP                              1.2314E-09            NORMALIZED  1.3515E-04
RMS PFTbiomass                       6.1993E-13            NORMALIZED  2.1156E-07

in the clm2.r.0001-01-05 file (ed fields only)

>>cprnc -m chkf19.clm2.r.0001-01-05-00000.nc ../save/chkf19.clm2.r.0001-01-05-00000.nc | grep RMS | grep ed_
RMS ed_bstore                        1.8381E-10            NORMALIZED  8.3040E-06
RMS ed_gpp_acc                       9.8746E-13            NORMALIZED  1.5586E-05
RMS ed_npp_acc                       5.0423E-13            NORMALIZED  1.6529E-05
RMS ed_resp_clm                      7.5660E-14            NORMALIZED  1.0302E-04
RMS ed_cwd_ag                        4.6912E-20            NORMALIZED  2.0821E-11
RMS ed_cwd_bg                        3.1274E-20            NORMALIZED  2.0821E-11
RMS ed_leaf_litter                   8.3800E-17            NORMALIZED  1.2463E-08
RMS ed_root_litter                   5.2713E-15            NORMALIZED  2.4462E-07
RMS ed_f_sun                         2.3372E-10            NORMALIZED  9.7288E-08
RMS ed_fabd_sun_z                    5.1857E-13            NORMALIZED  5.9807E-09
RMS ed_fabi_sun_z                    2.4104E-12            NORMALIZED  8.0762E-08
RMS ed_fabd_sha_z                    2.5355E-09            NORMALIZED  3.1893E-04
RMS ed_fabi_sha_z                    1.2666E-08            NORMALIZED  1.9667E-04
RMS ed_water_memory                  4.3388E-09            NORMALIZED  1.0575E-06
RMS ed_old_stock                     1.8204E-07            NORMALIZED  3.2317E-06

I believe that ed_clm_link should be broken apart in order to handle setting weights and land-units (particular so that ED plays well with dynamic land units) and in order to handle the actual merging of ED and CLM data. As the module stands now, these functions seem to be integrated too tightly and the call during a restart messes up the calling order from where ed_clm_link is called during a regular run. I didn't have any luck splitting this out and this seems to be partially a science issue as well as a design problem.
When I tried to take out the call to ed_clm_link at restart, I get errors in dynSubgridDriverMod, spawned from the call to reweightMod which calls subgridWeightsMod::check_weights. There are two check_weight routines in CLM (one in subgridWeightsMod.F90 and the other in subgridRestMod.F90). There is a note in the subgridRestMod.F90::check_weights which states that weights should not be checked in an ED case. I don't know what this means w.r.t. the call in subgridWeightsMod::check_weights and why it is called in this case.
At one point I tried calling ED during every time-step but that messes up some of the averaging although that would be an easier way to debug this problem if one were willing to make the ED code general enough to handle this situation (or call ED at the end of the day or on it's own frequency, etc...).
There is a further issue in that we call ed_clm_link normally in a run at the end of the ED calculations which happens after the call to CLM. This happens in one time-step. ED is called during the first time-step of the day, so at f19, it is called at TS 1, 49, 97, 145. When ED is restarted, ed_clm_link is called after the ED dynamic data structure is created and before CLM is started on that particular time-step, in effect calling ed_clm_link twice during a restart time-step. This is incorrect, but again, splitting this out is causing problems as it relates to science assumptions made in both CLM and ED.
Removing this call from ed_clm_link on restart does not help nor hinder differences.

This is one call that can be separated out for example:

760       if (.not. calledFromEdRest) then
761
762          call this%ed_update_history_variables( bounds, ed_allsites_inst(begg:endg), &
763             firstsoilpatch, ed_Phenology_inst, canopystate_inst)
764
765       endif

There is something happening at ts 4. Below, 2+1 means run for 3 days, dumping restart files every day, then restart at day 2 and run for 1 day. Compare. Anything restarting at day 4 or 5 is not BFB.

in my own test start with:

1+1 == 2 ?  BFB ... clm2.r. file is differs in ED_GDD0
2+1 == 3 ?  BFB ... clm2.r. file is differs in ED_GDD0
1+2 == 3 ?  BFB ... clm2.r. file is differs in ED_GDD0
3+1 == 4 ?  BFB ... clm2.r. file is differs in ED_GDD0

4+2 == 6 ? !BFB ts5 and ts6
5+1 == 6 ? !BFB ts6 is incorrect
3+3 == 6 ? BFB ts 4, 5 & 6 all check out
4+1 == 5 ? !BFB ts5 incorrect in clm2.r. and clm2.h0

I don't know what's going on with ED_GDD0, it looks like this came in during some other tag and while it differs in the restart file comparisons, does not effect runs it seems.
I've put some barriers in the code for easier debugging at 180 p

The text was updated successfully, but these errors were encountered:

bandre-ucar · 2016-02-23T17:05:49Z

Note from Bill Sacks 2016-02-23

Hi Ben & Rosie,

I noticed a bug in ed code in clm_driver on the trunk... not sure if this has been fixed on the ed branch:

   if (use_ed) then
      call ed_phenology_inst%accumulateAndExtract(bounds_proc, &
           temperature_inst%t_ref2m_patch(bounds_proc%begp:bounds_proc%endp), &
           patch%gridcell(bounds_proc%begp:bounds_proc%endp), &
           grc%latdeg(bounds_proc%begg:bounds_proc%endg), &
           mon, day, sec)
   endif

But mon, day and sec are never set.

Rosie commented:

Oh. That's interesting, & might explain why Stef kept seeing strange things with GDD0 (the phenology growing degree days 'counter').

Corrects namelist build bugs associated with use_ed_spit_fire, use_vertsoilc, use_century_decomp, use_lch4 and use_nitrif_denitrif. It also ensure no-megan with ED compsets. It also adds an edNoFire testmod which turns off ed_spit_fire for regression tests. Note*: As part of #13, we intend to change main regression tests to include use_century_decomp = .true. and use_vertsoilc = .true.. To enable baselines, these two options were set to false in components/clm/cimetest/testmods_dirs/clm/edTest/user_nl_clm, and should be changed to true in the next commit. Regression tests with true passed on lawrencium. Fixes: #16 and #13 User interface changes?: changes default name list, but underlying UI changes, user protocols do not change Code review: Ben Andre, Gautam Bisht, Erik Kluzek, discussions with Charles Koven, Chonggang Xu and Rosie Fisher Unit tests: test_build_namelist - pass, new tests added. Test suite: 'ed' on lrc and yellowstone intel, pgi, gnu. Test baseline: c3a1f92 Test namelist changes: added edNoFire where use_ed_spit_fire = .false. Test answer changes: none Test summary: no baselines for new edNoFire tests. f09 and f19 tests fail restart as expected (#14)

Adds the machine environment files for the wolf and conejo machines at LANL. Fixes: none, feature addition only User interface changes?: yes, "wolf" and "conejo" machines are now available. Changes were designed with minimal to no impact on other machine setups. Code review: Chonggang Xu, Ryan Knox, bandre Test suite: ed : lawrencium_intel, conejo_intel, yellowstone_{intel,pgi,gnu} Test baseline: lrc: a066165; conejo: none; yellowstone: 8740a1a Test namelist changes: None Test answer changes: None Test summary: All test functionality and restarts pass except f09 and f19 restarts (github issue #14)

Merge remote-tracking branch 'ryan/rgknox-cime-eddimachine' Summary: This PR contains the addition of machine configuration files for eddi, a GNU/Linux system. Eddi does not have a batch system, no batch directives were applied. Eddi is a workstation and is only intended for single site and small grids. Eddi is used by a limited number of researchers on the NGEET project who understand its usage and limited support. Fixes: None User interface changes?: new MACH=eddi configuration case Code review: knox Test suite: ed - lawrencium-lr3-intel; ed - yellowstone intel, pgi, gnu Test baseline: 534d152 Test namelist changes: None Test answer changes: None Test summary: All lrc-intel expected PASS were passes. yellowstone passed except for expected failures in #14, #43.

Merge branch 'rgknox-gnuissues-lawrencium' The set of commits in this pull request is designed to aid in the checking of uninitialized variables in the model. The changes have identified a few instances where variables were uninitialized (yet not consequential to model predictions), yet nonetheless prevented compiler initialization checks from completing. As well, some debug flags that aid in checking uninitialized variables were added to lawrencium and eddi. Fixes: none, yet was motivated by #43, which was believed to have something to do with variable initialization differences between two compilers. User interface changes?: No Code review: rgknox Test suite: ed - lawrencium-lr3; ed - yellowstone intel, gnu, pgi Test baseline: 763a722 Test namelist changes: none Test answer changes: B4B Test summary: ok - all tests pass except: expected fails fail for f09 and f19, #14; gnu baseline issue, #43.

@bandre-ucar

Merge remote-tracking branch 'ryan/rgknox/phenology/gdd0-accum' Summary: Bill Sacks identified that the month, day and second arguments are uninitialized in clm_driv when it calls ED's accumulateAndExtract subroutine. See issue #23 for discussion. This PR issues a call to retrieve the current time prior to calling the accumulateAndExtract, and corrects inconsistent ordering of the time variables in calling function and called function. Fixes: #23 User interface changes?: No Code review: Ryan Knox, Bill Sacks (identified original issue), discussion with @bandre-ucar , @rosiealice and @xuchongang Test suite: ed - yellowstone intel, pgi, gnu Test baseline: 89b8709 Test namelist changes: None Test answer changes: bit for bit Test summary: pass except for f09 and f19 restarts, #14. Interestingly, no failures associated with #43.

@ckoven

Merge remote-tracking branch 'koven/ckoven/fastfluxoutput' Changes history diagnostics to (1) report carbon productivity fluxes on the half-hourly rather than daily timestep, (2) correct a subgrid averaging error for carbon fluxes and states, and (3) add multiple new diagnostics, such as autotrophic respiration components and patch and cohort numbers. Fixes: #47 and #52 User interface changes?: Yes: some new variables(AR, GROWTH_RESP, MAINT_RESP, ED_NPATCHES, ED_NCOHORTS), and one renamed variable (NPP_hifreq -> NPP_COLUMN) Code review: @ckoven, @rgknox, changes made after discussions with @rosiealice Test suite: ed - Lawrencium-LR3, intel; yellowstone intel, pgi, gnu Test baseline: None Test namelist changes: None Test answer changes: climate changing Test summary: Pass except for expected restart failures at f09 and f19, #14

@ckoven

Merge remote-tracking branch 'knox/rgknox-interf-surfrad-clean' This commit introduces several new features: 1) first pass at the clm_fates API (class, where munging between CLM and FATES occurs). 2) first pass at the fates API (class, where fates presents to the clm_fates API is publically accessible bits, which includes the head of its state structure sites(:) and boundary condition vectors, although the latter is not yet coded) 3) some starter work on removing ED sun/shade calculations from CLM code. Although, this still needs more work because some boundary conditions for this procedure should now be defined in the fates(nc)%fatesbc public and populated in the clm_fates munging API, yet this functionality is not yet introduced yet. There are sill many only partially introduced features. For instance: 1) the fates API is still being accessed by CLM code, and not only through the clm_fates API. 2) the fates API still has access to CLM types 3) the fates(nc)%sites(:) vector is still allocated using the bounds_clump type, and it propogates well into the FATES/ED code. Addresses: #40, Replaces: #45 User interface changes?: No Code review: self, @ckoven, @bishtgautam, @bandre-ucar , @rosiealice, some earlier discussion with @billsacks Test suite: ed - lawrencium-lr3, intel; yellowstone, intel, gnu, pgi Test baseline: f881721 Test namelist changes: none Test answer changes: bit for bit Test summary: pass except for f09 and f19 restarts, #14, and answer change in gnu, #43.

Merge PR #58, branch 'ckoven/fusebug' The model was exhibiting instability when trying to run for global (4x5 resolution) long-term runs. some of these instabilities were coming from old patches that had extremely small areas. This was a fairly uncommon occurrence in single-point code and was not being frequently encountered, but when it was encountered, there were a series of issues that could crash the code. One issue is that it was trying to deallocate something twice. Another issue is that when it merged patches, it would still allow a patch to exist even when its size was below the minimum threshold if it was the oldest patch on the column; that is now changes so that if a tiny patch is the oldest it merges it into the younger neighbor, whereas if it is neither the oldest nor the youngest patch, it fuses it into the older neighbor. It will still allow a patch to be smaller than the minimum threshold if it is the youngest patch, since that is a situation that can and may occur transiently. In that case, however, it now bypasses a section of the canopy construction code for the tiny youngest patch that was particularly vulnerable to minimum patch size numerical issues (and, worse yet, responded by going into an infinite loop). Fixes: #54 User interface changes?: No Code review: ckoven, rgknox Test suite: ed, lawrencium-lr2 intel; yellowstone intel, gnu, pgi Test baseline: 0471ef9 Test namelist changes: none Test answer changes: should be climate changing, but only after many years Test summary: passes except for expected failures: f19 and f09 restarts #14, and gnu restarts #43.

Merge branch 'rgknox-eddriver-clean' ed_driver performed a variety of operations. Along with calling the main ED dynamics routines, it also performed timing calculations and the mapping and prepping of FATES/ED state variables into CLM compliant formats for boundary conditions and IO (vi clm_ed_inst%ed_clm_link). These commits migrate the latter functions to live in the clm_fates interface, where CLM calls are appropriate and the mapping functions and calls are supposed to live. changes were also added to the udata% timing structure, which was updated in ed_driver. Two notable changes are the migration of the n_sub global to be included with the udata%, and removing udata%cohort_index from regulating the cohort_fusion process. We have found that udata%cohort_index is not threadsafe (along with the rest of that structure), and an alternative means using .not.associated(currentcohort, nextcohort) was implemented instead to avoid self fusion. Note: These changes were also built off of PR #59, if those changes aren't up to snuff, this branch needs to update first. Fixes: n/a User interface changes?: No Code review: rgknox Test suite: ed - lawrencium-lr2 intel; yellowstone gnu, intel, pgi Test baseline: 1aaba89 Test namelist changes: none Test answer changes: b4b Test summary: pass except for expected fails - #14 f19 f09 restart, and #43 gnu f10 restart

Merge branch 'rgknox-init-cleanups' Three things were done in this branch. 1) Several functions previously called directly from clm_instInit() were moved to within the clm_fates%init() wrapper, including fates(nc)%init(). 2) the fates(nc) class is also now allocated within the clm_fates%init(). 3) The routine clm_fates%fates2hlm_inst was renamed to clm_fates%fates2hlm. Fixes: #62 User interface changes?: no Code review: rgknox Test suite: ed - lawrencium-lr3 intel; yellowstone gnu, intel, pgi Test baseline: 94118a5 Test namelist changes: none Test answer changes: b4b Test summary: expected fails #14, f09 and f19 restart, #43 yellowstone gnu f10 restarts.

@rgknox

Merge branch 'ckoven/phenology_internal' replace externally-dependant phenology into FATES internal code FATES phenology was going through the interface so that it could leverage CLM's time-accumulation infrastructure to track growing degree days. This was deemed overkill and so instead GDD are now tracked at the FATES site level and kept internal to FATES. Also changes the definition of GDD to be based on daily mean temperatures, which allows both the structural simplicity of calculating GDD at daily timestep, and gives the correct definition of GDD. However this requires it to be answer-changing with respect to the old code. Fixes: #63 User interface changes?: No Code review: ckoven, discussed changes with @rgknox and @rosiealice. Test suite: ed - lawrencium intel; yellowstone gnu, intel, pgi Test baseline: 1fc6811 - Should in principle be answer-changing. In practice tests were bit for bit with baseline. These changes probably do not have adequate test coverage. Test namelist changes: none Test answer changes: climate changing Test summary: expected failures #14 f09 and f19 restart. #43 f10 gnu restart on yellowstone.

@billsacks

Merge branch 'rgknox-fatescolumns' FATES sites are converted from alignment with gridcells, to alignment with columns in the host land model. Sites are still connected to the fates(nc)% structure, although syntactically they are now just called "sites(:)" instead of "ed_allsites_inst(:)". eg clm_fates%fates(nc)%sites Some other notable design constructs: 1) sites is allocated for each column on the natural vegetation land-unit. @billsacks and I have discussed other ways to filter this further, and have decided that allowing FATES sites to exist on all columns, even ones covered by ice or ones that have no weighting, while not ideal, is not a liability to getting correct results, and is not an immense performance hit. Future commits and issues should be brought up to strategize how to allocate and filter FATES sites for only active columns. 2) sites is not sparse, and is allocated from 1 to clm_fates%fates(nc)%nsites. As mentioned above, since it is anchored in the fates(nc)% structure, the vector space is separated by thread. 3) two mapping vectors have also been created: clm_fates%f2hmap(nc)%fcolumn(1:nsites) this vector returns the column index on each clump, associated with the FATES site index clm_fates%f2hmap(nc)%hsites(bounds_clump:begc:bounds_clump:endc) this vector returns the FATES site index associated with a given HLM column index. Zero's imply no FATES site, and this is sparse. This is almost always called from within a filtered loop on the HLM side, so there is no need to check if the column is non-zero, although there is a check in the code with a fail call. 4) restarts and history writes appear to be working correctly. Note that FATES uses the cohort level memory space and the column level memory space in the HLM IO machinery, and not any patch level space. The cohort vector space that is allocated is max_number_of_patches_per_col * max_number_of_cohorts_per_patch. This vector is incredibly sparse, and is also something that needs to be addressed, still. There are here: 1) currently the cohort IO vector space is allocated for all columns, and not just columns on naturally vegetated land units. 2) there are various variables that use the cohort IO vector space to hold their information, which is forcing us to use a very large max number of cohorts. 5) We are still using a variable that maps the FATES patch to its associated HLM patch index: currentpatch%clm_pno. This seems inconsistent with the library design. I wanted to remove it in this pass, but held off. Alternatively, there were indeed several variables at the site level that pointed towards the CLM/ALM gridcell, these have been removed. 6) Cosmetic improvements to the code and updated annotation is still needed in various places. Fixes: #66, #30 User interface changes?: Yes - people will need to update their analysis codes to use the new IO variables. For history output, the only variables that were changed were all of the "scpf" variables, that had been indexed by gridcell, they are no column variables. For restarts, we now have ed_io_numPatchesPerCol (instead of ed_io_numPatchesPerGCell, or something like that), and the variable that indicated whether or not the column has a patch has been removed, as that information is redundant with ed_io_numPatchesPerCol. Code review: Discussion with @ckoven @rosiealice, @bandre-ucar, @billsacks and Mariana Vertenstein. Test suite: ed - lawrencium-lr3 intel; yellowstone gnu, intel, pgi Test baseline: none, output vectors have changed, regression tests should not work. however see in #66 that the RSC tool was used to evaluate science output, and results were identical in the 1x1_brazil. Test namelist changes: none Test answer changes: See baseline explanation Test summary: Pass, expected failures #14 f09 and f19 restart. #43 f10 gnu restart on yellowstone.

bandre-ucar · 2016-06-15T20:27:50Z

Note from Sean Swenson regarding #74 but probably also relevant for this general restart issue:

I think that the ed restart value has to do with the fact that the ed_clm_link 
is called during the restart from EDRestVectorMod.  This seems to occur 
prior to the running of the biogeophysics.  In a normal timestep, the biogeophysics 
occurs prior to the ED calls.  But during a restart, the ed_clm_link routine is 
called, so calculates various things based on what's on the restart, not on 
what the value is after  biogeophysics.  Not sure what the proper fix is though.

rosiealice · 2016-06-15T22:48:03Z

OK - I have something that I think fixes the snow issue. It involves moving
the elai_profile and esai_profile calculations to SurfaceRadiation, such
that they use the updated snow variables. This needs a new variable
(layer_height_profile) to track the height of the 'iv' layers to compare
against snow depth in surface radiation.

There is -lot- of code in surface radiation that I just made redundant, and
some that already was (the smooth_leaf_distribution=1 option that isn't
used, primarily). I kept that out of this commit though.

There are two updates. One for the main fix, and a second to clean up a
couple of minor issues.

Ben, can you let me know if this works for you?

cross fingers,
-Rosie

On 15 June 2016 at 14:27, Ben Andre [email protected] wrote:

Note from Sean Swenson regarding #74
#74 but probably also relevant
for this general restart issue:

I think that the ed restart value has to do with the fact that the ed_clm_link
is called during the restart from EDRestVectorMod. This seems to occur
prior to the running of the biogeophysics. In a normal timestep, the biogeophysics
occurs prior to the ED calls. But during a restart, the ed_clm_link routine is
called, so calculates various things based on what's on the restart, not on
what the value is after biogeophysics. Not sure what the proper fix is though.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#14 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AMWsQ5ll0332MfpG0fZOPL7GxuSNl7rvks5qMGBIgaJpZM4HYqkz
.

Dr Rosie A. Fisher

Terrestrial Sciences Section
Climate and Global Dynamics
National Center for Atmospheric Research
1850 Table Mesa Drive
Boulder, Colorado, 80305
USA.
+1 303-497-1706

http://www.cgd.ucar.edu/staff/rfisher/

rosiealice · 2016-06-15T22:53:52Z

there's also a good change my changes will mess up the history outputting
of ELAI for the canopy. That should be an issue to fix somewhere.

On 15 June 2016 at 16:47, rosie fisher [email protected] wrote:

OK - I have something that I think fixes the snow issue. It involves
moving the elai_profile and esai_profile calculations to SurfaceRadiation,
such that they use the updated snow variables. This needs a new variable
(layer_height_profile) to track the height of the 'iv' layers to compare
against snow depth in surface radiation.

There is -lot- of code in surface radiation that I just made redundant,
and some that already was (the smooth_leaf_distribution=1 option that isn't
used, primarily). I kept that out of this commit though.

There are two updates. One for the main fix, and a second to clean up a
couple of minor issues.

Ben, can you let me know if this works for you?

cross fingers,
-Rosie

On 15 June 2016 at 14:27, Ben Andre [email protected] wrote:

Note from Sean Swenson regarding #74
#74 but probably also relevant
for this general restart issue:

I think that the ed restart value has to do with the fact that the ed_clm_link
is called during the restart from EDRestVectorMod. This seems to occur
prior to the running of the biogeophysics. In a normal timestep, the biogeophysics
occurs prior to the ED calls. But during a restart, the ed_clm_link routine is
called, so calculates various things based on what's on the restart, not on
what the value is after biogeophysics. Not sure what the proper fix is though.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#14 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AMWsQ5ll0332MfpG0fZOPL7GxuSNl7rvks5qMGBIgaJpZM4HYqkz
.

Dr Rosie A. Fisher

Terrestrial Sciences Section
Climate and Global Dynamics
National Center for Atmospheric Research
1850 Table Mesa Drive
Boulder, Colorado, 80305
USA.
+1 303-497-1706

http://www.cgd.ucar.edu/staff/rfisher/

Dr Rosie A. Fisher

Terrestrial Sciences Section
Climate and Global Dynamics
National Center for Atmospheric Research
1850 Table Mesa Drive
Boulder, Colorado, 80305
USA.
+1 303-497-1706

http://www.cgd.ucar.edu/staff/rfisher/

Merge branch 'rgknox-bbfix-careafix-icefiltfix' This PR bundles 3 fixes that address: #73, #69, #44 The fix to 73 is the only one that would be expected to have b4b regressions. I performed baseline simulation comparsisons between f1a14d6 and 18613d1, and tests confirmed b4b on all expected passes. One extra step was necessary, in that I needed to update the parameter file values of BB_slope to match what was previously hard coded (a value of 9). The current value in the default parameter file is 8, we can certaintly change this going forward. Not changing this now. The other two fixes, #69 and #44 are not supposed to generate b4b results, and they don't. 1x1 brazil simulations were also run on eddi to make sure that the non-b4b changes continue to generate very similar projections of forest composition and structure, as well as flux variables. They did. Fixes: #73 Fixes: #69 Fixes: #44 User interface changes?: no Code review: rgknox Test suite: ed - lawrencium-lr3 intel, eddi (PC) gnu (visualizations); ed - yellowstone gnu, intel, pgi Test baseline: 18613d1 Test namelist changes: none Test answer changes: yes, see above Test summary: pass except for #14 known failures in f09 and f19 restart, and gnu f10 restart #43.

@bishtgautam

Merge branch 'rgknox-bcs' New structures were created in FatesMod: bc_in(s) and bc_out(s), these structures hold boundary conditions, where for FATES input they are filled during the interface, and for FATES output they write to their corresponding host's variable. The new structures were prototyped with the calculation of the sunshade fractions and btran. Wrappers were created in the interface to call edbtran and the sunshade fractions. Some of the boundary condition arrays needed dimension info, some of these dimensions are dictated by the host, and so a scheme was implemented to pass these "control parameters" from the host to FATES as well. (see line ~230 of components/clm/src/ED/main/FatesInterfaceMod.F90, and line 211 of components/clm/src/utils/clmfates_interfaceMod.F90) Fixes: none User interface changes?: no Code review: code discussion and review with @bishtgautam and conferencing with @rosiealice @ckoven @bandre-ucar @mvertens and D Lawrence Test suite: ed - lawrencium-lr3 intel, yellowstone intel, gnu, pgi Test baseline: answer changing, (hydrologic calculations of suction were moved from FATES to the host) Test namelist changes: none Test answer changes: answer changing Test summary: PASS except for known failures: #14, f09 and f19 restarts

@ckoven

Merge branch 'ekluzek/lnd/clm-bldnmledbgcopt' The -ed_mode option is removed from CLM build-namelist and "ed" is added as a valid option to the "-bgc" option. It doesn't fill the cnfire namelists, but does the ndep namelist, because it looks like that is required in soilbiogeochem. It also turns on the default soilbiogeochem settings (methane, vertical-soil-carbon, nitrif-denitrf [I understood that was required for methane]). The new nitrogen things are default off for this version -- so that will need to be dealt with in a future clm update (use_ed will need to be passed down as an attribute to check for: use_fun, use_flexibleCN, use_luna). Only the ED namelists were changed, other namelists should be identical. I think probably the setup_cmdl_ed_mode should be merged into setup_cmdl_bgc. But, I left it separate for now. Closes: #82 User interface changes?:yes ed is an option to "-bgc" in CLM buildnamelist CLM_BLDNML_OPTS ed_mode option is removed. Code review: @ckoven, @ekluzek, @bandre-ucar Test suite: (ekluzek) build-namelist_test.pl (in components/clm/bld/unit_testers) Test baseline: ed-clm/master Test namelist changes: -bgc ed, option now by default turns on soil biogeochem settings Test answer changes: climate-changing for ED because of different namelist settings Test summary: all PASS Test suite: ed - yellowstone intel, gnu, pgi clm-short - yellowstone intel, gnu, pgi Test baseline: none Test namelist changes: new defaults for ed as bgc mode Test summary: all tests pass, except for expected fails: #14 f09 and f19 and #81 ERP 15x2 restart with pe layout change in clm-short.

@bandre-ucar

Merge remote-tracking branch 'ryan/rgknox/phenology/gdd0-accum' Summary: Bill Sacks identified that the month, day and second arguments are uninitialized in clm_driv when it calls ED's accumulateAndExtract subroutine. See issue #23 for discussion. This PR issues a call to retrieve the current time prior to calling the accumulateAndExtract, and corrects inconsistent ordering of the time variables in calling function and called function. Fixes: #23 User interface changes?: No Code review: Ryan Knox, Bill Sacks (identified original issue), discussion with @bandre-ucar , @rosiealice and @xuchongang Test suite: ed - yellowstone intel, pgi, gnu Test baseline: 89b8709 Test namelist changes: None Test answer changes: bit for bit Test summary: pass except for f09 and f19 restarts, #14. Interestingly, no failures associated with #43.

Merge branch 'rgknox-eddriver-clean' ed_driver performed a variety of operations. Along with calling the main ED dynamics routines, it also performed timing calculations and the mapping and prepping of FATES/ED state variables into CLM compliant formats for boundary conditions and IO (vi clm_ed_inst%ed_clm_link). These commits migrate the latter functions to live in the clm_fates interface, where CLM calls are appropriate and the mapping functions and calls are supposed to live. changes were also added to the udata% timing structure, which was updated in ed_driver. Two notable changes are the migration of the n_sub global to be included with the udata%, and removing udata%cohort_index from regulating the cohort_fusion process. We have found that udata%cohort_index is not threadsafe (along with the rest of that structure), and an alternative means using .not.associated(currentcohort, nextcohort) was implemented instead to avoid self fusion. Note: These changes were also built off of PR #59, if those changes aren't up to snuff, this branch needs to update first. Fixes: n/a User interface changes?: No Code review: rgknox Test suite: ed - lawrencium-lr2 intel; yellowstone gnu, intel, pgi Test baseline: 1aaba89 Test namelist changes: none Test answer changes: b4b Test summary: pass except for expected fails - #14 f19 f09 restart, and #43 gnu f10 restart

@bishtgautam

Merge branch 'rgknox-bcs' New structures were created in FatesMod: bc_in(s) and bc_out(s), these structures hold boundary conditions, where for FATES input they are filled during the interface, and for FATES output they write to their corresponding host's variable. The new structures were prototyped with the calculation of the sunshade fractions and btran. Wrappers were created in the interface to call edbtran and the sunshade fractions. Some of the boundary condition arrays needed dimension info, some of these dimensions are dictated by the host, and so a scheme was implemented to pass these "control parameters" from the host to FATES as well. (see line ~230 of components/clm/src/ED/main/FatesInterfaceMod.F90, and line 211 of components/clm/src/utils/clmfates_interfaceMod.F90) Fixes: none User interface changes?: no Code review: code discussion and review with @bishtgautam and conferencing with @rosiealice @ckoven @bandre-ucar @mvertens and D Lawrence Test suite: ed - lawrencium-lr3 intel, yellowstone intel, gnu, pgi Test baseline: answer changing, (hydrologic calculations of suction were moved from FATES to the host) Test namelist changes: none Test answer changes: answer changing Test summary: PASS except for known failures: #14, f09 and f19 restarts

@bandre-ucar

Merge remote-tracking branch 'ryan/rgknox/phenology/gdd0-accum' Summary: Bill Sacks identified that the month, day and second arguments are uninitialized in clm_driv when it calls ED's accumulateAndExtract subroutine. See issue #23 for discussion. This PR issues a call to retrieve the current time prior to calling the accumulateAndExtract, and corrects inconsistent ordering of the time variables in calling function and called function. Fixes: #23 User interface changes?: No Code review: Ryan Knox, Bill Sacks (identified original issue), discussion with @bandre-ucar , @rosiealice and @xuchongang Test suite: ed - yellowstone intel, pgi, gnu Test baseline: 89b8709 Test namelist changes: None Test answer changes: bit for bit Test summary: pass except for f09 and f19 restarts, #14. Interestingly, no failures associated with #43.

Merge branch 'rgknox-eddriver-clean' ed_driver performed a variety of operations. Along with calling the main ED dynamics routines, it also performed timing calculations and the mapping and prepping of FATES/ED state variables into CLM compliant formats for boundary conditions and IO (vi clm_ed_inst%ed_clm_link). These commits migrate the latter functions to live in the clm_fates interface, where CLM calls are appropriate and the mapping functions and calls are supposed to live. changes were also added to the udata% timing structure, which was updated in ed_driver. Two notable changes are the migration of the n_sub global to be included with the udata%, and removing udata%cohort_index from regulating the cohort_fusion process. We have found that udata%cohort_index is not threadsafe (along with the rest of that structure), and an alternative means using .not.associated(currentcohort, nextcohort) was implemented instead to avoid self fusion. Note: These changes were also built off of PR #59, if those changes aren't up to snuff, this branch needs to update first. Fixes: n/a User interface changes?: No Code review: rgknox Test suite: ed - lawrencium-lr2 intel; yellowstone gnu, intel, pgi Test baseline: 1aaba89 Test namelist changes: none Test answer changes: b4b Test summary: pass except for expected fails - #14 f19 f09 restart, and #43 gnu f10 restart

@bishtgautam

Merge branch 'rgknox-bcs' New structures were created in FatesMod: bc_in(s) and bc_out(s), these structures hold boundary conditions, where for FATES input they are filled during the interface, and for FATES output they write to their corresponding host's variable. The new structures were prototyped with the calculation of the sunshade fractions and btran. Wrappers were created in the interface to call edbtran and the sunshade fractions. Some of the boundary condition arrays needed dimension info, some of these dimensions are dictated by the host, and so a scheme was implemented to pass these "control parameters" from the host to FATES as well. (see line ~230 of components/clm/src/ED/main/FatesInterfaceMod.F90, and line 211 of components/clm/src/utils/clmfates_interfaceMod.F90) Fixes: none User interface changes?: no Code review: code discussion and review with @bishtgautam and conferencing with @rosiealice @ckoven @bandre-ucar @mvertens and D Lawrence Test suite: ed - lawrencium-lr3 intel, yellowstone intel, gnu, pgi Test baseline: answer changing, (hydrologic calculations of suction were moved from FATES to the host) Test namelist changes: none Test answer changes: answer changing Test summary: PASS except for known failures: #14, f09 and f19 restarts

Update maintenance respiration variable names

bandre-ucar added the bug - unknown label Feb 12, 2016

bandre-ucar mentioned this issue Feb 25, 2016

unset date values when accumulating ed variables #23

Closed

This was referenced Apr 1, 2016

incorrect direct call to setFilters #44

Closed

Bugfix: clm history files always instantaneous with ED on #41

Merged

bandre-ucar mentioned this issue Apr 8, 2016

merge to clm > r17x-ish and stable cime tag #46

Closed

bandre-ucar mentioned this issue Jun 10, 2016

interaction with snow bugfix causes restart failures with ED #74

Closed

bandre-ucar mentioned this issue Aug 3, 2016

ed r182 reintegration #90

Merged

rgknox closed this as completed in #90 Aug 4, 2016

ekluzek mentioned this issue Jul 11, 2017

Running clm-fates with 1PFT at f45 and larger PE layout fails in driver... #242

Closed

jkshuman mentioned this issue Sep 11, 2017

Hanging case #267

Closed

jkshuman mentioned this issue May 5, 2018

Balance Check failure in fire runs #378

Closed

pollybuotte mentioned this issue Sep 17, 2019

Passive Crown Fire development #572

Closed

4 tasks

glemieux pushed a commit that referenced this issue Mar 7, 2023

Merge pull request #14 from glemieux/respiration-agnostic_vars

fd9f5f6

Update maintenance respiration variable names

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ed-clm fails bit for bit restart for f19 and f09 grids #14

ed-clm fails bit for bit restart for f19 and f09 grids #14

bandre-ucar commented Feb 12, 2016

bandre-ucar commented Feb 23, 2016

bandre-ucar commented Jun 15, 2016

rosiealice commented Jun 15, 2016

rosiealice commented Jun 15, 2016

ed-clm fails bit for bit restart for f19 and f09 grids #14

ed-clm fails bit for bit restart for f19 and f09 grids #14

Comments

bandre-ucar commented Feb 12, 2016

bandre-ucar commented Feb 23, 2016

bandre-ucar commented Jun 15, 2016

rosiealice commented Jun 15, 2016

rosiealice commented Jun 15, 2016