Numerical Overflows in carbon accounting #168

rgknox · 2017-01-04T21:26:01Z

Summary of Issue:

There is a check on total fates carbon balance that occurs once each day when vegetation dynamics is called. This happens in ChecksBalancesMod.F90: FATES_BGC_Carbon_Balancecheck().

This is the routine that updates the variable: sites(s)%cbal_err_fates, this is the variable that blows up. It is blowing up from the variable: sites(s)%fates_to_bgc_this_ts, which is blowing up from: currentPatch%root_litter_out(ft). By blowing up, it suddenly has a value of E+248.

This only occurs on the first time a new patch has been created.

currentPatch%root_litter_out() is zero'd and set in cwd_out(), via:

 do ft = 1,numpft_ed
       currentPatch%leaf_litter_out(ft) = max(0.0_r8,currentPatch%leaf_litter(ft)* SF_val_max_decomp(dg_sf) * &
            currentPatch%fragmentation_scaler )
       currentPatch%root_litter_out(ft) = max(0.0_r8,currentPatch%root_litter(ft)* SF_val_max_decomp(dg_sf) * &
            currentPatch%fragmentation_scaler )
       if ( currentPatch%leaf_litter_out(ft)<0.0_r8.or.currentPatch%root_litter_out(ft)<0.0_r8)then
         write(iulog,*) 'root or leaf out is negative?',SF_val_max_decomp(dg_sf),currentPatch%fragmentation_scaler
       endif
    enddo

The order of operations seems fine. cwd_out is called from vegetation_dynamics, and the balance checks are called from EDBGCDynSummary() which is immediately after vegetation_dynamics in clm_drv().

ed_ecosystem_dynamics -> ed_integrate_state_variables -> non_canopy_derivs() -> cwd_out()
EDBGCDynSummary() -> wrap_bgc_summary() -> SummarizeNetFluxes()
EDBGCDynSummary() -> wrap_bgc_summary() ->FATES_BGC_Carbon_BalanceCheck()

Expected behavior and actual behavior:

Steps to reproduce the problem (should include create_newcase or create_test command along with any user_nl or xml changes):

What is the changeset ID of the code, and the machine you are using:

have you modified the code? If so, it must be committed and available for testing:

Screen output or output files showing the error message and context:

The text was updated successfully, but these errors were encountered:

ckoven · 2017-01-04T21:52:54Z

does it work to just zero the root_litter_out at the new patch creation step here:?
https://github.com/NGEET/ed-clm/blob/master/components/clm/src/ED/biogeochem/EDPatchDynamicsMod.F90#L868

rgknox · 2017-01-05T06:32:28Z

That seems like a good idea, I will look into that too.
I've also tracked down some NaN's being generated in patch%root_litter during cohort termination. Although its not clear yet why the NaN's are occurring, as components used to calculate the value are all reasonable numbers at the time of failure seem very reasonable.

At the end of EDCohortDynamicsMod.F90:terminate_cohorts(), I added a nan catcher for root_litter:

             currentPatch%root_litter(currentCohort%pft) = currentPatch%root_litter(currentCohort%pft) + currentCohort%n* &
                  (currentCohort%br+currentCohort%bstore)/currentPatch%area 

             if(currentPatch%root_litter(currentCohort%pft).ne.currentPatch%root_litter(currentCohort%pft)) then
                print*,'ROOT LITTER PROB',currentCohort%n,currentCohort%br,currentCohort%bstore,currentPatch%area 
                stop
             end if

And it does catch a NaN, but the output is not unusual:

ROOT LITTER PROB   1.5980184922696298E-006   5.3619791069715643E-002  0.10930903672984571       0.36255274778408553

UPDATE: This was a red herring, I tracked the NaN back to earlier in the code to subroutine mortality_litter_fluxes(), variable currentpatch%canopy_mortality_root_litter(p)

rgknox · 2017-01-06T21:29:14Z

The problem was traced all the way back to cohort%npp_acc. This value was not being copied correctly during cohort copying, which is used in various places including spawning newly disturbed patches.
PR with fix coming soon.

@jenniferholm

Merge branch 'rgknox-leafnpp-diag-fix' This pull request mostly deals with some bug fixes to carbon accounting. @jenniferholm noticed that the diagnostics for leaf_npp were periodically showing negative values. Through #164 we identified that this occurred because daily carbon balances during the allocation sequences were sometimes negative to to high respiration and low gpp. The model was correctly using storage carbon to "pay" the negative daily carbon balance and allow maintenance respiration to occur, it was not however correctly diagnosing this flow. The accounting of this process is a little complicated because storage can be used to pay for maintenance respiration as well as maintenance turnover demand. During the process of verifying that carbon accounting errors were low, I triggered spurious values of output variables that triggered netcdf write errors, this lead to the identification that cohort%npp_accum was not being properly copied during copying of cohorts during patch fission. A fix was needed to lawrencium machine files for its lr2 partition for serial runs. While these changes are unrelated to carbon accounting, they are trivial and simple, so I bundled them here. Fixes: #164, #169, #168 and possibly #154 User interface changes?: no Code review: requesting @jenniferholm and @serbinsh for evaluation in their science algorithms. Testing: rgknox: Test suite: lawrencium lr3 (baseline) and lawrencium-lr2 (non-baseline) edTest, Rapid Science Check tool (single site multi-decadal analysis) Test baseline: 5c5928f Test namelist changes: none Test answer changes: Test summary: all PASS andre: Test suite: ed - yellowstone gnu, intel, pgi hobart nag Test baseline: 30f84d7 Test namelist changes: none Test answer changes: bit for bit Test summary: all tests pass Test suite: clm_short - yellowstone gnu, intel, pgi Test baseline: clm4_5_12_r195 Test namelist changes: none Test answer changes: bit for bit Test summary: all tests pass

@jenniferholm

Merge branch 'rgknox-leafnpp-diag-fix' This pull request mostly deals with some bug fixes to carbon accounting. @jenniferholm noticed that the diagnostics for leaf_npp were periodically showing negative values. Through #164 we identified that this occurred because daily carbon balances during the allocation sequences were sometimes negative to to high respiration and low gpp. The model was correctly using storage carbon to "pay" the negative daily carbon balance and allow maintenance respiration to occur, it was not however correctly diagnosing this flow. The accounting of this process is a little complicated because storage can be used to pay for maintenance respiration as well as maintenance turnover demand. During the process of verifying that carbon accounting errors were low, I triggered spurious values of output variables that triggered netcdf write errors, this lead to the identification that cohort%npp_accum was not being properly copied during copying of cohorts during patch fission. A fix was needed to lawrencium machine files for its lr2 partition for serial runs. While these changes are unrelated to carbon accounting, they are trivial and simple, so I bundled them here. Fixes: #164, #169, #168 and possibly #154 User interface changes?: no Code review: requesting @jenniferholm and @serbinsh for evaluation in their science algorithms. Testing: rgknox: Test suite: lawrencium lr3 (baseline) and lawrencium-lr2 (non-baseline) edTest, Rapid Science Check tool (single site multi-decadal analysis) Test baseline: 5c5928f Test namelist changes: none Test answer changes: Test summary: all PASS andre: Test suite: ed - yellowstone gnu, intel, pgi hobart nag Test baseline: 30f84d7 Test namelist changes: none Test answer changes: bit for bit Test summary: all tests pass Test suite: clm_short - yellowstone gnu, intel, pgi Test baseline: clm4_5_12_r195 Test namelist changes: none Test answer changes: bit for bit Test summary: all tests pass

@jenniferholm

Merge branch 'rgknox-leafnpp-diag-fix' This pull request mostly deals with some bug fixes to carbon accounting. @jenniferholm noticed that the diagnostics for leaf_npp were periodically showing negative values. Through #164 we identified that this occurred because daily carbon balances during the allocation sequences were sometimes negative to to high respiration and low gpp. The model was correctly using storage carbon to "pay" the negative daily carbon balance and allow maintenance respiration to occur, it was not however correctly diagnosing this flow. The accounting of this process is a little complicated because storage can be used to pay for maintenance respiration as well as maintenance turnover demand. During the process of verifying that carbon accounting errors were low, I triggered spurious values of output variables that triggered netcdf write errors, this lead to the identification that cohort%npp_accum was not being properly copied during copying of cohorts during patch fission. A fix was needed to lawrencium machine files for its lr2 partition for serial runs. While these changes are unrelated to carbon accounting, they are trivial and simple, so I bundled them here. Fixes: #164, #169, #168 and possibly #154 User interface changes?: no Code review: requesting @jenniferholm and @serbinsh for evaluation in their science algorithms. Testing: rgknox: Test suite: lawrencium lr3 (baseline) and lawrencium-lr2 (non-baseline) edTest, Rapid Science Check tool (single site multi-decadal analysis) Test baseline: 5c5928f Test namelist changes: none Test answer changes: Test summary: all PASS andre: Test suite: ed - yellowstone gnu, intel, pgi hobart nag Test baseline: 30f84d7 Test namelist changes: none Test answer changes: bit for bit Test summary: all tests pass Test suite: clm_short - yellowstone gnu, intel, pgi Test baseline: clm4_5_12_r195 Test namelist changes: none Test answer changes: bit for bit Test summary: all tests pass

rgknox mentioned this issue Jan 6, 2017

Fixes to carbon accounting and lr2 machine files #174

Merged

rgknox closed this as completed Mar 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numerical Overflows in carbon accounting #168

Numerical Overflows in carbon accounting #168

rgknox commented Jan 4, 2017

ckoven commented Jan 4, 2017

rgknox commented Jan 5, 2017 •

edited

Loading

rgknox commented Jan 6, 2017

Numerical Overflows in carbon accounting #168

Numerical Overflows in carbon accounting #168

Comments

rgknox commented Jan 4, 2017

Summary of Issue:

Expected behavior and actual behavior:

Steps to reproduce the problem (should include create_newcase or create_test command along with any user_nl or xml changes):

What is the changeset ID of the code, and the machine you are using:

have you modified the code? If so, it must be committed and available for testing:

Screen output or output files showing the error message and context:

ckoven commented Jan 4, 2017

rgknox commented Jan 5, 2017 • edited Loading

rgknox commented Jan 6, 2017

rgknox commented Jan 5, 2017 •

edited

Loading