Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple FatesHydro tests fail due to water balance error very quickly #508

Closed
ekluzek opened this issue Mar 25, 2019 · 25 comments
Closed

Simple FatesHydro tests fail due to water balance error very quickly #508

ekluzek opened this issue Mar 25, 2019 · 25 comments

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Mar 25, 2019

A simple test with FatesHydro fails with a balance check error in FatesPlantHydro

SMS_Ld5.f45_f45_mg37.I2000Clm50FatesCruGs.cheyenne_intel.clm-FatesHydro

I tried increasing the value of thresh_break to even ridiculously large and it still fails almost immediately. This is with sci.1.23.0_api.7.1.0, and fates_next_api clm5.0.dev008-92-g1090150.

SMS_D_Mmpi-serial_Ld3.1x1_brazil.I2000Clm50FatesCruGs.cheyenne_intel.clm-FatesHydro does pass.

@ckoven
Copy link
Contributor

ckoven commented Mar 25, 2019

@ekluzek where in FatesPlantHydro is it failing?

@ekluzek
Copy link
Collaborator Author

ekluzek commented Mar 25, 2019

On this check...

    else if (abs(we_local) > thresh_break) then
       write(fates_log(),*)'EDPlantHydraulics water balance error exceeds threshold of = ', thresh_break
       call endrun(msg=errMsg(sourcefile, __LINE__))

A test with 1x1_brazil is working. But, the f45 above fails.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Mar 25, 2019

This may not be the same thing, but this test fails, after 6 and a half months with generation of NaN's... SMS_Mmpi-serial_Lm9.1x1_brazil.I2000Clm50FatesCruGs.cheyenne_intel.clm-FatesHydro.

@pollybuotte
Copy link

This may a different problem, but I also have a single point FATESHydro case that also fails after about 6 months as described in #504 .

@rgknox
Copy link
Contributor

rgknox commented Mar 25, 2019

One factor that might play a role here is the initialization of the water table.

In E3SM the initial condition is set to 70% saturation if fates_hydro is turned on:
https://github.com/E3SM-Project/E3SM/blob/master/components/clm/src/data_types/ColumnDataType.F90#L1480

Here is the corresponding line for CTSM:
https://github.com/ESCOMP/ctsm/blob/fates_next_api/src/biogeophys/WaterStateType.F90#L759

@ckoven
Copy link
Contributor

ckoven commented Mar 25, 2019

@xuchongang @bchristo have either of you tried to run FATES-hydro on a 4x5 or other global grid yet? Is this something that we expect to work yet across all possible soil conditions (including frozen, desert-dry, etc) or have you only worked on a subset of ecosystems?

@bchristo
Copy link

bchristo commented Mar 25, 2019 via email

@ckoven
Copy link
Contributor

ckoven commented Mar 25, 2019

@bchristo thanks. are there any soil conditions where the current numerics in fates-hydro are known to fail, rather than just being still-untested?

@bchristo
Copy link

bchristo commented Mar 25, 2019 via email

@ckoven
Copy link
Contributor

ckoven commented Mar 25, 2019

@bchristo thanks

@JunyanDing
Copy link
Contributor

I think both this FATESHydro fail and the issue posted by Polly are caused by the same problem - the dying of trees. I have been trying to make FATESHydro run for California last few weeks and encountered both issues. I looked at the outcome, seedling were all dying from the beginning and by the time the model was aborted, there is no trees.

If looking at the code we_local is calculated as we_local = we_tot_inner*cCohort%n/AREA

As for Polly 's case, the error message says: floating divided by zero, and the corresponding code in FatesPlantHydraulicsMod.F90 is
r_out_shell(nshell) = (pi_constl_aroot/(areadz))**(-0.5_r8)

Notice, both lines have 'area' in the denominator. The area is the area of tree cover. As for this case, once the area arrpoaches zero, no matter how large the threshold is set, we_locak will always be larger than that as it becomes infinite.

I think these issues might not be the bug of the codes and may be fixed by find the right parameter setting.

@rosiealice
Copy link
Contributor

Fwiw, I get the same thing as @ekluzek with 4x5 tests. I suspect they might be different issues, since the 4x5 fails almost immediately (need to check if it's the actual first time step or not) vs. failing after six months etc...

@rgknox
Copy link
Contributor

rgknox commented Apr 9, 2019

upgraded this to high-priority. Will put some time into tracking this down once I integrate the new default PFT file.

@jenniferholm
Copy link
Contributor

Hi all,
as Charlie suspected on the call today there was a water balance error at the very beginning of the simulation when running ELM-FATES Hydro in a single boreal site in Alaska, using 3 PFTs. Does this balance error when errh2o is greater than 1e-4 (mm) look the same the previous errors?
This could be a pft parameter issue, as I just used the default tropical hydr parameters values and didn't update anything.
Any thoughts about locations in code or output I could start investigating?

Here is the error from the lnd.log:

WARNING: plant hydraulics water balance error exceeds 1.0e-7 and is ajusted for
error
WARNING: plant hydraulics water balance error exceeds threshold of
1.000000000000000E-007
WARNING: plant hydraulics water balance error exceeds threshold of
1.000000000000000E-007
WARNING: plant hydraulics water balance error exceeds threshold of
1.000000000000000E-007
WARNING: plant hydraulics water balance error exceeds threshold of
1.000000000000000E-007
WARNING: plant hydraulics water balance error exceeds threshold of
1.000000000000000E-007
WARNING: plant hydraulics water balance error exceeds threshold of
1.000000000000000E-007
WARNING: water balance error nstep= 148 local indexc= 1
errh2o= 1.482251206177461E-003
clm model is stopping - error is greater than 1e-4 (mm)
nstep = 148
errh2o = 1.482251206177461E-003
forc_rain = 0.000000000000000E+000
forc_snow = 6.081371973583696E-007
endwb = 5191.20871506863
begwb = 5191.20989310156
qflx_evap_tot = 2.085897121661234E-006
qflx_irrig = 0.000000000000000E+000
qflx_surf = 0.000000000000000E+000
qflx_h2osfc_surf = 0.000000000000000E+000
qflx_qrgwl = 0.000000000000000E+000
qflx_drain = 1.757046644113158E-010
qflx_drain_perched = 0.000000000000000E+000
qflx_flood = 0.000000000000000E+000
qflx_snwcp_ice = 0.000000000000000E+000
qflx_glcice_melt = 0.000000000000000E+000
qflx_glcice_frz = 0.000000000000000E+000
qflx_lateral = 0.000000000000000E+000
total_plant_stored_h2o_col = 1.02148851606293
clm model is stopping
local column index = 1
global column index = 1
global landunit index = 1
global gridcell index = 1
gridcell longitude = 212.500000000000
gridcell latitude = 65.4973821989528
column type = 1
landunit type = 1
ENDRUN:
ERROR in BalanceCheckMod.F90 at line 413

@jenniferholm
Copy link
Contributor

jenniferholm commented Jul 11, 2019

@rgknox -- updating that this water balance error still occurs with the new default PFT parameter file.
It fails on day 2 for a boreal site, and also fails at tropical ZF2.

After increasing the threshold "real(r8), parameter :: thresh_break = 1.e-4_r8 to "1.e-2_r8"" the case runs successfully at ZF2, but still crashes in a boreal forest.

Same location that @ekluzek found:
else if (abs(we_local) > thresh_break) then
write(fates_log(),*)'EDPlantHydraulics water balance error exceeds threshold of = ', thresh_break
call endrun(msg=errMsg(sourcefile, LINE))

@JunyanDing - You seemed to onto something with the area in the denominator when calculating we_local. But I think AREA is the total land area = 10000.0_r8 ! Notional area of simulated forest m2, not just the tree coverage area. But, have you looked into this more?

@JunyanDing
Copy link
Contributor

In my case, there are two places Hydro crashed. One is the water balance error, another is "floating divided by zero". The causes are more complicated. For water balance error, maybe try increase maxiter to 5 or 10? (the original value is 1). Can also try deeper roots.

For the area issue, I was wrong about the AREA to be zero. In fact, when looking more careful of the code, the denominator is indeed the fine root length. I think this error occurs when tree trim all the leaves, which will result in zero fine root biomass. I haven't started deal with this problem as currently I can only run static stands cases on lr.

@jenniferholm
Copy link
Contributor

@JunyanDing - thanks for these comments. My crash is related to the water balance error only (I think).

Thanks for the suggestion about fine root length. Could you point me to where the calculations are for this? I couldn't find it in FatesPlantHydraulicsMod.F90, is it somewhere else?

I know the subroutine trim_canopy is in EDPhysiologyMod. I'm going to look into setting allom_fmode =2, and also lower fates_allom_l2fr to something below 1.0....which would decouple the fine root from canopy trim. I think this might help. See PR #354

I'm not sure where to adjust fine root "length", just fine root biomass. Suggestions?

Thanks for your help.

@JunyanDing
Copy link
Contributor

Thanks for the suggestion @jenniferholm!
fine root length is calculated based on fine root biomass (subroutine UpdateSizeDepRhizVolLenCon in FatesPlantHydraulicsMod.F90 function https://github.com/NGEET/fates/blob/master/biogeophys/FatesPlantHydraulicsMod.F90#L1742). Fine roots biomass is calculated in bfineroot subroutine in FatesAllometryMod.F90.

You can adjust fine root length by l2fr parameter. But, remember, if you decrease l2fr, you will reduce the root conductance which is calculated based on fine root density. I think if your problem is caused by zero fine root length, change allom_fmode to 2 should do the work.

@jenniferholm
Copy link
Contributor

@JunyanDing - In case you are curious with my latest tests (and will help you)....I decoupled the fine root biomass from the canopy trimming. So changed fmode = 2, and increased l2fr to get a higher fine root biomass. However the water balance error still occurs very quickly. I checked and the fine root biomass is high (not crazy high, but not zero).

I've started to look into other hydraulic outputs, and I found that mean individual root uptake rate (kg/s) was a negative value, which doesn't make sense to me. I think (but not sure) the root water uptake is an important part of the mass balance equation, correct? Maybe this is the culprit. Because water uptake cannot be negative, or is this some sort of "water potential" uptake?

Summary of some other outputs -
Root fraction of conductivity looks good (46%)
Root water potential looks good (-1.2 MPa)
Root water content looks good, right? (35.2 m3/m3)

@xuchongang -- do you have insight on how we can stop this water balance error that is occurring very soon in the simulation? We've tried a couple approaches, but no success.
Particularly, do you know why the mean individual root uptake rate (kg/s) is negative?

Also, by making the fine root biomass higher in my case, and hence the fine root length, am I causing the soil volume to become less, which could potentially lead to these errors?

Thanks!

@rosiealice
Copy link
Contributor

@jenniferholm, is negative root uptake happening in some sort of redistributive sense? Like, the roots are wetter than some soil layers. That happened a lot with the hydro scheme in CLM5...

@JunyanDing
Copy link
Contributor

@jenniferholm Thanks for keeping me updated. Have you tried to increase maxiter ? this parameter is in https://github.com/NGEET/fates/blob/master/biogeophys/FatesPlantHydraulicsMod.F90#L2293

Yes, you are totally right. Just looked at my root water uptake, they are all positive :)
From my experience, given the root model in FATES, the default root distribution pars will put most roots in up soil. This will have two consequences: fine roots have to compete with evaporation for water, second, most water absorbed from top layer, which means the water content will change very fast. so, you need to solve the partial differential equation in finer time step. Increase maxiter will do. Also, increase roota and rootb in parameter file (I use 0.5, 2)
By the way, I am running static stands. Are you running the prognostic one?

@jenniferholm
Copy link
Contributor

jenniferholm commented Jul 15, 2019

@JunyanDing -- Here are some updates. I still haven't solved the balance error issue, but I'm going to list my attempts and parameter/code changes.

  • I am running in prognostic mode
  • I did increase maxiter to 10
  • The roota and rootb for the PFT I want to simulate were already high. They are 6, 2 respectively. With the increase of roota and rootb to 8, 4 there was still the same water balance error on day 2.
  • I returned the root values to their default values, and only had maxiter higher.
  • But now on day 1 there are catch_nan was found to be true, and there are NaNs. This was found in FatesPlantHydraulicsMod.F90 at line 3483 and the run ends.
  • This is using the fates_param_default file, will all PFTs set to zero except for PFT 6.

I'm starting to really hit a wall. @rgknox or @xuchongang do you have any thoughts on why my case is getting NaNs present, or there is the water balance error. Both of these are very early in the run.

Thanks.

@xuchongang
Copy link
Contributor

xuchongang commented Jul 22, 2019

@jenniferholm, sorry that I am late to this thread. Do you use the most recent master codes with new codes to avoid very low soil moisture? Do you have any diagnostics or figures to show? Can we discuss this for the next NGEE tropics modeling call?

@jenniferholm
Copy link
Contributor

@xuchongang - no worries, there has been a lot going on with getting ready for the NGT Phase 2 review.

Yes, this sounds good to go over this at our next FATES modeling call.

I'm still investigating the "NaN" issue that leads to a model crash. Hopefully more will be known by our next call. I'm also going to make a separate issue out of this, as it's not related to the water balance error anymore.

@rgknox
Copy link
Contributor

rgknox commented May 7, 2020

Fates hydro tests are now passing

@rgknox rgknox closed this as completed May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants