Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fates hydro: transpiration error without leaves #618

Closed
mariuslam opened this issue Mar 10, 2020 · 20 comments
Closed

Fates hydro: transpiration error without leaves #618

mariuslam opened this issue Mar 10, 2020 · 20 comments

Comments

@mariuslam
Copy link

Hey @rgknox,

I have been running fates hydro at three sites with two different pft configurations and two different parameter files (12 simulations). Most of them crash with the same "ERROR: transpiration with no leaves" after the first 5, 6, 7 or 8 months. Weirdly one of the 12 simulations showed another ERROR: water balance error.
I attached the log files if you want to have a look. Do not hesitate to ask for more information.

Cheers, Marius
log_files.zip

@jkshuman
Copy link
Contributor

@mariuslam I know that you changed a few parameters based on literature (basic allometry). Did you update any of the hydro parameters? It would be useful to know if those are at the default value. I know that Hydro can be quite sensitive, and since it is failing in the first year in the growing season there may be a basic problem with the way the vegetation is parameterized?
tagging @pollybuotte @JunyanDing @xuchongang for any ideas they might have about FATES-Hydro parameterization and parameters to be aware of.

@jkshuman
Copy link
Contributor

@mariuslam tagging issue #508 "Simple FatesHydro tests fail due to water balance error very quickly"
please read this thread - there may be information here to help you parameterize and fix your problem, but we should at least connect these issues as they seem quite similar.

@mariuslam
Copy link
Author

I basically decreased SLA top /max and DBH repro/mH in half of my simulations. But whether I changed the parameters or not, the error is the same with a few days lagg

@jenniferholm
Copy link
Contributor

Hi @mariuslam -- Also check out this pull request. This is where updates were made when transpiration was zero when using Hydro
#555

I've run into this issue before, and here is the line in the code that is generating the endrun -

if(bc_in(s)%qflx_transp_pa(ifp) > 1.e-10_r8 .and. gscan_patch<nearzero)then

@rgknox -- This seems to be the opposite of the NaN issues, right? I think we decided there can be times when the HLM has a small transpiration flux, but then on the next time step FATES could drop all leaves (deciduous, etc) and have zero conductance. If the flux differences are small, didn't we decide they are allowed to be out of sync?

I believe your calculation of transpiration when there is no leaf area is here -

if(ccohort%g_sb_laweight>nearzero) then

Should this check (starting at line 2506) be moved up earlier?

But I might be confused with the transpiration/conductances changes that were made for the NaN issue, and this looks different from the NaN issue.......

@JunyanDing
Copy link
Contributor

Try set fmode to 2 to decouple the root biomass with leaf, so that you will not get zero roots when leaves off.

@rgknox
Copy link
Contributor

rgknox commented Mar 11, 2020

@mariuslam, could you provide the branches and/or git hashes that you are using for both CTSM and FATES?

@mariuslam
Copy link
Author

Yes, I am using release-clm5.0.30-5-g5220caaa and sci.1.33.0_api.8.1.0.

@mariuslam
Copy link
Author

So I tried with fates_allom_fmode = 2 , I don't get the leaf transpiration error, but I do get the water balance exceeding a threshold error. If I comment this Endrun, it crashes later due to a cabon balance error (ERROR in EDMainMod.F90 at line 699).

@rgknox
Copy link
Contributor

rgknox commented Mar 12, 2020

@mariuslam
Perhaps try the refactored version of hydraulics? This is not integrated yet, but testing has been promising so far. Must use these two branches together:
https://github.com/rgknox/fates/tree/hydro-diagnostics-wmat
https://github.com/rgknox/ctsm/tree/fates-init-hydro-highersm-merge5.030

@mariuslam
Copy link
Author

@rgknox,
I tried the branches you recommended, the good news is that they can deal with the transpiration and water balance Errors, but I encountered other problems
Deciduous pft: The Endrun occurs at line 3200 in planthydraulics: "Gracefully quit if too many iterations have been used" . this occurs as soon as the second simulation year starts (day 1)
Evergreen pft: The Endrun occurs at line 4293 in planthydraulics: "If debug mode, calculate error on the forward solution"
(I did not use fmode=2)
saga.zip

@rgknox
Copy link
Contributor

rgknox commented Mar 13, 2020

I looked at your error logs. This is easy, there is almost no flow happening at that time, and we are being unnecessarily strict on the solver. The absolute values are so small.

Tri-diagonal solve produced solution with
 non-negligable error.
 Compartment:            4
 Error in forward solution:   2.646977960169689E-023
 Estimated delta theta:   3.012237654218378E-020
 Rel Error:   8.787414088867874E-004
 ENDRUN:
 ERROR in FatesPlantHydraulicsMod.F90 at line 4293                                                                                   

In this line, we just need to uncomment that absolute error part of the logic. But lets decrease the allowable absolute error so it matches the max_wb_step_err:

https://github.com/rgknox/fates/blob/hydro-diagnostics-wmat/biogeophys/FatesPlantHydraulicsMod.F90#L4286

https://github.com/rgknox/fates/blob/hydro-diagnostics-wmat/biogeophys/FatesPlantHydraulicsMod.F90#L3071

The second error, I see that the solver just isnt converging. In this case it is because of NaNs. Need to think more on this, but I feel like I would need more information. I see this is after a restart read... Perhaps that is the actual problem here.

@jenniferholm
Copy link
Contributor

jenniferholm commented Mar 14, 2020 via email

@rgknox
Copy link
Contributor

rgknox commented Mar 16, 2020

Thanks @jenniferholm
@mariuslam I will start using your case and see if I can reproduce your errors

@mariuslam
Copy link
Author

Hi @rgknox ,

It seems like decreasing the error and uncommenting the code solves both errors.
My simulations are still running after 20 years.

@mariuslam
Copy link
Author

Hi @rgknox , @ekluzek,

I figured out that the second error is linked to the setup.
It work fine if STOP_OPTION is date, but stops after one year if STOP_OPTION is nyears
#./xmlchange STOP_OPTION="date"
#./xmlchange STOP_DATE=20190101

@rgknox
Copy link
Contributor

rgknox commented Mar 19, 2020

Hi @mariuslam,

I ran your Merrelva test. I ran mine slightly differently, as I didn't want to have any data dependencies on global driver data. Note below that I tell the datm that we don't have ZBOT (which I assume it will just use some reasonable value), and we are using TDEW instead of RH:

cat > user_nl_clm << EOF
fsurdat="/home/rgknox/CTSM/cime/scripts/Merrelva/surfdata_1x1_Merrelva_era_simyr2000_deciduous_SD1.nc"
fates_paramfile="/home/rgknox/CTSM/cime/scripts/Merrelva/param_file_deciduous_changed.nc"
use_fates_spitfire = .false.
use_bedrock= .true.
use_fates_planthydro = .true.
hist_fincl1='QROOTSINK','FATES_LWP_COL_SCPF','FATES_AWP_COL_SCPF','H2OVEG','FATES_LTH_COL_SCPF'
EOF

cat >> user_nl_datm <<EOF
taxmode = "cycle", "cycle", "cycle"
EOF


./case.setup

# HERE WE NEED TO MODIFY THE STREAM FILE (DANGER ZONE - USERS BEWARE CHANGING)
./preview_namelists
cp run/datm.streams.txt.CLM1PT.CLM_USRDAT user_datm.streams.txt.CLM1PT.CLM_USRDAT
`sed -i '/ZBOT/d' user_datm.streams.txt.CLM1PT.CLM_USRDAT`
`sed -i '/RH/c\TDEW tdew' user_datm.streams.txt.CLM1PT.CLM_USRDAT`


./case.build

My simulations also run for the whole period, although I didn't need to modify and error tolerances. I will run some diagnostics and see if my results look normalish.

@rgknox
Copy link
Contributor

rgknox commented Mar 19, 2020

Here are results from my simulation.
Merrelva-v1_plots.pdf

@mariuslam is Merrelva a forrest? I didn't output structural diagnostics, so I can't tell exactly from my simulation, but it doesn't seem like it is generating much of a forest. The differences in lear water potentials (LWP) and absorbing root water potentials (AWP) is also very small, which seems indicative of short vegetation with small hydraulic compartment volumes and low storage.

The water potentials don't seem to be going too low. Here are the stomatal p50 and vulnerability:

fates_hydr_p50_gs = -1.5 ;
fates_hydr_avuln_gs = 2.5 ;

I'm curious what a run without hydro looks like.

@rgknox
Copy link
Contributor

rgknox commented Mar 19, 2020

I should also remember to generate btran values for both the hydro and non-hydro runs to see how they compare.

@JunyanDing
Copy link
Contributor

Hi @rgknox , I am running your Hydro along a transect of Sierra 4 cz sties with 18 instances. The model crush on the 1D solving after running about 5 year. I have gone through all the land logs, but none of them recorded any errors. I attached the domain and surface data, and my run script.
ModelFiles.zip

--------------error message -----------------
Could not find a stable solution for hydro 1D solve

error code: 3
error diag: 0.228344513182446 0.272759139838090
0.279194780828520 -4.076654767989318E-002 0.361990124008440
0.361994440504256 0.361998936437905 0.362004429094216
0.362010871213673
layer: 18
wb_step_err = 0.000000000000000E+000
leaf water: 0.159557326395469 kg/plant
stem_water: 0.142111148964755 kg/plant
troot_water: 0.202021547277022
aroot_water: 0.641763205831147
LWP: -9.82443628933751
dbh: 1.90451715696591
pft: 1
z nodes: 1.25000000000000 0.600000000000000
-0.709405867382884 -7.46000000000000 -7.46000000000000
-7.46000000000000 -7.46000000000000 -7.46000000000000
-7.46000000000000
psi_z: 1.224999999999987E-002 5.880000000000329E-003 -6.952177500352263E-003
-7.310799999999995E-002 -7.310800000000001E-002 -7.310800000000001E-002
-7.310800000000001E-002 -7.310800000000001E-002 -7.310800000000001E-002
vol, theta, H, kmax-
flux: 0.000000000000000E+000
l: 7.035008018980105E-004 0.226804754116827 -9.81218628933751
-9.82443628933751
4.744871447689270E-005
s: 5.175092774797346E-004 0.274605992875789 -6.15591779626325
-6.16179779626325
9.021984973985981E-005
t: 7.392335517233187E-004 0.273285143519345 -6.21463252490059
7.773771584216823E-005
a: 6.680418991953484E-006 0.618236802186109 -1.31563466413993
in: 1.554602229700830E-004
out: 7.613410243813055E-006
r1: 4.663661504895142E-005 0.361979528207703 -0.103203243196368
139.373895723908
r2: 5.900116205819873E-004 0.361987777167604 -0.103197062543882
139.373895723909
r3: 7.464386342284710E-003 0.361996019997515 -0.103190887892927
139.373895723908
r4: 9.443384083168961E-002 0.362004209386969 -0.103184754667887
139.373895723909
r5: 1.19470642130446 0.362010988886661 -0.103179678393222
kmax_aroot_radial_out: 8.005424765868619E-006
surf area of root: 8.005424765868618E-002
aroot_frac_plant: 1.247351249448511E-004 127.410292303819
1021446.78461781
kmax_upper_shell: 397.062986628970 397.062986628970
397.062986628970 397.062986628970 397.062986628970
kmax_lower_shell: 214.755755084092 214.755755084092
214.755755084092 214.755755084092 214.755755084092

tree lai: 1.55157121520306 m2/m2 crown
area and area to volume ratios

a: 6.680418991953484E-006
8.005424765868618E-002
r1: 4.663661504895142E-005
0.284741751750651
r2: 5.900116205819873E-004

r3: 7.464386342284710E-003

r4: 9.443384083168961E-002

r5: 1.19470642130446
inner shell kmaxs: 5524.39442143184 7515.38383865023
8883.26729328434 11420.3270402958 12670.9067244132
11708.5623429026 10749.5440608916 9273.16104196980
7627.75081194441 6056.64132759669 4689.39496709676
3566.84516025725 2946.47912457158 2244.59626509577
1599.12313677473 1070.46498472574 673.173664753773
397.062986628970 219.417061772469
ENDRUN:
ERROR in FatesPlantHydraulicsMod.F90 at line 3200

ERROR: Unknown error submitted to shr_abort_abort.
Image PC Routine Line Source
cesm.exe 000000000110FF0E Unknown Unknown Unknown
cesm.exe 0000000000DAD832 shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 000000000079BAD0 fatesglobals_mp_f 65 FatesGlobals.F90
cesm.exe 00000000007F9475 fatesplanthydraul 3200 FatesPlantHydraulicsMod.F90
cesm.exe 00000000007EF8B2 fatesplanthydraul 2414 FatesPlantHydraulicsMod.F90
cesm.exe 00000000007EE012 fatesplanthydraul 259 FatesPlantHydraulicsMod.F90
cesm.exe 000000000053A5F3 clmfatesinterface 2344 clmfates_interfaceMod.F90
cesm.exe 0000000000701489 canopyfluxesmod_m 1280 CanopyFluxesMod.F90
cesm.exe 0000000000506DCF clm_driver_mp_clm 568 clm_driver.F90
cesm.exe 00000000004F639C lnd_comp_mct_mp_l 456 lnd_comp_mct.F90
cesm.exe 000000000042EBBD component_mod_mp_ 728 component_mod.F90
cesm.exe 0000000000414272 cime_comp_mod_mp_ 2717 cime_comp_mod.F90
cesm.exe 000000000042E867 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000411F5E Unknown Unknown Unknown
libc.so.6 00002B6CE682E545 Unknown Unknown Unknown
cesm.exe 0000000000411E69 Unknown Unknown Unknown
max rss=233.2 MB

MPI_ABORT was invoked on rank 79 in communicator MPI_COMM_WORLD
with errorcode 1001.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

@rgknox
Copy link
Contributor

rgknox commented Aug 13, 2020

@mariuslam feel free to re-open if you encounter more problems, I believe we made progress on this via: #611

@rgknox rgknox closed this as completed Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants