Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface: endrun and cleaning of fates<->hlm globals #180

Merged
merged 15 commits into from
Mar 6, 2017
Merged

Interface: endrun and cleaning of fates<->hlm globals #180

merged 15 commits into from
Mar 6, 2017

Conversation

rgknox
Copy link
Contributor

@rgknox rgknox commented Jan 25, 2017

In this change-set roughly two objectives are tackled. 1) a wrapper was created to call the CIME shutdown/end-run utitlities. The wrapper is essentially the same wrapper that CLM uses, but we need our own to break dependence. 2) We have various globals in FATES that are dictated by the HLM, and vice versa. I tried to wrangle these variables into FatesInterfaceMod, and give them some functions to set those variables and protect them. Moreover, the new global fates_maxElementsPerSite and fates_maxElementsPerPatch, are calculate based on the maximum requirements for array allocation for FATES restarts. Previously, this was a hard-coded and misleading value. While FATES asks the HLM to allocate a "cohort" array for restart variables, it is really a multi-purpose array, and may be larger than the needs we have to store cohorts.

Fixes: #178 and #177 and #144, and #186
Addresses: #141

User interface changes?: No

Code review:

Test suite: edTest, clm_short_45
Test baseline: cdb9db7
Test namelist changes: None
Test answer changes: b4b with baseline
Test summary: all PASS

rgknox and others added 9 commits January 19, 2017 17:19
…ls. Fix was trivial, both branches added a subroutine there, both subroutines are still relevant and orthogonal.
… did some light cleaning of some other calls to the cime shares, mostly just for readability.
…d having it a dependent variable that goes on to set HLM allocations. This pass also simplifies how HLM cohort<->grid,lu and column mappings are calculated. This is done by removing ed_cohort_vector. In this phase we have not removed that memory structure, but we are bypassing it.
…n this phase. The difference is that scalars dictated by the hlm going into fates are now living in FatesInterface mod where their wrappers are. Likewise, the varialbes that are dictated by fates and are destined for the HLM are also in this module, as are the wrapper call functions. The wrapper call functions to set fates_maxElementsPerPatch and fates_maxElementsPerSite are called during Initialize1(), and fates_maxElementsPerSite is now used during decompInit instead of the ed_vec_cohort structure.
…ning. Combined clm_fates%init and clm_fates%init_allocate. Created a trivial cohort dimensioning scheme for non-fates runs.
@rgknox
Copy link
Contributor Author

rgknox commented Feb 2, 2017

Some strangeness occurred. Although this branch was based off ofPR #176, when PR 176 was merged into master, this branch generated a long list of conflicts, as of b84f6b0. Then I simply merged master into this branch on 75807f1, which has 0 diffs, and the conflicts are gone. Maybe git was confused.

end subroutine SetFatesTime
if (present (msg)) then
write(fates_log(),*)'ENDRUN:', msg
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never cared for this convention in clm. It shifts the burden of understanding why endrun was called from the developer who called endrun onto the user. The original developer understands why they are inserting the endrun call and should be required to provide that information to the user in the message string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. But, just to double check, is your suggestion that we remove the optional use of providing a message, and make it mandatory? Or do you want us to require more structure in the message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think simply requiring a message is good enough. If you can think of additional structure that is easy to maintain, then that's great too.

@rgknox
Copy link
Contributor Author

rgknox commented Feb 6, 2017

Testing conflict resolutions and changes that make messages in endrun() mandatory.

@rgknox
Copy link
Contributor Author

rgknox commented Feb 6, 2017

All tests on e1b5b9b PASS, including regression tests.

@rgknox
Copy link
Contributor Author

rgknox commented Feb 6, 2017

@bandre-ucar : could you sign off on, or suggest improvements to the HLM side changes that I am proposing in this PR? Feel free to bring in others to check too if need be.

@bandre-ucar
Copy link
Contributor

No major concerns from the host side. Testing.

@bandre-ucar
Copy link
Contributor

Runtime failure on nag:

Runtime Error:  /home/andre/work/ed/pr180/ed-clm/components/clm/src/ED/biogeochem/EDPhysiologyMod.F90, line 432:  Subscript 1 of CURRENTSITE%WATER_MEMORY (value 0) is out of range (1:10)
Program terminated by fatal error
e/home/andre/work/ed/pr180/ed-clm/components/clm/src/ED/biogeochem/EDPhysiologyMod.F90, line 432: Error occurred in EDPHYSIOLOGYMOD:PHENOLOGY
/home/andre/work/ed/pr180/ed-clm/components/clm/src/ED/main/EDMainMod.F90, line 86: Callend by EDMAINMOD:ED_ECOSYSTEM_DYNAMICS
/home/andre/work/ed/pr180/ed-clm/components/clm/src/utils/clmfates_interfaceMod.F90, line 515: Called by CLMFATESINTERFACEMOD:DYNAMICS_DRIV
/home/andre/work/ed/pr180/ed-clm/components/clm/src/main/clm_driver.F90, line d821: Called by CLM_DRIVER:CLM_DRV
/home/andre/work/ed/pr180/ed-clm/components/clm/src/cpl/lnd_comp_mct.F90, line 436: Called by LND_COMP_MCT:LND_RUN_MCTr
/home/andre/work/ed/pr180/ed-clm/cime/driver_cpl/driver/component_mod.F90, line 1079: Called by COMPONENT_MOD:COMPONENT_RUN
/home/andre/work/ed/pr180/ed-clm/cime/driver_cpl/driver/cesm_comp_mod.F90, line 2509: Cal led by CESM_COMP_MOD:CESM_RUN
/home/andre/work/ed/pr180/ed-clm/cime/driver_cpl/driver/cesm_driver.F90, line 93: Called by CESM_DRIVER
=

@rgknox
Copy link
Contributor Author

rgknox commented Mar 1, 2017

I think I know the problem, I solved this in another branch I based on this one. I will submit the fix tomorrow.

@rgknox
Copy link
Contributor Author

rgknox commented Mar 1, 2017

Applied the fix, this should also address #186 now.
Re-ran EDTest on lawrencium and b4b regressions passed against cdb9db7.
Although I have to admit, I don't know why the compiler didn't catch the index overflow on the previous test, perhaps I used the wrong tag in my last test.

@bandre-ucar
Copy link
Contributor

@rgknox Thanks. Running tests.

@bandre-ucar
Copy link
Contributor

nag tests passed. Running on yellowstone.

@bandre-ucar
Copy link
Contributor

PR fails to compile under pgi:

PGF90-S-0087-Non-constant expression where constant expression required (/glade/
p/work/andre/ed/pr180/ed-clm/components/clm/src/ED/biogeophys/EDAccumulateFluxes
Mod.F90: 48)
PGF90-S-0081-Illegal selector - KIND parameter has unknown value for data type  
(/glade/p/work/andre/ed/pr180/ed-clm/components/clm/src/ED/biogeophys/EDAccumula
teFluxesMod.F90: 48)
PGF90-S-0087-Non-constant expression where constant expression required (/glade/
p/work/andre/ed/pr180/ed-clm/components/clm/src/ED/biogeophys/EDAccumulateFluxes
Mod.F90: 57)
PGF90-S-0081-Illegal selector - KIND parameter has unknown value for data type  
(/glade/p/work/andre/ed/pr180/ed-clm/components/clm/src/ED/biogeophys/EDAccumula
teFluxesMod.F90: 57)
  0 inform,   0 warnings,   4 severes, 0 fatal for accumulatefluxes_ed

@rgknox
Copy link
Contributor Author

rgknox commented Mar 2, 2017

@bandre-ucar ,

I am wondering if PGI didn't like the "use" call to IEEE_ARITHMATIC and therefor dropped the definition of r8 => fates_r8?

Or perhaps there is a circular dependency and it was not able to compile FatesConstantsMod before EDAccumulateFluxesMod?

I'm removing the use IEEE_Arithmetic, which is no-longer needed and will give it a try on yellowstone PGI. Do you have any foresight if this may be the issue?

@bandre-ucar
Copy link
Contributor

@rgknox I looked at it for a second and didn't understand what the problem was. PGI has been the most buggy compiler. If removing ieee doesn't work, you might try using fates_r8 directly or moving the use statement to the module level. I also think the use statement for endrun needs to be updated.

@rgknox
Copy link
Contributor Author

rgknox commented Mar 2, 2017

thanks @bandre-ucar, I just caught that rogue endrun() too. I will check back in with the test results.

@rgknox
Copy link
Contributor Author

rgknox commented Mar 2, 2017

I did a couple of changes, 1) moved the definition of r8 => fates_r8 to the module level and 2) removed the definition of IEE_ARITHMETIC. pgi now compiles on yellowstone.

I'd be more than happy to fully test the PGI @bandre-ucar , but I don't have a PGI baseline created for master, so your probably a step ahead of me here. Feel free to get the test started if you like, but otherwise I have a PGI baseline for master in the works.

@bandre-ucar
Copy link
Contributor

Thanks. I'll start another round of testing.

FYI - I keep all the ed-clm-fates baselines on yellowstone in the standard location:

/glade/p/cesmdata/cseg/ccsm_baselines

Named: ed-clm-{CHANGESET_ID}

@rgknox
Copy link
Contributor Author

rgknox commented Mar 2, 2017

great to know, thanks

@bandre-ucar
Copy link
Contributor

Yellowstone ed and clm_short testing passed.
rerunning nag, but no problems expected. Should be merging this evening.

@bandre-ucar bandre-ucar merged commit d12288a into NGEET:master Mar 6, 2017
@bandre-ucar bandre-ucar removed their assignment Apr 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants