Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review mode implementation in data models #1877

Closed
apcraig opened this issue Sep 7, 2017 · 2 comments
Closed

Review mode implementation in data models #1877

apcraig opened this issue Sep 7, 2017 · 2 comments
Assignees

Comments

@apcraig
Copy link
Collaborator

apcraig commented Sep 7, 2017

The data models need to review implementation of the different mode support. It looks like in the latest refactor, only datm, desp, dice, and docn have a "select case for the mode". Different modes should be supported in all data models, at least for NULL and not NULL to provide clearer paths for extensions.

@jedwards4b
Copy link
Contributor

Can you please elaborate? Has the refactor limited removed options that were previously available?

@apcraig
Copy link
Collaborator Author

apcraig commented Sep 8, 2017

I think the answer is probably not, but I'm not sure. I think we need some input from @mvertens.

billsacks added a commit that referenced this issue Oct 13, 2017
refactor CPLHIST mode and add DATM CPLHIST topo capability

This PR has several features:
1) unifies the CPLHIST naming convention across the components by
   adding the following new xml variables in DATM, DLND and DOCN
      `DATM_CPLHIST_DOMAIN_FILE, DLND_CPLHIST_DOMAIN_FILE, DROF_CPLHIST_DOMAIN_FILE`
   These new xml variables are the full pathname for domain file
   for datm, dlnd and drof when the corresponding data model mode
   is set to CPLHIST.
   NOTE: if any of these xml variables set to 'null',
   then domain information is read in from the first coupler
   history file in the target stream.
   NOTE: it is assumed that when `DXXX_CPLHIST_DOMAIN_FILE='null'`
   that the first coupler stream file that is pointed
   to contains the domain information for that stream.
   NOTE: This is the default mode that should be used when the data model
   `DXXX_MODE is CPLHIST`.
2) puts in new capability for DATM CPLHIST mode to read in topo data
3) replaces % formatting in the data models with new .format()
4) address issue #1877
5) replaced `xxx_mode (e.g. atm_mode) `variables in dxxx_comp_mod.F90 and dxxx_shr_mod.F90 with `datamode `- to reduce confusion.

Additional testing:
Verified that using this CIME branch to replace cime/ in cesm2_0_alpha07d,
the following cases ran successfully on cheyenne with data that was already there:

```
1850_DATM%CPLHIST_CLM50%BGC_SICE_SOCN_MOSART_CISM2%NOEVOLVE_SWAV (f09_f09_mg17)
   the following xmlchanges neededed to be made:
   ./xmlchange DATM_CPLHIST_DOMAIN_FILE=null
   ./xmlchange DATM_CPLHIST_DIR=/glade/p/cesm/bgcwg_dev/forcing/b.e20.B1850.f09_g17.pi_control.all.179.cplhist/cpl/hist.mon
   ./xmlchange DATM_CPLHIST_CASE=b.e20.B1850.f09_g17.pi_control.all.179.cplhist
   ./xmlchange DATM_CPLHIST_YR_ALIGN=1
   ./xmlchange DATM_CPLHIST_YR_START=37
   ./xmlchange DATM_CPLHIST_YR_END=37
```
```
1850_DATM%CPLHIST_SLND_CICE_POP2%ECO_DROF%CPLHIST_SGLC_WW3 (f09_g17)
   the following xmlchange commands needed to be run:
   ./xmlchange DATM_CPLHIST_DOMAIN_FILE=null
   ./xmlchange DATM_CPLHIST_DIR=/glade/p/cesm/bgcwg_dev/forcing/b.e20.B1850.f09_g17.pi_control.all.179.cplhist/cpl/hist.mon
   ./xmlchange DATM_CPLHIST_CASE=b.e20.B1850.f09_g17.pi_control.all.179.cplhist
   ./xmlchange DATM_CPLHIST_YR_ALIGN=1
   ./xmlchange DATM_CPLHIST_YR_START=37
   ./xmlchange DATM_CPLHIST_YR_END=37
   ./xmlchange DROF_CPLHIST_DOMAIN_FILE=null
   ./xmlchange DROF_CPLHIST_DIR=/glade/p/cesm/bgcwg_dev/forcing/b.e20.B1850.f09_g17.pi_control.all.179.cplhist/cpl/hist.mon
   ./xmlchange DROF_CPLHIST_CASE=b.e20.B1850.f09_g17.pi_control.all.179.cplhist
   ./xmlchange DROF_CPLHIST_YR_ALIGN=1
   ./xmlchange DROF_CPLHIST_YR_START=37
   ./xmlchange DROF_CPLHIST_YR_END=37

```
```
1850_SATM_DLND%SCPL_SICE_SOCN_SROF_CISM2%EVOLVE_SWAV (f09_g16_gl20)
    no xmlchange commands were needed since in this case we have
      DLND_CPLHIST_DOMAIN_FILE" value="$LND_DOMAIN_PATH/$LND_DOMAIN_FILE"
    which is obtained from the config_grids.xml file

```
Test suite: scripts_regression_tests
Test baseline: did not compare against baselines
Test namelist changes: None
Test status: should be bfb

Fixes #1877 
Fixes #1407 

User interface changes?: added new xml variables discussed above
Update gh-pages html (Y/N)?: Y
Code review: Jim Edwards, Bill Sacks
jgfouca added a commit that referenced this issue Nov 7, 2017
…r (PR #1877)

Document time spent before timing library is initialized

The GPTL-based timing library cannot be used until after MPI_Init
is called. MPI_Init can be expensive in some cases and on some
systems, and capturing this cost in the performance timing
output would be useful. Here the fortran system_clock (called
via shr_sys_irtc) is used to measure walltimes before the timing
library is initialized, and these times are recorded with the
GPTL-based performance data via the command t_startstop_valsf.

t_startstopf_valsf is a new perf_mod command based on the GPTL
command GPTLstartstop_val, which has been backported (and
modified slightly and renamed GPTLstartstop_vals) into gptl.c
from a more recent version of Jim Rosinksi's GPTL library.

Other minor clean-ups and augmentations are also included:

a) Eliminate adding negative time to timing events

On Cori-KNL MPI_WTIME does not always return monotonically increasing
values, leading to negative time increments being added to profile
timers (for fine grain timings). This is a rare event, and the timing
library does generate warnings when it occurs. However, it still adds
in the negative values. This modification sets negative increments to
zero, decreasing the impact of this erroneous MPI_WTIME behavior.

b) Force output of checkpoint timing file after day 1

Performance of the first simulated day typically differs from that
of the other days. It is useful to capture this as
a checkpoint timing file. This also increases the odds of
capturing the cost of the initialization even when the
initialization is very expensive and the job exceeds the
wallclock limit and is killed before final timing data
is output.

c) Improve attribution of time spent in component_init_cc

Move calls to t_set_prefix/t_unset_prefix to include everything in
component_init_cc, and also add a few more timers internally, to
better capture where time is being spent.

d) Apply 'CPL:' prefix more consistently in cime_comp_mod.F90 .

The timer names in cime_comp_mod.F90 were also modified
to use the 'CPL:' prefix more consistently.

[BFB]

* origin/worleyph/cime/driver_instrumentation_update:
  Change callcount default for t_startstop_valsf
  Capture time spent before timing library is initialized.
  Improve attribution of time spent in component_init_cc
  Force output of checkpoint timing file after timestep 1
  Eliminate adding negative time to timing events
jgfouca added a commit that referenced this issue Feb 23, 2018
…r (PR #1877)

Document time spent before timing library is initialized

The GPTL-based timing library cannot be used until after MPI_Init
is called. MPI_Init can be expensive in some cases and on some
systems, and capturing this cost in the performance timing
output would be useful. Here the fortran system_clock (called
via shr_sys_irtc) is used to measure walltimes before the timing
library is initialized, and these times are recorded with the
GPTL-based performance data via the command t_startstop_valsf.

t_startstopf_valsf is a new perf_mod command based on the GPTL
command GPTLstartstop_val, which has been backported (and
modified slightly and renamed GPTLstartstop_vals) into gptl.c
from a more recent version of Jim Rosinksi's GPTL library.

Other minor clean-ups and augmentations are also included:

a) Eliminate adding negative time to timing events

On Cori-KNL MPI_WTIME does not always return monotonically increasing
values, leading to negative time increments being added to profile
timers (for fine grain timings). This is a rare event, and the timing
library does generate warnings when it occurs. However, it still adds
in the negative values. This modification sets negative increments to
zero, decreasing the impact of this erroneous MPI_WTIME behavior.

b) Force output of checkpoint timing file after day 1

Performance of the first simulated day typically differs from that
of the other days. It is useful to capture this as
a checkpoint timing file. This also increases the odds of
capturing the cost of the initialization even when the
initialization is very expensive and the job exceeds the
wallclock limit and is killed before final timing data
is output.

c) Improve attribution of time spent in component_init_cc

Move calls to t_set_prefix/t_unset_prefix to include everything in
component_init_cc, and also add a few more timers internally, to
better capture where time is being spent.

d) Apply 'CPL:' prefix more consistently in cime_comp_mod.F90 .

The timer names in cime_comp_mod.F90 were also modified
to use the 'CPL:' prefix more consistently.

[BFB]

* origin/worleyph/cime/driver_instrumentation_update:
  Change callcount default for t_startstop_valsf
  Capture time spent before timing library is initialized.
  Improve attribution of time spent in component_init_cc
  Force output of checkpoint timing file after timestep 1
  Eliminate adding negative time to timing events
jgfouca added a commit that referenced this issue Mar 13, 2018
…r (PR #1877)

Document time spent before timing library is initialized

The GPTL-based timing library cannot be used until after MPI_Init
is called. MPI_Init can be expensive in some cases and on some
systems, and capturing this cost in the performance timing
output would be useful. Here the fortran system_clock (called
via shr_sys_irtc) is used to measure walltimes before the timing
library is initialized, and these times are recorded with the
GPTL-based performance data via the command t_startstop_valsf.

t_startstopf_valsf is a new perf_mod command based on the GPTL
command GPTLstartstop_val, which has been backported (and
modified slightly and renamed GPTLstartstop_vals) into gptl.c
from a more recent version of Jim Rosinksi's GPTL library.

Other minor clean-ups and augmentations are also included:

a) Eliminate adding negative time to timing events

On Cori-KNL MPI_WTIME does not always return monotonically increasing
values, leading to negative time increments being added to profile
timers (for fine grain timings). This is a rare event, and the timing
library does generate warnings when it occurs. However, it still adds
in the negative values. This modification sets negative increments to
zero, decreasing the impact of this erroneous MPI_WTIME behavior.

b) Force output of checkpoint timing file after day 1

Performance of the first simulated day typically differs from that
of the other days. It is useful to capture this as
a checkpoint timing file. This also increases the odds of
capturing the cost of the initialization even when the
initialization is very expensive and the job exceeds the
wallclock limit and is killed before final timing data
is output.

c) Improve attribution of time spent in component_init_cc

Move calls to t_set_prefix/t_unset_prefix to include everything in
component_init_cc, and also add a few more timers internally, to
better capture where time is being spent.

d) Apply 'CPL:' prefix more consistently in cime_comp_mod.F90 .

The timer names in cime_comp_mod.F90 were also modified
to use the 'CPL:' prefix more consistently.

[BFB]

* origin/worleyph/cime/driver_instrumentation_update:
  Change callcount default for t_startstop_valsf
  Capture time spent before timing library is initialized.
  Improve attribution of time spent in component_init_cc
  Force output of checkpoint timing file after timestep 1
  Eliminate adding negative time to timing events
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants