-
Notifications
You must be signed in to change notification settings - Fork 212
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implementation of multiple MCT drivers as an option for multi-instance simulations. If multi-instance is enabled, N drivers are run, each with one instance. Also (changes not directly related to multi-driver): Changed interface of check_lockedfiles (check_lockedfiles.py) to take a case instead of a caseroot. Use case.get_env instead of EnvBuild in check_lockedfiles.py Changed check_case (case_submit.py) to not take a caseroot input. Cleaned up memleak testing in _check_for_memleak (system_tests_common.py) Fixed bad format in build_xcpl_nml (buildnml.py) Test suite: scripts_regression_tests.py Test baseline: NA Test namelist changes: NA Test status: bit for bit Fixes: #1704 Fixes: #1714 User interface changes?: new --multi-driver option to create_newcase and _C# modifier to tests Update gh-pages html (Y/N)?: Y Code review: @gold2718
- Loading branch information
Showing
41 changed files
with
779 additions
and
386 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,95 +1,109 @@ | ||
.. _multi-instance: | ||
|
||
**TODO: Need to update PE elements and explain + and - values** | ||
|
||
|
||
Multi-instance component functionality | ||
====================================== | ||
|
||
The CIME coupling infrastructure is capable of running multiple component instances under one model executable. | ||
One caveat: If N multiple instances of any one active component are used, the same number of multiple instances of ALL active components are required. | ||
More details are discussed below. | ||
|
||
The primary motivation for this development was to be able to run an ensemble Kalman-Filter for data assimilation and parameter estimation (UQ, for example). | ||
However, it also provides the ability to run a set of experiments within a single model executable where each instance can have a different namelist, and to have all the output go to one directory. | ||
|
||
An F compset is used in the following example. Using the multiple-instance code involves the following steps: | ||
The CIME coupling infrastructure is capable of running multiple | ||
component instances (ensembles) under one model executable. There are | ||
two modes of ensemble capability, single driver in which all component | ||
instances are handled by a single driver/coupler component or | ||
multi-driver in which each instance includes a separate driver/coupler | ||
component. In the multi-driver mode the entire model is duplicated | ||
for each instance while in the single driver mode only active | ||
components need be duplicated. In most cases the multi-driver mode | ||
will give better performance and should be used. | ||
|
||
The primary motivation for this development was to be able to run an | ||
ensemble Kalman-Filter for data assimilation and parameter estimation | ||
(UQ, for example). However, it also provides the ability to run a set | ||
of experiments within a single model executable where each instance | ||
can have a different namelist, and to have all the output go to one | ||
directory. | ||
|
||
An F compset is used in the following example. Using the | ||
multiple-instance code involves the following steps: | ||
|
||
1. Create the case. | ||
:: | ||
|
||
> create_newcase --case Fmulti --compset F --res ne30_g16 | ||
> create_newcase --case Fmulti --compset F2000_DEV --res f19_f19_mg17 | ||
> cd Fmulti | ||
|
||
2. Assume this is the out-of-the-box pe-layout: | ||
2. Assume this is the out-of-the-box pe-layout: | ||
:: | ||
|
||
NTASKS(ATM)=128, NTHRDS(ATM)=1, ROOTPE(ATM)=0, NINST(ATM)=1 | ||
NTASKS(LND)=128, NTHRDS(LND)=1, ROOTPE(LND)=0, NINST(LND)=1 | ||
NTASKS(ICE)=128, NTHRDS(ICE)=1, ROOTPE(ICE)=0, NINST(ICE)=1 | ||
NTASKS(OCN)=128, NTHRDS(OCN)=1, ROOTPE(OCN)=0, NINST(OCN)=1 | ||
NTASKS(GLC)=128, NTHRDS(GLC)=1, ROOTPE(GLC)=0, NINST(GLC)=1 | ||
NTASKS(WAV)=128, NTHRDS(WAV)=1, ROOTPE(WAV)=0, NINST(WAV)=1 | ||
NTASKS(CPL)=128, NTHRDS(CPL)=1, ROOTPE(CPL)=0 | ||
|
||
The atm, lnd and rof are active components in this compset. The ocn is a prescribed data component, cice is a mixed prescribed/active component (ice-coverage is prescribed), and glc and wav are stub components. | ||
|
||
Let's say we want to run two instances of CAM in this experiment. | ||
We will also have to run two instances of CLM, CICE and RTM. | ||
However, we can run either one or two instances of DOCN, and we can ignore glc and wav since they do not do anything in this compset as stub components. | ||
|
||
To run two instances of CAM, CLM, CICE, RTM and DOCN, invoke the following commands in your **$CASEROOT** directory: | ||
Comp NTASKS NTHRDS ROOTPE | ||
CPL : 144/ 1; 0 | ||
ATM : 144/ 1; 0 | ||
LND : 144/ 1; 0 | ||
ICE : 144/ 1; 0 | ||
OCN : 144/ 1; 0 | ||
ROF : 144/ 1; 0 | ||
GLC : 144/ 1; 0 | ||
WAV : 144/ 1; 0 | ||
ESP : 1/ 1; 0 | ||
|
||
The atm, lnd, rof and glc are active components in this compset. The ocn is | ||
a prescribed data component, cice is a mixed prescribed/active | ||
component (ice-coverage is prescribed), and wav and esp are stub | ||
components. | ||
|
||
Let's say we want to run two instances of CAM in this experiment. We | ||
will also have to run two instances of CLM, CICE, RTM and GLC. However, we | ||
can run either one or two instances of DOCN, and we can ignore the | ||
stub components since they do not do anything in this compset. | ||
|
||
To run two instances of CAM, CLM, CICE, RTM, GLC and DOCN, invoke the following :ref: `xmlchange<modifying-an-xml-file>` commands in your **$CASEROOT** directory: | ||
:: | ||
|
||
> ./xmlchange NINST_ATM=2 | ||
> ./xmlchange NINST_LND=2 | ||
> ./xmlchange NINST_ICE=2 | ||
> ./xmlchange NINST_ROF=2 | ||
> ./xmlchange NINST_GLC=2 | ||
> ./xmlchange NINST_OCN=2 | ||
|
||
As a result, you will have two instances of CAM, CLM and CICE (prescribed), RTM, and DOCN, each running concurrently on 64 MPI tasks. | ||
As a result, you will have two instances of CAM, CLM and CICE (prescribed), RTM, GLC, and DOCN, each running concurrently on 72 MPI tasks and all using the same driver/coupler component. In this single driver/coupler mode the number of tasks for each component instance is NTASKS_COMPONENT/NINST_COMPONENT and the total number of tasks is the same as for the single instance case. | ||
|
||
Now consider the multi driver model. | ||
To use this mode change | ||
:: | ||
> ./xmlchange MULTI_DRIVER=TRUE | ||
|
||
**TODO: put in reference to xmlchange".** | ||
This configuration will run each component instance on the original 144 tasks but will generate two copies of the model (in the same executable) for a total of 288 tasks. | ||
|
||
3. Set up the case | ||
:: | ||
|
||
> ./case.setup | ||
|
||
A new **user_nl_xxx_NNNN** file (where NNNN is the number of the component instances) is generated when **case.setup** is called. | ||
A new **user_nl_xxx_NNNN** file is generated for each component instance when case.setup is called (where xxx is the component type and NNNN is the number of the component instance). | ||
When calling **case.setup** with the **env_mach_pes.xml** file specifically, these files are created in **$CASEROOT**: | ||
:: | ||
|
||
user_nl_cam_0001, user_nl_cam_0002 | ||
user_nl_cice_0001, user_nl_cice_0002 | ||
user_nl_clm_0001, user_nl_clm_0002 | ||
user_nl_rtm_0001, user_nl_rtm_0002 | ||
user_nl_docn_0001, user_nl_docn_0002 | ||
user_nl_cam_0001 user_nl_clm_0001 user_nl_docn_0001 user_nl_cice_0001 | ||
user_nl_cism_0001 user_nl_mosart_0001 | ||
user_nl_cam_0002 user_nl_clm_0002 user_nl_docn_0002 user_nl_cice_0002 | ||
user_nl_cism_0002 user_nl_mosart_0002 | ||
user_nl_cpl | ||
|
||
Also, **case.setup** creates the following ``*_in_*`` files and ``*txt*`` files in **$CASEROOT/CaseDocs**: | ||
:: | ||
|
||
atm_in_0001, atm_in_0002 | ||
docn.streams.txt.prescribed_0001, docn.streams.txt.prescribed_0002 | ||
docn_in_0001, docn_in_0002 | ||
docn_ocn_in_0001, docn_ocn_in_0002 | ||
drv_flds_in, drv_in | ||
ice_in_0001, ice_in_0002 | ||
lnd_in_0001, lnd_in_0002 | ||
rof_in_0001, rof_in_0002 | ||
|
||
The namelist for each component instance can be modified by changing the corresponding **user_nl_xxx_NNNN** file. | ||
Modifying **user_nl_cam_0002** will result in your namelist changes being active ONLY for the second instance of CAM. | ||
The namelist for each component instance can be modified by changing the corresponding **user_nl_xxx_NNNN** file. | ||
Modifying **user_nl_cam_0002** will result in your namelist changes being active ONLY for the second instance of CAM. | ||
To change the DOCN stream txt file instance 0002, copy **docn.streams.txt.prescribed_0002** to your **$CASEROOT** directory with the name **user_docn.streams.txt.prescribed_0002** and modify it accordlingly. | ||
|
||
Also keep these important points in mind: | ||
|
||
#. Note that these changes can be made at create_newcase time with option --ninst # where # is a positive integer, use the additional logical option --multi-driver to invoke the multi-driver mode. | ||
|
||
#. **Multiple component instances can differ ONLY in namelist settings; they ALL use the same model executable.** | ||
|
||
#. Multiple-instance implementation supports only one coupler component. | ||
#. Calling **case.setup** with ``--clean`` *DOES NOT* remove the **user_nl_xxx_NN** (where xxx is the component name) files created by **case.setup**. | ||
|
||
#. A special variable NINST_LAYOUT is provided for some experimental compsets, its value should be | ||
'concurrent' for all but a few special cases and it cannot be used if MULTI_DRIVER=TRUE. | ||
|
||
#. In **create_test** these options can be invoked with testname modifiers _N# for the single driver mode and _C# for the multi-driver mode. These are mutually exclusive options, they cannot be combined. | ||
|
||
#. Calling **case.setup** with ``--clean`` *DOES NOT* remove the **user_nl_xxx_NN** files created by **case.setup**. | ||
#. In create_newcase you may use --ninst # to set the number of instances and --multi-driver for multi-driver mode. | ||
|
||
#. Multiple instances generally should un concurrently, which is the default setting in **env_mach_pes.xml**. | ||
The serial setting is only for EXPERT USERS in upcoming development code implementations. | ||
#. In multi-driver mode you will always get 1 instance of each component for each driver/coupler, if you change a case using xmlchange MULTI_COUPLER=TRUE you will get a number of driver/couplers equal to the maximum NINST value over all components. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
""" | ||
Implemetation of CIME MCC test: Compares ensemble methods | ||
This does two runs: In the first we run a three member ensemble using the | ||
MULTI_DRIVER capability, then we run a second single instance case and compare | ||
""" | ||
from CIME.XML.standard_module_setup import * | ||
from CIME.SystemTests.system_tests_compare_two import SystemTestsCompareTwo | ||
from CIME.case_setup import case_setup | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
class MCC(SystemTestsCompareTwo): | ||
|
||
def __init__(self, case): | ||
self._comp_classes = [] | ||
self._test_instances = 3 | ||
SystemTestsCompareTwo.__init__(self, case, | ||
separate_builds = True, | ||
run_two_suffix = 'single_instance', | ||
run_two_description = 'single instance', | ||
run_one_description = 'multi driver') | ||
|
||
def _case_one_setup(self): | ||
# The multicoupler case will increase the number of tasks by the | ||
# number of requested couplers. | ||
self._case.set_value("MULTI_DRIVER",True) | ||
self._case.set_value("NINST", self._test_instances) | ||
case_setup(self._case, test_mode=False, reset=True) | ||
|
||
def _case_two_setup(self): | ||
self._case.set_value("NINST", 1) | ||
case_setup(self._case, test_mode=True, reset=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.