Skip to content

Commit

Permalink
Merge pull request #1636 from ESMCI/mvertens/new_component_matches
Browse files Browse the repository at this point in the history
New match attribute that can be 'last' or 'first' for values match in component.py

Currently there is confusion as to how matches are found for multiple
<value> elements in a <values> node.

    component.py is currently using a matching algorithm that picks the
    last match in case of multiple matches that are found. This
    matching algorithm is used anytime a Component object is
    instantiated (currently occurs in config_component.xml). By default
    if the match attribute DOES NOT appear, then the last match will
    be used, to make things backwards compatible.

    namelist_definition_<component>.xml uses the entry_id.py
    matching algorithm which picks the first match in case of
    multiple matches being found. So for setting namelists the first
    match is picked.

This PR adds a new, optional, attribute to the <entry> element in
EITHER a config_component.xml, config_compset.xml or namelist_definition_<component>.xml file.

<entry id="<name>">
   <values match="last"> will pick the last best match
   <values match="first"> will pick the first best match
      <value>...</value>
      <value>...</value>
   <values>
<entry_id>

As a result, there is new flexibility and transparency in how matches
are determined in component.py by adding a match attribute that can
be 'first' or 'last'. Having this be explicit will enable developers
to not trip up on assuming 'first' or 'last' match and be wrong.
This capability has been added to the _get_value_match routine in BOTH
entry_id.py AND component.py. However, the default values differ:

    the default "match" value entry_id.py is "first"
    the default "match" value in component.py is "last"
    Having these default values differ preserves backwards compatibility when the
    "match" attribute is not there. Moving forwards, it would be good to always
    have a "match" attribute.

The new match = "last"attribute has been added to all of the data
components component_component.xml and the config_component_cesm.xml
and config_component_acme.xml.

Test suite: scripts_regressions_tests and
also verified that running the prealpha and prebeta tests on
cheyenne, with just namelist comparisons, resulted in identical
namelists when compared to cesm2_0_alpha06m
Test baseline: cesm2_0_alpha06m for cesm
Test namelist changes: None
Test status: bit for bit

Fixes ESMCI/CIME issue 1617

User interface changes?: New match attribute elements that are children of <entry> nodes.

Code review: gold2718
  • Loading branch information
goldy authored Jun 5, 2017
2 parents 3b238a8 + 726b4ee commit 0d1bbd8
Show file tree
Hide file tree
Showing 13 changed files with 226 additions and 165 deletions.
2 changes: 2 additions & 0 deletions config/xml_schemas/config_compsets.xsd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
<xs:attribute name="id" type="xs:NCName"/>
<xs:attribute name="compset" type="xs:string"/>
<xs:attribute name="grid" type="xs:string"/>
<xs:attribute name="match" type="xs:string"/>

<!-- simple elements -->
<xs:element name="help" type="xs:string"/>
Expand Down Expand Up @@ -59,6 +60,7 @@
<xs:sequence>
<xs:element maxOccurs="unbounded" ref="value"/>
</xs:sequence>
<xs:attribute ref="match"/>
</xs:complexType>
</xs:element>

Expand Down
34 changes: 28 additions & 6 deletions scripts/lib/CIME/XML/component.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ def __init__(self, infile):
cimeroot = get_cime_root()
if cimeroot in os.path.abspath(infile):
schema = files.get_schema("CONFIG_CPL_FILE")

EntryID.__init__(self, infile, schema=schema)

#pylint: disable=arguments-differ
Expand All @@ -41,6 +40,11 @@ def get_valid_model_components(self):
return components

def _get_value_match(self, node, attributes=None, exact_match=False):
"""
return the best match for the node <values> entries
Note that a component object uses a different matching algorithm than an entryid object
For a component object the _get_value_match used is below and is not the one in entry_id.py
"""
match_value = None
match_max = 0
match_count = 0
Expand All @@ -50,6 +54,11 @@ def _get_value_match(self, node, attributes=None, exact_match=False):
values = self.get_optional_node("values", root=node)
if values is None:
return

# determine match_type if there is a tie
# ASSUME a default of "last" if "match" attribute is not there
match_type = values.get("match", default="last")

# use the default_value if present
val_node = self.get_optional_node("default_value", root=node)
if val_node is None:
Expand All @@ -58,6 +67,7 @@ def _get_value_match(self, node, attributes=None, exact_match=False):
value = val_node.text
if value is not None and len(value) > 0 and value != "UNSET":
match_values.append(value)

for valnode in self.get_nodes("value", root=node):
# loop through all the keys in valnode (value nodes) attributes
for key,value in valnode.attrib.iteritems():
Expand All @@ -70,6 +80,8 @@ def _get_value_match(self, node, attributes=None, exact_match=False):
else:
match_count = 0
break

# a match is found
if match_count > 0:
# append the current result
if values.get("modifier") == "additive":
Expand All @@ -82,11 +94,21 @@ def _get_value_match(self, node, attributes=None, exact_match=False):
del match_values[:]
match_values.append(valnode.text)

# take the *last* best match
elif match_count >= match_max:
del match_values[:]
match_max = match_count
match_value = valnode.text
else:
if match_type == "last":
# take the *last* best match
if match_count >= match_max:
del match_values[:]
match_max = match_count
match_value = valnode.text
elif match_type == "first":
# take the *first* best match
if match_count > match_max:
del match_values[:]
match_max = match_count
match_value = valnode.text
else:
expect(False, "match attribute can only have a value of 'last' or 'first'")

if len(match_values) > 0:
match_value = " ".join(match_values)
Expand Down
28 changes: 26 additions & 2 deletions scripts/lib/CIME/XML/entry_id.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,15 @@ def _get_value_match(self, node, attributes=None, exact_match=False):
'''
Note that the component class has a specific version of this function
'''
# if there is a <values> element - check to see if there is a match attribute
# if there is NOT a match attribute, then set the default to "first"
# this is different than the component class _get_value_match where the default is "last"
values_node = self.get_optional_node("values", root=node)
if values_node is not None:
match_type = values_node.get("match", default="first")
else:
match_type = "first"

# Store nodes that match the attributes and their scores.
matches = []
nodes = self.get_nodes("value", root=node)
Expand Down Expand Up @@ -97,8 +106,23 @@ def _get_value_match(self, node, attributes=None, exact_match=False):
if not matches:
return None

# Get maximum score using custom `key` function, extract the node.
_, mnode = max(matches, key=lambda x: x[0])
# Get maximum score using either a "last" or "first" match in case of a tie
max_score = -1
mnode = None
for score,node in matches:
if match_type == "last":
# take the *last* best match
if score >= max_score:
max_score = score
mnode = node
elif match_type == "first":
# take the *first* best match
if score > max_score:
max_score = score
mnode = node
else:
expect(False,
"match attribute can only have a value of 'last' or 'first', value is %s" %match_type)

return mnode.text

Expand Down
92 changes: 47 additions & 45 deletions src/components/data_comps/datm/cime_config/config_component.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,35 @@

<entry_id>

<!-- NOTE that the description block determines what DATM% values can appear in the compset name
For DATM this is determined by the DATM_MODE values in combination with the TIME prefix -->
<description>
<desc compset="^1850_DATM%QIA">QIAN atm input data for 1948-1972:</desc>
<desc compset="^2000_DATM%WISOQIA">QIAN atm input data with water isotopes for 2000-2004:</desc>
<desc compset="^2000_DATM%QIA">QIAN atm input data for 1972-2004:</desc>
<desc compset="^2003_DATM%QIA">QIAN atm input data for 2002-2003:</desc>
<desc compset="^HIST_DATM%QIA">QIAN atm input data for 1948-1972:</desc>
<desc compset="^20TR_DATM%QIA">QIAN atm input data for 1948-1972:</desc>
<desc compset="^4804_DATM%QIA">QIAN atm input data for 1948-2004:</desc>
<desc compset="^RCP[2468]_DATM%QIA">QIAN atm input data for 1972-2004:</desc>
<desc compset="^1850_DATM%CRU">CRUNCEP atm input data for 1901-1920:</desc>
<desc compset="^2000_DATM%CRU">CRUNCEP atm input data for 1991-2010:</desc>
<desc compset="^2003_DATM%CRU">CRUNCEP atm input data for 2002-2003:</desc>
<desc compset="^HIST_DATM%CRU">CRUNCEP atm input data for 1901-1920:</desc>
<desc compset="^20TR_DATM%CRU">CRUNCEP atm input data for 1901-1920:</desc>
<desc compset="^RCP[2468]_DATM%CRU">CRUNCEP atm input data for 1991-2010:</desc>
<desc compset="^1850_DATM%GSW">GSWP3 atm input data for 1901-1920:</desc>
<desc compset="^2000_DATM%GSW">GSWP3 atm input data for 1991-2010:</desc>
<desc compset="^2003_DATM%GSW">GSWP3 atm input data for 2002-2003:</desc>
<desc compset="^HIST_DATM%GSW">GSWP3 atm input data for 1901-1920:</desc>
<desc compset="^20TR_DATM%GSW">GSWP3 atm input data for 1901-1920:</desc>
<desc compset="^RCP[2468]_DATM%GSW">GSWP3 atm input data for 1991-2010:</desc>
<desc compset="^1850_DATM%PCLHIST">CPL history input data:</desc>
<desc compset="^2000_DATM%1PT">single point tower site atm input data:</desc>
<desc compset="_DATM%NYF">COREv2 datm normal year forcing: (requires additional user-supplied data)</desc>
<desc compset="_DATM%IAF">COREv2 datm interannual year forcing: (requires additional user-supplied data)</desc>
</description>

<entry id="COMP_ATM">
<type>char</type>
<valid_values>datm</valid_values>
Expand All @@ -22,7 +51,7 @@
<desc>Mode for data atmosphere component.
CORE2_NYF (CORE2 normal year forcing) is the DATM mode used in C and G compsets.
CLM_QIAN, CLMCRUNCEP, CLMGSWP3 and CLM1PT are modes using observational data for I compsets.</desc>
<values>
<values match="last">
<value compset="%NYF">CORE2_NYF</value>
<value compset="%IAF">CORE2_IAF</value>
<value compset="%WISOQIA">CLM_QIAN_WISO</value>
Expand All @@ -38,7 +67,7 @@
<type>char</type>
<valid_values>none,clim_1850,clim_2000,trans_1850-2000,rcp2.6,rcp4.5,rcp6.0,rcp8.5,cplhist</valid_values>
<default_value>none</default_value>
<values>
<values match="last">
<value compset="^1850_">clim_1850</value>
<value compset="^2000_">clim_2000</value>
<value compset="^2003_">clim_2000</value>
Expand All @@ -61,14 +90,14 @@
</entry>

<entry id="DATM_TOPO">
<type>char</type>
<valid_values>none,observed</valid_values>
<default_value>observed</default_value>
<values>
<!-- Only needed for compsets with active land; for other compsets, turn it off -->
<value compset="_SLND">none</value>
<value compset="_DLND">none</value>
</values>
<type>char</type>
<valid_values>none,observed</valid_values>
<default_value>observed</default_value>
<values match="last">
<!-- Only needed for compsets with active land; for other compsets, turn it off -->
<value compset="_SLND">none</value>
<value compset="_DLND">none</value>
</values>
<group>run_component_datm</group>
<file>env_run.xml</file>
<desc>DATM surface topography forcing</desc>
Expand All @@ -78,7 +107,7 @@
<type>char</type>
<valid_values>none,20tr,rcp2.6,rcp4.5,rcp6.0,rcp8.5</valid_values>
<default_value>none</default_value>
<values>
<values match="last">
<value compset="^RCP8">rcp8.5</value>
<value compset="^RCP6">rcp6.0</value>
<value compset="^RCP4">rcp4.5</value>
Expand All @@ -104,7 +133,7 @@
<type>char</type>
<valid_values></valid_values>
<default_value>UNSET</default_value>
<values>
<values match="last">
<value compset="1850_DATM%CPLHIST">b40.1850.track1.1deg.006a</value>
</values>
<group>run_component_datm</group>
Expand All @@ -116,7 +145,7 @@
<type>integer</type>
<valid_values></valid_values>
<default_value>1</default_value>
<values>
<values match="last">
<value compset="1850_DATM%PCLHIST">1</value>
</values>
<group>run_component_datm</group>
Expand All @@ -128,7 +157,7 @@
<type>integer</type>
<valid_values></valid_values>
<default_value>-999</default_value>
<values>
<values match="last">
<value compset="1850_DATM%PCLHIST">960</value>
</values>
<group>run_component_datm</group>
Expand All @@ -140,7 +169,7 @@
<type>integer</type>
<valid_values></valid_values>
<default_value>-999</default_value>
<values>
<values match="last">
<value compset="1850_DATM%PCLHIST">1030</value>
</values>
<group>run_component_datm</group>
Expand All @@ -152,7 +181,7 @@
<type>integer</type>
<valid_values></valid_values>
<default_value>1</default_value>
<values>
<values match="last">
<value compset="2000.*_DATM%1PT">1</value>
<value compset="1850.*_DATM%QIA">1</value>
<value compset="1850.*_DATM%CRU">1</value>
Expand Down Expand Up @@ -185,7 +214,7 @@
<type>integer</type>
<valid_values></valid_values>
<default_value>2004</default_value>
<values>
<values match="last">
<value compset="2000.*_DATM%1PT">1972</value>
<value compset="1850.*_DATM%QIA">1948</value>
<value compset="1850.*_DATM%CRU">1901</value>
Expand Down Expand Up @@ -219,7 +248,7 @@
<type>integer</type>
<valid_values></valid_values>
<default_value>2004</default_value>
<values>
<values match="last">
<value compset="2000.*_DATM%1PT">2004</value>
<value compset="1850.*_DATM%QIA">1972</value>
<value compset="1850.*_DATM%CRU">1920</value>
Expand Down Expand Up @@ -249,33 +278,6 @@
<desc>ending year to loop data over</desc>
</entry>

<description>
<desc compset="^1850_DATM%QIA" >QIAN atm input data for 1948-1972:</desc>
<desc compset="^2000_DATM%WISOQIA" >QIAN atm input data with water isotopes for 2000-2004:</desc>
<desc compset="^2000_DATM%QIA" >QIAN atm input data for 1972-2004:</desc>
<desc compset="^2003_DATM%QIA" >QIAN atm input data for 2002-2003:</desc>
<desc compset="^HIST_DATM%QIA" >QIAN atm input data for 1948-1972:</desc>
<desc compset="^20TR_DATM%QIA" >QIAN atm input data for 1948-1972:</desc>
<desc compset="^4804_DATM%QIA" >QIAN atm input data for 1948-2004:</desc>
<desc compset="^RCP[2468]_DATM%QIA" >QIAN atm input data for 1972-2004:</desc>
<desc compset="^1850_DATM%CRU" >CRUNCEP atm input data for 1901-1920:</desc>
<desc compset="^2000_DATM%CRU" >CRUNCEP atm input data for 1991-2010:</desc>
<desc compset="^2003_DATM%CRU" >CRUNCEP atm input data for 2002-2003:</desc>
<desc compset="^HIST_DATM%CRU" >CRUNCEP atm input data for 1901-1920:</desc>
<desc compset="^20TR_DATM%CRU" >CRUNCEP atm input data for 1901-1920:</desc>
<desc compset="^RCP[2468]_DATM%CRU" >CRUNCEP atm input data for 1991-2010:</desc>
<desc compset="^1850_DATM%GSW" >GSWP3 atm input data for 1901-1920:</desc>
<desc compset="^2000_DATM%GSW" >GSWP3 atm input data for 1991-2010:</desc>
<desc compset="^2003_DATM%GSW" >GSWP3 atm input data for 2002-2003:</desc>
<desc compset="^HIST_DATM%GSW" >GSWP3 atm input data for 1901-1920:</desc>
<desc compset="^20TR_DATM%GSW" >GSWP3 atm input data for 1901-1920:</desc>
<desc compset="^RCP[2468]_DATM%GSW" >GSWP3 atm input data for 1991-2010:</desc>
<desc compset="^1850_DATM%PCLHIST" >CPL history input data:</desc>
<desc compset="^2000_DATM%1PT" >single point tower site atm input data:</desc>
<desc compset="_DATM%NYF" >COREv2 datm normal year forcing: (requires additional user-supplied data)</desc>
<desc compset="_DATM%IAF" >COREv2 datm interannual year forcing: (requires additional user-supplied data)</desc>
</description>

<help>
=========================================
DATM naming conventions in compset name
Expand Down
12 changes: 8 additions & 4 deletions src/components/data_comps/desp/cime_config/config_component.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,13 @@

<entry_id>

<!-- NOTE that the description block determines what DESP% values can appear in the compset name
For DESP this is determined by the DESP_MODE values -->
<description>
<desc compset="_DESP%NOOP">no modification of any model data</desc>
<desc compset="_DESP%TEST">test modification of any model data</desc>
</description>

<entry id="COMP_ESP">
<type>char</type>
<valid_values>desp</valid_values>
Expand All @@ -21,15 +28,12 @@
<file>env_run.xml</file>
<desc>Mode for external system processing component.
The default is NOOP, do not modify any model data.</desc>
<values>
<values match="last">
<value compset="%NOOP" >NOCHANGE</value>
<value compset="%TEST" >DATATEST</value>
</values>
</entry>

<description>
</description>

<help>
=========================================
DESP naming conventions in compset name
Expand Down
18 changes: 10 additions & 8 deletions src/components/data_comps/dice/cime_config/config_component.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,15 @@

<entry_id>

<!-- NOTE that the description block determines what DICE% values can appear in the compset name
For DICE this is determined by the DICE_MODE values -->
<description>
<desc compset="DICE%SSMI">dice mode is ssmi:</desc>
<desc compset="DICE%SIAF">dice mode is ssmi_iaf:</desc>
<desc compset="DICE%PRES">dice mode is prescribed:</desc>
<desc compset="DICE%NULL">dice mode is null:</desc>
</description>

<entry id="COMP_ICE">
<type>char</type>
<valid_values>dice</valid_values>
Expand All @@ -17,7 +26,7 @@
<type>char</type>
<valid_values>prescribed,ssmi,ssmi_iaf,copyall,null</valid_values>
<default_value>ssmi</default_value>
<values>
<values match="last">
<value compset="DICE%SSMI">ssmi</value>
<value compset="DICE%SIAF">ssmi_iaf</value>
<value compset="DICE%PRES">prescribed</value>
Expand All @@ -43,13 +52,6 @@
same observational data sets and are consistent with each other.</desc>
</entry>

<description>
<desc compset="DICE%SSMI">dice mode is ssmi:</desc>
<desc compset="DICE%SIAF">dice mode is ssmi_iaf:</desc>
<desc compset="DICE%PRES">dice mode is prescribed:</desc>
<desc compset="DICE%NULL">dice mode is null:</desc>
</description>

<help>
=========================================
DICE naming conventions
Expand Down
Loading

0 comments on commit 0d1bbd8

Please sign in to comment.