Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CIME to 5.3.0alpha29 #1643

Closed
wants to merge 871 commits into from
Closed

Conversation

jgfouca
Copy link
Member

@jgfouca jgfouca commented Jul 14, 2017

New user interface features:
option --skip-preview-namelist to case.submit to skip calling preview_namelist during case.run
option -M MAIL_USER, --mail-user MAIL_USER to case.submit to set an email for batch system notification
option -m {never,all,begin,end,fail}, --mail-type {never,all,begin,end,fail} to case.submit for when the batch system should send email.
manage_case renamed to more accurate query_config and usage/arguments made easier and output more informative.

User interface bug fixes:
xmlquery options --file and --listall now work together.
xmlquery options clarified and help message improved.
Improve error message for passing incorrect grid alias to create_newcase.

Removed user features:
Remove long term archiving support. "case.lt_archive" no longer in caseroot.

Other changes to case control system:
Improve ERR test.
New ERS2 test
new case.test --reset option to manually reset a test to initial conditions.
TOTAL_CORES was inaccurate so remove it.
Only create rest directories if they are going to be populated
get_timing.py should use largest ncpl value, not assume it's the atm value
Improve comment for BASELINE status in TestStatus
Better handling of fails in the build (don't print a python stacktrace)
Add a new directory testmod that creates cpl history files on a daily basis and thereby permits answer changes to be compared more easily.
Add ability for create_test to infer non-default machine from testname
Change all python2 style strings to python3 style strings
Improve handling of hist comparisons in TestStatus.log
Downscale build parallelism on small machines (so we don't overload it)
Bring bless_test_results behavior into an exact match with create_test -g

Data model changes:
Use %CPLHIST for DATM coupler hist forcing
Add support for data model to read multiple time slices at once
New aquaplanet capability: adds aquaplanet capability to DOCN where the input is from a file rather than analytic; adds new A compsets to the driver config_compsets.xml that have different DOCN functionality: ADSOM - DOCN SOM, ADSOMAQP - DOCN aquaplanet SOM, ADAQP3 - DOCN analytic aquaplanet (mode 3), ADAQFILE - DOCN aquaplanet from file

Coupler/driver changes:
Map SMB from lnd to glc using bilinear mapping with a conservation correction

Other:
User's guide rst files now in cime/doc directory

Fixes #1424
Fixes #1426
Fixes #1573
Fixes #1576

jgfouca and others added 30 commits June 1, 2017 14:33
This will bring bless_test_results behavior into an exact match
with create_test -g.
merge of bjsmith changes with ESMCI/cime master
…include_logs

hist_utils.generate_baseline should also try to copy cpl.log file

This will bring bless_test_results behavior into an exact match
with create_test -g.

Test suite: scripts_regression_tests --fast
Test baseline:
Test namelist changes:
Test status: bit for bit

Fixes #1627

User interface changes?: None

Code review: @jedwards4b (optional)
…/acme_merge_06_01_2017

* jgfouca/cime53010-with-acmesplit-06012017: (202 commits)
  Update error msg based on github feedback
  Improve error from check_input_data when cannot connect to SVN repo
  Remove duplicate testmods
  Remove submodule hacks from CIME
  Make jenkins submodule handling more robust
  Fix expression evaluation for batch submit args
  In ACME provenance code, handle failure to retrieve job_id
  Set up chama in cime
  fates-interface: testmod for fates
  fates-interface: added machine configuration and tests to cime
  Allocate send/recv buffs even if nprocs eq 0
  Allocate send/recv buffs always when useswapm
  Forgot to put MPICH_CPUMASK_DISPLAY back in there for edison.
  Revert my previous change to clean up the XML. Go back to using ifort/icc/icpc for SFC (serial compilers) -- this change was causing certain OCN tests to fail. And in effort to keep this PR minimal, revert change to clean up HAVE_SLASHPROC and Catamount to be like original.
  Adds test mods for RTM=NULL
  re-enable CPUMASK DISPLAY env for edison clean up -DHAVE_SLASHPROC for NERSC machines as we should be able to set it once in the OS=CNL section
  Add fix to bug in docn that was in this cime version
  Remove  -DCMAKE_SYSTEM_NAME=Catamount This is very old name for Cray OS -- it's not CLE (or CNL)
  Remove MPIFC/SFC settings for specific compiler entries at NERSC. As NERSC machines currently the setting for OS=CNL, this entry already sets those. This cleans up XML lines, but also changes the serial compiler for Intel to be ftn instead of ifort, which may be preferable anyway.
  Only turn on MPICH_CPUMASK_DISPLAY when DEBUG=TRUE. This option can write a lot of data to the log files.
  ...
Since the directory will vary depending on user etc.
… component object rather than entryid object in compsets.py
Fixes for format errors introduced in ESMCI/CIME issue 1608

This PR also fixes a critical error for ERP tests that were recently discovered in CESM testing.

Test suite: scripts_regression_tests
Test baseline: NA
Test namelist changes: NA
Test status: bit for bit
Fixes ESMIC/CIME issue 1638
User interface changes?: None
Code review: gold2718
merge bjsmith changes with ESMCI/CIME master
Move test PRE.f19_f19_mg16.ADESP_TEST from cime_config to mct driver
testlist.

Test suite: scripts_regression_test
Test baseline:
Test namelist changes:
Test status: bit for bit

Fixes

User interface changes?:

Code review:
New match attribute that can be 'last' or 'first' for values match in component.py

Currently there is confusion as to how matches are found for multiple
<value> elements in a <values> node.

    component.py is currently using a matching algorithm that picks the
    last match in case of multiple matches that are found. This
    matching algorithm is used anytime a Component object is
    instantiated (currently occurs in config_component.xml). By default
    if the match attribute DOES NOT appear, then the last match will
    be used, to make things backwards compatible.

    namelist_definition_<component>.xml uses the entry_id.py
    matching algorithm which picks the first match in case of
    multiple matches being found. So for setting namelists the first
    match is picked.

This PR adds a new, optional, attribute to the <entry> element in
EITHER a config_component.xml, config_compset.xml or namelist_definition_<component>.xml file.

<entry id="<name>">
   <values match="last"> will pick the last best match
   <values match="first"> will pick the first best match
      <value>...</value>
      <value>...</value>
   <values>
<entry_id>

As a result, there is new flexibility and transparency in how matches
are determined in component.py by adding a match attribute that can
be 'first' or 'last'. Having this be explicit will enable developers
to not trip up on assuming 'first' or 'last' match and be wrong.
This capability has been added to the _get_value_match routine in BOTH
entry_id.py AND component.py. However, the default values differ:

    the default "match" value entry_id.py is "first"
    the default "match" value in component.py is "last"
    Having these default values differ preserves backwards compatibility when the
    "match" attribute is not there. Moving forwards, it would be good to always
    have a "match" attribute.

The new match = "last"attribute has been added to all of the data
components component_component.xml and the config_component_cesm.xml
and config_component_acme.xml.

Test suite: scripts_regressions_tests and
also verified that running the prealpha and prebeta tests on
cheyenne, with just namelist comparisons, resulted in identical
namelists when compared to cesm2_0_alpha06m
Test baseline: cesm2_0_alpha06m for cesm
Test namelist changes: None
Test status: bit for bit

Fixes ESMCI/CIME issue 1617

User interface changes?: New match attribute elements that are children of <entry> nodes.

Code review: gold2718
merge of bjsmith changes to cime users guide
Test suite:
Test baseline:
Test namelist changes:
Test status: [bit for bit, roundoff, climate changing]

Fixes [CIME Github issue #]

User interface changes?:

Code review:
@ndkeen
Copy link
Contributor

ndkeen commented Aug 2, 2017

Well, I obviously still haven't fixed it yet. :)

@rljacob
Copy link
Member

rljacob commented Aug 2, 2017

If its not a new fail created by this branch, it doesn't need to be fixed (on this branch).

@cameronsmith1
Copy link
Contributor

cameronsmith1 commented Aug 2, 2017

The issue with ne4 on Cori is partially documented by #1673 .

Has anyone confirmed that this is the reason for the failure of this run_acme on Cori? The easiest way to test it, that I can think of, is to redo the run_acme run with an ne30 resolution (eg ne30_oECv3_ICG for the A_WCYCL case).

@mfdeakin-sandia , the changes to run_acme in this PR look fine to me. However, could you change the version number?

@mfdeakin-sandia
Copy link
Contributor

mfdeakin-sandia commented Aug 2, 2017

@cameronsmith1 It does look like the same error; now that I've tried it on master. I'll also update the version number.
Also, do you have a recommended "small" (smaller than ne30) grid to use so I can fully test this? The grid you suggest doesn't seem to be supported

@cameronsmith1
Copy link
Contributor

@mfdeakin-sandia : the ne30 is standard. There are some grids in between ne4 and ne30, but they are not well supported -- use at your own risk.

Which compset are you using? The grid I mentioned is the standard grid used by the coupled simulation team. However, it forces a particular ocean initial condition, so it will cause problems if you don't have an active ocean. If you are using an F compset (atm/land only), you can use ne30_oECv3.

@mfdeakin-sandia
Copy link
Contributor

I was using AWCYCL_1850 and AWCYCL_2000 cases, I thought was supposed to work with it. I ended up successfully running a different compset and grid on cori through the run_acme script, so I think this PR is almost ready (once I update my commit to include the version increase)

@cameronsmith1
Copy link
Contributor

Yes, those compsets should have worked. Perhaps something else was going on.

@mfdeakin-sandia
Copy link
Contributor

I don't know; it seems unrelated to this PR since I got the same error on master on Cori

@cameronsmith1
Copy link
Contributor

OK. It seems to me that you should push the version# fix, and let this PR go forward. Then take up your error in a new issue/PR (Can you give me your run_acme script that is giving you a problem?).

@mfdeakin-sandia
Copy link
Contributor

Done, I believe this is ready

@cameronsmith1
Copy link
Contributor

Thanks, @mfdeakin-sandia

@cameronsmith1
Copy link
Contributor

What is still needed for this PR?

There is at least one other PR for run_acme that I think is waiting for this PR, so it would be nice to get this PR onto master. Of course, run_acme is only a small piece of this PR, so if this one will take a while it might make sense to have the other run_acme PR go first.

@rljacob
Copy link
Member

rljacob commented Aug 15, 2017

We are waiting to do it after the beta2 tag which is waiting for some ocean PRs.

@amametjanov
Copy link
Member

Getting errors on xmlquery --value CIMEROOT in a testcase-dir: e.g.

env_case.xml:106: element schema: Schemas validity error : Element 'schema', attribute 'version': The attribute 'version' is not allowed.

and env_case.xml lines are

    <entry id="CONFIG_CPL_FILE" value="$CIMEROOT/src/drivers/mct/cime_config/config_component.xml">
      <type>char</type>
      <desc>file containing all non-component specific case configuration variables (for documentation only - DO NOT EDIT)</desc>
      <schema version="2.0">$CIMEROOT/config/xml_schemas/entry_id.xsd</schema>
      <schema version="3.0">$CIMEROOT/config/xml_schemas/entry_id_version3.xsd</schema>
    </entry>

@mfdeakin-sandia
Copy link
Contributor

mfdeakin-sandia commented Sep 12, 2017

A few issues after merging it to a branch from next:
Edison no longer builds - it looks like an issue in fates
/global/u2/m/mdeakin/ACME/acme_wcycl2000_cime5/components/clm/src/external_models/fates/main/FatesInterfaceMod.F90(28): error #6580: Name in only-list does not exist. [FATESREPORTPARAMS]
use EDParamsMod , only : FatesReportParams

I get a crash in acme.exe on Cori. The useful part of the backtrace is below:
20: acme.exe 0000000001745B4B mpas_log_mp_mpas_ 828 mpas_log.f90
20: acme.exe 00000000025938E4 ocn_comp_mct_mp_o 404 ocn_comp_mct.f90
20: acme.exe 0000000000427E91 component_mod_mp_ 227 component_mod.F90
20: acme.exe 0000000000418916 cime_comp_mod_mp_ 1191 cime_comp_mod.F90
20: acme.exe 00000000004251CD MAIN__ 63 cime_driver.F90

I'm still building on anvil, bebop, and skybridge, and will comment when they finish.

I should note that @jgfouca might want to delete the last commit on the branch, as that was me merging master in awhile back as opposed to rebasing because other people might have been working on the branch.

@cameronsmith1
Copy link
Contributor

@mfdeakin-sandia , could you update script_ver and the version log in run_acme, since the changes in #1716 ended up going first? Thanks.

@mfdeakin-sandia
Copy link
Contributor

@cameronsmith1 This branch is being abandoned in favor of a new one with a more recent version of CIME; so I'll do it there

@cameronsmith1
Copy link
Contributor

OK. Can you ping me on the new PR so I can keep track of run_acme, since there are multiple sets of changes going on?

@mfdeakin-sandia
Copy link
Contributor

I was planning on it :)

@jgfouca
Copy link
Member Author

jgfouca commented Sep 15, 2017

Replaced by #1788

@jgfouca jgfouca closed this Sep 15, 2017
@jgfouca jgfouca deleted the jgfouca/cime/update_to_cime_5.3.0.29 branch January 26, 2018 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment