This is a Cylc-driven workflow for MPAS versus WRF ensemble twin experiments, currently focusing on case study ensemble forecast analysis with WRF / MPAS, ensemble DA cycling experiments with WRF-GSI and with additional development pending to integrate MPAS-JEDI ensemble DA. The purpose of this workflow framework is to produce offline reforecast analysis for the state-of-the-science MPAS system versus the legacy WRF model. The code structure is based around defining case studies and simulation configurations in a way that is self-contained and transferrable to facilitate experiment replication. The repository is also designed to run with an embedded Cylc installation using Micromamba with an automated build procedure for the Cylc environment and its configuration for this repository.
The repository is structured as follows:
Ensemble-DA-Cycling-Template/ # Root directory of the repository
├── cylc # Directory of embedded Cylc installation and wrappers
├── cylc-run # Directory of cylc workflow installations and run directories
├── cylc-src # Installable workflow settings for configuring experiments
├── settings # HPC environments / configurations for executables
│ ├── shared # Shared executable / driver configuration files
│ ├── sites # Root directory of HPC system configuration sub-directories
│ └── template_archive # Templates for workflow development
└── src # Workflow core source code
├── downloads # Utility scripts for downloading data
└── drivers # Task execution run scripts
Cylc can be run on an HPC system in a centralized or a distributed fashion. This build procedure will
create an embedded Cylc installation with the configuration of Cylc inferred from the config_workflow.sh
at the root of the repository. One should edit this file as in the following to define the local
HPC system paramters for the workflow and source this to define the Cylc environment when running the workflow.
In the workflow configuration one should define the full path of clone
export HOME="/expanse/nfs/cw3e/cwp168/Ensemble-DA-Cycling-Template"
NOTE: when sourcing this configuration the Unix ${HOME}
variable will be reset in the shell
to the repository. This is to handle Cylc's usual dependence on a ${HOME}
directory
to write out workflow logs and make them shareable in a self-contained repository / project folder
directory structure.
One should also define a site-specific configuration to source for HPC global variables throughout the workflow and for the software environment to load for the model and DA executables.
export SITE="expanse-cwp168"
the settings for the local HPC system parameters are sourced in the config_workflow.sh
file as
source ${HOME}/settings/sites/${SITE}/config.sh
New "sites" can be defined by creating a new directory containing a config.sh
file
edited to set local paths / computing environment. Example configuration variables for
the workflow include:
export SOFT_ROOT= # Root directory of software stack executables
export DATA_ROOT= # Root directory of simulation forcing data
export GRIB_ROOT= # Root directory of grib data data
export WORK_ROOT= # Root directory of simulation_io
export MOD_ENV= # The full path to a module load / shared library path definition file for WRF / MPAS
export GSI_ENV= # The full path to a module load / shared library path definition file for GSI
In the example site configuration
source ${HOME}/settings/sites/expanse-cwp168/config.sh
this defines the ${MOD_ENV}
variable to be
export MOD_ENV="${HOME}/settings/sites/expanse-cwp168/${MOD_STACK}.sh"
where the ${MOD_STACK}
variable is the name of the software stack used to compile
the WRF and MPAS model environments. Respectively, the example defines the ${GSI_ENV}
as
export GSI_ENV="${HOME}/settings/sites/expanse-cwp168/${GSI_STACK}.sh"
where ongoing support for GSI is provided by the NOAA HPC Stack and the NOAA EMC GSI Github.
The Cylc build script
${HOME}/cylc/cylc_build.sh
sources the config_workflow.sh
configuration file to configure the local installation of
the Cylc executable in a self-contained Micromamba enviornment. The Cylc installation
will be built at
${HOME}/cylc/Micromamba/envs/${CYLC_ENV_NAME}
with the Cylc software environment defined in the file
${HOME}/scripts/environments/${CYLC_ENV_NAME}.yml
sourcing a .yml
definition file for the build. Sourcing the config_workflow.sh
the ${PATH}
is set to source the cylc-wrapper script
${HOME}/cylc/cylc
configured to match the self-contained Micromamba environment. Cylc command Bash auto-completion
is configured by default by sourcing the config_workflow.sh
file. Additionally the
cylc global configuration file
${HOME}/cylc/global.cylc
is configured so that workflow definitions will source the global variables
in config_workflow.sh
, and so that task job scripts will inherit these variables as well.
The Cylc workflow manager uses the cylc-run directory
${HOME}/cylc-run
to install workflows,
run workflows
and manage their progress
with automated logging of job status andt task execution within the associated run directories. Job execution such as
MPAS / WRF simulation IO will not be performed in the cylc-run
directory, as this run directory only encompasses
the execution of the workflow prior to calling the task driving script. Task driving scripts will have
work directories nested in the directory structure at ${WORK_ROOT}
defined in the config_workflow.sh
.
It is assumed that there is a suitable installation of MPAS and / or WRF available on the HPC system. Basic build examples for WPS, WRF, WRFDA, MPAS and their dependencies from source can be found in the template archive in this repository. This repository also includes a build example for GSI compiling GSI versus the HPC Stack, which provides the underlying dependencies. The build is loosely integrated to the repository in that it references a site configuration for code deployment with example HPC Stack Environment configuration and HPC Stack Yaml definition files for the build on Expanse.
Paths to the executables' compilation directories should be set in the config_workflow.sh
file. This repository assumes
that the executable compilation directories are "clean" builds, in that no simulations have been run within
these compilation directories. Safety checks are performed in the workflow, but sourcing compilation directories that
have also been used as run directories may produce unpredictable results.
The code is currently designed around executing case studies in which the full experiment configuration is transferable as a self-contained directory structure in order to facilitate experiment replication. Example templates for a variety of simulation configurations are archived for a generic Atmospheric River case study in the following.
At the path
${HOME}/cylc-src/valid_date_2022-12-28T00
templates are provided for running an ensemble forecast case study with the specified valid date of
2022-12-28T00
at which to end the forecast verification. Forecast initialization date times are
cycled with the Cylc workflow manager and the forecast length is dynamically defined by the length of
forecast time to the valid date of 2022-12-28T00
. Cylc and the source task driving scripts at
${HOME}/src/drivers
can be configured for a variety of additional types of experiment execution, such as with a fixed forecast length, with the configuration options commented in the code.
At the path
${HOME}/cylc-src/valid_date_2021-01-29T00
templates are provided additionally for running 3D-VAR and 3D-EnVAR case studies with WRF-GSI data assmilation cycling such that:
- A first 6 hour cold start forecast is generated at 2021-01-21T18 to 2021-01-22T00;
- A first GSI / WRFDA initial / boundary condition update is performed at 2021-01-22T00;
- 6-hourly cycling is performed until a final analysis at 2021-01-29T00;
- Extended forecasts are generated every 24 hours at 00Z starting from 2021-01-23T00 to 2021-01-28T00 each running until 2021-01-29T00 for event verfication.
All of the above cycling settings can be revised using the workflow parameters in order to generate extended forecasts from a DA cylcing experiment for a target event verifcation date. The template name
valid_date_2021-01-29T00/D3envar_NAM_lag06_b0.00_v06_h0300
includes tunable parameter settings for hybridization (b0.00
), vertical (v06
) and horizontal (h0300
) localization
radii and obs / background error settings (NAM
), which can all be changed to new settings. New configurations should
have the same directory structure as the examples with names defined to reflect the particular tunable settings.
Finally, for downscaling ensemble forecasts for the background error covariance calculation in EnVAR, there is a template available to generate ensemble forecasts offline for WRF
EnsembleBackground/WRF_9_WestCoast_lag06
including a tunable lag setting. The lag setting defines how many hours in the past the ensemble forecast is
initialized beyond the usual 6-hourly cycling zero hour at which the control member is initialized. For example,
for an analysis at 2021-01-22T00, the lag06
workflow generates a 12-hour ensemble forecast initialized at
2021-01-21T12 lagged by 6 hours relative the control member initialized at 2021-01-21T18. The lag is a tuneable
setting that can be used to increase the spread of the ensemble background to tune the analysis.
Cylc will search for installable workflows at the simulation settings root directory
${HOME}/cylc-src
which is configured in the repository as the default source workflow directory. A common directory naming structure must be observed with the definition of case studies and configurations nested at this simulation settings root. Experiment configurations are assumed to take on a naming structure
configuration.sub_configuration
where the experiment short name corresponds to the name of the static file for MPAS, e.g., the experiment
short name MPAS_60-10_CONUS
would have an associated MPAS static file named
MPAS_60-10_CONUS.static.nc
Optionally, sub-configurations of MPAS with e.g., different physics suites for a given static configuration,
can be denoted with a .sub_configuration
naming convention as above. For example, different
sub-configurations of experiments which both use the MPAS_60-10_CONUS.static.nc
file but differ using
the convection permitting or mesoscale reference physics suites can be denoted as
MPAS_60-10_CONUS.convection_permitting
MPAS_60-10_CONUS.mesoscale_reference
Experiment configurations are assumed to have a nested directory structure as
${HOME}/cylc_src/case_study/configuration.sub_configuration
where for example, directories defined as
${HOME}/cylc_src/valid_date_2022-12-28T00/MPAS_60-10_CONUS.convection_permitting
${HOME}/cylc_src/valid_date_2022-12-28T00/MPAS_60-10_CONUS.mesoscale_reference
${HOME}/cylc_src/valid_date_2022-12-28T00/MPAS_240-U
${HOME}/cylc_src/valid_date_2022-12-28T00/WRF_9-3_WestCoast
are all valid case study / configuration / sub-configuration directory structures to install the
workflows for the valid_date_2022-12-28T00
valid date case study.
Every experiment configuration uses a Cylc workflow configuration file
${HOME}/cylc_src/case_study/configuration.sub_configuration/flow.cylc
to define the dependency graph between workflow tasks and the switches that configure the tasks' driving scripts'
execution. Parameters defined in the flow.cylc
file include HPC job scheduler parameters, job cycling settings,
settings for propagating namelist templates and settings for linking simulation data and executables.
Each experiment configuration directory has a nested sub-directory for namelists and streamlists for MPAS and WRF, a sub-directory for static files that are unique to the experiment configuration and (for DA experiments) a GSI fix file directory:
${HOME}/cylc_src/case_study/configuration.sub_configuration/namelist
${HOME}/cylc_src/case_study/configuration.sub_configuration/static
${HOME}/cylc_src/case_study/configuration.sub_configuration/fix
Namelist and streamlist files within the above namelist directory are templated to propagate Cylc workflow parameters for e.g., ISO date time cycling, in order to dynamically set simulation start and run parameters depending on the cycle point. Simulation configurations that are held static for the course of an experiment, currently such as the physics suite, are defined in the namelist template itself. Static files for MPAS or respectively the geogrid files for WRF should be stored in the static file directory. Additionally, if explicit zeta levels are specified for MPAS, the static file directory should contain only one definition file for the explicit vertical level heights, with naming convention
*.ZETA_LIST.txt
Workflow scripts will source the *.ZETA_LIST.txt
file in the static directory for the zeta level definitions at the
init_atmosphere
runtime.
GSI must be able to source each of the following files to define the parameters of its analysis
- anavinfo
- atms_beamwidth.txt
- berror_stats
- convinfo.txt
- errtable
- lightinfo.txt
- ozinfo.txt
- pcpinfo.txt
- satangbias.txt
- satinfo.txt
Each one of these files are either defined uniquely or otherwise linked to the corresonding fix files for the North American Mesoscale (NAM)
and the Rapid Refresh (RAP) fix files respectively, defining alternative settings for the GSI optimization. Note: the supplied
RAP fix file rap_anavinfo_arw_netcdf
is not compatible with the current GSI but is included for reference.
Additionally, the workflow requires the following file
satlist.txt
defined in two columns giving the
- Source observation file name to untar;
- GSI observation file name code;
which determines which satellites are to be assimilated in the GSI driver script. For example, we have the collection
1bamua amsuabufr
1bhrs4 hirs4bufr
1bmhs mhsbufr
atms atmsbufr
eshrs3 hirs3bufrears
esmhs mhsbufrears
geoimr gimgrbufr
goesfv gsnd1bufr
gome gomebufr
gpsro gpsrobufr
mtiasi iasibufr
osbuv8 sbuvbufr
satwnd satwndbufr
ssmisu ssmirrbufr
of satellites available for the 2021-01-29T00 event, and lines can be removed to suppress their assimilation. If
no satellites are to be assimilated, the satlist.txt
file is still required but should be empty.
Shared files that are independent of a simulation configurations, such as mesh partitions and ungrib variable tables are kept in the directories
${HOME}/settings/shared/meshes
${HOME}/settings/shared/variable_tables
respectively. The underlying mesh name to source, e.g., x6.999426
is defined in the experiment's flow.cylc
file.
Iinitial / boundary condition data can be obtained from, e.g., the GEFS AWS Bucket. An automated data download and formatting script
${HOME}/src/downloads/download_GEFS_AWS.py
is included for anonymous downloads from the GEFS bucket above, and can be configured within the script to download arbirary forecast initialization dates and forecast hours from public GEFS data. This script requires the use of AWS Command Line Interface, which is installable by e.g., conda-forge.
Assuming that the global configuration configure_worklfow.sh
has been set to the local HPC system,
the embedded Cylc installation is built in the repository, the experiment configuration directory
${HOME}/cylc-src/case_study/configuration.sub_configuration
has been set up including necessary static files, Cylc can install the workflow
${HOME}/cylc-src/case_study/configuration.sub_configuration/flow.cylc
and run the experiment. The experiment above can be installed as
cylc install case_study/configuration.sub_configuration
and run and managed through any of Cylc command interfaces.
Generic Slurm and PBS scheduler configurations are provided to submit
task to the HPC job scheduler, but modifications are necessary for specific systems. HPC system
account information should be set in the repository configuration file configure_workflow.sh
.
The cylc-run
directory contains
logs for all experiments,
their tasks' execution and the status of the Cylc workflow manager. Task execution will
produce outputs in the above discussed case study / configuration / sub-configuration
naming convention. In MPAS and WRF ensemble forecast experiments the experiment is broken over
multiple tasks including preprocessing input data and running the simulation. Task outputs are organized
with the heirarchy
${WORK_ROOT}/case_study/configuration.sub_configuration/cycle_date/task_name/ensemble_member
which is generated by the task driving script.
See the Github issues page for ongoing issues / debugging efforts with this template.
If you encounter bugs, please post a detailed issue in the Github page, with steps and parameter settings that reproduce this issue, and please include any related error messages / logs that may be useful for solving this. Limited support and debugging will be performed for issues that do not adhere to this request.
If you encounter a bug and have a solution, please follow the same steps as above to post the issue and then submit a pull request with your branch that fixes the issue with a detailed explanation of the solution.