Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the CREATE-IP project #1652

Open
bouweandela opened this issue Jul 1, 2022 · 2 comments
Open

Add support for the CREATE-IP project #1652

bouweandela opened this issue Jul 1, 2022 · 2 comments

Comments

@bouweandela
Copy link
Member

The CREATE-IP project is the follow up to the ana4MIPs project. I think it would be useful to add support for this in the ESMValCore, as this would allow e.g. automatically downloading reanalysis datasets that require no further CMORization.

Example integration of CREATE-IP in ESMValCore

config-developer.yml entry could look something like this?

'CREATE-IP':
  cmor_strict: false
  input_dir:
    default: '{project}/{product}/{dataset}/{realm}/{frequency}/{latestversion}'
    ESGF: '{project}/{product}/{dataset}/{realm}/{frequency}/{latestversion}'
  input_file:
    default: '{short_name}_{mip}_{product}*.nc'
    ESGF: '{short_name}_{mip}_{product}*.nc'
  output_file: '{project}_{product}_{dataset}_{frequency}_{short_name}'
  cmor_type: 'CMIP5'

but may need to add also model?

A difficulty is that there are apparently 3 different DRS entries used in this project:

{
    '%(root)s/%(project)s/%(product)s/%(institute)s/%(source_id)s/%(experiment)s/%(time_frequency)s/%(realm)s/%(variable)s': 7,
    '%(root)s/%(project)s/%(product)s/%(institute)s/%(model)s/%(source_id)s/%(time_frequency)s/%(realm)s/%(variable)s': 23,
    '%(root)s/%(project)s/%(product)s/%(institute)s/%(model)s/%(experiment)s/%(time_frequency)s/%(realm)s/%(variable)s': 80,
}

Data finding

from esmvalcore.esgf import find_files
from esmvalcore.esgf.facets import FACETS, DATASET_MAP

FACETS['CREATE-IP'] = {
    'dataset': 'source_id',
    'frequency': 'time_frequency',
    'model': 'model',
    'product': 'product',
    'realm': 'realm',
    'short_name': 'variable',
}
DATASET_MAP['CREATE-IP'] = {}

find_files(project='CREATE-IP', short_name='tas', dataset='CREATE-MRE', frequency='mon', model='JRA-55')
# Result:
# [ESGFFile:CREATE-IP/MREreanalysis/JMA/JRA-55/CREATE-MRE/atmos/mon/v20200609/tas_Amon_MREreanalysis_JRA-55_198001-201512.nc on hosts ['esgf.nccs.nasa.gov']]

find_files(project='CREATE-IP', short_name='tas', dataset='MERRA2', frequency='mon', product='MRE2reanalysis')
# Result
# [ESGFFile:CREATE-IP/MRE2reanalysis/NASA-GMAO/GEOS-5/MERRA2/atmos/mon/v20200613/tas_Amon_MRE2reanalysis_MERRA2_198001-201712.nc on hosts ['esgf.nccs.nasa.gov']]

find_files(project='CREATE-IP', short_name='snd', dataset='CFSR', realm='landIce')
# Result:
# [ESGFFile:CREATE-IP/reanalysis/NOAA-NCEP/CFSR/landIce/mon/v20200607/snd_LImon_reanalysis_CFSR_197901-201912.nc on hosts ['esgf.nccs.nasa.gov']]
@valeriupredoi
Copy link
Contributor

yis, let's!

A difficulty is that there are apparently 3 different DRS entries used in this project:

what are the numbers (values in the dict) denoting? Also, I'd imagine source_id is equivalent to model? That'd make it two DRS's - actually one coz we can map source_id to model (somehow)

@bouweandela
Copy link
Member Author

what are the numbers (values in the dict) denoting?

I think it is the number of records on ESGF with the same dataset_id that use that DRS. Here is the code to get this info with esgf-pyclient:

from pyesgf.search import SearchConnection
conn = SearchConnection('https://esgf-data.dkrz.de/esg-search',
                        distrib=True)

ctx = conn.new_context(project='CREATE-IP', facets='directory_format_template_')
dict(ctx.facet_counts)['directory_format_template_']

Also, I'd imagine source_id is equivalent to model?

It seems this can be different. For example, the ERA5 dataset is produced using the CY41R2 IFS model, if I understand it correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants