-
Notifications
You must be signed in to change notification settings - Fork 20
ESGF_Node|LUCIDexample
Wiki Reorganisation |
---|
This page has been classified for reorganisation. It has been given the category MOVE. |
The content of this page will be revised and moved to one or more other pages in the new wiki structure. |
This is an example of how the publisher should be tweaked in order to be used by a CMIP5 related project.
These are notes I made while configuring the node and publishing data for the LUCID project. They might be incomplete and/or there might be better/easier ways to achieve the same goal. Feel free to correct or comment anything in here, thanks. --estani
- add handler
- lucid project configuration
- lucid model/project
- thredds_root to new lucid root
- thredds url too?
-
Create directory to hold project catalogs at /esg/content/thredds/lucid (make sure the user publishing has write access to it)
-
Add a catalog reference to /esg/content/thredds/catalog.xml that points to the main location of the lucid catalog
-
Add the model name and project to /usr/local/cdat/lib/python2.6/site-packages/esgcet-2.8.5-py2.6.egg/esgcet/config/etc/esgcet_models_table.txt (or any other file pointed at by the config file (esg.ini) in use)
lucid | MPI-ESM-LR | http://www.ileaps.org/index.php?option=com_content&task=view&id=99 | LUCID
-
Copy a valid esg.ini that will be used for this project
-
Alter the esg.ini the following values:
thredds_root = /esg/content/thredds/lucid thredds_url = http://cmip2.dkrz.de/thredds/lucid thredds_root_catalog_name = LUCID catalog thredds_dataset_roots = esg_dataroot | /esg/data ... #don't delete anythng you have had here! if you do those catalogs will get erased too! lucid | /gpfs_750/projects/LUCID/data/lucid
project_options =
cmip5 | CMIP5 / IPCC Fifth Assessment Report | 1
ipcc4 | IPCC Fourth Assessment Report | 2
test | Test Project | 3
lucid | Land-Use and Climate, Identification of robust impacts | 4
Don't remove anything from _ thredds_dataset_roots _ , just add what you need. The publisher will dump catalogs from all missing entries while restarting the TDS if you do.
-
Add the following lucid project description:
#------------------------------------------------------------------------------------------ # Project-specific configuration # LUCID [project:lucid]
# LUCID experiments
# project | experiment_name | experiment_description
experiment_options =
lucid | L2A26 | model run without landuse change (after yr 2005) and with atmospheric CO2 from RCP2.6 scenario
lucid | L2A85 | model run without landuse change (after yr 2005) and with atmospheric CO2 from RCP8.5 scenario
# Define the categories to be used for this project:
# name | category_type | is_mandatory | is_thredds_property | display_order
categories =
project | enum | true | true | 0
experiment | enum | true | true | 1
product | enum | true | true | 2
model | string | true | true | 3
time_frequency | enum | true | true | 4
realm | enum | true | true | 5
cmor_table | enum | true | true | 6
ensemble | string | true | true | 7
institute | enum | true | true | 8
forcing | string | false | true | 9
title | string | false | true | 10
creator | enum | false | false | 11
publisher | enum | false | false | 12
creation_time | string | false | true | 13
format | fixed | false | true | 14
source | text | false | false | 15
drs_id | string | false | true | 16
description | text | false | false | 99
category_defaults =
product | requested
# Enumerated values
realm_options = atmos, ocean, land, landIce, seaIce, aerosol, atmosChem, ocnBgchem
time_frequency_options = yr, mon, day, 6hr, 3hr, subhr, monClim, fx
cmor_table_options = 3hr, 6hrLev, 6hrPlev, Amon, LImon, Lmon, OImon, Oclim, Omon, Oyr, aero, cf3hr, cfDay, cfMon, cfOff, cfSites, day, fx, grids
institute_options = BCC, CAWCR, CCCMA, CMCC, CNRM-CERFACS, CSIRO-QCCCE, EC-EARTH, GFDL, GISS, INM, IPSL, LASG, MIROC, MOHC, MPI-M, MRI, NCAR, NCC, NIMR, PCMDI
product_options = output1, output2, output
# Class name of the LUCID project handler.
handler = esgcet.config.lucid_handler:LUCIDHandler
# Format of generated dataset IDs
parent_id = wdcc.lucid
dataset_id = lucid.%(product)s.%(institute)s.%(model)s.%(experiment)s.%(time_frequency)s.%(realm)s.%(cmor_table)s.%(ensemble)s
# Directory format. This is used to determine field values by matching directory names.
#directory_format = /data/publish_test/cmip5_test #not used
dataset_name_format = lucid.%(product)s.%(institute)s.%(model)s.%(experiment)s.%(time_frequency)s.%(realm)s.%(cmor_table)s.%(ensemble)s.v%(version)s
# Exclude these variables from THREDDS catalogs. They are still added to the database.
thredds_exclude_variables = a, a_bnds, alev1, alevel, alevhalf, alt40, b, b_bnds, basin, bnds, bounds_lat, bounds_lon, dbze, depth, depth0m, depth100m, depth_bnds, geo_region, height, height10m, height2m, lat, lat_bnds, latitude, latitude_bnds, layer, lev, lev_bnds, location, lon, lon_bnds, longitude, longitude_bnds, olayer100m, olevel, oline, p0, p220, p500, p560, p700, p840, plev, plev3, plev7, plev8, plev_bnds, plevs, pressure1, region, rho, scatratio, sdepth, sdepth1, sza5, tau, tau_bnds, time, time1, time2, time_bnds, vegtype
# Maps
maps = institute_map, las_time_delta_map
institute_map = map(model : institute)
MPI-ESM-LR | MPI-M
las_time_delta_map = map(time_frequency : las_time_delta)
yr | 1 year
mon | 1 month
day | 1 day
6hr | 6 hours
3hr | 3 hours
subhr | 1 minute
monclim | 1 month
fx | fixed
# Set true if files follow the IPCC standard of one variable per file.
# If set, the THREDDS metadata is organized as per-variable datasets.
# Otherwise, the datasets are assumed to be per-time.
variable_per_file = true
-
Create the lucid handler by copying the ipcc5 one
cp /usr/local/cdat/lib/python2.6/site-packages/esgcet-2.8.5-py2.6.egg/esgcet/config/ipcc5_handler.py /usr/local/cdat/lib/python2.6/site-packages/esgcet-2.8.5-py2.6.egg/esgcet/config/lucid_handler.py
-
And altering the file a little bit (replacing cmip5 by lucid mostly, but warning there's a cmip5_product that needs to remain so!)
sed -e 's#cmip5.#lucid.#' -e 's#IPCC5#LUCID#' -e 's#CMIP5#LUCID#' -i /usr/local/cdat/lib/python2.6/site-packages/esgcet-2.8.5-py2.6.egg/esgcet/config/lucid_handler.py
- I have published the GeoMIP data followed as the steps show in http://esgf.org/wiki/ESGF_Node/LUCIDexample , used "geomip" instead of lucid, and "GeoMIP" instead of "LUCID". But there is a point must be careful, at the 8th step, after "sed" the geomip_handler.py as: sed -e 's#cmip5\.#geomip.#' -e 's#IPCC5#GeoMIP#' -e 's#CMIP5#GeoMIP#' -i /usr/local/cdat/lib/python2.6/site-packages/esgcet-2.8.5-py2.6.egg/esgcet/config/geomip_handler.py We need to change "result = (project_id[:5]=="GeoMIP")" to "result = (project_id[:6]=="GeoMIP")" in line 144 of geomip_handler.py file. If not, there would be error info "project_id must be GeoMIP" when publishing the GeoMIP data with "--project geomip". That's because the result of project_id[:5] is "GeoMI" but not "GeoMIP".
Regards,
Qizhong Wu 2012/04/16
-
Alter the
__init__.py
to point to this (this can be achieved certainly simpler, but you'll have to find out how. Feel free to correct this entry if you do!)[ 1] from ipcc4_handler import IPCC4Handler [ 2] from ipcc5_handler import IPCC5Handler [ 3] from tamip_handler import TAMIPHandler [ 4] from obs4mips_handler import Obs4mipsHandler [ 5] from lucid_handler import LUCIDHandler [ 6] builtinProjectHandlers = { [ 7] 'basic_builtin' : BasicHandler, [ 8] 'ipcc4_builtin' : IPCC4Handler, [ 9] 'ipcc5_builtin' : IPCC5Handler, [ 10] 'lucid_builtin' : LUCIDHandler, [ 11] 'tamip_builtin' : TAMIPHandler, [ 12] 'obs4mips_builtin' : Obs4mipsHandler, [ 13] } [ 14] builtinFormatHandlers = { [ 15] 'netcdf_builtin' : CdunifFormatHandler, [ 16] }
-
Now add the created project and model names to the database by pointing to the created esg.ini which holds information on the project
esginitialize -c -i lucid.esg.ini
-
If you use a map file then start ingesting the data into the variables
esgpublish --map test.dataset.map -i lucid.esg.ini
-
If everything looks fine then proceed crating the TDS catalogs
esgpublish --map test.dataset.map --project lucid -i lucid.esg.ini --noscan --thredds
-
And finally try to publish to the gateway
esgpublish --map test.dataset.map --project lucid -i lucid.esg.ini --publish
The configuration is very tricky, so check the FAQ and the Publisher documentation if anything fails.