Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

Central functions to open and normalize gridded datasets #644

Open
forman opened this issue May 10, 2018 · 0 comments
Open

Central functions to open and normalize gridded datasets #644

forman opened this issue May 10, 2018 · 0 comments

Comments

@forman
Copy link
Member

forman commented May 10, 2018

Expected behavior

Implement and use a single function open_gridded_dataset to open gridded datasets read from single files are from multiple files concatenated along time dimension. open_gridded_dataset should

  • detect (also recursive) file wildcards and expand file list
  • detect remote file access
  • detect an appropriate internal chunking based on a configurable strategy. For example
    • "spatial": optimal for spatial analysis and visualization, that is chunking in spatial dimension taken from external NetCDF/HDF chunking or GeoTIFF tiling
    • "time": optimal for time analysis analysis: chunking mostly along time dimension.
    • "cube": optimal for spatio-temporal analyses
  • open the dataset
  • perform dataset normalization

The latter should make optional use of another normalize function that can be configured to

  • rename spatial 1D longitude and latitude coordinates so in the end we have lon and lat coordinate variables
  • detect a 0-360 degree longitude range and fix it to -180 to +180 degrees by rearranging variable grids
  • ensure variables have a dimension time and we have a time coordinate variable given that attributes time_coverage_start and time_coverage_end are present
  • ensure a coordinate variable named time has datatype np.datetime64
  • ensure global spatio-temporal CF attributes are set

Actual behavior

There are many places in Cate's code where xr.open_dataset() are made without proper parameterization, e.g. appropriate chunking set. This has a major impact on performance and and also data compatibility due to missing normalization.

This is also related to #634, #623.

Specifications

Cate master as of 2018-05-10

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant