-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Getting data CF-compliant for regridding (with xesmf) #718
Comments
I see in xesmf docs that:
Updating this to False doesn't fix the issue. |
The following seems to be working as a work-around:
I think this might indicate that something changed with how xcdat passes information to xesmf (or xesmf changed). |
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/xesmf/frontend.py:75, in _get_lon_lat(ds)
74 try:
---> 75 lon = ds.cf['longitude']
76 lat = ds.cf['latitude'] I've isolated the code block in xESMF where the Also your workaround above might indicate that xCDAT is dropping or not copying over attributes to the lat/lon (which is a regression in behavior from previous versions I think?). |
Looking at the CF info via cf_xarray, both ngrid.cf
Coordinates:
CF Axes: * X: ['lon']
* Y: ['lat']
Z, T: n/a
CF Coordinates: * longitude: ['lon']
* latitude: ['lat']
vertical, time: n/a
Cell Measures: area, volume: n/a
Standard Names: n/a
Bounds: n/a
Grid Mappings: n/a
Data Variables:
Cell Measures: area, volume: n/a
Standard Names: n/a
Bounds: X: ['lon_bnds']
Y: ['lat_bnds']
lat: ['lat_bnds']
latitude: ['lat_bnds']
lon: ['lon_bnds']
longitude: ['lon_bnds']
Grid Mappings: n/a ds.cf
Coordinates:
CF Axes: X: ['lon', 'nlon']
Y: ['lat', 'nlat']
* T: ['time']
Z: n/a
CF Coordinates: longitude: ['lon']
latitude: ['lat']
* time: ['time']
vertical: n/a
Cell Measures: area, volume: n/a
Standard Names: latitude: ['lat']
longitude: ['lon']
* time: ['time']
Bounds: n/a
Grid Mappings: n/a
Data Variables:
Cell Measures: area, volume: n/a
Standard Names: surface_downward_mass_flux_of_carbon_dioxide_expressed_as_carbon: ['fgco2']
Bounds: T: ['time_bnds']
X: ['lon_bnds', 'nlon_bnds']
Y: ['lat_bnds']
lat: ['lat_bnds']
latitude: ['lat_bnds']
lon: ['lon_bnds']
longitude: ['lon_bnds']
nlon: ['nlon_bnds']
time: ['time_bnds']
Grid Mappings: n/a |
@tomvothecoder – thank you for digging into this...before you get too far – can you reproduce the error? Maybe the issue is on my end (weird environment and old/new cf-xarray or something)? |
@pochedls I can reproduce your issue with the latest I found the root cause of this issue. The dataset has two X and Y axes ( <xarray.Dataset> Size: 2GB
Dimensions: (time: 3420, nlat: 384, nlon: 320, d2: 2, vertices: 4, bnds: 2)
Coordinates:
lat (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
lon (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
* nlat (nlat) int32 2kB 1 2 3 4 5 6 7 8 ... 378 379 380 381 382 383 384
* nlon (nlon) int32 1kB 1 2 3 4 5 6 7 8 ... 314 315 316 317 318 319 320
* time (time) object 27kB 2015-01-15 13:00:00.000007 ... 2299-12-15 1... The input grid that is extracted (code here) from the dataset consists of ds.nlat.attrs
{'long_name': 'cell index along second dimension', 'units': '1'}
ds.nlon.attrs
{'long_name': 'cell index along first dimension', 'units': '1', 'bounds': 'nlon_bnds'} Meanwhile, ds.lat.attrs
{'axis': 'Y', 'bounds': 'lat_bnds', 'standard_name': 'latitude', 'title': 'Latitude', 'type': 'double', 'units': 'degrees_north', 'valid_max': np.float64(90.0), 'valid_min': np.float64(-90.0)}
ds.lon.attrs
{'axis': 'X', 'bounds': 'lon_bnds', 'standard_name': 'longitude', 'title': 'Longitude', 'type': 'double', 'units': 'degrees_east', 'valid_max': np.float64(360.0), 'valid_min': np.float64(0.0)} Are |
Got it. I think xcdat used to be able to sort this out since this data structure is what is used for unstructured grids (which can be regridded with ESMF). I'll see if I can find examples in which this did work in the past (I know we tested regridding sea ice or sea surface temperature). |
It is hard to trace, but I think that an example dataset that we used to be able to regrid, but is now problematic is this one:
I think @jasonb5 was able to get this working by resolving this issue. |
What happened?
In regridding some ocean data, many (not all) datasets are returning:
I'm having trouble getting the datasets to be CF-compliant on xcdat 0.6.1 / 0.7.1 / 0.7.3. I think xcdat used to better handle ocean grids (though I'm not sure when this might have changed).
What did you expect to happen? Are there are possible answers you came across?
Ideally xcdat could figure out the input grid automatically. If not, this issue is still useful for sorting out what we need to specify in the metadata to make the dataset CF-compliant for xesmf.
Minimal Complete Verifiable Example (MVCE)
Relevant log output
Anything else we need to know?
Note the same code works with
p = '/p/css03/esgf_publish/CMIP6/ScenarioMIP/BCC/BCC-CSM2-MR/ssp585/r1i1p1f1/Omon/fgco2/gn/v20190319/'
(though I now think this may be because the grid is rectilinear).Environment
xcdat 0.7.3
xesmf 0.8.8
The text was updated successfully, but these errors were encountered: