-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
combine_by_coords with loaded cftime variable demotes dtype #168
Comments
@jsignell not sure if this is a bug with the loading of cftime variables, the implementation of |
Okay so this happens with In [11]: ds1 = open_virtual_dataset('air1.nc', loadable_variables=['time', 'lat', 'lon'], cftime_variables=['time'])
In [12]: ds2 = open_virtual_dataset('air2.nc', loadable_variables=['time', 'lat', 'lon'], cftime_variables=['time'])
In [13]: xr.concat([ds1, ds2], coords='minimal', compat='override', dim='time')
Out[13]:
<xarray.Dataset> Size: 8MB
Dimensions: (time: 2920, lat: 25, lon: 53)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
* time (time) float32 12kB 1.867e+06 1.867e+06 ... 1.885e+06 1.885e+06
Data variables:
air (time, lat, lon) int16 8MB ManifestArray<shape=(2920, 25, 53), d...
Attributes:
Conventions: COARDS
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
title: 4x daily NMC reanalysis (1948) but only because I didn't pass In [8]: ds1 = open_virtual_dataset('air1.nc', loadable_variables=['time', 'lat', 'lon'], cftime_variables=['time'], indexes={})
In [9]: ds2 = open_virtual_dataset('air2.nc', loadable_variables=['time', 'lat', 'lon'], cftime_variables=['time'], indexes={})
In [10]: xr.concat([ds1, ds2], coords='minimal', compat='override', dim='time')
Out[10]:
<xarray.Dataset> Size: 8MB
Dimensions: (time: 2920, lat: 25, lon: 53)
Coordinates:
* lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
* lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
time (time) datetime64[ns] 23kB 2013-01-01T00:02:06.757437440 ... 201...
Data variables:
air (time, lat, lon) int16 8MB ManifestArray<shape=(2920, 25, 53), d...
Attributes:
Conventions: COARDS
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
title: 4x daily NMC reanalysis (1948) This almost certainly does just link back to #18 then. |
Ran into this same issue: from virtualizarr import open_virtual_dataset
cf_time_var = open_virtual_dataset('https://climate.northwestknowledge.net/TERRACLIMATE-DATA/TerraClimate_tmax_1958.nc', loadable_variables=['time','lat','lon','crs'], cftime_variables=['time'], indexes={})
no_cf_time_var = open_virtual_dataset('https://climate.northwestknowledge.net/TERRACLIMATE-DATA/TerraClimate_tmax_1958.nc', loadable_variables=['time','lat','lon','crs'], indexes={}) print(cf_time_var.xindexes)
Indexes:
lat PandasIndex
lon PandasIndex
crs PandasIndex print(cf_no_time_var.xindexes)
Indexes:
lat PandasIndex
lon PandasIndex
time PandasIndex
crs PandasIndex It seems like specifying |
I realized that
xr.combine_by_coords
should actually already work fine - if you are willing to load the relevant coordinates into memory (and therefore also have those values saved into the resultant references on-disk).the only issue with this result is that somehow the dtype of
time
has been changed fromdatetime64[ns]
tofloat32
.This approach doesn't solve the original issue, but it might also be fine in a lot of cases.
xr.combine_by_coords
can only auto-order along dimensions that have a1D
coordinate, and 1D variables are small, so if they are split across many files its likely that you wanted to include them inloadable_variables
anyway.Originally posted by @TomNicholas in #18 (comment)
The text was updated successfully, but these errors were encountered: