-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invisible differences between arrays using IntervalIndex #4579
Comments
Thanks for the clear issue @gerritholl . I agree — it's confusing if those two look the same. Currently, one way of discriminating them: In [6]: da1.indexes['x']
Out[6]:
IntervalIndex([(0.0, 0.6666666666666666], (0.6666666666666666, 1.3333333333333333], (1.3333333333333333, 2.0]],
closed='right',
name='x',
dtype='interval[float64]')
In [7]: da2.indexes['x']
Out[7]:
Index([ (0.0, 0.6666666666666666],
(0.6666666666666666, 1.3333333333333333],
(1.3333333333333333, 2.0]],
dtype='object', name='x') One option is to push the dtype — In [8]: da1
Out[8]:
<xarray.DataArray (x: 3)>
array([0, 1, 2])
Coordinates:
* x (x) object (0.0, 0.6666666666666666] ... (1.3333333333333333, 2.0] Could be: * x (x) interval[float64] (0.0, 0.6666666666666666] ... (1.3333333333333333, 2.0] What are others thoughts? |
Perhaps Xarray has been too clever so far regarding how it handles pandas objects passed directly as coordinate data? Expanding on @max-sixty's suggestion, we could:
|
What happened:
I have two
DataArray
s that each have a coordinate constructed withpandas.interval_range
. In one case I pass theinterval_range
directly, in the other case I call.to_numpy()
first. The twoDataArray
s look identical but aren't. This can lead to hard-to-find bugs, because behaviour is not identical: the former supports indexing whereas the latter doesn't.What you expected to happen:
I expect two arrays that appear identical to behave identically. If they don't behave identically then there should be some way to tell the difference (apart from
equals
, which tells me they are different but not how).Minimal Complete Verifiable Example:
Results in:
Additional context
I suppose this happens because under the hood xarray does something clever to support pandas-style indexing even though the coordinate variable appears like a numpy array with an object dtype, and that this cleverness is lost if the object is already converted to a numpy array. But there is, as far as I can see, no way to tell the difference once the objects have been created.
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.6 | packaged by conda-forge | (default, Oct 7 2020, 19:08:05)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 4.12.14-lp150.12.82-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.16.1
pandas: 1.1.4
numpy: 1.19.4
scipy: 1.5.3
netCDF4: 1.5.4
pydap: None
h5netcdf: 0.8.1
h5py: 3.1.0
Nio: None
zarr: 2.5.0
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.1.7
cfgrib: None
iris: None
bottleneck: None
dask: 2.30.0
distributed: 2.30.1
matplotlib: 3.3.2
cartopy: 0.18.0
seaborn: None
numbagg: None
pint: None
setuptools: 49.6.0.post20201009
pip: 20.2.4
conda: installed
pytest: 6.1.2
IPython: 7.19.0
sphinx: 3.3.0
The text was updated successfully, but these errors were encountered: