Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced interpolation returning unexpected shape #9839

Closed
5 tasks done
vianneylan opened this issue Nov 29, 2024 · 7 comments · Fixed by #9887
Closed
5 tasks done

Advanced interpolation returning unexpected shape #9839

vianneylan opened this issue Nov 29, 2024 · 7 comments · Fixed by #9887

Comments

@vianneylan
Copy link

vianneylan commented Nov 29, 2024

What happened?

Hi !

Iam using Xarray since quite some time and recently I started using fresh NC files from https://cds.climate.copernicus.eu/ (up to now, using NC4 files that I have from quite some time now). I have an issue regarding Xarray' interpolation with advanced indexing: the returned shape by interp() is not as expected.

basically Iam just doing advanced interpolation as mentioned here https://docs.xarray.dev/en/stable/user-guide/interpolation.html#advanced-interpolation.
It used to work perfectly with my older datasets.

thanks

Vianney

What did you expect to happen?

I would expect res.shape to be (100,) or similar (and actually I get this result with older datasets. but I don't understand why I get (705, 715, 2) ! this does not corrrespond to anything, as we can see from the below description of the dataset:

Minimal Complete Verifiable Example

dataset.zip

# -*- coding: utf-8 -*-

import dask
import numpy as np
import xarray as xa
import datetime as dt

# open dataset (unzip dataset.zip which is attached to bug report)
dataset = xa.open_mfdataset("dataset.nc", decode_times=True, parallel=True)
start_date = dt.datetime(2023, 1, 1)

np.random.seed(0)

# points generation
N = 100
access_date = np.linspace(0, 5000, N) # seconds since start_date
longitudes = -180 + np.random.rand(N)*360
latitudes = -90 + np.random.rand(N)*180

dates_folded = np.datetime64(start_date) + \
    access_date.astype("timedelta64[s]").astype("timedelta64[ns]")

arguments = {
    "valid_time": xa.DataArray(dates_folded, dims="z"),
    "longitude": xa.DataArray(longitudes, dims="z"),
    "latitude": xa.DataArray(latitudes, dims="z"),
    "method": "linear",
    "kwargs": {"bounds_error": False},
}
with dask.config.set(**{"array.slicing.split_large_chunks": True}):
    data = dataset['tcc'].interp(**arguments)
    
res = np.array(data).T
print(res.shape)

Output:

(705, 715, 2)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

(705, 715, 2)

Anything else we need to know?

the expected log is (100,)

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:27:10) [MSC v.1938 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: fr
LOCALE: ('fr_FR', 'cp1252')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2024.11.0
pandas: 2.2.2
numpy: 1.26.4
scipy: 1.14.0
netCDF4: 1.6.5
pydap: None
h5netcdf: None
h5py: 3.11.0
zarr: None
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.7.0
distributed: 2024.7.0
matplotlib: 3.9.1
cartopy: 0.23.0
seaborn: None
numbagg: None
fsspec: 2024.6.1
cupy: 13.2.0
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 70.3.0
pip: 24.0
conda: 24.5.0
pytest: 8.2.2
mypy: None
IPython: 8.26.0
sphinx: 7.4.0

@vianneylan vianneylan added bug needs triage Issue that has not been reviewed by xarray team member labels Nov 29, 2024
@max-sixty
Copy link
Collaborator

Respectfully, can you minimize the example, including:

the example is self-contained, including all data and the text of any traceback.

@vianneylan
Copy link
Author

ok, I updated, not quite sure what you mean by minimizing ?

@max-sixty
Copy link
Collaborator

It needs to be self-contained — i.e. not requiring someone to download a file.

@max-sixty max-sixty added needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports plan to close May be closeable, needs more eyeballs and removed bug needs triage Issue that has not been reviewed by xarray team member labels Dec 1, 2024
@vianneylan
Copy link
Author

The trick is that it looks somewhat dependent on the dataset file , because for some older datasets I use, the advanced interpolation works, but not for this kind of 'newer' datasets (which basically looks quite similar and are properly read by open_mf fct). this is why I added the dataset file.

otherwise if I had been able to tell that this is a pure interpolation problem that I can reproduce on all kind of datasets files I use, then I would probably have managed to create a true self-contained example indeed.

@vianneylan
Copy link
Author

vianneylan commented Dec 4, 2024

and besides, advanced indexing using .sel appears to work properly, so this looks like really related to the advanced indexing interpolation only, regardless of 'method' and 'kwargs' arguments

@vianneylan
Copy link
Author

vianneylan commented Dec 5, 2024

OK, I think I understand what happening.
The dataset has some 'index-less' coordinates (there is no star * before them in coordinates section), and it seems this is generating some mess with interp, because when I remove these empty coordinates with drop_vars, the interpolation shape is now OK.
So maybe there is something to do to ensure the behavior is similar between .sel and .interp regarding advanced indexing (because for .sel empty coords where OK with not for .interp)

@dcherian
Copy link
Contributor

dcherian commented Dec 5, 2024

yeah i believe the string non-dim coord is tripping it up.

@dcherian dcherian removed needs mcve https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports plan to close May be closeable, needs more eyeballs labels Dec 5, 2024
dcherian added a commit to dcherian/xarray that referenced this issue Dec 13, 2024
dcherian added a commit to dcherian/xarray that referenced this issue Dec 13, 2024
dcherian added a commit to dcherian/xarray that referenced this issue Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants