Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolation difference between float and da with dims #287

Closed
veenstrajelmer opened this issue Mar 17, 2023 · 3 comments
Closed

Interpolation difference between float and da with dims #287

veenstrajelmer opened this issue Mar 17, 2023 · 3 comments

Comments

@veenstrajelmer
Copy link
Collaborator

veenstrajelmer commented Mar 17, 2023

  • da.interp() with floats gives different results than with da with a new dim. Latter is used in dfmt.interp_regularnc_to_plipoints().
  • The results do seem the same with method='nearest', but that is not a desireable method.
  • Not reproducable with some of the example data of xarray-data.
  • Occurs with xarray 2023.1.0/2023.2.0, but not in not yet in xarray 2022.6.0/2022.12.1.dev15+gd6d24507. No .interp() related changes in https://docs.xarray.dev/en/stable/whats-new.html

Example code, interp_with_da has one extra nan compared to interp_with_floats:

import xarray as xr
file_nc = r'p:\1204257-dcsmzuno\data\CMEMS\nc\DCSM_allAvailableTimes\so_2012-01-06_12-00-00_2012-01-12_12-00-00.nc'
ds = xr.open_dataset(file_nc)

x_xr = xr.DataArray([-9.5],dims=('plipoints'))
y_xr = xr.DataArray([43],dims=('plipoints'))

interp_with_floats = ds.interp(longitude=x_xr.values[0], latitude=y_xr.values[0], method='linear').load().so.isel(time=0)
interp_with_da = ds.interp(longitude=x_xr, latitude=y_xr, method='linear').load().so.isel(time=0).isel(plipoints=0)
print(interp_with_floats.to_numpy())
print(interp_with_da.to_numpy())
print(xr.__version__)

assert (interp_with_floats.isnull()==interp_with_da.isnull()).all()
@veenstrajelmer
Copy link
Collaborator Author

veenstrajelmer commented Mar 31, 2023

This seems to be an issue with scipy introduced in 1.10.0 (pydata/xarray#7701 (comment)). Temporarily set scipy<1.10.0 in requirements.txt, environment.yml and setup.cfg with commit 74eb1db.

Testcase in:

def test_xarray_interp_to_newdim(): #this one fails with scipy>=1.10.0: https://github.com/pydata/xarray/issues/7701
ds = xr.Dataset()
so_np = np.array([[[35.819576, 35.82568 , 35.82873 ],
[35.819576, 35.824154, 35.831783],
[35.822628, 35.824154, 35.82873 ]],
[[35.802788, 35.80584 , 35.815 ],
[35.815 , 35.810417, 35.821102],
[35.824154, 35.813473, 35.81805 ]],
[[35.786003, 35.789055, np.nan],
[35.807365, 35.796684, np.nan],
[35.824154, 35.80584 , np.nan]],
[[35.776848, np.nan, np.nan],
[35.792107, np.nan, np.nan],
[35.822628, np.nan, np.nan]],
[[35.781425, np.nan, np.nan],
[35.792107, np.nan, np.nan],
[35.789055, np.nan, np.nan]]])
ds['so'] = xr.DataArray(so_np,dims=('depth','latitude','longitude'))
ds['longitude'] = xr.DataArray([-9.6, -9.5, -9.4], dims=('longitude'))
ds['latitude'] = xr.DataArray([42.9, 43.0, 43.1], dims=('latitude'))
x_xr = xr.DataArray([-9.5],dims=('plipoints'))
y_xr = xr.DataArray([43],dims=('plipoints'))
interp_with_floats = ds.interp(longitude=x_xr[0], latitude=y_xr[0], method='linear').so #selecting one value from the da drops the new plipoints dimension
interp_with_da_existing = ds.interp(longitude=x_xr.values, latitude=y_xr.values, method='linear').so.isel(longitude=0,latitude=0) #using the DataArray values keeps lat/lon dimenions, gives the same interp result
interp_with_da_newdim = ds.interp(longitude=x_xr, latitude=y_xr, method='linear').so.isel(plipoints=0) #using the DataArray introduces a plipoints dimension, which gives different interp result
print(interp_with_floats.to_numpy())
print(interp_with_da_existing.to_numpy())
print(interp_with_da_newdim.to_numpy())
print(xr.__version__)
assert (interp_with_floats.isnull()==interp_with_da_existing.isnull()).all() #success
assert (interp_with_floats.isnull()==interp_with_da_newdim.isnull()).all() #fails with scipy>=1.10.0

@veenstrajelmer
Copy link
Collaborator Author

Also reproduced with scipy only, but there the scipy version does not matter: scipy/scipy#19318

@veenstrajelmer
Copy link
Collaborator Author

veenstrajelmer commented Oct 4, 2023

Can be solved by combining linear and nearest interpolation like this, this will result in an array with as much valid data as possible, but nans where there were only nans:

interp_with_da_lin = ds.interp(longitude=x_xr, latitude=y_xr, method='linear').load().so.isel(time=0).isel(plipoints=0)
interp_with_da_near = ds.interp(longitude=x_xr, latitude=y_xr, method='nearest').load().so.isel(time=0).isel(plipoints=0)
interp_with_da = interp_with_da_lin.combine_first(interp_with_da_near)

This was implemented in dfm_tools in #561

veenstrajelmer added a commit that referenced this issue Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant