-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to_xarray() result is incorrect when one of multi-index levels is not sorted #4186
Comments
It seems the problem here is in Lines 4642 to 4643 in 732750a
Then in Lines 4588 to 4589 in 732750a
Besides the perf improvement it provides, #4184 seems also have a nice side effect fixing this issue. |
Hi @pzhlobi @Li9htmare -- thanks for raising this issue. Could you kindly clarify for me exactly what behavior you think xarray should do? The results are indeed reordered currently, but as far as I can tell the pairing between coordinators and values remains consistent. When I test this myself, I see the same behavior (documented in the first post) either with or without my changes from #4184. |
Hi @shoyer, sorry I got you confused, I should have run your code at first place. You code removes the problematic Using pzhlobi's example
Using the same
|
Thanks for clarifying! This raises an interesting question for #4184: do we want to keep @fujiisoup's fix from #3953 or not? If we remove @fujiisoup's fix, then the output we see is:
This is also correct -- coordinates match up with values -- but the order of the result is different from what is currently on master. |
I verified that #4184 fixes the tests added for #3953 even after removing the call to The main question is what behavior we want to do have: Should I think it's better to not to sort (but of course it's better to sort than to get the wrong order). |
I agree that it's better not to sort. |
Actually, I realize now that this is basically the same issue as #2619 If I remove the use of
So maybe it is more consistent to keep calling |
The sorting seems to be a separate matter, caused by
|
Hi @shoyer , without
I agree it will be better if we can maintain the order from |
@Li9htmare I'm not sure I follow your example. #4184 does remove the use of Is there something specific that you are worried about going wrong with your latest example? For what it's worth, here's what
I think this is doing the right thing already? |
Sorry @shoyer, I didn't notice you have pushed new commits to #4184 and thought you meant to just remove the |
Let me see if I can rewrite the helper functions to avoid passing around a |
This was a good suggestion. Done in 96b544b |
This intention of variables used constructing the Dataset looks a lot clearer now. Many thanks Stephan! |
What happened:
to_xarray() sorts multi-index level values and returns result for the sorted values but it doesn't sort levels or expects levels to be sorted resulting in completely incorrect order of data for the displayed coordinates.
What you expected to happen:
Should account for the order of levels in the original index.
Minimal Complete Verifiable Example:
Anything else we need to know?:
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 08:20:52)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.7-100.fc30.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.3
xarray: 0.15.1
pandas: 1.0.5
numpy: 1.19.0
scipy: 1.5.0
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.1.3
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.19.0
distributed: 2.19.0
matplotlib: 3.2.2
cartopy: None
seaborn: 0.10.1
numbagg: installed
setuptools: 46.3.0.post20200513
pip: 20.1
conda: None
pytest: 5.4.3
IPython: 7.15.0
sphinx: None
The text was updated successfully, but these errors were encountered: