-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pandas=2.0
support
#7724
pandas=2.0
support
#7724
Conversation
It seems this fixes the failing tests unrelated to the Regarding the failing doctests, I decided to cast the values of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here's some comments for the individual changes to make reviewing easier
xarray/tests/test_dataarray.py
Outdated
pytest.param( | ||
np.array([0.0, 0.111, 0.222, 0.333], dtype="float16"), | ||
slice(1, 3), | ||
id="float16", | ||
marks=[ | ||
pytest.mark.skipif( | ||
has_pandas_version_two, reason="not supported for pandas >= 2.0" | ||
) | ||
], | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the main change here: we skip the float16
check with pandas>=2.0
. Another option would be to change the code to explicitly cast to float32
/ float64
, then remove the skipif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicit cast sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to which precision should we cast? float64
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, IIUC these arrays are converted to pandas Indexes, which used to upcast to float64
always.
Perhaps we should upcast for backwards compatibility and raise a DeprecationWarning
asking the user to cast explicitly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this turns out not to be as simple as I thought it would be: the cast to PandasIndex
(and thus pandas.Index
) happens in multiple places. First is the DataArray
/ Dataset
constructor, which through several layers calls safe_cast_to_index
, where the actual cast to PandasIndex
happens. The second cast happens when selecting using a array of values. In that case, get_indexer_nd
calls index.get_indexer
(pandas.Index.get_indexer
), which appears to create a index from the indexer values.
Should both casts emit the DeprecationWarning
? I think the second might be a implementation detail and that we should just cast the indexer values from float16
to float64
(not sure, though).
cc @benbovy
[np.array(["a"]), np.array(["b"]), np.array(["a", "b"])], | ||
[np.array([1], dtype="int64"), np.array([2], dtype="int64"), pd.Index([1, 2])], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, the idea is to avoid the different default precision on windows by explicitly setting the int
precision on construction
(all except 0d)
this allows us to check for expected warnings
@Illviljan, @headtr1ck, I just noticed that the CI version of Otherwise this should be ready for a final review, and the failing datetime-related tests will be fixed by #7731. |
I think this is because of #7270 The pre-commit hook of mypy should be disabled anyway because it takes too long to run (should be actually much faster since mypy >=1). I think we could check if the newest mypy still segfaults or not... |
it didn't for me when I ran the hook, but that might just have been luck. To confirm, we'd need a PR that unpins The |
Locally it always works, it segfaults in CI, which makes it impossible to debug. |
Is there anything I can do to help out on this? It sounds like the blocker is mypy? |
no, As far as I can tell, the only thing left is another round of reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I think we can double check that the only failures are cftimeindex, restore the pin, then merge, and then remove the pin in #7731 |
* main: (34 commits) Update whats-new.rst Fix binning by unsorted array (pydata#7762) Bump codecov/codecov-action from 3.1.1 to 3.1.2 (pydata#7760) Fix typing errors using mypy 1.2 (pydata#7752) [skip-ci] dev whats-new Add whats-new for v2023.04.0 (pydata#7757) remove the `black` hook (pydata#7756) reword the what's new entry for the `pandas` 2.0 dtype changes (pydata#7755) restructure the contributing guide (pydata#7681) Continue to use nanosecond-precision Timestamps in precision-sensitive areas (pydata#7731) minor doc updates to clarify extensions using accessors (pydata#7751) align: Avoid reindexing when join="exact" (pydata#7736) `pandas=2.0` support (pydata#7724) Clarify vectorized indexing documentation (pydata#7747) Avoid recasting a CFTimeIndex (pydata#7735) fix typo (pydata#7746) [pre-commit.ci] pre-commit autoupdate (pydata#7745) Bump pypa/gh-action-pypi-publish from 1.8.4 to 1.8.5 (pydata#7743) preserve boolean dtype in encoding (pydata#7720) [skip-ci] Add alignment benchmarks (pydata#7738) ...
As mentioned in #7716 (comment), this tries to unpin
pandas
.