-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Ellipsis typehint to reductions #7048
Conversation
Excellent @headtr1ck ! Do we need to run |
Could any dev that uses linux rerun the generate_reductions and |
Turns out that the buildin |
Here's the diff from pytest-accept (it is weird that it's slightly different on windows...) commit 83615a94a6b7c0ae0cf0e0240d7705d9ce6c21e5
Author: Maximilian Roos <[email protected]>
Date: Sat Sep 17 12:36:45 2022 -0700
pytest-accept
diff --git a/xarray/core/_reductions.py b/xarray/core/_reductions.py
index a7cf7ec2..d0c2a9d7 100644
--- a/xarray/core/_reductions.py
+++ b/xarray/core/_reductions.py
@@ -97,7 +97,7 @@ def count(
<xarray.Dataset>
Dimensions: ()
Data variables:
- da int32 5
+ da int64 5
"""
return self.reduce(
duck_array_ops.count,
@@ -4400,7 +4400,7 @@ def count(
>>> da.groupby("labels").count()
<xarray.DataArray (labels: 3)>
- array([1, 2, 2], dtype=int64)
+ array([1, 2, 2])
Coordinates:
* labels (labels) object 'a' 'b' 'c'
"""
@@ -5485,7 +5485,7 @@ def count(
>>> da.resample(time="3M").count()
<xarray.DataArray (time: 3)>
- array([1, 3, 1], dtype=int64)
+ array([1, 3, 1])
Coordinates:
* time (time) datetime64[ns] 2001-01-31 2001-04-30 2001-07-31
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if Dims should include ellipsis
as well? The few times it's missing might be issues with the functions?
Is it just me that this example crashes the second time I run it? import numpy as np
import pandas as pd
import xarray as xr
da = xr.DataArray(
np.array([1, 2, 3, 1, 2, np.nan]),
dims="time",
coords=dict(
time=("time", pd.date_range("01-01-2001", freq="M", periods=6)),
labels=("time", np.array(["a", "b", "c", "c", "b", "a"])),
),
)
da.groupby("labels").count() Traceback (most recent call last):
File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec
exec(code, globals, locals)
File "g:\program\dropbox\python\xarray_groupby_windows_diff.py", line 34, in <module>
da.groupby("labels").count()
File "c:\users\j.w\documents\github\xarray\xarray\core\_reductions.py", line 4384, in count
return self._flox_reduce(
File "c:\users\j.w\documents\github\xarray\xarray\core\groupby.py", line 738, in _flox_reduce
result = xarray_reduce(
File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\flox\xarray.py", line 240, in xarray_reduce
ds, *by = xr.broadcast(ds, *by, exclude=exclude_dims)
File "c:\users\j.w\documents\github\xarray\xarray\core\alignment.py", line 1046, in broadcast
args = align(*args, join="outer", copy=False, exclude=exclude)
File "c:\users\j.w\documents\github\xarray\xarray\core\alignment.py", line 765, in align
aligner.align()
File "c:\users\j.w\documents\github\xarray\xarray\core\alignment.py", line 549, in align
self.find_matching_indexes()
File "c:\users\j.w\documents\github\xarray\xarray\core\alignment.py", line 256, in find_matching_indexes
obj_indexes, obj_index_vars = self._normalize_indexes(obj.xindexes)
File "c:\users\j.w\documents\github\xarray\xarray\core\alignment.py", line 205, in _normalize_indexes
pd_idx = safe_cast_to_index(data)
File "c:\users\j.w\documents\github\xarray\xarray\core\utils.py", line 140, in safe_cast_to_index
index = pd.Index(np.asarray(array), **kwargs)
File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\pandas\core\indexes\base.py", line 483, in __new__
data = sanitize_array(data, None, dtype=dtype, copy=copy)
File "C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\pandas\core\construction.py", line 524, in sanitize_array
raise ValueError("index must be specified when data is not list-like")
ValueError: index must be specified when data is not list-like |
Could you specify what you mean by "second time I run it"? |
Just running that script file several times without restarting the console. It might be a Spyder bug though since I can't reproduce it in a stand alone ipython console. This for example works (the first time): import numpy as np
import pandas as pd
import xarray as xr
da = xr.DataArray(
np.array([1, 2, 3, 1, 2, np.nan]),
dims="time",
coords=dict(
time=("time", pd.date_range("01-01-2001", freq="M", periods=6)),
labels=("time", np.array(["a", "b", "c", "c", "b", "a"])),
),
)
da.groupby("labels").count()
da = xr.DataArray(
np.array([1, 2, 3, 1, 2, np.nan]),
dims="time",
coords=dict(
time=("time", pd.date_range("01-01-2001", freq="M", periods=6)),
labels=("time", np.array(["a", "b", "c", "c", "b", "a"])),
),
)
da.groupby("labels").count() |
For me this works in a python terminal, python script and jupyter notebook (I don't use spyder but vscode). |
This PR adds the ellipsis typehint to reductions (only where they behave differently from None to reduce overhead).
Follow up on #7017 (comment)
Additionally I was changing a lot of "one or more dimensions" typehints to
str | Iterable[Hashable]
(See #6142).Some code changes were necessary to support this fully. Before several things were not working with actual hashable dimensions that are not strings.