-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mypy errors in xarray.py, xrutils.py, cache.py #144
Conversation
for more information, see https://pre-commit.ci
flox/xarray.py
Outdated
@@ -19,7 +19,10 @@ | |||
from .xrutils import _contains_cftime_datetimes, _to_pytimedelta, datetime_to_numeric | |||
|
|||
if TYPE_CHECKING: | |||
from xarray import DataArray, Dataset, Resample | |||
from xarray import DataArray, Dataset # TODO: Use T_DataArray, T_Dataset? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we should explicitly say that xarray.types
(is that right?) is public somewhere on the xarray docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's xarray.core.types
so I suppose it's technically private at the moment. Maybe for the better? I don't think .types
has settled enough yet to start recommending to the larger audience. Doesn't stop us from using it early though! :)
I mainly wrote the ToDo because I had issues with mypy, but this was the solution:
# This errors if obj: T_Dataset | T_DataArray.
if isinstance(obj, xr.DataArray):
ds = obj._to_temp_dataset()
else:
ds = obj
# This passes if obj: T_Dataset | T_DataArray.
if isinstance(obj, xr.Dataset):
ds = obj
else:
ds = obj._to_temp_dataset()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great this would be a good issue to open over at xarray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other reason this is fine is that I'd like to move the contents of this file over to xarray in the long term.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
@headtr1ck, do you know why flake8 passes |
It seems that flake8 does not support it yet fully and you have to convince it to expose it using this config in your setup.cfg:
|
flox/xarray.py
Outdated
@@ -19,7 +19,10 @@ | |||
from .xrutils import _contains_cftime_datetimes, _to_pytimedelta, datetime_to_numeric | |||
|
|||
if TYPE_CHECKING: | |||
from xarray import DataArray, Dataset, Resample | |||
from xarray import DataArray, Dataset # TODO: Use T_DataArray, T_Dataset? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great this would be a good issue to open over at xarray
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 thanks
flox/xarray.py
Outdated
expected_groups = _convert_expected_groups_to_index(expected_groups, isbin, sort=sort) | ||
group_shape = tuple(len(e) for e in expected_groups) | ||
expected_groups = _convert_expected_groups_to_index(expected_groups, isbins, sort=sort) | ||
# TODO: _convert_expected_groups_to_index can return None which is not good |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected_groups
cannot have a None
element at this stage see:
expected_groups[idx] = _get_expected_groups(b_.data, sort=sort, raise_if_dask=True)
This may be complicated from a typing perspective, so the comment should say that not describe a logic bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_get_expected_groups
can also return None
so it's not so easy to untangle this.
And even if expected_groups
was narrowed properly it doesn't matter because _convert_expected_groups_to_index
still has the None in it's return type. Example:
def test2(a: tuple[str | int, ...]) -> tuple[str | int, ...]:
return a
b: tuple[int, ...] = (1, 2)
reveal_type(test2(a=b)) # note: Revealed type is "builtins.tuple[Union[builtins.str, builtins.int], ...]"
There may not be logic bug here but this part of the code is really hard to understand and could do with a little simplification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Agreed. One simplification would be to remove the raise_if_dask
kwarg. It's only set to False in one place, we can explicitly skip it there.
Typing _convert_expected_groups_to_index
is hard because it handles some very flexible user input but happy to hear suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved similar stuff inside the loop which simplified it a little e73f6e8
group_names
can probably be replaced by group_sizes
as well.
Co-authored-by: Deepak Cherian <[email protected]>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
[pull] main from xarray-contrib:main
I think I'll stop here. core.py can be done in a different PR. Here's what left in core.py:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 👏 👏
nice work!
if isinstance(obj, xr.DataArray): | ||
ds = obj._to_temp_dataset() | ||
else: | ||
if isinstance(obj, xr.Dataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this rearrangement was weird. Is it a mypy bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the error you get if you isinstance with DataArray:
# obj: Union[T_Dataset, T_DataArray]
if isinstance(obj, xr.DataArray):
ds = obj._to_temp_dataset() # -> xr.Dataset
else:
ds = obj # error: Incompatible types in assignment (expression has type "Union[T_Dataset, T_DataArray]", variable has type "Dataset")
My understanding is that mypy always uses the typing from the first time it is defined (ds: xr.Dataset
narrower typing). It is similar to the typing issues when importing optional modules
Co-authored-by: Deepak Cherian <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
* main: Update ci-additional.yaml (#167) Refactor before redoing cohorts (#164) Fix mypy errors in core.py (#150) Add link to numpy_groupies (#160) Bump codecov/codecov-action from 3.1.0 to 3.1.1 (#159) Use math.prod instead of np.prod (#157) Remove None output from _get_expected_groups (#152) Fix mypy errors in xarray.py, xrutils.py, cache.py (#144) Raise error if multiple by's are used with Ellipsis (#149) pre-commit autoupdate (#148) Add mypy ignores (#146) Get pre commit bot to update (#145) Remove duplicate examples headers (#147) Add ci additional (#143) Bump mamba-org/provision-with-micromamba from 12 to 13 (#141) Add ASV benchmark CI workflow (#139) Fix func count for dtype O with numpy and numba (#138)
* main: (29 commits) Major fix to subset_to_blocks (#173) Performance improvements for cohorts detection (#172) Remove split_out (#170) Deprecate resample_reduce (#169) More efficient cohorts. (#165) Allow specifying output dtype (#131) Add a dtype check for numpy arrays in assert_equal (#158) Update ci-additional.yaml (#167) Refactor before redoing cohorts (#164) Fix mypy errors in core.py (#150) Add link to numpy_groupies (#160) Bump codecov/codecov-action from 3.1.0 to 3.1.1 (#159) Use math.prod instead of np.prod (#157) Remove None output from _get_expected_groups (#152) Fix mypy errors in xarray.py, xrutils.py, cache.py (#144) Raise error if multiple by's are used with Ellipsis (#149) pre-commit autoupdate (#148) Add mypy ignores (#146) Get pre commit bot to update (#145) Remove duplicate examples headers (#147) ...
Fixes some of #96