Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mypy errors in xarray.py, xrutils.py, cache.py #144

Merged
merged 49 commits into from
Sep 23, 2022

Conversation

Illviljan
Copy link
Contributor

@Illviljan Illviljan commented Sep 19, 2022

Fixes some of #96

@Illviljan Illviljan changed the title Dim typing Fix mypy errors in xarray.py Sep 19, 2022
flox/xarray.py Outdated
@@ -19,7 +19,10 @@
from .xrutils import _contains_cftime_datetimes, _to_pytimedelta, datetime_to_numeric

if TYPE_CHECKING:
from xarray import DataArray, Dataset, Resample
from xarray import DataArray, Dataset # TODO: Use T_DataArray, T_Dataset?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we should explicitly say that xarray.types (is that right?) is public somewhere on the xarray docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's xarray.core.types so I suppose it's technically private at the moment. Maybe for the better? I don't think .types has settled enough yet to start recommending to the larger audience. Doesn't stop us from using it early though! :)

I mainly wrote the ToDo because I had issues with mypy, but this was the solution:

# This errors if obj: T_Dataset | T_DataArray.
    if isinstance(obj, xr.DataArray):
        ds = obj._to_temp_dataset()
    else:
        ds = obj

# This passes if obj: T_Dataset | T_DataArray.
    if isinstance(obj, xr.Dataset):
        ds = obj
    else:
        ds = obj._to_temp_dataset()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great this would be a good issue to open over at xarray

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other reason this is fine is that I'd like to move the contents of this file over to xarray in the long term.

@Illviljan
Copy link
Contributor Author

@headtr1ck, do you know why flake8 passes ellipsis on xarray but not here?

@headtr1ck
Copy link

@headtr1ck, do you know why flake8 passes ellipsis on xarray but not here?

It seems that flake8 does not support it yet fully and you have to convince it to expose it using this config in your setup.cfg:

[flake8]
builtins =
    ellipsis

flox/xarray.py Outdated Show resolved Hide resolved
flox/xarray.py Show resolved Hide resolved
flox/xarray.py Outdated Show resolved Hide resolved
flox/xarray.py Outdated
@@ -19,7 +19,10 @@
from .xrutils import _contains_cftime_datetimes, _to_pytimedelta, datetime_to_numeric

if TYPE_CHECKING:
from xarray import DataArray, Dataset, Resample
from xarray import DataArray, Dataset # TODO: Use T_DataArray, T_Dataset?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great this would be a good issue to open over at xarray

flox/xarray.py Show resolved Hide resolved
flox/xarray.py Show resolved Hide resolved
Copy link
Collaborator

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks

flox/xarray.py Outdated Show resolved Hide resolved
flox/xarray.py Show resolved Hide resolved
flox/xarray.py Outdated Show resolved Hide resolved
flox/xarray.py Outdated Show resolved Hide resolved
flox/xarray.py Outdated
expected_groups = _convert_expected_groups_to_index(expected_groups, isbin, sort=sort)
group_shape = tuple(len(e) for e in expected_groups)
expected_groups = _convert_expected_groups_to_index(expected_groups, isbins, sort=sort)
# TODO: _convert_expected_groups_to_index can return None which is not good
Copy link
Collaborator

@dcherian dcherian Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected_groups cannot have a None element at this stage see:

expected_groups[idx] = _get_expected_groups(b_.data, sort=sort, raise_if_dask=True)

This may be complicated from a typing perspective, so the comment should say that not describe a logic bug.

Copy link
Contributor Author

@Illviljan Illviljan Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_get_expected_groups can also return None so it's not so easy to untangle this.
And even if expected_groups was narrowed properly it doesn't matter because _convert_expected_groups_to_index still has the None in it's return type. Example:

def test2(a: tuple[str | int, ...]) -> tuple[str | int, ...]:
    return a


b: tuple[int, ...] = (1, 2)
reveal_type(test2(a=b))  # note: Revealed type is "builtins.tuple[Union[builtins.str, builtins.int], ...]"

There may not be logic bug here but this part of the code is really hard to understand and could do with a little simplification.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Agreed. One simplification would be to remove the raise_if_dask kwarg. It's only set to False in one place, we can explicitly skip it there.

Typing _convert_expected_groups_to_index is hard because it handles some very flexible user input but happy to hear suggestions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved similar stuff inside the loop which simplified it a little e73f6e8
group_names can probably be replaced by group_sizes as well.

@Illviljan Illviljan changed the title Fix mypy errors in xarray.py Fix mypy errors in xarray.py, xrutils.py Sep 21, 2022
@Illviljan Illviljan changed the title Fix mypy errors in xarray.py, xrutils.py Fix mypy errors in xarray.py, xrutils.py, cache.py Sep 21, 2022
@Illviljan
Copy link
Contributor Author

I think I'll stop here. core.py can be done in a different PR.

Here's what left in core.py:

flox/core.py:557: error: Incompatible default for argument "axis" (default has type "None", argument has type "Union[int, Sequence[int]]")  [assignment]
flox/core.py:612: error: Incompatible types in assignment (expression has type "Tuple[None]", variable has type "Optional[Mapping[Union[str, Callable[..., Any]], Any]]")  [assignment]
flox/core.py:668: error: No overload variant of "zip" matches argument types "Union[Sequence[str], Sequence[Callable[..., Any]]]", "None", "Any", "Any"  [call-overload]
flox/core.py:668: note: Possible overload variants:
flox/core.py:668: note:     def [_T_co, _T1] __new__(cls, Iterable[_T1], *, strict: bool = ...) -> zip[Tuple[_T1]]
flox/core.py:668: note:     def [_T_co, _T1, _T2] __new__(cls, Iterable[_T1], Iterable[_T2], *, strict: bool = ...) -> zip[Tuple[_T1, _T2]]
flox/core.py:668: note:     def [_T_co, _T1, _T2, _T3] __new__(cls, Iterable[_T1], Iterable[_T2], Iterable[_T3], *, strict: bool = ...) -> zip[Tuple[_T1, _T2, _T3]]
flox/core.py:668: note:     def [_T_co, _T1, _T2, _T3, _T4] __new__(cls, Iterable[_T1], Iterable[_T2], Iterable[_T3], Iterable[_T4], *, strict: bool = ...) -> zip[Tuple[_T1, _T2, _T3, _T4]]
flox/core.py:668: note:     def [_T_co, _T1, _T2, _T3, _T4, _T5] __new__(cls, Iterable[_T1], Iterable[_T2], Iterable[_T3], Iterable[_T4], Iterable[_T5], *, strict: bool = ...) -> zip[Tuple[_T1, _T2, _T3, _T4, _T5]]
flox/core.py:668: note:     def [_T_co] __new__(cls, Iterable[Any], Iterable[Any], Iterable[Any], Iterable[Any], Iterable[Any], Iterable[Any], *iterables: Iterable[Any], strict: bool = ...) -> zip[Tuple[Any, ...]]
flox/core.py:672: error: Argument 1 to "is_nanlen" has incompatible type "None"; expected "Union[str, Callable[..., Any]]"  [arg-type]
flox/core.py:809: error: Unsupported left operand type for + ("Sequence[Any]")  [operator]
flox/core.py:811: error: Unsupported left operand type for + ("Sequence[Any]")  [operator]
flox/core.py:816: error: Incompatible return value type (got "Dict[str, Any]", expected "Dict[Union[str, Callable[..., Any]], Any]")  [return-value]
flox/core.py:816: note: Perhaps you need a type annotation for "results"? Suggestion: "Dict[Union[str, Callable[..., Any]], Any]"
flox/core.py:902: error: Incompatible types in assignment (expression has type "Dict[Union[str, Callable[..., Any]], Any]", variable has type "Dict[str, object]")  [assignment]
flox/core.py:918: error: "object" has no attribute "append"  [attr-defined]
flox/core.py:921: error: "object" has no attribute "append"  [attr-defined]
flox/core.py:928: error: Argument "fill_value" to "chunk_reduce" has incompatible type "Tuple[int]"; expected "Optional[Mapping[Union[str, Callable[..., Any]], Any]]"  [arg-type]
flox/core.py:944: error: "object" has no attribute "append"  [attr-defined]
flox/core.py:957: error: Argument "fill_value" to "chunk_reduce" has incompatible type "Tuple[Any]"; expected "Optional[Mapping[Union[str, Callable[..., Any]], Any]]"  [arg-type]
flox/core.py:962: error: "object" has no attribute "append"  [attr-defined]
flox/core.py:964: error: Incompatible return value type (got "Dict[str, object]", expected "Dict[Union[str, Callable[..., Any]], Any]")  [return-value]
flox/core.py:964: note: Perhaps you need a type annotation for "results"? Suggestion: "Dict[Union[str, Callable[..., Any]], Any]"
flox/core.py:1182: error: Argument 1 to "partial" has incompatible type "object"; expected "Callable[..., Any]"  [arg-type]
flox/core.py:1235: error: Item "None" of "Optional[Any]" has no attribute "to_numpy"  [union-attr]
flox/core.py:1252: error: Incompatible types in assignment (expression has type "Array", variable has type "Dict[Any, Any]")  [assignment]
flox/core.py:1312: error: Incompatible return value type (got "Tuple[Any, ...]", expected "Tuple[Optional[Any]]")  [return-value]
flox/core.py:1468: error: Argument 1 to "_validate_reindex" has incompatible type "Optional[bool]"; expected "bool"  [arg-type]
flox/core.py:1482: error: Incompatible types in assignment (expression has type "Tuple[bool, ...]", variable has type "bool")  [assignment]
flox/core.py:1498: error: Argument 2 to "_convert_expected_groups_to_index" has incompatible type "bool"; expected "Sequence[bool]"  [arg-type]
flox/core.py:1502: error: Argument 1 to "any" has incompatible type "bool"; expected "Iterable[object]"  [arg-type]
flox/core.py:1566: error: Argument 4 to "_initialize_aggregation" has incompatible type "Optional[int]"; expected "int"  [arg-type]
flox/core.py:1578: error: Item "str" of "Union[str, Aggregation]" has no attribute "name"  [union-attr]
flox/core.py:1590: error: Item "ndarray[Any, dtype[Any]]" of "Union[ndarray[Any, dtype[Any]], Any, ndarray[Any, Any]]" has no attribute "chunks"  [union-attr]
flox/core.py:1590: error: Item "ndarray[Any, Any]" of "Union[ndarray[Any, dtype[Any]], Any, ndarray[Any, Any]]" has no attribute "chunks"  [union-attr]
flox/core.py:1603: error: Item "ndarray[Any, dtype[Any]]" of "Union[ndarray[Any, dtype[Any]], Any, ndarray[Any, Any]]" has no attribute "chunks"  [union-attr]
flox/core.py:1603: error: Item "ndarray[Any, Any]" of "Union[ndarray[Any, dtype[Any]], Any, ndarray[Any, Any]]" has no attribute "chunks"  [union-attr]
flox/core.py:1634: error: Incompatible types in assignment (expression has type "List[Union[ndarray[Any, Any], Any]]", variable has type "Tuple[Any]")  [assignment]
Found 30 errors in 1 file (checked 10 source files)

@Illviljan Illviljan marked this pull request as ready for review September 21, 2022 21:49
Copy link
Collaborator

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 👏 👏

nice work!

if isinstance(obj, xr.DataArray):
ds = obj._to_temp_dataset()
else:
if isinstance(obj, xr.Dataset):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this rearrangement was weird. Is it a mypy bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the error you get if you isinstance with DataArray:

   # obj: Union[T_Dataset, T_DataArray]
    if isinstance(obj, xr.DataArray):
        ds = obj._to_temp_dataset() # -> xr.Dataset
    else:
        ds = obj  # error: Incompatible types in assignment (expression has type "Union[T_Dataset, T_DataArray]", variable has type "Dataset")

My understanding is that mypy always uses the typing from the first time it is defined (ds: xr.Dataset narrower typing). It is similar to the typing issues when importing optional modules

flox/xarray.py Outdated Show resolved Hide resolved
flox/xarray.py Show resolved Hide resolved
flox/xarray.py Show resolved Hide resolved
@Illviljan Illviljan merged commit 2b54c5e into xarray-contrib:main Sep 23, 2022
dcherian added a commit that referenced this pull request Oct 9, 2022
* main:
  Update ci-additional.yaml (#167)
  Refactor before redoing cohorts (#164)
  Fix mypy errors in core.py (#150)
  Add link to numpy_groupies (#160)
  Bump codecov/codecov-action from 3.1.0 to 3.1.1 (#159)
  Use math.prod instead of np.prod (#157)
  Remove None output from _get_expected_groups (#152)
  Fix mypy errors in xarray.py, xrutils.py, cache.py (#144)
  Raise error if multiple by's are used with Ellipsis (#149)
  pre-commit autoupdate (#148)
  Add mypy ignores (#146)
  Get pre commit bot to update (#145)
  Remove duplicate examples headers (#147)
  Add ci additional (#143)
  Bump mamba-org/provision-with-micromamba from 12 to 13 (#141)
  Add ASV benchmark CI workflow (#139)
  Fix func count for dtype O with numpy and numba (#138)
dcherian added a commit that referenced this pull request Oct 17, 2022
* main: (29 commits)
  Major fix to subset_to_blocks (#173)
  Performance improvements for cohorts detection (#172)
  Remove split_out (#170)
  Deprecate resample_reduce (#169)
  More efficient cohorts. (#165)
  Allow specifying output dtype (#131)
  Add a dtype check for numpy arrays in assert_equal (#158)
  Update ci-additional.yaml (#167)
  Refactor before redoing cohorts (#164)
  Fix mypy errors in core.py (#150)
  Add link to numpy_groupies (#160)
  Bump codecov/codecov-action from 3.1.0 to 3.1.1 (#159)
  Use math.prod instead of np.prod (#157)
  Remove None output from _get_expected_groups (#152)
  Fix mypy errors in xarray.py, xrutils.py, cache.py (#144)
  Raise error if multiple by's are used with Ellipsis (#149)
  pre-commit autoupdate (#148)
  Add mypy ignores (#146)
  Get pre commit bot to update (#145)
  Remove duplicate examples headers (#147)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants