-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: avoid inherit_names for DatetimeIndexOpsMixin #48015
Conversation
Over multiple PRs, my goal is to remove The alternative is to keep Adding little helper methods as done in #36742 and here has the advantage that we can easily express the type of properties, it is more difficult for the implementation and the annotations to get out of sync, and it might also be more readable. Mypy doesn't seem to support properties with decorators but it seems that it just ignores them, which should be fine in most/all cases ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM cc @jbrockmendel
pandas/core/indexes/datetimelike.py
Outdated
return self._data.inferred_freq | ||
|
||
@cache_readonly | ||
def _resolution_obj(self) -> Resolution | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is None possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DatetimeLikeArrayMixin._resolution_obj
can theoretically be None
but that is never the case when running the tests!
I could make sense to simplify DatetimeLikeArrayMixin._resolution_obj
pandas/pandas/core/arrays/datetimelike.py
Line 968 in 926b9ce
def _resolution_obj(self) -> Resolution | None: |
@property
def _resolution_obj(self) -> Resolution:
freqstr = self.freqstr
return Resolution.get_reso_from_freqstr(freqstr) # the KeyError is never triggered in the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mh, in test pandas/tests/arrays/test_datetimes.py(82)test_fields(), self.freq
is None but _resolution_obj
is not called during this test but it would return None
in this case. I came across this when testing whether freqstr
can be None
(and in this case it is None
): if freqstr
can be None
, _resolution_obj
can be None
.
One argument for removing None
from at least _resolution_obj
(but not freqstr
) is that places that call it, do not check for None and would anyways error there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, looks like it is only relevant for TDA, and we could re-use the logic (minus self.tz) from DTA._resolution_obj or that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not following: do you suggest moving _resolution_obj
(or freqstr
) to a different class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DatetimeLikeArrayMixin._resolution_obj
is only ever called for PeriodArray
in the tests (other arrays might implement their own version?) and DatetimeIndexOpsMixin._resolution_obj
is never called at least in the tests (I guess sub-classes implement it).
Cannot make DatetimeIndexOpsMixin._resolution_obj
abstract as DatetimeIndex
and TimedeltaIndex
do not "implement" it (DatetimeIndex
uses inherit_names
to replace it but it didn't see anything for TimedeltaIndex
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be fine leaving it as Resolution | None
for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we don't need TDA._resolution_obj to exist, then we can just move it. otherwise, i think something like
@property
def _resolution_obj(self) -> Resolution:
tz = getattr(self, "tz", None) # i.e. None for TimedeltaArray
return get_resolution(self.asi8, tz, reso=self._reso)
works for DTA/TDA, and the current DatetimeLikeArrayMixin._resolution_obj becomes PeriodArray._resolution_obj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is TimedeltaIndex
supposed to have _resolution_obj
(and resolution
):
>>> pd.TimedeltaIndex([1])._resolution_obj # None
>>> pd.TimedeltaIndex([1]).resolution
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/_libs/properties.pyx", line 37, in pandas._libs.properties.CachedProperty.__get__
File "python3.10/site-packages/pandas/core/indexes/extension.py", line 59, in cached
return getattr(self._data, name)
File "python3.10/site-packages/pandas/core/arrays/datetimelike.py", line 937, in resolution
return self._resolution_obj.attrname # type: ignore[union-attr]
AttributeError: 'NoneType' object has no attribute 'attrname'
If it is not supposed to have that, a good solution (for at least for the index classes) would be to make DatetimeIndexOpsMixin._resolution_obj -> Resolution
abstract (all sub-classes except TimedeltaIndex implement it and return Resolution
).
Not a hill I plan to die on, but it isn't obvious to me that this is worthwhile. we implemented inherit_names in the first place for boilerplate-reduction, which is still a good goal. |
I think I addressed all comments but now we also have some small implementation changes. If you could please have another look @jbrockmendel @mroeschke |
Generally looks okay to me. I don't have a strong opinion in which direction |
@@ -393,7 +392,7 @@ def is_full(self) -> bool: | |||
if not self.is_monotonic_increasing: | |||
raise ValueError("Index is not monotonic") | |||
values = self.asi8 | |||
return ((values[1:] - values[:-1]) < 2).all() | |||
return ((values[1:] - values[:-1]) < 2).all().item() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed? seems like the kind of thing that might be fragile across numpy versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is annotated as returning bool, but it currently return numpy.bool_ (which does not seem to inherit from bool):
>>> type((np.random.rand(3) > 0.5).all())
<class 'numpy.bool_'>
>>> isinstance((np.random.rand(3) > 0.5).all(), bool)
False
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
Opened #49804 |
Similar to #36742
xref #32100