-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Indices are immutable by definition and should take advantage of that #13357
Labels
0 - Backlog
In queue waiting for assignment
Performance
Performance related issue
Python
Affects Python cuDF API.
Comments
Here's a (possibly too magic) way of implementing this without too much code duplication: from functools import cached_property
class Immutable(type):
__immutable_properties__ = ()
def __init__(cls, name, bases, dict):
super().__init__(name, bases, dict)
for prop in cls.__immutable_properties__:
cprop = cached_property(lambda self: getattr(super(cls, self), prop))
cprop.__set_name__(cls, prop)
setattr(cls, prop, cprop)
class Index(SingleColumnFrame, metaclass=Immutable):
__immutable_properties__ = ("is_monotonic_increasing", ...)
... |
Related to #9593 (the immutability is also mentioned there). |
3 tasks
rapids-bot bot
pushed a commit
that referenced
this issue
Jun 30, 2023
…3625) Much of the index-specific search code for label-based lookup only worked in the case where the index was sorted in ascending order and/or the requested slice had the same step sign as the index. To fix this, handle ascending and descending sorted indices specially, as well as refactoring to remove unused codepaths. The core idea is to push `find_first` and `find_last` to being generic column methods (in doing so we remove the `closest` argument, but that was only used to produce pandas-incompatible index behaviour). Lookup in indices then uses `get_slice_bound` (that can be specialised for index types) that uses first (if applicable) `searchsorted` and then `find_first/last`. While we're here, since we now check the sortedness properties of Index objects, turn them into `@cached_property` (partially addressing the request of #13357). - Closes #8693 - Closes #12833 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Ashwin Srinath (https://github.com/shwina) URL: #13625
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
0 - Backlog
In queue waiting for assignment
Performance
Performance related issue
Python
Affects Python cuDF API.
Is your feature request related to a problem? Please describe.
Indices advertise a bunch of properties (for example
is_monotonic_increasing
). These are implemented by inheriting fromFrame
orSingleColumnFrame
. But those latter classes are mutable, and so their implementation of the property must recompute every time.For an index, this is not the case, and so we lose some performance (for example, every time we do a slice
.loc
we'll run a libcudf kernel to check if the index is sorted).Describe the solution you'd like
Index properties (especially where they return scalars, and therefore the memory footprint is negligible) should use
functools.cached_property
to provide their (immutable) properties.Describe alternatives you've considered
None
The text was updated successfully, but these errors were encountered: