Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Indices are immutable by definition and should take advantage of that #13357

Open
wence- opened this issue May 16, 2023 · 2 comments
Open
Labels
0 - Backlog In queue waiting for assignment Performance Performance related issue Python Affects Python cuDF API.

Comments

@wence-
Copy link
Contributor

wence- commented May 16, 2023

Is your feature request related to a problem? Please describe.

Indices advertise a bunch of properties (for example is_monotonic_increasing). These are implemented by inheriting from Frame or SingleColumnFrame. But those latter classes are mutable, and so their implementation of the property must recompute every time.

For an index, this is not the case, and so we lose some performance (for example, every time we do a slice .loc we'll run a libcudf kernel to check if the index is sorted).

Describe the solution you'd like

Index properties (especially where they return scalars, and therefore the memory footprint is negligible) should use functools.cached_property to provide their (immutable) properties.

Describe alternatives you've considered

None

@wence- wence- added feature request New feature or request Needs Triage Need team to review and classify labels May 16, 2023
@wence-
Copy link
Contributor Author

wence- commented May 16, 2023

Here's a (possibly too magic) way of implementing this without too much code duplication:

from functools import cached_property

class Immutable(type):
    __immutable_properties__ = ()
    def __init__(cls, name, bases, dict):
        super().__init__(name, bases, dict)
        for prop in cls.__immutable_properties__:
            cprop = cached_property(lambda self: getattr(super(cls, self), prop))
            cprop.__set_name__(cls, prop)
            setattr(cls, prop, cprop)


class Index(SingleColumnFrame, metaclass=Immutable):
    __immutable_properties__ = ("is_monotonic_increasing", ...)
    ...

@vyasr
Copy link
Contributor

vyasr commented Jun 2, 2023

Related to #9593 (the immutability is also mentioned there).

@GregoryKimball GregoryKimball added 0 - Backlog In queue waiting for assignment Python Affects Python cuDF API. Performance Performance related issue and removed feature request New feature or request Needs Triage Need team to review and classify labels Jun 7, 2023
rapids-bot bot pushed a commit that referenced this issue Jun 30, 2023
…3625)

Much of the index-specific search code for label-based lookup only worked in the case where the index was sorted in ascending order and/or the requested slice had the same step sign as the index. To fix this, handle ascending and descending sorted indices specially, as well as refactoring to remove unused codepaths.

The core idea is to push `find_first` and `find_last` to being generic column methods (in doing so we remove the `closest` argument, but that was only used to produce pandas-incompatible index behaviour).

Lookup in indices then uses `get_slice_bound` (that can be specialised for index types) that uses first (if applicable) `searchsorted` and then `find_first/last`.

While we're here, since we now check the sortedness properties of Index objects, turn them into `@cached_property` (partially addressing the request of #13357).

- Closes #8693
- Closes #12833

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)

URL: #13625
@vyasr vyasr added this to cuDF Python Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment Performance Performance related issue Python Affects Python cuDF API.
Projects
Status: Todo
Development

No branches or pull requests

3 participants