Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA]: Preserve RangeIndex in more setops cases #14013

Closed
mroeschke opened this issue Aug 31, 2023 · 0 comments · Fixed by #14053
Closed

[FEA]: Preserve RangeIndex in more setops cases #14013

mroeschke opened this issue Aug 31, 2023 · 0 comments · Fixed by #14053
Assignees
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@mroeschke
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Sometimes the result of difference, intersection, union could still be represented as a `RangeIndex

In [3]: from cudf import *

In [4]:         idx = Index(range(4))[::-1]
   ...:         other = Index(range(3, 4))
   ...: 
   ...:         result = idx.difference(other)

In [5]: idx
Out[5]: RangeIndex(start=3, stop=-1, step=-1)

In [6]: other
Out[6]: RangeIndex(start=3, stop=4, step=1)

In [7]: result
Out[7]: Int64Index([0, 1, 2], dtype='int64')

Describe the solution you'd like
In the case above, RangeIndex(0, 3, 1) could have been returned

Describe alternatives you've considered
Casting directly to a RangeIndex

Additional context
Add any other context, code examples, or references to existing implementations about the feature request here.

@mroeschke mroeschke added feature request New feature or request Needs Triage Need team to review and classify Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels Aug 31, 2023
@galipremsagar galipremsagar self-assigned this Sep 7, 2023
rapids-bot bot pushed a commit that referenced this issue Sep 8, 2023
This PR fixes `Index.difference` in following ways:

- [x] Fixes `name` preservation by correctly evaluating the name of two input objects, closes #14019
- [x] Fixes `is_mixed_with_object_dtype` handling that will resolve incorrect results for `CategoricalIndex`, closes #14022
- [x] Raises errors for invalid input types, the error messages are an exact match to pandas error messages for parity.
- [x] Introduce a `Range._try_reconstruct_range_index` that will try to re-construct a `RangeIndex` out of an `Int..Index` to save memory- this is on parity with pandas. closes #14013

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)

URL: #14053
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants