Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] reindexing a DataFrame with a CategoricalIndex incorrectly matches label and drops Index name #13900

Closed
mroeschke opened this issue Aug 17, 2023 · 1 comment · Fixed by #13917
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@mroeschke
Copy link
Contributor

Describe the bug
reindexing a DataFrame with a CategoricalIndex incorrectly matches label and drops Index name

Steps/Code to reproduce bug

In [53]: import cudf; import numpy as np

In [54]: cudf.__version__
Out[54]: '23.10.00'

In [55]: df3 = cudf.DataFrame(
    ...: 
    ...:     {"A": np.arange(3), "B": cudf.Series(list("abc")).astype("category")}
    ...: 
    ...: )
    ...: 
    ...: 
    ...: 
    ...: df3 = df3.set_index("B")
    ...: 
    ...: df3
Out[55]: 
   A
B   
a  0
b  1
c  2

In [56]: df3.reindex(["a", "e"])
Out[56]: 
      A
a  <NA>
e  <NA>

In [57]: df3.to_pandas().reindex(["a", "e"])
Out[57]: 
     A
B     
a  0.0
e  NaN

Expected behavior

  1. Reindexing by an existing category with CategoricalIndex should not return NA
  2. The existing CategoricalIndex name should be maintained

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: conda
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context
Add any other context about the problem here.

@mroeschke mroeschke added bug Something isn't working Needs Triage Need team to review and classify labels Aug 17, 2023
@galipremsagar galipremsagar self-assigned this Aug 18, 2023
@galipremsagar galipremsagar added Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels Aug 18, 2023
@galipremsagar
Copy link
Contributor

This also happens for any Index object.

rapids-bot bot pushed a commit that referenced this issue Aug 18, 2023
Fixes: #13900 

This PR fixes an issue with `reindex` API, where `name` of the index being reindexed upon was lost. This PR fixes it to match pandas by using the new index name if it exists or preserving the old name.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #13917
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants