Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] DataFrame loc indexing is incorrect with repeated column labels. #13269

Closed
Tracked by #12793
wence- opened this issue May 2, 2023 · 2 comments
Closed
Tracked by #12793

[BUG] DataFrame loc indexing is incorrect with repeated column labels. #13269

wence- opened this issue May 2, 2023 · 2 comments
Assignees
Labels
bug Something isn't working improvement Improvement / enhancement to an existing function Python Affects Python cuDF API.

Comments

@wence-
Copy link
Contributor

wence- commented May 2, 2023

Describe the bug

This is basically #13266 but for loc, I will fix it separately due to different code paths.

import cudf
import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(4).reshape(2, 2))
cdf = cudf.from_pandas(df)

df.loc[:, [0, 1, 0]]
#    0  1  0
# 0  0  1  0
# 1  2  3  2

cdf.loc[:, [0, 1, 0]]
#    0  1
# 0  0  1
# 1  2  3

This is because ColumnAccessor.select_by_label uniquifies input label arguments.

Expected behavior

This should match pandas.

@wence- wence- added bug Something isn't working Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function labels May 2, 2023
@wence- wence- self-assigned this May 2, 2023
@wence-
Copy link
Contributor Author

wence- commented May 2, 2023

This is a consequence of #13273, and how it will be fixed depends on what we do there.

@mroeschke
Copy link
Contributor

This was fixed by #16514 so closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working improvement Improvement / enhancement to an existing function Python Affects Python cuDF API.
Projects
Status: Done
Development

No branches or pull requests

2 participants