Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] DataFrame slicing produces incorrect results in empty cases #10292

Closed
brandon-b-miller opened this issue Feb 15, 2022 · 0 comments · Fixed by #10310
Closed

[BUG] DataFrame slicing produces incorrect results in empty cases #10292

brandon-b-miller opened this issue Feb 15, 2022 · 0 comments · Fixed by #10310
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@brandon-b-miller
Copy link
Contributor

Describe the bug
In investigating a review suggestion several bugs were encountered with the way dataframes handle slicing, all led to incorrect results.

Steps/Code to reproduce bug

  1. A slice with step > 1 fails to slice an empty dataframe
>>> df[::3]
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

But not a non-empty dataframe:

>>> df['a'] = 42
>>> df[::3]
    a
0  42
3  42
6  42
9  42
  1. Out of bounds slicing sometimes leads to a libcudf error:
>>> df = cudf.DataFrame(index=[1,2,3])
>>> slc = slice(3, -5, 2)
>>> df[slc]
RuntimeError: cuDF failure at: /cudf/cpp/src/filling/sequence.cu:128: size must be >= 0
  1. Slicing with a negative step does not produce the correct result in empty cases
# cudf
>>> df = cudf.DataFrame(index=range(10))
>>> df[::-1]
Empty DataFrame
Columns: []
Index: []
# pandas
>>> df.to_pandas()[::-1]
Empty DataFrame
Columns: []
Index: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

The above works as expected for nonempty cases.

Expected behavior
Slicing a DataFrame with only an index should simply gather the index as specified. In general I think indexing in cuDF should follow pandas which probably follows numpy under the hood.

Environment overview (please complete the following information)

  • Environment location: [Bare-metal]
  • Method of cuDF install: [source]

Environment details
22.04

Additional context
Add any other context about the problem here.

@brandon-b-miller brandon-b-miller added bug Something isn't working Python Affects Python cuDF API. labels Feb 15, 2022
@brandon-b-miller brandon-b-miller self-assigned this Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant