-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] In-place updates with loc
or iloc
don't work correctly when the LHS has more than one column
#7377
Comments
magnatelee
added
bug
Something isn't working
Needs Triage
Need team to review and classify
labels
Feb 12, 2021
magnatelee
changed the title
[BUG] In-place updates with
[BUG] In-place updates with Feb 12, 2021
loc
or iloc
don't work when the LHS and RHS have the same dimensionalityloc
or iloc
don't work correctly when the LHS has more than one column
shwina
added
Python
Affects Python cuDF API.
and removed
Needs Triage
Need team to review and classify
labels
Feb 12, 2021
This issue has been labeled |
rapids-bot bot
pushed a commit
that referenced
this issue
May 4, 2022
…as more than one column (#9918) Fixes: #7377 This PR enables to `setitem` using a scalar value, dataframe or array/list iterable in both `DataframeLocIndexer `and `DataFrameIlocIndexer `. Only the following cases are currently supported in cudf: - Scalar value: follows the original code path, assigns column- values via specified key (row-label) - Dataframe : checks for column-alignment in LHS and RHS, then uses a scatter map of the indices to assign column-values accordingly. Substitute NA for columns not found in the RHS - All other cases (array, list, range value, etc) : first conversion to cupy array followed by special handling: * If 2d array: If the inner dimension is 1, it's broadcastable to all columns of the dataframe. * Otherwise the value must be a 1d array (scalar values are handled in case 1 above), there are 2 subcases: * If the key on column axis is a scalar, meaning the user is indexing a single column; Therefore 1d value should assign along the columns. * Otherwise, the key on column axis is a 1d array. In this case, the key on row axis can be a scalar or 1d and in both cases of row key, the ith element in value corresponds to the ith row in the indexed object. If the key is 1d, a broadcast will happen. Authors: - Sheilah Kirui (https://github.com/skirui-source) - Michael Wang (https://github.com/isVoid) - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Ashwin Srinath (https://github.com/shwina) - GALI PREM SAGAR (https://github.com/galipremsagar) - Michael Wang (https://github.com/isVoid) URL: #9918
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
In-place updates with
loc
seem to work only when the RHS is broadcasted. The following enumerates three cases that work just fine in Pandas but crash in cuDF. The same issue exists iniloc
.Steps/Code to reproduce bug
(Stacktraces are elided for brevity.)
Bug 1:
Bug 2:
Bug 3 (note that the index of the RHS is set to [0, 2] to make it align with the LHS):
Expected behavior
The examples above should work rather than crashing.
Environment overview (please complete the following information)
The text was updated successfully, but these errors were encountered: