Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG FIX] In-place updates with loc or iloc don't work correctly when the LHS has more than one column #9918

Merged
merged 100 commits into from
May 4, 2022

Conversation

skirui-source
Copy link
Contributor

@skirui-source skirui-source commented Dec 16, 2021

Fixes: #7377

This PR enables to setitem using a scalar value, dataframe or array/list iterable in both DataframeLocIndexer and DataFrameIlocIndexer . Only the following cases are currently supported in cudf:

  • Scalar value: follows the original code path, assigns column- values via specified key (row-label)
  • Dataframe : checks for column-alignment in LHS and RHS, then uses a scatter map of the indices to assign column-values accordingly. Substitute NA for columns not found in the RHS
  • All other cases (array, list, range value, etc) : first conversion to cupy array followed by special handling:
    • If 2d array: If the inner dimension is 1, it's broadcastable to all columns of the dataframe.
    • Otherwise the value must be a 1d array (scalar values are handled in case 1 above), there are 2 subcases:
      • If the key on column axis is a scalar, meaning the user is indexing a single column; Therefore 1d value should assign along the columns.
      • Otherwise, the key on column axis is a 1d array. In this case, the key on row axis can be a scalar or 1d and in both cases of row key, the ith element in value corresponds to the ith row in the indexed object. If the key is 1d, a broadcast will happen.

@skirui-source skirui-source added bug Something isn't working Python Affects Python cuDF API. non-breaking Non-breaking change labels Dec 16, 2021
@skirui-source skirui-source self-assigned this Dec 16, 2021
@github-actions
Copy link

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

@skirui-source skirui-source changed the base branch from branch-22.02 to branch-22.04 January 19, 2022 18:10
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
Copy link
Contributor

@shwina shwina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me! There's a lot that can generally be improved in the indexing code but very much out of scope for this PR.

Appreciate the attention to testing here 👍 Nice work @skirui-source and @isVoid!

@skirui-source skirui-source requested a review from isVoid May 3, 2022 21:22
@galipremsagar galipremsagar self-requested a review May 4, 2022 19:24
Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and pushed a commit to internalize the shape_mismatch_error to _shape_mismatch_error. Other than that this PR looks good to go. 🎉

@galipremsagar galipremsagar dismissed isVoid’s stale review May 4, 2022 19:39

Dismissing since the review has been addressed.

@galipremsagar galipremsagar added the 5 - Ready to Merge Testing and reviews complete, ready to merge label May 4, 2022
@skirui-source
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 1a457ef into rapidsai:branch-22.06 May 4, 2022
@skirui-source skirui-source deleted the i-loc.bug branch May 4, 2022 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] In-place updates with loc or iloc don't work correctly when the LHS has more than one column
5 participants