Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] DataFrame.update() should modify the memory inplace as opposed to replacing the buffers #7187

Closed
kkraus14 opened this issue Jan 21, 2021 · 2 comments · Fixed by #7201
Closed
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@kkraus14
Copy link
Collaborator

In #6883 DataFrame.update() was implemented but it was implemented to replace the Columns underneath the DataFrame. In the Pandas implementation it actually mutates the memory underneath the DataFrame which can have propagating effects:

import pandas as pd

pdf = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
pdf2 = pd.DataFrame({'c': [7,8,9], 'b': [14,15,16]})

# before update
initial_pointer = pdf._data.blocks[0].values.__array_interface__['data']

pdf.update(pdf2)

# after update
final_pointer = pdf._data.blocks[0].values.__array_interface__['data']

assert(initial_pointer == final_pointer) # doesn't Throw

Not high priority but wanted to raise an issue to have something to point to in case this comes back.

@kkraus14 kkraus14 added bug Something isn't working Python Affects Python cuDF API. labels Jan 21, 2021
@kkraus14
Copy link
Collaborator Author

cc @skirui-source @galipremsagar @isVoid for visibility

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

rapids-bot bot pushed a commit that referenced this issue Mar 31, 2021
Fixes: #7187 

This PR:

- [x] Fixes inplace manipulation of columns.
- [x] Introduces `Series.update`
- [x] Fixes incorrect dtype handling in `Frame.where`

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)
  - Keith Kraus (https://github.com/kkraus14)

URL: #7201
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants