-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Fix inplace update of data and add Series.update #7201
Conversation
Before you dive deep into this, can you explain the situation this is handling? |
So the issue here is we are not correctly handling the dtype situation, where we need to compare the >>> import cudf
>>> s = cudf.Series([False, True, True])
>>> other = cudf.Series([3.0, 4.5, 7.0])
>>> s.where([True, True, True], other)
0 False
1 True
2 True
dtype: bool
>>> s.where([False, False, False], other)
0 True
1 True
2 True
dtype: bool
>>> s.to_pandas().where([False, False, False], other.to_pandas())
0 3
1 4.5
2 7
dtype: object
>>> s.to_pandas().where([False, True, False], other.to_pandas())
0 3
1 True
2 7
dtype: object We should perform the correct dtype calcaulation for numeric types and raise an error when there could be a situation of |
Okay, that sounds good. What about the situation where |
Yup that's when the common dtype finding logic should kick in and result in final dtype as >>> s
0 3
1 4
2 5
dtype: int8
>>> other = pd.Series([10, 11, 12], dtype='int32')
>>> other
0 10
1 11
2 12
dtype: int32
>>> s.where([True, False, True], other)
0 3
1 11
2 5
dtype: int32 |
Is that the expected behavior for end users? I.E. if I did something like:
The output would be |
The dtype selection is also based on >>> s
0 3
1 11
2 5
dtype: int8
>>> other = pd.Series([10, 10000, 12120], dtype='int32')
>>> s.where([True, False, True], other, inplace=True)
>>> s
0 3
1 16
2 5
dtype: int8
>>> s.where([True, False, True], other, inplace=False)
0 3
1 10000
2 5
dtype: int32
>>> import cudf
>>> s = cudf.Series([10, 10000, 12120], dtype='int32')
>>> s
0 10
1 10000
2 12120
dtype: int32
>>> s.astype('int8')
0 10
1 16
2 88
dtype: int8 |
Sounds like a good path forward to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
rerun tests |
@gpucibot merge |
Fixes: #7187
This PR:
Series.update
Frame.where