-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix type promotion edge cases in numerical binops #12074
Fix type promotion edge cases in numerical binops #12074
Conversation
This is a breaking change because it changes the behaviour of user-facing API, though I hope very much no-one was relying on it. FWIW, pandas gets this case right. |
2afc3e1
to
c566964
Compare
Try and do everything following numpy using types rather than values by first attempting to use the dtype of the passed in operand and subsequently (if it does not have one) using result_type. This way we avoid problems with min_scalar_type wanting to pick unsigned int types for bare Python integers.
c566964
to
d670394
Compare
f052481
to
f023aea
Compare
No idea how to handle the pandas weirdness here.
Requesting some careful eyes from reviewers here (assuming at least the test suite all passes). This kind of type promotion/casting and all the edge cases is really hard to think through for me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought I would have a look and added a few comments/questions in the hope they are useful
Codecov ReportBase: 88.07% // Head: 88.11% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-22.12 #12074 +/- ##
================================================
+ Coverage 88.07% 88.11% +0.04%
================================================
Files 135 135
Lines 22133 22124 -9
================================================
+ Hits 19494 19495 +1
+ Misses 2639 2629 -10
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Now that binop with cudf.Scalar matches pandas behaviour with numpy-dtype-enabled scalars, we need to manually promote here.
I think this is now ready for another look (though happy to retarget to 23.0x) |
@gpucibot merge |
Description
The type normalisation applied before heading into libcudf previously had slightly unexpected consequences for large int64 values. If not providing a
cudf.Scalar
, a bareint64
scalar would be cast touint64
and then normal numpy type promotion would unify tofloat64
. This is lossy, since int64 to float64 is neither surjective nor injective.To avoid this, try very hard to maintain the dtype of the object coming in, and match pandas behaviour by applying numpy type promotion rules via
numpy.result_type
.null
values #5938Checklist