-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition & integration of the integer power operator #11025
Conversation
`cudf::table::select` is declared `nodiscard`, test will fail to build with line 103.
Can one of the admins verify this patch? |
Thank you very much for your work on this, @AtlantaPepsi! I'll take a look at this very soon. In the meantime, I'll request permission to run CI tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start. Next steps:
- Copy the test from this commit and add some related tests that cover the problematic values noted in [BUG] pow operator not acting as expected. #10178
- Add Python bindings by implementing
NumericalColumn.__pow__
. It should checkis_integer_dtype
for the inputs, and if both inputs are integral, it should call throughIntPow
.
Happy to help with anything, just drop a comment here or ping me on Slack.
ok to test |
Co-authored-by: Bradley Dice <[email protected]>
@bdice quite some cases in BINARY_TEST suites are failing on my device, even the ones beyond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some comments attached -- apologies for the delay @AtlantaPepsi. I'll work through these proposed changes right now and push some commits to address them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AtlantaPepsi Thanks for your hard work on this! I applied some changes and I think this PR is ready to approve. It fixes the core problem where incorrect power results appear in Python for integer types by dispatching to IntPow
for only those types, without altering the default behavior of Pow
in C++ code (which casts to floating point types). Once this gets a second C++ approval, one approval from Python and Java code owners, and CI passes, it will be ready to merge.
To reiterate my other comment: before closing #10178, we should verify whether decimal types are affected as well, and implement the same fix in a second PR.
Codecov Report
@@ Coverage Diff @@
## branch-22.08 #11025 +/- ##
===============================================
Coverage ? 86.37%
===============================================
Files ? 144
Lines ? 22826
Branches ? 0
===============================================
Hits ? 19715
Misses ? 3111
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jake's question aside, this looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java changes are OK as the minimal, necessary change, but the Java code for BinaryOperable.pow
was not updated to leverage the new INT_POW
operator as was done for the Python API. There needs to be a followup issue if it's not addressed here.
@jlowe See my assessment in #10178 (comment) and let me know if you agree — I think Spark/Java should not use the integer power operator because the expected return type of pow is always a floating type. That’s not true for pandas, which is what motivated this fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one small non-blocking suggestion otherwise lgtm.
Agreed it's a non-issue for Spark, but it is an inconsistency with respect to the cudf APIs across language bindings. Given the cudf Java API is only used by Spark for now it's essentially a non-issue, but it could be if someone in the future used the cudf Java API outside of Spark. Given Spark will always call it with double, having it perform this behavior won't break Spark, but it will produce better results for anything that might call it with an INT in the future. Yeah, yeah, YAGNI applies here, so I'm fine if we decide not to follow up. On the flip side, it's also not hard to avoid this theoretical future surprise. |
@gpucibot merge |
1 similar comment
@gpucibot merge |
Fixes a compile warning was introduced in PR #11025 : [link to log containing the warning](https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/10790/consoleFull) When a templated variable `y` is unsigned the compare `(y<0)` results in a compile warning: ``` /cudf/cpp/src/binaryop/compiled/operation.cuh(226): warning #186-D: pointless comparison of unsigned integer with zero detected during: instantiation of "auto cudf::binops::compiled::ops::IntPow::operator()(TypeLhs, TypeRhs)->TypeLhs [with TypeLhs=uint8_t, TypeRhs=uint8_t, <unnamed>=(void *)nullptr]" ``` Adding an `if constexpr` around the comparison removes the warning. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #11339
Partial fix for #10178 (still need to investigate whether decimal types are also affected).
This implements the
INT_POW
binary operator, which can be dispatched for integral types. This uses an exponentiation-by-squaring algorithm to compute powers. UnlikePOW
, this does not cast the data to floating-point types which can suffer from precision loss when computing powers and casting back to an integral type. The cuDF Python layer has been updated to dispatch integral data to this operator, which fixes the problems seen for specific values of base and exponent (like3**1 == 2
) noted in #10178.