Addition & integration of the integer power operator #11025

AtlantaPepsi · 2022-06-01T23:18:55Z

Partial fix for #10178 (still need to investigate whether decimal types are also affected).

This implements the INT_POW binary operator, which can be dispatched for integral types. This uses an exponentiation-by-squaring algorithm to compute powers. Unlike POW, this does not cast the data to floating-point types which can suffer from precision loss when computing powers and casting back to an integral type. The cuDF Python layer has been updated to dispatch integral data to this operator, which fixes the problems seen for specific values of base and exponent (like 3**1 == 2) noted in #10178.

`cudf::table::select` is declared `nodiscard`, test will fail to build with line 103.

GPUtester · 2022-06-01T23:18:56Z

Can one of the admins verify this patch?

AtlantaPepsi · 2022-06-01T23:19:16Z

@bdice

bdice · 2022-06-01T23:27:12Z

Thank you very much for your work on this, @AtlantaPepsi! I'll take a look at this very soon. In the meantime, I'll request permission to run CI tests.

bdice

Great start. Next steps:

Copy the test from this commit and add some related tests that cover the problematic values noted in [BUG] pow operator not acting as expected. #10178
Add Python bindings by implementing NumericalColumn.__pow__. It should check is_integer_dtype for the inputs, and if both inputs are integral, it should call through IntPow.

Happy to help with anything, just drop a comment here or ping me on Slack.

cpp/include/cudf/binaryop.hpp

cpp/src/binaryop/compiled/IntPow.cu

jjacobelli · 2022-06-02T07:19:01Z

ok to test

python/cudf/cudf/core/column/numerical.py

python/cudf/cudf/_lib/cpp/binaryop.pxd

python/cudf/cudf/_lib/binaryop.pyx

cpp/src/binaryop/compiled/util.cpp

cpp/src/binaryop/compiled/binary_ops.cu

Co-authored-by: Bradley Dice <[email protected]>

Add test.

AtlantaPepsi · 2022-07-02T16:40:07Z

@bdice quite some cases in BINARY_TEST suites are failing on my device, even the ones beyond BinaryOperationCompiledTest_FloatOps, so please double check the correctness of INT_POW if you would. Also as we discussed in slack, negative exponents still create wrong results, please let me know if undefined behavior is allowed for negative exponents.

bdice

I have some comments attached -- apologies for the delay @AtlantaPepsi. I'll work through these proposed changes right now and push some commits to address them.

cpp/src/binaryop/compiled/operation.cuh

python/cudf/cudf/core/column/numerical.py

…ntaPepsi/branch-22.08

bdice

@AtlantaPepsi Thanks for your hard work on this! I applied some changes and I think this PR is ready to approve. It fixes the core problem where incorrect power results appear in Python for integer types by dispatching to IntPow for only those types, without altering the default behavior of Pow in C++ code (which casts to floating point types). Once this gets a second C++ approval, one approval from Python and Java code owners, and CI passes, it will be ready to merge.

To reiterate my other comment: before closing #10178, we should verify whether decimal types are affected as well, and implement the same fix in a second PR.

codecov · 2022-07-08T07:56:35Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.08@bc5e769). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 9b79193 differs from pull request most recent head 5c447d1. Consider uploading reports for the commit 5c447d1 to get more accurate results

@@               Coverage Diff               @@
##             branch-22.08   #11025   +/-   ##
===============================================
  Coverage                ?   86.37%           
===============================================
  Files                   ?      144           
  Lines                   ?    22826           
  Branches                ?        0           
===============================================
  Hits                    ?    19715           
  Misses                  ?     3111           
  Partials                ?        0

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc5e769...5c447d1. Read the comment docs.

cpp/src/binaryop/compiled/operation.cuh

hyperbolic2346

Jake's question aside, this looks good to me.

jlowe

Java changes are OK as the minimal, necessary change, but the Java code for BinaryOperable.pow was not updated to leverage the new INT_POW operator as was done for the Python API. There needs to be a followup issue if it's not addressed here.

bdice · 2022-07-12T21:16:04Z

Java changes are OK as the minimal, necessary change, but the Java code for BinaryOperable.pow was not updated to leverage the new INT_POW operator as was done for the Python API. There needs to be a followup issue if it's not addressed here.

@jlowe See my assessment in #10178 (comment) and let me know if you agree — I think Spark/Java should not use the integer power operator because the expected return type of pow is always a floating type. That’s not true for pandas, which is what motivated this fix.

python/cudf/cudf/tests/test_binops.py

brandon-b-miller

one small non-blocking suggestion otherwise lgtm.

jlowe · 2022-07-13T21:58:26Z

I think Spark/Java should not use the integer power operator because the expected return type of pow is always a floating type.

Agreed it's a non-issue for Spark, but it is an inconsistency with respect to the cudf APIs across language bindings. Given the cudf Java API is only used by Spark for now it's essentially a non-issue, but it could be if someone in the future used the cudf Java API outside of Spark. Given Spark will always call it with double, having it perform this behavior won't break Spark, but it will produce better results for anything that might call it with an INT in the future.

Yeah, yeah, YAGNI applies here, so I'm fine if we decide not to follow up. On the flip side, it's also not hard to avoid this theoretical future surprise.

bdice · 2022-07-18T17:58:17Z

@gpucibot merge

bdice · 2022-07-18T20:32:19Z

@gpucibot merge

Fixes a compile warning was introduced in PR #11025 : [link to log containing the warning](https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/10790/consoleFull) When a templated variable `y` is unsigned the compare `(y<0)` results in a compile warning: ``` /cudf/cpp/src/binaryop/compiled/operation.cuh(226): warning #186-D: pointless comparison of unsigned integer with zero detected during: instantiation of "auto cudf::binops::compiled::ops::IntPow::operator()(TypeLhs, TypeRhs)->TypeLhs [with TypeLhs=uint8_t, TypeRhs=uint8_t, <unnamed>=(void *)nullptr]" ``` Adding an `if constexpr` around the comparison removes the warning. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #11339

bdice and others added 4 commits February 1, 2022 12:07

Add test.

ae67768

Catching ignored return value

4e16f50

`cudf::table::select` is declared `nodiscard`, test will fail to build with line 103.

Merge branch 'rapidsai:branch-22.06' into branch-22.06

45f470e

Adding ipow operator beneath Cython level

0e03fa7

github-actions bot added Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Jun 1, 2022

bdice reviewed Jun 1, 2022

View reviewed changes

cpp/include/cudf/binaryop.hpp Outdated Show resolved Hide resolved

cpp/src/binaryop/compiled/IntPow.cu Outdated Show resolved Hide resolved

AtlantaPepsi added 2 commits June 8, 2022 15:53

fixing libudf errors

b9d7fae

still some bugs left w/ negative exp/base

227a9f0

github-actions bot added the CMake CMake build issue label Jun 10, 2022

bdice reviewed Jun 15, 2022

View reviewed changes

python/cudf/cudf/core/column/numerical.py Outdated Show resolved Hide resolved

bdice reviewed Jun 15, 2022

View reviewed changes

python/cudf/cudf/core/column/numerical.py Outdated Show resolved Hide resolved

bdice reviewed Jun 15, 2022

View reviewed changes

python/cudf/cudf/_lib/cpp/binaryop.pxd Outdated Show resolved Hide resolved

python/cudf/cudf/_lib/binaryop.pyx Outdated Show resolved Hide resolved

cpp/src/binaryop/compiled/util.cpp Outdated Show resolved Hide resolved

cpp/src/binaryop/compiled/binary_ops.cu Outdated Show resolved Hide resolved

AtlantaPepsi and others added 3 commits June 24, 2022 15:36

Apply suggestions from code review: Naming/Copyright

0ac0be6

Co-authored-by: Bradley Dice <[email protected]>

Merge new test from bdice/pow-integer-incorrect

b466b32

Add test.

final for python Numerical.intpow

502274f

AtlantaPepsi marked this pull request as ready for review July 2, 2022 16:33

AtlantaPepsi requested review from a team as code owners July 2, 2022 16:33

AtlantaPepsi requested review from trxcllnt, brandon-b-miller and jrhemstad July 2, 2022 16:33

bdice added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 8, 2022

AtlantaPepsi requested a review from a team as a code owner July 8, 2022 06:03

github-actions bot added the Java Affects Java cuDF API. label Jul 8, 2022

bdice reviewed Jul 8, 2022

View reviewed changes

bdice added 6 commits July 8, 2022 01:05

Return 0 for negative exponents.

1d38991

Use TypeLhs as return type.

c2a51b9

Add comment.

22abe50

Add comment, use integer division.

0f4a6de

Use INT_POW instead of __int_pow__.

c202657

Merge branch 'branch-22.08' of github.com:AtlantaPepsi/cudf into Atla…

1d60f76

…ntaPepsi/branch-22.08

bdice approved these changes Jul 8, 2022

View reviewed changes

jrhemstad reviewed Jul 12, 2022

View reviewed changes

cpp/src/binaryop/compiled/operation.cuh Show resolved Hide resolved

hyperbolic2346 approved these changes Jul 12, 2022

View reviewed changes

jlowe approved these changes Jul 12, 2022

View reviewed changes

brandon-b-miller reviewed Jul 13, 2022

View reviewed changes

python/cudf/cudf/tests/test_binops.py Show resolved Hide resolved

brandon-b-miller approved these changes Jul 13, 2022

View reviewed changes

Mark test as xfail.

0f55784

bdice added 2 commits July 18, 2022 12:54

Add note about negative exponents.

c35a535

clang-format.

5c447d1

rapids-bot bot merged commit ae1b581 into rapidsai:branch-22.08 Jul 18, 2022

jbrennan333 mentioned this pull request Jul 18, 2022

Workaround for nvcomp zstd overwriting blocks for orc due to underestimate of sizes #11288

Merged

davidwendt mentioned this pull request Jul 25, 2022

Fix unsigned-compare compile warning in IntPow binops #11339

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Addition & integration of the integer power operator #11025

Addition & integration of the integer power operator #11025

AtlantaPepsi commented Jun 1, 2022 •

edited by bdice

Loading

GPUtester commented Jun 1, 2022

AtlantaPepsi commented Jun 1, 2022

bdice commented Jun 1, 2022

bdice left a comment

jjacobelli commented Jun 2, 2022

AtlantaPepsi commented Jul 2, 2022

bdice left a comment

bdice left a comment •

edited

Loading

codecov bot commented Jul 8, 2022 •

edited

Loading

hyperbolic2346 left a comment

jlowe left a comment

bdice commented Jul 12, 2022 •

edited

Loading

brandon-b-miller left a comment

jlowe commented Jul 13, 2022

bdice commented Jul 18, 2022

bdice commented Jul 18, 2022

Addition & integration of the integer power operator #11025

Addition & integration of the integer power operator #11025

Conversation

AtlantaPepsi commented Jun 1, 2022 • edited by bdice Loading

GPUtester commented Jun 1, 2022

AtlantaPepsi commented Jun 1, 2022

bdice commented Jun 1, 2022

bdice left a comment

Choose a reason for hiding this comment

jjacobelli commented Jun 2, 2022

AtlantaPepsi commented Jul 2, 2022

bdice left a comment

Choose a reason for hiding this comment

bdice left a comment • edited Loading

Choose a reason for hiding this comment

codecov bot commented Jul 8, 2022 • edited Loading

Codecov Report

hyperbolic2346 left a comment

Choose a reason for hiding this comment

jlowe left a comment

Choose a reason for hiding this comment

bdice commented Jul 12, 2022 • edited Loading

brandon-b-miller left a comment

Choose a reason for hiding this comment

jlowe commented Jul 13, 2022

bdice commented Jul 18, 2022

bdice commented Jul 18, 2022

AtlantaPepsi commented Jun 1, 2022 •

edited by bdice

Loading

bdice left a comment •

edited

Loading

codecov bot commented Jul 8, 2022 •

edited

Loading

bdice commented Jul 12, 2022 •

edited

Loading