[BUG] pow operator not acting as expected. #10178

bdice · 2022-01-31T20:51:22Z

Describe the bug
__pow__ / __rpow__ binary operations do not give the expected results in the following cases.

Steps/Code to reproduce bug
Powers of 3 ** 1.

>>> 3 ** cudf.Series([1])
0    2
dtype: int64
>>> cudf.Series([3]) ** 1
0    2
dtype: int64
>>> pow(cudf.Series([3]), 1)
0    2
dtype: int64
>>> cudf.Series([3]) ** cudf.Series([1])
0    2
dtype: int64
>>> cudf.Series([-3]) ** 1
0   -2
dtype: int64
>>> 3 ** cudf.Series([1, 2, 3, 4])
0     2
1     9
2    27
3    81
dtype: int64

This happens for other integers as well:

for i in range(1000):
    if i != (cudf.Series([i]) ** 1)[0]:
        print(i)

gives

Also, the power does not have to be 1. There are a lot of integers that fail, always off-by-one in the result.

>>> print((cudf.Series([8]) ** 7)[0], 8 ** 7)
2097151 2097152

It gives the wrong results for Series with dtypes int8, int16, int32, int64, uint8, uint16, uint32, uint64, bool (for the exponent, 3 ** cudf.Series([1], dtype=bool) gives 2).

Additionally, pandas raises when computing integers to negative integer powers. cudf gives meaningless results.

>>> cudf.Series([0, 1, 2, 3]) ** -1
0    9223372036854775807
1                      1
2                      0
3                      0
dtype: int64
>>> pd.Series([0, 1, 2, 3]) ** -1
(... traceback ...)
ValueError: Integers to negative integer powers are not allowed.

Expected behavior

>>> 3 ** 1
3

Additional context
Three is a magic number.

Using a floating type on either side gives expected results:

>>> 3 ** cudf.Series([1, 2, 3, 4], dtype=float)
0     3.0
1     9.0
2    27.0
3    81.0
dtype: float64

The text was updated successfully, but these errors were encountered:

vyasr · 2022-01-31T20:58:03Z

This appears to be some sort of precision issue that manifests for specific numbers, most likely in the libcudf layer (not in cuDF Python). I can reproduce on my machine, and moreover there are other apparently unrelated numbers conforming to no specific pattern that exhibit the same behavior of x**1 == x-1.

>>> x = cudf.Series([1, 2]) 
>>> for i in range(100):
...     if (i ** x)[0] != i:
...             print(f"{i}**1 returns {(i**x)[0]}")
... 
3**1 returns 2
84**1 returns 83
86**1 returns 85
95**1 returns 94

bdice · 2022-02-03T17:01:30Z

This comment summarizes some Slack discussions, analysis, and plans for fixing this bug.

C++'s std::pow always returns double. Current our libcudf implementation of Pow always casts to double but then machinery in binary_ops.cuh casts the result to the common type of the inputs, i.e. calling Pow on integer types yields a double but it is cast back to an integer.

We want the Pow struct to conform to STL C++ conventions. NumPy, Pandas, and CuPy all return integer types when computing powers of integer bases to integer exponents. For instance, cupy handles powers of integers with exponentiation by squaring.

We need to change the behavior of Pow and introduce a new struct IntPow:

Pow should always return double (it will fail if the requested output type is not double).
cuDF Python will dispatch to IntPow, which will implement exponentiation-by-squaring. This algorithm is already implemented in libcudf for fixed-point powers.

The only piece that still requires some investigation is the choice of conventions for float values, like pow(float, float) -> (float or double?) across the PyData, STL, and Spark ecosystems. It appears Pandas and C++ STL would return float but it looks like Spark/Java always upcast pow arguments to double and return double. We may want to allow Pow to perform its math in float precision instead of double to align with STL conventions -- currently libcudf casts float arguments to double, then downcasts the result to the common type of the inputs, float.

This commit adds a failing C++ test and should be included in a PR to fix this bug: bdice@ae67768

github-actions · 2022-03-14T01:29:23Z

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

bdice · 2022-05-25T19:49:41Z

Issue #9346 is also relevant here, since that ipow implementation could be reused as a utility function. However, the lookup table suggested for that implementation will diverge from the runtime computation needed for this integer pow function.

bdice · 2022-06-01T22:32:49Z

This problem may also affect decimal types, since they use integers to represent the fixed point value. That should be verified before closing this issue.

Partial fix for #10178 (still need to investigate whether decimal types are also affected). This implements the `INT_POW` binary operator, which can be dispatched for integral types. This uses an [exponentiation-by-squaring](https://en.wikipedia.org/wiki/Exponentiation_by_squaring) algorithm to compute powers. Unlike `POW`, this does not cast the data to floating-point types which can suffer from precision loss when computing powers and casting back to an integral type. The cuDF Python layer has been updated to dispatch integral data to this operator, which fixes the problems seen for specific values of base and exponent (like `3**1 == 2`) noted in #10178. Authors: - Tim (https://github.com/AtlantaPepsi) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) - Mike Wilson (https://github.com/hyperbolic2346) - Jason Lowe (https://github.com/jlowe) - https://github.com/brandon-b-miller URL: #11025

github-actions · 2022-08-30T23:03:32Z

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

bdice · 2022-08-31T22:23:35Z

This problem may also affect decimal types, since they use integers to represent the fixed point value. That should be verified before closing this issue.

A separate issue affects decimal types. I filed #11636 and will close this issue.

bdice added bug Something isn't working Needs Triage Need team to review and classify 4 - Needs cuDF (Python) Reviewer and removed Needs Triage Need team to review and classify labels Jan 31, 2022

bdice added Python Affects Python cuDF API. and removed 4 - Needs cuDF (Python) Reviewer labels Jan 31, 2022

github-actions bot added the inactive-30d label Mar 14, 2022

bdice self-assigned this Mar 28, 2022

github-actions bot removed the inactive-30d label May 25, 2022

AtlantaPepsi mentioned this issue Jun 1, 2022

Addition & integration of the integer power operator #11025

Merged

gmarkall mentioned this issue Jul 4, 2022

CUDA: Numerical differences between CPython / CPU target and the CUDA target in math.pow numba/numba#8218

Open

github-actions bot added the inactive-90d label Aug 30, 2022

bdice mentioned this issue Aug 31, 2022

[BUG] Binary ops fail unexpectedly for Decimal inputs #11636

Closed

bdice closed this as completed Aug 31, 2022

This was referenced Sep 29, 2023

Fix inaccuracy in decimal128 rounding. #14233

Merged

Consolidate and optimize integer power implementations in libcudf #14243

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] pow operator not acting as expected. #10178

[BUG] pow operator not acting as expected. #10178

bdice commented Jan 31, 2022 •

edited

Loading

vyasr commented Jan 31, 2022 •

edited

Loading

bdice commented Feb 3, 2022 •

edited

Loading

github-actions bot commented Mar 14, 2022

bdice commented May 25, 2022 •

edited

Loading

bdice commented Jun 1, 2022

github-actions bot commented Aug 30, 2022

bdice commented Aug 31, 2022

[BUG] pow operator not acting as expected. #10178

[BUG] pow operator not acting as expected. #10178

Comments

bdice commented Jan 31, 2022 • edited Loading

vyasr commented Jan 31, 2022 • edited Loading

bdice commented Feb 3, 2022 • edited Loading

github-actions bot commented Mar 14, 2022

bdice commented May 25, 2022 • edited Loading

bdice commented Jun 1, 2022

github-actions bot commented Aug 30, 2022

bdice commented Aug 31, 2022

bdice commented Jan 31, 2022 •

edited

Loading

vyasr commented Jan 31, 2022 •

edited

Loading

bdice commented Feb 3, 2022 •

edited

Loading

bdice commented May 25, 2022 •

edited

Loading