Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] HALF_UP and HALF_EVEN rounding can produce the wrong result for decimal128 #14210

Closed
revans2 opened this issue Sep 27, 2023 · 5 comments
Closed
Labels
1 - On Deck To be worked on next bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS

Comments

@revans2
Copy link
Contributor

revans2 commented Sep 27, 2023

Describe the bug
I did some extra testing in Spark for half even rounding (bround) and half up rounding (round) in spark. Specifically for decimal values. Decimal128 values showed a number of differences vs the CPU at various levels.

+-----+---------------------------------------+---+---+----+-----+------+---+---+----+-----+------+-----+----------+
|id   |div_num                                |b0 |b1 |b2  |b3   |b4    |r0 |r1 |r2  |r3   |r4    |count|_in_column|
+-----+---------------------------------------+---+---+----+-----+------+---+---+----+-----+------+-----+----------+
|2    |0.5000000000000000000000000000000000000|0  |0.5|0.50|0.500|0.5000|1  |0.5|0.50|0.500|0.5000|1    |CPU       |
|2    |0.5000000000000000000000000000000000000|1  |0.5|0.50|0.500|0.5000|1  |0.5|0.50|0.500|0.5000|1    |GPU       |
|4    |0.2500000000000000000000000000000000000|0  |0.2|0.25|0.250|0.2500|0  |0.3|0.25|0.250|0.2500|1    |CPU       |
|4    |0.2500000000000000000000000000000000000|0  |0.2|0.25|0.250|0.2500|0  |0.2|0.25|0.250|0.2500|1    |GPU       |
|8    |0.1250000000000000000000000000000000000|0  |0.1|0.12|0.125|0.1250|0  |0.1|0.13|0.125|0.1250|1    |CPU       |
|8    |0.1250000000000000000000000000000000000|0  |0.1|0.13|0.125|0.1250|0  |0.1|0.13|0.125|0.1250|1    |GPU       |
|16   |0.0625000000000000000000000000000000000|0  |0.1|0.06|0.062|0.0625|0  |0.1|0.06|0.063|0.0625|1    |CPU       |
|16   |0.0625000000000000000000000000000000000|0  |0.1|0.06|0.063|0.0625|0  |0.1|0.06|0.063|0.0625|1    |GPU       |
|20   |0.0500000000000000000000000000000000000|0  |0.0|0.05|0.050|0.0500|0  |0.1|0.05|0.050|0.0500|1    |CPU       |
|20   |0.0500000000000000000000000000000000000|0  |0.0|0.05|0.050|0.0500|0  |0.0|0.05|0.050|0.0500|1    |GPU       |
|32   |0.0312500000000000000000000000000000000|0  |0.0|0.03|0.031|0.0312|0  |0.0|0.03|0.031|0.0313|1    |CPU       |
|32   |0.0312500000000000000000000000000000000|0  |0.0|0.03|0.031|0.0313|0  |0.0|0.03|0.031|0.0313|1    |GPU       |
|40   |0.0250000000000000000000000000000000000|0  |0.0|0.02|0.025|0.0250|0  |0.0|0.03|0.025|0.0250|1    |CPU       |
|40   |0.0250000000000000000000000000000000000|0  |0.0|0.03|0.025|0.0250|0  |0.0|0.03|0.025|0.0250|1    |GPU       |
|80   |0.0125000000000000000000000000000000000|0  |0.0|0.01|0.012|0.0125|0  |0.0|0.01|0.013|0.0125|1    |CPU       |
|80   |0.0125000000000000000000000000000000000|0  |0.0|0.01|0.013|0.0125|0  |0.0|0.01|0.013|0.0125|1    |GPU       |
|160  |0.0062500000000000000000000000000000000|0  |0.0|0.01|0.006|0.0062|0  |0.0|0.01|0.006|0.0063|1    |CPU       |
|160  |0.0062500000000000000000000000000000000|0  |0.0|0.01|0.006|0.0063|0  |0.0|0.01|0.006|0.0063|1    |GPU       |
|200  |0.0050000000000000000000000000000000000|0  |0.0|0.00|0.005|0.0050|0  |0.0|0.01|0.005|0.0050|1    |CPU       |
|200  |0.0050000000000000000000000000000000000|0  |0.0|0.01|0.005|0.0050|0  |0.0|0.01|0.005|0.0050|1    |GPU       |
|400  |0.0025000000000000000000000000000000000|0  |0.0|0.00|0.002|0.0025|0  |0.0|0.00|0.003|0.0025|1    |CPU       |
|400  |0.0025000000000000000000000000000000000|0  |0.0|0.00|0.003|0.0025|0  |0.0|0.00|0.003|0.0025|1    |GPU       |
|800  |0.0012500000000000000000000000000000000|0  |0.0|0.00|0.001|0.0012|0  |0.0|0.00|0.001|0.0013|1    |CPU       |
|800  |0.0012500000000000000000000000000000000|0  |0.0|0.00|0.001|0.0013|0  |0.0|0.00|0.001|0.0013|1    |GPU       |
|2000 |0.0005000000000000000000000000000000000|0  |0.0|0.00|0.000|0.0005|0  |0.0|0.00|0.001|0.0005|1    |CPU       |
|2000 |0.0005000000000000000000000000000000000|0  |0.0|0.00|0.001|0.0005|0  |0.0|0.00|0.001|0.0005|1    |GPU       |
|4000 |0.0002500000000000000000000000000000000|0  |0.0|0.00|0.000|0.0002|0  |0.0|0.00|0.000|0.0003|1    |CPU       |
|4000 |0.0002500000000000000000000000000000000|0  |0.0|0.00|0.000|0.0003|0  |0.0|0.00|0.000|0.0003|1    |GPU       |
|20000|0.0000500000000000000000000000000000000|0  |0.0|0.00|0.000|0.0000|0  |0.0|0.00|0.000|0.0001|1    |CPU       |
|20000|0.0000500000000000000000000000000000000|0  |0.0|0.00|0.000|0.0001|0  |0.0|0.00|0.000|0.0001|1    |GPU       |

In this we are rounding 1/id where id is a decimal with a scale of -37. The b columns are for half up and the r columns are the result for half even rounding.

Steps/Code to reproduce bug
I wrote some unit tests

diff --git a/cpp/tests/round/round_tests.cpp b/cpp/tests/round/round_tests.cpp
index d802c0c270..ece7abc45d 100644
--- a/cpp/tests/round/round_tests.cpp
+++ b/cpp/tests/round/round_tests.cpp
@@ -703,4 +703,84 @@ TEST_F(RoundTests, BoolTestHalfUp)
   EXPECT_THROW(cudf::round(input, -2, cudf::rounding_method::HALF_UP), cudf::logic_error);
 }
 
+// Use __uint128_t for demonstration.
+constexpr __uint128_t operator""_uint128_t(const char* s)
+{
+  __uint128_t ret = 0;
+  for (int i = 0; s[i] != '\0'; ++i)
+  {
+    ret *= 10;
+    if ('0' <= s[i] && s[i] <= '9') {
+      ret += s[i] - '0';
+    }
+  }
+  return ret;
+}
+
+TEST_F(RoundTests, HalfEvenErrorsA)
+{
+  using namespace numeric;
+  using RepType    = cudf::device_storage_type_t<decimal128>;
+  using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;
+
+  {
+    // 0.5 at scale -37 should round HALF_EVEN to 0, because 0 is an even number
+    auto const input    = fp_wrapper{{5000000000000000000000000000000000000_uint128_t}, scale_type{-37}};
+    auto const expected = fp_wrapper{{0}, scale_type{0}};
+    auto const result   = cudf::round(input, 0, cudf::rounding_method::HALF_EVEN);
+
+    CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+  }
+}
+
+TEST_F(RoundTests, HalfEvenErrorsB)
+{
+  using namespace numeric;
+  using RepType    = cudf::device_storage_type_t<decimal128>;
+  using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;
+
+  {
+    // 0.125 at scale -37 should round HALF_EVEN to 0.12, because 2 is an even number
+    auto const input    = fp_wrapper{{1250000000000000000000000000000000000_uint128_t}, scale_type{-37}};
+    auto const expected = fp_wrapper{{12}, scale_type{-2}};
+    auto const result   = cudf::round(input, 2, cudf::rounding_method::HALF_EVEN);
+
+    CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+  }
+}
+
+TEST_F(RoundTests, HalfEvenErrorsC)
+{
+  using namespace numeric;
+  using RepType    = cudf::device_storage_type_t<decimal128>;
+  using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;
+
+  {
+    // 0.0625 at scale -37 should round HALF_EVEN to 0.062, because 2 is an even number
+    auto const input    = fp_wrapper{{0625000000000000000000000000000000000_uint128_t}, scale_type{-37}};
+    auto const expected = fp_wrapper{{62}, scale_type{-3}};
+    auto const result   = cudf::round(input, 3, cudf::rounding_method::HALF_EVEN);
+
+    CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+  }
+}
+
+TEST_F(RoundTests, HalfUpErrorsA)
+{
+  using namespace numeric;
+  using RepType    = cudf::device_storage_type_t<decimal128>;
+  using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;
+
+  {
+    // 0.25 at scale -37 should round HALF_UP to 0.3
+    auto const input    = fp_wrapper{{2500000000000000000000000000000000000_uint128_t}, scale_type{-37}};
+    auto const expected = fp_wrapper{{3}, scale_type{-1}};
+    auto const result   = cudf::round(input, 1, cudf::rounding_method::HALF_UP);
+
+    CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+  }
+}
+
+
+
 CUDF_TEST_PROGRAM_MAIN()

But they don't cover the error cases from the previous test I did in Spark.

Expected behavior
We get the correct answer when rounding decimal128 numbers.

@revans2 revans2 added bug Something isn't working Needs Triage Need team to review and classify Spark Functionality that helps Spark RAPIDS labels Sep 27, 2023
@GregoryKimball
Copy link
Contributor

GregoryKimball commented Sep 27, 2023

Thanks @revans2 for raising this. Is it fair to say that cudf::rounding_method::HALF_EVEN is not working correctly for some decimal values? Is this only an issue with scale_type{-37}?

But they don't cover the error cases from the previous test I did in Spark.

Do you think the unit tests you've proposed cover the root cause of all the error cases above?

@GregoryKimball GregoryKimball added 1 - On Deck To be worked on next libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Sep 27, 2023
@GregoryKimball GregoryKimball moved this to Needs owner in libcudf Sep 27, 2023
@revans2
Copy link
Contributor Author

revans2 commented Sep 27, 2023

@GregoryKimball I am running tests right now to get a better idea. I have not exhaustively tested all combinations of decimal scale. I have seen it happen with scale -37 and scale -36. I will let you know when my tests are done if there are more that fail.

@revans2
Copy link
Contributor Author

revans2 commented Sep 28, 2023

I ran through a fairly large exhaustive set of tests for various scales at different rounding indexes. For HALF_UP the largest scale factor that I saw got a wrong answer was scale -21 rounding to 0.

|id |div_num                  |rounded|count|_in_column|
+---+-------------------------+-------+-----+----------+
|2  |0.50000000000000000000000|1      |1    |CPU       |
|2  |0.50000000000000000000000|0      |1    |GPU       |
+---+-------------------------+-------+-----+----------+

For HALF_EVEN it was -22 rounding to 0

|id |div_num                   |rounded|count|_in_column|
+---+--------------------------+-------+-----+----------+
|2  |0.500000000000000000000000|0      |1    |CPU       |
|2  |0.500000000000000000000000|1      |1    |GPU       |
+---+--------------------------+-------+-----+----------+

But I have a list of 743 values at different scales and roundings that produces the incorrect answers. I suspect it might have something to either do with overflow, or with the fact that a double is being used to calculate powers of 10.

T const n = std::pow(10, std::abs(decimal_places));

Type const n = std::pow(10, scale_movement);

I had to write some custom code to do 256-bit decimal math, because Spark/Java will go over 128-bits for some operations and I ran into issues with pow being inaccurate, so I wrote a LUT, which was simple enough to cover all possible versions could run into. Not that you have to do the same. Just a point of reference.

https://github.com/NVIDIA/spark-rapids-jni/blob/f32fe74027b25fdb64c997ca7515490a8c210072/src/main/cpp/src/decimal_utils.cu#L247-L510

I'll try to clean up the log file I have with the 743 failures for reference

@revans2
Copy link
Contributor Author

revans2 commented Sep 28, 2023

out.tsv.gz has a tab separated values file for the failed test cases.

@bdice
Copy link
Contributor

bdice commented Sep 28, 2023

I looked briefly at the rounding functors themselves but nothing jumped out at me. Based on the variation with precision (it seems that larger scale values work but -21 to -37 fail), my working hypothesis is that this is due to some stage of the process that isn't in "integral" math. There is a reference here to std::pow which could introduce a floating-point component to the math. I am going to work on adding some test cases that fail, and then see if using an integral power operation improves the situation.

Type const n = std::pow(10, scale_movement);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - On Deck To be worked on next bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

No branches or pull requests

3 participants