Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inaccuracy in decimal128 rounding. #14233

Merged
merged 3 commits into from
Oct 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion cpp/src/round/round.cu
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,10 @@ std::unique_ptr<column> round_with(column_view const& input,
out_view.template end<Type>(),
static_cast<Type>(0));
} else {
Type const n = std::pow(10, scale_movement);
Type n = 10;
for (int i = 1; i < scale_movement; ++i) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use exponentiation-by-squaring for efficiency, and we should have a common implementation of that in libcudf. (We already have two implementations of exponentiation-by-squaring.) I have started work on this and will push to this PR when it's ready, but that might not happen today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome Bradley.
Would you point to the other two implementations? I was trying to look for them myself earlier today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One is in fixed point code and has a base that is known as a template parameter:

CUDF_HOST_DEVICE inline Rep ipow(T exponent)

The other is in binary ops code and its base and exponent are both runtime parameters:

I am not sure how to best refactor this, but I have drafted some work locally (not yet pushed) that would add a file cudf/detail/utilities/intpow.hpp that centralizes this logic and exposes both a "constexpr base" and "runtime base" form of the function. I'll push this soon so it can be evaluated -- but there's some hangups I am seeing locally with include order and cuda_runtime.h macro conflicts (__forceinline__) with CCCL (resolved in libcudacxx 2.2.0).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we do not have an intpow AST operator, because one has not been requested (to the best of my knowledge). It would go somewhere in here, but would need to have a different name like INTPOW to disambiguate it from the operators that are expected to return floating point values:

struct operator_functor<ast_operator::POW, false> {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The topic of integer powers was heavily discussed and analyzed in #10178.

Copy link
Contributor Author

@bdice bdice Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two more places where I think this bug might reoccur:

I'd love help writing some tests that fail for these cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting work on a follow-up PR to fix these additional rescaling issues in #14242. I have a checklist there. This PR should be limited in scope to fixing only the rounding issues, to minimize friction for this fix. I'd like to target refactoring requests to #14242 (aiming for 23.10) or a subsequent release (probably 23.12)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also #9346

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #14243 to track future work on this.

n *= 10;
}
thrust::transform(rmm::exec_policy(stream),
input.begin<Type>(),
input.end<Type>(),
Expand Down
79 changes: 79 additions & 0 deletions cpp/tests/round/round_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -703,4 +703,83 @@ TEST_F(RoundTests, BoolTestHalfUp)
EXPECT_THROW(cudf::round(input, -2, cudf::rounding_method::HALF_UP), cudf::logic_error);
}

// Use __uint128_t for demonstration.
constexpr __uint128_t operator""_uint128_t(const char* s)
{
__uint128_t ret = 0;
for (int i = 0; s[i] != '\0'; ++i) {
ret *= 10;
if ('0' <= s[i] && s[i] <= '9') { ret += s[i] - '0'; }
}
return ret;
}

TEST_F(RoundTests, HalfEvenErrorsA)
{
using namespace numeric;
using RepType = cudf::device_storage_type_t<decimal128>;
using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;

{
// 0.5 at scale -37 should round HALF_EVEN to 0, because 0 is an even number
auto const input =
fp_wrapper{{5000000000000000000000000000000000000_uint128_t}, scale_type{-37}};
auto const expected = fp_wrapper{{0}, scale_type{0}};
auto const result = cudf::round(input, 0, cudf::rounding_method::HALF_EVEN);

CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
}
}

TEST_F(RoundTests, HalfEvenErrorsB)
{
using namespace numeric;
using RepType = cudf::device_storage_type_t<decimal128>;
using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;

{
// 0.125 at scale -37 should round HALF_EVEN to 0.12, because 2 is an even number
auto const input =
fp_wrapper{{1250000000000000000000000000000000000_uint128_t}, scale_type{-37}};
auto const expected = fp_wrapper{{12}, scale_type{-2}};
auto const result = cudf::round(input, 2, cudf::rounding_method::HALF_EVEN);

CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
}
}

TEST_F(RoundTests, HalfEvenErrorsC)
{
using namespace numeric;
using RepType = cudf::device_storage_type_t<decimal128>;
using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;

{
// 0.0625 at scale -37 should round HALF_EVEN to 0.062, because 2 is an even number
auto const input =
fp_wrapper{{0625000000000000000000000000000000000_uint128_t}, scale_type{-37}};
auto const expected = fp_wrapper{{62}, scale_type{-3}};
auto const result = cudf::round(input, 3, cudf::rounding_method::HALF_EVEN);

CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
}
}

TEST_F(RoundTests, HalfUpErrorsA)
{
using namespace numeric;
using RepType = cudf::device_storage_type_t<decimal128>;
using fp_wrapper = cudf::test::fixed_point_column_wrapper<RepType>;

{
// 0.25 at scale -37 should round HALF_UP to 0.3
auto const input =
fp_wrapper{{2500000000000000000000000000000000000_uint128_t}, scale_type{-37}};
auto const expected = fp_wrapper{{3}, scale_type{-1}};
auto const result = cudf::round(input, 1, cudf::rounding_method::HALF_UP);

CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
}
}

CUDF_TEST_PROGRAM_MAIN()