-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Decimal-128 Multiplication For Newer Spark Versions #1623
Conversation
Signed-off-by: Raza Jafri <[email protected]>
related to NVIDIA/spark-rapids#9859 |
@tgravescs I think I have answered your previous comment. Do you have any other comments? |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor nits, looks good overall and I like the comments added.
src/main/cpp/src/decimal_utils.cu
Outdated
@@ -968,6 +974,7 @@ namespace cudf::jni { | |||
std::unique_ptr<cudf::table> multiply_decimal128(cudf::column_view const& a, | |||
cudf::column_view const& b, | |||
int32_t product_scale, | |||
bool const& cast_interim_result, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool const& cast_interim_result, | |
bool const cast_interim_result, |
No real advantage to passing it as a pointer if it is const and smaller than a pointer.
build |
build |
CI failing because of
but passes locally even after doing a clean build |
build |
public static Table multiply128(ColumnView a, ColumnView b, int productScale) { | ||
return new Table(multiply128(a.getNativeView(), b.getNativeView(), productScale)); | ||
public static Table multiply128(ColumnView a, ColumnView b, int productScale, boolean interimCast) { | ||
return new Table(multiply128(a.getNativeView(), b.getNativeView(), productScale, interimCast)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the interimCast
is applied to Spark versions 3.2.4, 3.3.3, 3.4.1, 3.5.0 and 4.0.0 then please add such clarification into the docs of this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the docs were pretty clear, if there is anything else you want me to add please share and I will add.
src/test/java/com/nvidia/spark/rapids/jni/DecimalUtilsTest.java
Outdated
Show resolved
Hide resolved
build |
* WARNING: With interimCast set to true, this method has a bug which we match with Spark versions before 3.4.2, | ||
* 4.0.0, 3.5.1. Consider the following example using Decimal with a precision of 38 and scale of 10: | ||
* -8533444864753048107770677711.1312637916 * -12.0000000000 = 102401338377036577293248132533.575166 | ||
* while the actual answer based on Java BigDecimal is 102401338377036577293248132533.575165 | ||
* | ||
* @param a factor input, must match row count of the other factor input | ||
* @param b factor input, must match row count of the other factor input | ||
* @param productScale scale to use for the product type | ||
* @param interimCast whether to cast the result of the division to 38 precision before casting it again to the final | ||
* precision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Sorry, I think these lines are longer than usual thus we may need to rewrite them a little bit. As a convention, typically a line should not exceed 100 characters. The lines above are up to 120.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point. I see the file has other instances where we are doing this. To keep this PR focused, I don't want to make the changes in other places and since the CI has passed I would really appreciate if we can do that as a follow-on.
Thanks @ttnghia I will file a follow-on for the line length! appreciate your help |
This PR adds a new method for multiplying two Decimal 128-bit numbers without casting the interim result before casting to the scale the final answer is expected to be in.
When multiplying two decimal numbers with a precision of 38, earlier versions of Spark are capped by the precision of 38 which causes the answer to be casted twice essentially, once right after multiplication and once before reporting the final answer which leads to the answer being inaccurate.
For more details please look at the Spark issue which explains the issue in greater detail.
Changes Made:
multiply128
method that takes in interimCast boolean.Tests: