-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the overflow of container type when casting floats to decimal #5766
Conversation
build |
I assume that this should go to 22.06 release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks reasonable to me but I have not worked on the decimal implementation and am not an expert in this area.
withResource(checked.castTo(containerType)) { container => | ||
container.round(dt.scale, cudf.RoundMode.HALF_UP) | ||
withResource(container.round(dt.scale, cudf.RoundMode.HALF_UP)) { rd => | ||
rd.castTo(targetType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit; Doesn't this cast end up creating a redundant copy of the column in many (all?) cases? Wondering if we should check if the resulting type from round
is already the desired cudf type. I would expect cudf's round
to deterministically produce a result type based on the input cast type which makes me wonder why the cast is necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlowe The cast here is for cases that underlying cuDF decimal type got promoted as precision + 1 (from Decimal32 to Decimal64). For case like that, I think we need to convert back to the original cuDF decimal type, to keep align with the precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are pre-casting to the promoted precision but asking cudf to round to the originally desired precision. Do we know for certain that cudf does not already put the result in the appropriate type as part of the round implementation? Even if it does not, it is not always the case that the the promotion cast actually did anything (e.g.: it was DECIMAL32 before and after, because precision+1 didn't actually need DECIMAL64).
This isn't must-fix for proper functionality, more of a performance optimization for what may be a common case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know for certain that cudf does not already put the result in the appropriate type as part of the round implementation?
I'm not sure if I understand your question correctly. In cudf, floating-point is just shifted (mul/div) by the scale
value then rounded (towards zero) to integer for internal storage.
https://github.com/rapidsai/cudf/blob/fae22213537a003c74e2da9feac1bda38f562f6f/cpp/include/cudf/fixed_point/fixed_point.hpp#L230
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe cudf can just fix the accuracy problem easily by just simply not rounding toward zero! That simple fix will not cause such issue on the plugin side but I'm not sure if that is the desired behavior in cudf. I'll file an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed: rapidsai/cudf#11079
Signed-off-by: sperlingxx <[email protected]>
729e04e
to
e3675f8
Compare
Conducted a force push as rebasing to 22.06 |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with it as-is, but it would be good to have a followup for what appears to be an unnecessary post-round cast in some (all?) cases.
Regarding to the reason behind this bug, I repost an explain from my cudf issue (rapidsai/cudf#11079 (comment)): (reposting this message from slack) There is a long history behind this and truncation is definitely the desired and intentional behaviour. Here are the main motivations for truncation:
Note that at one point, we actually did have rounding functionality baked into floating point construction for fixed-point. However, it wasn't actually to address the desire to be more "accurate" as @ttnghia is pointing out, it was to try and address inherent issues in floating point (such as 1.001 not being representable by floating point, meaning if you construct a fixed-point with scale -3, you end up with 1 , missing the .001). @harrism and I had many discussions about this and both agreed that "ideally" fixed-point should be able to avoid this at the end of the day. However, it ended up presenting too many issues / complications and not actually comprehensively fixing all cases. And note, trying to be "better" was not consistent with what CNL and other C++ fixed-point libraries do. So we decided at a certain point to just accept the inherent flaws of floating point and follow CNL. This was done in the following small PR: rapidsai/cudf#6544 If you take a look at the unit tests, it is very clear how the behaviour changed by looking at the before and after. Furthermore, there was a bunch of "opposition research" done at one point in to what other C++ fixed-point libraries do. Our goal is not to be similar to Julia or Go or Python, but to be a generic C++ backend that provides all the tools to do whatever you want in the front end language (like cuDF or Spark-RAPIDS). |
* Correct the error message for test_mod_pmod_by_zero (#5781) Signed-off-by: Firestarman <[email protected]> * Update the error string for test_cast_neg_to_decimal_err on 330[databricks] (#5784) * Update the error string for test_cast_neg_to_decimal_err on 330 Signed-off-by: Firestarman <[email protected]> * address comments Signed-off-by: Firestarman <[email protected]> * Throw an exception when attempting to read columnar encrypted Parquet files on the GPU [databricks] (#5761) * Throw useful message when parquet columnar encryption enabled * update message * fix message * handle native encrypted * move variable * cleanup * fix native check * cleanup imports * fix import order * Sign off Signed-off-by: Thomas Graves <[email protected]> * Shim the parquet crypto exception check Signed-off-by: Thomas Graves <[email protected]> * shim 320cdh * Add test for parquet encryption Signed-off-by: Thomas Graves <[email protected]> * fix rounds over decimal in Spark 330+ (#5786) Passes the datatype of round-like functions directly to GPU overrides, so as to adapt different Spark versions. Signed-off-by: sperlingxx <[email protected]> * Fix the overflow of container type when casting floats to decimal (#5766) Fixes #5765 Fix the potential overflow when casting float/double to decimal. The overflow occurs on the container decimal for HALF_UP round. Signed-off-by: sperlingxx <[email protected]> Co-authored-by: Liangcai Li <[email protected]> Co-authored-by: Alfred Xu <[email protected]>
Fixes #5765
This PR is to fix the potential overflow when casting float/double to decimal. The overflow occurs on the container decimal for HALF_UP round. The precision of the container decimal should increase along with the scale in case of cornor cases like #5765.
Signed-off-by: sperlingxx [email protected]