-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] test_decimal_round fails with DATAGEN_SEED=3 #9847
Comments
The -0.0 vs 0.0 is not an actual mismatch in the test (it's using approx float comparison already), that's an artifact of the difftool used to produce the output when the test fails which is just looking at raw text and no longer doing the floating point approx comparison logic. The real issue is more along the lines of #9350. In this specific case with DATAGEN_SEED=3, the problem is only with single-precision floating point value 6.121944898040965e-05f which gets rounded incorrectly up instead of down. I tracked this down to a problem in libcudf where the calculations for single-precision floating point rounding are forced to use single-precision floating point throughout which introduces enough error to cause the issue. Filed rapidsai/cudf#14528 |
@jlowe do you think that this might happen with double values too for rounding to larger numbers of digits? Right now we go up to 10, which exposed a problem for float. But should we try larger values for double? |
Yes, I'm sure that will happen at some point for specific values and rounding places, even for doubles. We'll won't be able to match Spark's behavior here if we stick with floating-point types, since Spark goes to BigDecimal and thus will have a lot more precision to wield. Fixing the cudf issue I filed for floats may give us enough additional precision to handle most (all?) of the FLOAT32 cases, but it won't do anything to improve the known limitations when rounding with doubles to high numbers of decimal places. We may want to do some experiments to know how many decimal places we can reliably round to and consider fallback to the CPU when we see rounding to a decimal place we cannot reliably match. |
I think for float32 trying to round anything more than 7 decimal places (inclusive) might result in some "strange" results (even if you use double precision). Consider a float32 value and its two closest neighbours:
Rounding to six digits produces |
Repro:
(part of) results:
I seen many mismatched are from #9349, but I believe there will be some from #9350
The text was updated successfully, but these errors were encountered: