-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] FLOAT32 rounding more inaccurate than necessary #14528
Comments
Looking at the implementation of our rounding code (specifically half_up_positive), I am concerned that it's not just a matter of |
Yes, totally agree that double will have similar issues with larger decimal place rounding. As such I don't see this so much as "solving" all the rounding issues as significantly improving the results for FLOAT32 with minimal effort. |
I think this might be possible to handle by using appropriately selected CUDA intrinsics for the division step. If we're rounding half-up we should use division with round to positive infinity, rather than round to nearest ties to even (which is the default): Edit: I misunderstood the definition of half_up rounding, which only breaks ties. Though one might still be able to get away with something like this
Produces:
But I haven't thought through all the consequences of this change. |
That said, I claim that |
I started looking into this, and noticed a problem with our current implementation: HALF_UP rounding is bugged for negative numbers: if you round (e.g.) -0.5f to zero decimal places, it should round up to zero, but instead results in -1 because it calls roundf(), which rounds away from zero. Do we want to change the rounding code to round up, or change the name/comment to ROUND_AWAY? |
I think the "UP" here refers to higher magnitude rather than higher value. FWIW Java's |
That's fine with me, I'll fix the code comment then in round.hpp (which should instead point to wikipedia's rounding half away from zero). |
The most accurate way to round would be to convert floating-point to fixed_point (to more decimal places than float/doubles precision), round as desired, then convert that back to floating. But as @harrism brings up this will still produce "unexpected" results (e.g. 1.151f to 2 decimal places -> fixed point 115 w/ scale -1 -> 1.1499999f -> string "1.14"). So I think the question is: do we want to do these conversions in this kernel, or should we block using floating point for the round kernel, and instead require users to explicitly choose the chain of operations above? |
I think licudf shouldn't support rounding binary float types to specified number of decimal places. Just like the C++ standard provides no function for this. https://en.cppreference.com/w/cpp/numeric/math/round
|
Describe the bug
When rounding single-precision floats to a specified number of decimal places, the result can be slightly inaccurate due to the intermediate computations being forced into FLOAT32 as well. round.cu has rounding functors for non-fixed-point types, but all of the intermediate results are in the same type as the input rather than the highest precision type, double. This means more error is introduced during the rounding computation than is necessary.
Steps/Code to reproduce bug
The following code demonstrates the problem:
Rounding the value to the tenth decimal place should round down to approximately 6.12194e-05 but instead the value is rounded up to approximately 6.12195e-05 as shown in the output when running the program:
Expected behavior
FLOAT32 rounding should use FLOAT64 for intermediate results during computation to try to avoid injecting errors beyond what is necessary when dealing with floating point numbers. When I manually performed the computations on this example input value for round.cu's half_up_positive logic but using double instead of float for the intermediate values, the answer came out rounded down as expected.
It seems that the functors for floating point rounding in round.cu should not be using whatever the input type is but rather
double
explicitly to avoid unnecessary additional error during the computation.The text was updated successfully, but these errors were encountered: