Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt changes from JNI for casting from float to decimal #10917

Merged
merged 19 commits into from
Jul 30, 2024

Conversation

ttnghia
Copy link
Collaborator

@ttnghia ttnghia commented May 28, 2024

This reimplements castFloatsToDecimal to just calling the corresponding function from spark-rapids-jni, which is designed specifically for casting floating point values to decimal for Spark.

Depends on:

Closes #9682, and closes #10809.

ttnghia added 3 commits May 27, 2024 08:37
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
@ttnghia ttnghia added bug Something isn't working SQL part of the SQL/Dataframe plugin task Work required that improves the product but is not user facing labels May 28, 2024
@ttnghia ttnghia self-assigned this May 28, 2024
@thirtiseven
Copy link
Collaborator

thirtiseven commented May 28, 2024

The results look really close.

The @approximate_float in integration test can even be removed under my tests.

CastOpSuite can pass with eps of 1e-13.

Related Spark UT still fails, also close:

- casting to fixed-precision decimals *** FAILED ***
  Incorrect evaluation: cast(10.03 as decimal(38,18)), actual: 10.030000000000002048, expected: 10.03 (RapidsTestsTrait.scala:350)

Performance test to cast 5000000 floats to 10 kinds of decimal types (in ms):

Type CPU 24.08 #10909 This PR
Double 146,524 468.33 660.33 370.66
Float 82,691 412.33 480.67 275.33

@ttnghia
Copy link
Collaborator Author

ttnghia commented May 28, 2024

Should that case be acceptable as the relative error is just 1.7e-16?
For 10.3, it is stored like 9.30000000000000071054 and it is very difficult to remove the trailing bonus digits if we are going to take 18 decimal digits.

@ttnghia
Copy link
Collaborator Author

ttnghia commented May 29, 2024

I've just updated my JNI code in NVIDIA/spark-rapids-jni#2078, adding some utilities functions by @pmattione-nvidia and it seems to fix all of our failed tests. @thirtiseven please run tests again on your machine.

@thirtiseven
Copy link
Collaborator

thirtiseven commented May 29, 2024

It passed Spark UT, personally I think it is good enough for this issue.

If set eps = 0 in com.nvidia.spark.rapids.CastOpSuite#cast float/double to decimal, we can still see some mismatch:
Some cases in float test:

cpu: -6.963900469355742E15	gpu: -6.963900469355743E15
cpu: 4.005061702373036E19	gpu: 4.005061702373037E19
cpu: 7.1726783043136778E18	gpu: 7.1726783043136788E18
cpu: 4.9245108190671776E17	gpu: 4.9245108190671782E17
cpu: -6.6190078985307568E16	gpu: -6.6190078985307576E16
cpu: -8.286135559736182E15	gpu: -8.286135559736183E15
cpu: 9.999999933815812E18	gpu: 9.999999933815814E18
cpu: 9.2205076355788718E18	gpu: 9.2205076355788728E18
cpu: 1.00000004091847872E17	gpu: 1.00000004091847888E17

Only diffs in last few digits.

Some cases in double test:

cpu: -1.0E16	                gpu: -1.0000000000000002E16
cpu: -3.953408452257507E34	gpu: -3.9534084522575073E34
cpu: -1.0E16	                gpu: -1.0000000000000002E16
cpu: 7.1726781626632929E18	gpu: 7.172678162663294E18
cpu: 4.9245106955378758E17	gpu: 4.9245106955378765E17
cpu: 1.0E29                     gpu: 1.0000000000000001E29
cpu: -6.619008101723092E16	gpu: -6.6190081017230928E16
cpu: 9.999999999999999E29	gpu: 1.0E30
cpu: -6.200692612954865E31	gpu: -6.200692612954866E31
cpu: -8.286135303307476E15	gpu: -8.286135303307477E15
cpu: -5.75278800663036E23	gpu: -5.7527880066303604E23
cpu: -2.786658992657616E33	gpu: -2.7866589926576164E33
cpu: -4.857510460413498E34	gpu: -4.8575104604134985E34

There some cases like -1.0E16 vs -1.0000000000000002 and 9.999999999999999E29 vs 1.0E30, not sure if they can be fix easily.

New performance:

Type CPU 24.08 #10909 This PR old This PR new
Double 146,524 468.33 660.33 370.66 439
Float 82,691 412.33 480.67 275.33 378.33

@thirtiseven
Copy link
Collaborator

And again some thoughts about the float => string => decimal solution. In Spark/Java double is converted to string with as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. doc. But the algorithm is complex and sometimes does not produce results only as many at low version jdk. So technically the best we can match now might be the only as many digits, which is implemented in jni ftos_converter.cuh. So we don't need to really convert it to string, and can get the digits in the string in int64. But it seems not to be easy to use in your approach.

@ttnghia
Copy link
Collaborator Author

ttnghia commented May 29, 2024

My solution in NVIDIA/spark-rapids-jni#2078 is adopted from the ongoing work of @pmattione-nvidia. Paul is working on even a better solution, so hopefully it will be able to fix the failures above. Let's wait for it.

@ttnghia ttnghia changed the base branch from branch-24.06 to branch-24.08 May 29, 2024 22:07
ttnghia added 9 commits June 7, 2024 19:50
Signed-off-by: Nghia Truong <[email protected]>
# Conflicts:
#	tests/src/test/spark330/scala/org/apache/spark/sql/rapids/utils/RapidsTestSettings.scala
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
@ttnghia
Copy link
Collaborator Author

ttnghia commented Jul 19, 2024

Alright, since we don't have any better solution in the meantime, I've made NVIDIA/spark-rapids-jni#2078 ready for review. With that:

ttnghia added 4 commits July 23, 2024 13:58
Signed-off-by: Nghia Truong <[email protected]>
This reverts commit c185206.

# Conflicts:
#	tests/src/test/scala/com/nvidia/spark/rapids/CastOpSuite.scala
@ttnghia ttnghia marked this pull request as ready for review July 24, 2024 21:35
@ttnghia ttnghia marked this pull request as draft July 24, 2024 22:16
Signed-off-by: Nghia Truong <[email protected]>
@ttnghia ttnghia marked this pull request as ready for review July 26, 2024 23:27
@ttnghia
Copy link
Collaborator Author

ttnghia commented Jul 29, 2024

build

Copy link
Collaborator

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@razajafri razajafri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ttnghia ttnghia merged commit c9f1ab9 into NVIDIA:branch-24.08 Jul 30, 2024
43 checks passed
@ttnghia ttnghia deleted the float_to_decimal branch July 30, 2024 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SQL part of the SQL/Dataframe plugin task Work required that improves the product but is not user facing
Projects
None yet
4 participants