-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove floating point types from radix sort fast-path #7215
Remove floating point types from radix sort fast-path #7215
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Might suggest rewording the comment.
// A non-stable sort on a column of arithmetic type with no nulls will use a radix sort | ||
// if specifying only the `thrust::less` or `thrust::greater` comparators. | ||
// But this also requires making a copy of the input data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads a bit odd.
I reran the tests that we saw failing and this has fixed them. Thanks. |
Codecov Report
@@ Coverage Diff @@
## branch-0.18 #7215 +/- ##
===============================================
+ Coverage 82.09% 82.20% +0.11%
===============================================
Files 97 98 +1
Lines 16474 16692 +218
===============================================
+ Hits 13524 13722 +198
- Misses 2950 2970 +20
Continue to review full report at Codecov.
|
@gpucibot merge |
PR #7215 removed single floating point columns from radix sort fast-path but missed disabling the fast-path sort for floating-point in `cudf::sort()`. This PR fixes `cudf::sort` and adds a new test to the existing `RowOperatorTestForNAN.NANSortingNonNull` gtest. Authors: - David (@davidwendt) Approvers: - Ram (Ramakrishna Prabhu) (@rgsl888prabhu) - Conor Hoekstra (@codereport) URL: #7250
Closes #7212
Reference #7167 (comment)
Using radix sort for all fixed-width types causes an error in Spark when floating point columns contain NaN elements.
This PR removes floating-point column types from the radix fast-path. This means the original
relational_compare
row operator is used to handle sorting floating point columns since they could possibly contain NaN elements.The
NANSorting
gtest included null elements so it did not catch the fast-path output discrepancy. This PR adds aNANSortingNonNull
gtest to check for the desired NaN sorting behavior.