-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Optimizations for groupby::scan #8522
Comments
This impacts more than just |
This issue has been labeled |
Not sure if this is still an issue or not, but I would love to have groupby::scan go faster for already sorted data. As this is a use case we commonly have in Spark. |
Oops, I completely forgot about this issue. Will work on this shortly. |
Still desired. |
Closes #8522 This PR gets rid of redundant rearranging processes in `groupby::scan` if input values are presorted. Instead of a short circuit in `sort_helper`, it adds an early exit in the scan functor to avoid materializing `sorted_values`/`grouped_values` thus reducing memory footprint. This optimization brings a 1.6x speedup for presorted scan operations. - Baseline ``` ----------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------- Groupby/BasicSumScan/1000000/manual_time 0.455 ms 0.472 ms 1388 Groupby/BasicSumScan/10000000/manual_time 8.80 ms 8.81 ms 61 Groupby/BasicSumScan/100000000/manual_time 543 ms 543 ms 1 Groupby/PreSortedSumScan/1000000/manual_time 0.217 ms 0.236 ms 3319 Groupby/PreSortedSumScan/10000000/manual_time 1.45 ms 1.47 ms 479 Groupby/PreSortedSumScan/100000000/manual_time 14.0 ms 14.0 ms 47 ``` - After optimization ``` ----------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------- Groupby/BasicSumScan/1000000/manual_time 0.455 ms 0.472 ms 1393 Groupby/BasicSumScan/10000000/manual_time 8.81 ms 8.82 ms 60 Groupby/BasicSumScan/100000000/manual_time 546 ms 546 ms 1 Groupby/PreSortedSumScan/1000000/manual_time 0.129 ms 0.148 ms 5389 Groupby/PreSortedSumScan/10000000/manual_time 0.901 ms 0.921 ms 769 Groupby/PreSortedSumScan/100000000/manual_time 8.68 ms 8.70 ms 74 ``` Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) URL: #9754
The
groupby
object has a parameter allowing a caller to indicate that their keys (and likewise values to aggregate) are already in sorted order such that all values belonging to a particular group are already contiguous.However, this information is not fully leveraged in the
groupby::scan
implementation.Internally, the
groupby::scan
implementation callsget_grouped_values()
to rearrange the values such that all values in the same group are contiguouscudf/cpp/src/groupby/sort/scan.cpp
Lines 74 to 75 in 6728c75
This in turn calls the
sort_groupby_helper::grouped_values
to perform the grouping:cudf/cpp/src/groupby/sort/sort_helper.cu
Lines 289 to 302 in 6728c75
Which in turn calls
key_sort_order
to get the indices of the sorted order of the keys:cudf/cpp/src/groupby/sort/sort_helper.cu
Lines 107 to 157 in 6728c75
Then the values are gathered based on the sorted key order:
cudf/cpp/src/groupby/sort/sort_helper.cu
Lines 294 to 299 in 6728c75
However, the steps of materializing the
key_sort_order
and performing thegather
are completely redundant if the user has already specified that the inputs are already sorted.I haven't full thought through the solution, but my intuition is that
sort_groupby_helper
should short circuit when they keys are already sorted and just return a view to the input values.The text was updated successfully, but these errors were encountered: