-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Groupby hash aggregations use sort-based implementation if nested-type columns are used as values #14412
Comments
This code block is the piece in question: cudf/cpp/src/groupby/hash/groupby.cu Lines 654 to 656 in b446a6f
@ttnghia, you added this in #13676. Do you know if this fallback is still required, or why? We discussed this a bit here: #13676 (comment) Side note, I'm punching my past self -- I asked this question on that PR, and never submitted my review: |
Because hash-based aggregations are implemented for plain type only, using the operator such as If we want to support hash-based aggregates for nested types, we need to rewrite such |
Great, that was helpful. I think we can do this. I think the rough plan would be to preprocess the table so we have device comparators that can be used, pass the preprocessed table info through all the aggregation machinery, and use the device comparator where needed in |
Yes that sounds good. Note that we only need to rework for |
To clarify, the reason we cannot use hash-based groupby for nested types is that there is currently no way to atomically update nested data on the device due to the lack of direct hardware support for such operations. A possible solution is to use an atomic lock table, which CCCL is expected to support in the future NVIDIA/cccl#990. We should backlog this for now until the atomic lock table becomes available. |
We should be able to use nested-type columns as values and still be able to invoke a hash-based groupby, as hash-based is generally faster so we do not want to be silently using sort-based.
cudf/cpp/src/groupby/hash/groupby.cu
Lines 654 to 656 in abc0d41
Reference thread: #13795 (comment)
The text was updated successfully, but these errors were encountered: