-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicitly disable groupby on unsupported key types. #9227
Explicitly disable groupby on unsupported key types. #9227
Conversation
Fixes rapidsai#8905. Attempting groupby aggregations with LIST keys leads to silent failures and bad results. For instance, attempting hash-based groupby aggregations with LIST keys only fails on DEBUG builds, thus: ``` /home/myth/dev/cudf/2/cpp/include/cudf/table/row_operators.cuh:447: unsigned int cudf: :element_hasher_with_seed<hash_function, has_nulls>::operator()(cudf::column_device_view, signed in t) const [with T = cudf::list_view; void *<anonymous> = (void *)nullptr; hash_function = default_ha sh; __nv_bool has_nulls = false]: block: [0,0,0], thread: [0,0,0] Assertion `false && "Unsupported type in hash."` failed. ``` In RELEASE builds, a copy of the input LIST column is returned, causing each output row to be interpreted as its own group. This commit adds an explicit failure for unsupported LIST groupby keys.
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #9227 +/- ##
===============================================
Coverage ? 10.84%
===============================================
Files ? 116
Lines ? 19171
Branches ? 0
===============================================
Hits ? 2080
Misses ? 17091
Partials ? 0 Continue to review full report at Codecov.
|
Also, added exceptions for STRING and STRUCT types as keys.
The failures seem transient:
|
Rerun tests |
1. No special handling for STRUCT, STRING. 2. Utility function for type checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMake changes LGTM
Thanks for the reviews, all. I'll merge this in now. |
@gpucibot merge |
Fixes #8905.
Attempting groupby aggregations with
LIST
keys leads to silentfailures and bad results.
For instance, attempting hash-based
groupby
aggregations withLIST
keys only fails on DEBUG builds, thus:
In RELEASE builds, a copy of the input
LIST
column is returned, causingeach output row to be interpreted as its own group.
This commit adds an explicit failure for unsupported groupby key types,
i.e. those that don't support equality comparisons (like
LIST
).