-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] libcudf functions for segmented_has_nulls
/ has_null_elements(list_view)
/ segmented_bitmask_reduction
#9552
Comments
segmented_bitmask_reduction
segmented_has_nulls
/ has_null_elements(list_view)
/ segmented_bitmask_reduction
I have started on this, and have a draft of splitting out device-side code from |
… split host/device side code for segmented counts. (#9588) This PR is a first step towards #9552. I split the host and device code for `segmented_count_set_bits`, so that APIs with existing indices on the device can call the device function directly. Along the way, I discussed some pain points in the current design of bit counting functions with @jrhemstad and @mythrocks and made the following changes: - Fixed inconsistencies in the behavior of bit counting functions (`count_set_bits` returned zero when `bitmask==nullptr`, but `segmented_count_set_bits` returned segment lengths) by raising an error when `bitmask==nullptr` _(breaking change)_ - Moved all bit counting functions to detail namespaces _(breaking change)_ - Separated "validity" logic from pure bit counting through the introduction of new `valid_count`, `null_count`, `segmented_valid_count`, `segmented_null_count` detail functions. These functions treat all elements as valid when `bitmask==nullptr` and otherwise return the same way as the corresponding bit counting functions. - Split device-side code from host-side code to enable future work on segmented nullmask reductions over lists, with indices (offsets) already on the device. - Updated all calling code (and added tests) to use the new valid/null count detail functions. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Jake Hemstad (https://github.com/jrhemstad) - MithunR (https://github.com/mythrocks) URL: #9588
This issue has been labeled |
This issue has been labeled |
closes #9135 closes #9552 This PR adds support for numeric types to `simple_op`, `sum`, `prod`, `min`, `max`, `any`, `all`. Also, this PR adds `segmented_null_mask_reduction` to compute null mask reductions on segments. Authors: - Michael Wang (https://github.com/isVoid) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Robert (Bobby) Evans (https://github.com/revans2) - Bradley Dice (https://github.com/bdice) - Jake Hemstad (https://github.com/jrhemstad) URL: #9621
Is your feature request related to a problem? Please describe.
Following from libcudf discussions around null support and lists (or segmented columns more generally), we have identified a need for a few new APIs. Many reductions over a
list_view
produce results that should be null if any element in the list (an element in the corresponding segment of the child column) is null.cc: @isVoid @jrhemstad @nvdbaranec, @ttnghia @mythrocks
Describe the solution you'd like
First,
segmented_count_set_bits
should be refactored to separate the section acting on device-side data. This function will do the same thing as the existing host-side function but act on arrays of indices that are already on the device.We propose the addition of the following functions:
segmented_null_count
based onsegmented_count_unset_bits
in Refactor bit counting APIs, introduce valid/null count functions, and split host/device side code for segmented counts. #9588segmented_has_nulls
based onsegmented_null_count
(a segment has nulls if the count is greater than 0)has_null_elements(list_column_view)
based onsegmented_has_nulls
segmented_bitmask_reduction
to assemble a bitmask from the results ofsegmented_has_nulls
Additional context
This affects @bdice's work on #9215, @ttnghia's work on #9545, and @isVoid's plan for addressing #9135.
The text was updated successfully, but these errors were encountered: