-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cudf::row_bit_count() support. #7534
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I really appreciate the detailed, well formatted comments.
shmem_per_thread != 0 | ||
? std::min(max_block_size, shmem_limit_per_block / static_cast<int>(shmem_per_thread)) | ||
: max_block_size; | ||
auto const shared_mem_size = shmem_per_thread * block_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a cool utility. Maybe consider adding it to cudf/detail/utilities/cuda.cuh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some top-level doc suggestions
@gpucibot merge |
Closes #7408
Some notes:
There are some limitations on what this computes, specifically regarding lists or strings embedded inside structs that have null masks. I've added some documentation for this. @jlowe @revans2 This could be made to handle that case properly but it would incur a fairly significant performance cost, and likely would require a large amount of temporary memory.
I made some modifications to the
test::print()
code for lists and structs to be a little more clear when displaying null masks.The structure of
flatten_functor
andflatten_hierarchy
will probably raise some eyebrows. These functions return 3 separate pieces of data and rather than trying to cram them awkwardly through as actual return values, they are passed by reference.