-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use grid_1d utilities in copy_range.cuh #17409
Use grid_1d utilities in copy_range.cuh #17409
Conversation
@@ -67,15 +67,15 @@ CUDF_KERNEL void offset_bitmask_binop(Binop op, | |||
size_type source_size_bits, | |||
size_type* count_ptr) | |||
{ | |||
auto const tid = threadIdx.x + blockIdx.x * blockDim.x; | |||
auto const tid = cudf::detail::grid_1d::global_thread_id(); | |||
|
|||
auto const last_bit_index = source_size_bits - 1; | |||
auto const last_word_index = cudf::word_index(last_bit_index); | |||
|
|||
size_type thread_count = 0; | |||
|
|||
for (size_type destination_word_index = tid; destination_word_index < destination.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does tid
need to be a thread_index_type
? Or do we assume that it's sufficient to let this be size_type
because it's a nullmask and thus we only have to worry about a max of size_type
bits, leading to (2^31 / 32 = 2^26) as the max possible word index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I have thought about this before because I tried to refactor this kernel to use safe thread types, and gave up due to this possibility being a distraction.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with leaving this as-is and not worrying about that possibility, as long as we agree the status quo is sufficiently safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. I was partly future-proofing against size_type
but mostly trying to keep the overflow-checking robots at bay.
/merge |
Description
Use the
grid_1d
utilities to manage thread and stride calculations in thecopy_range.cuh
kernels.Checklist