Use grid_1d utilities in copy_range.cuh #17409

davidwendt · 2024-11-21T22:09:17Z

Description

Use the grid_1d utilities to manage thread and stride calculations in the copy_range.cuh kernels.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

bdice · 2024-11-22T22:02:00Z

cpp/include/cudf/detail/null_mask.cuh

@@ -67,15 +67,15 @@ CUDF_KERNEL void offset_bitmask_binop(Binop op,
                                      size_type source_size_bits,
                                      size_type* count_ptr)
 {
-  auto const tid = threadIdx.x + blockIdx.x * blockDim.x;
+  auto const tid = cudf::detail::grid_1d::global_thread_id();

  auto const last_bit_index  = source_size_bits - 1;
  auto const last_word_index = cudf::word_index(last_bit_index);

  size_type thread_count = 0;

  for (size_type destination_word_index = tid; destination_word_index < destination.size();


Does tid need to be a thread_index_type? Or do we assume that it's sufficient to let this be size_type because it's a nullmask and thus we only have to worry about a max of size_type bits, leading to (2^31 / 32 = 2^26) as the max possible word index?

(I have thought about this before because I tried to refactor this kernel to use safe thread types, and gave up due to this possibility being a distraction.)

I am okay with leaving this as-is and not worrying about that possibility, as long as we agree the status quo is sufficiently safe.

Yes, I agree. I was partly future-proofing against size_type but mostly trying to keep the overflow-checking robots at bay.

davidwendt · 2024-12-04T13:22:32Z

/merge

Use grid_1d utilities in copy_range.cuh

8696b92

davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Nov 21, 2024

davidwendt self-assigned this Nov 21, 2024

davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Nov 22, 2024

davidwendt marked this pull request as ready for review November 22, 2024 21:21

davidwendt requested a review from a team as a code owner November 22, 2024 21:21

davidwendt requested review from vyasr and mythrocks November 22, 2024 21:21

bdice reviewed Nov 22, 2024

View reviewed changes

bdice approved these changes Nov 25, 2024

View reviewed changes

GregoryKimball mentioned this pull request Nov 25, 2024

Prevent grid stride loop overflow in libcudf kernels #10368

Open

ttnghia approved these changes Dec 1, 2024

View reviewed changes

mythrocks approved these changes Dec 3, 2024

View reviewed changes

rapids-bot bot merged commit 1b01df3 into rapidsai:branch-25.02 Dec 4, 2024
104 checks passed

davidwendt deleted the blockdim-copyrange branch December 4, 2024 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use grid_1d utilities in copy_range.cuh #17409

Use grid_1d utilities in copy_range.cuh #17409

davidwendt commented Nov 21, 2024

bdice Nov 22, 2024

bdice Nov 22, 2024

bdice Nov 22, 2024

davidwendt Nov 22, 2024

davidwendt commented Dec 4, 2024

Use grid_1d utilities in copy_range.cuh #17409

Use grid_1d utilities in copy_range.cuh #17409

Conversation

davidwendt commented Nov 21, 2024

Description

Checklist

bdice Nov 22, 2024

Choose a reason for hiding this comment

bdice Nov 22, 2024

Choose a reason for hiding this comment

bdice Nov 22, 2024

Choose a reason for hiding this comment

davidwendt Nov 22, 2024

Choose a reason for hiding this comment

davidwendt commented Dec 4, 2024