Skip to content

Commit

Permalink
Fix stream usage in segmented_gather() (#9679)
Browse files Browse the repository at this point in the history
`detail::segmented_gather()` inadvertently uses `cuda_default_stream` in some parts of its implementation, while using the user-specified stream in others.

This applies to the calls to `copy_range_in_place()`, `allocate_like()`, and `make_lists_column()`. ~This might produce race conditions, which might explain NVIDIA/spark-rapids/issues/4060. It's a rare failure that's quite hard to reproduce.~ This might lead to over-synchronization, though bad output is unlikely.

The commit here should sort this out, by switching to the `detail` APIs corresponding to the calls above.

Authors:
  - MithunR (https://github.com/mythrocks)

Approvers:
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Nghia Truong (https://github.com/ttnghia)
  - Karthikeyan (https://github.com/karthikeyann)

URL: #9679
  • Loading branch information
mythrocks authored Dec 2, 2021
1 parent 1077dae commit 50acf07
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 10 deletions.
21 changes: 12 additions & 9 deletions cpp/src/lists/copying/segmented_gather.cu
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <cudf/detail/copy_range.cuh>
#include <cudf/detail/gather.cuh>
#include <cudf/detail/gather.hpp>
#include <cudf/detail/indexalator.cuh>
#include <cudf/detail/iterator.cuh>
#include <cudf/detail/null_mask.hpp>
Expand Down Expand Up @@ -88,22 +88,25 @@ std::unique_ptr<column> segmented_gather(lists_column_view const& value_column,
auto child = std::move(child_table->release().front());

// Create list offsets from gather_map.
auto output_offset = cudf::allocate_like(
gather_map.offsets(), gather_map.size() + 1, mask_allocation_policy::RETAIN, mr);
auto output_offset = cudf::detail::allocate_like(
gather_map.offsets(), gather_map.size() + 1, mask_allocation_policy::RETAIN, stream, mr);
auto output_offset_view = output_offset->mutable_view();
cudf::copy_range_in_place(gather_map.offsets(),
output_offset_view,
gather_map.offset(),
gather_map.offset() + output_offset_view.size(),
0);
cudf::detail::copy_range_in_place(gather_map.offsets(),
output_offset_view,
gather_map.offset(),
gather_map.offset() + output_offset_view.size(),
0,
stream);
// Assemble list column & return
auto null_mask = cudf::detail::copy_bitmask(value_column.parent(), stream, mr);
size_type null_count = value_column.null_count();
return make_lists_column(gather_map.size(),
std::move(output_offset),
std::move(child),
null_count,
std::move(null_mask));
std::move(null_mask),
stream,
mr);
}

} // namespace detail
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/lists/extract.cu
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ std::unique_ptr<cudf::column> make_index_child(column_view const& indices,
// `segmented_gather()` on a null index should produce a null row.
if (not indices.nullable()) { return std::make_unique<column>(indices, stream); }

auto const d_indices = column_device_view::create(indices);
auto const d_indices = column_device_view::create(indices, stream);
// Replace null indices with MAX_SIZE_TYPE, so that gather() returns null for them.
auto const null_replaced_iter_begin =
cudf::detail::make_null_replacement_iterator(*d_indices, std::numeric_limits<size_type>::max());
Expand Down

0 comments on commit 50acf07

Please sign in to comment.