Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Construct column_view from rmm::device_uvector and device_span #9656

Closed
ttnghia opened this issue Nov 10, 2021 · 5 comments · Fixed by #10302
Closed

[FEA] Construct column_view from rmm::device_uvector and device_span #9656

ttnghia opened this issue Nov 10, 2021 · 5 comments · Fixed by #10302
Labels
feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change

Comments

@ttnghia
Copy link
Contributor

ttnghia commented Nov 10, 2021

Sometimes, we need to call some API that only accepts column_view input but we only have data stored in rmm::device_uvector (which can implicitly casted into device_span). As a result, we need to construct a column_view from rmm::device_uvector manually like this, in many places:

 auto gather_map = column_view(
      cudf::data_type{cudf::type_to_id<cudf::size_type>()}, output_size, indices.data());

This is lengthy and error-prone.

A better solution is to add a utility API to do that automatically. Internally, that utility may call type_dispatcher to construct the output column_view depending on the input type.

@ttnghia ttnghia added feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Nov 10, 2021
@ttnghia
Copy link
Contributor Author

ttnghia commented Nov 10, 2021

Tag @jrhemstad and @davidwendt as this idea is from you guys and it's great if you can work on it 😄

@jrhemstad
Copy link
Contributor

A better solution is to add a utility API to do that automatically. Internally, that utility may call type_dispatcher to construct the output column_view depending on the input type.

There's no need for a type dispatch here because you already know the type from the device_span.

@davidwendt
Copy link
Contributor

Make sure you really need this. If you doing this for calling cudf::detail::gather then you can just pass the device_uvector directly to

std::unique_ptr<table> gather(
table_view const& source_table,
device_span<size_type const> const gather_map,
out_of_bounds_policy bounds_policy,
negative_index_policy neg_indices,
rmm::cuda_stream_view stream = rmm::cuda_stream_default,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());
which will automatically convert the device_uvector to a device_span.

Example:

auto gather_map = rmm::device_uvector<size_type>{static_cast<std::size_t>(size), stream};
auto const gather_map_end = thrust::copy_if(rmm::exec_policy(stream),
thrust::make_counting_iterator(size_type{0}),
thrust::make_counting_iterator(size_type{size}),
gather_map.begin(),
is_left);
gather_map.resize(thrust::distance(gather_map.begin(), gather_map_end), stream);
auto const scatter_src_lhs = cudf::detail::gather(table_view{std::vector<column_view>{lhs}},
gather_map,
out_of_bounds_policy::DONT_CHECK,
negative_index_policy::NOT_ALLOWED,
stream);

@jrhemstad
Copy link
Contributor

I think there are plenty of other places where people are constructing a column or column_view from a device_uvector that make this an important and useful thing.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

rapids-bot bot pushed a commit that referenced this issue Feb 17, 2022
This PR adds an implicit conversion operator from `column_view` to `device_span<T const>`. The immediate purpose of this PR is to make it possible to use the API `segmented_reduce(column_view data, device_span<size_type> offsets, ...)` in PR #9621.

This PR also resolves #9656 by adding a `column_view` constructor from `device_span<T const>`.

More broadly, this PR should make it easier to refactor instances where `column.data()` is used with counting iterators to build transform iterators, or other patterns that require a length (e.g. vector factories to copy to host).

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Jake Hemstad (https://github.com/jrhemstad)
  - David Wendt (https://github.com/davidwendt)

URL: #10302
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants