-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change make_strings_children to return uvector #15171
Change make_strings_children to return uvector #15171
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great with a non-blocking comment on the function return type.
@@ -84,18 +84,17 @@ auto make_strings_children(SizeAndExecuteFunction size_and_exec_fn, | |||
std::overflow_error); | |||
|
|||
// Now build the chars column | |||
std::unique_ptr<column> chars_column = | |||
create_chars_child_column(static_cast<size_type>(bytes), stream, mr); | |||
rmm::device_uvector<char> chars(bytes, stream, mr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preferably returning a unique pointer of uvector according to the developer guide,
cudf/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md
Lines 185 to 190 in 63e9040
libcudf functions typically take views as input (`column_view` or `table_view`) | |
and produce `unique_ptr`s to owning objects as output. For example, | |
```c++ | |
std::unique_ptr<table> sort(table_view const& input); | |
``` |
I also noticed #14202 changed the input argument from unique ptr to rvalue reference for make_strings_column
, which makes the ownership information obscure IMO.
for_each_fn(size_and_exec_fn); | ||
} | ||
|
||
return std::pair(std::move(offsets_column), std::move(chars_column)); | ||
return std::pair(std::move(offsets_column), std::move(chars)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return std::pair(std::move(offsets_column), std::move(chars)); | |
return std::pair(std::move(offsets_column), chars); |
nit: no need to move
if it's just a uvector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the std::move
the uvector is copied.
https://godbolt.org/z/K1sTfP1na
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought copy elision could happen automatically with C++17 based on multiple discussions at our cpp channel and also StackOverflow and then by taking a closer look at your sample code, I realized that returning a pair makes a difference (see here).
/merge |
Description
Changes the
cudf::strings::detail::make_strings_children
utility to return armm::device_uvector<char>
instead of a chars column. This further helps enable large strings support by not storing chars in a column.This is an internal utility and so is non-breaking for any public APIs.
Checklist