-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable round-tripping of large strings in cudf
#15944
Enable round-tripping of large strings in cudf
#15944
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think wee need to check cudf::strings::detail::is_large_strings_enabled()
when creating arrow LargeString objects.
cpp/src/interop/to_arrow.cu
Outdated
} else { | ||
return std::make_shared<arrow::StringArray>( | ||
0, std::move(tmp_offset_buffer), std::move(tmp_data_buffer)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed. I see no reason to create a separate empty LargeStringArray.
Co-authored-by: David Wendt <[email protected]>
cudf
cudf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add gtests for this?
cpp/src/interop/from_arrow.cu
Outdated
auto chars_column = dispatch_to_cudf_column{}.operator()<int8_t>( | ||
*char_array, data_type(type_id::INT8), true, stream, mr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be updated to use an rmm::device_buffer
since making this a column will trigger the size_type
limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving python changes, some tiny C++ suggestions.
Co-authored-by: Lawrence Mitchell <[email protected]>
…into arrow_interop
Co-authored-by: David Wendt <[email protected]>
/merge |
Description
Fixes: #15922
This PR adds support for round-tripping
LargeStringArray
incudf
using 64 bit offsets.Checklist