-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Replace nvstrdesc with string_view #5682
Comments
I would not recommend using cudf/cpp/src/strings/utilities.cu Lines 65 to 68 in ec4b7b6
This works but is kind of hacky -- using string_view instance merely as pair container of two values (pointer and length).
This implementation should have been probably removed/fixed as part of #4548 but got missed. I would recommend using the following API (which seems to be lacking documentation): cudf/cpp/include/cudf/strings/detail/strings_column_factories.cuh Lines 34 to 39 in ec4b7b6
This takes iterators over types like |
@davidwendt the API you suggested seems to create a strings column from a pair of pointer an integer. What |
Yes, my mistake. If the intention is to encode nullness into the data vector (instead of in a separate bitmask), then I would recommend using a The API I (wrongly) suggested converts a vector of pairs into a strings column. |
I don't believe At the very least, we could consolidate the string comparison codes located here: cudf/cpp/include/cudf/strings/string_view.inl Lines 263 to 277 in ce826c5
cudf/cpp/src/io/utilities/block_utils.cuh Lines 196 to 228 in ce826c5
|
This issue has been labeled |
This PR adds column_device_view members to EncChunk, DictionaryChunk and StripeDictionary structures which are used in the ORC writer. The idea is to replace members in these structures which replicate the same information. Usage of nvstrdesc_s has also been eliminated in the ORC writer. Fixes #7347, Addresses #5682, Addresses #7334 Authors: - Kumar Aatish (@kaatish) Approvers: - Vukasin Milovanovic (@vuule) - Devavret Makkar (@devavret) URL: #7676
Fixes #5682. - Structure `nvstrdesc_s` was replaced with `thrust::pair<const char*, size_type>;`. - `nvstrdesc_s` related logical functions such as `nvstr_is_lesser`, `nvstr_is_greater` etc. were removed. - Include directives for headers included by source files residing in the same directory were made relative as per the developer guide. - `make_column` function related to `column_buffer` was moved from a header file to an implementation file. Authors: - Kumar Aatish (https://github.com/kaatish) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - https://github.com/nvdbaranec - Devavret Makkar (https://github.com/devavret) - Keith Kraus (https://github.com/kkraus14) URL: #7841
Is your feature request related to a problem? Please describe.
The struct
nvstrdesc_s
used in cuIO is similar tostring_view
. They both contain a pointer to the char array and the size of the string. (string_view contains only 2 additional members).string_view
has the necessary methods to perform operations expected ofnvstrdesc_s
cudf/cpp/src/io/parquet/parquet_gpu.h
Lines 39 to 42 in ec4b7b6
Describe the solution you'd like
For all purposes,
nvstrdesc_s
can be replaced withstring_view
.Doing this will be a multi-step process.
stats_column_desc
members in favor ofcolumn_device_view
stats_column_desc
members in favor ofcolumn_device_view
nvstrdesc_s
and usecolumn_device_view
instead of members of stats_column_desc that replicate the same informationstringdata_to_nvstrdesc()
in all the writers.The text was updated successfully, but these errors were encountered: