-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix invalid use of std::exclusive_scan in Parquet writer #13434
Conversation
Pull requests from external contributors require approval from a |
/ok to test |
cpp/src/io/parquet/writer_impl.cu
Outdated
@@ -1710,10 +1710,10 @@ auto convert_table_to_parquet_data(table_input_metadata& table_meta, | |||
size_type const total_frags = [&]() { | |||
if (frags_per_column.size() > 0) { | |||
std::exclusive_scan(frags_per_column.data(), | |||
frags_per_column.data() + num_columns + 1, | |||
frags_per_column.data() + num_columns, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use frags_per_column.begin(), frags_per_column.end()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please avoid back_inserter
if possible. So this should be better:
std::vector<size_type> frag_offsets(num_columns, 0);
...
std::exclusive_scan(frags_per_column.begin(), frags_per_column.end(), frag_offsets.begin(), 0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much cleaner. thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, the canonical way is to reserve the final size and use back_inserter
, so we never reallocate and never default-initialize elements. It is unfortunatelly also the most verbose option 🤷♂️
Not a request to apply this here, just an FYI (or maybe a FMI, if @ttnghia knows of a back_inserter
drawback).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that. I wouldn't mind changing to that just so it's in my muscle memory :)
Co-authored-by: Nghia Truong <[email protected]>
/ok to test |
/ok to test |
/merge |
Description
Fixes #13431
Checklist