-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't unnecessarily read string offsets when doing concatenate overflow checking. #8968
Conversation
…n't read the offsets to compute size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q4, q47, q57 are back to wall clock times seen in 21.06 with this change as is (with the last commit). Thanks @nvdbaranec!
rerun tests |
: cudf::detail::get_value<offset_type>( | ||
scv.offsets(), scv.offset() + b.size(), stream) - | ||
cudf::detail::get_value<offset_type>(scv.offsets(), scv.offset(), stream)); | ||
return a + (scv.is_empty() ? 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not urgent but this feels ripe for refactoring to a lambda given we have dual nested ternary statements.
something like
auto computed_length = [](...)
{
....
};
return a + (scv.is_empty() ? 0 : computed_length);
Fixes: #8960
We were always reading string offsets (device->gpu memcpy) during the concatenation overflow checking which was unnecessary when dealing with an unsliced column, resulting in a performance degradation. This fixes that.