Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't unnecessarily read string offsets when doing concatenate overflow checking. #8968

Merged
merged 2 commits into from
Aug 6, 2021

Conversation

nvdbaranec
Copy link
Contributor

@nvdbaranec nvdbaranec commented Aug 5, 2021

Fixes: #8960

We were always reading string offsets (device->gpu memcpy) during the concatenation overflow checking which was unnecessary when dealing with an unsliced column, resulting in a performance degradation. This fixes that.

@nvdbaranec nvdbaranec added libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 5, 2021
@nvdbaranec nvdbaranec requested a review from a team as a code owner August 5, 2021 16:31
@nvdbaranec nvdbaranec requested review from robertmaynard and codereport and removed request for a team August 5, 2021 16:31
@nvdbaranec nvdbaranec added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Aug 5, 2021
Copy link
Contributor

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q4, q47, q57 are back to wall clock times seen in 21.06 with this change as is (with the last commit). Thanks @nvdbaranec!

@nvdbaranec nvdbaranec removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Aug 5, 2021
@nvdbaranec
Copy link
Contributor Author

rerun tests

: cudf::detail::get_value<offset_type>(
scv.offsets(), scv.offset() + b.size(), stream) -
cudf::detail::get_value<offset_type>(scv.offsets(), scv.offset(), stream));
return a + (scv.is_empty() ? 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not urgent but this feels ripe for refactoring to a lambda given we have dual nested ternary statements.

something like

auto computed_length = [](...)
{
....
}; 
return a + (scv.is_empty() ? 0 : computed_length);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] 50% performance regression in concatenate for string columns observed in 21.08
6 participants