Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of slice_strings for long strings #13057

Merged
merged 17 commits into from
Apr 18, 2023

Conversation

davidwendt
Copy link
Contributor

@davidwendt davidwendt commented Apr 4, 2023

Description

Improves on performance for longer strings with cudf::strings::slice_strings() API. The cudf::string_view::substr was reworked to minimize counting characters and the gather version of make_strings_children is used to build the resulting strings column. This version is already optimized for small and large strings.

Additionally, the code was refactored so the common case of step==1 and start < stop can also make use of the gather approach. Common code was also grouped closer together to help navigate the source file better.

The slice.cpp benchmark was updated to better measure large strings with comparable slice boundaries. The benchmark showed performance improvement was up to 9x for larger strings with no significant degradation for smaller strings.

Reference #13048 and #12445

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 4, 2023
@davidwendt davidwendt self-assigned this Apr 4, 2023
@davidwendt
Copy link
Contributor Author

From slice benchmarks

rows width baseline new x
4096 32 0.117 0.132 0.9
4096 64 0.142 0.137 1.0
4096 128 0.199 0.148 1.3
4096 256 0.330 0.176 1.9
4096 512 0.603 0.233 2.6
4096 1024 1.26 0.319 3.9
4096 2048 2.64 0.553 4.8
4096 4096 5.55 1.02 5.4
4096 8192 12.4 2.54 4.9
32768 32 0.119 0.135 0.9
32768 64 0.147 0.141 1.0
32768 128 0.210 0.156 1.3
32768 256 0.363 0.189 1.9
32768 512 0.669 0.271 2.5
32768 1024 1.50 0.390 3.8
32768 2048 3.56 0.850 4.2
32768 4096 7.46 1.63 4.6
32768 8192 14.8 3.14 4.7
262144 32 0.190 0.171 1.1
262144 64 0.324 0.199 1.6
262144 128 1.38 0.288 4.8
262144 256 7.38 1.47 5.0
262144 512 22.7 5.18 4.4
262144 1024 52.9 5.80 9.1
262144 2048 139 18.9 7.4
262144 4096 293 39.2 7.5
2097152 32 0.780 0.479 1.6
2097152 64 1.72 0.597 2.9
2097152 128 8.54 1.10 7.8
2097152 256 53.2 7.34 7.2
2097152 512 160 32.4 4.9
16777216 32 5.75 2.82 2.0
16777216 64 13.1 3.74 3.5

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Apr 11, 2023
@davidwendt davidwendt marked this pull request as ready for review April 11, 2023 16:35
@davidwendt davidwendt requested a review from a team as a code owner April 11, 2023 16:35
cpp/src/strings/slice.cu Outdated Show resolved Hide resolved
cpp/src/strings/slice.cu Outdated Show resolved Hide resolved
cpp/src/strings/slice.cu Outdated Show resolved Hide resolved
@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit feea040 into rapidsai:branch-23.06 Apr 18, 2023
@davidwendt davidwendt deleted the perf-strings-slice branch April 18, 2023 14:05
shwina pushed a commit to shwina/cudf that referenced this pull request Apr 18, 2023
Improves on performance for longer strings with `cudf::strings::slice_strings()` API. The `cudf::string_view::substr` was reworked to minimize counting characters and the gather version of `make_strings_children` is used to build the resulting strings column. This version is already optimized for small and large strings.

Additionally, the code was refactored so the common case of `step==1 and start < stop`  can also make use of the gather approach. Common code was also grouped closer together to help navigate the source file better.

The `slice.cpp` benchmark was updated to better measure large strings with comparable slice boundaries. The benchmark showed performance improvement was up to 9x for larger strings with no significant degradation for smaller strings.

Reference rapidsai#13048 and rapidsai#12445

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Elias Stehle (https://github.com/elstehle)

URL: rapidsai#13057
rapids-bot bot pushed a commit that referenced this pull request Apr 22, 2023
…13178)

Fixes bug where `stop` value is less than `start` value in calls to `cudf::strings::slice_strings` should result in an empty string. Optimization in #13057 introduced this bug. Additional gtest was added to check for this condition.

Close #13173

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Jason Lowe (https://github.com/jlowe)
  - Nghia Truong (https://github.com/ttnghia)
  - MithunR (https://github.com/mythrocks)

URL: #13178
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants