Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement in cudf::strings::all_characters_of_type #13259

Merged
merged 21 commits into from
May 15, 2023

Conversation

davidwendt
Copy link
Contributor

Description

Improves performance for cudf::strings::all_characters_of_type() API which covers many cudf is_X functions. The solution improves performance for all string lengths as measured by the new benchmark included in this PR.
Additionally, the code was cleaned up to help with maintenance and clarity.

Reference #13048

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 1, 2023
@davidwendt davidwendt self-assigned this May 1, 2023
@github-actions github-actions bot added the CMake CMake build issue label May 1, 2023
@davidwendt
Copy link
Contributor Author

Here is the diff from the nvbench_compare tool:

|  row_width  |  num_rows  |   Ref Time |   Cmp Time |          Diff |   %Diff |
|-------------|------------|------------|------------|---------------|---------|
|     32      |    4096    |  34.701 us |  32.349 us |     -2.352 us |  -6.78% |
|     64      |    4096    |  34.414 us |  32.179 us |     -2.235 us |  -6.50% |
|     128     |    4096    |  41.044 us |  32.793 us |     -8.251 us | -20.10% |
|     256     |    4096    |  60.614 us |  32.353 us |    -28.261 us | -46.63% |
|     512     |    4096    | 100.592 us |  33.944 us |    -66.648 us | -66.26% |
|    1024     |    4096    | 178.231 us |  34.122 us |   -144.109 us | -80.86% |
|    2048     |    4096    | 324.128 us |  33.723 us |   -290.405 us | -89.60% |
|    4096     |    4096    | 616.879 us |  32.786 us |   -584.093 us | -94.69% |
|     32      |   32768    |  34.873 us |  32.656 us |     -2.216 us |  -6.35% |
|     64      |   32768    |  36.433 us |  34.036 us |     -2.397 us |  -6.58% |
|     128     |   32768    |  45.096 us |  36.951 us |     -8.145 us | -18.06% |
|     256     |   32768    |  65.402 us |  35.736 us |    -29.666 us | -45.36% |
|     512     |   32768    | 113.118 us |  36.259 us |    -76.859 us | -67.95% |
|    1024     |   32768    | 199.634 us |  37.704 us |   -161.931 us | -81.11% |
|    2048     |   32768    | 365.835 us |  37.985 us |   -327.850 us | -89.62% |
|    4096     |   32768    | 688.672 us |  38.201 us |   -650.472 us | -94.45% |
|     32      |   262144   |  47.768 us |  42.404 us |     -5.365 us | -11.23% |
|     64      |   262144   |  61.138 us |  52.997 us |     -8.141 us | -13.32% |
|     128     |   262144   | 102.089 us |  63.134 us |    -38.955 us | -38.16% |
|     256     |   262144   | 305.667 us |  67.241 us |   -238.426 us | -78.00% |
|     512     |   262144   | 651.223 us |  67.515 us |   -583.708 us | -89.63% |
|    1024     |   262144   |   1.648 ms |  69.991 us |  -1578.019 us | -95.75% |
|    2048     |   262144   |   5.159 ms |  80.804 us |  -5077.801 us | -98.43% |
|    4096     |   262144   |  12.695 ms |  85.679 us | -12608.903 us | -99.33% |
|     32      |  2097152   | 360.893 us | 261.914 us |    -98.979 us | -27.43% |
|     64      |  2097152   | 417.313 us | 357.012 us |    -60.300 us | -14.45% |
|     128     |  2097152   | 788.544 us | 439.015 us |   -349.529 us | -44.33% |
|     256     |  2097152   |   2.518 ms | 488.347 us |  -2029.795 us | -80.61% |
|     512     |  2097152   |   5.052 ms | 551.573 us |  -4500.046 us | -89.08% |
|     32      |  16777216  |   2.042 ms |   1.893 ms |   -149.908 us |  -7.34% |
|     64      |  16777216  |   2.630 ms |   2.342 ms |   -288.120 us | -10.96% |

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels May 3, 2023
@davidwendt davidwendt marked this pull request as ready for review May 8, 2023 12:40
@davidwendt davidwendt requested a review from a team as a code owner May 8, 2023 12:40
@davidwendt davidwendt requested review from karthikeyann and ttnghia May 8, 2023 12:40
Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Just one small question: does the performance improvement come from bypassing the continuation char?

cpp/benchmarks/string/char_types.cpp Outdated Show resolved Hide resolved
cpp/benchmarks/string/char_types.cpp Show resolved Hide resolved
Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice speedups.

Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@davidwendt
Copy link
Contributor Author

Looks good.

Just one small question: does the performance improvement come from bypassing the continuation char?

That's basically the case here. Overall this changes from a character-based iteration to a byte-based iteration which is a bit faster here by limiting character counting. Iterating by characters is not always necessary but also not always slower.

@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 2825d5a into rapidsai:branch-23.06 May 15, 2023
@davidwendt davidwendt deleted the char-types-perf branch May 15, 2023 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants