Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use offsetalator in cudf::strings::split functions #14757

Merged
merged 17 commits into from
Feb 8, 2024

Conversation

davidwendt
Copy link
Contributor

Description

Adds offsetalator in place of hardcoded offset type arrays to the strings split functions:

  • cudf::strings::split()
  • cudf::strings::rsplit()
  • cudf::strings::split_record()
  • cudf::strings::rsplit_record()
  • cudf::strings::split_re()
  • cudf::strings::rsplit_re()
  • cudf::strings::split_record_re()
  • cudf::strings::rsplit_record_re()

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 12, 2024
@davidwendt davidwendt self-assigned this Jan 12, 2024
@davidwendt davidwendt changed the base branch from branch-24.02 to branch-24.04 January 23, 2024 18:47
@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jan 25, 2024
@davidwendt davidwendt changed the title Use offsetalator in strings:split functions Use offsetalator in cudf::strings::split functions Jan 26, 2024
@davidwendt davidwendt marked this pull request as ready for review January 31, 2024 18:17
@davidwendt davidwendt requested a review from a team as a code owner January 31, 2024 18:17
@davidwendt
Copy link
Contributor Author

Some performance degradation in long strings to be addressed in a follow-on PR.

# split

## [0] NVIDIA RTX A6000

|  row_width  |  num_rows  |   type    |   Ref Time |   Cmp Time |        Diff |   %Diff |
|-------------|------------|-----------|------------|------------|-------------|---------|
|     32      |    4096    |   split   | 353.942 us | 347.504 us |   -6.438 us |  -1.82% |
|     64      |    4096    |   split   | 413.765 us | 406.981 us |   -6.784 us |  -1.64% |
|     128     |    4096    |   split   | 497.669 us | 491.159 us |   -6.510 us |  -1.31% |
|     256     |    4096    |   split   | 608.428 us | 602.191 us |   -6.237 us |  -1.03% |
|     512     |    4096    |   split   | 844.322 us | 844.510 us |    0.189 us |   0.02% |
|    1024     |    4096    |   split   |   1.164 ms |   1.165 ms |    0.348 us |   0.03% |
|    2048     |    4096    |   split   |   1.669 ms |   1.680 ms |   10.412 us |   0.62% |
|     32      |   32768    |   split   | 469.089 us | 464.069 us |   -5.021 us |  -1.07% |
|     64      |   32768    |   split   | 445.271 us | 445.741 us |    0.470 us |   0.11% |
|     128     |   32768    |   split   | 557.610 us | 557.831 us |    0.221 us |   0.04% |
|     256     |   32768    |   split   | 732.601 us | 747.264 us |   14.663 us |   2.00% |
|     512     |   32768    |   split   |   1.075 ms |   1.109 ms |   34.260 us |   3.19% |
|    1024     |   32768    |   split   |   1.517 ms |   1.588 ms |   70.731 us |   4.66% |
|    2048     |   32768    |   split   |   2.234 ms |   2.392 ms |  157.516 us |   7.05% |
|     32      |   262144   |   split   | 650.987 us | 660.003 us |    9.016 us |   1.38% |
|     64      |   262144   |   split   | 767.996 us | 796.298 us |   28.302 us |   3.69% |
|     128     |   262144   |   split   |   1.797 ms |   1.874 ms |   76.762 us |   4.27% |
|     256     |   262144   |   split   |   2.037 ms |   2.187 ms |  150.323 us |   7.38% |
|     512     |   262144   |   split   |   2.535 ms |   2.849 ms |  313.964 us |  12.38% |
|    1024     |   262144   |   split   |   4.036 ms |   4.672 ms |  636.598 us |  15.77% |
|    2048     |   262144   |   split   |   7.722 ms |   9.088 ms |    1.366 ms |  17.69% |
|     32      |  2097152   |   split   |   1.926 ms |   2.099 ms |  172.113 us |   8.93% |
|     64      |  2097152   |   split   |   3.008 ms |   3.324 ms |  316.418 us |  10.52% |
|     128     |  2097152   |   split   |   8.832 ms |   9.411 ms |  579.038 us |   6.56% |
|     256     |  2097152   |   split   |   9.553 ms |  10.755 ms |    1.202 ms |  12.58% |
|     512     |  2097152   |   split   |  13.492 ms |  15.873 ms |    2.382 ms |  17.65% |
|     32      |  16777216  |   split   |  11.942 ms |  13.140 ms |    1.198 ms |  10.03% |
|     64      |  16777216  |   split   |  21.172 ms |  23.314 ms |    2.142 ms |  10.12% |
|     32      |    4096    |  record   | 266.254 us | 257.481 us |   -8.773 us |  -3.30% |
|     64      |    4096    |  record   | 274.727 us | 265.579 us |   -9.148 us |  -3.33% |
|     128     |    4096    |  record   | 285.355 us | 275.869 us |   -9.486 us |  -3.32% |
|     256     |    4096    |  record   | 310.321 us | 302.704 us |   -7.617 us |  -2.45% |
|     512     |    4096    |  record   | 281.551 us | 279.880 us |   -1.672 us |  -0.59% |
|    1024     |    4096    |  record   | 309.191 us | 310.298 us |    1.106 us |   0.36% |
|    2048     |    4096    |  record   | 375.990 us | 384.850 us |    8.860 us |   2.36% |
|     32      |   32768    |  record   | 277.552 us | 270.440 us |   -7.112 us |  -2.56% |
|     64      |   32768    |  record   | 298.828 us | 296.294 us |   -2.533 us |  -0.85% |
|     128     |   32768    |  record   | 351.183 us | 349.425 us |   -1.758 us |  -0.50% |
|     256     |   32768    |  record   | 445.231 us | 450.598 us |    5.367 us |   1.21% |
|     512     |   32768    |  record   | 410.295 us | 434.163 us |   23.868 us |   5.82% |
|    1024     |   32768    |  record   | 553.866 us | 619.496 us |   65.629 us |  11.85% |
|    2048     |   32768    |  record   | 836.477 us | 967.847 us |  131.369 us |  15.71% |
|     32      |   262144   |  record   | 417.883 us | 419.429 us |    1.546 us |   0.37% |
|     64      |   262144   |  record   | 579.503 us | 597.616 us |   18.113 us |   3.13% |
|     128     |   262144   |  record   |   1.151 ms |   1.220 ms |   69.452 us |   6.03% |
|     256     |   262144   |  record   |   3.584 ms |   3.711 ms |  127.473 us |   3.56% |
|     512     |   262144   |  record   |   1.451 ms |   1.737 ms |  286.181 us |  19.73% |
|    1024     |   262144   |  record   |   2.585 ms |   3.165 ms |  579.501 us |  22.42% |
|    2048     |   262144   |  record   |   5.181 ms |   6.565 ms |    1.383 ms |  26.69% |
|     32      |  2097152   |  record   |   1.435 ms |   1.582 ms |  147.143 us |  10.25% |
|     64      |  2097152   |  record   |   2.665 ms |   2.928 ms |  263.861 us |   9.90% |
|     128     |  2097152   |  record   |   6.408 ms |   6.940 ms |  531.913 us |   8.30% |
|     256     |  2097152   |  record   |  22.797 ms |  24.032 ms |    1.236 ms |   5.42% |
|     512     |  2097152   |  record   |   9.999 ms |  12.341 ms |    2.342 ms |  23.42% |
|     32      |  16777216  |  record   |   9.790 ms |  10.887 ms |    1.098 ms |  11.21% |
|     64      |  16777216  |  record   |  19.703 ms |  21.746 ms |    2.044 ms |  10.37% |

@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 47d28a0 into rapidsai:branch-24.04 Feb 8, 2024
74 checks passed
@davidwendt davidwendt deleted the split-offsetalator branch February 8, 2024 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants