Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement for nvtext::minhash #13333

Merged
merged 9 commits into from
May 16, 2023

Conversation

davidwendt
Copy link
Contributor

@davidwendt davidwendt commented May 10, 2023

Description

Improves performance of nvtext::minhash by minimizing character counting in the internal logic. The MinHash strings are expected to be very long ( > 1KB). Improvement is measure to be up to 2x.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 10, 2023
@davidwendt davidwendt self-assigned this May 10, 2023
@davidwendt
Copy link
Contributor Author

From nvbench results

|  num_rows  |  row_width  |  hash_width  |  seed_count  |   Ref Time |   Cmp Time |          Diff |   %Diff |
|------------|-------------|--------------|--------------|------------|------------|---------------|---------|
|    1024    |     128     |      5       |      2       | 193.856 us | 181.411 us |    -12.445 us |  -6.42% |
|    4096    |     128     |      5       |      2       | 227.873 us | 182.397 us |    -45.476 us | -19.96% |
|    8192    |     128     |      5       |      2       | 280.639 us | 197.543 us |    -83.096 us | -29.61% |
|   16364    |     128     |      5       |      2       | 389.130 us | 223.790 us |   -165.340 us | -42.49% |
|   32768    |     128     |      5       |      2       | 610.014 us | 282.108 us |   -327.906 us | -53.75% |
|   262144   |     128     |      5       |      2       |   3.729 ms |   1.113 ms |  -2615.250 us | -70.14% |
|    1024    |     512     |      5       |      2       | 281.461 us | 206.475 us |    -74.986 us | -26.64% |
|    4096    |     512     |      5       |      2       | 362.247 us | 238.903 us |   -123.344 us | -34.05% |
|    8192    |     512     |      5       |      2       | 511.638 us | 292.133 us |   -219.505 us | -42.90% |
|   16364    |     512     |      5       |      2       | 814.275 us | 387.908 us |   -426.367 us | -52.36% |
|   32768    |     512     |      5       |      2       |   1.418 ms | 587.679 us |   -829.977 us | -58.55% |
|   262144   |     512     |      5       |      2       |   9.899 ms |   3.402 ms |  -6496.691 us | -65.63% |
|    1024    |    2048     |      5       |      2       | 624.283 us | 346.543 us |   -277.740 us | -44.49% |
|    4096    |    2048     |      5       |      2       | 900.445 us | 452.052 us |   -448.393 us | -49.80% |
|    8192    |    2048     |      5       |      2       |   1.417 ms | 665.112 us |   -751.827 us | -53.06% |
|   16364    |    2048     |      5       |      2       |   2.465 ms |   1.036 ms |  -1428.609 us | -57.96% |
|   32768    |    2048     |      5       |      2       |   4.518 ms |   1.812 ms |  -2706.182 us | -59.90% |
|   262144   |    2048     |      5       |      2       |  33.360 ms |  12.553 ms | -20807.037 us | -62.37% |
|    1024    |     128     |      10      |      2       | 196.147 us | 170.822 us |    -25.325 us | -12.91% |
|    4096    |     128     |      10      |      2       | 234.330 us | 186.371 us |    -47.959 us | -20.47% |
|    8192    |     128     |      10      |      2       | 292.730 us | 209.369 us |    -83.360 us | -28.48% |
|   16364    |     128     |      10      |      2       | 416.179 us | 241.465 us |   -174.713 us | -41.98% |
|   32768    |     128     |      10      |      2       | 661.388 us | 303.404 us |   -357.983 us | -54.13% |
|   262144   |     128     |      10      |      2       |   4.133 ms |   1.273 ms |  -2859.794 us | -69.19% |
|    1024    |     512     |      10      |      2       | 291.149 us | 219.325 us |    -71.824 us | -24.67% |
|    4096    |     512     |      10      |      2       | 385.945 us | 262.891 us |   -123.054 us | -31.88% |
|    8192    |     512     |      10      |      2       | 554.882 us | 330.057 us |   -224.825 us | -40.52% |
|   16364    |     512     |      10      |      2       | 898.362 us | 439.449 us |   -458.912 us | -51.08% |
|   32768    |     512     |      10      |      2       |   1.575 ms | 675.748 us |   -899.270 us | -57.10% |
|   262144   |     512     |      10      |      2       |  11.165 ms |   3.992 ms |  -7172.857 us | -64.24% |
|    1024    |    2048     |      10      |      2       | 678.483 us | 409.528 us |   -268.955 us | -39.64% |
|    4096    |    2048     |      10      |      2       | 997.831 us | 543.615 us |   -454.216 us | -45.52% |
|    8192    |    2048     |      10      |      2       |   1.582 ms | 822.555 us |   -759.219 us | -48.00% |
|   16364    |    2048     |      10      |      2       |   2.772 ms |   1.234 ms |  -1537.355 us | -55.46% |
|   32768    |    2048     |      10      |      2       |   5.093 ms |   2.149 ms |  -2944.546 us | -57.81% |
|   262144   |    2048     |      10      |      2       |  37.772 ms |  14.850 ms | -22921.371 us | -60.68% |
|    1024    |     128     |      25      |      2       | 207.323 us | 179.497 us |    -27.826 us | -13.42% |
|    4096    |     128     |      25      |      2       | 249.153 us | 204.989 us |    -44.164 us | -17.73% |
|    8192    |     128     |      25      |      2       | 316.093 us | 230.204 us |    -85.890 us | -27.17% |
|   16364    |     128     |      25      |      2       | 459.841 us | 279.298 us |   -180.543 us | -39.26% |
|   32768    |     128     |      25      |      2       | 748.611 us | 379.865 us |   -368.746 us | -49.26% |
|   262144   |     128     |      25      |      2       |   4.789 ms |   1.822 ms |  -2966.954 us | -61.95% |
|    1024    |     512     |      25      |      2       | 331.493 us | 269.785 us |    -61.707 us | -18.62% |
|    4096    |     512     |      25      |      2       | 446.998 us | 329.370 us |   -117.628 us | -26.32% |
|    8192    |     512     |      25      |      2       | 667.508 us | 456.063 us |   -211.445 us | -31.68% |
|   16364    |     512     |      25      |      2       |   1.109 ms | 643.706 us |   -464.991 us | -41.94% |
|   32768    |     512     |      25      |      2       |   1.996 ms |   1.051 ms |   -945.149 us | -47.36% |
|   262144   |     512     |      25      |      2       |  14.409 ms |   6.706 ms |  -7703.241 us | -53.46% |
|    1024    |    2048     |      25      |      2       | 828.190 us | 591.819 us |   -236.371 us | -28.54% |
|    4096    |    2048     |      25      |      2       |   1.252 ms | 836.633 us |   -414.874 us | -33.15% |
|    8192    |    2048     |      25      |      2       |   2.017 ms |   1.339 ms |   -677.821 us | -33.60% |
|   16364    |    2048     |      25      |      2       |   3.594 ms |   2.082 ms |  -1511.921 us | -42.06% |
|   32768    |    2048     |      25      |      2       |   6.667 ms |   3.701 ms |  -2966.655 us | -44.50% |
|   262144   |    2048     |      25      |      2       |  49.920 ms |  26.234 ms | -23686.493 us | -47.45% |
|    1024    |     128     |      5       |      26      | 266.604 us | 243.072 us |    -23.532 us |  -8.83% |
|    4096    |     128     |      5       |      26      | 369.774 us | 357.338 us |    -12.435 us |  -3.36% |
|    8192    |     128     |      5       |      26      | 546.339 us | 524.452 us |    -21.887 us |  -4.01% |
|   16364    |     128     |      5       |      26      | 865.386 us | 806.152 us |    -59.234 us |  -6.84% |
|   32768    |     128     |      5       |      26      |   1.491 ms |   1.402 ms |    -88.768 us |  -5.95% |
|   262144   |     128     |      5       |      26      |  10.441 ms |   9.727 ms |   -714.349 us |  -6.84% |
|    1024    |     512     |      5       |      26      | 567.700 us | 521.053 us |    -46.647 us |  -8.22% |
|    4096    |     512     |      5       |      26      | 959.818 us | 903.960 us |    -55.857 us |  -5.82% |
|    8192    |     512     |      5       |      26      |   1.588 ms |   1.579 ms |     -8.881 us |  -0.56% |
|   16364    |     512     |      5       |      26      |   2.732 ms |   2.632 ms |    -99.895 us |  -3.66% |
|   32768    |     512     |      5       |      26      |   4.990 ms |   4.842 ms |   -147.820 us |  -2.96% |
|   262144   |     512     |      5       |      26      |  36.764 ms |  35.780 ms |   -984.386 us |  -2.68% |
|    1024    |    2048     |      5       |      26      |   1.758 ms |   1.544 ms |   -214.086 us | -12.18% |
|    4096    |    2048     |      5       |      26      |   3.312 ms |   3.150 ms |   -162.507 us |  -4.91% |
|    8192    |    2048     |      5       |      26      |   5.698 ms |   5.581 ms |   -117.682 us |  -2.07% |
|   16364    |    2048     |      5       |      26      |  10.175 ms |   9.892 ms |   -283.592 us |  -2.79% |
|   32768    |    2048     |      5       |      26      |  19.019 ms |  18.590 ms |   -428.750 us |  -2.25% |
|   262144   |    2048     |      5       |      26      | 142.517 ms | 139.764 ms |  -2753.184 us |  -1.93% |
|    1024    |     128     |      10      |      26      | 277.937 us | 261.317 us |    -16.620 us |  -5.98% |
|    4096    |     128     |      10      |      26      | 384.665 us | 368.642 us |    -16.023 us |  -4.17% |
|    8192    |     128     |      10      |      26      | 566.374 us | 546.126 us |    -20.248 us |  -3.57% |
|   16364    |     128     |      10      |      26      | 907.391 us | 798.656 us |   -108.734 us | -11.98% |
|   32768    |     128     |      10      |      26      |   1.552 ms |   1.368 ms |   -183.853 us | -11.84% |
|   262144   |     128     |      10      |      26      |  10.882 ms |   9.374 ms |  -1507.620 us | -13.85% |
|    1024    |     512     |      10      |      26      | 616.924 us | 564.238 us |    -52.686 us |  -8.54% |
|    4096    |     512     |      10      |      26      |   1.042 ms | 970.627 us |    -70.978 us |  -6.81% |
|    8192    |     512     |      10      |      26      |   1.682 ms |   1.653 ms |    -29.064 us |  -1.73% |
|   16364    |     512     |      10      |      26      |   2.895 ms |   2.717 ms |   -177.754 us |  -6.14% |
|   32768    |     512     |      10      |      26      |   5.254 ms |   4.934 ms |   -319.588 us |  -6.08% |
|   262144   |     512     |      10      |      26      |  38.492 ms |  36.068 ms |  -2424.049 us |  -6.30% |
|    1024    |    2048     |      10      |      26      |   2.010 ms |   1.811 ms |   -198.338 us |  -9.87% |
|    4096    |    2048     |      10      |      26      |   3.630 ms |   3.447 ms |   -183.313 us |  -5.05% |
|    8192    |    2048     |      10      |      26      |   6.141 ms |   6.040 ms |   -101.531 us |  -1.65% |
|   16364    |    2048     |      10      |      26      |  10.837 ms |  10.367 ms |   -469.505 us |  -4.33% |
|   32768    |    2048     |      10      |      26      |  20.125 ms |  19.191 ms |   -933.884 us |  -4.64% |
|   262144   |    2048     |      10      |      26      | 149.540 ms | 142.846 ms |  -6693.143 us |  -4.48% |
|    1024    |     128     |      25      |      26      | 304.311 us | 277.693 us |    -26.618 us |  -8.75% |
|    4096    |     128     |      25      |      26      | 415.681 us | 385.801 us |    -29.880 us |  -7.19% |
|    8192    |     128     |      25      |      26      | 599.293 us | 551.448 us |    -47.844 us |  -7.98% |
|   16364    |     128     |      25      |      26      | 952.504 us | 837.496 us |   -115.008 us | -12.07% |
|   32768    |     128     |      25      |      26      |   1.664 ms |   1.422 ms |   -242.341 us | -14.56% |
|   262144   |     128     |      25      |      26      |  11.723 ms |   9.641 ms |  -2081.667 us | -17.76% |
|    1024    |     512     |      25      |      26      | 751.137 us | 706.851 us |    -44.286 us |  -5.90% |
|    4096    |     512     |      25      |      26      |   1.241 ms |   1.216 ms |    -24.526 us |  -1.98% |
|    8192    |     512     |      25      |      26      |   2.031 ms |   2.043 ms |     11.428 us |   0.56% |
|   16364    |     512     |      25      |      26      |   3.489 ms |   3.320 ms |   -168.836 us |  -4.84% |
|   32768    |     512     |      25      |      26      |   6.388 ms |   5.992 ms |   -396.116 us |  -6.20% |
|   262144   |     512     |      25      |      26      |  46.953 ms |  43.520 ms |  -3432.680 us |  -7.31% |
|    1024    |    2048     |      25      |      26      |   2.564 ms |   2.432 ms |   -132.059 us |  -5.15% |
|    4096    |    2048     |      25      |      26      |   4.600 ms |   4.524 ms |    -76.239 us |  -1.66% |
|    8192    |    2048     |      25      |      26      |   7.747 ms |   8.473 ms |    725.970 us |   9.37% |
|   16364    |    2048     |      25      |      26      |  13.620 ms |  13.193 ms |   -426.827 us |  -3.13% |
|   32768    |    2048     |      25      |      26      |  25.247 ms |  24.278 ms |   -968.509 us |  -3.84% |
|   262144   |    2048     |      25      |      26      | 187.823 ms | 179.514 ms |  -8309.640 us |  -4.42% |

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels May 11, 2023
@davidwendt davidwendt marked this pull request as ready for review May 11, 2023 17:36
@davidwendt davidwendt requested a review from a team as a code owner May 11, 2023 17:36
@davidwendt davidwendt requested review from harrism and vuule May 11, 2023 17:36
@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 4483b87 into rapidsai:branch-23.06 May 16, 2023
@davidwendt davidwendt deleted the minhash-perf branch May 16, 2023 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants