-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add string scalar replace benchmark #7369
Conversation
These are the results on a V100. I filed #7370 to improve the performance on long strings.
|
Here's the updated benchmarks using the custom arg generator showing the combinations that are run:
|
Codecov Report
@@ Coverage Diff @@
## branch-0.19 #7369 +/- ##
==============================================
Coverage ? 81.79%
==============================================
Files ? 100
Lines ? 16610
Branches ? 0
==============================================
Hits ? 13586
Misses ? 3024
Partials ? 0 Continue to review full report at Codecov.
|
@gpucibot merge |
#7384) Reference #7370 This PR simplifies the current `cudf::strings::replace` (non-regex) functions by refactoring to use the more efficient `make_strings_children` utility. This refactoring improves performance by about 2x on these APIs as measured by the gbenchmark PR #7369. <details> <summary>Baseline gbenchmark for replace-scalar</summary> ``` --------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------------------------------------- StringReplaceScalar/replace_scalar/4096/32/manual_time 0.308 ms 0.316 ms 2345 bytes_per_second=224.631M/s StringReplaceScalar/replace_scalar/4096/128/manual_time 1.01 ms 1.03 ms 684 bytes_per_second=269.171M/s StringReplaceScalar/replace_scalar/4096/512/manual_time 7.35 ms 7.38 ms 95 bytes_per_second=149.028M/s StringReplaceScalar/replace_scalar/4096/2048/manual_time 74.1 ms 74.2 ms 9 bytes_per_second=58.9153M/s StringReplaceScalar/replace_scalar/4096/8192/manual_time 1170 ms 1170 ms 1 bytes_per_second=14.8457M/s StringReplaceScalar/replace_scalar/32768/32/manual_time 0.314 ms 0.333 ms 2225 bytes_per_second=1.7147G/s StringReplaceScalar/replace_scalar/32768/128/manual_time 1.16 ms 1.18 ms 604 bytes_per_second=1.83688G/s StringReplaceScalar/replace_scalar/32768/512/manual_time 7.56 ms 7.58 ms 92 bytes_per_second=1.12604G/s StringReplaceScalar/replace_scalar/32768/2048/manual_time 80.8 ms 80.9 ms 9 bytes_per_second=432.314M/s StringReplaceScalar/replace_scalar/32768/8192/manual_time 1526 ms 1521 ms 1 bytes_per_second=91.3563M/s StringReplaceScalar/replace_scalar/262144/32/manual_time 0.430 ms 0.449 ms 1622 bytes_per_second=10.0357G/s StringReplaceScalar/replace_scalar/262144/128/manual_time 1.94 ms 1.96 ms 361 bytes_per_second=8.80298G/s StringReplaceScalar/replace_scalar/262144/512/manual_time 18.1 ms 18.0 ms 39 bytes_per_second=3.77253G/s StringReplaceScalar/replace_scalar/262144/2048/manual_time 227 ms 227 ms 3 bytes_per_second=1.20334G/s StringReplaceScalar/replace_scalar/2097152/32/manual_time 2.48 ms 2.50 ms 282 bytes_per_second=13.9373G/s StringReplaceScalar/replace_scalar/2097152/128/manual_time 11.8 ms 11.9 ms 59 bytes_per_second=11.5245G/s StringReplaceScalar/replace_scalar/2097152/512/manual_time 101 ms 101 ms 7 bytes_per_second=5.42976G/s StringReplaceScalar/replace_scalar/16777216/32/manual_time 22.2 ms 22.2 ms 31 bytes_per_second=12.4258G/s ``` </details> <details> <summary>gbenchmark results for refactored replace-scalar</summary> ``` --------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------------------------------------- StringReplaceScalar/replace_scalar/4096/32/manual_time 0.144 ms 0.162 ms 4871 bytes_per_second=481.559M/s StringReplaceScalar/replace_scalar/4096/128/manual_time 0.428 ms 0.446 ms 1633 bytes_per_second=634.055M/s StringReplaceScalar/replace_scalar/4096/512/manual_time 2.65 ms 2.67 ms 263 bytes_per_second=413.561M/s StringReplaceScalar/replace_scalar/4096/2048/manual_time 28.8 ms 28.8 ms 24 bytes_per_second=151.733M/s StringReplaceScalar/replace_scalar/4096/8192/manual_time 479 ms 479 ms 2 bytes_per_second=36.2387M/s StringReplaceScalar/replace_scalar/32768/32/manual_time 0.161 ms 0.178 ms 4347 bytes_per_second=3.35237G/s StringReplaceScalar/replace_scalar/32768/128/manual_time 0.466 ms 0.484 ms 1502 bytes_per_second=4.57268G/s StringReplaceScalar/replace_scalar/32768/512/manual_time 2.94 ms 2.96 ms 238 bytes_per_second=2.89405G/s StringReplaceScalar/replace_scalar/32768/2048/manual_time 37.4 ms 37.4 ms 19 bytes_per_second=933.899M/s StringReplaceScalar/replace_scalar/32768/8192/manual_time 567 ms 565 ms 1 bytes_per_second=245.929M/s StringReplaceScalar/replace_scalar/262144/32/manual_time 0.316 ms 0.334 ms 2198 bytes_per_second=13.6601G/s StringReplaceScalar/replace_scalar/262144/128/manual_time 1.39 ms 1.41 ms 498 bytes_per_second=12.237G/s StringReplaceScalar/replace_scalar/262144/512/manual_time 12.8 ms 12.9 ms 54 bytes_per_second=5.30963G/s StringReplaceScalar/replace_scalar/262144/2048/manual_time 157 ms 157 ms 4 bytes_per_second=1.73861G/s StringReplaceScalar/replace_scalar/2097152/32/manual_time 1.84 ms 1.86 ms 379 bytes_per_second=18.7409G/s StringReplaceScalar/replace_scalar/2097152/128/manual_time 9.50 ms 9.52 ms 74 bytes_per_second=14.3717G/s StringReplaceScalar/replace_scalar/2097152/512/manual_time 84.7 ms 84.7 ms 8 bytes_per_second=6.44185G/s StringReplaceScalar/replace_scalar/16777216/32/manual_time 14.0 ms 14.0 ms 50 bytes_per_second=19.6828G/s ``` </details> Improvements for #7370 should base off of these changes. Authors: - David (@davidwendt) Approvers: - Jason Lowe (@jlowe) - @nvdbaranec - Mark Harris (@harrism) URL: #7384
… gbenchmark (#7403) Reference #5698 This builds off of PR #7369 to add `cudf::strings::replace_slice` and the multi-column version of `cudf::strings::replace` to the current gbenchmark that only measures scalar strings replace. The current `replace_scalar_benchmark.cpp` is also renamed to `replace_benchmark.cpp` since it now handles more than the scalar replace. Authors: - David (@davidwendt) Approvers: - Jason Lowe (@jlowe) - Keith Kraus (@kkraus14) - @nvdbaranec - Karthikeyan (@karthikeyann) URL: #7403
Reference #5698
This creates a gbenchmark for the scalar form of
cudf::strings::replace
.