Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gbenchmark for strings find/contains functions #7392

Merged
merged 6 commits into from
Feb 19, 2021

Conversation

davidwendt
Copy link
Contributor

Reference #5698
This creates a gbenchmark for the cudf::strings::contains, cudf::strings::find, cudf::strings::find_multiple, cudf::strings::starts_with, and cudf::strings::ends_with.

This also includes some improvements for starts_with and ends_with to use string_view::compare instead of string_view::find since compare would be more efficient when checking for a string at a specific position. This improved the performance of these two functions by 2-3x on average.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 16, 2021
@davidwendt davidwendt self-assigned this Feb 16, 2021
@davidwendt davidwendt requested review from a team as code owners February 16, 2021 20:24
@github-actions github-actions bot added the CMake CMake build issue label Feb 16, 2021
Copy link
Collaborator

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmake lgtm

@davidwendt davidwendt changed the title Benchmark strings find Add gbenchmark for strings find/contains functions Feb 16, 2021
@codecov
Copy link

codecov bot commented Feb 17, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-0.19@7a6e60e). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@              Coverage Diff               @@
##             branch-0.19    #7392   +/-   ##
==============================================
  Coverage               ?   81.80%           
==============================================
  Files                  ?      101           
  Lines                  ?    16695           
  Branches               ?        0           
==============================================
  Hits                   ?    13658           
  Misses                 ?     3037           
  Partials               ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a6e60e...be4d0f6. Read the comment docs.

@karthikeyann karthikeyann added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Feb 19, 2021
@kkraus14
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 580f9a2 into rapidsai:branch-0.19 Feb 19, 2021
@davidwendt davidwendt deleted the benchmark-strings-find branch February 19, 2021 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants