Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gbenchmarks for strings extract function #7522

Merged

Conversation

davidwendt
Copy link
Contributor

Reference #5698
This creates a gbenchmark for cudf::strings::extract function. The benchmarks measures various sized rows as well as strings lengths. It also has measurements for small, medium, and large regex instructions. The extract performance is effected by the number of instructions in the regex pattern.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 5, 2021
@davidwendt davidwendt self-assigned this Mar 5, 2021
@davidwendt davidwendt requested review from a team as code owners March 5, 2021 21:08
@davidwendt davidwendt requested review from trxcllnt and jrhemstad March 5, 2021 21:08
@github-actions github-actions bot added the CMake CMake build issue label Mar 5, 2021
Copy link
Collaborator

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmake lgtm

Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link

codecov bot commented Mar 6, 2021

Codecov Report

Merging #7522 (9233ece) into branch-0.19 (7871e7a) will increase coverage by 0.41%.
The diff coverage is 93.75%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7522      +/-   ##
===============================================
+ Coverage        81.86%   82.27%   +0.41%     
===============================================
  Files              101      101              
  Lines            16884    17261     +377     
===============================================
+ Hits             13822    14202     +380     
+ Misses            3062     3059       -3     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/column.py 87.80% <75.00%> (+0.04%) ⬆️
python/cudf/cudf/core/column/decimal.py 95.83% <100.00%> (+0.96%) ⬆️
python/cudf/cudf/core/column/string.py 86.76% <100.00%> (+0.26%) ⬆️
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
python/cudf/cudf/core/abc.py 87.23% <0.00%> (-1.14%) ⬇️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/struct.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/_version.py 0.00% <0.00%> (ø)
... and 43 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2dd15b0...9233ece. Read the comment docs.

@harrism
Copy link
Member

harrism commented Mar 9, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 4897a25 into rapidsai:branch-0.19 Mar 9, 2021
@davidwendt davidwendt deleted the benchmark-strings-extract branch March 9, 2021 14:47
hyperbolic2346 pushed a commit to hyperbolic2346/cudf that referenced this pull request Mar 25, 2021
Reference rapidsai#5698
This creates a gbenchmark for `cudf::strings::extract` function. The benchmarks measures various sized rows as well as strings lengths. It also has measurements for small, medium, and large regex instructions. The extract performance is effected by the number of instructions in the regex pattern.

Authors:
  - David (@davidwendt)

Approvers:
  - Keith Kraus (@kkraus14)
  - Karthikeyan (@karthikeyann)
  - Mark Harris (@harrism)

URL: rapidsai#7522
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants