Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chars-tokenizer to nvtext tokenize_benchmark.cpp #8125

Merged

Conversation

davidwendt
Copy link
Contributor

This PR adds a benchmark to the current tokenize_benchmark.cpp to measure the nvtext::character_tokenize API.

PR #8085 added code for using the nvtext::character_tokenize function.
The benchmark was also useful while investigating #8094.
Also found and removed an unused variable in the code logic.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 30, 2021
@davidwendt davidwendt self-assigned this Apr 30, 2021
@davidwendt davidwendt requested a review from a team as a code owner April 30, 2021 17:27
@codecov
Copy link

codecov bot commented Apr 30, 2021

Codecov Report

Merging #8125 (10537f7) into branch-0.20 (51336df) will decrease coverage by 0.36%.
The diff coverage is 90.90%.

❗ Current head 10537f7 differs from pull request most recent head c6ca384. Consider uploading reports for the commit c6ca384 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.20    #8125      +/-   ##
===============================================
- Coverage        82.88%   82.52%   -0.37%     
===============================================
  Files              103      103              
  Lines            17668    17500     -168     
===============================================
- Hits             14645    14441     -204     
- Misses            3023     3059      +36     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/core/groupby/groupby.py 91.39% <ø> (-0.06%) ⬇️
python/cudf/cudf/core/index.py 92.62% <ø> (-0.46%) ⬇️
python/cudf/cudf/core/indexing.py 96.50% <ø> (+0.21%) ⬆️
python/cudf/cudf/core/reshape.py 91.08% <ø> (-0.03%) ⬇️
python/cudf/cudf/core/scalar.py 87.87% <ø> (+0.13%) ⬆️
python/cudf/cudf/core/series.py 91.40% <ø> (-0.32%) ⬇️
python/cudf/cudf/core/tools/datetimes.py 80.17% <ø> (-4.36%) ⬇️
python/cudf/cudf/core/window/rolling.py 88.88% <ø> (+0.65%) ⬆️
python/cudf/cudf/io/orc.py 86.80% <ø> (-0.10%) ⬇️
... and 71 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f686c01...c6ca384. Read the comment docs.

@harrism
Copy link
Member

harrism commented May 6, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 611cabd into rapidsai:branch-0.20 May 6, 2021
@davidwendt davidwendt deleted the chars-tokenize-benchmark branch May 6, 2021 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants