Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup libcudf strings regex classes #10573

Merged
merged 16 commits into from
Apr 14, 2022

Conversation

davidwendt
Copy link
Contributor

Refactors some of the internal libcudf regex classes used for executing regex on strings. This is the first part of some changes to reduce kernel memory launch size for the regex code. A follow on PR will change the stack-based state management to a device memory approach. The changes here are isolated to help ease the review process in the next PR. Mostly code has been moved or refactored along with general cleanup like adding consts and removing some unnecessary pass-by-reference/pointer.

None of the calling routines currently require changes and no behavior has changed.

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 1, 2022
@davidwendt davidwendt self-assigned this Apr 1, 2022
@codecov
Copy link

codecov bot commented Apr 4, 2022

Codecov Report

Merging #10573 (cc9eee4) into branch-22.06 (bf4ffc9) will increase coverage by 0.02%.
The diff coverage is n/a.

❗ Current head cc9eee4 differs from pull request most recent head 3ee2f7b. Consider uploading reports for the commit 3ee2f7b to get more accurate results

@@               Coverage Diff                @@
##           branch-22.06   #10573      +/-   ##
================================================
+ Coverage         86.33%   86.36%   +0.02%     
================================================
  Files               140      140              
  Lines             22289    22289              
================================================
+ Hits              19244    19249       +5     
+ Misses             3045     3040       -5     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/numerical.py 95.88% <0.00%> (-0.30%) ⬇️
python/cudf/cudf/core/column/string.py 89.10% <0.00%> (+0.12%) ⬆️
python/cudf/cudf/core/groupby/groupby.py 91.72% <0.00%> (+0.22%) ⬆️
python/cudf/cudf/core/tools/datetimes.py 84.49% <0.00%> (+0.30%) ⬆️
python/cudf/cudf/core/column/lists.py 92.70% <0.00%> (+1.28%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update df6bd3c...3ee2f7b. Read the comment docs.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Apr 4, 2022
@davidwendt davidwendt marked this pull request as ready for review April 4, 2022 17:12
@davidwendt davidwendt requested a review from a team as a code owner April 4, 2022 17:12
@davidwendt davidwendt requested a review from jrhemstad April 7, 2022 18:12
@davidwendt davidwendt requested a review from devavret April 12, 2022 11:33
@davidwendt
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit ac27757 into rapidsai:branch-22.06 Apr 14, 2022
@davidwendt davidwendt deleted the regex-classes-cleanup branch April 14, 2022 12:10
rapids-bot bot pushed a commit that referenced this pull request May 6, 2022
All libcudf strings regex calls will use global device memory for state data when evaluating regex on strings. Previously, separate templated kernels were used to store state data in fixed size stack memory depending on the number of instructions resolved from the provided regex pattern. This required the CUDA driver to allocate a large amount of device memory for when launching the kernel. This memory is managed by the launcher in the driver and so not under control of RMM.

This has been changed to use a memory-resource allocated global device memory to hold and manage the state data per string per instruction. This is an internal change only and results in no behavior changes. Overall, the performance based on the current benchmarks has not changed though much more memory may be required to execute any of the regex APIs depending on the number of instructions in the pattern and the total number of strings in the column.

Every effort has been made to not reduce performance from the stack-based approach. Additional optimizations here include copying the `reprog_device` class data to shared-memory (when it fits). Further optimizations are expected in later PRs as well.

Overall, the compile time of the files that use regex is also faster since only a single kernel is generated instead of 4 in the templated, stack-based implementation.

This PR is dependent on PR #10573.

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Jake Hemstad (https://github.com/jrhemstad)

URL: #10600
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants