Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regex quantifier check to include capture groups #11373

Merged
merged 17 commits into from
Aug 4, 2022

Conversation

davidwendt
Copy link
Contributor

Description

Adds regex compile logic to check quantifier can be used with the previous item even if its within a capture group.
This prevents an infinite loop occurring when evaluating the expression.
Additional gtests are included to check for this condition which should throw an error.

Closes #11311

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added bug Something isn't working 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) non-breaking Non-breaking change labels Jul 27, 2022
@davidwendt davidwendt self-assigned this Jul 27, 2022
@codecov
Copy link

codecov bot commented Jul 27, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@9429099). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.10   #11373   +/-   ##
===============================================
  Coverage                ?   86.47%           
===============================================
  Files                   ?      144           
  Lines                   ?    22856           
  Branches                ?        0           
===============================================
  Hits                    ?    19765           
  Misses                  ?     3091           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jul 29, 2022
@davidwendt davidwendt marked this pull request as ready for review July 29, 2022 14:54
@davidwendt davidwendt requested a review from a team as a code owner July 29, 2022 14:54
@davidwendt davidwendt requested review from upsj and elstehle July 29, 2022 14:54
cpp/src/strings/regex/regcomp.cpp Outdated Show resolved Hide resolved
cpp/src/strings/regex/regcomp.cpp Outdated Show resolved Hide resolved
cpp/src/strings/regex/regcomp.cpp Outdated Show resolved Hide resolved
@davidwendt davidwendt requested a review from upsj August 3, 2022 14:54
Copy link
Contributor

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just adding a minor comment. will re-parse now with the updates suggested by @upsj

cpp/src/strings/regex/regcomp.cpp Outdated Show resolved Hide resolved
@davidwendt davidwendt requested a review from elstehle August 3, 2022 16:51
Copy link
Contributor

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for getting me to learn another digestible bit from the regex universe 💡

@davidwendt
Copy link
Contributor Author

@gpucibot merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] regexp: hanging when attempting to repeat string anchor inside capture group
3 participants