Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regex_program searching APIs and related java classes #12666

Merged
merged 12 commits into from
Feb 3, 2023

Conversation

cindyyuanjiang
Copy link
Contributor

@cindyyuanjiang cindyyuanjiang commented Jan 31, 2023

Signed-off-by: Cindy Jiang [email protected]

Description

This PR adds contains_re, matches_re, count_re, findall related regex_program java APIs and unit tests, and RegexProgram, RegexFlag, and CaptureGroups java classes to support these regex_program APIs.

Part of work for NVIDIA/spark-rapids#7295.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the Java Affects Java cuDF API. label Jan 31, 2023
@cindyyuanjiang cindyyuanjiang marked this pull request as ready for review January 31, 2023 22:04
@cindyyuanjiang cindyyuanjiang requested a review from a team as a code owner January 31, 2023 22:04
@cindyyuanjiang cindyyuanjiang added feature request New feature or request non-breaking Non-breaking change labels Feb 1, 2023
@codecov
Copy link

codecov bot commented Feb 1, 2023

Codecov Report

❗ No coverage uploaded for pull request base (branch-23.04@3aab6b8). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-23.04   #12666   +/-   ##
===============================================
  Coverage                ?   85.81%           
===============================================
  Files                   ?      158           
  Lines                   ?    25151           
  Branches                ?        0           
===============================================
  Hits                    ?    21584           
  Misses                  ?     3567           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new classes seem OK, but normally we do not add classes and APIs without the ability for the user to meaningfully wield it. There are no tests of these new APIs to prove they work and are useful, and there's nothing that can be meaningfully done with them as-is.

Should we at least add this API to some Table or ColumnView method to both provide a test surface for it and a meaningful way to start using it? We wouldn't have to put the new class in all potential places in one PR, but not putting it anywhere seems a bit less than ideal.

@davidwendt
Copy link
Contributor

Is there a way to test this through the Spark test suite before merging so we can catch any potential errors like those found during the end of the last release?

@cindyyuanjiang
Copy link
Contributor Author

The new classes seem OK, but normally we do not add classes and APIs without the ability for the user to meaningfully wield it. There are no tests of these new APIs to prove they work and are useful, and there's nothing that can be meaningfully done with them as-is.

Yes, thank you. I will update to include some usage of the added classes.

@cindyyuanjiang
Copy link
Contributor Author

Is there a way to test this through the Spark test suite before merging so we can catch any potential errors like those found during the end of the last release?

Yes, I will test the changes in the Spark plugin to avoid issues encountered before. Thanks!

@cindyyuanjiang cindyyuanjiang changed the title Add regex program related java classes Add regex_program searching APIs and related java classes Feb 2, 2023
Signed-off-by: Cindy Jiang <[email protected]>
Copy link
Contributor

@msadang msadang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@msadang msadang merged commit 17554ad into rapidsai:branch-23.04 Feb 3, 2023
@cindyyuanjiang cindyyuanjiang deleted the regex-program-classes branch February 3, 2023 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Java Affects Java cuDF API. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants