-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add regex_program
java APIs and unit tests
#12548
Add regex_program
java APIs and unit tests
#12548
Conversation
Signed-off-by: Cindy Jiang <[email protected]>
Pull requests from external contributors require approval from a |
@cindyyuanjiang: I've added |
Signed-off-by: Cindy Jiang <[email protected]>
…g/cudf into regex-program-cudf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor nitpicks to start.
I have yet to go over the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more minor nitpicks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, LGTM. A couple of very minor nitpicks remain.
Signed-off-by: Cindy Jiang <[email protected]>
regex_program
java APIs and unit testsregex_program
java APIs and unit tests
public CaptureGroups capture() { | ||
return capture; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we add in native APIs to allow us to call the following APIs.
cudf/cpp/include/cudf/strings/regex/regex_program.hpp
Lines 93 to 113 in 22087b3
/** | |
* @brief Return the number of instructions in this instance | |
* | |
* @return Number of instructions | |
*/ | |
int32_t instructions_count() const; | |
/** | |
* @brief Return the number of capture groups in this instance | |
* | |
* @return Number of groups | |
*/ | |
int32_t groups_count() const; | |
/** | |
* @brief Return the pattern used to create this instance | |
* | |
* @param num_strings Number of strings for computation | |
* @return Size of the working memory in bytes | |
*/ | |
std::size_t compute_working_memory_size(int32_t num_strings) const; |
I am fine if it is a follow on issue that we do later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can definitely add them in a follow up issue. I find the current PR a bit large to track things easily. Thank you for the comment!
Signed-off-by: Cindy Jiang <[email protected]>
Signed-off-by: Cindy Jiang <[email protected]>
Signed-off-by: Cindy Jiang <[email protected]>
Signed-off-by: Cindy Jiang <[email protected]>
…g/cudf into regex-program-cudf
Codecov ReportBase: 86.58% // Head: 85.73% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-23.02 #12548 +/- ##
================================================
- Coverage 86.58% 85.73% -0.85%
================================================
Files 155 155
Lines 24368 24889 +521
================================================
+ Hits 21098 21339 +241
- Misses 3270 3550 +280
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
/ok to test |
The test failures appear to be totally unrelated. It failed in a boolean segmented sort test. Not sure if you just need to upmerge or what. |
rerun tests |
hmm the same tests failed again. I wonder if others are seeing these same failures or not. |
@davidwendt it looks like |
@revans2, the You can check out the docs below for information on how to rerun tests in GH Actions |
Yes, this was a cmake error that is fixed by rapidsai/rapids-cmake#353 |
@cindyyuanjiang your branch is still out of date. Please upmerge again. Also because this is deprecating some APIs that the spark plugin uses we are going to need to put up a PR in the plugin that adds the warnings to an allow list so we don't break the build. You will need to add an annotation similar to
You can put this on each off the methods that calls one of the deprecated methods. Sorry I know this is a pain. |
@revans2, a fully up-to-date PR branch is not required for a PR to merge. If a PR has all of the required reviews and all of the required GitHub checks are passing, then it can be merged by someone with The Update Branch button is a GitHub feature that we enabled (src) to compliment the new Recently Updated check shown in the screenshot below. You can read about the Recently Updated check here: https://docs.rapids.ai/resources/recently-updated/ The Update Branch button and the Recently Updated check are separate but complimentary features. At this time, the Recently Updated check is not required, but that may change in the future. |
@ajschmidt8 you are right. I just got confused when I saw But it is not a blocker.... |
/merge |
@revans2 Thank you very much! I will follow up with the spark plugin side. |
@cindyyuanjiang it's OK I have a patch for the warnings almost ready to go. I realized I hit merge a bit too soon on this so I figured I would take the hit and make the patch |
#12548 introduced a number of GPU resource leaks in ColumnVectorTest. This cleans them up by wrapping them in `try` blocks. Authors: - Jason Lowe (https://github.com/jlowe) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) - Jim Brennan (https://github.com/jbrennan333) - Nghia Truong (https://github.com/ttnghia) URL: #12625
This reverts commit 20c945b.
This reverts commit 20c945b.
* Revert "Fix leaks in ColumnVectorTest (#12625)" This reverts commit fb17ac7. * Revert "Add `regex_program` java APIs and unit tests (#12548)" This reverts commit 20c945b. --------- Co-authored-by: Cindy Jiang <[email protected]> Co-authored-by: GALI PREM SAGAR <[email protected]>
Description
Adds a set of java regex APIs that take in a
regex_program
as parameter and java unit tests. This is part of the solution for NVIDIA/spark-rapids#7295.Checklist