-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JNI for strings::split_re
and strings::split_record_re
#10139
Add JNI for strings::split_re
and strings::split_record_re
#10139
Conversation
Don't you need java side changes and java tests as well here? Typically Java bindings and JNI bindings go hand in hand. This is in draft mode so I guess you might have it brewing on your branch still but wanted to put this out just in case. |
Sure, I'll add. This is still draft WIP :) |
This comment has been minimized.
This comment has been minimized.
# Conflicts: # cpp/include/cudf/strings/split/split_re.hpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few nits that are not required
@gpucibot merge |
This PR adds Java binding for the new strings API
strings::split_re
andstrings::split_record_re
, which allows splitting strings by regular expression delimiters.In addition, the Java string split overloads with default split pattern (an empty string) are removed in this PR. That is because with default empty pattern the Java's split API produces different results than cudf.
Finally, some cleanup has been perform automatically thanks to IntelliJ IDE.
Depends on #10128.
This is breaking change which is fixed by NVIDIA/spark-rapids#4714. Thus, it should be merged at the same time with NVIDIA/spark-rapids#4714.