-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental make_strings_children utility #15363
Add experimental make_strings_children utility #15363
Conversation
Updates the `replace_re()` and `replace_with_backrefs()` internal logic to support large strings. These functions use a regex-specific version of make-strings-children. Depends on #15363 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Vukasin Milovanovic (https://github.com/vuule) URL: #15524
The functor is passed by value. The members stored in them won't be passed back to caller. The caller's functor object scope won't matter. Do we need to change the functor members to functor's arguments? |
You are right. I had forgotten about the functor is passed by value so there are 2 independent copies and therefore really 2 scopes for the data. I'm happy to change it back to member variables unless there is more compelling reason for keeping them as parameters. Requiring a base-class does not really help much here in my opinion since they would hide the members in a separate header file that are directly and frequently being referenced by the functor logic. I'd like to keep them close the logic that uses them. Also, I want to preserve the raw kernel introduced here for calling the functors. This allows functors to legally use shared memory and warp and block intrinsics, etc which are considered UB in thrust lambdas/functors. |
@davidwendt I’m not able to re-review until next week but please feel free to move forward with your plan described above. I proposed the alternative structure but I’m happy with your reasoning for rejecting that proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A clarification on the docs -
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
LGTM! |
/merge |
Description
Adds new
cudf::strings::detail::experimental::make_strings_children
which uses the offsetalator to build output columns. The currentd_offsets
member required by the given functors no longer stores sizes and offsets but is now split intod_sizes
andd_offsets
whered_sizes
is computed in the first pass and thend_offsets
is set to an offsetalator for building output ind_chars
.Once all the uses of
make_strings_children
(~50 or so) are converted to use the experimental implementation, this will replace the old implementation and the 'experimental' namespace will be removed.This PR includes 2 changes,
repeat_strings
andconcatenate
(per row) since each use different overloadedmake_strings_children
functions to verify the code does not break any current tests.Checklist