-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvement in cudf::strings::all_characters_of_type #13259
Performance improvement in cudf::strings::all_characters_of_type #13259
Conversation
Here is the diff from the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Just one small question: does the performance improvement come from bypassing the continuation char?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice speedups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
That's basically the case here. Overall this changes from a character-based iteration to a byte-based iteration which is a bit faster here by limiting character counting. Iterating by characters is not always necessary but also not always slower. |
/merge |
Description
Improves performance for
cudf::strings::all_characters_of_type()
API which covers many cudfis_X
functions. The solution improves performance for all string lengths as measured by the new benchmark included in this PR.Additionally, the code was cleaned up to help with maintenance and clarity.
Reference #13048
Checklist