-
Notifications
You must be signed in to change notification settings - Fork 922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvement for nvtext tokenize/token functions #13480
Merged
rapids-bot
merged 31 commits into
rapidsai:branch-23.08
from
davidwendt:nvtext-perf-tokenize
Jun 29, 2023
Merged
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
63bde3b
Performance improvement for nvtext tokenize for long strings
davidwendt 6a6f668
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 1f99968
fix perf of normalize spaces
davidwendt bd2c072
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 1dbe89a
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt b06a871
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 764f154
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 05a9c55
update comments, style; name const value
davidwendt d743154
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt a7d8826
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 408ad51
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt a68711d
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 031b666
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt d023e40
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 51954c1
remove unneeded prefetch logic from normalize spaces
davidwendt 7c89985
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 50e75f1
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt d59e695
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 09b67cf
Merge branch 'nvtext-perf-tokenize' of github.com:davidwendt/cudf int…
davidwendt 541440f
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt e1e6709
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 12b75c5
Merge branch 'nvtext-perf-tokenize' of github.com:davidwendt/cudf int…
davidwendt c5357c6
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 241d7d3
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 75a8c39
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt bf03f1a
Merge branch 'nvtext-perf-tokenize' of github.com:davidwendt/cudf int…
davidwendt 58701e7
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 946ca1c
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 95787ff
fix next-token boundary check
davidwendt 4bdac42
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt 48288fe
Merge branch 'branch-23.08' into nvtext-perf-tokenize
davidwendt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we deprecated offset_type entirely? Has it been removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet. I can look into doing that in a separate PR.