Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose stream parameter in public nvtext ngram APIs #14061

Merged
merged 15 commits into from
Sep 22, 2023

Conversation

davidwendt
Copy link
Contributor

@davidwendt davidwendt commented Sep 7, 2023

Description

Add stream parameter to public APIs:

  • nvtext::generate_ngrams()
  • nvtext::generate_character_ngrams()
  • nvtext::hash_character_ngrams()
  • nvtext::ngrams_tokenize()

Also cleaned up some of the doxygen comments.
And also fixed a spelling mistake in the jaccard.cu source that was bothering me.

Reference #13744

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python) improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 7, 2023
@davidwendt davidwendt requested a review from a team as a code owner September 7, 2023 22:02
@davidwendt davidwendt self-assigned this Sep 7, 2023
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but same request for tests as on #14060

@github-actions github-actions bot added the CMake CMake build issue label Sep 12, 2023
@davidwendt
Copy link
Contributor Author

@vyasr How do I run these tests locally?

@davidwendt davidwendt requested a review from vyasr September 13, 2023 17:00
@vyasr
Copy link
Contributor

vyasr commented Sep 18, 2023

@vyasr How do I run these tests locally?

The easiest way to run the test with the appropriate configuration is to use ctest. ctest -R STREAM_TEXT_TEST will work for this one. If you execute the test executable directly it will run, but without the appropriate stream configuration; to do that correctly, you also need to set the appropriate environment variables:

GTEST_CUDF_STREAM_MODE=new_testing_default LD_PRELOAD=/path/to/build/cudf_identify_stream_usage_mode_testing /path/to/test/executable

cpp/include/nvtext/generate_ngrams.hpp Outdated Show resolved Hide resolved
@davidwendt davidwendt requested a review from vyasr September 19, 2023 22:55
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm changing the labels to mark this PR as breaking now though since the separator is no longer optional.

@vyasr vyasr removed the non-breaking Non-breaking change label Sep 20, 2023
@vyasr vyasr added the breaking Breaking change label Sep 20, 2023
@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit f865c87 into rapidsai:branch-23.10 Sep 22, 2023
54 checks passed
@davidwendt davidwendt deleted the stream-nvtext-ngrams branch September 22, 2023 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team breaking Breaking change CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants