Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Pin nltk version for sentence tokenizer #8786

Merged
merged 4 commits into from
Jan 31, 2025

Conversation

Amnah199
Copy link
Contributor

@Amnah199 Amnah199 commented Jan 29, 2025

Related Issues

Proposed Changes:

Pin nltk==3.9.1 for SentenceTokenizer and DocumentSplitter.
This PR is a suggestion, open to discussion.

How did you test it?

Ran the tests
Tested an example.

Notes for the reviewer

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added unit tests and updated the docstrings
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue

@coveralls
Copy link
Collaborator

coveralls commented Jan 29, 2025

Pull Request Test Coverage Report for Build 13075527333

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 91.359%

Totals Coverage Status
Change from base Build 13074273511: 0.0%
Covered Lines: 8871
Relevant Lines: 9710

💛 - Coveralls

@Amnah199 Amnah199 added the ignore-for-release-notes PRs with this flag won't be included in the release notes. label Jan 29, 2025
davidsbatista
davidsbatista previously approved these changes Jan 31, 2025
Copy link
Contributor

@davidsbatista davidsbatista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidsbatista davidsbatista marked this pull request as ready for review January 31, 2025 15:03
@davidsbatista davidsbatista requested a review from a team as a code owner January 31, 2025 15:03
@davidsbatista davidsbatista requested review from anakin87 and removed request for a team January 31, 2025 15:03
@anakin87
Copy link
Member

anakin87 commented Jan 31, 2025

Sorry... I think that NLTK is pretty stable in general, compared to other libraries we usually pin to a specific version (e.g. transformers).

Can we avoid pinning a specific version?
nltk>=3.9.1 would be acceptable?

@davidsbatista
Copy link
Contributor

@anakin87 yeah, I think it should be OK - I will make this change

@davidsbatista davidsbatista dismissed their stale review January 31, 2025 15:14

don't pin to a particular version

pyproject.toml Outdated Show resolved Hide resolved
@davidsbatista davidsbatista merged commit 379711f into main Jan 31, 2025
18 checks passed
@davidsbatista davidsbatista deleted the pin-nltk-version-splitter branch January 31, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ignore-for-release-notes PRs with this flag won't be included in the release notes. topic:build/distribution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants