You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rust-punkt and NLTK Punkt (with aligning off) produce different results when using exactly the same model. NLTK Punkt correctly identifies abbreviations and doesn't split on them, while rust-punkt, with the same model, splits sentences on almost every period.
To test things, I loaded the JSON model from rust-punkt:
Rust-punkt and NLTK Punkt (with aligning off) produce different results when using exactly the same model. NLTK Punkt correctly identifies abbreviations and doesn't split on them, while rust-punkt, with the same model, splits sentences on almost every period.
To test things, I loaded the JSON model from rust-punkt:
The output from NLTK Punkt:
While rust-punkt produced:
The text was updated successfully, but these errors were encountered: