Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

segment-any-text / wtpsplit Public

Notifications You must be signed in to change notification settings
Fork 48
Star 837

Code
Issues 3
Pull requests
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: segment-any-text/wtpsplit

Releases · segment-any-text/wtpsplit

Release 2.1.4

25 Jan 16:43

markus583

Compare

Choose a tag to compare

Loading

Release 2.1.4 Latest

Latest

Introduce optional hat weighting by @lsorber
Clarify LoRA adaptation
Clarify treat_newline_as_space: renamed to split_on_input_newlines. treat_newline_as_space will be deprecated in a future release.

Contributors

lsorber

Assets 2

Loading

All reactions

Release 2.1.2

14 Dec 11:06

markus583

Compare

Choose a tag to compare

Loading

Release 2.1.2

Fixes #142: AssertionError when string is only comprised of newlines, whitespace, or if its an empty strong.

Assets 2

Loading

All reactions

Release 2.1.1

27 Oct 14:19

markus583

Compare

Choose a tag to compare

Loading

Release 2.1.1

Change default behaviour for newlines in SaT.split.
- Now, while the model ignores them, they will used to split as simple post-processing.
Small bugfixes for LoRA training
Update Readme for advanced usage

Assets 2

Loading

All reactions

Release 2.1.0

24 Sep 21:37

markus583

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

Release 2.1.0

Adds ONNX support for SaT models.
- Including export scripts and an updated README.
- This results in 50% improved inference time on GPU.

Assets 2

Loading

JTRNS reacted with hooray emoji

Qubitium reacted with heart emoji

All reactions

🎉 1 reaction
❤️ 1 reaction

2 people reacted

Release 2.0.8

09 Sep 10:49

markus583

Compare

Choose a tag to compare

Loading

Release 2.0.8

Fix splitting of short sequences into individual characters (#127)

Assets 2

Loading

All reactions

Release 2.0.7

02 Sep 13:26

markus583

Compare

Choose a tag to compare

Loading

Release 2.0.7

Allow numpy>=2.0
Fix adaptation code
Add some comments

Assets 2

Loading

All reactions

Release 2.0.5

08 Jul 07:41

bminixhofer

Compare

Choose a tag to compare

Loading

Release 2.0.5

Fixes potential CUDA device error when the input has exactly 511 tokens (#121).

Assets 2

Loading

All reactions

Release 2.0.4

01 Jul 09:32

bminixhofer

Compare

Choose a tag to compare

Loading

Release 2.0.4

Fix a speed issue with SaT (#118). Now it is (as expected) ~6x faster than WtP.

Assets 2

Loading

All reactions

Release 2.0.3

26 Jun 08:05

bminixhofer

Compare

Choose a tag to compare

Loading

Release 2.0.3

Implement SaT (https://arxiv.org/abs/2406.16678) and switch the default models to SaT🚀

The previous WtP models are still available but SaT is strictly better in accuracy and speed. See the updated README for details: https://github.com/segment-any-text/wtpsplit.

SaT was implemented and developed by @markus583 @igorsterner.

Contributors

markus583 and igorsterner

Assets 2

Loading

651961, nmstoker, and mrmichaeladavis reacted with thumbs up emoji

stefan-it and 651961 reacted with hooray emoji

Qubitium reacted with rocket emoji

All reactions

👍 3 reactions
🎉 2 reactions
🚀 1 reaction

5 people reacted

Release 1.3.0

22 Jan 15:30

bminixhofer

Compare

Choose a tag to compare

Loading

Release 1.3.0

Fix a bug affecting some hash embeddings of the canine-* models which reduced accuracy (please upgrade to this version!).
Add a guide on adapting to your custom data: https://github.com/bminixhofer/wtpsplit#advanced-usage.

Assets 2

Loading

nezda and pavaris-pm reacted with hooray emoji

Qubitium and pavaris-pm reacted with rocket emoji

All reactions

🎉 2 reactions
🚀 2 reactions

3 people reacted

Previous 1 2 3 Next

Previous Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.