-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: consider using wtpsplit
for better sentence split (especially for splitting Japanese, Chinese and Korean)
#2
Comments
Split text first (using WtP)Combine text back together (using
|
wtpsplit
for better sentence split (especially for Japanese and Chinese)wtpsplit
for better sentence split (especially for splitting Japanese, Chinese and Korean)
Report on Feature Implementation: Use of
|
#2 - Add contributor information - Set language to English - Enable automatic issue labeling, title formatting, and report closing with issues
⬆️ feat: bump version to 0.2.0 in pyproject.toml #2
segment-any-text: wtpsplit
Which is a Machine Learning based splitting instead of rule based
wtp-bert-mini
is good enough for split Chinese and Japanese)The text was updated successfully, but these errors were encountered: