Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language Request: Kurdish Sorani (Central Kurdish) #296

Open
makwanbarzan opened this issue Apr 29, 2022 · 1 comment
Open

Language Request: Kurdish Sorani (Central Kurdish) #296

makwanbarzan opened this issue Apr 29, 2022 · 1 comment

Comments

@makwanbarzan
Copy link

There's already a trained data file for the Latin dialect of the Kurdish language. Sorani dialect is the second most used dialect of the language and it'd be amazing to have a trained data file in Tesseract.

The script is Persian-like, except having a few different letters like ژ، گ، ڤ، چ، ۆ. So it shouldn't take so much effort to develop.

Thank you and I'm looking forward to getting a response.

@stweil
Copy link
Member

stweil commented Apr 30, 2022

All those characters are included in the script/Arabic model. Maybe that already works for Sorani text?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants