Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for New Reiwa Era Character ㋿ in Japanese #32

Open
prateek4sep opened this issue Jul 25, 2019 · 1 comment
Open

Support for New Reiwa Era Character ㋿ in Japanese #32

prateek4sep opened this issue Jul 25, 2019 · 1 comment

Comments

@prateek4sep
Copy link

With the new Japanese Reiwa Era, there's a new character introduced ㋿ (U+32FF). Support for this character is required.

Current Behavior: Other Characters are being identified 砒後徘朔御菓
Expected Behavior: ㋿ should be identified for the given input image
Suggested Fix: Train and Update the current jpn.traineddata file with the new jpn character.

Reference:
Wiki Page

Attached:
The input file I used.
The character in 6 different fonts for training.
Reiwa.docx
Reiwa

@stweil
Copy link
Member

stweil commented Dec 17, 2019

We need an update for langdata_lstm. Do you want to send a pull request there?

I transfer the issue to langdata_lstm.

@stweil stweil transferred this issue from tesseract-ocr/tessdata Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants