Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong Unicode mapping of some Romanian diacritics #1314

Closed
latrau opened this issue Feb 9, 2018 · 1 comment
Closed

wrong Unicode mapping of some Romanian diacritics #1314

latrau opened this issue Feb 9, 2018 · 1 comment

Comments

@latrau
Copy link

latrau commented Feb 9, 2018

Environment

Debian Linux

  • Tesseract Version: tesseract 4.00.00alpha

  • Platform: Linux 4.15.0 SMP PREEMPT 2018 x86_64 GNU/Linux

Current Behavior:

romanian diacritics șȘțȚ are mapped into the wrong Unicode codes, namely:
Ș -> Ş=U+015E
ș -> ş=U+015F
Ț -> Ţ=U+0162
ț -> ţ=U+0163

Expected Behavior:

Ș -> Ș=U+0218
ș -> ș=U+0219
Ț -> Ț=U+021A
ț -> ț=U+021B

Suggested Fix:

edit the map accordingly, thanks

@zdenop
Copy link
Contributor

zdenop commented Feb 10, 2018

Why you are opening duplicate issues? (#1315)

@zdenop zdenop closed this as completed Feb 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants