Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR fails if image is rotated #29

Open
robertknight opened this issue May 31, 2022 · 6 comments
Open

OCR fails if image is rotated #29

robertknight opened this issue May 31, 2022 · 6 comments

Comments

@robertknight
Copy link
Owner

OCR completely fails if the image is rotated at 90, 180 or 270 degrees. Tesseract has built-in orientation detection, so this could be used to resolve that.

@robertknight
Copy link
Owner Author

robertknight commented Jun 1, 2022

Tesseract's built-in orientation detection requires the library to be build with the legacy / non-LSTM text recognition engine. Leptonica has some built-in orientation detection functionality. So some options:

  1. Compile Tesseract with the legacy engine included, so its orientation detection can be used. This increases the WASM binary size from 1.6 => 2.3MB in my testing.
  2. Use Leptonica's orientation detection
  3. Don't support orientation detection and leave it as a problem for the consumer

@robertknight
Copy link
Owner Author

Work in progress at #34.

@robertknight
Copy link
Owner Author

#34 adds a partial solution in the form of orientation detection, however the algorithm is simplistic and this means that in any application user input would probably be required to confirm actions depending on it.

@robertknight
Copy link
Owner Author

I posted a comment on Hacker News and someone responded with a test case where the word recognition works well, but the text is not output in the correct order, due rotation of the image:

C1jn2Kz

If you compare the text output of this image in the demo, vs a copy of this image rotated such that the text baselines are straight, you can see that the layout outputs are different.

@johanvaneck
Copy link

Any updates on this?

@robertknight
Copy link
Owner Author

No. Ensuring the input is correctly oriented is currently a problem that users of the library have to solve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants