OCR fails if image is rotated #29

robertknight · 2022-05-31T07:12:33Z

OCR completely fails if the image is rotated at 90, 180 or 270 degrees. Tesseract has built-in orientation detection, so this could be used to resolve that.

robertknight · 2022-06-01T08:33:34Z

Tesseract's built-in orientation detection requires the library to be build with the legacy / non-LSTM text recognition engine. Leptonica has some built-in orientation detection functionality. So some options:

Compile Tesseract with the legacy engine included, so its orientation detection can be used. This increases the WASM binary size from 1.6 => 2.3MB in my testing.
Use Leptonica's orientation detection
Don't support orientation detection and leave it as a problem for the consumer

robertknight · 2022-06-02T09:15:56Z

Work in progress at #34.

robertknight · 2022-06-05T08:41:22Z

#34 adds a partial solution in the form of orientation detection, however the algorithm is simplistic and this means that in any application user input would probably be required to confirm actions depending on it.

robertknight · 2022-07-09T16:49:12Z

I posted a comment on Hacker News and someone responded with a test case where the word recognition works well, but the text is not output in the correct order, due rotation of the image:

If you compare the text output of this image in the demo, vs a copy of this image rotated such that the text baselines are straight, you can see that the layout outputs are different.

johanvaneck · 2024-05-09T12:52:20Z

Any updates on this?

robertknight · 2024-05-09T14:55:48Z

No. Ensuring the input is correctly oriented is currently a problem that users of the library have to solve.

robertknight mentioned this issue May 31, 2022

Demo app: Images rotated if added from iOS / macOS photo library #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCR fails if image is rotated #29

OCR fails if image is rotated #29

robertknight commented May 31, 2022

robertknight commented Jun 1, 2022 •

edited

Loading

robertknight commented Jun 2, 2022

robertknight commented Jun 5, 2022

robertknight commented Jul 9, 2022

johanvaneck commented May 9, 2024

robertknight commented May 9, 2024

OCR fails if image is rotated #29

OCR fails if image is rotated #29

Comments

robertknight commented May 31, 2022

robertknight commented Jun 1, 2022 • edited Loading

robertknight commented Jun 2, 2022

robertknight commented Jun 5, 2022

robertknight commented Jul 9, 2022

johanvaneck commented May 9, 2024

robertknight commented May 9, 2024

robertknight commented Jun 1, 2022 •

edited

Loading