Ocr of rotated image #121

josef821 · 2024-09-30T12:50:35Z

hi, thanks for your useful ocr engine,
its works good but when i try to set rotated image it return bad result.

is there any fix tips?

robertknight · 2024-09-30T19:28:34Z

Currently the recognition model and layout logic assumes that the image is approximately upright (some amount of rotation or skew is OK) and that the text is read left to right. To work with rotated or severely skewed images, they need to be rotated / de-skewed as a preprocessing step. Eventually this should be integrated into this library, but in the meantime you could try something like:

Call the OcrEngine::detect_words method to detect bounding boxes of connected areas (the white regions in the top-left image)
Infer the orientation from the positions and aspect ratios of the boxes (eg. if most boxes are tall rather than wide, that means the text is probably upside-down)
Use functions in the imageproc crate to rotate the image based on the inferred orientation
Perform OCR or the rotated image

A more sophisticated approach would be to use an image classification model to infer the orientation of each word, or a sample of words. If a suitable model was created in eg. PyTorch and exported to ONNX, it could then be converted to RTen and used in the above preprocessing pipeline instead of heuristics.

josef821 · 2024-09-30T22:11:16Z

thanks for reply.
i will do that. i check all masks to check lines are rotated or not.
your layout analyze is not good enough. i check you reply to other. you want to create a model for cluster and sort word to get line bounding box. How long do you think it will take to be able to publish the layout analyze model with its training code?

robertknight · 2024-10-01T06:32:00Z

How long do you think it will take to be able to publish the layout analyze model with its training code?

I don't know. All the code that exists is in the ocrs-models repository, but for layout analysis that only includes some non-functional prototypes.

In the meantime, if you happen to be working with documents that have a predictable layout, you can always substitute the find_text_lines step with custom code.

josef821 · 2024-10-05T09:42:56Z

exist layout analysis not working good for curve layout or complex image. i waiting for your layout analysis.
thanks

josef821 changed the title ~~ocr of rotated image~~ Ocr of rotated image Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ocr of rotated image #121

Ocr of rotated image #121

josef821 commented Sep 30, 2024

robertknight commented Sep 30, 2024

josef821 commented Sep 30, 2024

robertknight commented Oct 1, 2024

josef821 commented Oct 5, 2024

Ocr of rotated image #121

Ocr of rotated image #121

Comments

josef821 commented Sep 30, 2024

robertknight commented Sep 30, 2024

josef821 commented Sep 30, 2024

robertknight commented Oct 1, 2024

josef821 commented Oct 5, 2024