Make `--text-line-images` debug option apply recognition preprocessing #30

robertknight · 2024-02-27T00:51:23Z

Make the --text-line-images debug option apply the same preprocessing that is applied before lines are fed into the text recognition model. This includes:

Resizing the image to be 64px high and with a max width of 800px
Converting the image from color to gray
Extracting only the polygon containing the line's words, and masking off other pixels in black

This makes this option more useful for debugging recognition accuracy issues, as problems arising from the preprocessing become visible.

In the process of doing this functions in ocrs which return dynamic errors were changed to use anyhow::Error rather than Box<dyn Error> as the error type. This is more convenient to work with in ocrs-cli, which already used anyhow.

Add `OcrEngine::prepare_recognition_input` method that returns an image with the same preprocessing applied as `OcrEngine::recognize_text` does before it feeds input into a model. This is useful for debugging scenarios where ocrs produces different / worse output than the PyTorch model training/evaluation tools.

`anyhow::Error` provides a better dynamic error type than `dyn Error` as it can capture context and be sent between threads. Using it here also enables propagating these errors in ocrs-cli which is already using anyhow.

Make the `--text-line-images` option save images with the same preprocessing applied as when preparing images to feed into the recognition model. This makes accuracy errors arising from preprocessing issues easier to debug. This preprocessing includes: - Resizing the image to 64px high and a max width of 800px - Extracting only the polygon containing the line's words, with other pixels masked off - Converting the image to grayscale

robertknight added 3 commits February 27, 2024 00:46

Use anyhow::Result instead of Result<T, Box<dyn Error>> in ocrs

4090c71

`anyhow::Error` provides a better dynamic error type than `dyn Error` as it can capture context and be sent between threads. Using it here also enables propagating these errors in ocrs-cli which is already using anyhow.

robertknight merged commit 9d56a86 into main Feb 27, 2024
2 checks passed

robertknight deleted the expose-recognition-preprocessing branch February 27, 2024 01:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `--text-line-images` debug option apply recognition preprocessing #30

Make `--text-line-images` debug option apply recognition preprocessing #30

robertknight commented Feb 27, 2024 •

edited

Loading

Make --text-line-images debug option apply recognition preprocessing #30

Make --text-line-images debug option apply recognition preprocessing #30

Conversation

robertknight commented Feb 27, 2024 • edited Loading

Make `--text-line-images` debug option apply recognition preprocessing #30

Make `--text-line-images` debug option apply recognition preprocessing #30

robertknight commented Feb 27, 2024 •

edited

Loading