From 2c696aab725ed8384c007da71008faa4c642b745 Mon Sep 17 00:00:00 2001 From: Robert Knight Date: Wed, 24 Jan 2024 08:17:20 +0000 Subject: [PATCH] Update Ocrs project link in README and reformat paragraphs --- README.md | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 2bfff93..b571e60 100644 --- a/README.md +++ b/README.md @@ -1,16 +1,20 @@ # ocrs-models -This project contains PyTorch models for the -[**ocrs**](https://github.com/robertknight/rten/tree/main/ocrs) OCR engine. +This project contains tools for training PyTorch models for use with the +[**Ocrs**](https://github.com/robertknight/ocrs/) OCR engine. The ocrs engine splits text detection and recognition into three phases, each of which corresponds to a different model in this repository: -1. **Text detection**: This is a semantic segmentation model which classifies -each pixel in a greyscale input image as text/non-text. Consumers then post-process clusters of text pixels to get oriented bounding boxes for words. -2. **Layout analysis (VERY WIP)**: This is a graph model which takes word bounding boxes as input nodes and -classifies each node's relation to nearby nodes (eg. start / middle / end of line) -3. **Text recognition**: This is a CRNN model that takes a greyscale image of a text line as input and returns a sequence of characters. +1. **Text detection**: This is a semantic segmentation model which classifies + each pixel in a greyscale input image as text/non-text. Consumers then + post-process clusters of text pixels to get oriented bounding boxes for + words. +2. **Layout analysis (VERY WIP)**: This is a graph model which takes word + bounding boxes as input nodes and classifies each node's relation to nearby + nodes (eg. start / middle / end of line) +3. **Text recognition**: This is a CRNN model that takes a greyscale image of a + text line as input and returns a sequence of characters. All models can be exported to ONNX for downstream use.