diff --git a/README.md b/README.md index eb57f12dd7..4ca775a975 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,12 @@ ## About This package contains an **OCR engine** - `libtesseract` and a **command line program** - `tesseract`. +Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused +on line recognition, but also still supports the legacy Tesseract OCR engine of +Tesseract 3 which works by recognizing character patterns. Compatibility with +Tesseract 3 is enabled by using the Legacy OCR Engine mode (--oem 0). +It also needs traineddata files which support the legacy engine, for example +those from the tessdata repository. The lead developer is Ray Smith. The maintainer is Zdenko Podobny. For a list of contributors see [AUTHORS](https://github.com/tesseract-ocr/tesseract/blob/master/AUTHORS)