-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LSTM vs BLSTM vs MDLSTM #630
Comments
In principle, Tesseract is probably as accurate (or slightly more accurate) than ocropy/clstm. Tesseract has official trained models for ~100 languages. ocropy has official models for English and German only. Unlike ocropus, Tesseract works on Windows. BLSTM is implemented and used. 2D-LSTM is also implemented in the library. I think (not sure) it's not used by the released traineddata. Using 2D-LSTM means much longer time to train a model. and for OCRing printed text, the accuracy will not necessary be better than 1D-BLSTM. BTW, ocropy doesn't have 2D-LSTM support. |
Please see https://github.com/tesseract-ocr/tesseract/wiki/VGSLSpecs
- excuse the brevity, sent from mobile
…On 30-Dec-2016 10:06 PM, "Amit D." ***@***.***> wrote:
In principle, Tesseract is probably as accurate (or slightly more
accurate) than ocropy/clstm.
Tesseract has official trained models for ~100 languages. ocropy has
official models for English and German only. Unlike ocropus, Tesseract
works on Windows.
BLSTM is implemented and used.
2DLSTM is also implemented in the library. I think (not sure) it's not
used by the released traineddata. Using 2DLSTM means much longer time to
train a model. and for OCRing printed text, the accuracy will not necessary
be better than 1D-BLSTM.
BTW, ocropy doesn't have 2D-LSTM support.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#630 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE2_oweaHUpMa-dyavVOg6KqTEwrrcoHks5rNTMTgaJpZM4LYPU8>
.
|
This is not really the situation with the LSTM engine. The difference in accuracy between Latin script based langs and Arabic is due to
|
Also, the OCR stage is dependent on the layout analysis stage which is weaker for Arabic. |
Shree, indic scripts are even more complex... |
i checked Arabic today
with default trained data it have about 80% accuracy
what are you looking for?
|
@amitdo Thanks for clearing things up, improved pre-processing may make 1D-LSTM outperform the more complex MDLSTM. You were right. @Shreeshrii So Tesseract 4.x has the capability of producing more sophisticated and complex structures. @roozgar i was looking for a method that gain +85% recognition rate for Arabic language. |
as i said i got the result with official trainedata
i dont started to my own training yet...
|
@roozgar what operating system are you using? |
Please see Ray's comment with accuracy figures in
#40
I have found Hindi to have much greater accuracy with LSTM engine.
- excuse the brevity, sent from mobile
|
@shree ubuntu 16lts
|
This is what is used for most of the languages: I think it is 2D-LSTM. |
@amitdo thanks, I have been told that 4.x version of tesseract would be the next big leap, now I believe. |
1/ please how i can use blstm to segment a page into textline. |
i understand that much of the new Tesseract 4.0 is using a customized implementation of Ocropus, relying basically on the new LSTM recognition engine.
But the main problem is that most of the decisions that are being taken focus mostly on English (Latin Languages) which already able to reach +95% recognition rates easily.
My concern is allowing the other languages such as Arabic to be able to reach the PRECISION CEILING.
Methods such as BLSTM (Bidirectional LSTM) , and the two-dimensional 2D LSTM which is called MDLSTM, can achieve without explicit segmentation of words, a character-level accuracies of 92 and 96% !!!!!! and I repeat, without explicit segmentation.
So my question is that, will there be plans to implement and ascend the current LSTM to a MDLSTM (Multi-dimensional LSTM), this will radically make ALL THE LANGUAGES ABLE TO PASS THAT PRECISION CEILING.
i am planing to engage in testing Tesseract 4.0 LSTM on the Arabic language, and wanting to post results in the future, i hope that there will be recognition improvement while testing.
Thank you Ray for your hard work, and all contributors, you are appreciated.
More information about BLSTM and MDLSTM:
https://www.nist.gov/sites/default/files/documents/itl/iad/mig/OpenHaRT2013_WorkshopPres_A2IA.pdf
http://www.a2ialab.com/lib/exe/fetch.php?media=presentations:icdar2015_chinese_slides.pdf
https://goo.gl/0wUNfm
The text was updated successfully, but these errors were encountered: