-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSD with --psm 0 creates wrong result in latest version #1926
Comments
Could you please try the old version with |
Yes, it does change the result. Result becomes same incorrect output. Command line : Result with tesseract v4.0.0-beta.1.20180608
This is wrong orientation and poor confidence value. How does this make sense if running without -l osd produces correct result ? |
As far as I know, Obviously at least the two tested versions of Tesseract come to the result that your image is upside down. |
I made following tests : Version = beta 1, beta 4 and latest compiled master branch : result =wrong
Version = 4.0.0-alpha.20170804 result = correct
Version = 4.0.0-alpha.20170804 .... only add -l osd result = wrong
further tests with 3.05.02 - same story..adding -l osd causes incorrect result. How does that make sense ? I am sorry but there seems to be some weird bug when the the osd option is activated via command line. |
If you give tesseract '-l <lang>' a traineddata that includes data for the legacy engine, other than osd, it will still able to detect the orientation if the the traineddata matches the.language used in the input image. @CanadianHusky |
@amitdo engine gives following comment with your suggestion and produces wrong result
|
@stweil, your code is causing this... I think osd should be the default for psm 0, to enable both script and orientation detection, but |
I removed @stweil's change, and tested it, with a traineddata that contains the legacy data.
With best/fast traineddata tesseract segfault, because they lack the legacy data needed by psm 0. |
@amitdo |
Just reverting my commit will produce segfaults again (see issue #1855 or @amitdo's comment above). I wrote the code which enforces Even then some questions still remain:
|
@CanadianHusky, what is the result with old Tesseract and |
In my opinion, Yes. BUT beware... the result is wrong at the moment
Im not familiar enough with the detection engine logic to comment meaningful enough, but for the average user it is clearly a bug |
https://ai.google/research/pubs/pub35506 According to this paper, 'osd' is able to recognize only 30 Latin characters! If there are enough chars in the text osd will usually work quite well, but for short text it might fail. |
'OINZM6890' Is this upright or rotated 180 degrees? That's one reason why osd only has 30 Latin chars. A second reason is ambiguousness across different scripts. |
While I understand why its impossible to determine by those (carefully selected, some almost symetrical or having symetric equivalent) characters alone, if it is upright or 180, It should still be possible to determine orientation in the image I supplied at the top of the post because there is relation between the characters (words and chains of letters) I am sure the bright minds at Google have some good ideas how to improve recognition for orientation detection. |
It mostly one bright mind from Google that worked on tesseract in the last years. Currently, he is on vacation from any tesseract work. |
Meanwhile, he let us play with his toy... :-) |
understand.... the Abstract you posted the link to earlier shows how bounding boxes and blobs are detected. The chances of "offending" characters be one after the other consecutively that disturb OSD are extremely rare in real content, your post of 'OINZM6890' is the extreme case and we should live with the fact that OSD will fail on that sequence of chars. for anything else, I would have the algorithm look for and try to give "meaning" to characters in pairs of 2. |
From #1926 (comment):
|
The previous test was 180 degrees rotation. With 90 degrees rotation, I get:
For both -l osd and -l eng. |
After resizing your image 300%, tesseract with psm 0 works fine, with or without my change. |
I really doubt that we can solve this in tesseract. I would bother if image has text only part (e.g. line or paragraph, or page). Input images with tables, noise and graphics were always problem to full automatic processing in tesseract. Without wide testing I assume that correct of result of orientation detection 4.0.alpha is more coincidence than prove of bug implementation in later stage. I just checked the image with leptonica function of orientation detection, (e.g. no connection to OSD or eng tesseract data) and it claims that image need to be corrected by 180 degrees.... |
ZIP File |
If you only need rotation detection, you can now use -l eng with --psm 0. You can use other lang than eng. |
@CanadianHusky, please test with latest code and |
This warning was removed a few days ago. |
@CanadianHusky, the snapshots show that you used 4.0.0-rc1. The fix was added later, so please pull the latest code from Git master to get it and build again. |
@stweil it compiled fine without errors the output still shows 4.0.0-rc1 but the result is correct this time. The engine outputs a warning message and returns correct rotation and a confidence value that is > 1 worth to note; all other rotation test files (more than 100) have been checked against this version. The engine, combined with the additional external pre/post processing that I have added makes zero mistakes in orientation detection now even with very little meaningful content. It is a signifcant improvement in my opinion now and allows tesseract to be used for detection of orientation when content is not a "regular" block of text. Thank you very much to all that helped. I have closed the issue. |
For the attached file; the latest master version creates nonsense result
command line :
tesseract --psm 0 "C:\temp\input.png" "C:\output"
OSD result with master
OSD Result with 20180608 version
the file has no rotation!
old version is correct, current master branch codes is behaving wrong and claims 180 degrees rotation accoring to my findings
Best Regards
The text was updated successfully, but these errors were encountered: