OSD not working again with --psm 0 after latest 20181030 binary release #2062

CanadianHusky · 2018-11-18T08:54:22Z

Environment

Tesseract Version: 4.0.0.20181030 regression against 4.0.0-rc1
Platform: windows 64 bit

Binary release clean install from

https://github.com/UB-Mannheim/tesseract/wiki
https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v4.0.0.20181030.exe

Current Behavior:

orientation is detected wrong in supplied file with shown command line

WRONG Result :

Page number: 0
Orientation in degrees: 270
Rotate: 90
Orientation confidence: 14.00
Script: Latin
Script confidence: nan

Expected Behavior:

compare the same input against 4.0.0-rc1

CORRECT Result :

Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 0.54
Script: Latin
Script confidence: 33.33

the orientation confidence value based on tests on thousdands of files in rc1 version is extremely accurate and makes sense. It is used as a threshold if the result can be trusted or not
the result from 20181030 release is horribly mistaken

Input Image :

Suggested Fix:

invesigate what lead to regression in OSD code

thank you kindly

The text was updated successfully, but these errors were encountered:

stweil · 2018-11-19T10:56:36Z

This could be related to the changed handling of the alpha channel in PNG images: the latest Tesseract code replaces the alpha channel by white.

@CanadianHusky, could you please try both versions with the same image in other formats (for example JPEG or TIFF) or with a PNG without alpha channel?

CanadianHusky · 2018-12-06T11:59:13Z

Hello,

@stweil
I have tested RC3 and RC4 and the final version 4-20181030 builds.
I used BMP and JPG input of the same image.
All of them suffer from the same problem and fail to detect orientation correctly, that used to be working in RC1
The problem must have been introduced somewhere between the date ranges of RC1 and RC3
thank you

CanadianHusky · 2019-03-15T09:46:19Z

Hello, I see a new pre-compiled release at https://digi.bib.uni-mannheim.de/tesseract/ for

tesseract-ocr-w64-setup-v4.1.0.20190314.exe

and tested that release against the issue mentioned above.

The result on the input image is still incorrect.
I am unsure if the binary release I have used is really a 4.1.0 release or if this an intermediary build.

thank you

stweil · 2019-03-15T10:27:34Z

That binary is based on latest Tesseract sources (Git master).

zdenop · 2019-05-09T16:17:40Z

@CanadianHusky: you can copy and paste terminal output by mouse select (with left button, and if you then click with right in terminal you have selection in clipboard) - it is more useful than screenshots.

I made test with the latest code (5.0.0-alpha-50-g3f4dc) and best tessdata:

> tesseract i2062.png - --dpi 175 -c min_characters_to_try=10 --psm 0 -l eng
Warning, detects only orientation with -l eng
Warning. Invalid resolution 0 dpi. Using 70 instead.
Page number: 0
Orientation in degrees: 270
Rotate: 90
Orientation confidence: 14.00
Script: Latin
Script confidence: -nan(ind)

But if I skip language specification (eng should be used anyway) I got different result:

> tesseract i2062.png - --dpi 175 -c min_characters_to_try=10 --psm 0
Warning. Invalid resolution 0 dpi. Using 70 instead.
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 0.28
Script: Greek
Script confidence: 4.36

Detection of orientation is correct, but script is wrong. This is quiet strange that specification of eng language is cause different result...

zdenop · 2019-05-09T16:29:33Z

And using tessdata (e.g. not fast, not best) provide correct result:

tesseract i2062.png - --psm 0 --tessdata-dir tessdata -c min_characters_to_try=10 -l eng
Warning, detects only orientation with -l eng
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 174
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 0.54
Script: Latin
Script confidence: 33.33

Seems like LSTM model is not able to detect correctly orientation on this kind of images (Too few characters), but legacy is working fine:

pi@raspberrypi:/usr/src/test $ tesseract i2062.png - --psm 0 --tessdata-dir tessdata --oem 0 --dpi 175 -c min_characters_to_try=10 -l eng
Warning, detects only orientation with -l eng
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 0.54
Script: Latin
Script confidence: 33.33
pi@raspberrypi:/usr/src/test $ tesseract i2062.png - --psm 0 --tessdata-dir tessdata --oem 1 --dpi 175 -c min_characters_to_try=10 -l eng
Warning, detects only orientation with -l eng
Page number: 0
Orientation in degrees: 270
Rotate: 90
Orientation confidence: 14.00
Script: Latin
Script confidence: nan
pi@raspberrypi:/usr/src/test $ tesseract i2062.png - --psm 0 --tessdata-dir tessdata --oem 2 --dpi 175 -c min_characters_to_try=10 -l eng
Warning, detects only orientation with -l eng
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 0.54
Script: Latin
Script confidence: 33.33
pi@raspberrypi:/usr/src/test $ tesseract i2062.png - --psm 0 --tessdata-dir tessdata --oem 3 --dpi 175 -c min_characters_to_try=10 -l eng
Warning, detects only orientation with -l eng
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 0.54
Script: Latin
Script confidence: 33.33

zdenop · 2019-05-09T17:27:47Z

More details, that can bring some light how it works:

If there is not language specification - only osd.traineddata is used (according strace report) That is reason why Script detection is not correct.
When there is specification of language -l eng then:

first eng.traineddata is opened
next image is opened
and than osd.traineddata is opened...

I am not sure if we can/want do something with this.

CanadianHusky · 2019-05-09T19:10:58Z

As soon as I see a stable binary release that I can test, I will try those suggested command line options.
if using --oem option with the correct value is able to detect correct orientation and a reasonable confidence value, that is sufficient. It does not matter to me personally if the detection is done with LSTM or legacy code. Of course it is very desirable that this sort of orientation detection works as fast as possible. I appreciate the provided information. Thank you @zdenop

zdenop · 2019-05-09T20:22:37Z

If my observation is correct you do not need to wait for stable release: just use tessdata repository for OSD.

stweil · 2019-06-23T18:39:20Z

@zdenop, it is normal that only osd.traineddata is used if no explicit language was given. That file includes a selection of more than 1700 unicode characters from different scripts which are used to detect the right script. It is only available for the legacy OCR engine. Therefore it won't work if you use --oem 1 or compile Tesseract without that engine.

My tests with latest Tesseract code all give the right orientation as long as I do not add --oem 1.

zdenop · 2019-06-24T08:26:09Z

So what is the status of this issue? Can it be closed?

stweil · 2019-06-24T08:59:42Z

@CanadianHusky, do you still have that problem?

CanadianHusky · 2019-06-25T11:06:49Z

Orientation detection still has problems for me. Here are my test results, after having adjusted the command line as recommended by @stweil

Test environment :
clean install from https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v5.0.0.20190623.exe

all 3 input images are 0 degrees, but get detected with incorrected result.
I admit that input 3 image is poor quality and a higher preprocessing resolution does find the correct result. However input 2 and 4 are as good as its going to get images with clean and large enough letters that I would have liked to see a correct result.

Am I still doing something wrong in the command line ?

input2 image :

input 3 image :

input 4 image :

also worth noting, adding -l eng (or -l deu) changes the orientation detection result, still to an incorrect result, but very high confidence.

Shreeshrii · 2019-07-27T02:06:55Z

Please see https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/tesseract-ocr/9HSpp7Ysduw/r8FPCHhBFAAJ

It might be related to this OSD related issue.

amitdo · 2020-05-17T22:49:56Z

Reading @zdenop and @stweil comment, it seems that there in no regression in newer versions with the first image in this issue.

Nobody commented about the other images. It is not clear if the OP claims that there is a regression here too, or just complains about the wrong result.

amitdo · 2020-05-18T00:33:48Z

I tested the input2 image.

I got correct result with:

tesseract input2.png input2 --psm 0 -l eng --tessdata-dir $testadadir/tessdata -c min_characters_to_try=10

console:

Warning, detects only orientation with -l eng
Tesseract Open Source OCR Engine v5.0.0-alpha-580-g87841 with Leptonica
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 225
Warning. Invalid resolution 0 dpi. Using 70 instead.

input2.osd

Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 1.36
Script: Latin
Script confidence: 29.17

I'm not going to bother testing more images.

CanadianHusky · 2020-05-18T07:45:57Z

Thank you for revisiting this issue. In the meantime I have discovered the source of the inconsistency.
The issue is not a regression in the code itself but depends in which TRAINEDDATA file is used.
When I do a clean install from https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v5.0.0.20190623.exe or any recent release...

This data file is installed

Now observe these tests, only -l eng changes. Expected result is 0 degrees and meaningful confidence value

C:\Program Files\Tesseract-OCR>tesseract --version
tesseract v5.0.0-alpha.20191030
 leptonica-1.78.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
 Found AVX
 Found SSE
 Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5

C:\Program Files\Tesseract-OCR>tesseract --tessdata-dir "C:\Program Files\Tesseract-OCR\tessdata" --psm 0 -l eng -c min_characters_to_try=10 "input2.png" stdout
Warning, detects only orientation with -l eng
Page number: 0
Orientation in degrees: 270
Rotate: 90
Orientation confidence: 50.00
Script: Latin
Script confidence: 2.00

WRONG 

C:\Program Files\Tesseract-OCR>tesseract --tessdata-dir "C:\Program Files\Tesseract-OCR\tessdata" --psm 0 -l eng_15040 -c min_characters_to_try=10 "input2.png" stdout
Warning, detects only orientation with -l eng_15040
Page number: 0
Orientation in degrees: 270
Rotate: 90
Orientation confidence: 50.00
Script: Latin
Script confidence: 2.00

WRONG

C:\Program Files\Tesseract-OCR>tesseract --tessdata-dir "C:\Program Files\Tesseract-OCR\tessdata" --psm 0 -l eng_22917 -c min_characters_to_try=10 "input2.png" stdout
Warning, detects only orientation with -l eng_22917
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 1.38
Script: Latin
Script confidence: 30.00

CORRECT!

Here the trained data files

These are the files in tessdata and clearly the source of the issue for me is that the original file installed with the binary distribution does not give the expected result. File eng_22917 was downloaded seperately from the traineddata repository

I would be interested to know what size your eng.traineddata file is and where it is from.

The source for my trained data files are as follows:

https://github.com/tesseract-ocr/tessdata/blob/master/eng.traineddata
22917kb and the only file that works for orientation detection
probably because it has the legacy models that OSD code needs

https://github.com/tesseract-ocr/tessdata_fast/blob/master/eng.traineddata
4017kb, also part of the binary installation, does not work with --psm 0 for orientation detection purposes for me

https://github.com/tesseract-ocr/tessdata_best/blob/master/eng.traineddata
15040kb, does not work with --psm 0 for orientation detection purposes for me

It took me very long time to understand and figure out this issue. I hope this information helps someone else. I have closed the issue.

I suppose the question now becomes if it makes sense to add a note to the binary distribution or elsewhere in the release notes from @stweil that the included default traineddata file is the fast integer model, which is totally fine for most users when all thay want to do is regular OCR. For anyone that is interested in OSD only like me, the traineddata files that I linked to must be used as far as I see from my tests.
Thanks again for having this pinned and looked into. Much appreciated.

amitdo · 2020-05-18T21:43:13Z

I would be interested to know what size your eng.traineddata file is and where it is from.

I used eng.traindata from the tessdata repo.

https://github.com/tesseract-ocr/tessdata/blob/d87b3cbc7555/eng.traineddata

Size: 24.5 MB (24,530,234 bytes

zdenop added the accuracy label Nov 30, 2018

stweil pinned this issue Jan 9, 2019

stweil unpinned this issue Jan 9, 2019

stweil pinned this issue Jan 9, 2019

zdenop added this to the 4.1.0 milestone Feb 16, 2019

zdenop mentioned this issue May 16, 2019

make check fails with unittest/baseapi_test.cc:190: undefined reference #2439

Closed

CanadianHusky mentioned this issue May 28, 2019

32bit version fails to launch UB-Mannheim/tesseract#7

Closed

amitdo mentioned this issue Nov 15, 2019

tesseract 4 --oem 0 baseline error with rotated pages #2086

Closed

amitdo mentioned this issue May 9, 2020

OSD mode (--psm 0) always returns dummy results when language (-l eng, -l pol, etc.) is specified #2931

Closed

amitdo added the OSD Orientation and Script Detection label May 14, 2020

amitdo unpinned this issue May 18, 2020

CanadianHusky closed this as completed May 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OSD not working again with --psm 0 after latest 20181030 binary release #2062

OSD not working again with --psm 0 after latest 20181030 binary release #2062

CanadianHusky commented Nov 18, 2018

stweil commented Nov 19, 2018 •

edited

Loading

CanadianHusky commented Dec 6, 2018

CanadianHusky commented Mar 15, 2019

stweil commented Mar 15, 2019

zdenop commented May 9, 2019

zdenop commented May 9, 2019 •

edited

Loading

zdenop commented May 9, 2019 •

edited

Loading

CanadianHusky commented May 9, 2019

zdenop commented May 9, 2019

stweil commented Jun 23, 2019

zdenop commented Jun 24, 2019

stweil commented Jun 24, 2019

CanadianHusky commented Jun 25, 2019

Shreeshrii commented Jul 27, 2019

amitdo commented May 17, 2020 •

edited

Loading

amitdo commented May 18, 2020 •

edited

Loading

CanadianHusky commented May 18, 2020 •

edited

Loading

amitdo commented May 18, 2020

OSD not working again with --psm 0 after latest 20181030 binary release #2062

OSD not working again with --psm 0 after latest 20181030 binary release #2062

Comments

CanadianHusky commented Nov 18, 2018

Environment

Current Behavior:

Expected Behavior:

Suggested Fix:

stweil commented Nov 19, 2018 • edited Loading

CanadianHusky commented Dec 6, 2018

CanadianHusky commented Mar 15, 2019

stweil commented Mar 15, 2019

zdenop commented May 9, 2019

zdenop commented May 9, 2019 • edited Loading

zdenop commented May 9, 2019 • edited Loading

CanadianHusky commented May 9, 2019

zdenop commented May 9, 2019

stweil commented Jun 23, 2019

zdenop commented Jun 24, 2019

stweil commented Jun 24, 2019

CanadianHusky commented Jun 25, 2019

Shreeshrii commented Jul 27, 2019

amitdo commented May 17, 2020 • edited Loading

amitdo commented May 18, 2020 • edited Loading

CanadianHusky commented May 18, 2020 • edited Loading

amitdo commented May 18, 2020

stweil commented Nov 19, 2018 •

edited

Loading

zdenop commented May 9, 2019 •

edited

Loading

zdenop commented May 9, 2019 •

edited

Loading

amitdo commented May 17, 2020 •

edited

Loading

amitdo commented May 18, 2020 •

edited

Loading

CanadianHusky commented May 18, 2020 •

edited

Loading