-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tesseract outputs empty files for valid png image #1361
Comments
I suggest to disable OpenCL support when building Tesseract (any version) unless you want to improve the OpenCL code. And of course it should be possible to build Tesseract 4. |
I will try to rebuild it without opencl, thanks for the suggestion. Can you tell me if windows version ships with opencl support or is opencl currently being developed and not intended for use? |
I could not install latest tesseract version 4 because it does not
compile, it fails to build,
What error are you getting?
…On Mon 5 Mar, 2018, 10:58 AM pager72, ***@***.***> wrote:
I have tesseract installed on gentoo linux with all its components
enabled. Heres what I do to run it
$ tesseract test.png out.txt
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:AMD CARRIZO (DRM 3.23.0 / 4.15.6-gentoo, LLVM 5.0.1)
score is 0.318045
[DS] Device[2] 0:(null) score is 1.448047
[DS] Selected Device[1]: "AMD CARRIZO (DRM 3.23.0 /
4.15.6-gentoo, LLVM 5.0.1)" (OpenCL)
Tesseract Open Source OCR Engine v3.05.01 with Leptonica
Page 1
attached file: tiff image to be ocr'ed, tesseract --print-parameters and
tesseract -v
attached files:
png image [image: test]
<https://user-images.githubusercontent.com/37059493/36958719-a2075ba4-1ff2-11e8-914f-1f14b77d74aa.png>
tesseract --print-parameters print-parameters.txt
<https://github.com/tesseract-ocr/tesseract/files/1779595/print-parameters.txt>
tesseract -v version-info.txt
<https://github.com/tesseract-ocr/tesseract/files/1779596/version-info.txt>
I could not install latest tesseract version 4 because it does not
compile, it fails to build, however the version I am using is sable. I hope
I get some pointers on how to remedy this.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1361 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE2_o37T6xc9CENBAbvVmnP6dYvH4-PAks5tbMz0gaJpZM4SbvVS>
.
|
I will rebuild version 4 and post build log, however I must say disabled opencl has resolved the issue and after testing I found tesseract a poor ocr engine, Im sure acrobat is better. |
Tesseract works quite well, specially for English. The problem could be
related to using an incompatible traineddata file or --oem mode. Please
provide a test input file on which you got poor results.
…On Mon 5 Mar, 2018, 12:43 PM pager72, ***@***.***> wrote:
I will rebuild version 4 and post build log, however I must say disabled
opencl has resolved the issue, and I find tesseract a poor ocr engine and
Im sure acrobat is better.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1361 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE2_owGMDZEtWImFj4kEPzxfT0NYjonRks5tbOWsgaJpZM4SbvVS>
.
|
Strangely disabling opencl for version 4 permitted the build without failure. However here are some things I found, previous experience with acrobat says the box labeled "swanson" will be ocr'ed because most of the text is legible, not so with tesseract. Don't get me wrong I really like it being open source and also support to write text searchable pdfs, which is what I do, but maybe I should disable training support and rebuild for better results? Can better results be achieved? resulting pdf: Please tell me if you achieve better results. |
Disabling training would not change the results. Using data from http://github.com/tesseract-ocr/tessdata_fast would (unless you already used that). |
Is there a guide on how to use this? |
Then please close the issue. |
I have tesseract installed on gentoo linux with all its components enabled. Heres what I do to run it
attached files:
png image
tesseract --print-parameters print-parameters.txt
tesseract -v version-info.txt
I could not install latest tesseract version 4 because it does not compile, it fails to build, however the version I am using is stable. I hope I get some pointers on how to remedy this.
The text was updated successfully, but these errors were encountered: