-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TessPdfRenderer not working with jpg files #31
Comments
Hi, I have made no intentional changes to the TessPdfRenderer, so I would expect it should work same as in tess-two library. Based on your link it probably has something to do with missing support for libjpeg, maybe I missed some parameter when compiling Leptonica, to enable the libjpeg support. If that is the case, I expect that line |
@PabloRodrizHp I see. In that case you can try:
Btw the difference between jpgt / jpeg names is not relevant, I just wanted to use more explicit names for the libraries. |
Sorry @Robyer . Maybe I missed to explain that I am not compiling the library, I am using the dependency. So I guess that my approach for now, will be to try 5.0.0 tesseract branch, which dependency is |
I see. In that case you are not really using alexcohn's tess-two, because he doesn't provide compiled library - the README is copied from the original rmtheis' tess-two so the dependency line that you copy to your build.gradle is really the original tess-two library using Tesseract 3.x. I just wanted to clarify this. I couldn't even make alexcohn's library work when I compiled manually his latest code. Anyway, I've reproduced the problem and found the issue. It is caused by change in Tesseract code itself, but it's easy to work-around. I have just commited the fix to master branch. You just need to compile this library yourself for now, as I don't have time to release the new version yet. |
Hi @Robyer. I kind of understood that I was not really using alexcohn's tess-two when I saw it was Tesseract 3, but thanks for helping me confirm it. Also, thank you for the fix. I will try to compile it by myself then. Btw, the tests carried out where in Android Oreo, API 26, and the master branch (Tesseract 4). |
See discussion at tesseract-ocr/tesseract#3317
@PabloRodrizHp I commited better fix for this issue and also released the new library version. You can use the 4.1.1 version now. |
Hi.
Having used the alexcohn/tess-two repository, the TessPdfRenderer works with jpg files, but when using the latest version of this library, it doesn't. The generated PDF apparently has the text in it, but not the image, and the workaround is to create a PNG out of the JPG file, which in some situations adds up to 2 seconds of processing. This situation is quite similar to the old issue (from 2015) found in the original rmtheis repository.
Here is the code that works with
'com.rmtheis:tess-two:9.1.0'
but not this library:Am I missing something in my code, so it works with the other library and not this one?
The text was updated successfully, but these errors were encountered: