This repository has been archived by the owner on Jan 13, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 13
OCR Test failure #4
Comments
I've compared the outputs produced by:
and
And for some reason the output of the first one (the older version) seems to be of better quality, with less misspelled words and less wrongly recognised characters. It might be due to some non-obvious configuration of tesseract in the first case that changed with the fresh installation of newer version. |
jstuczyn
added a commit
that referenced
this issue
Jun 15, 2017
The test is preventing building cogstack which is due to possible tesseract misconfiguration (#4)
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
This might be consequence of resolving issue #3 as the build would not finish successfully, because it would fail at test stage. The assertion
assertTrue(parsedString.contains("Father or mother"))
intestParseRequiringOCR
insidePDFPreprocessorParserTest.java
fails. Rather than being recognised as "Father or mother", the OCR'd document contains the string "Father er mother".The text was updated successfully, but these errors were encountered: