-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Google code for more unittests #1863
Conversation
Some of the new test files which are currently not activated use the Abseil API (https://abseil.io/, https://github.com/abseil/abseil-cpp). We have to decide whether we want to replace those calls, add a minimal implementation to Tesseract or require a full Abseil installation (maybe as a Git submodule, about 10 MB). These tests are affected:
|
These tests neither depend on the Abseil API nor on unavailable TIFF images, so might be the next candidates to get fixed and added to our test set:
Still missing (2019-07-06):
|
This allows using the class for unittests, too. Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
They were provided by Jeff Breidenbach <[email protected]>. Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
We are already using Googletest and test (testdata) as submodules that are required only for testing, so I think it will be OK to add Abseil as a submodule too. |
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
It is mentioned here: Also check
Try to extract a eng.traindata file that contains the data for the legacy engine. https://github.com/tesseract-ocr/tesseract/blob/5fdaa479da2c/doc/combine_tessdata.1.asc
Try to replace with GENERIC_2D_ARRAY
https://www.google.com/search?q=%22StringPrintf%22+%22google%22+%22apache%22
https://github.com/tensorflow/models/search?q=%22unicodetext.h%22
https://www.google.com/search?q=%22subprocess.h%22+%22google%22+%22apache%22 |
I'm just preparing the |
|
Yes, the code is obviously using glog. Pull request #1980 includes a rudimentary implementation for |
Latest Tesseract now has 40 unit tests (with lots of subtests) which pass successfully:
|
Good work. @stweil and @Shreeshrii, thank you! How many tests are not activated? |
Some additional changes of the Tesseract code base make integration of the new tests easier.
As an example, bitvector_test and some more tests were now added to the Tesseract test set.