Skip to content

Commit

Permalink
Fix baseapi_test with locale de_DE.UTF-8
Browse files Browse the repository at this point in the history
The unittest failed with LANG=de_DE.UTF-8:

    $ unittest/baseapi_test
    Running main() from ../../../../unittest/../googletest/googletest/src/gtest_main.cc
    [==========] Running 12 tests from 2 test suites.
    [----------] Global test environment set-up.
    [----------] 10 tests from TesseractTest
    [ RUN      ] TesseractTest.ArraySizeTest
    [       OK ] TesseractTest.ArraySizeTest (0 ms)
    [ RUN      ] TesseractTest.BasicTesseractTest
    [       OK ] TesseractTest.BasicTesseractTest (1251 ms)
    [ RUN      ] TesseractTest.IteratesParagraphsEvenIfNotDetected
    [       OK ] TesseractTest.IteratesParagraphsEvenIfNotDetected (347 ms)
    [ RUN      ] TesseractTest.HOCRWorksWithoutSetInputName
    [       OK ] TesseractTest.HOCRWorksWithoutSetInputName (403 ms)
    [ RUN      ] TesseractTest.HOCRContainsBaseline
    [       OK ] TesseractTest.HOCRContainsBaseline (389 ms)
    [ RUN      ] TesseractTest.RickSnyderNotFuckSnyder
    [       OK ] TesseractTest.RickSnyderNotFuckSnyder (346 ms)
    [ RUN      ] TesseractTest.AdaptToWordStrTest
    Trying to adapt "136
    " to "1 3 6"
    Trying to adapt "256
    " to "2 5 6"
    Trying to adapt "410
    " to "4 1 0"
    Trying to adapt "432
    " to "4 3 2"
    Trying to adapt "540
    " to "5 4 0"
    Trying to adapt "692
    " to "6 9 2"
    Trying to adapt "779
    " to "7 7 9"
    Trying to adapt "793
    " to "7 9 3"
    Trying to adapt "808
    " to "8 0 8"
    Trying to adapt "815
    " to "8 1 5"
    Trying to adapt "12
    " to "1 2"
    Trying to adapt "12
    " to "1 2"
    [       OK ] TesseractTest.AdaptToWordStrTest (788 ms)
    [ RUN      ] TesseractTest.BasicLSTMTest
    [       OK ] TesseractTest.BasicLSTMTest (4525 ms)
    [ RUN      ] TesseractTest.LSTMGeometryTest
    [       OK ] TesseractTest.LSTMGeometryTest (615 ms)
    [ RUN      ] TesseractTest.InitConfigOnlyTest
    Error: unichar ? in normproto file is not in unichar set.
    Error: unichar 0.232621 in normproto file is not in unichar set.
    Error: unichar 0.000400 in normproto file is not in unichar set.
    Error: unichar 0.231864 in normproto file is not in unichar set.
    [...]
    Error: unichar ? in normproto file is not in unichar set.
    Error: unichar 0.233915 in normproto file is not in unichar set.
    Error: unichar 0.000400 in normproto file is not in unichar set.
    Error: unichar 0.221755 in normproto file is not in unichar set.
    Error: unichar 0.000400 in normproto file is not in unichar set.
    Error: unichar ? in normproto file is not in unichar set.
    baseapi_test(21845,0x1134c45c0) malloc: *** error for object 0x927f96c28005e0: pointer being freed was not allocated
    baseapi_test(21845,0x1134c45c0) malloc: *** set a breakpoint in malloc_error_break to debug
    [INFO]  Lang eng took 327ms in regular init
    [INFO]  Lang chi_tra took 1422ms in regular init
    Abort trap: 6

TesseractTest.InitConfigOnlyTest is fixed by using std::istringstream
instead of sscanf.

Signed-off-by: Stefan Weil <[email protected]>
  • Loading branch information
stweil committed May 16, 2019
1 parent 0dcc889 commit 36ed6da
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions src/classify/normmatch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@

#include <cstdio>
#include <cmath>
#include <sstream> // for std::istringstream

#include "classify.h"
#include "clusttool.h"
Expand Down Expand Up @@ -113,7 +114,7 @@ float Classify::ComputeNormMatch(CLASS_ID ClassId,
feature.Params[CharNormRx] * 8000.0 +
feature.Params[CharNormRy] *
feature.Params[CharNormRy] * 8000.0);
return (1.0 - NormEvidenceOf (Match));
return (1.0 - NormEvidenceOf(Match));
}

BestMatch = FLT_MAX;
Expand Down Expand Up @@ -209,7 +210,11 @@ NORM_PROTOS *Classify::ReadNormProtos(TFile *fp) {
const int kMaxLineSize = 100;
char line[kMaxLineSize];
while (fp->FGets(line, kMaxLineSize) != nullptr) {
if (sscanf(line, "%s %d", unichar, &NumProtos) != 2) continue;
std::istringstream stream(line);
stream >> unichar >> NumProtos;
if (stream.fail()) {
continue;
}
if (unicharset.contains_unichar(unichar)) {
unichar_id = unicharset.unichar_to_id(unichar);
Protos = NormProtos->Protos[unichar_id];
Expand Down

0 comments on commit 36ed6da

Please sign in to comment.