Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler Warnings #1036

Closed
hotchkiss87 opened this issue Jul 13, 2017 · 36 comments
Closed

Compiler Warnings #1036

hotchkiss87 opened this issue Jul 13, 2017 · 36 comments

Comments

@hotchkiss87
Copy link
Contributor

I am seeing a number of messages like:

In file included from ../ccutil/clst.h:24:0,
                 from ../ccstruct/blobbox.h:23,
                 from workingpartset.h:24,
                 from workingpartset.cpp:21:
../ccstruct/blobbox.h: In member function ‘void BLOBNBOX::set_reduced_box(TBOX)’:
../ccutil/host.h:79:25: warning: overflow in conversion from ‘int’ to ‘signed char:1’ changes value from ‘1’ to ‘-1’ [-Woverflow]
 #define TRUE            1
                         ^
../ccstruct/blobbox.h:236:17: note: in expansion of macro ‘TRUE’
       reduced = TRUE;
                 ^~~~

Would it be possible to switch the 'debug' flag from 'configure' to include '-Wall -pedantic', or to at least require all patches and changes to compile without any warnings and errors?

It's not unusual for contracts to include a provision that all code must compile without warnings, and it's generally a good practice to at the very least test enough to make sure that you're not getting compiler warnings before heading to production.

It's also much easier for the person who wrote the code to decide if the compiler warnings matter, or, for example, if simply using a cast will result in the same logic if certain bounds are impossible.

I can submit a one time cleanup once #995 is resolved, but in the future it'd be great if reducing compiler warnings would be encouraged.


Environment

  • Tesseract Version: 4.0.0(alpha or dev)
  • Platform: Linux [hostname] 4.12.0 defect issue #1 SMP Tue Jul 11 14:56:49 EDT 2017 x86_64 Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz GenuineIntel GNU/Linux
@stweil
Copy link
Member

stweil commented Jul 15, 2017

I addressed the reported and some more warnings in pull request #1039.

@amitdo
Copy link
Collaborator

amitdo commented Jul 23, 2017

See also #584

@Shreeshrii
Copy link
Collaborator

@stweil Can this be closed now?

@stweil
Copy link
Member

stweil commented Apr 30, 2018

We still get lots of warnings, for example 1370 warnings with Visual Studio. This issue can be closed if we agree that all those warnings can be ignored.

If we care for warnings, we also have to decide which compiler warnings should be enabled for the GNU compiler or for CLANG.

@Shreeshrii
Copy link
Collaborator

Is there a way to list unique what the warnings are?

I looked at the console output, most seem related to type conversion, but there are some similar to

warning C4566: character represented by universal-character-name '\u2606' cannot be represented in the current code page (1252)

Some related discussion at
https://stackoverflow.com/questions/12040539/utf-8-compatibility-in-c

@Shreeshrii
Copy link
Collaborator

C4018: '<': signed/unsigned mismatch
C4018: '>=': signed/unsigned mismatch
C4068: unknown pragma
C4146: unary minus operator applied to unsigned type, result still unsigned
C4244: '=': conversion from 'FLOAT64' to 'FLOAT32', possible loss of data
C4244: '
=': conversion from 'const double' to 'float', possible loss of data
C4244: '=': conversion from 'double' to 'FLOAT32', possible loss of data
C4244: '
=': conversion from 'double' to 'PRIORITY', possible loss of data
C4244: '=': conversion from 'double' to 'float', possible loss of data
C4244: '+=': conversion from 'const int64_t' to 'int', possible loss of data
C4244: '+=': conversion from 'double' to 'float', possible loss of data
C4244: '+=': conversion from 'float' to 'int', possible loss of data
C4244: '-=': conversion from 'double' to 'float', possible loss of data
C4244: '-=': conversion from 'float' to 'int', possible loss of data
C4244: '/=': conversion from 'double' to 'float', possible loss of data
C4244: '=': conversion from 'FLOAT32' to 'int', possible loss of data
C4244: '=': conversion from 'FLOAT64' to 'FLOAT32', possible loss of data
C4244: '=': conversion from 'UNICHAR_ID' to 'float', possible loss of data
C4244: '=': conversion from 'const double' to 'float', possible loss of data
C4244: '=': conversion from 'const tesseract::XHeightConsistencyEnum' to 'float', possible loss of data
C4244: '=': conversion from 'double' to 'FLOAT32', possible loss of data
C4244: '=': conversion from 'double' to 'PRIORITY', possible loss of data
C4244: '=': conversion from 'double' to 'float', possible loss of data
C4244: '=': conversion from 'double' to 'int32_t', possible loss of data
C4244: '=': conversion from 'float' to 'int', possible loss of data
C4244: '=': conversion from 'float' to 'int32_t', possible loss of data
C4244: '=': conversion from 'int' to 'FLOAT32', possible loss of data
C4244: '=': conversion from 'int' to 'float', possible loss of data
C4244: '=': conversion from 'int16_t' to 'int8_t', possible loss of data
C4244: '=': conversion from 'int16_t' to 'uint8_t', possible loss of data
C4244: '=': conversion from 'int64_t' to 'int32_t', possible loss of data
C4244: 'argument': conversion from 'FLOAT32' to 'const int', possible loss of data
C4244: 'argument': conversion from 'FLOAT32' to 'int', possible loss of data
C4244: 'argument': conversion from 'const double' to 'float', possible loss of data
C4244: 'argument': conversion from 'const float' to 'int', possible loss of data
C4244: 'argument': conversion from 'double' to 'FLOAT32', possible loss of data
C4244: 'argument': conversion from 'double' to 'const int', possible loss of data
C4244: 'argument': conversion from 'double' to 'float', possible loss of data
C4244: 'argument': conversion from 'double' to 'int', possible loss of data
C4244: 'argument': conversion from 'double' to 'int16_t', possible loss of data
C4244: 'argument': conversion from 'double' to 'l_float32', possible loss of data
C4244: 'argument': conversion from 'float' to 'int', possible loss of data
C4244: 'argument': conversion from 'float' to 'int16_t', possible loss of data
C4244: 'argument': conversion from 'float' to 'int32_t', possible loss of data
C4244: 'argument': conversion from 'int' to 'float', possible loss of data
C4244: 'argument': conversion from 'int' to 'l_float32', possible loss of data
C4244: 'argument': conversion from 'int32_t' to 'float', possible loss of data
C4244: 'initializing': conversion from 'double' to 'float', possible loss of data
C4244: 'initializing': conversion from 'double' to 'int', possible loss of data
C4244: 'initializing': conversion from 'float' to 'int', possible loss of data
C4244: 'initializing': conversion from 'int' to 'float', possible loss of data
C4244: 'initializing': conversion from 'int16_t' to 'uint8_t', possible loss of data
C4244: 'initializing': conversion from 'int32_t' to 'float', possible loss of data
C4244: 'initializing': conversion from 'int64_t' to 'int', possible loss of data
C4244: 'return': conversion from 'const double' to 'float', possible loss of data
C4244: 'return': conversion from 'double' to 'FLOAT32', possible loss of data
C4244: 'return': conversion from 'double' to 'PRIORITY', possible loss of data
C4244: 'return': conversion from 'double' to 'float', possible loss of data
C4244: 'return': conversion from 'float' to 'int', possible loss of data
C4251: 'tesseract::TessPDFRenderer::offsets_': class 'GenericVector' needs to have dll-interface to be used by clients of class 'tesseract::TessPDFRenderer'
C4251: 'tesseract::TessPDFRenderer::pages_': class 'GenericVector' needs to have dll-interface to be used by clients of class 'tesseract::TessPDFRenderer'
C4267: '=': conversion from 'size_t' to 'int16_t', possible loss of data
C4267: 'argument': conversion from 'size_t' to 'const char', possible loss of data
C4305: '
=': truncation from 'double' to 'FLOAT32'
C4305: '*=': truncation from 'double' to 'float'
C4305: '=': truncation from 'double' to 'FLOAT32'
C4305: '=': truncation from 'double' to 'float'
C4305: 'argument': truncation from 'double' to 'float'
C4305: 'initializing': truncation from 'double' to 'FLOAT32'
C4305: 'initializing': truncation from 'double' to 'float'
C4305: 'initializing': truncation from 'double' to 'l_float32'
C4305: 'initializing': truncation from 'int' to 'float'
C4566: character represented by universal-character-name '\u05BE' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u0640' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2000' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2001' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2002' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2003' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2004' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2005' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u200E' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u200F' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2010' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2011' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2012' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2015' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2028' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2029' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u202A' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u202C' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2032' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2212' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\u2606' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uE003' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uE006' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uE007' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uE008' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uE009' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uFB01' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uFB02' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uFE58' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uFE63' cannot be represented in the current code page (1252)
C4566: character represented by universal-character-name '\uFF0D' cannot be represented in the current code page (1252)
C4838: conversion from 'double' to 'FLOAT32' requires a narrowing conversion
C4996: 'ltoa': The POSIX name for this item is deprecated. Instead, use the ISO C and C++ conformant name: _ltoa. See online help for details.
C4996: 'putenv': The POSIX name for this item is deprecated. Instead, use the ISO C and C++ conformant name: _putenv. See online help for details.
C4996: 'strdup': The POSIX name for this item is deprecated. Instead, use the ISO C and C++ conformant name: _strdup. See online help for details.
C4996: 'ultoa': The POSIX name for this item is deprecated. Instead, use the ISO C and C++ conformant name: _ultoa. See online help for details.
C4996: 'unlink': The POSIX name for this item is deprecated. Instead, use the ISO C and C++ conformant name: _unlink. See online help for details.

@amitdo
Copy link
Collaborator

amitdo commented Apr 30, 2018

C4566: character represented by universal-character-name '\uFF0D' cannot be represented in the current code page (1252)

https://msdn.microsoft.com/en-us/library/mt708819.aspx

@amitdo
Copy link
Collaborator

amitdo commented May 3, 2018

CC: @egorpugin

About C4566 ...

@stweil
Copy link
Member

stweil commented May 3, 2018

Pull request #1554 adds /utf-8 for the MS VC compiler which fixes C4566.

@stweil
Copy link
Member

stweil commented May 3, 2018

Warnings sorted by frequency:

      2 C4146
      4 C4267
      4 C4838
      6 C4068
     14 C4996
     20 C4251
     82 C4018
    416 C4305
   2054 C4244

The most common warning is for conversions from double to float (and its synonyms). This can also be a performance issue because such conversions possibly cost execution time.

@egorpugin
Copy link
Contributor

Pull request #1554 adds /utf-8 for the MS VC compiler which fixes C4566.

True, I also build packages with cppan with /utf-8.

@amitdo
Copy link
Collaborator

amitdo commented May 3, 2018

Some source files contain pragmas to suppress msvc's warnings.

@stweil
Copy link
Member

stweil commented May 3, 2018

I'm just testing a new pull request which removes all those pragma statements. Otherwise we get too few warnings. :-)

Compare old with current and planned.

@zdenop
Copy link
Contributor

zdenop commented Dec 30, 2018

BTW: clang on windows reports these warnings variables:

[89/245] Building CXX object CMakeFiles\libtesseract.dir\src\ccutil\globaloc.cpp.obj
..\src\ccutil\globaloc.cpp(33,13):  warning: unused variable 'global_crash_pixes' [-Wunused-variable]
static Pix* global_crash_pixes[kMaxNumThreadPixes];
            ^
1 warning generated.
[157/245] Building CXX object CMakeFiles\libtesseract.dir\src\lstm\lstmtrainer.cpp.obj
..\src\lstm\lstmtrainer.cpp(134,15):  warning: unused variable 'shape' [-Wunused-variable]
  StaticShape shape = network_->OutputShape(network_->InputShape());
              ^
1 warning generated.
[166/245] Building CXX object CMakeFiles\libtesseract.dir\src\textord\bbgrid.cpp.obj
..\src\textord\bbgrid.cpp(103,10):  warning: unused variable 'old_tright' [-Wunused-variable]
  ICOORD old_tright(tright());
         ^
1 warning generated.
[205/245] Building CXX object CMakeFiles\libtesseract.dir\src\textord\tablefind.cpp.obj
..\src\textord\tablefind.cpp(93,11):  warning: unused variable 'kRulingVerticalMargin' [-Wunused-const-variable]
const int kRulingVerticalMargin = 3;
          ^
1 warning generated.
[206/245] Building CXX object CMakeFiles\libtesseract.dir\src\textord\textlineprojection.cpp.obj
..\src\textord\textlineprojection.cpp(754,9):  warning: '#pragma optimize' is not supported [-Wignored-pragma-optimize]
#pragma optimize("g", off)
        ^
..\src\textord\textlineprojection.cpp(762,9):  warning: '#pragma optimize' is not supported [-Wignored-pragma-optimize]
#pragma optimize("", on)
        ^
2 warnings generated.
[208/245] Building CXX object CMakeFiles\libtesseract.dir\src\textord\strokewidth.cpp.obj
..\src\textord\strokewidth.cpp(1331,14):  warning: unused variable 'blob_box' [-Wunused-variable]
        TBOX blob_box(blob->bounding_box());
             ^
1 warning generated.
[210/245] Building RC object CMakeFiles\libtesseract.dir\vs2010\tesseract\libtesseract.rc.res
Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384
Copyright (C) Microsoft Corporation.  All rights reserved.

[222/245] Building CXX object CMakeFiles\libtesseract.dir\src\viewer\svutil.cpp.obj
..\src\viewer\svutil.cpp(92,10):  warning: unused variable 'newthread' [-Wunused-variable]
  HANDLE newthread = CreateThread(nullptr,        // default security attributes
         ^
1 warning generated.

@stweil
Copy link
Member

stweil commented Dec 12, 2019

Citing @db4 (from issue #2816):

I'm not sure it's a duplicate of #1036. The current behavior was introduced 01-Nov-2019 14:30:15 in 5e3772c and is specific to MSVC (while old #1036 mostly addresses gcc/clang). And it should be fixed before 4.1.1 release (in my case it's a real show-stopper: travis CI build job cannot complete in 1 hour)

Travis CI should work for 4.1.1, of course.

@stweil
Copy link
Member

stweil commented Dec 12, 2019

@db4, this is the latest Travis build for 4.1: https://travis-ci.org/tesseract-ocr/tesseract/builds/613358145. We don't use MSVC there.

MSVC is used with Appveyor CI: https://ci.appveyor.com/project/zdenop/tesseract/builds/28928329.

@db4
Copy link
Contributor

db4 commented Dec 12, 2019

@stweil

MSVC is used with Appveyor CI

Sure. Of course, I meant Appveyor CI. Here is my failing build:
https://ci.appveyor.com/project/db4/conan-tesseract/builds/29433440

@db4
Copy link
Contributor

db4 commented Dec 12, 2019

And a successful build after /Wall is removed:
https://ci.appveyor.com/project/db4/conan-tesseract/builds/29491104

@stweil
Copy link
Member

stweil commented Dec 12, 2019

@zdenop, what do you think? Can we remove /Wall for cmake in the 4.1 branch? That would be an easy solution. Fixing compiler warnings in that branch seems wasted time for me, I prefer focusing on the master branch.

@zdenop
Copy link
Contributor

zdenop commented Dec 12, 2019

IMO: Yes, we can remove it from 4.1.1 (e.g. as part of release process ;-) ) and keep it for master/devel.
To remove it completely: No. If we want to solve "all" warnings for all compilers (different compiler report could report different problem), we should report them somewhere... At the moment we report them only for debug release...

@stweil
Copy link
Member

stweil commented Dec 13, 2019

I sorted the compiler warnings by their frequency, and 99 % (12924) of all warnings are C4514 (unreferenced inline function has been removed). Is there a command line switch which only suppresses C4514? Then I'd suggest to add that to 4.1 and master.

@egorpugin
Copy link
Contributor

/wd4514 iirc

@stweil
Copy link
Member

stweil commented Dec 13, 2019

Thank you, @egorpugin. @db4, could you please test whether /Wall /Wd4514 works for you?

@db4
Copy link
Contributor

db4 commented Dec 13, 2019

@stweil

could you please test whether /Wall /Wd4514 works for you?

No:

cl : Command line error D8021: invalid numeric argument '/Wd4514'

@stweil
Copy link
Member

stweil commented Dec 13, 2019

It should be /Wall /wd4514 (lower case w). Could you please try that?

@db4
Copy link
Contributor

db4 commented Dec 16, 2019

@stweil
/wd4514 does not change much. 1 hour is still not enough for CI build (20K+ warnings are generated):
https://ci.appveyor.com/project/db4/conan-tesseract/build/job/db42lnclq3e2tgnh

@stweil
Copy link
Member

stweil commented Dec 16, 2019

@db4, that protocol lists 9557 warnings. The most frequent ones are 3864 times C4820 and 1088 times C4625. So /Wall /wd4514 /wd 4820 /wd1088 might be enough.

@stweil
Copy link
Member

stweil commented Dec 16, 2019

Maybe you need also /wd4458.

@db4
Copy link
Contributor

db4 commented Dec 16, 2019

@stweil
OK, I can try these options as well. But I don't understand the purpose. If it's a workaround why don't just remove /Wall? The proper fix (in master branch?) should address source code inaccuracies that trigger warnings, not just selectively disable them.

@stweil
Copy link
Member

stweil commented Dec 16, 2019

I'd disable those warnings in master, too. Not every warning indicates a problem which has to be fixed (for example C4514 is normal), and other warnings are very low priority or shown by gcc or clang compiler, too, so there is no harm when they are disabled for MSVC.

As there are still backports of fixes from master to 4.1, disabling all warnings would give us no indicator of any progress or potential new problems.

@db4
Copy link
Contributor

db4 commented Dec 17, 2019

Well, /Wall /wd4514 /wd 4820 /wd1088 /wd4458 options make AppVeyor compile time reasonable (just two times slower that Release build). But /Wall now looks like overkill even for Debug build (it generates warnings even in Microsoft headers). Maybe just use /W4 instead? That's what Microsoft itself recommends:

/W4 displays level 1, level 2, and level 3 warnings, and all level 4 (informational) warnings that are not turned off by default. We recommend that you use this option to provide lint-like warnings. For a new project, it may be best to use /W4 in all compilations; this will ensure the fewest possible hard-to-find code defects.

@amitdo
Copy link
Collaborator

amitdo commented Dec 17, 2019

https://docs.microsoft.com/en-us/cpp/build/reference/compiler-option-warning-level?view=vs-2019

MSVC's /Wall seems like Clang's -Weverything.

/W<n> should be used instead.

@stweil
Copy link
Member

stweil commented Dec 17, 2019

@zdenop, can we replace /Wall by /W4 as suggested by @db4 for 4.1 and master?

@zdenop
Copy link
Contributor

zdenop commented Dec 18, 2019

Yes.

@stweil
Copy link
Member

stweil commented Dec 18, 2019

@db4, branch 4.1 and master now use /W4. Is this sufficient for your build to make it work?

@db4
Copy link
Contributor

db4 commented Dec 18, 2019

@stweil

Is this sufficient for your build to make it work?

Yes, it is. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants