Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization (/O2) option causes a run time error. #2898

Closed
korhun opened this issue Feb 26, 2020 · 12 comments
Closed

Optimization (/O2) option causes a run time error. #2898

korhun opened this issue Feb 26, 2020 · 12 comments

Comments

@korhun
Copy link

korhun commented Feb 26, 2020

I have pulled the latest code (today). Using Visual Studio 2019 (in Windows 10) built the project in Release x64. Everything works fine in many computers. However in one computer, when I try to OCR any image, an error occurs. The computer is Windows 10 64bit and has 16GB RAM. There is nothing unusual in the OS and is up to date. I cannot see the message of the error, because I use your library from C#. The error can only be caught as SEHException, which does not tell anything about the error. I've built the the libtesseract project in debug mode to see the error details, and debugged on that computer. No exception occurred, the OCR finished successfully in debug mode.

In the libtesseract project: I've changed
Configuration Properties / C/C++ / Optimization / Optimization value from Maximum Optimization (Favor Speed) (/O2) to Disabled (/Od)
and no error occurred; OCR processes finished with success.

@stweil
Copy link
Member

stweil commented Feb 26, 2020

Maybe you created code which uses AVX instructions, and the computer which crashes does not support AVX?

@korhun
Copy link
Author

korhun commented Feb 26, 2020

@stweil I'm not experienced in c++; and have no idea if it uses AVX instructions. I've used the steps described in the help and some description in the issues to build libtesseract.

However, that computer has this CPU. I guess it supports AVX.

Intel® Xeon® Processor E5-1620 v2
https://ark.intel.com/content/www/us/en/ark/products/75779/intel-xeon-processor-e5-1620-v2-10m-cache-3-70-ghz.html

@korhun
Copy link
Author

korhun commented Feb 26, 2020

Disabling optimization decreases performance drastically.
Multi threaded test measurement results for one of my dataset:
/O2 --> 0.70 seconds
/Od --> 1.64 seconds
/Od + /Ot --> 1.63 seconds

Any help is very appreciated.

@amitdo
Copy link
Collaborator

amitdo commented Feb 26, 2020

You can try other optimization options: /Od /Ob2, /O1, /Ox, /Ot.

https://docs.microsoft.com/en-us/cpp/build/reference/o-options-optimize-code?view=vs-2019

In general, we can't really help you here unless you reproduce the issue with the command line or submit a C/C++ code that uses the API.

@korhun
Copy link
Author

korhun commented Feb 26, 2020

@amitdo thank you for your response. Performance has a great importance for our use. So I have to use /O2. Using only /Ot did not increase the performance like /O2 did (my previous comment)
Also in my tries, the remote computer could only work with the /Od option.

This is the code that generates the error. I've tried it with latest Tessdata best and fast of TUR + ENG languages.

   const char* GetText2(const char* input) {
       char* outText;
       Pix* image = pixRead(input);
       api->SetImage(image);        
       outText = api->GetUTF8Text();      
       pixDestroy(&image);
       return outText;
   }
   
 extern "C" __declspec(dllexport) const char* __stdcall OCR_GetText2(OCRNative * engine, const char* input) {
     return engine->GetText2(input);
 }

To solve the issue I've used this way:

  • build 2 Tesseract50.dll's, one with optimization, one without.
  • In C# code, try optimized dll first; if error occurs use the non-optimized one.
    This is a little nasty solution, but I couldn't figure out what else to do.

Now, I realized I haven't tried to build debug dll with optimization, and debug c++ to see what is generating error.. I'll try it now..

Edit: I've tried. Project could not be built with any optimization enabled in debug mode: cl : command line error D8016: '/O2' and '/RTC1' command-line options are incompatible

@amitdo
Copy link
Collaborator

amitdo commented Feb 26, 2020

I'm not a Windows developer, but just by reading Microsoft's docs, it's clear to me that /RTC1 is not the 'debug mode'.

This one is probably the flag you want:
https://docs.microsoft.com/en-us/cpp/build/reference/debug-generate-debug-info?view=vs-2019

@stweil
Copy link
Member

stweil commented Feb 26, 2020

@amitdo
Copy link
Collaborator

amitdo commented Feb 26, 2020

You said you run it remotely, there were some bug reports related to VM + AVX / AVX2.

@korhun
Copy link
Author

korhun commented Feb 26, 2020

I'm not a C++ developer but, when the libtesseract is built in debug mode, RTC1 is active, and it does not let the project be compiled while an optimization is active. As far as I could understand from:
here it enables the run time error checking; which I try to use the debug for. However I'm not sure, since we can debug and see the exception in an optimized compiled C# debug. I don't have an access to that computer now; so I cannot make tries right now. (It's night here :) tomorrow I'll try give it a try, if they let me.)

I'm not remotely running Tesseract; it runs on a remote machine, which is not a VM. I just remotely debug it.

Three extra notes:

  • This was tested on >10 computers until now, only one had this issue.
  • On the error occuring computer, I've made >100 tries with the error generating options, and saw that no error was generated and successfully extract text maybe 2 or 3 times.
  • I had tried the same test with a 23.10.2018 built tesseract40.dll, and no error occured.

I hope these make any sense. Thank you in advance.

@amitdo
Copy link
Collaborator

amitdo commented Mar 6, 2020

/O2 --> 0.70 seconds
/Od --> 1.64 seconds
/Od + /Ot --> 1.63 seconds

You said:

Using only /Ot did not increase the performance like /O2 did

but you also added:

(my previous comment)

which shows '/Od + /Ot'.

So, did you try just /Ot without /Od?

Performance has a great importance for our use. So I have to use /O2.

Still, I suggest to check if the issue also occurs with /O1 (without /Od or any other /O).

@korhun
Copy link
Author

korhun commented Mar 9, 2020

So, did you try just /Ot without /Od?

Yes, using /Ot no issue occurred but the performance was bad.

Still, I suggest to check if the issue also occurs with /O1 (without /Od or any other /O).

I tried O1 and the issue occurred.

I've tried many other combinations but could not resolve the issue with the optimization performance boost.

I don't have an access to that computer anymore, so I cannot make any tries. I've solved this issue by using 2 different dll's, one optimized, one not. C# code tries the optimized dll first; if it's not successful it uses the not optimized one. No performance or error issue has been reported since.

This might be a very special case of a single computer. I have no idea. So closing this issue might be a better option, if the info I've provided here doesn't help much.

@stweil stweil closed this as completed Mar 12, 2020
@stweil
Copy link
Member

stweil commented Mar 12, 2020

Without a stack trace of the crash there is no chance to do more here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants