Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use cachePath with Scheduler #576

Closed
oleg-andreyev opened this issue Nov 3, 2021 · 4 comments
Closed

Cannot use cachePath with Scheduler #576

oleg-andreyev opened this issue Nov 3, 2021 · 4 comments

Comments

@oleg-andreyev
Copy link

Describe the bug
Cannot use cachePath with Scheduler

To Reproduce
Steps to reproduce the behavior:

  1. Create Scheduler
  2. Create N workers with cachePath
  3. Start OCR

Expected behavior
No errors and able to load languages

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: MacOS
  • Browser: node
  • Version: v14.17.6

Additional context

Error opening data file ./lav.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Error opening data file ./lav.traineddata
Failed loading language 'lav'
Tesseract couldn't load any languages!
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'lav'
Tesseract couldn't load any languages!
@juemrami
Copy link

i was trying do something similar. i think it might have to with some kind of race conditions when when reading the files containing our local models. I noticed that when i forced my workers to run non-noncurrently i had no issues with opening my models. I was hoping the scheduler had support for 'initialize' or 'load' language jobs.

@Balearica
Copy link
Member

Balearica commented Aug 29, 2022

Disregard, this is not the cause. I believe by default Tesseract both reads from and writes to the file you specify using cachePath. Therefore, I do think it makes sense that there could be issues when using simultaneously on multiple threads. We could consider adding a note to documentation and/or checking if cachePath arguments are being reused across workers and throw an warning if detected.

@Balearica
Copy link
Member

Balearica commented Sep 18, 2022

I was able to reproduce this bug in version 2, but not the current version (v3). If this issue is still active, please confirm you still encounter this bug in the latest version and provide a reproducible example.

@Balearica
Copy link
Member

Closing as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants