The acronym yaOCRa stands for "yet another OCR app" which basically describes what this app is all about. Try it out
yaOCRa uses pytesseract, Flask, and clipboard.js a third-party Javascript library for copying browser text to the clipboard.
For a full list of dependencies, please see the project's requirements.txt
file.
Some special steps need to be taken to install and run yaOCRa on your local machine.
First, you need to have the Tesseract binary installed on your machine.
- For macOS users:
$ brew install tesseract
- For Ubuntu users:
$ sudo apt-get install tesseract-ocr
- For Windows users: Download UB Mannheim's unofficial installer and run it. Note that yu also need to add the path to the Tesseract executable into your
PATH
environment variable.
To check if you correctly installed Tesseract:
- Open a terminal
- Run
tesseract -v
- You should now see the version and compatible libraries on your screen. Otherwise, an error occurred.
yaOCRa uses Google's reCAPTCHA v2 verification system. So, sign up for an account and save the corresponding private and public keys.
- Clone this repo
- Run
pip install -r requirements.txt
- Create a
.flaskenv
file in the project root (see next section for details) - Run
flask run
You should define the following environment variables in the .flaskenv
file:
FLASK_APP=ocr_app.py
FLASK_ENV=development
FLASK_DEBUG=0
SECRET_KEY=REPLACE_WITH_APPROPRIATE_VALUE_HERE
RECAPTCHA_PUBLIC_KEY=REPLACE_WITH_APPROPRIATE_VALUE_HERE
RECAPTCHA_PRIVATE_KEY=REPLACE_WITH_APPROPRIATE_VALUE_HERE
TESTING=0
Notes:
- Set
FLASK_ENV
to "production" upon deployment. - Set
FLASK_DEBUG
to 1 to enable debug mode, but set it to 0 in a production environment. - Set
TESTING
to 1 in order to disable reCAPTCHA during testing.
The app is currently deployed as a Docker container on Heroku. Give it a try.
The following useful resources helped made yaOCRa possible: