-
Notifications
You must be signed in to change notification settings - Fork 0
Rules and tools to deterministically generate all prerequisites for the final training process. Adapted from https://github.com/ryanfb/ancientgreekocr-grctraining/
License
ryanfb/latinocr-lattraining
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Source files for some automatically generated parts of the Latin (lat) training for Tesseract OCR. Specifically, this contains the Makefile and its prerequisites to build the following files needed for the lat training: - training_text.txt - lat.word.txt - lat.freq.txt - lat.unicharambigs - lat.wordlist # Dependencies On a Mac with homebrew, install coreutils and gnu-sed (needed for gsed, gmktemp, gshuf). # To build the training parts Note that the build starts by downloading and unpacking a text corpus from which to generate the wordlists. Make all of the parts with the command: make
About
Rules and tools to deterministically generate all prerequisites for the final training process. Adapted from https://github.com/ryanfb/ancientgreekocr-grctraining/
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published