Finetune Training and OCR evaluation of Tesseract 5.0.0 Alpha for Balinese script using tesstrain Training workflow for Tesseract 4 as a Makefile. Certain file locations and scripts have been modified compared to source repos.
OCR evaluation is done using ISRI Analytic Tools for OCR Evaluation with UTF-8 support and and The ocrevalUAtion tool.
Replace the top layer training was done using five Balinese Unicode fonts and synthetic training text using jav1.traineddata as the Start_Model to continue from. The training was aborted around 8% CER as the training and evaluation images were not shuffled across fonts.
Replace the top layer training was done using five Balinese Unicode fonts and synthetic training text using bali1.traineddata as the Start_Model to continue from.