This repository contains code for generating data and training OCR on Latin transliteration of ancient Akkadian texts, using the Kraken OCR framework.
- Kraken installed as follows:
Clone the Kraken Github repo and install its Conda environment from within the repo with:
conda env create -f environment.yml
Activating the kraken environment (conda activate kraken
) gives access to the kraken
and ketos
command-line tools.
Note: If you want to fine-tune the model, you may want to use Kraken with CUDA (GPU acceleration). In this case you should instead install it with:
conda env create -f environment_cuda.yml
Note: This also creates an environment with name kraken
. If you want both environments, edit the .yml
file to change the environment name (e.g. to kraken-cuda
).
Activate the Kraken Conda environment and then run:
kraken -i "IMAGE_FILENAME" OUTPUT_FILENAME binarize segment ocr -m MODEL_FILENAME
Using the following values:
- IMAGE_FILENAME: Use the filename of the image you want to perform OCR on.
- OUTPUT_FILENAME: Filename of text file to write output to
- MODEL_FILENAME: Should point to the model file you want to use in
models/
Note: In the output text, we use a single left curly brace { to indicate a superscript, which is significant in the common transliteration of Akkadian. We recommend to post-process by adding } after every character which is preceeded by {. For example, "{dAMAR.UTU" should be converted to "{d}AMAR.UTU".
The models/
folder contains fine-tuned OCR models. We recommend model.mlmodel
for general use. dillard.mlmodel
was fine-tuned on typewriter text in Dillard (1975) and might perform better in that use case.
In tests/
, we have provided a sample paragraph (test.png
, from Dietrich 2003) and OCR results of the default model in output.txt
. You may run OCR on it and verify that you get the same output.
For instructions on fine-tuning the existing model on new data, see Finetuning.md
.