Build a Mets file (Page Collections file) to easily work with Aletheia.
See: Page Collections in the Aletheia User Guide.
You can also use the METS file for the OCR-D framework (
- Name and path of the image file (without file extension).
- If available the matching PAGE XML file.
- Both files should have the same name and only differ in their file extension.
- The files should be stored in relevant folders:
- e.g. the image files in the folder
- and the PAGE XML files in the folder
- e.g. the image files in the folder
name of the image file folderpagefolder
name of the PAGE file folderimageFormat
Format of image filesnoIMAGE=yes
Indication that no image files can be specified,noPAGE=yes
Indication that no PAGE files can be specified or are availabledrive
The drive letter from windows file system.
The link
element contains the path to the image
Note: See the example file in the example folder. Use only a slash for seperating the folders, dont use a backslash also when you use the Windows OS.
<?xml version="1.0" encoding="UTF-8"?>
<link>[Path to the Image or PAGE file]/[Name of the File without Extension]</link>
java -jar ../saxon9he.jar -xsl:../xsl/makeAletheia_mets.xsl -s:../example/example.xml imagefolder=jpg imageFormat=jpg pagefolder=page
A variante that no PAGE files can be specified or available:
java -jar ../saxon9he.jar -xsl:../xsl/makeAletheia_mets.xsl -s:../example/example.xml imagefolder=tiff imageFormat=tif noPage=yes