Skip to content

Latest commit

 

History

History
99 lines (62 loc) · 2.27 KB

README.md

File metadata and controls

99 lines (62 loc) · 2.27 KB

ocr-xsl

XSLT functions to transform common OCR formats

Build Status

Format Cheatsheet

Check out OCR-Format-Comparison for a concise comparison of hOCR, ALTO, and ABBYY.

Shared concepts

OCR formats

  • hocr
  • alto
  • abbyy

Box coordinates

  • left
  • right
  • top
  • bottom
  • width
  • height

Function reference

ocr.bbox.xsl

Retrieve the bounding box of an element

ocr:bbox($format, $element, $coord)

ocr:hocr-bbox($element, $coord)

ocr:abbyy-bbox($slement, $coord)

ocr:alto-bbox($slement, $coord)

ocr.image.xsl

Retrieve the image for an element

ocr:image($format, $element)

  • $format: A valid OCR format
  • $element: The element with image

ocr:hocr-image($element)

  • $element: The element with image