Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read and write AlternativeImage #34

Open
bertsky opened this issue Jul 17, 2019 · 1 comment
Open

read and write AlternativeImage #34

bertsky opened this issue Jul 17, 2019 · 1 comment

Comments

@bertsky
Copy link

bertsky commented Jul 17, 2019

OCR-D processors are required to respect the AlternativeImage annotation of the METS/PAGE pair, cf. spec (ff.). That implies,

  • on the page level: do not just read @imageFilename, but prefer a AlternativeImage/@filename if it exists; regardless, write the result as a new PAGE, merely referencing the resulting image as additional AlternativeImage in the PageType and mets:file (in one of the OCR-D-IMG-* fileGrps) in METS

  • on the region level: do not just read @imageFilename and cut the respective region from it, but instead:

    1. prefer a AlternativeImage/@filename (for the region) if it exists, or
    2. prefer a AlternativeImage/@filename (for the page) if it exists and cut the region from it, otherwise
    3. use @imageFilename and cut the respective region from it;

    regardless, write the result as a new PAGE, merely referencing the resulting image as additional AlternativeImage in the RegionType and mets:file (in one of the OCR-D-IMG-* fileGrps) in METS

  • etc.

@bertsky
Copy link
Author

bertsky commented Jul 17, 2019

But mind that an effort is currently under way to incorporate a nice API for all that into core. (Because there is a lot more to it, cf. OCR-D/ocrd_tesserocr#33.) So I recommend waiting for the next release of ocrd first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant