-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
separate preprocessing steps and use AlternativeImage in ocropy wrappers #10
Conversation
…ers: - move binarization from recognition into extra Processor (also allowing region and page level operation) - move dewarping from recognition into extra Processor (operating on the line level; model-independent) - move deskewing from binarization into extra Processor (operating on the region level, only annotating angle in PAGE) - always dive down the PAGE hierarchy checking whether AlternativeImage is referenced: use it if present, otherwise create an ad-hoc image for the segment (page/region/line) from _relative_ coordinates into the next higher-level image by cropping (and rotating); also, pass down corrected coordinates: - offset coordinates if the image are larger than the segment (e.g. from rotation), - rotate coordinates if the region was rotated (has @orientation) new functions (all to be moved into ocrd core): - image_from_page (AlternativeImage, or crop via Border) - image_from_region (AlternativeImage, or crop and rotate via Coords and orientation) - image_from_line (AlternativeImage, or crop via rotated polygon mask, and optionally region segmentation) - save_image_file (save new AlternativeImage: add to METS and reference in PAGE) - use polygon masks instead of rectangles when cropping lines (especially useful after rotation), and try to resegment regions to mask components from neighbouring lines (especially useful against ascenders and descenders when dewarping or with sensitive OCR like ocropy) - move common ocropy functions into extra module (but with additions/improvements): - PIL.Image vs np.ndarray conversions - type and plausibility checks for line/region/page level (but mix absolute and relative error criteria) - local whitelevel estimation (but keeping exact size) - deskewing (but expanding image size with rotation) - binarization (but using larger whitelevel percentile, smaller whitelevel local range and zoom, and larger white point threshold) - borderclean (remove black components only in the margin) - black and white column separator search - gradmap for baseline search (but with smaller minimum size of boxmap and sticky top/bottom for line components that were chopped-off) - line seed search (but with horizontal merge to avoid splitting lines at large whitespace in the absence of true colseps) - line segmentation for regions/pages without/with colseps (but with larger scale estimate and tighter hscale for higher vertical variability of broken fonts) - denoising - ocrd-tool: add default input and output file groups - update README and setup - version: 0.0.2 -> 0.0.3
The last commit maintains that functions in |
Please do not merge yet. I will push more changes soon that will further improve:
|
…dd clip: - make all common functions for image extraction respect and recreate the full polygon coordinates (not just the bounding box): - use Numpy arrays for coordinates instead of dicts - rename rotate_polygon → rotate_coordinates - factor out coordinates_of_segment for shared offset/rotation calc - offer extra coordaintes_for_segment for the reverse direction (to add segmentation on lower levels) - factor out image_from_polygon for shared background masking - when masking a polygon from an image, fill with the background color (instead of white) - when cropping a rectangle from an image, if the rectangle extends beyond the image (as happens with bad segmentation when segments extend beyond their parents in PAGE), fill with the background color (instead of black) - in various processors: start introducing DPI-based zoom parameter - when deskewing, make sure to also create a rotated AlternativeImage - when deskewing, ignore detected angles if the drop in variance is too small (as happens on tiny regions) - when binarizing, be robust against NaN results for threshold levels - when binarizing, do not attempt borderclean (obsolete with clip) - when binarizing, do not attempt deskewing on page level (yet) - add new Processor clipping connected components from neighbouring segments (operating on the region or line level), which produces images with intruding foreground components clipped to white - move re-segmentation from `image_from_line` or binarization/dewarping into extra Processor (operating on the line level), which instead of producing images creates shrinked, non-overlapping polygon outlines - improve line segmentation (compute_line_labels) further: - use more robust state transitions from bottom to top line markers: project seed by delta from both bottom (up) and top (down), but stop short if they are closer to each other already (fill only) - horizontally blur bottom line markers just like top line markers - skip horizontally blurring the resulting seeds altogether (to avoid accidentally joining lines) - this obsoletes the large (6*hscale) horizontal blur of the gradient - this obsoletes the sticky option for compute_gradmap: do not extend the gradient from the bottom/top margins - make the old behaviour available with robust=False - fix hmerge relabelling - when spreading line seeds to the background, first make sure that connected components of the foreground remain in their majority label - when full_page=True, - add remove_hlines again, but with additional height threshold, and smaller width threshold default - when searching for black column separators, - reduce the vertical threshold (because vlines can be discontinuous) - keep only connected components that are properly contained in the detected region (i.e. avoid damaging neighbours) - make checks optional here as well - combine scale parameters with additional top-level zoom parameter (to be determined from DPI factor against implicit 300) - improve docstrings
Besides the changes already announced, this last commit includes a new processor clip which can suppress neighbouring segments (lines vs. line, or regions vs. region – including Due to the nature of comparison between neighbours though, the segments must not have Line-level clipping can be seen as an alternative to resegmentation which does not depend on Ocropy segmentation. |
@finkf Please review and/or merge! |
I am OK to merge this. But wouldn't it make more sense to separate all the |
Splendid! Definitely, it does make sense to have a separate module with all Ocropy based processors. And that's exactly what we are planning to do. This is just the first step, so others can start experimenting already. Remember, I started this in your repo because of all your good work on the recognition processor. The idea is to restructure the repo OCR-D/ocropy into 2 packages, ocrolib (current Moreover, OCR-D/ocrd_ocropy can then become a simple OCR-D wrapper based on Based on this, I will re-commit all changes not strictly related to wrappers I made here to the new ocrolib. Finally, I will introduce the new processors into OCR-D/ocrd_ocropy, and also rewrite the segmentation processor (the only one I have not touched yet). |
OK. Ill merge then. |
thx! |
details see changelog, roughly:
AlternativeImage
– as reference implementation showing how to deal with relative coordinates and coordinate transforms (rotation, translation, offset)I will provide more examples and images shortly.