-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polygonalize segments by shrinking to children #162
Conversation
After segmentation is done for the page, enter the (most granular) lowest hierarchy level which has been processed (i.e. textequiv_level) and project the hull of its constituent outlines upwards to the highest (least granular) hierarchy level on which segments have been added (i.e. segmentation_level). In effect, segments will have tight polygons instead of coarse bounding boxes, with fewer unnecessary overlaps between neighbours.
Codecov Report
@@ Coverage Diff @@
## master #162 +/- ##
==========================================
- Coverage 40.58% 31.18% -9.41%
==========================================
Files 11 11
Lines 1126 1209 +83
Branches 236 277 +41
==========================================
- Hits 457 377 -80
- Misses 585 758 +173
+ Partials 84 74 -10
Continue to review full report at Codecov.
|
Instead of post-processing on the PAGE hierarchy, get convex hull polygon from all constituent symbols/glyphs (i.e. the lowest level) on the Tesseract iterator hierarchy directly. Thus, segments will have tight polygons not only when processing across multiple levels, but also when annotating results for a a single PAGE level. So the more granular results from Tesseract will at least be used implicitly. Also exposes this option as parameter to the other processors.
Good news: I did find a solution for this. I emulated a |
The bad news is that I also made a big conceptual mistake in #158: My parameterization Here's an idea how to fix this (without throwing it all away): We could separate the question of whether to do only segmentation (
Opinions? |
Oops. But good that you noticed before ocrd_all merge/documentation update.
Makes sense. If users want text recognition they will also explicitly provide the model(s) they want to use for recognition, so this is a reasonable convention. From my side, I'd be content with this segmentation_level/textequiv_level/model logic. |
- recognize: do not attempt `Recognize()` (only `AnalyseLayout()`) or modify `TextEquiv` if `model` parameter is empty (just like for `textequiv_level==none`, except the latter still attempts maximal segmentation) - segment-{region,line,word}: Apply single-level segmentation (i.e. `textequiv_level!=none`) without recognition (i.e. empty `model`)
See d1ffd70. I still kept the |
Tesseract's `PageIterator` / `ResultIterator` navigation has 2 bugs which makes them hardly usable: they are not equipped to cleanly handle the case when - a non-text block is entered at the PARA/TEXTLINE/WORD level, or - an empty word (only rejections) is entered at the SYMBOL level. In particular, `IsAtFinalElement` and `Empty` prematurely signal True at the lower (but not higher!) levels, the latter w.r.t. the current and the former w.r.t. the next segment. Using the API in a naïve / straightforward way would cause all follow-up segments to be lost, respectively. This contains a workaround on the API level.
@@ -72,7 +73,7 @@ class TesserocrRecognize(Processor): | |||
|
|||
def __init__(self, *args, **kwargs): | |||
kwargs['ocrd_tool'] = OCRD_TOOL['tools'][TOOL] | |||
kwargs['version'] = OCRD_TOOL['version'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! We should probably also do that for ocrd_calamari
This implements an idea I had long time ago to at least get some polygons out of Tesseract, reducing the large overlap between bounding boxes. If you set
shrink_polygons
, thensegment*
andrecognize
will post-process their segmentation by projecting convex hull upwards fromtextequiv_level
tosegmentation_level
.For example,
ocrd-tesserocr-segment
now yields: