Skip to content

Commit

Permalink
improve ocropy processor (#8)
Browse files Browse the repository at this point in the history
* fix requirements

* improve ocrd-cis-ocropy-recognize:

- add proper error handling
- use proper temporary files
- re-introduce binarization (was commented)
- add sanity checks (from ocropy CLI)
- make de-warping optional

* improve ocrd-cis-ocropy-recognize again:

- abolish temporary files altogether:
  keep converting between pillow and array
  formats in memory
- make logger available to all functions
- make binarization method and dewarping
  selectable via ocrd-tool parameters
- add binarization method from original
  ocropus-nlbin (including local whitelevel
  estimation and de-skewing)
- calculate OCR-GT distances while processing
  and show CER per input file
  • Loading branch information
bertsky authored and finkf committed Apr 25, 2019
1 parent b6773e9 commit 12ea2f9
Show file tree
Hide file tree
Showing 4 changed files with 248 additions and 70 deletions.
10 changes: 10 additions & 0 deletions ocrd_cis/ocrd-tool.json
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,16 @@
],
"description": "Recognize text in lines with ocropy",
"parameters": {
"dewarping": {
"type": "boolean",
"description": "enable line normalization",
"default": true
},
"binarization": {
"type": "string",
"enum": ["none", "global", "otsu", "gauss-otsu", "ocropy"],
"default": "none"
},
"textequiv_level": {
"type": "string",
"enum": ["line", "word", "glyph"],
Expand Down
2 changes: 1 addition & 1 deletion ocrd_cis/ocropy/ocrolib/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ def array2pil(a):
else:
raise OcropusException("bad image rank")
elif a.dtype==dtype('float32'):
return PIL.Image.fromstring("F",(a.shape[1],a.shape[0]),a.tostring())
return PIL.Image.frombytes("F",(a.shape[1],a.shape[0]),a.tostring())
else:
raise OcropusException("unknown image type")

Expand Down
Loading

0 comments on commit 12ea2f9

Please sign in to comment.