You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$ ocrd-tesserocr-binarize -I GT -O BIN -p '{"operation_level": "line"}'
10:46:23.324 INFO processor.TesserocrBinarize - No output file group for images specified, falling back to 'OCR-D-IMG-BIN'
10:46:23.442 INFO processor.TesserocrBinarize - INPUT FILE 0 / 00009
Traceback (most recent call last):
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd/workspace.py", line 109, in download_file
f.url = self.resolver.download_to_directory(self.directory, f.url, subdir=f.fileGrp, basename=basename)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd/resolver.py", line 77, in download_to_directory
raise FileNotFoundError("File path passed as 'url' to download_to_directory does not exist: %s" % url)
FileNotFoundError: File path passed as 'url' to download_to_directory does not exist: 00009.tif
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/kmw/Documents/Work/OCR-D/env/bin/ocrd-tesserocr-binarize", line 10, in<module>sys.exit(ocrd_tesserocr_binarize())
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd_tesserocr/cli.py", line 45, in ocrd_tesserocr_binarize
return ocrd_cli_wrap_processor(TesserocrBinarize, *args, **kwargs)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd/decorators.py", line 66, in ocrd_cli_wrap_processor
run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd/processor/base.py", line 56, in run_processor
processor.process()
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd_tesserocr/binarize.py", line 82, in process
page, page_id)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd/workspace.py", line 320, in image_from_page
page_image = self._resolve_image_as_pil(page.imageFilename)
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd/workspace.py", line 237, in _resolve_image_as_pil
image_filename = self.download_file(f).local_filename
File "/home/kmw/Documents/Work/OCR-D/env/lib/python3.6/site-packages/ocrd/workspace.py", line 112, in download_file
raise Exception("No baseurl defined by workspace. Cannot retrieve '%s'" % f.url)
Exception: No baseurl defined by workspace. Cannot retrieve '00009.tif'
In addition, a file 00009_gt.xml is created in the GT directory.
Another flaw in the logic of download_to_directory. It SHOULD recognize that the source files are already in the workspace but does not, leading to copies of all input files...
kba
added a commit
to kba/ocrd-core
that referenced
this issue
Jan 13, 2020
As shown in the Lobby, I encounter problems when trying to create a workspace from existing (local) files:
Running
ocrd-tesserocr-binarize
leads toIn addition, a file
00009_gt.xml
is created in theGT
directory.00009.zip
The text was updated successfully, but these errors were encountered: