Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure general workflow, cli, services, processors, docs, tests #40

Merged
merged 68 commits into from
Mar 27, 2018
Merged
Changes from 1 commit
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
11bff34
Stubs for METS, PAGE, resolver and workspace, pylint, unittests
kba Mar 23, 2018
abeb696
OcrdMetsFile in its own file
kba Mar 23, 2018
f22758b
wip: ResolverCache
kba Mar 23, 2018
c54a433
model.ocrd_page: fix indexing off by one
kba Mar 23, 2018
1fc1ee7
recognize works
kba Mar 23, 2018
01045cf
gitignore libreoffice lock files
kba Mar 23, 2018
dbaa95c
wip: travis
kba Mar 23, 2018
3a05a34
typo: segent -> segment
kba Mar 23, 2018
e3e46ae
ResolverCache working
kba Mar 23, 2018
919cec9
'make test' to run all unit tests
kba Mar 23, 2018
f3cc1ae
add EXIF constants
kba Mar 23, 2018
db9eb66
OcrdMets: cache fileGrps
kba Mar 23, 2018
98ef09a
:memo: Update README
kba Mar 24, 2018
f018657
python3 compat (make PYTHON=python3 test)
kba Mar 24, 2018
18d12ee
create processor class, port exif to new api, extend page, test files…
kba Mar 24, 2018
6d81ea5
rename ocrd.log -> ocrd.utils to contain reusable static code
kba Mar 24, 2018
9641bbd
move to utils, export getLogger, coordinate_string_from_xywh
kba Mar 24, 2018
905aa1e
lazy logging
kba Mar 24, 2018
fef9b5c
pylint: stop complaining about lxml
kba Mar 24, 2018
ff1e92d
page tag constants
kba Mar 24, 2018
62684c1
OcrdPage: methods for listing/creating regions/lines
kba Mar 24, 2018
728564e
workspace: + save_mets method
kba Mar 24, 2018
8ecce18
utils: xywh_from_coordinate_string as opposite of coordinate_string_f…
kba Mar 24, 2018
fb09eba
WIP port segmenting to new api
kba Mar 24, 2018
cf724a8
pylint: stop complaining about tesserocr/cv2
kba Mar 24, 2018
9dbe659
MIMETYPE_PAGE = text/page+xml
kba Mar 24, 2018
cfb536a
processor: helpers for input/output of files
kba Mar 24, 2018
6cbb851
OcrdPage: prefer "X is not None" over "not X"
kba Mar 24, 2018
48b3cee
tests: assets module
kba Mar 24, 2018
b8a942b
mirror module structure in tests
kba Mar 25, 2018
98c099b
segment*/tesseract: use processor shortcuts
kba Mar 25, 2018
bb8832f
xsl namespace
kba Mar 25, 2018
50b1f60
workspace: output files are saved with file:// if no url
kba Mar 25, 2018
751f0c2
OcrdPage: typos
kba Mar 25, 2018
f323355
xml prettify
kba Mar 25, 2018
9244464
remove cruft from ocrd_xml_base
kba Mar 25, 2018
19e0c19
tests: run all with uniitest discover
kba Mar 25, 2018
e4c9fc4
test with pytest
kba Mar 25, 2018
93dfbba
test OcrdPage
kba Mar 25, 2018
3de7a6a
:fire: remove original characterizing/segmenting
kba Mar 25, 2018
468c698
:memo: docstrings in OcrdPage
kba Mar 25, 2018
83de2f7
:fire: remove initializing
kba Mar 25, 2018
02e01c9
start with cli
kba Mar 25, 2018
271d0bf
cli
kba Mar 25, 2018
26b9fff
run_process in ocrd.processor to flexibly create workspace and run pr…
kba Mar 25, 2018
3a9391c
Expose existing processors on web service, extend run-server script
kba Mar 25, 2018
4568bd8
rename binary to 'ocrd', merge run and run_server, update setup.py
kba Mar 25, 2018
3da3e46
CLI: ocrd process is now chainable
kba Mar 25, 2018
4bfd81b
:fire: remove ocrd.webservices
kba Mar 25, 2018
e712196
minimal repository web service
kba Mar 25, 2018
0c80139
optionally symlink instead of copy in resolver
kba Mar 25, 2018
74db476
:memo: docs, move code in in processor/__init__.py to processor/base.py
kba Mar 25, 2018
30a97c9
basic setup for documentation with sphinx
kba Mar 25, 2018
32c2ce7
move image manipulation to workspace.resolve_image_as_pil
kba Mar 25, 2018
b58e6f6
resolver: allow setting workspace directory explicitly (for testing)
kba Mar 26, 2018
fa92fa9
use xmllint --format to optionally canonicalize/pretty print XML
kba Mar 26, 2018
cabb273
canonical ID for mets:file: fileGrp@USE + 4-zero-padded index within grp
kba Mar 26, 2018
bfdf09a
page: helpers to work with TextLine
kba Mar 26, 2018
38c85a7
processor.add_output_file: pass on ID
kba Mar 26, 2018
9f3c82c
WIP recognition with tesseract3
kba Mar 26, 2018
80d33da
test assets
kba Mar 26, 2018
c8232b7
:bug: resolver: use hyper-verbose but uniqe filenames based on url
kba Mar 23, 2018
3693299
:art: remove obsolete pylint exceptions
kba Mar 26, 2018
4e0fa6f
rename tesseract3 -> tesserocr
kba Mar 26, 2018
38d29ab
make 'test-profile' to list most time-consuming lines
kba Mar 26, 2018
f911afa
workspace: remove hard-coded reference to INPUT fileGrp
kba Mar 26, 2018
1a2bd9b
:green_heart: travis add @alex-p's tesseract-ocr PPA
kba Mar 26, 2018
daade7f
properly skip recognize test, travis
kba Mar 24, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
typo: segent -> segment
  • Loading branch information
kba committed Mar 23, 2018
commit 3a05a34f40afe6fe9ff660f8d7f25dc0b5ea4ccd
4 changes: 2 additions & 2 deletions ocrd/webservice/processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@ def _characterize_exif():
run_processor(ExifProcessor, request.args['mets_url'], resolver)
return 'DONE', 200

@app.route('/processor/segent_line/tesseract3', methods=['PUT'])
@app.route('/processor/segment_line/tesseract3', methods=['PUT'])
def _segment_line_tesseract3():
run_processor(Tesseract3LineSegmenter, request.args['mets_url'], resolver)
return 'DONE', 200

@app.route('/processor/segent_region/tesseract3', methods=['PUT'])
@app.route('/processor/segment_region/tesseract3', methods=['PUT'])
def _segment_region_tesseract3():
run_processor(Tesseract3RegionSegmenter, request.args['mets_url'], resolver)
return 'DONE', 200
Expand Down