Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1240 backports for v2 #1275

Merged
merged 46 commits into from
Sep 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
5663f48
processor CLI: delegate --resolve-resource, too
bertsky Sep 13, 2024
853bdb5
test_mets_server: fix arg vs kwarg
bertsky Aug 13, 2024
33c7386
mets_server: ClientSideOcrdMets needs OcrdMets-like kwargs (without d…
bertsky Aug 13, 2024
37f7cda
use up-to-date kwargs (avoiding old deprecations)
bertsky Aug 13, 2024
44946ba
hide/test expected deprecation warnings
bertsky Aug 13, 2024
d0962d6
improve output in case of assertion failures
bertsky Aug 13, 2024
061f023
allow "from ocrd_models import OcrdPage
kba Aug 15, 2024
d2f92d1
ocrd_utils: forgot to export scale_coordinates at toplvl
bertsky Aug 16, 2024
c6c5c42
fix imports
bertsky Aug 16, 2024
245778c
Processor.zip_input_files: warning instead of exception for missing i…
bertsky Aug 20, 2024
1f7b57f
Processor.zip_input_files: more verbose log msg
bertsky Aug 21, 2024
35bdb39
tests report.is_valid: improve output on failure
bertsky Aug 21, 2024
e595996
fix --log-filename (6fc606027a): apply in ocrd_cli_wrap_processor
bertsky Aug 24, 2024
f21b8d2
fix exception
bertsky Aug 24, 2024
0cbd3ea
adapt to PIL.Image moved constants
bertsky Aug 24, 2024
8f8912c
cli.workspace: pass fileGrp as well, improve description
bertsky Aug 24, 2024
6dccfb3
OcrdMets.add_agent: does not have positional args
bertsky Aug 24, 2024
2d85f14
update pylintrc
bertsky Aug 24, 2024
ea68370
pylint: try ignoring generateds (again)
bertsky Aug 25, 2024
18ac2c0
ClientSideOcrdMets: use same logger name prefix as server
bertsky Aug 28, 2024
da37967
test_mets_server: use tmpdir to avoid side effects between suites
bertsky Aug 28, 2024
ccb416b
disableLogging: re-instate root logger, to
bertsky Aug 28, 2024
7e3cdf4
test-logging: also remove ocrd.log from tempdir
bertsky Aug 28, 2024
4f45b12
bashlib: re-add --log-filename, implement as stderr redirect
bertsky Aug 28, 2024
7b70c90
ocrd_utils.config: add reset_defaults()
bertsky Aug 29, 2024
48bb3c2
add test for OcrdEnvConfig.reset_defaults()
bertsky Aug 29, 2024
ed92403
Workspace.reload_mets: fix for METS server case
bertsky Sep 1, 2024
9c3c399
OcrdMetsServer.add_file: pass on 'force' kwarg, too
bertsky Sep 2, 2024
c077e95
test_mets_server: add test for force (overwrite)
bertsky Sep 2, 2024
4492168
PcGts.Page.id / make_xml_id: replace '/' with '_'
bertsky Sep 13, 2024
83d52d8
METS Server: also export+delegate physical_pages
bertsky Sep 15, 2024
4eccefc
ocrd.cli.workspace: consistently pass on --mets-server-url and --back…
bertsky Sep 13, 2024
083df27
ocrd.cli.workspace server: add 'reload' and 'save'
bertsky Sep 13, 2024
b2c0161
ocrd.cli.validate tasks: pass on --mets-server-url, too
bertsky Sep 12, 2024
203a06a
run_processor: be robust if ocrd_tool is missing steps
bertsky Sep 12, 2024
4fbdd00
lib.bash: fix errexit
bertsky Sep 12, 2024
c865079
tests: make sure ocrd_utils.config gets reset whenever changing it gl…
bertsky Sep 13, 2024
1a13cd3
ocrd.cli.workspace: assert non-server in cmds mutating METS
bertsky Sep 16, 2024
bba597e
OcrdPage: add PageType.get_ReadingOrderGroups()
bertsky Sep 7, 2024
fa0fada
update OcrdPage from generateds
bertsky Sep 7, 2024
9641d4a
OcrdMets.get_physical_pages: cover return_divs w/o for_fileIds for_pa…
bertsky Sep 27, 2024
19ce7d9
ocrd.cli.workspace: use physical_pages if possible, fix default outpu…
bertsky Sep 27, 2024
606915b
disableLogging: clearer comment
bertsky Sep 30, 2024
3b908a6
:memo: changelog
kba Sep 30, 2024
343a66a
:memo: changelog: remove spurious entries
kba Sep 30, 2024
f808b72
:memo: update changelog again
bertsky Sep 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 9 additions & 10 deletions .pylintrc
Original file line number Diff line number Diff line change
@@ -1,19 +1,22 @@
[MASTER]
extension-pkg-whitelist=lxml
ignored-modules=cv2,tesserocr,ocrd.model
extension-pkg-whitelist=lxml,pydantic
ignored-modules=cv2,tesserocr,ocrd_models.ocrd_page_generateds
ignore-paths=ocrd_page_generateds.py
ignore-patterns=.*generateds.*

[MESSAGES CONTROL]
ignore-patterns='.*generateds.*'
disable =
fixme,
E501,
line-too-long,
consider-using-f-string,
logging-fstring-interpolation,
trailing-whitespace,
logging-not-lazy,
inconsistent-return-statements,
disallowed-name,
invalid-name,
line-too-long,
missing-docstring,
no-self-use,
wrong-import-order,
too-many-nested-blocks,
superfluous-parens,
Expand All @@ -25,13 +28,9 @@ disable =
ungrouped-imports,
useless-object-inheritance,
useless-import-alias,
bad-continuation,
no-else-return,
logging-not-lazy

[FORMAT]
no-space-check=empty-line

[DESIGN]
# Maximum number of arguments for function / method
max-args=12
Expand All @@ -40,7 +39,7 @@ max-locals=30
# Maximum number of return / yield for function / method body
max-returns=12
# Maximum number of branch for function / method body
max-branchs=30
max-branches=30
# Maximum number of statements in function / method body
max-statements=60
# Maximum number of parents for a class (see R0901).
Expand Down
31 changes: 31 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,36 @@ Versioned according to [Semantic Versioning](http://semver.org/).

## Unreleased

## [2.69.0] - 2024-09-30

Fixed:
- tests: ensure `ocrd_utils.config` gets reset whenever changing it globally
- `ocrd.cli.workspace`: consistently pass on `--mets-server-url` and `--backup`
- `ocrd.cli.workspace`: make `list-page` work w/ METS Server
- `ocrd.cli.validate "tasks"`: pass on `--mets-server-url`
- `lib.bash`: fix `errexit` handling
- actually apply CLI `--log-filename`, and show in `--help`
- adapt to Pillow changes
- `ocrd workspace clone`: do pass on `--file-grp` (for download filtering)
- `OcrdMetsServer.add_file`: pass on `force` kwarg
- `Workspace.reload_mets`: handle ClientSideOcrdMets as well
- `OcrdMets.get_physical_pages`: cover `return_divs` w/o `for_fileIds` and `for_pageIds`
- `disableLogging`: also re-instate root logger to Python defaults

Changed:
- `run_processor`: be robust if `ocrd_tool` is missing `steps`
- `PcGtsType.PageType.id` via `make_xml_id`: replace `/` with `_`
- `ClientSideOcrdMets`: use same logger name prefix as METS Server
- `Processor.zip_input_files`: when `--page-id` yields empty list, just log instead of raise

Added:
- `OcrdPage`: new `PageType.get_ReadingOrderGroups()` to retrieve recursive RO as dict
- METS Server: export and delegate `physical_pages`
- ocrd.cli.workspace `server`: add subcommands `reload` and `save`
- processor CLI: delegate `--resolve-resource`, too
- `OcrdConfig.reset_defaults` to reset config variables to their defaults
- `ocrd_utils.scale_coordinates` for resizing images

## [2.68.0] - 2024-08-23

Changed:
Expand Down Expand Up @@ -2164,6 +2194,7 @@ Fixed
Initial Release

<!-- link-labels -->
[2.69.0]: ../../compare/v2.69.0..v2.68.0
[2.68.0]: ../../compare/v2.68.0..v2.67.2
[2.67.2]: ../../compare/v2.67.2..v2.67.1
[2.67.1]: ../../compare/v2.67.1..v2.67.0
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -273,7 +273,7 @@ test-logging: assets
cp src/ocrd_utils/ocrd_logging.conf $$tempdir; \
cd $$tempdir; \
$(PYTHON) -m pytest --continue-on-collection-errors -k TestLogging -k TestDecorators $(TESTDIR); \
rm -r $$tempdir/ocrd_logging.conf $$tempdir/.benchmarks; \
rm -r $$tempdir/ocrd_logging.conf $$tempdir/ocrd.log $$tempdir/.benchmarks; \
rm -rf $$tempdir/.coverage; \
rmdir $$tempdir

Expand Down
2 changes: 2 additions & 0 deletions src/ocrd/cli/ocrd_tool.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ def __init__(self, filename):
self.filename = filename
with codecs.open(filename, encoding='utf-8') as f:
self.content = f.read()
# perhaps the validator should _always_ run (for default expansion)
# so validate command only for the report?
kba marked this conversation as resolved.
Show resolved Hide resolved
self.json = loads(self.content)

pass_ocrd_tool = click.make_pass_decorator(OcrdToolCtx)
Expand Down
9 changes: 6 additions & 3 deletions src/ocrd/cli/validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,16 +102,19 @@ def validate_page(page, **kwargs):
@validate_cli.command('tasks')
@click.option('--workspace', nargs=1, required=False, help='Workspace directory these tasks are to be run. If omitted, only validate syntax')
@click.option('-M', '--mets-basename', nargs=1, default=DEFAULT_METS_BASENAME, help='Basename of the METS file, used in conjunction with --workspace')
@click.option('-U', '--mets-server-url', help='TCP host URI or UDS path of METS server')
@click.option('--overwrite', is_flag=True, default=False, help='When checking against a concrete workspace, simulate overwriting output or page range.')
@click.option('-g', '--page-id', help="ID(s) of the pages to process")
@click.argument('tasks', nargs=-1, required=True)
def validate_process(tasks, workspace, mets_basename, overwrite, page_id):
def validate_process(tasks, workspace, mets_basename, mets_server_url, overwrite, page_id):
'''
Validate a sequence of tasks passable to 'ocrd process'
'''
if workspace:
_inform_of_result(validate_tasks([ProcessorTask.parse(t) for t in tasks],
Workspace(Resolver(), directory=workspace, mets_basename=mets_basename), page_id=page_id, overwrite=overwrite))
_inform_of_result(validate_tasks(
[ProcessorTask.parse(t) for t in tasks],
Workspace(Resolver(), directory=workspace, mets_basename=mets_basename, mets_server_url=mets_server_url),
page_id=page_id, overwrite=overwrite))
else:
for t in [ProcessorTask.parse(t) for t in tasks]:
_inform_of_result(t.validate())
Loading
Loading