-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix make_file_id #861
fix make_file_id #861
Conversation
- if neither input fileGrp nor pageId is in input fileID, then also try pageId - do not enforce the fallback counter's uniqueness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I grok this correctly, the problem is that make_file_id
is not aware of the overwrite_mode
of the workspace and should behave differently. It should not try to avoid clashes because that would lead to inconsistent state, i.e. more files with different IDs instead of overwriting existing files.
One solution could be to add a new kwarg overwrite
to the make_file_id
method. Not ideal because that requires all the procesors to be adapted and can lead to easy-to-miss inconsistencies across processors.
Since overwrite
is a state of the OcrdWorkspace
class and there is no link back from OcrdMets
to the workspace it represents, we cannot go OcrdFile
-> OcrdMets
-> OcrdWorkspace
- but maybe this should be possible. We could add a reference from OcrdMets to OcrdWorkspace in the latter's __init__
.
Then we could check OcrdFile.mets.workspace.overwrite_mode
and skip the loop to create unique IDs.
Yes. But also, conversely, running against an existing grp without (Of course, that's nothing we can ensure at the level of
Yes. Possible but not a good design.
Yes, but should My best idea for the moment still is:
|
When IIUC you would replace mets_file = next(self.find_files(ID=ID), None) with mets_file = next(self.find_files(fileGrp=fileGrp, pageId=pageId) and if that is found and |
No. Like I said, keep if kwargs['mimetype'] == MIMETYPE_PAGE:
# only one PAGE per page
file_ = next(self.mets.find_files(fileGrp=file_grp, pageId=kwargs['page_id'], mimetype=kwargs['mimetype']), None)
if file_:
kwargs['ID'] = file_.ID # let overwrite_mode decide what to do (re-use / raise)
ret = self.mets.add_file(file_grp, **kwargs) I'm not sure about |
You said re-review, which I did, but now you already merged without my suggestions… |
I'll amend. |
Done in https://github.com/OCR-D/core/tree/861-re-review. Many thanks for the documentation fixes. |
First we need to have a test for
--overwrite
that fails…