Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple types in PAGE model are broken #451

Closed
bertsky opened this issue Feb 28, 2020 · 3 comments · Fixed by #457
Closed

simple types in PAGE model are broken #451

bertsky opened this issue Feb 28, 2020 · 3 comments · Fixed by #457
Assignees
Labels

Comments

@bertsky
Copy link
Collaborator

bertsky commented Feb 28, 2020

This is a regression from #437:

>>> regions[0].get_type()
'page-number'
>>> isinstance(regions[0].get_type(), str)
True
>>> isinstance(TextTypeSimpleType.PAGENUMBER, str)
False
>>> TextTypeSimpleType.PAGENUMBER == 'page-number'
False
>>> regions[0].set_type(TextTypeSimpleType.PAGENUMBER)
>>> regions[0].get_type()
<TextTypeSimpleType.PAGENUMBER: 'page-number'>
>>> str(TextTypeSimpleType.PAGENUMBER)
'TextTypeSimpleType.PAGENUMBER'

The latter is also what is used for XML serialization. This in turn causes invalid PAGE output when using ocrd-tesserocr-segment-region, which afterwards even without validation causes tons of error messages of the following form:

Warning: Value "TextTypeSimpleType.HEADING" near line 56 does not match xsd enumeration restriction on TextTypeSimpleType

This is super-urgent. I recommend doing a revert release first and then start proper investigation.

@bertsky
Copy link
Collaborator Author

bertsky commented Mar 10, 2020

@kba can you please revert 3a0a3a8 from #437 and make a new release, so we at least have a working master?

@kba
Copy link
Member

kba commented Mar 12, 2020

3a0a3a8 reverted as a quickfix in v2.4.3. That reopens the issue with @conf but you're mitigating that already so this is the least worst solution. Will need to revisit to properly fix and not introduce more regressions.

@bertsky
Copy link
Collaborator Author

bertsky commented Mar 13, 2020

3a0a3a8 reverted as a quickfix in v2.4.3. That reopens the issue with @conf but you're mitigating that already so this is the least worst solution. Will need to revisit to properly fix and not introduce more regressions.

19afb8d is not a true revert of 3a0a3a8, and it does not fix this unfortunately!

The reason seems to be that you ran generateds with the bad new (instead of the good old) version again: this shows no difference other than the date (version and code stay the same), whereas that was the change we want/need to revert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants