-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple Shape elements in one TextLine #42
Comments
Former CR issue #22 Allow shape element usage (IMPACT) |
Hi jlerouge, thanks for your posting and sorry for the delay on response.
In historical papers it might be that something which is from content point of view belongs to the text line (e.g. even is written on the outer border ). I do not remember if we had discussed the scenario on this specifically, but I also do not see a critical problem with it. This is the same as on "PageSpaceType", which has also mutliple sub-elements (minOccurs="unbounded"). We will discuss this once more on the board. Perhaps we extend the annotation to prevent misunderstanding but keep possibility as described above. |
I think this is a mistake on our part and it should be fixed as proposed above. |
Definately as stated above it is inconsistent on the different types. |
I agree, this issue is bug in the schema. But because allowing multiple shapes for a textline is not according the described use cases/specifications I think we can fix it in the next minor release. This bug brings me to the question of creating test cases. Usually if I find a bug I would create a test case that you could run in a regression test. Would it be worthwhile to create testcases for ALTO as well? |
@evelienket Excellent idea about the test cases - perhaps we can use Schematron for this. |
There are four levels where the Space element can be used: PageSpaceType, BlockType, TextLine, StringType Proposed fix for PageSpace:
Proposed fix for TextLine:
Next step is to add XML-files with correct and and wrong shape elements and a new version of the xsd. |
Fix is included in version 4-0 (now in draft status for public review) |
Fixed in v4.0. |
Hello,
Considering this change between Alto v3.0 and v3.1 :
I guess this is relative to the following in the v3.1 changelog :
I see a problem here, which is multiple shape elements can be direct children of a TextLine. According to the schema, the following constructions are allowed :
Ex. 1: No Shape element in the TextLine ✅
Ex. 2: One Shape element at the beginning of the TextLine ✅
Ex. 3: One Shape element before each String element of the TextLine ❗
In the 3rd situation, which Shape element should be selected as the correct shape of the line ?
I suggest that TextLine can have at most one Shape child element, at the beginning of the sequence, like this :
The text was updated successfully, but these errors were encountered: