-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Position for rotated text #59
Comments
This is a good question and I am hoping one of my Board brethren with more experience with using ALTO with rotated blocks can weigh in. I believe that HPOS,VPOS are for the center of the block and then the rotation is applied when the rotation attribute is used. |
Please allow me to weigh in. I have just finished tackling those issues for PAGE-XML within OCR-D, where with (That discussion aims to solve not just the particular problem of rotation but the wider issues of relative coordinates when using binary image data for segments at each step in the hierarchy – blocks, lines, words –, which is possible to represent in PAGE-XML via Let me start off my answer with a quote from the spec. In the
So this merely informs about the skew of the binary image data within the annotated region (being described by the bounding box with Therefore, yes @silviu22, your block D has its HPOS/VPOS at P4 and its WIDTH/HEIGHT is W2/H2 (referring to your drawing – your verbalization is somewhat unfortunate, because it describes W2/H2 as the width/height of the smallest rectangle of the rotated text; surely, you are referring to "rotated" as rotated in the image, but that's usually called "skewed", whereas "rotated" is the respective countermeasure). This is not a big deal. It only starts to get complicated when we extract and annotate a binary, cropped and deskewed image for the block (via It is in this detail that the comment by @artunit makes some sense, but happens to be wrong:
No, @cneud as you can see I answered myself here. |
Thank you @bertsky and @artunit for the clarification. To me, skewed text was distorted text, like italic text. (I assumed skewed text is still written horizontally, but you drag the top of the text left or right by a certain amount, the same way the italic text leans to the right). But if you prefer the term "skewed" instead of "rotated", that is fine with me. There will be a good amount of calculations to find a way to draw this skewed text using the coordinates of the bounding rectangle. But this is fine as long as it's clear. |
I am a little confused about the coordinates and width/height of rotated text.
I believe the HPOS, VPOS are the (x,y) coordinates of the top-left corner of the text block. Also, the width/height seem to be the width of the bounding rectangle containing the whole text.
This seems obvious for normal (horizontal) text. Is this the case for rotated text as well?
For example, when text is rotated by 90 degrees, the old width becomes height and old height becomes width, To describe the question a little better, I came up with 4 cases:
Please take a look at this file: Text Position.pdf
In that file, (x,y) is HPOS,VPOS for a particular word. And W/H is width/height of that word.
I believe the answers are as follows:
Note that if HPOS,VPOS is always top-left of the displayed text, then it has a different meaning for the program that is supposed to displaying such text.
Case D might be best to explain what I mean. To draw text at 45 degrees, you will typically tell the computer to draw text at 45 degrees starting from point P2 (the baseline). You will usually not tell it to display the text at point P4 (top-left corner). So I would have to do quite a lot of work to deduce point P2 that would draw text at 45 degrees that will have the top-left corner at point P4. It can be done, but it takes some work.
So, to recap, can someone confirm that the HPOS,VPOS for case D (text at 45 degrees) is point P4? Also, for point D (45 degrees), there are two possible pairs of values that can be considered width/height:
I believe the WIDTH/HEIGHT values that we are supposed to write are W2, H2 (smallest rectangle containing the rotated text).
The text was updated successfully, but these errors were encountered: