-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TextMetrics.advances should define more details #4026
Comments
cc @whatwg/canvas |
See also #3994. |
Glyphs are an implementation detail of text drawing libraries. I'm worried about people getting trying to use glyphs as semantic information. For consistency with the rest of the Web Platform, these should be UTF-16 Code Unit.
This is almost certainly a mistake. I've never heard of anyone wanting this information, and we shouldn't force web developers to subtract in order to get what they really want.
We already have to solve this problem for text selection with a mouse. We should use whatever we use for that. We should also have the invariant that the sum of all the advances is the width of the entire string. (So, e.g. combining marks get a 0 advance)
As far as I know, every text engine does Bidi reordering before looking up characters in fonts. In order to be implementable in existing engines, we should make this visual order. |
I agree. We should fix Houdini Font Metrics API as well, if consensus. One edge case is when one code point is shaped to multiple glyphs. Maybe we can put the sum of advances of such glyphs?
Agree (I assume you meant to prefer each advance, not total from index 0, correct?)
I believe mouse selection is not defined in the spec, is that correct? Is there any definitions we can use for this purpose? @annevk @domenic
Good point, agree.
Agree it's the most feasible to implement. I don't know how we can present advances in the logical order. On the other hand, if the intention of |
Does it have to be spec'ed? Advances are already platform / engine / font-specific. Similarly, the mapping from glyph index -> character index has to survive font shaping, which is lossy (consider the Contextual Glyph Substitution Subtable inside the MORX table in AAT fonts). This mapping definitely shouldn't be spec'ed.
If there is demand, perhaps we can work with the ECMA-402 to provide a UBA? Both visual and logical orders have value for different purposes; we should start with the easy one and open it up to the more complicated one if necessary. |
ISTM that an array of glyph advances in visual order, indexed by code units into the character string, is a fundamentally broken API, and if authors try to build things on top of it they'll end up with code that fails in peculiar ways when faced with complex-script text. It's not just about bidi reordering; what about Indic-style rearrangement of glyphs such as vowels that appear to the left of the base character? When the shaping engine reorders the glyphs corresponding to "hindi" into the visual order "ihndi", how will a client know that the first element of |
@jfkthame I'm not sure I understand why "an array of glyph advances in visual order, indexed by code units into the character string" is fundamentally broken with your example. Could you please clarify that? In the example that you gave "hindi" (an let's assume that each glyph here has 10 logical unit of advance size and we end up with 5 glyphs). If there's a rearrangement to "ihndi", the returning advances would be It seems some of those TextMetrics threads diverged a bit from people guessing what was the original intent of it. It's totally my fault, for not making it way more clear on the original spec what it should have been. One of the original motivations for this API was to solve things like detecting cursor position, i.e., to answer the question "Where the editing cursor would have to be to be at the left of the glyph associated with this character". Which is exactly what @litherum hints at when they said "We already have to solve this problem for text selection with a mouse. We should use whatever we use for that." We are actually using the same information (more precisely, text edit cursor selection, but they are the same). Implementation wise (and this was not on the spec writing and is all my fault), most of the issues that I've seen being brought up here were addressed, but they didn't end up into the spec, as I wrongly assumed they were implementation details. For example, @annevk brought up on the other thread "What happens if multiple code points get rendered as a single glyph?". In this case, we return the same advance for both code points. It's possible that we forgot some cases that should be addressed and the spec needs some clarification (and maybe even directly address some of the cases brought up here? Although I'd argue that WPT could be better for this, but oh well...). |
OK, I think that makes it clear that "advances" is the wrong name for this array. Those aren't the advances of the characters (or glyphs); they're positions. So if a client wants to use this information to draw an underline below the first character of the text ("h"), how should it go about this? Draw a line from positions[0] to positions[1]? I expect that's what most authors would instinctively write; but it'll be wrong. What if a single character results in multiple rendered glyphs? What if those glyphs are non-contiguous in the resulting visual order? Suppose the first "i" of "hindi" renders as two glyphs (let's call them If this array were renamed
This means a client cannot distinguish between a pair of spacing characters that happen to ligate and a base letter followed by a non-spacing mark, which ISTM is a useful distinction when dealing with issues such as caret positioning or selection highlighting. |
That's a good question. The actual algorithm for doing that would probably be "for LTR, draw from
I was not aware that we had non-contiguous multiple glyphs from the same character. I'm almost sure that Chrome doesn't handle this at all, not sure about Safari and Firefox. Do you have a real text example of this that we could test against?
Sorry. I misstated that. If two code points get rendered as a singly glyph, there are two options: if they are separate unicode graphemes, we return the linear interpolation of the advance of the glyph for each code point (e.g. |
This occurs in scripts such as Malayalam: the sequence AFAIK, most (all?) browsers currently behave as if this entire cluster were a single glyph, for selection/cursor placement purposes; but it's not, it is three distinct glyphs (and none of them are zero-width, fwiw). I think a canvas API client should be able to determine things like this, as the point of using canvas is (at least in part) to give the author low-level control over exactly what/how they're drawing. Perhaps we should first be creating canvas APIs to draw and measure glyphs (as opposed to characters), along with APIs to access the text-shaping process (mapping from a string of characters, with associated font/style information, to an array of glyphs and positions). |
This is what the web author would need to implement text selection. (But only if the advances are layout advances, not paint advances.) EDIT: Yeah, UBA makes a mess of this. You're right, it's pretty broken. |
This reverts commit 7711a1f. As discussed in #3995, these changes were made prematurely without appropriate implementer sign-off. Since then, a plethora of issues around the changes here have been opened up (e.g. #3994, #4023, #4026, #4030, #4033, #4034). We revert these changes until a more complete and agreed-upon specification can replace them. Closes #3995.
This reverts commit 7711a1f. As discussed in whatwg#3995, these changes were made prematurely without appropriate implementer sign-off. Since then, a plethora of issues around the changes here have been opened up (e.g. whatwg#3994, whatwg#4023, whatwg#4026, whatwg#4030, whatwg#4033, whatwg#4034). We revert these changes until a more complete and agreed-upon specification can replace them. Closes whatwg#3995.
This reverts commit 7711a1f. As discussed in whatwg#3995, these changes were made prematurely without appropriate implementer sign-off. Since then, a plethora of issues around the changes here have been opened up (e.g. whatwg#3994, whatwg#4023, whatwg#4026, whatwg#4030, whatwg#4033, whatwg#4034). We revert these changes until a more complete and agreed-upon specification can replace them. Closes whatwg#3995.
TextMetrics.advances is a recent addition to the spec. It looks very useful, but needs a few more details.
advance
suggests it's an advance of each glyph. Should it be so, or change the name if authors want cumulative widths up to each index?Note this member is also defined in Font Metrics API.
Opinions appreciated: @litherum @FremyCompany @dbaron @jfkthame @fserb @domenic @eaenet
The text was updated successfully, but these errors were encountered: