Language and direction metadata #36

xfq · 2021-09-19T04:04:45Z

I just read the three documents of this incubation experiment. Leveraging the power of the CSS layout engine sounds like a useful way to style text in Canvas.

I wonder if there is a way to associate language and direction metadata with FormattedText and/or FormattedTextRun? See string-meta for more information.

The text was updated successfully, but these errors were encountered:

travisleithead · 2022-02-02T19:03:47Z

PR #39 includes lang metadata as recommended for string annotations in the link you provided. For 'dir', I'm confident that relying on CSS direction will fill that need without a separate value for 'dir'.

travisleithead · 2022-02-14T22:34:40Z

#39 has landed. I think we are covered from the dir/lang side of things.

aphillips · 2022-03-06T17:28:38Z

In reviewing the above thread, I find the text here, which says:

No additional work is needed from web developers to support bidi text. At format time, bidi analysis is done on the input text which creates internal bidi runs if necessary.

This is not correct: the bidi algorithm needs help from content authors in order to produce the correct results.

I do agree that CSS direction can be used as the attribute for associating the direction with a FormattedText paragraph and as the bearer of directional metadata within a paragraph. However, the document is bereft of examples and the statement about bidi is misleading. Every paragraph has a "base paragraph direction" necessary for computing directional runs and this should be called out. Bidi analysis proceeds from this base direction. Detection via "first strong" is often wrong.

In addition, runs of text within a paragraph often need to be "spanned" with a direction in order to get the right results. This doesn't appear to be accounted for in FormattedText. We have some examples here.

I'll also paste a couple of screenshots to exemplify the need for direction in-line. These are using HTML to mark up the text, but you can imagine how FormattedText would need similar "spanning" within the text.

Here's the "badly styled" paragraph (no added direction):

Here's the fixed version:

xfq · 2022-03-11T04:52:14Z

In addition to Addison's comments above, the W3C Internationalization WG found some of the terminologies in the documents here to be inaccurate. What would be the best way to engage you? Would you prefer PRs? Or issues?

travisleithead · 2022-03-26T06:56:05Z

@aphillips thanks so much for the review and your comments. I'm looking forward to making these explainers so much better as a result. Sounds like an example would be good to add to reflect the importance of needing to help the Bidi algorithm as needed, and emphasizing the use of CSS direction (and maybe unicode-bidi properties?) as important components of that.

I think the spanning you describe is fully possible with this proposal--i.e., I should be able to translate your above example into the Formatted Text input in roughly the same way (and it should produce the same result, given it's ultimately processed by the same layout/rendering pipeline).

I would like to know more about how the "base paragraph direction" is established. In HTML Canvas, for example, when a JavaScript string is rendered to the canvas with fillText() how is this base paragraph direction chosen? Is it inherited into the Canvas from elsewhere in the DOM? For the HTML parser, how does it establish it? Does it ultimately derive from language or network hints. I'm very curious. This seems related to #49 as another default we need to think about.

@xfq I would welcome any help you can offer on improving terminologies. PRs will be the fastest ways to suggest the improvements. Looking forward to any help you can provide.

fantasai · 2023-11-10T02:49:28Z

For the HTML parser, how does it establish it?

See

The directionality of an element, as established by HTML, gets mapped to the direction property in CSS, which, when set on a block container, sets the base direction for the inline formatting context it contains.

Wrt specifying direction, btw, I think it would be better to have a dir property analogous to HTML's dir attribute, and parallel to the lang property, directly on the text object, rather than relying on direction and unicode-bidi in the styles. There are several reasons for this:

Matches best practices for HTML.
Much easier for authors to get it right (because for inlines, you have to set both direction and unicode-bidi correctly to get an effect).
Binds the content and directionality together separately from styling, so that devs don't accidentally overwrite directionality info when styling their content.

xfq added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Sep 19, 2021

w3cbot mentioned this issue Sep 28, 2021

Language and direction metadata w3c/i18n-activity#1418

Open

travisleithead closed this as completed Feb 14, 2022

aphillips reopened this Mar 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Language and direction metadata #36

Language and direction metadata #36

xfq commented Sep 19, 2021

travisleithead commented Feb 2, 2022

travisleithead commented Feb 14, 2022

aphillips commented Mar 6, 2022

xfq commented Mar 11, 2022

travisleithead commented Mar 26, 2022

fantasai commented Nov 10, 2023

Language and direction metadata #36

Language and direction metadata #36

Comments

xfq commented Sep 19, 2021

travisleithead commented Feb 2, 2022

travisleithead commented Feb 14, 2022

aphillips commented Mar 6, 2022

xfq commented Mar 11, 2022

travisleithead commented Mar 26, 2022

fantasai commented Nov 10, 2023