Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language and direction metadata #36

Open
xfq opened this issue Sep 19, 2021 · 6 comments
Open

Language and direction metadata #36

xfq opened this issue Sep 19, 2021 · 6 comments
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.

Comments

@xfq
Copy link
Contributor

xfq commented Sep 19, 2021

I just read the three documents of this incubation experiment. Leveraging the power of the CSS layout engine sounds like a useful way to style text in Canvas.

I wonder if there is a way to associate language and direction metadata with FormattedText and/or FormattedTextRun? See string-meta for more information.

@xfq xfq added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Sep 19, 2021
@travisleithead
Copy link
Member

PR #39 includes lang metadata as recommended for string annotations in the link you provided. For 'dir', I'm confident that relying on CSS direction will fill that need without a separate value for 'dir'.

@travisleithead
Copy link
Member

#39 has landed. I think we are covered from the dir/lang side of things.

@aphillips
Copy link

In reviewing the above thread, I find the text here, which says:

No additional work is needed from web developers to support bidi text. At format time, bidi analysis is done on the input text which creates internal bidi runs if necessary.

This is not correct: the bidi algorithm needs help from content authors in order to produce the correct results.

I do agree that CSS direction can be used as the attribute for associating the direction with a FormattedText paragraph and as the bearer of directional metadata within a paragraph. However, the document is bereft of examples and the statement about bidi is misleading. Every paragraph has a "base paragraph direction" necessary for computing directional runs and this should be called out. Bidi analysis proceeds from this base direction. Detection via "first strong" is often wrong.

In addition, runs of text within a paragraph often need to be "spanned" with a direction in order to get the right results. This doesn't appear to be accounted for in FormattedText. We have some examples here.

I'll also paste a couple of screenshots to exemplify the need for direction in-line. These are using HTML to mark up the text, but you can imagine how FormattedText would need similar "spanning" within the text.

Here's the "badly styled" paragraph (no added direction):

image

Here's the fixed version:

image

@aphillips aphillips reopened this Mar 9, 2022
@xfq
Copy link
Contributor Author

xfq commented Mar 11, 2022

In addition to Addison's comments above, the W3C Internationalization WG found some of the terminologies in the documents here to be inaccurate. What would be the best way to engage you? Would you prefer PRs? Or issues?

@travisleithead
Copy link
Member

@aphillips thanks so much for the review and your comments. I'm looking forward to making these explainers so much better as a result. Sounds like an example would be good to add to reflect the importance of needing to help the Bidi algorithm as needed, and emphasizing the use of CSS direction (and maybe unicode-bidi properties?) as important components of that.

I think the spanning you describe is fully possible with this proposal--i.e., I should be able to translate your above example into the Formatted Text input in roughly the same way (and it should produce the same result, given it's ultimately processed by the same layout/rendering pipeline).

I would like to know more about how the "base paragraph direction" is established. In HTML Canvas, for example, when a JavaScript string is rendered to the canvas with fillText() how is this base paragraph direction chosen? Is it inherited into the Canvas from elsewhere in the DOM? For the HTML parser, how does it establish it? Does it ultimately derive from language or network hints. I'm very curious. This seems related to #49 as another default we need to think about.

@xfq I would welcome any help you can offer on improving terminologies. PRs will be the fastest ways to suggest the improvements. Looking forward to any help you can provide.

@fantasai
Copy link

For the HTML parser, how does it establish it?

See

The directionality of an element, as established by HTML, gets mapped to the direction property in CSS, which, when set on a block container, sets the base direction for the inline formatting context it contains.

Wrt specifying direction, btw, I think it would be better to have a dir property analogous to HTML's dir attribute, and parallel to the lang property, directly on the text object, rather than relying on direction and unicode-bidi in the styles. There are several reasons for this:

  • Matches best practices for HTML.
  • Much easier for authors to get it right (because for inlines, you have to set both direction and unicode-bidi correctly to get an effect).
  • Binds the content and directionality together separately from styling, so that devs don't accidentally overwrite directionality info when styling their content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
Projects
None yet
Development

No branches or pull requests

4 participants