-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i18n: Implement a solution for interpolation of React elements in strings #9846
Comments
We've talked about this in private, but noting that auditing our current strings would help us prioritize this task. |
Agree 👍 I've also been thinking about a syntax like this for a while. Another benefit is that we could connect an
Shouldn't a regular JSX parser (as used by Babel and friends) be helpful here (and thus be feasible by a Babel plugin)? In particular, the task at hand shares some traits with shallow rendering, as provided by Enzyme or
That's reassuring, since the render prop syntax isn't exactly nice 😆
Bit verbose 🙁 but yeah, might be the tersest we can do. |
This is a bit confusing for me: <Interpolate count={ 2 }>
<Interpolate.PluralForm>
Check out this link to my <a href={ url }>website</a>.
</Interpolate.PluralForm>
<Interpolate.PluralForm>
Check out this link to my <a href={ url }>websites</a>.
</Interpolate.PluralForm>
</Interpolate> I'm likely just misunderstanding things, but something like this might be a bit clearer? <Interpolate count={ 2 }>
<Interpolate.SingularForm>
Check out this link to my <a href={ url }>website</a>.
</Interpolate.SingularForm>
<Interpolate.PluralForm>
Check out this link to my <a href={ url }>websites</a>.
</Interpolate.PluralForm>
</Interpolate> |
@nerrad I think this may have been a misunderstanding on my part around >2 plural forms. I believe you're correct that, at least as how we express in English, there would be only the singular and plural form to include in the code (corresponding to References: |
Yeah just singular + plural in the code needed.. |
I think we should also shoot for a goal where there is no html in the translation string (how feasible that is I don't know yet). Otherwise the string translation is inherently fragile. |
The tools for including markup in localized string exist in WordPress PHP, so it's assumed there will be a similar expectation in JavaScript. Our usage in Gutenberg thus far has been extremely limited, which is why I've personally temporarily shelved this as an issue. As one data point, @blowery had done a brief audit of component interpolation in Calypso and had found at least 800+ uses in the codebase. Embedding |
More prior art: WordPress/wordcamp.org@bbe3ce8 |
That commit's a bit messy. Here's the current file with relevant prior art: |
re: singular && plural please keep in mind that some languages use different rules and forms for pluralization, such as:
Here's a page that discusses this: https://developer.mozilla.org/en-US/docs/Mozilla/Localization/Localization_and_Plurals An example from that list is for Icelandic and Macedonian where there is one rule for any number that ends in 1, except for 11 (1, 21, 31, 41, 51, 61...) and another rule for everything else (0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13...). So a simple singular vs plural system is not going to work. Ultimately it would be better to simply use the count to pick the corresponding translation and then have a default to cover all other cases. I can't give a good real world example because I'm a dumb one language kinda guy, so please just use your imagination and pretend that these were valid in something other than english <Interpolate count={ 2 }>
<PluralRule count={ 0 }>
no things for you
</PluralRule>
<PluralRule count={ 1 }>
there is one thing
</PluralRule>
<PluralRule count={ 2 }>
(in Yoda voice) two things there are
</PluralRule>
<PluralRule>
there are %s things
</PluralRule>
</Interpolate> For the Icelandic and Macedonian example from the MDN docs, the (EDIT) |
@iandunn @coreymckrill Very interesting, thanks for sharing! Sounds like we should be able to provide a function like @aduth What do you think of that? Would this something that could be even added to Tannin directly? |
Incoming stream of consciousness: I like that the approach is not specifically tailored to the React, but rather to the generic problem of implementing substitution of non-string values. It does impose some challenge on the implementer to handle an array of parts; it just-so-happens to work pretty well† out of the box as a React children value, though if the goal is to be more generic, it's not immediately obvious what this might look like in other settings. † Pretty well: React may complain about missing
I'm not clear what's proposed to be included where. Functionally this sounds similar to a utility function like At which point, you could have (pending beautification): sprintf = ( str, ...args ) => tokenSplit( str ).reduce( ( result, part, i ) => result + ( i % 2 ? args[ ( i - 1 ) / 2 ] : part ), '' );
Interpolate = ( props ) => tokenSplit( Children.toArray( props.children ).map( ( child, i ) => typeof child === 'string' ? child : '%' + i + '$s' ).join( '' ) ).reduce( ( result, part, i ) => [ ...result, ( i % 2 ? cloneElement( Children.toArray( props.children )[ ( i - 1 ) / 2 ], { key: i } ) : part ) ], [] ); Two things I note:
At this point, I re-read your comment and perhaps you're suggesting that this be put more on the developer to handle? So in their own renders, e.g. function Accusation( props ) {
const { who, where, weapon } = props;
return <div>{ eprintf( __( 'I accuse %1$s in the %2$s with the %3$s!' ), <em>{ who }</em>, <em>{ where }</em>, <em>{ weapon }</em> ) }</div>;
} The benefit here being that no changes are required to string extraction. For both this and
As considered, it only works if we substitute the entirety of the link:
...but then we lose translation (or at least context of the translation of) "website". While I'm having difficulty coming up with a solution, I could imagine one or a sum of parts of:
I think there may be iterative improvements with element substitutions we could find with something like the utilities mentioned in #9846 (comment) without needing to solve all the problems, but I do think we should have some general idea of what a broader solution might look like to have a better sense of how it all fits together. |
I've been thinking/experimenting on this a bit over the weekend and I have some more thoughts to throw up here. So one thing I'm thinking of playing with is introducing new tokens to represent open/close type of elements. So for example, open would be eprintf( __( 'Check out this link to my %1$owebsite%1$c' ), 'a' ); Behind the scenes, the code would take care of creating the element attached to the token (in this case 'a') and adding the content between the const children = [
wp.createElement( '', { key: 0 }, 'Check out this link to my ' ),
wp.createElement( 'a', { key: 1 }, 'website' ),
]; Some benefits:
As an option, we could detect if the argument is an object and use that to provide not only the element but also the props to pass to the element on creation. So it might look something like this: eprintf(
__( 'Check out this link to my %1$owebsite%1$c' ),
{ el: 'a', props: { href: 'https://website.com' } }
); The main question I have here though is how would translators handle the introduction of what on the surface seems unfamiliar tokens. @swissspidy any thoughts on that? |
Ideally we‘d avoid using such placeholders for HTML tags. It‘s hard to grasp for translators for that use case and very easy to make errors. Currently on mobile so I don‘t have resources at hand to point to, but I‘d say from a localization perspective it‘s the last resort IMO. Perhaps @ocean90 can give more feedback from the meta side as well. |
I can definitely see that as a potential downside of it being easier for translators to make errors. But imo that's not a strong argument because any usage of tokens introduces the chance of human errors made by translators. I do agree though that there should be a greater awareness of what an html tag is. However, out of the options presented, this is the one that gets us closest to a solution for the immediate problem at hand with minimal extra modifications made elsewhere. As far as I can tell, other solutions (such as what was proposed by @aduth):
|
@swissspidy and I chatted briefly about this at WordCamp Europe this past weekend. @nerrad , to your third point, I think the original proposal here had some nice advantages from the perspective of usability, though certainly is much more technically challenging to implement. Your proposed solution reminds me quite a bit of the Interpolate-Components prior art I shared in the original comment, albeit using the traditional placeholder syntax instead of a custom |
Yes definitely inspired by that, but with the tweak of keeping syntax that would be at least somewhat familiar to translators. However, it'd be trickier doing nested replacement than if Interpolate-components was used. Also, as you pointed out:
Which I think nixes the idea (I though I had checked if they already had defined meanings, but I obviously didn't :) ). |
I like what the
(although they do allow for subbing in text representations of common white-listed elements provided they have no additional attributes). I'm beginning to think that the goal of ending up with strings presented to translators like this:
Is going to require using
It preserves some context for translators to grasp but still should allow developers to use an interface that is relatively simple. So basically, the issue here is will introducing a new placeholder format in translation strings be a no-go for translations in the wp ecosystem, or is it an acceptable change to limit the complexity of implementation? |
Should note however, I'm still working on an experiment implementing the |
Just an update, I should have something to share this coming week on this. Made some great progress over the weekend :). |
The first experimental proposal for how we could move forward with this is now up: #16374 |
If I may give some context of what we've been doing with i18n-calypso and why I think it's a good approach: By using
This contains two It would be possible to achieve the same with
This is understandable for someone versed in HTML but also confusing because it uses the One additional fact that I like about Last but not least, I think it's helpful to keep defining translatable text as a string. It makes whitespace handling very clear (translatable strings are hash lookups). What about leading whitespace, trailing whitespace, line breaks, indentation? |
With the introduction of As discussed in today's JavaScript chat (link requires registration), we can consider some usability improvements in future tasks. I have created one such task at #18614, proposing the introduction of a Separately, if there are specific changes we should make in Gutenberg to make use of interpolated elements in existing components, those should be opened as separate issues or pull requests. |
Problem: Most any
sprintf
implementation, including that of@wordpress/i18n
, is expected to return a string value. In a React application, this can be problematic if the string contains HTML markup, as the HTML will be escaped, at least without using dangerouslySetInnerHTML, which is discouraged. This can sometimes lead to developers implementing workarounds using concatenation of partial localized strings (e.g. #5767 (comment)), which is not advisable as it cannot be translated accurately in many languages. This is also noted in the i18n for WordPress Developers guide:https://codex.wordpress.org/I18n_for_WordPress_Developers#Best_Practices
Task: Find a way to allow for React elements to be inserted within localized strings safely.
Prior Art:
Additional Considerations:
JSXExpressionContainer
occursBrainstorming:
It would be nice if a developer could simply wrap their string and elements in a component which performs the localization:
There are, of course, complications:
Check out this link to my <a href="%s">website</a>
. We'd need to find a way to associate the JSX expression to the placeholder (and vice-versa, inject the variable value where the placeholder occurs when rendering into the UI).<WebsiteLink />
element?It might be simpler to associate placeholders via a known limited subset of expression values allowed. This could also be used for linting enforcement of best-practices:
Aside: It is be feasible for a custom Babel plugin to transform an element like as written in the first snippet (which has arguably nicer semantics) to this latter format.
To address pluralization, perhaps having a special child component type mapping to the index of the plural form corresponding to the count (
an advantage here is supporting >2 plural forms(correction below)):cc @mcsf @jorgefilipecosta @swissspidy
The text was updated successfully, but these errors were encountered: