Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Locale.t() to handle context and plurals. #6158

Closed
phaux opened this issue Jan 29, 2020 · 2 comments
Closed

Improve Locale.t() to handle context and plurals. #6158

phaux opened this issue Jan 29, 2020 · 2 comments
Assignees
Labels
domain:i18n The issue reports a problem with internalization / translation mechanisms package:utils type:improvement This issue reports a possible enhancement of an existing feature.

Comments

@phaux
Copy link

phaux commented Jan 29, 2020

📝 Provide a description of the improvement

Support more functionality in Locale.t function.

  • We are lacking support for plurals. The API should be similar to the standard ngettext.
  • We decided to support passing context as a separate argument (similar to pgettext). Right now it looks like passing the context as part of msgid in square brackets is supported.

This will require work in the Locale class and changes in Webpack plugin.

@phaux phaux added type:improvement This issue reports a possible enhancement of an existing feature. domain:accessibility This issue reports an accessibility problem. domain:dx This issue reports a developer experience problem or possible improvement. package:utils labels Jan 29, 2020
@mlewand mlewand added this to the backlog milestone Feb 3, 2020
@phaux
Copy link
Author

phaux commented Feb 27, 2020

Summary of the new Locale API

Our new API was inspired a lot by Jed and gettext.js which are themselves based on the original GNU gettext approach.

const { t, ct, ctn } = this.editor.locale;

// t behaves the same as before.
// It finds a localized message based on the english version and can also perform string interpolation.
console.log( t( 'Hello, World!' );
// Interpolation values go into the second argument.
console.log( t( 'Hello, %0!', [ 'World' ] );

// ct is used to pass a context with the message to make sure similar messages are unique.
console.log(
  ct( 'Welcome message', 'Hello, World!' )
);
// To perform an interpolation, use a method.
console.log(
  ct( 'Welcome message', 'Hello, %0!' )
    .format('World')
);

// ctq is used for messages with plural forms.
console.log(
  ctq( 'Word count plugin', '# word', '# words', wordCount )
);

We use the following functions for getting messages:

  • t( 'id', [ ...values ] ) – Nothing changed here. Works the same way as regular gettext function but has the placeholder interpolation built-in to not break backwards compatibility.
  • ct( 'ctx', 'id' ) (stands for "context translation") – Allows specifying the message context. Other common name for this function is pgettext()
  • ctq( 'ctx', 'singular id', 'plural id', quantity ) (stands for "context translation quantity") – Used for handling plural forms. Also known as npgettext(). It also replaces pound sign # with the number provided and formats it according to the language.

The new functions (ct and ctq) don't return strings but rather an object with a few methods (similar to Jed):

  • toString() – It's there so that the object can be used like it was a regular string.
  • format( ...values ) – Performs the string interpolation. Replaces %0-9 placeholders with given values.
  • formatToParts( ...values ) – Similar to Intl​.List​Format​.formatToParts(). Does the string interpolation, but returns an array of all the literal and dynamic parts of the message.

Read more about Jed API for comparison.

Changes to translations JSON format

{
  "pl": {
    "PLURAL_FORMS": "nplurals=3; plural=(n == 1) ? 0 : (n < 5) ? 1 : 2",
    "message id": ["localised message", "localised message plural 1", "localised message plural 2"],
    "message context|message id": "localised message"
  }
}

Things to note:

  • A special key exists for the "plural-forms" header.
  • Pipe character separates context and message ID.
  • For plural messages, the value is an array of all the plural forms.
  • The message IDs and contexts are lowercased.

Breaking changes 💔

There was functionality for providing the message context in the t function as part of the message ID. The syntax was t( 'message id [context: message context]' ).

We decided to remove that functionality because it wasn't used anywhere. A better way of providing more information for the translator are // translators: comments. They don't make the message unique though.

@ma2ciek ma2ciek self-assigned this Mar 30, 2020
@ma2ciek ma2ciek added domain:i18n The issue reports a problem with internalization / translation mechanisms and removed domain:accessibility This issue reports an accessibility problem. domain:dx This issue reports a developer experience problem or possible improvement. labels Apr 10, 2020
@ma2ciek
Copy link
Contributor

ma2ciek commented Jul 15, 2020

The issue was closed by ckeditor/ckeditor5-dev#614.

@ma2ciek ma2ciek closed this as completed Jul 15, 2020
@ma2ciek ma2ciek removed this from the backlog milestone Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:i18n The issue reports a problem with internalization / translation mechanisms package:utils type:improvement This issue reports a possible enhancement of an existing feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants