Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added tests for structurally invalid language tags in DisplayNames.prototype.of() #3844

Merged
merged 1 commit into from
Oct 2, 2023

Conversation

ben-allen
Copy link
Contributor

Previously no test coverage for cases where {type: 'language'}, created tests for structurally invalid tags.

fix #3840

@ben-allen ben-allen requested a review from a team as a code owner June 7, 2023 23:14
Comment on lines 30 to 32
assert.throws(RangeError, function() {
displayNames.of('en-u-hebrew');
}, 'invalid language id -- contains singleton');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on why you think this is invalid? It looks to me like a valid Unicode locale identifier with unicode_language_id "en" and Unicode locale extension sequence "-u-hebrew", which includes the attribute "hebrew" and no keywords.

Copy link
Contributor Author

@ben-allen ben-allen Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When type is "language", code has to be a valid unicode_language_id. See 12.5.1. Although the string in question is a valid unicode_locale_id, a unicode_language_id can only contain unicode_language_subtag, unicode_script_subtag, unicode_region_subtag, and unicode_variant_subtag, so strings with singletons are necessarily invalid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though now that I'm looking at it: doesn't the test in 12.5.1.1.a duplicate the test in IsStructurallyValidLanguageTag? Seems like all it does is save having to do the ASCII-lowercase at the start of IsStructurallyValidLanguageTag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When type is "language", code has to be a valid unicode_language_id. See 12.5.1. Although the string in question is a valid unicode_locale_id, a unicode_language_id can only contain unicode_language_subtag, unicode_script_subtag, unicode_region_subtag, and unicode_variant_subtag, so strings with singletons are necessarily invalid.

👍

Though now that I'm looking at it: doesn't the test in 12.5.1.1.a duplicate the test in IsStructurallyValidLanguageTag? Seems like all it does is save having to do the ASCII-lowercase at the start of IsStructurallyValidLanguageTag.

Let's see... CanonicalCodeForDisplayNames step 1.a ensures that the input is an isolated Unicode language identifier (i.e., a language subtag with optional script subtag or a script subtag, in either case optionally followed by a region subtag and/or any number of variant subtags but NOT by singleton subtags or any other subtags), and step 1.b ensures that it does not use backwards compatibility syntax (IsStructurallyValidLanguageTag step 3 rejecting e.g. "en_GB" and "root" and "Latn-DE" per the respective UTS #35 Part 1 Core, BCP 47 Conformance "backwards compatibility" bullet points) or contain duplicate variant subtags (IsStructurallyValidLanguageTag step 6 rejecting e.g. "de-1996-fonipa-1996") that would both pass the preceding syntax check. So I like the current state of ECMA-402 text, and also the coverage being added in this PR.

Comment on lines 34 to 36
assert.throws(RangeError, function() {
displayNames.of('enxx');
}, 'invalid language subtag (4 letters)');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert.throws(RangeError, function() {
displayNames.of('enxx');
}, 'invalid language subtag (4 letters)');
assert.throws(RangeError, function() {
displayNames.of('enxx');
}, 'invalid language id - starts with a script subtag (4 letters)');

But actually, how does this differ from "tries to use backwards compatibility feature (bare script subtag)" below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, fair! let me take that one out...

@gibson042
Copy link
Contributor

Thanks so much for working on these! ❤️

Copy link
Contributor

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few message consistency suggestions and assertion tweaks, but this looks good. Thanks again!

a. If code cannot be matched by the unicode_language_id Unicode locale nonterminal, throw a RangeError exception.
b. If IsStructurallyValidLanguageTag(code) is false, throw a RangeError exception.
c. Return CanonicalizeUnicodeLocaleId(code).
features: [Intl.DisplayNames-v2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Does this test actually use any features from Intl.DisplayNames-v2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops -- fixed!

@ben-allen ben-allen force-pushed the invalid-language-tag branch 2 times, most recently from 3e51ef9 to 5cd229f Compare August 21, 2023 22:36
@ptomato
Copy link
Contributor

ptomato commented Oct 2, 2023

All comments seem to have been addressed and we have an approval from Richard; I gave it a quick look additionally and it looks good. Merging now, sorry for the long wait.

@ptomato ptomato merged commit 63456ad into tc39:main Oct 2, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing Intl.DisplayNames tests for RangeError in .of function
4 participants