From dc9d83106cada9af507bf37dee3973de97b020fd Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Thu, 1 Jun 2017 14:24:01 -0700 Subject: [PATCH] IDNA: use proposed UTS46 flags to avoid breaking YouTube Tests: w3c/web-platform-tests#5976. Fixes #53 and fixes #267 by no longer breaking on hyphens in the 3rd and 4th position of a domain label. This is known to break YouTube: r3---sn-2gb7ln7k.googlevideo.com. This is fixed by setting the proposed CheckHyphens flag to false. Fixes #110 by clarifying that BIDI and CONTEXTJ checks are to be done by setting the proposed CheckBidi and CheckJoiners flags to true. Follow-up #313 is filed to remove the proposed bits once Unicode is updated. #317 also tracks a potential cleanup. --- url.bs | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/url.bs b/url.bs index 28606112..0fa72282 100644 --- a/url.bs +++ b/url.bs @@ -278,28 +278,35 @@ U+005C (\), or U+005D (]).

IDNA

-

The domain to ASCII given a -domain domain, runs these steps: +

The domain to ASCII algorithm, given a domain +domain and optionally a boolean beStrict, runs these steps:

    -
  1. Let result be the result of running Unicode ToASCII with - domain_name set to domain, UseSTD3ASCIIRules set to false, - processing_option set to Nontransitional_Processing, and VerifyDnsLength set - to false. +

  2. If beStrict is not given, set it to false. + +

  3. +

    Let result be the result of running Unicode ToASCII + with domain_name set to domain, UseSTD3ASCIIRules set to + beStrict, CheckHyphens set to false, CheckBidi set to true, + CheckJoiners set to true, processing_option set to + Nontransitional_Processing, and VerifyDnsLength set to beStrict. + +

    This and domain to Unicode below are based on a proposed revision. See + issue #313.

  4. If result is a failure value, validation error, return failure.

  5. Return result.

-

The domain to Unicode given a -domain domain, runs these steps: +

The domain to Unicode algorithm, given a domain +domain, runs these steps:

  1. Let result be the result of running - Unicode ToUnicode with - domain_name set to domain, - UseSTD3ASCIIRules set to false. + Unicode ToUnicode with domain_name set to domain, + CheckHyphens set to false, CheckBidi set to true, CheckJoiners set to true, + and UseSTD3ASCIIRules set to false.

  2. Signify validation errors for any returned errors, and then, return result. @@ -315,16 +322,16 @@ U+005C (\), or U+005D (]).

    A domain is a valid domain if these steps return success:

      -
    1. Let result be the result of running - Unicode ToASCII with - domain_name set to domain, - UseSTD3ASCIIRules set to true, processing_option set to - Nontransitional_Processing, and VerifyDnsLength set to true. +

    2. Let result be the result of running domain to ASCII with domain + and true. -

    3. If result is a failure value, return failure. +

    4. If result is failure, then return failure.

    5. Set result to the result of running - Unicode ToUnicode with + Unicode ToUnicode with domain_name set to result, + CheckHyphens set to false, CheckBidi set to true, CheckJoiners set to true, + and UseSTD3ASCIIRules set to true. + domain_name set to result, UseSTD3ASCIIRules set to true. @@ -3152,7 +3159,7 @@ spec: MEDIA-SOURCE; urlPrefix: https://w3c.github.io/media-source/#idl-def- type: interface; text: MediaSource spec: MEDIACAPTURE-STREAMS; urlPrefix: https://w3c.github.io/mediacapture-main/#idl-def- type: interface; text: MediaStream -spec: UTS46; urlPrefix: http://www.unicode.org/reports/tr46/ +spec: UTS46; urlPrefix: http://www.unicode.org/reports/tr46/proposed.html type: abstract-op; text: ToASCII; url: #ToASCII type: abstract-op; text: ToUnicode; url: #ToUnicode