Editorial: let code point and co be defined by Infra #2471

annevk · 2017-03-27T13:30:48Z

domenic · 2017-03-28T05:06:26Z

source

@@ -2390,6 +2365,11 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
    <p>The following terms are defined in the WHATWG Infra standard: <ref spec=INFRA></p>

    <ul class="brief">
+     <li><dfn data-x-href="https://infra.spec.whatwg.org/#code-point">code point</dfn> and its synonym
+             <dfn data-x-href="https://infra.spec.whatwg.org/#code-point">character</dfn></li>


Nit: I believe we line up the <dfns in similar blocks

This is how it's lined up later on in this <ul>...

It looks to be inconsistent. Position variable is not lined up, but the rest of them are.

domenic · 2017-03-28T05:06:58Z

source

@@ -2438,7 +2418,7 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

   <dd>

-    <p>The Unicode character set is used to represent textual data, and the WHATWG Encoding standard
+    <p>The Unicode Standard is used to represent textual data, and the WHATWG Encoding standard


I'm not sure the standard is used to represent the data. A character set seems more likely to be used to represent data than a standard

domenic · 2017-03-28T05:07:43Z

source

-   <li><p>Replace any characters in <var>input</var> that have a Unicode code point greater
-   than U+FFFF (i.e. any characters that are not in the basic multilingual plane) with the
-   two-character string "<code data-x="">00</code>".</p></li>
+   <li><p>Replace any characters in <var>input</var> that have a code point greater than U+FFFF


Is the idea to not introduce new links? Seems OK, but maybe a missed opportunity.

domenic · 2017-03-28T05:08:49Z

source

-  appropriate.</p>
+  <p>An image should not be used if characters would serve an identical purpose. Only when the text
+  cannot be directly represented using text, e.g., because of decorations or because the character
+  is not in the Unicode Standard (as in the case of gaiji), would an image be appropriate.</p>


Again character set seems more accurate here.

domenic · 2017-03-28T05:12:07Z

source

@@ -72016,14 +71995,14 @@ END:VCARD</pre>

   <li>

-    <p>If and while <var>line</var> is longer than <var>maximum length</var>
-    Unicode code points long, run the following substeps:</p>
+    <p>If and while <var>line</var> is longer than <var>maximum length</var> code points long:</p>


You could link longer/long to string length here (as opposed to JavaScript string length)

Why do we say "If and while" rather than just "While"?

domenic · 2017-03-28T05:13:18Z

source

-   U+003B SEMICOLON character (;).</dd>
+   <span>ASCII digits</span>, representing a base-ten integer that corresponds to a code point that
+   is allowed according to the definition below. The digits must then be followed by a U+003B
+   SEMICOLON character (;).</dd>


In general I'd like to replace "character" by "code point" when possible, but I realize I'm just piling work onto this PR, so feel free to ignore this suggestion.

A dedicated commit might be better for that. Happy to make one though.

domenic · 2017-03-28T05:15:21Z

source

@@ -100478,7 +100454,8 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {

    <dl class="switch">

-     <dt>A sequence of bytes starting with: 0x3C 0x21 0x2D 0x2D (ASCII '&lt;!--')</dt>
+     <dt>A sequence of bytes starting with: 0x3C 0x21 0x2D 0x2D (`<code
+     data-x="">&lt;!--</code>`)</dt>


Nice to fix this into using byte sequence notation

domenic · 2017-03-28T05:17:23Z

source

-     <dd>Append the Unicode character with code point <span data-x=""><var>b</var>+0x20</span> to <var>attribute name</var> (where <var>b</var>
-     is the value of the byte at <var>position</var>). (This converts the input to
-     lowercase.)</dd>
+     <dd>Append the code point <span data-x=""><var>b</var>+0x20</span> to <var>attribute name</var>


Any idea why this has a span data-x=""?

Not really.

domenic · 2017-03-28T05:18:50Z

source

@@ -100879,9 +100861,9 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {
  U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF,
  U+AFFFE, U+AFFFF, U+BFFFE, U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE,
  U+FFFFF, U+10FFFE, and U+10FFFF are <span data-x="parse error">parse errors</span>. These are all
-  <span>control characters</span> or permanently undefined Unicode characters (noncharacters).</p>
+  <span>control characters</span> or permanently undefined characters (noncharacters).</p>


This "permanently undefined characters (noncharacters)" concept keeps coming up and probably deserves a term defined in Infra.

Opened an issue. Probably best done as cleanup afterward.

annevk · 2017-03-29T06:41:58Z

There are a lot of hits for <span data-x=""><var> in source. Did we do away with autolinking of <var> already or is that still a thing? If we did away with that, we should maybe do a separate PR for removing all those instances.

zcorpan · 2017-03-29T09:17:53Z

I think <var> doesn't autolink but <var data-x="foo"> does.

Editorial: let code point and co be defined by Infra

beed71d

See whatwg/infra#6 for context.

annevk mentioned this pull request Mar 27, 2017

Using generics for bytes / code units / code points whatwg/infra#17

Closed

domenic reviewed Mar 28, 2017

View reviewed changes

annevk mentioned this pull request Mar 28, 2017

Add permanently undefined characters (noncharacters) whatwg/infra#108

Closed

address review feedback

bf53344

domenic merged commit 59595d9 into master Mar 29, 2017

domenic deleted the annevk/infra-strings branch March 29, 2017 08:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Editorial: let code point and co be defined by Infra #2471

Editorial: let code point and co be defined by Infra #2471

annevk commented Mar 27, 2017

domenic Mar 28, 2017

annevk Mar 28, 2017

domenic Mar 28, 2017

domenic Mar 28, 2017

domenic Mar 28, 2017

domenic Mar 28, 2017

domenic Mar 28, 2017

zcorpan Mar 28, 2017

domenic Mar 28, 2017

annevk Mar 28, 2017

domenic Mar 28, 2017

domenic Mar 28, 2017

annevk Mar 28, 2017

domenic Mar 28, 2017

annevk Mar 28, 2017

annevk commented Mar 29, 2017

zcorpan commented Mar 29, 2017

Editorial: let code point and co be defined by Infra #2471

Editorial: let code point and co be defined by Infra #2471

Conversation

annevk commented Mar 27, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annevk commented Mar 29, 2017

zcorpan commented Mar 29, 2017