Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: let code point and co be defined by Infra #2471

Merged
merged 2 commits into from
Mar 29, 2017

Conversation

annevk
Copy link
Member

@annevk annevk commented Mar 27, 2017

See whatwg/infra#6 for context.

source Outdated
@@ -2390,6 +2365,11 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
<p>The following terms are defined in the WHATWG Infra standard: <ref spec=INFRA></p>

<ul class="brief">
<li><dfn data-x-href="https://infra.spec.whatwg.org/#code-point">code point</dfn> and its synonym
<dfn data-x-href="https://infra.spec.whatwg.org/#code-point">character</dfn></li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I believe we line up the <dfns in similar blocks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it's lined up later on in this <ul>...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to be inconsistent. Position variable is not lined up, but the rest of them are.

source Outdated
@@ -2438,7 +2418,7 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

<dd>

<p>The Unicode character set is used to represent textual data, and the WHATWG Encoding standard
<p>The Unicode Standard is used to represent textual data, and the WHATWG Encoding standard
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the standard is used to represent the data. A character set seems more likely to be used to represent data than a standard

<li><p>Replace any characters in <var>input</var> that have a Unicode code point greater
than U+FFFF (i.e. any characters that are not in the basic multilingual plane) with the
two-character string "<code data-x="">00</code>".</p></li>
<li><p>Replace any characters in <var>input</var> that have a code point greater than U+FFFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea to not introduce new links? Seems OK, but maybe a missed opportunity.

source Outdated
appropriate.</p>
<p>An image should not be used if characters would serve an identical purpose. Only when the text
cannot be directly represented using text, e.g., because of decorations or because the character
is not in the Unicode Standard (as in the case of gaiji), would an image be appropriate.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again character set seems more accurate here.

source Outdated
@@ -72016,14 +71995,14 @@ END:VCARD</pre>

<li>

<p>If and while <var>line</var> is longer than <var>maximum length</var>
Unicode code points long, run the following substeps:</p>
<p>If and while <var>line</var> is longer than <var>maximum length</var> code points long:</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could link longer/long to string length here (as opposed to JavaScript string length)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we say "If and while" rather than just "While"?

U+003B SEMICOLON character (;).</dd>
<span>ASCII digits</span>, representing a base-ten integer that corresponds to a code point that
is allowed according to the definition below. The digits must then be followed by a U+003B
SEMICOLON character (;).</dd>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I'd like to replace "character" by "code point" when possible, but I realize I'm just piling work onto this PR, so feel free to ignore this suggestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dedicated commit might be better for that. Happy to make one though.

@@ -100478,7 +100454,8 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {

<dl class="switch">

<dt>A sequence of bytes starting with: 0x3C 0x21 0x2D 0x2D (ASCII '&lt;!--')</dt>
<dt>A sequence of bytes starting with: 0x3C 0x21 0x2D 0x2D (`<code
data-x="">&lt;!--</code>`)</dt>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to fix this into using byte sequence notation

<dd>Append the Unicode character with code point <span data-x=""><var>b</var>+0x20</span> to <var>attribute name</var> (where <var>b</var>
is the value of the byte at <var>position</var>). (This converts the input to
lowercase.)</dd>
<dd>Append the code point <span data-x=""><var>b</var>+0x20</span> to <var>attribute name</var>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why this has a span data-x=""?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really.

@@ -100879,9 +100861,9 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {
U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF,
U+AFFFE, U+AFFFF, U+BFFFE, U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE,
U+FFFFF, U+10FFFE, and U+10FFFF are <span data-x="parse error">parse errors</span>. These are all
<span>control characters</span> or permanently undefined Unicode characters (noncharacters).</p>
<span>control characters</span> or permanently undefined characters (noncharacters).</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "permanently undefined characters (noncharacters)" concept keeps coming up and probably deserves a term defined in Infra.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened an issue. Probably best done as cleanup afterward.

@annevk
Copy link
Member Author

annevk commented Mar 29, 2017

There are a lot of hits for <span data-x=""><var> in source. Did we do away with autolinking of <var> already or is that still a thing? If we did away with that, we should maybe do a separate PR for removing all those instances.

@domenic domenic merged commit 59595d9 into master Mar 29, 2017
@domenic domenic deleted the annevk/infra-strings branch March 29, 2017 08:47
@zcorpan
Copy link
Member

zcorpan commented Mar 29, 2017

I think <var> doesn't autolink but <var data-x="foo"> does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants