-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extra spaces in markup #650
Comments
For convenience I will include the HTML below unprotected, so that we can see it in the issue: <p>Say something <i>italic</i> and last (works).</p>
<p>Say something < i>italic</i> and last (fails).</p>
<p>Say something <i >italic</i> and last (works).</p>
<p>Say something <i>italic< /i> and last (fails).</p>
<p>Say something <i>italic</ i> and last (fails).</p>
<p>Say something <i>italic</i > and last (fails).</p> Say something italic and last (works). Say something < i>italic and last (fails). Say something italic and last (works). Say something italic< /i> and last (fails). Say something italic and last (fails). Say something italic and last (fails). |
Proposed change: markup = "{" [s] "#" identifier *(s option) *(s attribute) [s] "}" ; open
/ "{" [s] "#" identifier *(s option) *(s attribute) [s] "/}" ; standalone
/ "{" [s] "/" identifier *(s option) *(s attribute) [s] "}" ; close to: markup = "{#" identifier *(s option) *(s attribute) [s] "}" ; open
/ "{#" identifier *(s option) *(s attribute) [s] "/}" ; standalone
/ "{/" identifier *(s option) *(s attribute) [s] "}" ; close |
Note: this is not the same as the expressions. There detecting |
I think this is a non-starter? Our syntax is very consistent about optional whitespace: we are very lax when it's optional, especially inside expressions. The I think the thing that might be confusing here is that when you write the syntax using markup = "{" markup-identifier *(s option) *(s attribute) [s] "}" <- this doesn't work because of standalone
markup-identifier = ( "#" / "/" ) identifier All of our other identifiers and sigil-introduced tokens are of the sigil-identifier flavor. For consistency, this should be to. There is some lookahead to find type, but it's consuming whitespace (so can be optimized). |
Agree, but I don't think this is optional, and the sigils should not be attached to the identifier. Right above your comment I explain why I think this is not at all the same.
I am not judging this based on the grammar, I am judging it as a user seeing the syntax.
What I see is These are spaces that I agree are optional:
There is not markup (that I know of) that allows these kind of spaces. |
I think it's way too late to introduce this syntax change for consideration, and I do not think we should consider this for LDML 45. As an implementer, I'd like to note that I had no difficulty dealing with the current syntax. In a pattern, from the |
This is not about implementing. That is point 1 of 4, and in fact the weakest one. It is that it does not make sense semantically, as a user, and no other system does that. |
We agreed to the syntax for 45 in the F2F. I'm going to mark this for consideration during tech preview. |
BTW, if we are to be "loose" with the spaces, why not allow spaces between the Current:
Extra space:
The current syntax allows I think this is all based on a superficial (visual) similarity with the placeholders. |
I'd also just like to chime in that I agree with @mihnita - restricting the syntax here to ensure that there is no whitespace between |
I think some care should be exercised about how we discuss the whitespace here. If one just looks at the sigil, the spaces looks weird. But notice that in the remainder of our syntax, the sigil is attached to something, e.g.
The counter argument would seem to be that markup is a fundamentally different type of expression, so it's not really a sigil, it's a different introducing sequence:
Looking at what HTML does is not really instructive, since any HTML would be produced from the markup syntax and we need to think about what other syntax's needs are as well. I'd be curious how markup is currently parsed by implementations? Attaching the sigil to the starter would require a one character lookahead in each expression to check if the expression is markup. Attaching the sigil to the identifier would be more similar to seeking the next token (see list above). That doesn't mean that the lookahead is evil. I'm just curious if the difference in parsing is worth it. |
I parse markup together with expressions. Because the constructions are syntactically so similar, it's easier to have just one handler for the stuff between curly braces. As a user, I would find variance between the whitespace requirements of expressions and markup very confusing. I don't find the |
My parser does the lookahead after consuming the '{' -- if the next character is '#' or '/' a separate |
This is not about how hard / easy it is to parse.
And the closing markup Same as there is one |
That's not our syntax, though? As you note in the first post on this issue, the
I'm very confused by this passage. Maybe you typo'd some of the sigils here? |
The current syntax permits space between the bracket and the opening sigil (either My suggestion would be: let's reject this for now. A future version could allow spaces between the starting sigil and the identifier (but no version would be able to disallow spaces between the bracket and the sigil). |
Our current grammar for markup is this:
That's what it looks like after adding adding options to close.
But the issue is independent.
I would like to make the case that the first space after
{
is not only unnecessary, but somewhat problematic.It forces us into a (possibly big) lookahead.We see the
{
, and then we might have to "consume" a lot of spaces to know where we are (markup or expression)I decided to strikethrough because it is the least compelling argument. It is not about the parsing, at all. [mihnita]
It is this whole marker that is standalone or close, not the attribute
The
/
at the end of the standalone is tied with the closing}
, there is no[s]
in between them.This was true before PR Add options to close (spec) #649, but it is more visible now.
None of this works:
< b>
and< /b>
and</ b>
The text was updated successfully, but these errors were encountered: