-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow comment tags (<!
, --
, and >
) to be nested.
#10153
Comments
We cannot make a breaking change like this to how HTML is parsed. Especially with so little justification. I hope you understand. |
This comment was marked as resolved.
This comment was marked as resolved.
https://whatwg.org/working-mode#changes describes the process, but in this specific case I know that it's extremely unlikely you'll get implementer support as making a breaking change to how HTML is parsed is not done lightly. I can tell you that WebKit would not want this change to be made, but maybe you can convince other implementers. I'll reopen for now given that you don't seem convinced. |
Thank you, @annevk. I'm aware that this shall be an uphill battle, considering how much time this issue has been left unfixed. |
@annevk, I know solely that Google, Mozilla, and Apple are stakeholders in this, so do you know of any implementers I might have missed? Additionally, how do you suggest I contact them? I've always created issues in their respective public bug trackers - is that the standard way to propose things like this? |
I can see your enthusiasm for this feature. This change would be a breaking change to how HTML is parsed. This would break correctly parsed pages that currently exist. It would add a significant layer of complexity when parsing HTML. This change would need an exceptional use case to be considered. Easier debugging is not exceptional. Some complied languages support this feature, again not exceptional. Filing bugs, Anne already said that WebKit would not accept such a change. For Mozilla it would get duplicated to an old invalid bug like 195133. I expect a similar response from Blink/Chrome developers. I suspect that this feels circular but you are proposing a fundamental change that affects any page that uses comments. Such a change is not taken lightly. |
@kbrosnan, thanks. Indeed, I recognise what you've stated. |
We will not implement this for Gecko. As already mentioned, this would break existing pages that have this syntax and expect the current behavior. This alone is a sufficient blocker. But I think there's also another aspect here. When changing how HTML parsing works, we need to be consider if it introduces a new XSS problem. I believe this would: let's say a website allows untrusted input but uses an HTML-parser based sanitizer to remove anything that can run scripts. Comments are considered safe. If the website's backend implements this change, but the user's browser does not (it takes a long time for all users to update even if all browsers were to ship a coordinated change), then the user is now vulnerable to XSS. What you could do instead is to use the
The SGML-style comment parsing issue was solved in the spec in 2006, so that is not relevant. |
<!
, --
, and >
) to be nested.
Thanks for all of your inputs, and I apologize for taking so long considering how much effort you put in to consider this proposal. It's quite evident to me that the manner in which comment tags are currently parsed by most renderers isn't equivalent to other tags are, and that we consequently can't modify them without introducing XSS attacks to most of the web. Instead:
|
A change to allow nested comments would post substantial risks to content authored in WordPress over the past seven years. It relies on the fact that comments cannot be nested for structure and security. I don't see the need for a new tag when |
@dmsnell, thanks. Is all content inside it sanitised, like |
@RokeJulianLockhart content inside of comments isn't sanitized either. a |
I have learnt of something that is very relevant to this: The first editions of the HTML1 DTD actually included a (presumably nestable) The WHATWG's HTML Living Standard is no longer SGML-compliant. However, the aforementioned rationale in this thread to not add such a tag irrespective remains valid. Footnotes
|
@RokeJulianLockhart HTML was never SGML-compliant, though some HTML2/3/4 may have been. The DTD’s, if I’m not wrong, came about later in an attempt to formalize the HTML rules as an SGML application, but that was never truly successful because those DTDs didn’t correspond to how actual HTML parsers worked. HTML5 is fully-incompatible with SGML. For a cursory examination I ran my parser against a list of 293,965 root-path documents from a list of domains sorted on rank (some day I hope to get a processing pipeline established for Common Crawl but today I rely on this somewhat lazier dataset). For every comment in the document, I then examined if it contained the text 404 (0.137%) of the results contained what looks like nested comments. Among the other metrics, there were no instances of One example of breakage is where I found Sites also contain the old IE conditional comment,
and in cases like this the result would be benign, thankfully. Another type of error is when something has cut off the comment closer. It’s not evident why, but this appeared.
But again here, because of the missing closer, this would nest and swallow the rest of the page. Like before, I see far fewer instances of actual nesting of comments in the wild than I do of errors caused by some other kind of improper processing or stitching-together of documents. In any case, I wanted to share some data, because I realized that we’re all kind of spinning around some opinions but it might be helpful to know what kind of actual potential impact a change like this could have. If you want to convince people you might start simply by attempting to quantify what impact the change would have. Because of the fact that many of these existing nested comments are the result of other errors, it may not be enough simply to ask how many pages would parse differently. Perhaps a nice metric would be like this:
|
What problem are you trying to solve?
The undermentioned correctly-used HTML comment tags:
<!-- -->
...cannot be nested like:
This means that using comments to temporarily remove well-described code in order to debug it is immensely difficult in HTML.
What solutions exist today?
Some IDE extensions automatically break the tags in non-standardized manners, like
github.com/philsinatra/NestedCommentsVSCode/blob/9c25135847af99c89e66b14e7396a3bf2b0d7cf0/README.md?plain=1#L36-L42
:However, I could also use a custom tag. This may appear as if it is the immediately easier solution to this, but significant disadvantages to immediately choosing this option exist:
Choosing to add a new (for instance,
<comment>
) tag means that not solely must I propose and successfully implement a new element, I must additionally deprecate the previous absurdly widely-used<!-- -->
element. This seems to me as if it would cause more disruption, because we shouldn't leave both in, lest they duplicate functionality.Considering that modifying the parsing of the existent comment tag wouldn't affect HTML users - developers - in any manner I can foresee, modifying the engines to allow the comment tag to be nested sounds like a better idea.
I would like to eventually propose a
<comment>
tag. However, this seems to me to be a less disruptive modification to the specification.How would you solve it?
I would permit the tags to be nested. I would like this support to be unanimous to avoid situations like
stackoverflow.com/revisions/6698115/2
:Anything else?
stackoverflow.com/revisions/12102131/6
explains it well.new.reddit.com/user/rokejulianlockhart/duplicates/1b69fcv
, so that it might become more visible.The text was updated successfully, but these errors were encountered: