-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wbr element shouldn't be balanced #488
Comments
hmm yeah I can reproduce. wbr is listed as a self closing tag on: bleach/bleach/_vendor/html5lib/html5parser.py Lines 964 to 965 in a06cd77
and should have: token["selfClosingAcknowledged"] = True but I get
at https://github.com/mozilla/bleach/blob/master/bleach/sanitizer.py#L271 so I'm thinking one of these things might be going on:
but I'll need to find more time to look into it further. |
OK this is a bug in html5lib (v1.1 at least): » python
Python 3.8.2 (default, Mar 26 2020, 12:39:19)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bleach._vendor.html5lib as html5lib
>>> html5lib.__version__
'1.1'
>>> html5lib.serialize(html5lib.parseFragment('<area>')) # this is correct
'<area>'
>>> html5lib.serialize(html5lib.parseFragment('<wbr>')) # should be <wbr>
'<wbr></wbr>'
>>> html5lib.serialize(html5lib.parseFragment('<keygen>')) # HTML 5.2 deprecates the tag
'<keygen></keygen>'
>>> html5lib.serialize(html5lib.parseFragment('<menuitem>')) # https://github.com/html5lib/html5lib-python/issues/203 mentions this but https://developer.mozilla.org/en-US/docs/Web/HTML/Element/menuitem shows non-void examples and says HTML 5.2 deprecates it
'<menuitem></menuitem>' the upstream issue is html5lib/html5lib-python#203 Not sure what html5lib's position on deprecated elements is. |
This is now addressed in html5lib: |
Waiting on an html5lib release with this fix. Then we can update the vendored html5lib and test everything. |
The
<wbr>
element is balanced bybleach.clean
even though it is an empty element.Using the list of empty tags from MDN:
The output includes
<wbr></wbr>
when it should just be<wbr>
like the others.keygen
has the same problem, but that's deprecated so I'm not sure if it's worth including.The text was updated successfully, but these errors were encountered: