You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The PermitScrubber class treats ProcessingInstructions as if they were Elements. (Actually, strictly speaking, it doesn't check the node type at all, beyond special handling for CDATA nodes, it just checks the name field.) This leads to some odd behavior:
require'rails-html-sanitizer'sanitizer=Rails::Html::SafeListSanitizer.new# Sanitizer is using PermitScrubber with default properties# The actual issue is within PermitScrubber, but for simplicity in showing the# issue, these examples use a SafeListSanitizersanitizer.sanitize('<a href=example.html>Expected link</a><?a href=pi.html>not expected PI')# => "<a href=\"example.html\">Expected link</a><?a href=pi.html>not expected PI"
However, this is because the PI has a white-listed name. So, for example:
I haven't figured out any way to do anything evil with this in modern browsers because modern browsers seem to correctly parse and ignore processing instructions. However, the contents of the PI are passed through to the final result unmodified, so:
There was apparently an issue in older versions of Internet Explorer where PIs were not parsed correctly, potentially allowing the script to be run. It does not appear to run in modern browsers, though. (Chrome just changes the PIs into comments.)
It's worth noting that, yes, HTML has processing instructions, because SGML has processing instructions. <?a something> is a valid HTML processing instruction. It doesn't do anything, but it's valid SGML and may be parsed as such. (Unlike XML PIs, SGML PIs end with just > and not ?>.)
The solution is likely to update the scrub method in PermitScrubber to remove PIs. Alternatively, the allowed_node? method could be updated to determine if the node is really an element before checking it against the whitelist.
The text was updated successfully, but these errors were encountered:
Some scrubbers want to allow comments through, but in v1.4.0 didn't
get the chance because only elements were passed through to
`keep_node?`.
This change allows comments and elements through, but still omits
other non-elements like processing instructions (see #115).
The PermitScrubber class treats
ProcessingInstruction
s as if they wereElement
s. (Actually, strictly speaking, it doesn't check the node type at all, beyond special handling for CDATA nodes, it just checks thename
field.) This leads to some odd behavior:However, this is because the PI has a white-listed
name
. So, for example:I haven't figured out any way to do anything evil with this in modern browsers because modern browsers seem to correctly parse and ignore processing instructions. However, the contents of the PI are passed through to the final result unmodified, so:
There was apparently an issue in older versions of Internet Explorer where PIs were not parsed correctly, potentially allowing the script to be run. It does not appear to run in modern browsers, though. (Chrome just changes the PIs into comments.)
It's worth noting that, yes, HTML has processing instructions, because SGML has processing instructions.
<?a something>
is a valid HTML processing instruction. It doesn't do anything, but it's valid SGML and may be parsed as such. (Unlike XML PIs, SGML PIs end with just>
and not?>
.)The solution is likely to update the
scrub
method inPermitScrubber
to remove PIs. Alternatively, theallowed_node?
method could be updated to determine if the node is really an element before checking it against the whitelist.The text was updated successfully, but these errors were encountered: