From a5bc63407949f2169bd6e872228ce221a0f542dc Mon Sep 17 00:00:00 2001 From: Chris Morgan Date: Thu, 26 Aug 2021 00:10:44 +1000 Subject: [PATCH] escape_html: stop encoding / as / (+ more) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The OWASP XSS cheat sheet is a very badly-written document that hasn’t been maintained and still contains various errors that have lasted more than a decade, in some cases despite them being pointed out. Also the section being quoted here was being misapplied anyway (it’s only for *text*, not for attribute values, and it therefore escapes *way* more than is needed). The entire document urgently needs to be completely rewritten, but they’re not doing it. Hence in part my removal of any citation of it. One recently exorcised ancient error is the recommendation to escape slashes: . That was *always* spurious, and I want it gone partly under the principle of least encoding but mostly because I’m fed up with URLs being uglified in this way. I’ve also changed the escaping of ' from ' to ', because the reason for avoiding ' is invalid (it was an accidental omission in an early HTML5 spec, long since reinstated, and all user agents always supported it). --- src/utils.rs | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/src/utils.rs b/src/utils.rs index 67a39fec..49d5f270 100644 --- a/src/utils.rs +++ b/src/utils.rs @@ -1,21 +1,28 @@ use crate::errors::Error; -/// Escape HTML following [OWASP](https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet) +/// Escape text for inclusion in HTML or XML body text or quoted attribute values. /// -/// Escape the following characters with HTML entity encoding to prevent switching -/// into any execution context, such as script, style, or event handlers. Using -/// hex entities is recommended in the spec. In addition to the 5 characters -/// significant in XML (&, <, >, ", '), the forward slash is included as it helps -/// to end an HTML entity. +/// This escapes more than is ever necessary in any given place, so that one method can be used for +/// almost forms of escaping ever needed in both HTML and XML. Here’s all that you actually *need* +/// to escape: /// -/// ```text -/// & --> & -/// < --> < -/// > --> > -/// " --> " -/// ' --> ' ' is not recommended -/// / --> / forward slash is included as it helps end an HTML entity -/// ``` +/// - In HTML body text: `<` and `&`; +/// - In HTML quoted attribute values: `&` and the quote (`'` or `"`); +/// - In XML body text: `<`, `>` and `&`; +/// - In XML quoted attribute values: `<`, `>`, `&` and the quote (`'` or `"`). +/// +/// This method is only certified for use in these contexts. It may not be suitable in other +/// contexts; for example, inside a ``, +/// `