From e6dd2eabcaa0934b226d69cbbda3958be106361a Mon Sep 17 00:00:00 2001 From: Mike Samuel Date: Mon, 15 Jun 2020 11:44:51 -0400 Subject: [PATCH 1/9] s/master/main/ for default branch --- RELEASE-checklist.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE-checklist.sh b/RELEASE-checklist.sh index 9315800d..c871ccf8 100644 --- a/RELEASE-checklist.sh +++ b/RELEASE-checklist.sh @@ -92,7 +92,7 @@ find . -name pom.xml \ git commit -am "Bumped dev version" -git push origin master --tags +git push origin main --tags # Now Release echo '1. Go to oss.sonatype.org' From f3f56d4d2555f55f46af39b0c074eb0888b43900 Mon Sep 17 00:00:00 2001 From: Mike Samuel Date: Mon, 15 Jun 2020 12:04:58 -0400 Subject: [PATCH 2/9] Release candidate 20200615.1 --- README.md | 10 +++++----- aggregate/pom.xml | 4 ++-- change_log.md | 7 +++++-- docs/getting_started.md | 10 +++++----- docs/maven.md | 2 +- empiricism/pom.xml | 4 ++-- html-types/pom.xml | 4 ++-- parent/pom.xml | 2 +- pom.xml | 2 +- 9 files changed, 24 insertions(+), 21 deletions(-) diff --git a/README.md b/README.md index abdf69b0..dd2b3533 100644 --- a/README.md +++ b/README.md @@ -35,7 +35,7 @@ how to get started with or without Maven. ## Prepackaged Policies You can use -[prepackaged policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/Sanitizers.html): +[prepackaged policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/Sanitizers.html): ```Java PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS); @@ -47,7 +47,7 @@ String safeHTML = policy.sanitize(untrustedHTML); The [tests](https://github.com/OWASP/java-html-sanitizer/blob/master/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java) show how to configure your own -[policy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/HtmlPolicyBuilder.html): +[policy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/HtmlPolicyBuilder.html): ```Java PolicyFactory policy = new HtmlPolicyBuilder() @@ -62,7 +62,7 @@ String safeHTML = policy.sanitize(untrustedHTML); ## Custom Policies You can write -[custom policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/ElementPolicy.html) +[custom policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/ElementPolicy.html) to do things like changing `h1`s to `div`s with a certain class: ```Java @@ -85,7 +85,7 @@ need to be explicitly whitelisted using the `allowWithoutAttributes()` method if you want them to be allowed through the filter when these elements do not include any attributes. -[Attribute policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/AttributePolicy.html) allow running custom code too. Adding an attribute policy will not water down any default policy like `style` or URL attribute checks. +[Attribute policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/AttributePolicy.html) allow running custom code too. Adding an attribute policy will not water down any default policy like `style` or URL attribute checks. ```Java new HtmlPolicyBuilder = new HtmlPolicyBuilder() @@ -153,7 +153,7 @@ of the output. ## Telemetry -When a policy rejects an element or attribute it notifies an [HtmlChangeListener](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/HtmlChangeListener.html). +When a policy rejects an element or attribute it notifies an [HtmlChangeListener](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/HtmlChangeListener.html). You can use this to keep track of policy violation trends and find out when someone is making an effort to breach your security. diff --git a/aggregate/pom.xml b/aggregate/pom.xml index 2673aa37..f41ff770 100644 --- a/aggregate/pom.xml +++ b/aggregate/pom.xml @@ -3,12 +3,12 @@ com.googlecode.owasp-java-html-sanitizer aggregate pom - 20191001.2-SNAPSHOT + 20200615.1 ../parent com.googlecode.owasp-java-html-sanitizer parent - 20191001.2-SNAPSHOT + 20200615.1 diff --git a/change_log.md b/change_log.md index 8cd2b6a4..ce66fe97 100644 --- a/change_log.md +++ b/change_log.md @@ -1,8 +1,11 @@ # OWASP Java HTML Sanitizer Change Log Most recent at top. - * Pending - * Fix table formatting + * Release 20200615.1 + * Change `.and` when combining two policies to respect explicit `skipIfEmpty` decisions. + * HTML entity decoding now follows HTML standard rules about when a semicolon is optional. + [Fixes #193](https://github.com/OWASP/java-html-sanitizer/issues/193) + * Fix table formatting [#137](https://github.com/OWASP/java-html-sanitizer/issues/137) * Release 20191001.1 * Package as an OSGI bundle * Release 20190610.1 diff --git a/docs/getting_started.md b/docs/getting_started.md index a46f254d..103b4e01 100644 --- a/docs/getting_started.md +++ b/docs/getting_started.md @@ -30,16 +30,16 @@ it to HTML. The [javadoc](http://javadoc.io/doc/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/) covers more detailed topics, including -[customization](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/HtmlPolicyBuilder.html). +[customization](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/HtmlPolicyBuilder.html). Important classes are: - * [Sanitizers](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/Sanitizers.html) contains combinable pre-packaged policies. - * [HtmlPolicyBuilder](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/HtmlPolicyBuilder.html) lets you easily build custom policies. + * [Sanitizers](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/Sanitizers.html) contains combinable pre-packaged policies. + * [HtmlPolicyBuilder](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/HtmlPolicyBuilder.html) lets you easily build custom policies. For advanced use, see: - * [AttributePolicy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/AttributePolicy.html) and [ElementPolicy](http://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20180219.1/org/owasp/html/ElementPolicy.html) allow complex customization. - * [HtmlStreamEventReceiver](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/org/owasp/html/HtmlStreamEventReceiver.html) if you don't just want a `String` as output. + * [AttributePolicy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/AttributePolicy.html) and [ElementPolicy](http://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20180219.1/org/owasp/html/ElementPolicy.html) allow complex customization. + * [HtmlStreamEventReceiver](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/HtmlStreamEventReceiver.html) if you don't just want a `String` as output. ## Asking Questions diff --git a/docs/maven.md b/docs/maven.md index 4f2ad1bd..192db266 100644 --- a/docs/maven.md +++ b/docs/maven.md @@ -23,7 +23,7 @@ Bigger numbers are more recent and the [change log](../change_log.md) can shed light on the salient differences. You should be able to build with the HTML sanitizer. You can read the -[javadoc](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20191001.1/index.html), +[javadoc](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/index.html), and if you have questions that aren't answered by these wiki pages, you can ask on the [mailing list](http://groups.google.com/group/owasp-java-html-sanitizer-support). diff --git a/empiricism/pom.xml b/empiricism/pom.xml index f62386e0..ce286f65 100644 --- a/empiricism/pom.xml +++ b/empiricism/pom.xml @@ -2,13 +2,13 @@ 4.0.0 com.googlecode.owasp-java-html-sanitizer html-types - 20191001.2-SNAPSHOT + 20200615.1 jar ../parent com.googlecode.owasp-java-html-sanitizer parent - 20191001.2-SNAPSHOT + 20200615.1 empiricism diff --git a/html-types/pom.xml b/html-types/pom.xml index 2e454f9b..3637bf64 100644 --- a/html-types/pom.xml +++ b/html-types/pom.xml @@ -2,13 +2,13 @@ 4.0.0 com.googlecode.owasp-java-html-sanitizer html-types - 20191001.2-SNAPSHOT + 20200615.1 bundle ../parent com.googlecode.owasp-java-html-sanitizer parent - 20191001.2-SNAPSHOT + 20200615.1 OWASP Java HTML Sanitizer Safe HTML Compatibility diff --git a/parent/pom.xml b/parent/pom.xml index e98cf6ad..6c2d68be 100644 --- a/parent/pom.xml +++ b/parent/pom.xml @@ -2,7 +2,7 @@ 4.0.0 com.googlecode.owasp-java-html-sanitizer parent - 20191001.2-SNAPSHOT + 20200615.1 pom diff --git a/pom.xml b/pom.xml index c9c25d82..ccb740c0 100644 --- a/pom.xml +++ b/pom.xml @@ -6,7 +6,7 @@ parent com.googlecode.owasp-java-html-sanitizer parent - 20191001.2-SNAPSHOT + 20200615.1 OWASP Java HTML Sanitizer From fd6b2ddbf7ded2ba4fc1d718e450dc701368cdba Mon Sep 17 00:00:00 2001 From: Mike Samuel Date: Mon, 15 Jun 2020 12:11:49 -0400 Subject: [PATCH 3/9] Bumped dev version --- aggregate/pom.xml | 4 ++-- empiricism/pom.xml | 4 ++-- html-types/pom.xml | 4 ++-- parent/pom.xml | 2 +- pom.xml | 2 +- 5 files changed, 8 insertions(+), 8 deletions(-) diff --git a/aggregate/pom.xml b/aggregate/pom.xml index f41ff770..1dac6b95 100644 --- a/aggregate/pom.xml +++ b/aggregate/pom.xml @@ -3,12 +3,12 @@ com.googlecode.owasp-java-html-sanitizer aggregate pom - 20200615.1 + 20200615.2-SNAPSHOT ../parent com.googlecode.owasp-java-html-sanitizer parent - 20200615.1 + 20200615.2-SNAPSHOT diff --git a/empiricism/pom.xml b/empiricism/pom.xml index ce286f65..2468149c 100644 --- a/empiricism/pom.xml +++ b/empiricism/pom.xml @@ -2,13 +2,13 @@ 4.0.0 com.googlecode.owasp-java-html-sanitizer html-types - 20200615.1 + 20200615.2-SNAPSHOT jar ../parent com.googlecode.owasp-java-html-sanitizer parent - 20200615.1 + 20200615.2-SNAPSHOT empiricism diff --git a/html-types/pom.xml b/html-types/pom.xml index 3637bf64..36b14d8f 100644 --- a/html-types/pom.xml +++ b/html-types/pom.xml @@ -2,13 +2,13 @@ 4.0.0 com.googlecode.owasp-java-html-sanitizer html-types - 20200615.1 + 20200615.2-SNAPSHOT bundle ../parent com.googlecode.owasp-java-html-sanitizer parent - 20200615.1 + 20200615.2-SNAPSHOT OWASP Java HTML Sanitizer Safe HTML Compatibility diff --git a/parent/pom.xml b/parent/pom.xml index 6c2d68be..95f95bd7 100644 --- a/parent/pom.xml +++ b/parent/pom.xml @@ -2,7 +2,7 @@ 4.0.0 com.googlecode.owasp-java-html-sanitizer parent - 20200615.1 + 20200615.2-SNAPSHOT pom diff --git a/pom.xml b/pom.xml index ccb740c0..c15a5c02 100644 --- a/pom.xml +++ b/pom.xml @@ -6,7 +6,7 @@ parent com.googlecode.owasp-java-html-sanitizer parent - 20200615.1 + 20200615.2-SNAPSHOT OWASP Java HTML Sanitizer From eb6ef0297b5a960ac25a85a6d8a56803ccc11a0c Mon Sep 17 00:00:00 2001 From: Mike Samuel Date: Mon, 13 Jul 2020 11:27:08 -0400 Subject: [PATCH 4/9] =?UTF-8?q?Do=20not=20lcase=20element=20or=20attribute?= =?UTF-8?q?=20names=20that=20match=20SVG=20or=20MathML=20name=E2=80=A6=20(?= =?UTF-8?q?#206)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Do not lcase element or attribute names that match SVG or MathML names exactly > Currently all names are converted to lowercase which is ok when > you're using it for HTML only, but if there is an SVG image nested > inside the HTML it breaks. For example, when `viewBox` attribute is > converted to `viewbox` the image is not displayed correctly. This commit splits *HtmlLexer*.*canonicalName* into variants which preserve items on whitelists derived from the SVG and MathML specifications, and adjusts callers of *canonicalName* to use the appropriate variant. Fixes #182 * add unittests for mixed-case SVG names --- .../html/ElementAndAttributePolicies.java | 2 +- ...ndAttributePolicyBasedSanitizerPolicy.java | 2 +- src/main/java/org/owasp/html/HtmlLexer.java | 155 ++++++++++++++++-- .../org/owasp/html/HtmlPolicyBuilder.java | 18 +- .../java/org/owasp/html/HtmlSanitizer.java | 6 +- .../org/owasp/html/HtmlStreamRenderer.java | 6 +- .../TagBalancingHtmlStreamEventReceiver.java | 4 +- .../org/owasp/html/HtmlPolicyBuilderTest.java | 19 +++ 8 files changed, 182 insertions(+), 30 deletions(-) diff --git a/src/main/java/org/owasp/html/ElementAndAttributePolicies.java b/src/main/java/org/owasp/html/ElementAndAttributePolicies.java index 219a3d80..8f2cdca6 100644 --- a/src/main/java/org/owasp/html/ElementAndAttributePolicies.java +++ b/src/main/java/org/owasp/html/ElementAndAttributePolicies.java @@ -35,7 +35,7 @@ /** * Encapsulates all the information needed by the - * {@link ElementAndAttributePolicySanitizerPolicy} to sanitize one kind + * {@link ElementAndAttributePolicyBasedSanitizerPolicy} to sanitize one kind * of element. */ @Immutable diff --git a/src/main/java/org/owasp/html/ElementAndAttributePolicyBasedSanitizerPolicy.java b/src/main/java/org/owasp/html/ElementAndAttributePolicyBasedSanitizerPolicy.java index d578221f..c8da4617 100644 --- a/src/main/java/org/owasp/html/ElementAndAttributePolicyBasedSanitizerPolicy.java +++ b/src/main/java/org/owasp/html/ElementAndAttributePolicyBasedSanitizerPolicy.java @@ -144,7 +144,7 @@ public void openTag(String elementName, List attrs) { adjustedElementName = policies.elPolicy.apply(elementName, attrs); if (adjustedElementName != null) { - adjustedElementName = HtmlLexer.canonicalName(adjustedElementName); + adjustedElementName = HtmlLexer.canonicalElementName(adjustedElementName); } } else { adjustedElementName = null; diff --git a/src/main/java/org/owasp/html/HtmlLexer.java b/src/main/java/org/owasp/html/HtmlLexer.java index ed326b15..00fcc7dc 100644 --- a/src/main/java/org/owasp/html/HtmlLexer.java +++ b/src/main/java/org/owasp/html/HtmlLexer.java @@ -55,11 +55,27 @@ public HtmlLexer(String input) { /** * Normalize case of names that are not name-spaced. This lower-cases HTML - * element and attribute names, but not ones for embedded SVG or MATHML. + * element names, but not ones for embedded SVG or MathML. */ - static String canonicalName(String elementOrAttribName) { - return elementOrAttribName.indexOf(':') >= 0 - ? elementOrAttribName : Strings.toLowerCase(elementOrAttribName); + static String canonicalElementName(String elementName) { + return elementName.indexOf(':') >= 0 || mixedCaseForeignElementNames.contains(elementName) + ? elementName : Strings.toLowerCase(elementName); + } + + /** + * Normalize case of names that are not name-spaced. This lower-cases HTML + * attribute names, but not ones for embedded SVG or MathML. + */ + static String canonicalAttributeName(String attribName) { + return attribName.indexOf(':') >= 0 || mixedCaseForeignAttributeNames.contains(attribName) + ? attribName : Strings.toLowerCase(attribName); + } + + /** + * Normalize case of keywords in attribute values. + */ + public static String canonicalKeywordAttributeValue(String keywordValue) { + return Strings.toLowerCase(keywordValue); } /** @@ -243,9 +259,7 @@ private void pushbackToken(HtmlToken token) { /** Can the attribute appear in HTML without a value. */ private static boolean isValuelessAttribute(String attribName) { - boolean valueless = VALUELESS_ATTRIB_NAMES.contains( - Strings.toLowerCase(attribName)); - return valueless; + return VALUELESS_ATTRIB_NAMES.contains(canonicalAttributeName(attribName)); } // From http://issues.apache.org/jira/browse/XALANC-519 @@ -253,6 +267,125 @@ private static boolean isValuelessAttribute(String attribName) { "checked", "compact", "declare", "defer", "disabled", "ismap", "multiple", "nohref", "noresize", "noshade", "nowrap", "readonly", "selected"); + + private static final ImmutableSet mixedCaseForeignAttributeNames = ImmutableSet.of( + "attributeName", + "attributeType", + "baseFrequency", + "baseProfile", + "calcMode", + "clipPathUnits", + "contentScriptType", + "defaultAction", + "definitionURL", + "diffuseConstant", + "edgeMode", + "externalResourcesRequired", + "filterUnits", + "focusHighlight", + "gradientTransform", + "gradientUnits", + "initialVisibility", + "kernelMatrix", + "kernelUnitLength", + "keyPoints", + "keySplines", + "keyTimes", + "lengthAdjust", + "limitingConeAngle", + "markerHeight", + "markerUnits", + "markerWidth", + "maskContentUnits", + "maskUnits", + "mediaCharacterEncoding", + "mediaContentEncodings", + "mediaSize", + "mediaTime", + "numOctaves", + "pathLength", + "patternContentUnits", + "patternTransform", + "patternUnits", + "playbackOrder", + "pointsAtX", + "pointsAtY", + "pointsAtZ", + "preserveAlpha", + "preserveAspectRatio", + "primitiveUnits", + "refX", + "refY", + "repeatCount", + "repeatDur", + "requiredExtensions", + "requiredFeatures", + "requiredFonts", + "requiredFormats", + "schemaLocation", + "snapshotTime", + "specularConstant", + "specularExponent", + "spreadMethod", + "startOffset", + "stdDeviation", + "stitchTiles", + "surfaceScale", + "syncBehavior", + "syncBehaviorDefault", + "syncMaster", + "syncTolerance", + "syncToleranceDefault", + "systemLanguage", + "tableValues", + "targetX", + "targetY", + "textLength", + "timelineBegin", + "transformBehavior", + "viewBox", + "xChannelSelector", + "yChannelSelector", + "zoomAndPan" + ); + + private static final ImmutableSet mixedCaseForeignElementNames = ImmutableSet.of( + "animateColor", + "animateMotion", + "animateTransform", + "clipPath", + "feBlend", + "feColorMatrix", + "feComponentTransfer", + "feComposite", + "feConvolveMatrix", + "feDiffuseLighting", + "feDisplacementMap", + "feDistantLight", + "feDropShadow", + "feFlood", + "feFuncA", + "feFuncB", + "feFuncG", + "feFuncR", + "feGaussianBlur", + "feImage", + "feMerge", + "feMergeNode", + "feMorphology", + "feOffset", + "fePointLight", + "feSpecularLighting", + "feSpotLight", + "feTile", + "feTurbulence", + "foreignObject", + "linearGradient", + "radialGradient", + "solidColor", + "textArea", + "textPath" + ); } /** @@ -311,7 +444,7 @@ protected HtmlToken produce() { switch (token.type) { case TAGBEGIN: { - String canonTagName = canonicalName( + String canonTagName = canonicalElementName( token.start + 1, token.end); if (HtmlTextEscapingMode.isTagFollowedByLiteralContent( canonTagName)) { @@ -478,7 +611,7 @@ private HtmlToken parseToken() { if (this.inEscapeExemptBlock && '/' == input.charAt(start + 1) && textEscapingMode != HtmlTextEscapingMode.PLAIN_TEXT - && canonicalName(start + 2, end) + && canonicalElementName(start + 2, end) .equals(escapeExemptTagName)) { this.inEscapeExemptBlock = false; this.escapeExemptTagName = null; @@ -612,8 +745,8 @@ && canonicalName(start + 2, end) return result; } - private String canonicalName(int start, int end) { - return HtmlLexer.canonicalName(input.substring(start, end)); + private String canonicalElementName(int start, int end) { + return HtmlLexer.canonicalElementName(input.substring(start, end)); } private static boolean isIdentStart(char ch) { diff --git a/src/main/java/org/owasp/html/HtmlPolicyBuilder.java b/src/main/java/org/owasp/html/HtmlPolicyBuilder.java index 46b42230..aa1b51a5 100644 --- a/src/main/java/org/owasp/html/HtmlPolicyBuilder.java +++ b/src/main/java/org/owasp/html/HtmlPolicyBuilder.java @@ -238,7 +238,7 @@ public HtmlPolicyBuilder allowElements( ElementPolicy policy, String... elementNames) { invalidateCompiledState(); for (String elementName : elementNames) { - elementName = HtmlLexer.canonicalName(elementName); + elementName = HtmlLexer.canonicalElementName(elementName); ElementPolicy newPolicy = ElementPolicy.Util.join( elPolicies.get(elementName), policy); // Don't remove if newPolicy is the always reject policy since we want @@ -286,7 +286,7 @@ public HtmlPolicyBuilder allowCommonBlockElements() { public HtmlPolicyBuilder allowTextIn(String... elementNames) { invalidateCompiledState(); for (String elementName : elementNames) { - elementName = HtmlLexer.canonicalName(elementName); + elementName = HtmlLexer.canonicalElementName(elementName); textContainers.put(elementName, true); } return this; @@ -305,7 +305,7 @@ public HtmlPolicyBuilder allowTextIn(String... elementNames) { public HtmlPolicyBuilder disallowTextIn(String... elementNames) { invalidateCompiledState(); for (String elementName : elementNames) { - elementName = HtmlLexer.canonicalName(elementName); + elementName = HtmlLexer.canonicalElementName(elementName); textContainers.put(elementName, false); } return this; @@ -321,7 +321,7 @@ public HtmlPolicyBuilder disallowTextIn(String... elementNames) { public HtmlPolicyBuilder allowWithoutAttributes(String... elementNames) { invalidateCompiledState(); for (String elementName : elementNames) { - elementName = HtmlLexer.canonicalName(elementName); + elementName = HtmlLexer.canonicalElementName(elementName); skipIssueTagMap.put(elementName, HtmlTagSkipType.DO_NOT_SKIP); } return this; @@ -336,7 +336,7 @@ public HtmlPolicyBuilder allowWithoutAttributes(String... elementNames) { public HtmlPolicyBuilder disallowWithoutAttributes(String... elementNames) { invalidateCompiledState(); for (String elementName : elementNames) { - elementName = HtmlLexer.canonicalName(elementName); + elementName = HtmlLexer.canonicalElementName(elementName); skipIssueTagMap.put(elementName, HtmlTagSkipType.SKIP); } return this; @@ -349,7 +349,7 @@ public HtmlPolicyBuilder disallowWithoutAttributes(String... elementNames) { public AttributeBuilder allowAttributes(String... attributeNames) { ImmutableList.Builder b = ImmutableList.builder(); for (String attributeName : attributeNames) { - b.add(HtmlLexer.canonicalName(attributeName)); + b.add(HtmlLexer.canonicalAttributeName(attributeName)); } return new AttributeBuilder(b.build()); } @@ -432,7 +432,7 @@ public HtmlPolicyBuilder requireRelsOnLinks(String... linkValues) { this.extraRelsForLinks = Sets.newLinkedHashSet(); } for (String linkValue : linkValues) { - linkValue = HtmlLexer.canonicalName(linkValue); + linkValue = HtmlLexer.canonicalKeywordAttributeValue(linkValue); Preconditions.checkArgument( !Strings.containsHtmlSpace(linkValue), "spaces in input. use f(\"foo\", \"bar\") not f(\"foo bar\")"); @@ -456,7 +456,7 @@ public HtmlPolicyBuilder skipRelsOnLinks(String... linkValues) { this.skipRelsForLinks = Sets.newLinkedHashSet(); } for (String linkValue : linkValues) { - linkValue = HtmlLexer.canonicalName(linkValue); + linkValue = HtmlLexer.canonicalKeywordAttributeValue(linkValue); Preconditions.checkArgument( !Strings.containsHtmlSpace(linkValue), "spaces in input. use f(\"foo\", \"bar\") not f(\"foo bar\")"); @@ -980,7 +980,7 @@ public HtmlPolicyBuilder globally() { public HtmlPolicyBuilder onElements(String... elementNames) { ImmutableList.Builder b = ImmutableList.builder(); for (String elementName : elementNames) { - b.add(HtmlLexer.canonicalName(elementName)); + b.add(HtmlLexer.canonicalElementName(elementName)); } return HtmlPolicyBuilder.this.allowAttributesOnElements( policy, attributeNames, b.build()); diff --git a/src/main/java/org/owasp/html/HtmlSanitizer.java b/src/main/java/org/owasp/html/HtmlSanitizer.java index f9e932e8..63d7ae95 100644 --- a/src/main/java/org/owasp/html/HtmlSanitizer.java +++ b/src/main/java/org/owasp/html/HtmlSanitizer.java @@ -152,7 +152,7 @@ public static void sanitize( break; case TAGBEGIN: if (htmlContent.charAt(token.start + 1) == '/') { // A close tag. - receiver.closeTag(HtmlLexer.canonicalName( + receiver.closeTag(HtmlLexer.canonicalElementName( htmlContent.substring(token.start + 2, token.end))); while (lexer.hasNext() && lexer.next().type != HtmlTokenType.TAGEND) { @@ -173,7 +173,7 @@ public static void sanitize( } else { attrsReadyForName = false; } - attrs.add(HtmlLexer.canonicalName( + attrs.add(HtmlLexer.canonicalAttributeName( htmlContent.substring(tagBodyToken.start, tagBodyToken.end))); break; case ATTRVALUE: @@ -191,7 +191,7 @@ public static void sanitize( attrs.add(attrs.getLast()); } receiver.openTag( - HtmlLexer.canonicalName( + HtmlLexer.canonicalElementName( htmlContent.substring(token.start + 1, token.end)), attrs); } diff --git a/src/main/java/org/owasp/html/HtmlStreamRenderer.java b/src/main/java/org/owasp/html/HtmlStreamRenderer.java index 019895a5..16645531 100644 --- a/src/main/java/org/owasp/html/HtmlStreamRenderer.java +++ b/src/main/java/org/owasp/html/HtmlStreamRenderer.java @@ -187,7 +187,7 @@ private void writeOpenTag( attrIt.hasNext();) { String name = attrIt.next(); String value = attrIt.next(); - name = HtmlLexer.canonicalName(name); + name = HtmlLexer.canonicalAttributeName(name); if (!isValidHtmlName(name)) { error("Invalid attr name", name); continue; @@ -234,7 +234,7 @@ public final void closeTag(String elementName) { private final void writeCloseTag(String uncanonElementName) throws IOException { if (!open) { throw new IllegalStateException(); } - String elementName = HtmlLexer.canonicalName(uncanonElementName); + String elementName = HtmlLexer.canonicalElementName(uncanonElementName); if (!isValidHtmlName(elementName)) { error("Invalid element name", elementName); return; @@ -386,7 +386,7 @@ static boolean isValidHtmlName(String name) { * that has more consistent semantics. */ static String safeName(String unsafeElementName) { - String elementName = HtmlLexer.canonicalName(unsafeElementName); + String elementName = HtmlLexer.canonicalElementName(unsafeElementName); // Substitute a reliably non-raw-text element for raw-text and // plain-text elements. diff --git a/src/main/java/org/owasp/html/TagBalancingHtmlStreamEventReceiver.java b/src/main/java/org/owasp/html/TagBalancingHtmlStreamEventReceiver.java index c94d5735..aa2eb075 100644 --- a/src/main/java/org/owasp/html/TagBalancingHtmlStreamEventReceiver.java +++ b/src/main/java/org/owasp/html/TagBalancingHtmlStreamEventReceiver.java @@ -97,7 +97,7 @@ public void openTag(String elementName, List attrs) { if (DEBUG) { dumpState("open " + elementName); } - String canonElementName = HtmlLexer.canonicalName(elementName); + String canonElementName = HtmlLexer.canonicalElementName(elementName); int elIndex = METADATA.indexForName(canonElementName); // Treat unrecognized tags as void, but emit closing tags in closeTag(). @@ -238,7 +238,7 @@ public void closeTag(String elementName) { if (DEBUG) { dumpState("close " + elementName); } - String canonElementName = HtmlLexer.canonicalName(elementName); + String canonElementName = HtmlLexer.canonicalElementName(elementName); int elIndex = METADATA.indexForName(canonElementName); if (elIndex == UNRECOGNIZED_TAG) { // Allow unrecognized end tags through. diff --git a/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java b/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java index a06fafee..db75e4c7 100644 --- a/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java +++ b/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java @@ -975,6 +975,25 @@ public static final void testTableStructure() { sanitized); } + @Test + public static final void testSvgNames() { + PolicyFactory policyFactory = new HtmlPolicyBuilder() + .allowElements("svg", "animateColor") + .allowAttributes("viewBox").onElements("svg") + .toFactory(); + String svg = ""; + assertEquals(svg, policyFactory.sanitize(svg)); + } + + @Test + public static final void testTextareaIsNotTextArea() { + String input = ""; + PolicyFactory textareaPolicy = new HtmlPolicyBuilder().allowElements("textarea").toFactory(); + PolicyFactory textAreaPolicy = new HtmlPolicyBuilder().allowElements("textArea").toFactory(); + assertEquals("y", textareaPolicy.sanitize(input)); + assertEquals("x", textAreaPolicy.sanitize(input)); + } + private static String apply(HtmlPolicyBuilder b) { return apply(b, EXAMPLE); } From 25c3d64c4c0764d86792a2e23b8f5498a449b9de Mon Sep 17 00:00:00 2001 From: Mike Samuel Date: Mon, 13 Jul 2020 11:37:22 -0400 Subject: [PATCH 5/9] Release candidate 20200713.1 --- README.md | 10 +++++----- aggregate/pom.xml | 4 ++-- change_log.md | 4 ++++ docs/getting_started.md | 10 +++++----- docs/maven.md | 2 +- empiricism/pom.xml | 4 ++-- html-types/pom.xml | 4 ++-- parent/pom.xml | 2 +- pom.xml | 2 +- 9 files changed, 23 insertions(+), 19 deletions(-) diff --git a/README.md b/README.md index dd2b3533..2f9708a2 100644 --- a/README.md +++ b/README.md @@ -35,7 +35,7 @@ how to get started with or without Maven. ## Prepackaged Policies You can use -[prepackaged policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/Sanitizers.html): +[prepackaged policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200713.1/org/owasp/html/Sanitizers.html): ```Java PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS); @@ -47,7 +47,7 @@ String safeHTML = policy.sanitize(untrustedHTML); The [tests](https://github.com/OWASP/java-html-sanitizer/blob/master/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java) show how to configure your own -[policy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/HtmlPolicyBuilder.html): +[policy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200713.1/org/owasp/html/HtmlPolicyBuilder.html): ```Java PolicyFactory policy = new HtmlPolicyBuilder() @@ -62,7 +62,7 @@ String safeHTML = policy.sanitize(untrustedHTML); ## Custom Policies You can write -[custom policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/ElementPolicy.html) +[custom policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200713.1/org/owasp/html/ElementPolicy.html) to do things like changing `h1`s to `div`s with a certain class: ```Java @@ -85,7 +85,7 @@ need to be explicitly whitelisted using the `allowWithoutAttributes()` method if you want them to be allowed through the filter when these elements do not include any attributes. -[Attribute policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/AttributePolicy.html) allow running custom code too. Adding an attribute policy will not water down any default policy like `style` or URL attribute checks. +[Attribute policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200713.1/org/owasp/html/AttributePolicy.html) allow running custom code too. Adding an attribute policy will not water down any default policy like `style` or URL attribute checks. ```Java new HtmlPolicyBuilder = new HtmlPolicyBuilder() @@ -153,7 +153,7 @@ of the output. ## Telemetry -When a policy rejects an element or attribute it notifies an [HtmlChangeListener](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200615.1/org/owasp/html/HtmlChangeListener.html). +When a policy rejects an element or attribute it notifies an [HtmlChangeListener](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20200713.1/org/owasp/html/HtmlChangeListener.html). You can use this to keep track of policy violation trends and find out when someone is making an effort to breach your security. diff --git a/aggregate/pom.xml b/aggregate/pom.xml index 1dac6b95..c41fc9a5 100644 --- a/aggregate/pom.xml +++ b/aggregate/pom.xml @@ -3,12 +3,12 @@ com.googlecode.owasp-java-html-sanitizer aggregate pom - 20200615.2-SNAPSHOT + 20200713.1 ../parent com.googlecode.owasp-java-html-sanitizer parent - 20200615.2-SNAPSHOT + 20200713.1 diff --git a/change_log.md b/change_log.md index ce66fe97..9619abc5 100644 --- a/change_log.md +++ b/change_log.md @@ -1,6 +1,10 @@ # OWASP Java HTML Sanitizer Change Log Most recent at top. + * Release 20200713.1 + * Do not lower-case SVG/MathML names. + This shouldn't cause problems since it was hard to write policies for + SBG, but be aware that SVG's `