Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible code injection #21

Closed
mindhaq opened this issue Mar 10, 2018 · 6 comments
Closed

Possible code injection #21

mindhaq opened this issue Mar 10, 2018 · 6 comments

Comments

@mindhaq
Copy link

mindhaq commented Mar 10, 2018

When using autolink on a text including a link like this one

www.google.com"onclick="alert('gotcha!')

And render the output as it is suggested in the example:

String result = Autolink.renderLinks(input, links, (link, text, sb) -> {
    sb.append("<a href=\"");
    sb.append(text, link.getBeginIndex(), link.getEndIndex());
    sb.append("\">");
    sb.append(text, link.getBeginIndex(), link.getEndIndex());
    sb.append("</a>");
});

the output will be

<a href="www.google.com"onclick="alert('gotcha!')">www.google.com"onclick="alert('gotcha!')</a>"

which is strictly speaking invalid HTML, but browsers will still execute the click handler. See https://jsfiddle.net/vLjLLo8n/2/ to try it out.

I understand that appending a subsequence to the StringBuilder is the more efficient than providing the link as a String, but to make this secure, you would need to get the substring and perform encoding on it.

So, for example using OWASP Java Encoder, the rendering needs to be done like this:

String result = Autolink.renderLinks(input, links, (link, text, sb) -> {
    String linkString = new StringBuilder().append(text, link.getBeginIndex(), link.getEndIndex()).toString();

    sb.append("<a href=\"");
    sb.append(Encode.forHtmlAttribute(linkString));
    sb.append("\">");
    sb.append(Encode.forHtml(linkString));
    sb.append("</a>");
});

resulting in a safe output:

<a href="www.google.com&#34;onclick=&#34;alert(&#39;gotcha!&#39;)">www.google.com&#34;onclick=&#34;alert(&#39;gotcha!&#39;)</a>

Easiest fix for this particular problem would probably be if autolink would not include single or double quotes, or any other character not legal in a URL.
(EDIT: single quotes are legal characters)

A possibly breaking API change would be to provide the linkString as part of the LinkSpan interface.

@robinst
Copy link
Owner

robinst commented Mar 19, 2018

Yeah, I'm fully aware of this, that's why there is a note for that example:

Note that it doesn't handle escaping at all

But maybe I should either:

  • Not include this example at all, as it's dangerous if someone copied it like that
  • Change the example so that it escapes all parts of text
  • Change the example so that it parses the input as HTML and only finds links in text (to prevent double linking)

Note that even your suggested change of using Encode would have problems in the real world, as it wouldn't escape text outside links.

What do you think?

@mindhaq
Copy link
Author

mindhaq commented Mar 19, 2018

First, the URLs really should not include characters not allowed in URLs. According to this post on Stackoverflow double quotes are not legal.

Giving an example in the readme is important, I wouldn't have learned about the AutoLink class otherwise. As a general effort to spread knowledge about important security best practices, it's maybe a good thing to include usage of Encode.

If you are using the library as a tool to create HTML out of some kind of plaintext, that plaintext needs to be sanitized before using it to autolink. That's what I do in my real world application; only during autolinking some of it becomes unwanted code again.

@robinst
Copy link
Owner

robinst commented Mar 20, 2018

I was following Rinku's behavior in this case. Having said that, I don't think stopping URL's at " would be a problem in practice, so I'm open to changing it. Or did you want to work on that?

I'll update the example when I find some time.

robinst added a commit that referenced this issue May 28, 2018
robinst added a commit that referenced this issue May 28, 2018
With spans, the code for renderLinks can just be written as a normal
loop with an if statement. This makes it possible to render the text
between links in a special way too (e.g. escape it).
@robinst
Copy link
Owner

robinst commented May 28, 2018

Ok, I've:

Iterable<Span> spans = linkExtractor.extractSpans(input);

StringBuilder sb = new StringBuilder();
for (Span span : spans) {
    String text = input.substring(span.getBeginIndex(), span.getEndIndex());
    if (span instanceof LinkSpan) {
        // span is a URL
        sb.append("<a href=\"");
        sb.append(Encode.forHtmlAttribute(text));
        sb.append("\">");
        sb.append(Encode.forHtml(text));
        sb.append("</a>");
    } else {
        // span is plain text before/after link
        sb.append(Encode.forHtml(text));
    }
}
result = sb.toString();

@robinst
Copy link
Owner

robinst commented Jun 4, 2018

@mindhaq nothing? Ok. I'm gonna merge that PR.

robinst added a commit that referenced this issue Jun 4, 2018
Add extractSpans method so that we can deprecate renderLinks (#21)
@robinst
Copy link
Owner

robinst commented Jun 18, 2018

I have since released this change as 0.9.0, see CHANGELOG: https://github.com/robinst/autolink-java/blob/master/CHANGELOG.md#090---2018-06-04

@robinst robinst closed this as completed Jun 18, 2018
xzel23 pushed a commit to xzel23/autolink-java that referenced this issue Mar 2, 2020
xzel23 pushed a commit to xzel23/autolink-java that referenced this issue Mar 2, 2020
)

With spans, the code for renderLinks can just be written as a normal
loop with an if statement. This makes it possible to render the text
between links in a special way too (e.g. escape it).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants