Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autolinks extension should ignore URIs inside link descriptions #97

Open
kukimik opened this issue Sep 14, 2022 · 1 comment
Open

Autolinks extension should ignore URIs inside link descriptions #97

kukimik opened this issue Sep 14, 2022 · 1 comment

Comments

@kukimik
Copy link
Contributor

kukimik commented Sep 14, 2022

Calling:

commonmark-cli -x autolinks <<EOF
[https://www.website.com](https://www.website.com#something)

[[email protected]](mailto:[email protected]?subject=Some%20subject)

[A website similar to https://www.foo.com and https://www.bar.com](https://www.baz.com)
EOF

results in (note the nested <a> tags):

<p><a href="https://www.website.com#something"><a href="https://www.website.com">https://www.website.com</a></a></p>
<p><a href="mailto:[email protected]?subject=Some%20subject"><a href="mailto:[email protected]">[email protected]</a></a></p>
<p><a href="https://www.baz.com">A website similar to <a href="https://www.foo.com">https://www.foo.com</a> and <a href="https://www.bar.com">https://www.bar.com</a></a></p>

while I would expect

<p><a href="https://www.website.com#something">https://www.website.com</a></p>
<p><a href="mailto:[email protected]?subject=Some%20subject">[email protected]</a></p>
<p><a href="https://www.baz.com">A website similar to https://www.foo.com and https://www.bar.com</a></p>

One reason is that nested links are illegal in HTML5 and HTML4.

This bite me in srid/emanote#349.

@jgm
Copy link
Owner

jgm commented Sep 19, 2022

Related issue about explicit autolinks: commonmark/commonmark-spec#719

Actually this may be a bit hard to achieve, given the architecture used in this library. If we were parsing to an AST, we could simply substitute any links in the link description for their associated link text. But this library allows you to parse directly to an output format, so this isn't possible in general. Moreover, we don't know whether a bit of text is part of a link description until AFTER we've parsed it as an autolink (since the matching of brackets takes place at a later stage).

If you parse to an AST (which is possible, just not required, with this library), then you can always walk the document after parsing and remove links inside links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants