Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using a & in an image URL makes text2confl crash #183

Closed
feliksik opened this issue Jun 19, 2024 · 6 comments
Closed

using a & in an image URL makes text2confl crash #183

feliksik opened this issue Jun 19, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@feliksik
Copy link

imagine the following asciidoc:

= test

image::https://example.com/image?id=12&param=b[image]

https://example.com/image?id=12&param=b[link]

this results in the following Confluence XML:

<div class="openblock">
<div class="content">

</div>
</div>
<p><ac:image ac:alt="image"><ri:url ri:value="https://example.com/image?id=12&param=b" /></ac:image></p>
<p><a href="https://example.com/image?id=12&amp;param=b">link</a></p>

And it makes the tool crash:

Some pages content is invalid:
1. ./test.adoc: [6:84] The reference to entity "param" must end with the ';' delimiter.

The cause is that & in an XML attribute must apparently be escaped. As we can see, this is done for the <a href, but not for the <ri:url ri:value attribute.

@zeldigas zeldigas added the bug Something isn't working label Jun 19, 2024
@zeldigas
Copy link
Owner

@feliksik do you know if confluence itself is ready to process such content?

@feliksik
Copy link
Author

feliksik commented Jun 20, 2024

Confluence is ready to process content with the escaped variant;

  <ac:image ac:alt="image">
    <ri:url ri:value="https://example.com/image.png?id=12&amp;param=b" />
  </ac:image>

This is properly rendered a an image to the unescaped variant (and it's the only valid XML to represent an &).

If the current implementation would not trip over the XML validation, Confluence actually accepts the invalid (unescaped) XML, and then converts it to store it as an escaped &amp; in the storage format (but it would still be ugly to depend on Confluence fixing it).

I just discovered that with an inline image (i.e., using image: instead of image::), it does already work as it should:

= test

image:https://example.com/image.png?id=12&param=b[image]

is converted to

  <ac:image ac:alt="image">
    <ri:url ri:value="https://example.com/image.png?id=12&amp;param=b" />
  </ac:image>

@zeldigas zeldigas self-assigned this Jun 22, 2024
@zeldigas
Copy link
Owner

@feliksik fixed and released as 0.17.0

@feliksik
Copy link
Author

feliksik commented Jul 4, 2024

The following also breaks with some pages content is invalid: ./bug.adoc: [13:47] The reference to entity "param" must end with the ';' delimiter.


++++
<a href="https://example.com/image?id=12&param=b">this links breaks text2confl</a>
++++

I think we should conclude that the proper approach is not to to have macro-specific escapings, like we did before, but instead, only do a replacement of the & in the html output, because it will be treated as XML, which will first unescape the &.

I'm not entirely sure though - alternatively, we could also apply the mechanism used above for literal sections (++++, +++, and i believe ~~~ or something).

@zeldigas
Copy link
Owner

zeldigas commented Jul 5, 2024

@feliksik please create separate issue for this finding. The challenge for asciidoc here is that some rendered stuff is escaped while other parts are not, so it's not possible just "normalize" whole output as some post processing step. Maybe i'm missing something, but this is my current understanding..

@feliksik
Copy link
Author

feliksik commented Jul 5, 2024

I think I might have been mistaken. It look like a href-attribute in an HTML anchor needs & characters to be escaped as &amp;, so it's really the responsibility of the author of the passthrough block to provide this to make sure the AsciiDoc HTML output is correct -- that is, the fact that's it's later interpreted/treated as XML does not add this as a new requirement. So I think issue regarding the ++++ should be ignored. Thanks Dmitry!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants