Newlines in attributes are not escaped #152

jwoudenberg · 2020-02-02T19:46:10Z

Hey there! Thank you everyone involved with creating and maintaining this library!

I'm trying to use this library to generate Junit-style XML reports, which our CI system (Jenkins) then parses and presents in a pretty way to users. I've been running into an issue though which I believe has to do with how this library encodes newlines in attributes (it doesn't).

Given the following XML document description:

doc = 
  Document
    (Prologue [] Nothing [])
    (Element "root" (fromList [("attr", "line\nline")]) [])
    []

renderText produces the following XML:

<?xml version="1.0" encoding="UTF-8"?><root attr="line\nline"/>

Parsing this back using parseText recovers the original value. That sounds good, but it's different from the behavior defined in the spec, which requires a normalization phase before parsing that would turn newlines into a spaces.

The XML parser in our Jenkins CI system does seem to perform this normalization step and so turns newlines in attributes into spaces. If we have something like a GHC compiler error, use this library to encode it into the attribute of an XML document, and then have Jenkins parse it back, then what gets displayed in the UI isn't particularly readable.

The Junit XML format specifies the main error message goes into an attribute. We've been able to work around this by storing it in an element intended to contain a stack trace of an error, and that Jenkins displays ungarbled under a 'Stacktrace' header. It works but isn't great.

This stack overflow answer claims that to avoid this problem the JSON renderer should escape newlines into 
. Making the replacement myself in a string before passing it into an attribute for xml-conduit to encode is no use, because xml-conduit will escape the & character. I checked that if I manually construct an XML string containing the 
 then xml-conduit does parse that back into a newline.

Would a patch to escape newlines into &#10 during rendering be accepted? I imagine there's a risk of breaking people relying on the current behavior.

The text was updated successfully, but these errors were encountered:

k0ral · 2020-02-03T15:37:52Z

I would gladly accept a pull-request that makes xml-conduit more standard-compliant.
This warrants the usual backward-incompatible-change precautions:

change must be mentioned in the changelog
major version bump: A.B+1.C.D
old release will be set as preferred version in Hackage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Newlines in attributes are not escaped #152

Newlines in attributes are not escaped #152

jwoudenberg commented Feb 2, 2020 •

edited

Loading

k0ral commented Feb 3, 2020

Newlines in attributes are not escaped #152

Newlines in attributes are not escaped #152

Comments

jwoudenberg commented Feb 2, 2020 • edited Loading

k0ral commented Feb 3, 2020

jwoudenberg commented Feb 2, 2020 •

edited

Loading