Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatted output (ident) not working when using XMLAnyElement #1707

Closed
ChrisHuebsch-FLIG opened this issue Mar 27, 2023 · 5 comments
Closed

Comments

@ChrisHuebsch-FLIG
Copy link

ChrisHuebsch-FLIG commented Mar 27, 2023

I read that it is possible to write out pretty printed xml with the help of

marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);

When using XMLAnyElement, I get an unexpected result. Only the start tags are indented, but the end-tags are not.

Is there any method to fix it (except piping it through an xsl formatter)?

See my example below.

I have an extremely simple model like so:

import java.util.List;

import jakarta.xml.bind.Element;
import jakarta.xml.bind.annotation.XmlAnyElement;
import jakarta.xml.bind.annotation.XmlRootElement;

@XmlRootElement
public class Job {

	@XmlAnyElement
	private List<Element> content;

}

The test program looks like this

import java.io.File;
import java.io.StringWriter;

import org.junit.jupiter.api.Test;

import jakarta.xml.bind.JAXBContext;
import jakarta.xml.bind.JAXBException;
import jakarta.xml.bind.Marshaller;
import jakarta.xml.bind.Unmarshaller;

public class JobTestXML {
	@Test
	public void anyTypeTest() {
		try {
			JAXBContext ctx = JAXBContext.newInstance(Job.class);

			Unmarshaller unmarshaller = ctx.createUnmarshaller();
			File f = new File(getClass().getResource("test.xml").getFile());
			Job job = (Job) (unmarshaller.unmarshal(f));

			Marshaller marshaller = ctx.createMarshaller();
			marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
			StringWriter sw = new StringWriter();
			marshaller.marshal(job, sw);
			System.out.println(sw.toString());

		} catch (JAXBException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}
}

Test.xml content is this:

<job type="NewMethod">

<foo id="1">
<bar>
<baz>
<value name="xxx"/>
  </baz>
  </bar>
 </foo>

</job>

The program generates this output:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<job>
    <foo id="1">
        <bar>
            <baz>
                <value name="xxx"/>
  </baz>
  </bar>
 </foo>
</job>

I was expecting something like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<job>
	<foo id="1">
		<bar>
			<baz>
				<value name="xxx" />
			</baz>
		</bar>
	</foo>
</job>
@laurentschoelens
Copy link
Contributor

Hi @ChrisHuebsch-FLIG

Was providing a PR for that but in fact, your test case is wrong.

By marshalling the Test.xml to Job class, you create "text nodes" that will contains the whitespaces between xml tag and closing tags.
For example :

<baz>
<value name="xxx"/>
  </baz>

will create a Baz tag, having child tag value and a text node of \n

By marshalling this, it will output the \n to the String result, and ignore the indent of closing tag since having encountered characters.

You should try with constructing the Job element from java code (or remove all the whitespaces that are irrelevant), you'll then see that output is conform to expected.

Regards
Laurent

@ChrisHuebsch-FLIG
Copy link
Author

Hello Laurent,

thank you for your explanation. I was expecting that the JAXB_FORMATTED_OUTPUT option deals with "additional" whitespaces.

But I agree. How shall the Marshaller know which one is additional and which one is not.

But you gave me a good hint. I might just need to scan the elements list for "whitespace only" text elements and remove them.

This might be "cheaper" than starting up an entire xslt processor.

Perhaps you can close this issue with "won't fix".

@laurentschoelens
Copy link
Contributor

But you gave me a good hint. I might just need to scan the elements list for "whitespace only" text elements and remove them.

That's what I was trying to do in my PR original submission, ignoring every "whitespace only" characters output in processing of formatted output. But was a bit curious why it wanted to output \n on the problematic lines.
Figured in the end that it was because of the formatted xml in input and jaxb model used that it was parsed as effective text nodes.
By removing these extra lines / spaces, I got the real output in the end.

@laurentschoelens
Copy link
Contributor

@lukasj : as suggested, maybe close this issue with "won't fix" tag ?

@lukasj
Copy link
Member

lukasj commented Oct 9, 2023

closing as suggested

@lukasj lukasj closed this as not planned Won't fix, can't repro, duplicate, stale Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants