Skip to content

XMLDocument

Santhosh Kumar Tekuri edited this page Mar 18, 2015 · 1 revision

Getting Started

Consider Company class containing array of Employee;

class Company{
    String name;
    Employee employees[];

    Company(String name, Employee... employees){
        this.name = name;
        this.employees = employees;
    }
}

class Employee{
    String id;
    String name;
    String email;
    int age;

    Employee(String id, String name, String email, int age){
        this.id = id;
        this.name = name;
        this.email = email;
        this.age = age;
    }
}

To create xml:

import jlibs.xml.sax.XMLDocument;
import javax.xml.transform.stream.StreamResult;

XMLDocument xml = new XMLDocument(new StreamResult(System.out), false, 4, null);
xml.startDocument();{
    xml.startElement("company");{
        xml.addAttribute("name", company.name);
        for(Employee emp: company.employees){
            xml.startElement("employee");{
                xml.addAttribute("id", emp.id);
                xml.addAttribute("age", ""+emp.age);
                xml.addElement("name", emp.name);
                xml.addElement("email", emp.email);
            }
            xml.endElement("employee");
        }
    }
    xml.endElement("company");
}
xml.endDocument();

Running this prints following:

<?xml version="1.0" encoding="UTF-8"?>
<company name="MyCompany">
    <employee id="1" age="20">
        <name>scott</name>
        <email>[email protected]</email>
    </employee>
    <employee id="2" age="25">
        <name>alice</name>
        <email>[email protected]</email>
    </employee>
</company>

The constructor of XMLDocument is:

XMLDocument(Result result, boolean omitXMLDeclaration, int indentAmount, String encoding) throws TransformerConfigurationException

The first argument is of type javax.xml.transform.Result; So we can even use DOMResult to create DOM;
if last argument encoding is null, then it defaults to default XML encoding(UTF-8);


NULL Friendly

The methods to fire SAX events are null friendly. it means:

xml.addAttribute("id", emp.id);

will not add attribute if emp.id=null. So you no longer need to write as below:

if(emp.id!=null)
xml.addAttribute("id", emp.id);

null friendly methods, avoid code clutter and make it more readable.

Method Chaining

The methods to fire SAX events return this. So method calls can be chained to produce more readable code

xml.startElement("employee")
.addAttribute("id", emp.id)
.addAttribute("age", ""+emp.age);

instead of

xml.startElement("employee");
xml.addAttribute("id", emp.id);
xml.addAttribute("age", ""+emp.age);

Simple Text Only Elements

You can do following:

xml.addElement("email", emp.email);

instead of

if(emp.email!=null){
xml.startElement("email");
xml.addText(emp.email);
xml.endElement("email");
}

there is also addCDATAElement(...) available

End Element

To end element, we do:

xml.endElement("employee");

If you mis-spell element name here, it will throw SAXException:

org.xml.sax.SAXException: expected </employee>

there is also another variation of endElement with no arguments;

xml.endElement();

This will implicitly find the recent element started and ends it.

suppose we have:

xml.endElement("elem3");
xml.endElement("elem2");
xml.endElement("elem1");

the same can be done in single line as below:

xml.endElements("elem1");

This will do endElement() until elem1 is closed;

To end all elements started, do:

xml.endElements();

NOTE:

  • endElements() will do nothing if all elements are already closed;
  • endElements() is implictly called in endDocument(). So you can safely ignore trailing end elements of xml

DTD

xml.addSystemDTD("company", "company.dtd");

will produce

<!DOCTYPE company SYSTEM "company.dtd">

xml.addPublicDTD("company", "-//W3C//DTD XHTML 1.0 Transitional//EN", "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd");

will produce

<!DOCTYPE company PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Adding XML

xml.startElement("elem1");
xml.addXML("<test><test1>first</test1><test2>second</test2></test>", false);
xml.endElement();

will produce:

<elem1>
<test>
<test1>first</test1>
<test2>second</test2>
</test>
</elem1>

The first argument to addXML(...) should be well-formed xml string;
second argument will tell whether to ignore root element or not;
when used true in above sample it will produce:

<elem1>
<test1>first</test1>
<test2>second</test2>
</elem1>

there are following other variations of addXML(...) available:

public XMLDocument addXML(InputSource is, boolean excludeRoot) throws SAXException

for example, you could write:

xml.addXML(new InputSource("notes.xml"), true);

Miscellaneous

xml.addComment("this is comment");
xml.addCDATA("this is inside cdata");

// to produce: <?xml-stylesheet href="classic.xsl" type="text/xml"?>
xml.addProcessingInstruction("xml-stylesheet", "href=\"classic.xsl\" type=\"text/xml\"");

Namespaces
static final String URI_JLIBS = "http://jlibs.org";
static final String URI_COMP = "http://mycompany.com";
static final String URI_EMP = "http://employee.com";

xml.startDocument();
xml.startElement(URI_COMP, "company")
.addAttribute("name", "mycompany")
.addAttribute(URI_JLIBS, "version", "0.1")
.startElement(URI_EMP, "employee")
.addAttribute("name", "scott")
.addElement(URI_EMP, "email", "[email protected]")
.endElement()
.endElement();
xml.endDocument();

will produce the following:

<?xml version="1.0" encoding="UTF-8"?>
<mycompany:company xmlns:mycompany="http://mycompany.com" name="mycompany" jlibs:version="0.1" xmlns:jlibs="http://jlibs.org">
<employee:employee xmlns:employee="http://employee.com" name="scott">
<employee:email>[email protected]</employee:email>
</employee:employee>
</mycompany:company>

You can notice that, we didn't tell what prefix to use.
XMLDocument is intelligent enough to generate prefixes automatically from namespace uri.

Standard Namespaces

jlibs.xml.Namespaces class contains most frequently used namespaces like:

public static final String URI_XSD   = "http://www.w3.org/2001/XMLSchema";
public static final String URI_XSI = "http://www.w3.org/2001/XMLSchema-instance";
public static final String URI_XSL = "http://www.w3.org/1999/XSL/Transform";

Namespaces.suggestPrefix(String uri) suggests most commonly used prefix for any of these standard prefixes;

String prefix = Namespaces.suggestPrefix(Namespaces.URI_XSD); // prefix will be "xsd"

XMLDocument uses suggested prefixes from Namespaces if available; For example:

import static jlibs.xml.Namespaces.*;

xml.startDocument();
xml.startElement(URI_XSD, "element")
.addAttribute("name", "employee")
.addAttribute("type", "employeeType");
xml.endDocument();

will produce the following:

<xsd:element xmlns:xsd="http://www.w3.org/2001/XMLSchema" name="employee" type="employeeType"/>

Suggesting Prefixes

public void suggestPrefix(String prefix, String uri)

this method can be used to suggest prefix for given uri;
Note that, using this method you can even ovverride the prefixes for standard namespaces, if needed.

xml.startDocument();
xml.suggestPrefix(URI_JLIBS, "jlibs");
xml.suggestPrefix(URI_COMP, "comp");
xml.suggestPrefix(URI_EMP, "emp");

xml.startElement(URI_COMP, "company")
.addAttribute("name", "mycompany")
.addAttribute(URI_JLIBS, "version", "0.1")
.startElement(URI_EMP, "employee")
.addAttribute("name", "scott")
.addElement(URI_EMP, "email", "[email protected]")
.endElement()
.endElement();
xml.endDocument();

will produce the following:

<comp:company xmlns:comp="http://mycompany.com" name="mycompany" jlibs:version="0.1" xmlns:jlibs="http://jlibs.org">
<emp:employee xmlns:emp="http://employee.com" name="scott">
<emp:email>[email protected]</emp:email>
</emp:employee>
</comp:company>

Declaring Prefixes

When you declare prefix, xmlns attribute will be added to generated xml;
This could be handy in following situation:

xml.startDocument();
xml.startElement(URI_COMP, "company")
.startElement(URI_EMP, "employee")
.addAttribute("name", "scott")
.endElement()
.startElement(URI_EMP, "employee")
.addAttribute("name", "alice")
.endElement()
.startElement(URI_EMP, "employee")
.addAttribute("name", "alean")
.endElement()
.endElement();
xml.endDocument();

produces the following:

<mycompany:company xmlns:mycompany="http://mycompany.com">
<employee:employee xmlns:employee="http://employee.com" name="scott"/>
<employee:employee xmlns:employee="http://employee.com" name="alice"/>
<employee:employee xmlns:employee="http://employee.com" name="alean"/>
</mycompany:company>

In output, you can notice that employee namespace is declared in each <employee> element.
The xml is looking cluttered because of this. If we could have defined employee namespace in <company>, it would be better.

To do this:

xml.startDocument();
xml.declarePrefix(URI_EMP); // we are declaring manually here

xml.startElement(URI_COMP, "company")
.startElement(URI_EMP, "employee")
.addAttribute("name", "scott")
.endElement()
.startElement(URI_EMP, "employee")
.addAttribute("name", "alice")
.endElement()
.startElement(URI_EMP, "employee")
.addAttribute("name", "alean")
.endElement()
.endElement();
xml.endDocument();

now the above code produces:

<mycompany:company xmlns:mycompany="http://mycompany.com" xmlns:employee="http://employee.com">
<employee:employee name="scott"/>
<employee:employee name="alice"/>
<employee:employee name="alean"/>
</mycompany:company>

notice that xmlns:employee attribute is now moved to <mycompany> element.

there is also another variant of declarePrefix(...)

public boolean declarePrefix(String prefix, String uri)

using this, you can specify prefix of your wish.

Computing QNames

xml.startDocument();
xml.declarePrefix("emp", URI_EMP);

xml.startElement(URI_XSD, "schema");
.startElement(URI_XSD, "element")
.addAttribute("name", "employee")
.addAttribute("type", toQName(URI_EMP, "emloyeeType"));
xml.endDocument();

will produce following:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:emp="http://employee.com">
<xsd:element name="employee" type="emp:emloyeeType"/>
</xsd:schema>

here the value of @type is a qname which should be valid. i.e you want to use correct prefix
toQName(uri, localPart) will return the correct qname string.
if the given uri is not yet declared, it will be declared automatically.


Mark and Release

let us say we have following two methods:

public static void serializeCompany(XMLDocument xml, Company company) throws SAXException{
xml.startElement("company");
xml.addAttribute("name", company.name);
for(Employee emp: company.employees){
serializeEmployee(xml, emp);
}
xml.endElement("company");
}

public static void serializeEmployee(XMLDocument xml, Employee emp) throws SAXException{
xml.startElement("employee");{
xml.addAttribute("id", emp.id);
xml.addAttribute("age", ""+emp.age);
xml.addElement("name", emp.name);
xml.addElement("email", emp.email);
}
xml.endElement();
// xml.endElement();
}

public static void main(String[] args) throws Exception{
Company company = createCompany();
XMLDocument xml = new XMLDocument(new StreamResult(System.out), false, 4, null);
xml.startDocument();
serializeCompany(xml, company);
xml.endDocument();
}

when you uncomment last line in serializeEmployee(...) it produces following exception:

Exception in thread "main" org.xml.sax.SAXException: can't find matching start element
at jlibs.xml.sax.XMLDocument.findEndElement(XMLDocument.java:244)
at jlibs.xml.sax.XMLDocument.endElement(XMLDocument.java:257)
at jlibs.xml.sax.XMLDocument.endElement(XMLDocument.java:264)
at Example.serializeCompany(XMLDocument.java:483)
at Example.main(XMLDocument.java:504)

from above stacktrace, you will notice that the error is reported for serializeCompany(...);
but actually the bug is in serializeEmployee(...) method.

now change serializeCompany(...) to use marking support as follows:

public static void serializeCompany(XMLDocument xml, Company company) throws SAXException{
xml.startElement("company");
xml.addAttribute("name", company.name);
for(Employee emp: company.employees){
xml.mark();
serializeEmployee(xml, emp);
xml.release();
}
xml.endElement("company");
}

now the exception produced will be:

Exception in thread "main" org.xml.sax.SAXException: can't find matching start element
at jlibs.xml.sax.XMLDocument.findEndElement(XMLDocument.java:244)
at jlibs.xml.sax.XMLDocument.endElement(XMLDocument.java:268)
at Example.serializeEmployee(XMLDocument.java:496)
at Example.serializeCompany(XMLDocument.java:482)
at Example.main(XMLDocument.java:506)

i.e the stacktrace now clearly tells the bug is in serializeEmployee(...) method;

let us see marking support in detail:

xml.startElement("elem1");
...
xml.startElement("elem2");
....
xml.mark();
xml.startElement("elem3");
....
xml.startElement("elem4");
.....
xml.release(); // will close elem4 and elem3 i.e upto the mark and clears the mark
xml.endElement("elem2");

xml.release() must be called prior to ending Elements before the mark. i.e,

xml.startElement("elem1");
...
xml.mark();
....
xml.endElement("elem1"); // will throw SAXException: can't find matching start element

NOTE:

  • You can mark as many times as possible. i.e, multiple marks can exist;
  • endElements() will only end elements which are started after recent mark.
  • release() implictly does end elements

you can also release any mark, instead of last mark as below:

int mark = mark();
...
xml.mark();
...
xml.mark();
...
xml.release(mark);

when you call mark(), it returns the number of mark;
first call to mark() returns 1. next call to mark() will return 2, if earlier mark is not released;

NOTE:
there is an implicit mark 0, which should not be released by user. it is used by XMLDocument;


You can create wrappers for XMLDocument to make creating specific type of xml document easier;

JLibs has one such wrapper jlibs.xml.xsd.XSDocument; this lets us make XMLSchema documents easier;

import jlibs.xml.xsd.XSDocument;

XSDocument xsd = new XSDocument(new StreamResult(System.out), false, 4, null);
xsd.startDocument();
{
String n1 = "http://www.example.com/N1";
String n2 = "http://www.example.com/N2";
xsd.xml().declarePrefix("n1", n1);
xsd.xml().declarePrefix("n2", n2);
xsd.startSchema(n1);
{
xsd.addImport(n2, "imports/b.xsd");
xsd.startComplexType().name("MyType");
{
xsd.startCompositor(Compositor.SEQUENCE);
xsd.startElement().ref(n1, "e1").endElement();
xsd.endCompositor();
}
xsd.endComplexType();
xsd.startElement().name("root").type(n1, "MyType").endElement();
}
xsd.endSchema();
}
xsd.endDocument();

produces following output:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.com/N1" xmlns:n1="http://www.example.com/N1" xmlns:n2="http://www.example.com/N2">
<xsd:import namespace="http://www.example.com/N2" schemaLocation="imports/b.xsd"/>
<xsd:complexType name="MyType">
<xsd:sequence>
<xsd:element ref="n1:e1"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="root" type="n1:MyType"/>
</xsd:schema>

You can also create similar wrappers.


ObjectInputSource:

This is an extension of org.xml.sax.InputSource with single abstract method:

protected abstract void write(E obj) throws SAXException;

InputSource wraps systemID or OutputStream or Reader which is source of xml;
Similarly, ObjectInputSource wraps a java object, which is the source of xml:

new ObjectInputSource<E>(E obj, XMLDocument xml)

It is job of its subclass to override write(E obj, XMLDocument xml) and fire SAX events.

Let us write an implementation of ObjectInputSource for Company;

import jlibs.xml.sax.ObjectInputSource;
import org.xml.sax.SAXException;

class CompanyInputSource extends ObjectInputSource<Company>{
public CompanyInputSource(Company company){
super(company);
}

@Override
protected void write(Company company, XMLDocument xml) throws SAXException{
xml.startElement("company");
xml.addAttribute("name", company.name);
for(Employee emp: company.employees){
xml.startElement("employee");{
xml.addAttribute("id", emp.id);
xml.addAttribute("age", ""+emp.age);
xml.addElement("name", emp.name);
xml.addElement("email", emp.email);
}
xml.endElement("employee");
}
xml.endElement("company");
}
}

Note that xml.startDocument() is implicitly called before write(...) method and
xml.endDocument() is called implicitly after write(...) method;

To create XML, now we can do the following:

import javax.xml.transform.TransformerException;

public static void main(String[] args) throws TransformerException, XMLStreamException{
Employee scott = new Employee("1", "scott", "[email protected]", 20);
Employee alice = new Employee("2", "alice", "[email protected]", 25);
Company company = new Company("MyCompany", scott, alice);

// print company to System.out as xml
new CompanyInputSource(company).writeTo(System.out, false, 4, null);
}

ObjectInputSource contains several methods to serialize the SAX events:

public void writeTo(Writer writer, boolean omitXMLDeclaration, int indentAmount) throws TransformerException
public void writeTo(OutputStream out, boolean omitXMLDeclaration, int indentAmount, String encoding) throws TransformerException
public void writeTo(String systemID, boolean omitXMLDeclaration, int indentAmount, String encoding) throws TransformerException

if encoding is null, then it defaults to default XML encoding(UTF-8);
These writeTo(...) methods use Identity Trasformer;

Your comments are appreciated;

Clone this wiki locally