Working with JDom

January 19, 2010

JDom is the simplest interface for working with XML in Java that I've found.

Parsing to a Document

To parse code from a file into a JDom document, use the following code:

SAXBuilder lBuilder = new SAXBuilder();
Document lDocument = lBuilder.build(new FileReader("file.xml"));

To read xml from a String instead, you will need to use a StringReader as the build overload that takes a string expects a URI:

SAXBuilder lBuilder = new SAXBuilder();
Document lDocument = lBuilder.build(new StringReader("<root></root>"));

Converting Document back to XML

To convert back from a parsed Document into a pretty printed XML representation

XMLOutputter lXMLOutputter = new XMLOutputter();
lXMLOutputter.setFormat(Format.getPrettyFormat());
String lXml = lXMLOutputter.outputString(pDocument);

Getting the Root Element

To navigate around the structure, the two key classes required are Document and Element. You will already have a parsed Document created from a file (or other source). To get hold of the root element (the top level tag in the XML document, of which there must only be one), ask the document to give you it.

Element lRootElement = lDocument.getRootElement();

Navigating the Document

Check the Element JavaDoc for the full list of methods available on the Element class. The key methods for navigating down through the structure are getChild() and getChildren(), which allow you to return child tags in the xml.

List lChildren = lRootElement.getChildren();
Iterator lChildIterator = lChildren.iterator();
while (lChildIterator.hasNext())
{
   Element lNextElement = (Element)lChildIterator.next();
   // do something with lNextElement
}

Or in a Java 5 style:

List<Element> lChildren = lRootElement.getChildren();
for (lElement in lChildren)
{
  // do something with lElement
}

Getting Tag Contents

Once you have navigated down to an Element which has a text value, then the value can be retrieved using the getText() method on the Element. It is also possible to get a child tag's text all in one go, using the getChildText() method. A convenience method for trimming the contents of excess whitespace is also available in getChildTextTrim().

Given this structure:

<Root>
   <Element>
     <Test>Test</test>
   </Element>
</Root>

the following two code blocks are identical

Element lTest = lElement.getChild("Test");
String lTagText = lTest.getText();

:::java
String lTagText = lElement.getChildText("Test");

XPath

JDom supports evaluating XPath expressions.

SAXBuilder lBuilder = new SAXBuilder();
Document lDocument = lBuilder.build("file.xml");
XPath lXPath = XPath.newInstance("/path/expression");
List lList = lXPath.selectNodes(lDocument);

If you know that your expression will only return one object, you can use selectSingleNode() method instead:

Element lElement = (Element)lXPath.selectSingleNode(lDocument);

You can pass in an Element or a Document to the select methods, and the XPath reference can be relative.

Namespaces

If you do a getChild() call, and the child in question has a namespace it won't find it, even if it has the same namespace as the node you are looking it.

Given this xml:

<root xmlns="http://namespace.com/">
  <element1>
    <element2>Fred</element2>
  </element1>
</root>

This won't work:

Element lRootElement = lDocument.getRootElement();
Element lElement1 = lRootElement.getChild("element1");
String lElement2Value = lElement1.getChildText("element2");

You need to do this:

Element lRootElement = lDocument.getRootElement();
Element lElement1 = lRootElement.getChild("element1", lRootElement.getNamespace());
String lElement2Value = lElement1.getChildText("element2", lRootElement.getNamespace());

This final line could also have been written as:

String lElement2Value = lElement1.getChildText("element2", lElement1.getNamespace());

Tags: jdom xml java