April 19, 2019
Hot Topics:

Cure Your Java XML Troubles with a Dose of Castor Oil

  • December 21, 2006
  • By James Turner
  • Send Email »
  • More Articles »

XML has become the lingua franca of the computer industry, driving out older formats such as comma separated values and fixed field length files for moving data between companies and applications. When combined with new technologies like web services, it has become a pervasive component of software development.

When manipulating XML under Java, there have been two basic approaches commonly available. The first is to use the JDOM parser, reading the entire document into memory and then operating on it. This has the advantage of requiring very little custom coding to parse documents, and also allows software to generate XML documents. The primary disadvantage (apart from requiring the entire document to fit into memory) is that the DOM created is very generic. Everything is expressed in terms of nodes and attributes, and it's cumbersome to navigate.

The other approach is to use a SAX-style event-driven parser. This is more akin to a traditional compiler, where callbacks are made whenever elements or attributes are encountered during the parse. This has the advantage of being very memory efficient, but is even harder to use.

A final possibility is to use XSLT to directly transform the XML to something else, usually an XHTML document. This is a great approach to take, assuming that what you want to do is to create an HTML document as an end product.

Many times, however, all you really want to do is copy the contents of an XML file into a collection of Plain Old Java Objects (POJOs). A classic example of this is reading an XML configuration file on application startup. In these types of applications, Castor is just what the doctor ordered. Imagine a simple XML format for a bookstore:

<collection type="private">
	<book isbn= "0345334302">
		<title>The Ringworld Engineers</title>
	<book isbn= "0671462148">

There's obviously a lot more that you could be storing about the book (price, publication date, etc.), but for this example the data shown is sufficient. One important feature to note is that there can be multiple authors for a single book.

All you want to do in this example is to load up all the books in the collection, and print out a report of all the authors in our collection, and the books they have written. In order to do this, you need POJOs to hold all the various levels of objects in the schema. First, the Author object, which is at the bottom of the schema hierarchy (the getters and setters have been removed for terseness…)

package com.blackbear.examples.castor;

public class Author {
	private String lastname;
	private String firstname;

All that an author has is a lastname and firstname. The Book object is a little more complicated.

package com.blackbear.examples.castor;

import java.util.ArrayList;
import java.util.List;

public class Book {
	private String title;
	private String isbn;
	private List<Author> authors = new ArrayList<Author>();

In addition to a title and ISBN number, books have a list of authors, implemented as an ArrayList. Because you're being good Java 5.0 citizens, you use Generics to specify the type of object that the List will hold. Finally, books go into Collection objects:

package com.blackbear.examples.castor;

import java.awt.print.Book;
import java.util.ArrayList;
import java.util.List;

public class Collection {
	private String type;
	private List<Book> books = new ArrayList<Book>();

Again, you use Generics for the List. With our object structure in place, you're ready to use Castor to read in our XML file. There are two ways to use Castor. One way is the have the classes themselves contain information about how to pack and unpack the XML. This is what you get if you use the Eclipse Castor plug-in to generate Castor POJOs from an XML XSD file. However, in this example, you're going to use the other method, which is to use a mapping file.

Technically, you can read data from XML into Java using Castor without defining a mapping file at all, but only in a very restricted set of circumstances. You'd need to define your classes in the default package, and only be interested in flat XML objects (in other words, elements without sub-elements.) If that were the case, reading in the data would be as simple as saying:

Unmarshaller unmarshaller = new Unmarshaller();
InputStream xmlFile = Example1.class.getResourceAsStream("books.xml");
InputSource f = new InputSource(xmlFile);
try {
	Object shouldBeCollection = unmarshaller.unmarshal(f);
} catch (MarshalException e) {
} catch (ValidationException e) {

First, you instantiate a copy of the Castor unmarshaller. You get an InputSource for the XML file you want to parse. You call the unmarshaller on the input, and it returns an object that will be the top level element in the file. In this case, you'd need to have a class called Collection (matching the XML element name "collection"). Further, all that will be generated is a Collection object with the type filled out, none of the books inside the collection will be created.

So, clearly you need to help Castor out a bit. For one thing, having to use the default package isn't anything you want to be doing. You also want to see the books in our collection, since that's the point of the exercise. So you need to create a mapping file (which you'll call collection-mapping.xml)

<!DOCTYPE mapping PUBLIC "-//EXOLAB/Castor Mapping DTD Version 1.0//EN" "http://castor.org/mapping.dtd">

First, a standard DOCTYPE declaration and the mapping element, which starts all Castor mapping files.

		<map-to xml="collection"/>
		<field name="books" collection="arraylist" 
			<bind-xml name="book" node="element"/>
		<field name="type" direct="false" 
			<bind-xml name="type" node="attribute"/>

Page 1 of 3

Comment and Contribute


(Maximum characters: 1200). You have characters left.



Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date