JavaData & JavaCure Your Java XML Troubles with a Dose of Castor Oil

Cure Your Java XML Troubles with a Dose of Castor Oil

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

XML has become the lingua franca of the computer industry, driving out older formats such as comma separated values and fixed field length files for moving data between companies and applications. When combined with new technologies like web services, it has become a pervasive component of software development.

When manipulating XML under Java, there have been two basic approaches commonly available. The first is to use the JDOM parser, reading the entire document into memory and then operating on it. This has the advantage of requiring very little custom coding to parse documents, and also allows software to generate XML documents. The primary disadvantage (apart from requiring the entire document to fit into memory) is that the DOM created is very generic. Everything is expressed in terms of nodes and attributes, and it’s cumbersome to navigate.

The other approach is to use a SAX-style event-driven parser. This is more akin to a traditional compiler, where callbacks are made whenever elements or attributes are encountered during the parse. This has the advantage of being very memory efficient, but is even harder to use.

A final possibility is to use XSLT to directly transform the XML to something else, usually an XHTML document. This is a great approach to take, assuming that what you want to do is to create an HTML document as an end product.

Many times, however, all you really want to do is copy the contents of an XML file into a collection of Plain Old Java Objects (POJOs). A classic example of this is reading an XML configuration file on application startup. In these types of applications, Castor is just what the doctor ordered. Imagine a simple XML format for a bookstore:

<collection type="private">
	<book isbn= "0345334302">
		<title>The Ringworld Engineers</title>
		<author>
			<lastname>Niven</lastname>
			<firstname>Larry</firstname>
		</author>
	</book>
	<book isbn= "0671462148">
		<title>Inferno</title>
		<author>
			<lastname>Niven</lastname>
			<firstname>Larry</firstname>
		</author>
		<author>
			<lastname>Pournelle</lastname>
			<firstname>Jerry</firstname>
		</author>		
	</book>	
</collection>

There’s obviously a lot more that you could be storing about the book (price, publication date, etc.), but for this example the data shown is sufficient. One important feature to note is that there can be multiple authors for a single book.

All you want to do in this example is to load up all the books in the collection, and print out a report of all the authors in our collection, and the books they have written. In order to do this, you need POJOs to hold all the various levels of objects in the schema. First, the Author object, which is at the bottom of the schema hierarchy (the getters and setters have been removed for terseness…)

package com.blackbear.examples.castor;

public class Author {
	private String lastname;
	private String firstname;
} 

All that an author has is a lastname and firstname. The Book object is a little more complicated.

package com.blackbear.examples.castor;

import java.util.ArrayList;
import java.util.List;

public class Book {
	private String title;
	private String isbn;
	private List<Author> authors = new ArrayList<Author>();
}

In addition to a title and ISBN number, books have a list of authors, implemented as an ArrayList. Because you’re being good Java 5.0 citizens, you use Generics to specify the type of object that the List will hold. Finally, books go into Collection objects:

package com.blackbear.examples.castor;

import java.awt.print.Book;
import java.util.ArrayList;
import java.util.List;

public class Collection {
	private String type;
	
	private List<Book> books = new ArrayList<Book>();
}

Again, you use Generics for the List. With our object structure in place, you’re ready to use Castor to read in our XML file. There are two ways to use Castor. One way is the have the classes themselves contain information about how to pack and unpack the XML. This is what you get if you use the Eclipse Castor plug-in to generate Castor POJOs from an XML XSD file. However, in this example, you’re going to use the other method, which is to use a mapping file.

Technically, you can read data from XML into Java using Castor without defining a mapping file at all, but only in a very restricted set of circumstances. You’d need to define your classes in the default package, and only be interested in flat XML objects (in other words, elements without sub-elements.) If that were the case, reading in the data would be as simple as saying:

Unmarshaller unmarshaller = new Unmarshaller();
InputStream xmlFile = Example1.class.getResourceAsStream("books.xml");
InputSource f = new InputSource(xmlFile);
try {
	Object shouldBeCollection = unmarshaller.unmarshal(f);
} catch (MarshalException e) {
	e.printStackTrace();
} catch (ValidationException e) {
	e.printStackTrace();
}

First, you instantiate a copy of the Castor unmarshaller. You get an InputSource for the XML file you want to parse. You call the unmarshaller on the input, and it returns an object that will be the top level element in the file. In this case, you’d need to have a class called Collection (matching the XML element name “collection”). Further, all that will be generated is a Collection object with the type filled out, none of the books inside the collection will be created.

So, clearly you need to help Castor out a bit. For one thing, having to use the default package isn’t anything you want to be doing. You also want to see the books in our collection, since that’s the point of the exercise. So you need to create a mapping file (which you’ll call collection-mapping.xml)

<!DOCTYPE mapping PUBLIC "-//EXOLAB/Castor Mapping DTD Version 1.0//EN" "http://castor.org/mapping.dtd">
<mapping>

First, a standard DOCTYPE declaration and the mapping element, which starts all Castor mapping files.

	<class 
        name="com.blackbear.examples.castor.Collection">
		<map-to xml="collection"/>
		<field name="books" collection="arraylist" 
						   direct="false"			      
                 type="com.blackbear.examples.castor.Book">
			<bind-xml name="book" node="element"/>
	   </field>
		<field name="type" direct="false" 
						   type="java.lang.String">
			<bind-xml name="type" node="attribute"/>
	   </field>
	</class>

The first interesting content in the file is the declaration of the top level element, the collection. This definition maps a specific class to an XML element. The only XML elements that you need to use the “map-to” tag with are ones that appear at the top level of XML documents, all others are mapped using the “bind-xml” tag. The “field” tag defines a relationship between a child element of the current element and a Java class or collection. In this case, there are two fields defined. The simpler is the “type” field, which maps to the type bean property of the Collection class. The “direct” attribute indicates whether Castor should use direct access to the properties, or the accessor methods. Since you declared our properties private in the classes, you need to set “direct” to false. You also set “node” to “attribute”, which means that the property is stored as an XML attribute rather than as text or an element.

The more interesting field is the books field, which is used to store the list of books in the collection. Because there are more than one books potentially in a single book collection, you have to specify the “collection” attribute and set it equal to the type of Java Collection you’re going to store the values in. In this case, you use an ArrayList. The type attribute tells Castor what the type of the individual elements is. In this case, you use the node type of “element” to indicate that the class is populated from an XML sub-element.

The rest of the file follows the same pattern. You define a book, with an ISBN code and title. The ISBN is an attribute, and the title comes from the contents of the title element. The book also has a list of authors, just as the collection had a list of books. Notice that you don’t even have to define the author class, Castor will figure out how to populate the two fields because the Java properties match the XML element names.

	<class name="com.blackbear.examples.castor.Book">
		<field name="isbn" direct="false" 
						   type="java.lang.String">
			<bind-xml name="isbn" node="attribute"/>
	   </field>
		<field name="title" direct="false" 
						   type="java.lang.String">
			<bind-xml name="title" node="element"/>
	   </field>
		<field name="authors" collection="arraylist" 
						   direct="false" 
		   type="com.blackbear.examples.castor.Author">
			<bind-xml name="author" node="element"/>
	   </field>
	</class>
</mapping>

Now it’s just a matter of loading the mapping file, doing an unmarshall using the mapping, and walking the resulting Java objects. As you can see from the code below, you only need about 8 lines of Java to actually unpack the XML into POJOs, the rest of the code is processing the resulting objects to product a list of books by each author.

 public static void main(String[] args) {
    try {
        Mapping mapping = new Mapping();
        InputStream mappingStream =
            Example2.class.getResourceAsStream("collection-mapping.xml");
        mapping.loadMapping(new InputSource(mappingStream));
                            
        Unmarshaller unmarshaller = new Unmarshaller();
        InputStream xmlFile = 
            Example2.class.getResourceAsStream("books.xml");
        InputSource f = new InputSource(xmlFile);
        unmarshaller.setMapping(mapping);
        Object shouldBeCollection = unmarshaller.unmarshal(f);

	  // Done demarshalling XML, rest is processing

        if (shouldBeCollection instanceof Collection) {
            Collection collection = (Collection) shouldBeCollection;
            Map<String, List> authors = new HashMap<String, List>();
            if (collection.getBooks() != null) {
                List<Book> books = collection.getBooks();
                int numBooks = books.size();
                for (int i = 0; i < numBooks; i++) {
                    Book book = books.get(i);
                    List<Author> bookAuthors = book.getAuthors();
                    int numAuthors = bookAuthors.size();
                    for (int j = 0; j < numAuthors; j++) {
                        Author author = bookAuthors.get(j);
                        String name = author.getLastname()
                                + ","
                                    + author.getFirstname();
                        List authorBooks = authors.get(name);
                        if (authorBooks == null) {
                            authorBooks = new ArrayList<Book>();
                            authors.put(name, authorBooks);
                        }
                        authorBooks.add(book);
                    }
                }
                Iterator<String> keys = authors.keySet().iterator();
                while (keys.hasNext()) {
                    String key = keys.next();
                    System.out.println(key);
                    List<Book> authorBooks = authors.get(key);
                    int numAuthorBooks = authorBooks.size();
                    for (int i = 0; i < numAuthorBooks; i++) {
                        Book book = authorBooks.get(i);
                        System.out.println("   "
                                + book.getTitle() + "("
                                + book.getIsbn() + ")");
                    }
                    System.out.println("");
                }
            }
        }
    } catch (MarshalException e) {
        e.printStackTrace();
    } catch (ValidationException e) {
        e.printStackTrace();
    } catch (MappingException e) {
        e.printStackTrace();
    }
}

Running this code results in the output below:

Niven,Larry
   The Ringworld Engineers(0345334302)
   Inferno(0671462148)

Pournelle,Jerry
   Inferno(0671462148)

You can also turn things around and marshall up some XML with Castor from the same Java objects. Let’s try the code show here. First you create a pair of outstanding Java books to turn into XML:

public static void main(String[] args) {
    try {
        Collection col = new Collection();
        Book book = new Book();
        book.setTitle("MySQL and JSP Web Applications");
        book.setIsbn("0672323095");
        Author james = new Author();
        james.setFirstname("James");
        james.setLastname("Turner");
        book.getAuthors().add(james);
        col.getBooks().add(book);
        book = new Book();
        book.setTitle("Struts Kick Start");
        book.setIsbn("0672324725");
        Author kevin = new Author();
        kevin.setFirstname("Kevin");
        kevin.setLastname("Bedell");
        book.getAuthors().add(kevin);
        book.getAuthors().add(james);
        col.getBooks().add(book);

Now you’re ready to generate some XML. As before, you read the mapping file in, but this time you create a Marshaller, handing it a java.io.Writer (in this case a StringWriter, that will store the XML for us to print.) Then you have only to set the mapping on the marshaller, and marshall the top-level object, in this case our book collection.

        Mapping mapping = new Mapping();
        InputStream mappingStream =
            Example3.class.
           getResourceAsStream("collection-mapping.xml");
        mapping.loadMapping(new InputSource(mappingStream));
                            
        Writer stringWriter = new StringWriter();
            
        Marshaller marshaller = new Marshaller(stringWriter);
        marshaller.setMapping(mapping);
        marshaller.marshal(col);
        stringWriter.close();
        System.out.println(stringWriter.toString());
    } catch (MarshalException e) {
        e.printStackTrace();
    } catch (ValidationException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (MappingException e) {
        e.printStackTrace();
    }
}

The resulting XML looks just like you want it:

<?xml version="1.0" encoding="UTF-8"?>
<collection>
	<book isbn="0672323095">
		<title>MySQL and JSP Web Applications</title>
		<author>
			<lastname>Turner</lastname>
			<firstname>James</firstname>
		</author>
	</book>
	<book isbn="0672324725">
		<title>Struts Kick Start</title>
		<author>
			<lastname>Bedell</lastname>
			<firstname>Kevin</firstname>
		</author>
		<author>
			<lastname>Turner</lastname>
			<firstname>James</firstname>
		</author>
	</book>
</collection>

This article only begins to scratch the surface of what Castor can do, a particularly good walkthrough of how to use mapping files with Castor can be found at:

http://www.castor.org/xml-mapping.html.

About the Author

James Turner is a Senior Software Engineer at Kronos, Inc. He has written two books on Java Web Development and writes frequently on technology and software development.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories