http://www.developer.com/

Back to article

XML Manipulation with Apache AXIOM


January 30, 2009

What Is AXIOM?

AXIOM stands for AXis2 Object Model and refers to the XML infoset model that was initially developed as part of Apache Axis2, but later it moved to a WS commons project. Axiom is the main XML representation mechanism in Axis2, so any message coming to Axis2 will be represented as an Axiom object in Axis2. XML infoset refers to the information included inside the XML, and for programmatic manipulation it is convenient to have a representation of this XML infoset in a language specific manner. For an object-oriented language, the obvious choice is a model made up of objects. DOM and JDOM are two classic examples for such XML models. AXIOM is conceptually similar to such a XML model by its external behaviour, but deep down it is very different.

AXIOM is a lightweight, deferred built XML infoset representation based on StAX, which is the standard streaming pull parser API. The object model can be manipulated flexibly as any other object model (such as JDOM), but underneath, the objects will be created only when they are absolutely required. This leads to much less memory-intensive programming. This also is very useful when it comes to applications such as message routing and ESB (Enterprise Service Bus).

Among the features of AXIOM, deferred building can be considered as the best. That also was one of the design goals of AXIOM. As you have seen in Axis2-related articles, one of the issues of Apache Axis1 was its XML representation. It was fully based on DOM, and it loads full messages for processing; this become a performance killer when it came to large messages. AXIOM was introduced to solve those issues, and in addition to that it has following key features as well.

  • Lightweight: AXIOM is specifically targeted to be lightweight. This is achieved by reducing the depth of the hierarchy, number of methods, and the attributes enclosed in the objects. This makes the objects less memory intensive.
  • Deferred building: The objects are not made unless a need arises for them. This passes the control of building over to the object model itself rather than an external builder.
  • Pull based: For a deferred building mechanism, a pull-based parser is required. AXIOM is based on StAX, the standard pull parser API.

What Are Pull Parsing and Push Parsing?

You will encounter the term pull parsing several times throughout the article, so it is essential to understand the meaning and the concept behind pull parsing. XML documents can be parsed using a "pull-based" or a "push-based" (StAX) process. Pull parsing is a recent trend in XML processing. The previously popular XML processing frameworks, such as SAX and DOM, were "push-based" (SAX); this meant that parsing control was in the hands of the parser itself. This approach is fine and easy to use, but it was not efficient in handling large XML documents because a complete memory model will be generated in the memory. Pull parsing inverts the control; therefore, the parser proceeds only at the user's command. The user can decide to store or discard events generated from the parser. OM is based on pull parsing.

Get Started with Samples

To start working with Axiom, you first need to download Axiom. You can either download AXIOM binary or you can build the binary using the source distribution (or from source repository). As you already know, though, AXIOM was started as a part of Axis2; now, it has its own release cycle. Therefore, you can either download AXIOM binary from AXIOM release or you can find AXIOM binary in the Axis2 release.

Once you have the Axiom binary, the next step is to add it into your classpath (and the dependent binary files as well); only then can you start to work with Axiom. If your application has a build system like maven, you can add the dependency to that and let it to download AXIOM jars automatically.

Creating AXIOM

You can create Axiom in three ways, as shown in the following figure. First, you can create Axiom by using a Pull event stream (Stax). Second, you can create Axiom by using a push event stream (SAX). Third, you can create Axiom programatically. Here, you will learn how to create Axiom by using a pull event stream and programmatically, because those are the two most common methods used to create Axiom.

First, look at how to create Axiom using a pull event stream. Axiom provides the notion of a factory and a builder to create objects. The factory helps keep the code at the interface level and the implementations separately. Because Axiom is tightly bound to StAX, a StAX compliant reader should be created first with the desired input stream. Then, you can select one of the many builders available. In Axiom, you can find different types of builders as well, and those are mainly for user convenience. Axiom has OM builders as well as SOAP builders, so you can use the appropriate builder for your requirement. StAXOMBuilder will build a pure XML infoset-compliant object model whereas the SOAPModelBuilder returns SOAP-specific objects (such as the SOAPEnvelope, which are sub classes of the OMElement) through its builder methods.

Sample 1: Creating AXIOM from an inputstream

The following code illustrates the correct method of creating an Axiom document from a file input stream.

//create the parser
XMLStreamReader parser =
   XMLInputFactory.newInstance().createXMLStreamReader(
   new FileInputStream(file));
//create the builder
StAXOMBuilder builder = new StAXOMBuilder(parser);
//get the root element of the XML
OMElement documentElement = builder.getDocumentElement();

Step 1: Create a parser or reader. In this case, you create a parser.

Step 2: Next, create a builder using the parser (or reader). In this case, create StAXOMBuilder.

Step 3: Get the Axiom document element from the builder.

NOTE
When you ask for the document element from the builder, it will give you the pointer to the Axiom wrapper. But, the XML stream is still in the stream and no object tree is created at that time. The object tree is created only when you navigate or build the Axiom.

Sample 2: Creating AXIOM Using a String

At this point, try to create an Axiom document from a string that is also very straightforward. The only difference is that you first need to convert a string into a stream.

String xmlString = "<book>" +
                   "<name>Qucik-start Axis</name>" +
                   "<isbn>978-1-84719-286-8</isbn>" +
                   "</book>";
//Convert string into a stream
ByteArrayInputStream xmlStream =
   new ByteArrayInputStream(xmlString.getBytes());
//Create a builder. Because we want the XML as a plain XML,
//we can just use the plain OMBuilder
StAXBuilder builder = new StAXOMBuilder(xmlStream);
//Return the root element.
builder.getDocumentElement();

As you can see, when creating an Axiom from a string, first you get an input stream from that and follow the same procedure you did earlier. By looking at the example, it is clear that creating an Axiom from an input stream or from a string is pretty straightforward. However, elements and nodes also can be created programmatically to modify the structure of the AXIOM element that you created above. The recommended way to create Axiom objects programmatically is to use the factory.

The OMAbstractFactory.getOMFactory() method will return the proper factory and the creator methods for each type that should be called.

Sample 3: Creating AXIOM Programmatically

Creating an Axiom programmatically is a little bit difficult compared to previous two cases, and of course it involves additional steps.

//Obtain a factory.
OMFactory factory = OMAbstractFactory.getOMFactory();
//Use the factory to create two namespace object.
OMNamespace axis2 = factory.createOMNamespace("axis2","ns");
//Use the factory to create three elements to represent the book
//element.
OMElement root = factory.createOMElement("book",axis2);
OMElement name = factory.createOMElement("name",axis2);
OMElement isbn = factory.createOMElement("isbn",axis2);

As you can see above, in the factory there is a set of factory.create methods. That is mainly to cater to different implementations, but keep the programmer's code intact. When you use Axiom, it is a best practice to use the factory to create Axiom objects; this will ease the switching of different Axiom implementations. Several differences exist between a programmatically created OMNode and a conventionally creating OMNode. The most important difference is that the former will have no builder object enclosed; the latter always carries a reference to its builder. As discussed earlier, the object model is built as and when required. Therefore, each and every OMNode should have a reference to its builder. If this information is not available, it is due to the object being created without a builder. This difference becomes evident when the user tries to get a non-caching pull parser from the OMElement.

Sample 4: Adding a Child Node and Attributes

So far, you've learned how to create Axiom programmatically and by using StAX API, but these techniques are not enough to work with Axiom. You need to learn how to create and add child nodes to Axiom. Addition and removal methods are primarily defined in the OMElement interface. The following are the important methods for adding nodes.

public void addChild(OMNode omNode);
public void addAttribute(OMAttribute omAttribute);

Now. try to complete the book element you previously created by adding "name" and "isbn" child elements to the root element.

root.addChild(name);
root.addChild(isbn);
  • The addChild method will always add the child as the last child of the parent.
  • A given node can be removed from the tree by calling the detach() method. A node also can be removed from the tree by calling the remove method of the returned iterator; this also will call the detach method of the particular node internally.
  • Namespaces are a tricky part of any XML object model; it's the same in Axiom. However, the interface to the namespace has been made very simple. OMNamespace is the class that represents a namespace with intentionally removed setter methods. This makes the OMNamespace immutable and allows the underlying implementation to share the objects without any difficulty.

Sample 5: Working with OM Namespaces

As you just saw, namespace handling is a key part in XML processing. Hence, Axiom provides a set of APIs to handle namespaces.

public OMNamespace declareNamespace(String uri, String prefix);
public OMNamespace declareNamespace(OMNamespace namespace);
public OMNamespace findNamespace(String uri, String prefix)
   throws OMException;

As you can see, there are two declareNamespace methods and they are fairly straightforward. Note that a namespace declaration that has already been added will not be added twice. findNamespace is a very handy method to locate a namespace object higher up the object tree. It searches for a matching namespace in its own declarations section and jumps to the parent if it's not found. The search progresses up the tree until a matching namespace is found or the root has been reached.

During the serialization, a directly created namespace from the factory will be added to the declarations only when that prefix is encountered by the serializer. You will learn how to serialize Axiom elements in the next article; however, if you serialize the element you created, you get the following output.

<ns:book xmlns:ns="axis2">
   <ns:name></ns:name>
   <ns:isbn></ns:isbn>
</ns:book>

Sample 6: Working with Attributes

Now, see how to create and add attributes to the book element.

OMAttribute type =
   factory.createOMAttribute("type",null,"web-services");
   root.addAttribute(type);

If you serialize the element again, you will see the following output:

<ns:book xmlns:ns="axis2" type="web services">
   <ns:name></ns:name>
   <ns:isbn></ns:isbn>
</ns:book>

Summary

In this article, I discussed a little bit of the history of Axiom and various ways of manipulating XML, especially the pull and push models. Then, I went though a few examples to give you an idea about Axiom XML APIs. In the next article, I will discuss more about XML navigation, Xpaths, and SOAP manipulation with Axiom.

About the Author

Deepal Jayasinghe is a Computer Science graduate student. Before starting his studies, he worked at WSO2 Inc. He is an Apache Member and active contributor to Apache Axis2 and other Apache Web services projects.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date