XML Manipulation with Apache AXIOM
What Is AXIOM?
AXIOM stands for AXis2 Object Model and refers to the XML infoset model that was initially developed as part of Apache Axis2, but later it moved to a WS commons project. Axiom is the main XML representation mechanism in Axis2, so any message coming to Axis2 will be represented as an Axiom object in Axis2. XML infoset refers to the information included inside the XML, and for programmatic manipulation it is convenient to have a representation of this XML infoset in a language specific manner. For an object-oriented language, the obvious choice is a model made up of objects. DOM and JDOM are two classic examples for such XML models. AXIOM is conceptually similar to such a XML model by its external behaviour, but deep down it is very different.
AXIOM is a lightweight, deferred built XML infoset representation based on StAX, which is the standard streaming pull parser API. The object model can be manipulated flexibly as any other object model (such as JDOM), but underneath, the objects will be created only when they are absolutely required. This leads to much less memory-intensive programming. This also is very useful when it comes to applications such as message routing and ESB (Enterprise Service Bus).
Among the features of AXIOM, deferred building can be considered as the best. That also was one of the design goals of AXIOM. As you have seen in Axis2-related articles, one of the issues of Apache Axis1 was its XML representation. It was fully based on DOM, and it loads full messages for processing; this become a performance killer when it came to large messages. AXIOM was introduced to solve those issues, and in addition to that it has following key features as well.
- Lightweight: AXIOM is specifically targeted to be lightweight. This is achieved by reducing the depth of the hierarchy, number of methods, and the attributes enclosed in the objects. This makes the objects less memory intensive.
- Deferred building: The objects are not made unless a need arises for them. This passes the control of building over to the object model itself rather than an external builder.
- Pull based: For a deferred building mechanism, a pull-based parser is required. AXIOM is based on StAX, the standard pull parser API.
What Are Pull Parsing and Push Parsing?
You will encounter the term pull parsing several times throughout the article, so it is essential to understand the meaning and the concept behind pull parsing. XML documents can be parsed using a "pull-based" or a "push-based" (StAX) process. Pull parsing is a recent trend in XML processing. The previously popular XML processing frameworks, such as SAX and DOM, were "push-based" (SAX); this meant that parsing control was in the hands of the parser itself. This approach is fine and easy to use, but it was not efficient in handling large XML documents because a complete memory model will be generated in the memory. Pull parsing inverts the control; therefore, the parser proceeds only at the user's command. The user can decide to store or discard events generated from the parser. OM is based on pull parsing.
Get Started with Samples
To start working with Axiom, you first need to download Axiom. You can either download AXIOM binary or you can build the binary using the source distribution (or from source repository). As you already know, though, AXIOM was started as a part of Axis2; now, it has its own release cycle. Therefore, you can either download AXIOM binary from AXIOM release or you can find AXIOM binary in the Axis2 release.
Once you have the Axiom binary, the next step is to add it into your classpath (and the dependent binary files as well); only then can you start to work with Axiom. If your application has a build system like maven, you can add the dependency to that and let it to download AXIOM jars automatically.
You can create Axiom in three ways, as shown in the following figure. First, you can create Axiom by using a Pull event stream (Stax). Second, you can create Axiom by using a push event stream (SAX). Third, you can create Axiom programatically. Here, you will learn how to create Axiom by using a pull event stream and programmatically, because those are the two most common methods used to create Axiom.
First, look at how to create Axiom using a pull event stream. Axiom provides the notion of a factory and a builder to create objects. The factory helps keep the code at the interface level and the implementations separately. Because Axiom is tightly bound to StAX, a StAX compliant reader should be created first with the desired input stream. Then, you can select one of the many builders available. In Axiom, you can find different types of builders as well, and those are mainly for user convenience. Axiom has OM builders as well as SOAP builders, so you can use the appropriate builder for your requirement. StAXOMBuilder will build a pure XML infoset-compliant object model whereas the SOAPModelBuilder returns SOAP-specific objects (such as the SOAPEnvelope, which are sub classes of the OMElement) through its builder methods.
Sample 1: Creating AXIOM from an inputstream
The following code illustrates the correct method of creating an Axiom document from a file input stream.
//create the parser XMLStreamReader parser = XMLInputFactory.newInstance().createXMLStreamReader( new FileInputStream(file)); //create the builder StAXOMBuilder builder = new StAXOMBuilder(parser); //get the root element of the XML OMElement documentElement = builder.getDocumentElement();
Step 1: Create a parser or reader. In this case, you create a parser.
Step 2: Next, create a builder using the parser (or reader). In this case, create StAXOMBuilder.
Step 3: Get the Axiom document element from the builder.