JavaJumping into JAXP

Jumping into JAXP

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Java plus XML is a combination of skills that is currently much in demand.
For Java programmers that want jump into the XML fray, this article shows you the
basics of using the Java API for XML Processing (JAXP).

API Overview

The Java API for XML Processing (JAXP) gives Java programmers a standardized API for working with XML documents, independent of the actual XML parser that is used. The classes and interfaces that comprise JAXP can be divided into three general categories:

  • Document Object Model (DOM) – These classes are used to process XML
    documents as DOM documents.

  • Simple API for XML (SAX) – These classes are used to parse XML documents in
    an event-driven manner.

  • XML Stylesheet Language for Transformations (XSLT) –
    These classes are used to transform XML documents using XSL.

Tools

To try the examples in this article, you will need the following tools:

  • Java Development Kit (JDK) 1.3 or higher
  • Java XML Pack 1.2
  • Your favorite text editor

Using Dom

The Document Object Model (DOM) is a standardized API that is used to represent,
navigate, and manipulate the structure and content of structured documents, such
as valid HTML and XML. Documents are represented in DOM as trees, where each
document contains one root node, which has zero or more child nodes, which in
turn can be the root node of a tree.

To create a DOM document from an XML file, you use classes in the
javax.xml.parsers package. DocumentBuilderFactory is used to create instances of
DocumentBuilder, which is used to create DOM documents from XML sources. To
navigate and manipulate the document, you use the classes in the org.w3c.dom
package, such as Document and Node.

DomPrint is an example of using the JAXP
DOM classes. It creates a DOM document from a given XML file then prints the
content as plain text, with indentation to indicate nested elements. Even though
it is recursive, the algorithm for DomPrint is straightforward:

  1. Check command-line arguments. If not enough arguments,
    print usage message, then exit.

  2. Create a File object from the first command-line
    argument.

  3. Get an instance of DocumentBuilderFactory and
    configure it.

  4. Get a DocumentBuilder from the DocumentBuilderFactory.

  5. Tell the DocumentBuilder to parse the given file and
    return a DOM Document.

  6. Print the tree, starting from the root node:

    1. Print indentation for the given nesting level (0 =
      no indentation).< /p >

    2. Print the node name.

    3. If the node has attributes, print them, one per
      line, indented under the node name.

    4. Print the node value on the next line after the node
      name.

    5. If the node has children:

      1. Increment indentation level.

      2. For each child: print the tree, starting from the
        child.

Running DomPrint on an Ant project file produces
this output. Running it on a DocBook article produces this output
.

Using Sax

The Simple API for XML (SAX) is an event-based API for processing XML documents.
As a document is parsed, events, such as document start or element start, are
reported to an application. In order to handle these events, the application
implements event handling interfaces.

To parse an XML document with SAX, you use the classes in the java.xml.parsers
package. SAXParserFactory is used to create instances of SAXParser, which is
used to parse XML documents. To handle parsing events, you extend
org.xml.sax.helpers.DefaultHandler or implement org.xml.sax.ContentHandler.

SaxPrint is an example of using SAX to parse
an XML document. It parses a given XML file and prints the content as block-
structured text. Here is the algorithm:

  1. Get command-line arguments. If not enough arguments,
    print usage message, then exit.

  2. Create File from first command-line argument.

  3. Get an instance of SAXParserFactory and configure it.

  4. Get a SAXParser from the SAXParserFactory.

  5. Tell the SAXParser to parse the given file.

  6. Handle events:

  7. When startDocument: print “BEGIN DOCUMENT”.

  8. When endDocument: print “END DOCUMENT”.

  9. When startElement:

    1. Print “BEGIN” + element name.

    2. If element has attributes, print them, indented
      under element name.

  10. When endElement: print “END ” + element name.

    Running SaxPrint on an Ant project file produces
    this output. Running it on a DocBook article produces this output
    .

    Using Transformations

    The XML Stylesheet Language for Transformations (XSLT) classes are used to
    transform XML documents into other forms, such as other XML structures, HTML, or
    plain text. Transformation is accomplished by applying instructions (rules) in
    an XSL stylesheet to an input source and creating an output result. Both the
    input source and the output result can be an a DOM document, SAX events, or an
    XML stream.

    To transform an XML document with XSLT, you use the classes in the
    javax.xml.transform package. TransformerFactory is used to create instances of
    Transformer, which is used to run transformations. Input sources and output
    results are created with the classes in the package that corresponds to the type
    or source or result. For example, stream sources are created with the classes in
    the javax.xml.transform.stream package.

    Transform is an example of transforming a given XML
    file with a given XSL stylesheet. Both the input and the result are streams.
    Here is the algorithm:

    1. Get command-line arguments. If not enough arguments, print usage
      message, then exit.

    2. Create stylesheet File from first command-line argument, input File from
      second command-line argument.

    3. Create stream sources for stylesheet and input file, stream result for
      System.out.

    4. Get instance of TransformerFactory.
    5. Get a Transformer from TransformerFactory that uses the given
      stylesheet.

    6. Tell the Transformer to transform the input stream
      and write the output to the result stream.

    Running Transform on article.xml using article2html.xsl produces this output
    .

    Resources

    Copyright ) 2002, Thornton Rose

    Get the Free Newsletter!

    Subscribe to Developer Insider for top news, trends & analysis

    Latest Posts

    Related Stories