JavaJava JAXP, Exposing a DOM Tree

Java JAXP, Exposing a DOM Tree

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Java Programming Notes # 2204


Preface

What is JAXP?

As the name implies, the Java API for XML Processing (JAXP) is an
API designed
to help you write programs for processing XML documents.  JAXP is
very important for many reasons, not the least of which is the
fact that it is a critical part of the Java Web Services Developer Pack
(JWSDP).  As you are probably already aware, web services is
expected by many to be a very important aspect of the Internet of the
future

This is the third lesson in a series designed to initially help you
understand how to use JAXP,
and to eventually help you understand how to use the JWSDP.

The first lesson was entitled Java
API for XML Processing (JAXP), Getting Started
.  The
previous lesson was entitled Getting
Started with Java JAXP and XSL Transformations (XSLT)
.

What is XML?

XML is an acronym for the eXtensible Markup Language. 
I will not attempt to teach XML in this series of
tutorial lessons.  Rather, I will assume that you already
understand
XML, and I will teach you how to use JAXP to write programs for
creating and processing XML documents.

I have published numerous tutorial lessons on XML at Gamelan.com and www.DickBaldwin.com
You may find it useful to refer to those lessons.  In addition, I
provided
a review of the salient aspects of XML in the first lesson in this
series.  From time to time, I will also provide background
information regarding XML in the lessons in this series.

Viewing tip

You may find it useful to open another copy of this lesson in a
separate browser window.  That will make it easier for you to
scroll back and forth among the different listings and figures while
you are reading about them.

Supplementary material

I recommend that you also study the other lessons in my extensive
collection of online Java tutorials.  You will find those lessons
published at Gamelan.com
However, as of the date of this writing, Gamelan doesn’t maintain a
consolidated index of my Java tutorial lessons, and sometimes
they are difficult to locate there.  You will find a consolidated
index at www.DickBaldwin.com.

Preview

A tree structure in memory

A DOM parser can be used to
create a tree structure in memory that represents an XML
document.  In Java, that tree structure is encapsulated in an
object of the interface type DocumentDocument
and its superinterface Node declare numerous methods that may
be used to navigate, extract information from, modify, and otherwise
manipulate the DOM tree.  As
is always the case, classes that implement Document must
provide concrete definitions of those methods.

Many operations are possible

Given an object of type Document, there are many
methods that
can be invoked on the object to perform a variety of operations. 
For example, it is possible to move nodes from one location in the tree
to another location in the tree, thus rearranging the structure of the
XML document represented by the Document object.  It is
also possible to delete nodes, and to insert new nodes.  It is
also possible
to
recursively traverse the tree, extracting information about the nodes
along
the way.

I showed you …

In the previous lesson on Java JAXP, I began by providing a brief
review of XSL and XSL Transformations (XSLT).

Then I showed you how to create an identity Transformer
object, and how to use that object to:

  • Display a DOM tree structure on the screen in XML format.
  • Write the contents of a DOM tree structure into an output XML
    file.

Following that, I showed you how to write exception handlers that
provide meaningful information in the event of errors and exceptions,
with particular emphasis on parser errors and exceptions.

I will show you

In this lesson, I will show you how to write a program to display a
DOM tree on the screen in a format that is much easier to interpret
than raw XML code.  I will explain two different versions of the
program.  One version will simply identify text nodes in the
output tree.  The other will display the value of text nodes in
the output tree.  The first version will ignore attributes in the
output tree.  The second version will include attributes in the
output tree.

Discussion
and Sample Code


The first program that I will discuss, named DomTree01, analyzes a DOM tree that
represents an XML document, and produces an output on the screen
similar to the tree shown in Figure 1.
 

#document DOCUMENT_NODE
A DOCUMENT_TYPE_NODE
#comment COMMENT_NODE
xml-stylesheet PROCESSING_INSTRUCTION_NODE
processor PROCESSING_INSTRUCTION_NODE
A ELEMENT_NODE
Q ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
#cdata-section CDATA_SECTION_NODE
R ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
S ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
S ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
T ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
D ELEMENT_NODE
E ELEMENT_NODE
#text TEXT_NODE
G ELEMENT_NODE
#text TEXT_NODE
#text TEXT_NODE
F ELEMENT_NODE
#text TEXT_NODE
E ELEMENT_NODE
#text TEXT_NODE
F ELEMENT_NODE
#text TEXT_NODE
E ELEMENT_NODE
#text TEXT_NODE
F ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
R ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
R ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
Figure 1

The physical tree structure shown in Figure 1 represents the
corresponding XML document as a visual tree.  As I discuss the
various parts of the XML document, you should be able to correlate
those parts of the document to the tree structure shown in Figure 1.

The sample XML
file named DomTree01.xml

The tree structure in Figure 1 corresponds to an XML file named DomTree01.xml.  As is often the
case, I will discuss the XML files and the programs in fragments. 
A complete listing of DomTree01.xml
is shown in Listing 21 near the end of the lesson.  Listing 1
shows the beginning of the XML file

<?xml version="1.0"?>

<!DOCTYPE A [
<!ELEMENT A (Q,B,B)*>
<!ELEMENT B (B | C | D | R | S | T)*>
<!ELEMENT C (#PCDATA)>
<!ELEMENT D (E | F)*>
<!ELEMENT E (#PCDATA | G)*>
<!ELEMENT F (#PCDATA)>
<!ELEMENT G (#PCDATA)>
<!ELEMENT Q (#PCDATA)>
<!ELEMENT R (#PCDATA)>
<!ELEMENT S (#PCDATA)>
<!ELEMENT T (#PCDATA)>
]>

<!-- File DomTree01.xml
Copyright 2003 R. G. Baldwin
Used to test the program named
DomTree01.java
-->

<?xml-stylesheet type="text/xsl"
href="Dom03.xsl"?>

<?processor ProcInstr="Dummy"?>

Listing 1

The structure
of the XML file named DomTree01.xml

That portion of the XML file shown in Listing 1 consists of five
items that are represented by the following nodes in the DOM tree:

  • A Document node
    • A Document-Type node
    • A Comment node
    • A Processing Instruction node representing a stylesheet
    • A Processing Instruction node representing a dummy processing
      instruction

The last four node types in the above list represent nodes that are
children of the Document node.  The Document node is the root of
the entire DOM tree, and all other nodes in the DOM tree are children
of the Document node.

The five items are separated by blank lines in Listing 1, so you
should be able to correlate them visually with the five nodes in the
above list.

(Note
that although it is tempting to believe that the Document node
correlates with the XML declaration in the first line of Listing 1, the
XML declaration is not required, and the DOM tree will be rooted in a
Document node, even in the absence of an XML declaration.)

The DOM tree
exposed

Figure 2 shows a reproduction of the first five lines from Figure
1.  Each line in Figure 2 represents a node in the DOM tree. 
You should be able to correlate each line in Figure 2 with one
of the nodes in the above list, and also with one of the items in
Listing 1 (except for the
DOCUMENT_NODE for which there is no explicit item in Listing 1).

The indentation in Figure 2 indicates that the last four lines in
Figure 2 represent nodes that are children of the node represented by
the Document node in the first line.
 

#document DOCUMENT_NODE
A DOCUMENT_TYPE_NODE
#comment COMMENT_NODE
xml-stylesheet PROCESSING_INSTRUCTION_NODE
processor PROCESSING_INSTRUCTION_NODE
Figure 2

The prolog of
the XML document

Listing 1 shows the prolog
for this XML document, which includes everything prior to the start tag
for the root element.  Figure 2 shows the DOM nodes associated
with the prolog.

The root
element in the XML document

Listing 2 shows the XML code for the root element and the six nodes
following the root-element node in the DOM tree. 
The XML code in Listing 2 produces the following node types in the DOM
tree, with the parent-child relationships shown.

  • An Element node named A, which is the root element node
    • An Element node named Q
      • A Text node
    • An Element node named B
      • An Element node named C
        • A Text node
        • A CDATA Section node
<A>
<Q>A Big Header</Q>

<B>
<C>Level 0. This is the beginning of a B.
This text is in the Introduction section.
<![CDATA[This is CDATA < > " &]]></C>

Listing 2

A is a child of
the document root node

Referring back to Figure 1, you can see that the Element node named A
is a child of the Document node,
which forms the root of the DOM tree.  The node for element A is
the
root element node for the DOM
tree, (which is different from the
root node for the DOM tree).
  All of the data stored in an
XML
document is stored in the root element node and its children.

Figure 3 shows a reproduction of the next seven lines from Figure 1,
showing the tree structure and the parent-child relationships among the
nodes.  The nodes shown in Figure 3 correspond to the XML code in
Listing 2.
 

  A ELEMENT_NODE
Q ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
#cdata-section CDATA_SECTION_NODE
Figure 3

Easier to
interpret

Unless you have a lot of practice reading XML code, you may have
concluded
by now that the representations of the DOM tree in Figures 2, and 3 are
much easier to get your mind around than the raw XML shown in Listings
1 and 2.

Node types seen
thus far

So far, we have seen the following types of nodes:

  • Document node
  • Document-Type node
  • Comment node
  • Processing Instruction node
  • Element node
  • Text node
  • CDATA Section node

It will be useful at this point to provide a brief explanation for each
of
these node
types.

The Document
node and the XML declaration

According to XML in a Nutshell by Harold and Means, which I recommend
as an excellent book,

“XML
documents should, (but do not have to) begin with an XML
declaration.  The XML declaration looks like a processing
instruction with the name xml and version, standalone, and encoding
attributes.  Technically, it’s not a processing instruction
though, just the XML declaration; nothing more, nothing less.”

As I mentioned earlier, every XML DOM tree is rooted in a Document
node, even in the absence of an XML declaration.  Apparently, the
DOM
tree does not contain a node that represents the XML declaration, and
the XML document doesn’t contain any specific text that represents the
Document node.

Although the XML declaration is used for
information purposes by a validating XML parser, if it is possible to
recover the XML declaration from the DOM tree, I don’t know how to do
that at this time.

Document-Type
node

A valid XML document contains a reference to a Document Type Declaration (DTD) to
which the document should be compared for validation purposes. 
The DTD can also be included in the XML document prolog, as is the case
in Listing 1.

(The
DTD in Listing 1 begins with <!DOCTYPE and ends with ]>)

According to XML in a Nutshell,

“DTDs
are written in a formal syntax that explains precisely which elements
and entities may appear where in the document and what the elements’
contents and attributes are.”

For example, the DTD in Listing 1 states that the element named A must
contain the elements named Q, B, and B, in that order.  I’m not
going to try to explain the rules for writing DTDs.  There are
numerous tutorials on the Web that you can refer to in this regard.

The DTD in Listing 1 produced the Document-Type node in the tree in
Figure 2.

(In
certain situations, a schema can be used for validation in place of a
DTD.)



Comment node

A comment in XML means pretty much the same thing as a comment in
Java.  XML comments are generally ignored by XML processors. 
They are intended primarily for human consumption.

Listing 1 contains an XML comment with the file name and some other
information.  This comment produced the Comment node in the tree
of Figure 2.

Processing
Instruction node

XML processing instructions begin with <? and end with ?>. 
Processing instructions are intended to provide instructions to
processing programs that may be called upon to process an XML document.

Listing 1 contains two separate processing instructions.  The two
processing instructions gave rise to the two Processing Instruction
nodes in the tree in Figure 2.

Element node

As you learned in the previous two lessons, XML syntax includes
elements, consisting of start tags, end tags, optional content, and
optional attributes.

Listing 2 contains all or part of several elements.  The elements
gave rise to the Element nodes in Figure 3.  The text content of
the elements gave rise to the Text nodes in Figure 3.

(Note
that the actual text in this XML document is not intended to have any
meaning other than to constitute text nodes in the DOM tree for
illustration purposes.)

Text node

When you include text as part or all of
the content of an XML element, each chunk of text gives rise to a text
node in the DOM tree.  Figure 3 shows two text nodes produced by
the text content of the elements in Listing 2.

CDATA Section
node

XML recognizes two kinds of text data, PCDATA and CDATA.  PCDATA
stands for parsed character data.  CDATA stands for character data.

The primary difference between the two is as follows.  PCDATA
cannot contain
certain characters such as left angle brackets (<) and ampersands
(&).  The reason is that a left angle bracket would confuse
the parser, causing it to believe that it had encountered the first
character in a start or end tag.  Therefore, if these characters
appear in
PCDATA, they must be represented by entities, such as &lt;.

A CDATA section

When a block of text is declared to be of type CDATA, it is
ignored by the parser.  Therefore, it can contain any
characters (with the possible
exception of ]]).
  A block of CDATA always begins with
<![CDATA[.  The block always ends with ]]>.

(Note
that the periods in the above sentences are not parts of the CDATA
beginning and ending syntax.)

Listing 2 contains a block of CDATA, which gave rise to the CDATA
Section node in Figure 3.

Note that the Element node named C in Figure 3 has two children. 
One child is a text node.  The other child is a CDATA
Section node.

An interesting
case involving whitespace

I’m not going to bore you by discussing the entire XML document in this
level of detail.  By now, you should be able to compare the XML in
Listing 21 with the DOM tree represented by Figure 1, and understand
how the XML code relates to the DOM tree,.

However, there is one tricky aspect involving whitespace that deserve a little
more
explanation.  The DOM tree nodes shown in Figure 4 represent the
XML code shown in Listing 3.
 

            E ELEMENT_NODE
#text TEXT_NODE
G ELEMENT_NODE
#text TEXT_NODE
#text TEXT_NODE
F ELEMENT_NODE
#text TEXT_NODE
Figure 4

Too many text
nodes

I have colored the obvious text in Listing 3 green for emphasis. 
At
first glance, it would appear that there are too many Text nodes
showing in Figure 4 to correspond to the text shown in Listing 3.

<E>First list item in E
<G>Nested G text element</G>
</E>
<F>First list item in F</F>

Listing 3

Another
representation of the DOM tree

Figure 5 shows another representation of the DOM tree, similar to
Figure
4, except that the actual text belonging to each Text node is shown in
Figure 5.
 

            E ELEMENT_NODE
#text First list item in E

G ELEMENT_NODE
#text Nested G text element
#text

F ELEMENT_NODE
#text First list item in F
Figure 5

Note the blank lines in Figure 5.  This is caused by newline
characters in the actual XML code in Listing 3.  In particular,
there are two Text nodes belonging to the element named E.  One of
those Text nodes appears before
the element named G and the other appears after the element named
G.  The Text
node after the element named G was caused by the newline character
immediately following the end tag for the element named G.

Element E may
contain PCDATA

This happens because of one line in the DTD shown in Listing 1 and
repeated below for convenience.

<!ELEMENT E (#PCDATA | G)*>

This DTD statement says that the content for an element named E may
contain Text nodes (#PCDATA) and/or elements named G in any number and
in
any order.  Thus, simple newline characters inserted into the XML
to make it easier to read were interpreted as Text nodes.  This
gave rise to what appears to be extra Text nodes in Figure 4.

That’s probably enough talk.  It’s time to see some Java code.

The program
named DomTree01

With the preceding discussion as background, I will now discuss the
program named DomTree01,
which was used to process the file named DomTree01.xml
and to produce the Dom tree representation shown in Figure 1.  As
usual, I
will discuss the program in fragments.  A complete listing of the
program is shown in Listing 20 near the end of the lesson.

Purpose and
limitations of the program

This program produces a text-based output on the screen that represents
the DOM tree structure for an XML file.  Note that although the
code was written to support these node types, the program was not
actually tested for the following node types:

  • DOCUMENT_FRAGMENT_NODE
  • ENTITY_NODE
  • ENTITY_REFERENCE_NODE
  • NOTATION_NODE

Note also that this program does not display attributes.  That
will be accomplished in the sample program named DomTree02 to be discussed later in
this lesson.

Also note that for simplicity, no effort was made to cause the program
to produce meaningful output in the event of errors and exceptions.

The program was tested using Sun’s SDK 1.4.2 under WinXP.

Overall program
structure

This program consists of a single class with a main method that runs as a Java
application.  Listing 4 shows the beginning of the class
definition and the beginning of the main
method.

public class DomTree01{

int indent = -1;//Indentation level for display

public static void main(String argv[]){
if (argv.length != 2){
System.err.println(
"usage: java DomTree01 fileIn validate");
System.err.println(
"validate = n for no, y for yes");
System.exit(0);
}//end if

Listing 4

The code in Listing 4 is straightforward:

  • It declares and
    initializes an instance variable that is used later for control of
    indentation in the output display.
  • It also provides usage instructions if the user
    starts the program with the wrong number of command-line arguments.

Running the
program

Two command-line parameters are required.  The first parameter is
the path and file name of the file containing the XML document to be
processed.  The second command-line parameter is either “y” or “n”
specifying whether or not the parser should attempt to validate the XML
document.

(If
the program is instructed to validate the document,
a DTD (or schema) must be
provided either inline or as a reference in the XML document.)

Steps for creating a Document object

As you learned in an earlier lesson, three steps
are required to create a Document object:

  1. Create a DocumentBuilderFactory object
  2. Use the DocumentBuilderFactory object to create a DocumentBuilder
    object
  3. Use the parse method of the DocumentBuilder object
    to create a Document object

Create a
DocumentBuilderFactory object

The first step in the above list is accomplished by the code in Listing
5..

    try{
//Get a factory object for DocumentBuilder
// objects
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();

//Configure the factory object setting
// validating true or false based on user
// input.
if(argv[1].equals("n")){
factory.setValidating(false);
}else{
factory.setValidating(true);
}//end if/else

factory.setNamespaceAware(false);

//Set to ignore cosmetic white space
// between elements.
factory.
setIgnoringElementContentWhitespace(true);

Listing 5

There is very little in Listing 5 that wasn’t discussed in detail in
earlier lessons.  About the only thing that is new is the
invocation of the setter method at the end of Listing 5 to cause the
parser to ignore cosmetic whitespace in the XML document.

(Cosmetic
whitespace consists of spaces, tabs, newlines, etc., inserted into the
XML document between elements to make the document easier to read.)

This wasn’t discussed in the previous lessons because it only works
with a validating parser.  The parsers used in the two previous
lessons were not validating parsers.

Create a
Document object

The remaining two steps required to create a Document object are accomplished in
Listing 6.

      //Get a DocumentBuilder (parser) object
DocumentBuilder builder =
factory.newDocumentBuilder();

//Parse the XML input file to create a
// Document object that represents the
// input XML file.
Document document = builder.parse(
new File(argv[0]));

Listing 6

The code in Listing 6 was also discussed in detail in the two previous
lessons, so I won’t discuss that code further here.

Process the
Document object

Code that is new to this lesson begins in Listing 7.  The code in
Listing 7 instantiates a new object of the program class and invokes
the processNode method on that
object, passing the Document
object’s reference as a parameter.

      //Instantiate an object of this class
DomTree01 thisObj = new DomTree01();

thisObj.processNode(document);

}catch(Exception e){
e.printStackTrace(System.err);
}//end catch

}// end main()

Listing 7

Listing 7 also contains a simple exception handler, which signals the
end of the main method.

The processNode
method

The processNode method, which
begins in Listing 8, is used to recursively process the DOM tree,
identifying and displaying the tree structure along the way.

  private void processNode(Node node){
if (node == null){
System.err.println(
"Nothing to do, node is null");
return;
}//end if

Listing 8

Recall from an earlier lesson that the Document
interface extends the Node
interface, which provides a multiplicity of
methods that can be used to navigate and manipulate the DOM tree. 
Therefore, a Document object
can be treated as
type Node.  The required
type for the incoming
parameter to the processNode
method is type Node.

The code in Listing 8 simply checks to confirm that the incoming
reference
does not have a value of null.  If it does, the code in Listing 8
prints an error message and
returns.

Perform the
recursive processing on the incoming node

The code in Listing 9 shows the beginning of what happens if the
incoming parameter is not null.

    indent++;

String nodeName = node.getNodeName();
int type = node.getNodeType();

Listing 9

As you will see later, the processNode
method will continue calling itself recursively until all of the nodes
in the DOM tree have been examined.  Information about the tree
structure will be extracted and displayed as each node is
examined.  When all of
the nodes in the DOM tree have been examined, the program will
terminate.

Indentation

Recall the instance variable named indent
that was declared and initialized in Listing 4.  Each time control
enters the processNode method (with a non-null Node parameter), the
value of that instance variable is incremented.  Each time control
exits the method (except for the
case of a null Node parameter),

the value of that instance variable is
decremented.  Therefore, at any point in time, the value of indent indicates the current depth (in the DOM tree)
of the node that is being examined.

Get node name
and type

The variable named indent is
incremented in Listing 9.  Following this, two methods are called
on the incoming Node parameter
to get and save the name and the type of the node currently being
examined.

Some types of nodes have generic names, such as #text.  Other types of nodes
have actual names, which match element names in the XML document.

The doIndent
method

At this point, I am going to skip ahead and show you a very simple
method named doIndent, (which actually appears near the end of
the program code in Listing 20).
 
The code for this method is shown in Listing 10.

  private void doIndent(){
for(int cnt = 0; cnt < indent; cnt++){
System.out.print(" ");
}//end for loop
}//end doIndent

Listing 10

The purpose of this method is to move the cursor to the right on the
screen to accomplish indentation in the display.  Each time method
is called, it moves the cursor to the right by an amount equal to twice
the value of the variable named indent
This produces two spaces for each level of indentation.

Display the
name of the node

Returning to the discussion of the processNode
method, Listing 11 invokes the doIndent
method to produce the required indentation, and then displays the name
of the
current node, followed by a space.  Note that the
cursor remains immediately to the right of the space and does not
advance to the
next line at this time.

    doIndent();

System.out.print(nodeName + " ");

Listing 11

Display the
type of the node on the same line

Recall that the invocation of the getNodeType
method in Listing 9 returned a value of type int.  The Node interface defines about a dozen
symbolic constants that correlate the type values to names such as CDATA_SECTION_NODE.

A switch
statement

Listing 12 shown the beginning of a switch
statement that uses the type value from Listing 9, along with the
constants from the Node
interface to display the alphanumeric node type to the right of the
node name that was displayed by the code in Listing 11.

    switch(type){
case Node.CDATA_SECTION_NODE:{
System.out.println("CDATA_SECTION_NODE");
break;
}//end case Node.CDATA_SECTION_NODE

Listing 12

When the alphanumeric node type is displayed, the cursor moves down to
the left-hand side of the next line.

For example, the code in Listings 11 and 12 would produce output
similar to that shown in Figure 6 (the
indentation may be different for different XML documents).

 

        #cdata-section CDATA_SECTION_NODE
Figure 6

The remainder
of the switch statement

Listing 13 shows the remainder of the switch
statement.  There is nothing special about the code in Listing
13.  As each node is examined, the code in Listing 11 performs the
proper indentation and displays the name of the node.  Then one of
the cases in the switch
statement is invoked to display the alphanumeric node type to the
right of the node name and to advance the display cursor to the next
line.

      case Node.COMMENT_NODE:{
System.out.println("COMMENT_NODE");
break;
}//end case

case Node.DOCUMENT_FRAGMENT_NODE:{
System.out.println(
"DOCUMENT_FRAGMENT_NODE");
break;
}//end case

case Node.DOCUMENT_NODE:{
System.out.println("DOCUMENT_NODE");
break;
}//end case Node.DOCUMENT_NODE

case Node.DOCUMENT_TYPE_NODE:{
System.out.println("DOCUMENT_TYPE_NODE");
break;
}//end case

case Node.ELEMENT_NODE:{
System.out.println("ELEMENT_NODE");
break;
}//end case Node.ELEMENT_NODE

case Node.ENTITY_NODE:{
System.out.println("ENTITY_NODE");
break;
}//end case

case Node.ENTITY_REFERENCE_NODE:{
System.out.println(
"ENTITY_REFERENCE_NODE");
break;
}//end case Node.ENTITY_REFERENCE_NODE

case Node.NOTATION_NODE:{
System.out.println("NOTATION_NODE");
break;
}//end case

case Node.PROCESSING_INSTRUCTION_NODE:{
System.out.println(
"PROCESSING_INSTRUCTION_NODE");
break;
}//end case

//Handle text nodes
case Node.TEXT_NODE:{
System.out.println("TEXT_NODE");
break;
}//end case Node.TEXT_NODE

default:{
System.out.println("Unknown Node Type");
}//end default case
}//end switch

Listing 13

Get and process
children of the current node

Following the switch
statement, the code in Listing 14 invokes the getChildNodes method on the current
node to get a list of the nodes that are children of the current
node.  That list is returned as an object of type NodeList.  The NodeList object’s
reference is stored in the reference variable named children.

    NodeList children = node.getChildNodes();

Listing 14

A NodeList object provides an
ordered collection of nodes, and provides two methods for accessing the
items in the list:

  • A method named getLength
    returns the number of
    nodes in the list.
  • A method named item
    takes a parameter of type int,
    and uses that parameter to
    return the Node object’s
    reference that is stored at that index.

Make recursive
call to processNode method on each child node

Provided that the NodeList
reference in the variable named children
is not null, the code in Listing 15 uses a for loop to process each node whose
reference is stored in the list.

    if (children != null){
int len = children.getLength();

for (int i = 0; i < len; i++){

//Recursion !!!
processNode(children.item(i));

}//end for loop
}//end if children

Listing 15

This is where the recursive processing occurs.  The boldface
statement in Listing 15, recursively invokes the processNode method once for each
item in the list, passing the item as a parameter to the processNode method.

This causes the program to recursively examine every node in the DOM
tree, (except for attribute nodes)
extracting and
displaying information about each node as it is examined.  This
includes nodes in the prolog of the XML document as well as nodes in
the body of the XML document.

(The
issue of attribute nodes will be addressed in the next sample program.)

Decrease
indentation level and terminate processNode method

When all the invocations of the processNode
method finally return and the current instance of the processNode method terminates, it
decreases the value of the variable named indent prior to termination as shown
in Listing 16.

    indent--;

}//end processNode(Node)
//-------------------------------------------//

// doIndent method goes here

}//end class DomTree01

Listing 16

Listing 16 signals the end of the processNode
method, and the beginning of the method named doIndent, which was discussed
earlier.

(Because
it was discussed earlier, the code for the doIndent method was not included in
Listing 16.)

The end of the doIndent method
signals the end of the class and the end of the program named DomTree01.

The program
named DomTree02

The program named DomTree02 is
an upgraded
version of DomTree01
This program displays the actual text belonging to text nodes instead
of
simply showing the type of node as TEXT_NODE.

DomTree02 also displays
attribute names and values, which is not the
case with DomTree01.

Sample output
from DomTree02

Figure 7 shows the output produced by using DomTree02 to process the XML file
named DomTree02.xml(You can view a listing of this XML file
in Listing 23
near the end of the lesson.)

I colored the attributes red and the text green in Figure 7
to make them easy to spot.

(Note
that some of the text consists of invisible newline characters, which
are impossible to color green.)

 

#document DOCUMENT_NODE
top DOCUMENT_TYPE_NODE
#comment COMMENT_NODE
xml-stylesheet PROCESSING_INSTRUCTION_NODE
top ELEMENT_NODE
#comment COMMENT_NODE
theData ELEMENT_NODE
Attribute: type=Programming
Attribute: test_attr=Testing
title ELEMENT_NODE
#text Java
author ELEMENT_NODE
#text R.Baldwin
price ELEMENT_NODE
#text $9.95
uvw ELEMENT_NODE
#text abc-
xyz ELEMENT_NODE
#text def-
#text

uvw ELEMENT_NODE
#text ghi-
#text each

theData ELEMENT_NODE
Attribute: type=Pets
title ELEMENT_NODE
Attribute: another_test_attr=More Test
#text Dogs
author ELEMENT_NODE
#text R.U.Barking
price ELEMENT_NODE
#text $19.95
Figure 7

Displaying text
versus displaying node type

Sometimes it can be very useful to display the actual text values in
the tree.  At other times, the text is so voluminous that it
completely overwhelms the display making it difficult to pick out the
structure of the tree.  In those cases, the version that simply
identifies the node as a text node is probably advantageous.

(A
good learning exercise would be for you to write a single program where
the user specifies whether the tree is to simply identify text nodes,
or is to display the actual
text value of each text node, by entering a parameter on the command
line.)

Will discuss in
fragments

I will discuss the program named DomTree02
in fragments.  A complete listing of the program is shown in
Listing 22 near the end of the lesson.

Large portions of this program are identical or very similar to the
code in the program named DomTree01,
discussed earlier in this lesson.  Therefore, I won’t repeat the
discussion of that code.  Rather, I will restrict this discussion
to those parts of this program that differ from the earlier program.

The main method in this
program is essentially the same as the main
method in the previous program, so I will skip a discussion of the main method.

As before, the method named processNode
is used to recursively process the entire DOM tree, extracting and
displaying information about the nodes in the tree along the way. 
The method named processNode
in this program is the same as in the previous program except for the
code in a couple of cases in the switch
statement.

New features in
DomTree02

Previously, the cases in the switch
statement were used to display the alphanumeric type of each node in
the tree.  In this program, the case for TEXT_NODE is modified to cause
the actual text value of the text node to be displayed instead of the
type of the node.

In addition, the case for ELEMENT_NODE
in this program is modified to get and display the names and
values of all attributes associated with elements.

The
ELEMENT_NODE case

I will begin by explaining the changes to the ELEMENT_NODE case in the switch statement.  Listing 17
shows the beginning of the ELEMENT_NODE
case.

  private void processNode(Node node){

//Code deleted for brevity

switch(type){

//Case code deleted for brevity

case Node.ELEMENT_NODE:{
System.out.println("ELEMENT_NODE");
//Get and display attributes if any
NamedNodeMap attrList =
node.getAttributes();
int attrLen = 0;
if(attrList != null){
attrLen = attrList.getLength();
}//end if

Listing 17

A map of
attribute nodes

There is a very important conceptual issue to deal with here. 
Specifically, attribute nodes are not simply child nodes of element
nodes.  In particular, all child nodes of an element node can be
obtained in a collection of type NodeList
by invoking the method named getChildNodes
on the element node.

In order to get the attributes belonging to an element node, it is
necessary to invoke the method named getAttributes
on the element node.  This method returns a reference to an object
of type
NamedNodeMap containing
unordered references to the
attribute nodes.

NamedNodeMap
versus NodeList

A NamedNodeMap is a different
type of data structure than a NodeList

A NodeList is an ordered
collection of references to Node
objects.  Items in the list are accessed on the basis of an
ordinal index.  They cannot be accessed on the basis of the name
of a node.  The order of the items in the list matches the
ordering of the corresponding nodes in the DOM tree.

NamedNodeMap

Sun describes objects of type NamedNodeMap
as

“collections
of nodes that can be accessed by name”.

Sun goes on to tell us,

“NamedNodeMaps
are not maintained in any particular order. Objects contained in an
object implementing NamedNodeMap may also be accessed by an ordinal
index, but this is simply to allow convenient enumeration of the
contents of a NamedNodeMap, and does not imply that the DOM specifies
an order to these Nodes.”

Therefore, references to objects representing attribute nodes can be
accessed
in a NamedNodeMap object
either on the basis of the attribute name, or on the basis of an
ordinal index.  I will use an ordinal index in this program, as
shown in Listing 18.

Get and display
name and value of attribute nodes

Listing 18 shows the remaining code for the ELEMENT_NODE case in the switch statement.

        for(int i = 0; i < attrLen; i++){
Node attrNode = attrList.item(i);
doIndent();
System.out.println(" Attribute: "
+ attrNode.getNodeName()
+ "="
+ attrNode.getNodeValue());
}//end for loop
break;
}//end case Node.ELEMENT_NODE

Listing 18

Listing 18 uses a for loop to
iterate on the NamedNodeMap
object, getting a reference to each attribute node in sequence, and
using that
reference to get and display the name and value of the attribute
properly indented.

(Note
in Listing 18 and Figure 7 that the attribute information was indented
an additional four spaces relative to the element node to visually
separate the attribute information from the child node of the
element.  This was done solely for cosmetic purposes.)

The modified
TEXT_NODE case

Listing 19 shows the modified TEXT_NODE
case in the switch statement,
and the end of the switch
statement.

      //Case code deleted for brevity

case Node.TEXT_NODE:{
System.out.println(node.getNodeValue());
break;
}//end case Node.TEXT_NODE

//default case code deleted for brevity

}//end switch

Listing 19

The version of this case in the program named DomTree01 simply displayed the text TEXT_NODE each time the case was
invoked.

This version invokes the method named getNodeValue
on the node and displays the String
that is returned by that method.  This code produced the green
text values for the text nodes represented in Figure 7.

(Recall
that the word #text in Figure
7 was displayed by code that invoked the getNodeName method prior to control
entering the switch
statement.  This is the same in both programs.  Only the red
and green text in Figure 7 is new.)

Beyond this
point, both programs are the same

The remainder of this program is the same as DomTree01, and therefore, doesn’t
merit further discussion.

Run the Programs

I encourage you to copy the code and XML data from Listings 20
through 23 into your text editor.  Compile and execute the
programs.  Experiment with them, making changes, and observing the
results
of your
changes.

Summary

In this lesson, I showed you how to write a program to display a
DOM tree on the screen in a format that is much easier to interpret
than raw XML code.  I explained two different versions of the
program.  One version simply identifies text nodes in the
output tree.  The other version displays the value of text nodes
in
the output tree.  Also, the first version ignores attributes in
the
output tree, while the second version includes attributes in the
output tree.

What’s Next?

In the next lesson, I will explain default XSLT behavior
and show you how to write Java code that mimics that behavior. 
The resulting Java code will serve as a skeleton for more advanced
transformation programs.

Complete Program Listings


Complete listings of the Java class and the XML documents discussed in
this lesson are shown in Listings 20 through 23 below.

/*File DomTree01.java
Copyright 2003 R.G.Baldwin

This program produces a text-based output on
the screen that represents the tree structure
of an XML file.

Not tested for DOCUMENT_FRAGMENT_NODE.
Not tested for ENTITY_NODE.
Not tested for ENTITY_REFERENCE_NODE.
Not tested for NOTATION_NODE.

The following output was produced by testing
this program with the XML file named
DomTree01.xml

#document DOCUMENT_NODE
A DOCUMENT_TYPE_NODE
#comment COMMENT_NODE
xml-stylesheet PROCESSING_INSTRUCTION_NODE
processor PROCESSING_INSTRUCTION_NODE
A ELEMENT_NODE
Q ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
#cdata-section CDATA_SECTION_NODE
R ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
S ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
S ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
T ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
C ELEMENT_NODE
#text TEXT_NODE
D ELEMENT_NODE
E ELEMENT_NODE
#text TEXT_NODE
G ELEMENT_NODE
#text TEXT_NODE
#text TEXT_NODE
F ELEMENT_NODE
#text TEXT_NODE
E ELEMENT_NODE
#text TEXT_NODE
F ELEMENT_NODE
#text TEXT_NODE
E ELEMENT_NODE
#text TEXT_NODE
F ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
R ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE
B ELEMENT_NODE
R ELEMENT_NODE
#text TEXT_NODE
C ELEMENT_NODE
#text TEXT_NODE

Note. No effort was made to provide meaningful
information about errors and exceptions.

Tested using SDK 1.4.2 under WinXP.
************************************************/

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import java.io.File;
import java.io.FileOutputStream;
import org.w3c.dom.*;

public class DomTree01{

int indent = -1;//Indentation level for display

public static void main(String argv[]){
if (argv.length != 2){
System.err.println(
"usage: java DomTree01 fileIn validate");
System.err.println(
"validate = n for no, y for yes");
System.exit(0);
}//end if

try{
//Get a factory object for DocumentBuilder
// objects
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();

//Configure the factory object setting
// validating true or false based on user
// input.
if(argv[1].equals("n")){
factory.setValidating(false);
}else{
factory.setValidating(true);
}//end if/else

factory.setNamespaceAware(false);
//Set to ignore cosmetic white space
// between elements.
factory.
setIgnoringElementContentWhitespace(true);

//Get a DocumentBuilder (parser) object
DocumentBuilder builder =
factory.newDocumentBuilder();

//Parse the XML input file to create a
// Document object that represents the
// input XML file.
Document document = builder.parse(
new File(argv[0]));

//Instantiate an object of this class
DomTree01 thisObj = new DomTree01();

thisObj.processNode(document);

}catch(Exception e){
e.printStackTrace(System.err);
}//end catch

}// end main()
//-------------------------------------------//

//This method is used recursively to identify
// and display node structure.
private void processNode(Node node){
if (node == null){
System.err.println(
"Nothing to do, node is null");
return;
}//end if

//Increase indentation level for display.
indent++;
//Get name and type of node. Some types of
// nodes have generic names, such as #text.
// Other nodes have actual names.
String nodeName = node.getNodeName();
int type = node.getNodeType();

//Indent to the correct level and display the
// name of the node.
doIndent();
System.out.print(nodeName + " ");

//Use the type to display the type of the
// node on the same line following the name
// of the node.
switch(type){
case Node.CDATA_SECTION_NODE:{
System.out.println("CDATA_SECTION_NODE");
break;
}//end case Node.CDATA_SECTION_NODE

case Node.COMMENT_NODE:{
System.out.println("COMMENT_NODE");
break;
}//end case

case Node.DOCUMENT_FRAGMENT_NODE:{
System.out.println(
"DOCUMENT_FRAGMENT_NODE");
break;
}//end case

case Node.DOCUMENT_NODE:{
System.out.println("DOCUMENT_NODE");
break;
}//end case Node.DOCUMENT_NODE

case Node.DOCUMENT_TYPE_NODE:{
System.out.println("DOCUMENT_TYPE_NODE");
break;
}//end case

case Node.ELEMENT_NODE:{
System.out.println("ELEMENT_NODE");
break;
}//end case Node.ELEMENT_NODE

case Node.ENTITY_NODE:{
System.out.println("ENTITY_NODE");
break;
}//end case

case Node.ENTITY_REFERENCE_NODE:{
System.out.println(
"ENTITY_REFERENCE_NODE");
break;
}//end case Node.ENTITY_REFERENCE_NODE

case Node.NOTATION_NODE:{
System.out.println("NOTATION_NODE");
break;
}//end case

case Node.PROCESSING_INSTRUCTION_NODE:{
System.out.println(
"PROCESSING_INSTRUCTION_NODE");
break;
}//end case

//Handle text nodes
case Node.TEXT_NODE:{
System.out.println("TEXT_NODE");
break;
}//end case Node.TEXT_NODE

default:{
System.out.println("Unknown Node Type");
}//end default case
}//end switch

//This method is first called on the node
// that represents the root node of the DOM
// tree. The following code recursively
// processes the entire tree.
NodeList children = node.getChildNodes();
if (children != null){
int len = children.getLength();
//Iterate on NodeList of child nodes.
for (int i = 0; i < len; i++){
//Process each of the nested elements
// recursively.
processNode(children.item(i));
}//end for loop
}//end if children

//Decrease indentation level for display
indent--;
}//end processNode(Node)
//-------------------------------------------//

//This method displays two spaces for each
// level of indentation.
private void doIndent(){
for(int cnt = 0; cnt < indent; cnt++){
System.out.print(" ");
}//end for loop
}//end doIndent

}//end class DomTree01

Listing 20

<?xml version="1.0"?>

<!DOCTYPE A [
<!ELEMENT A (Q,B,B)*>
<!ELEMENT B (B | C | D | R | S | T)*>
<!ELEMENT C (#PCDATA)>
<!ELEMENT D (E | F)*>
<!ELEMENT E (#PCDATA | G)*>
<!ELEMENT F (#PCDATA)>
<!ELEMENT G (#PCDATA)>
<!ELEMENT Q (#PCDATA)>
<!ELEMENT R (#PCDATA)>
<!ELEMENT S (#PCDATA)>
<!ELEMENT T (#PCDATA)>
]>


<!-- File DomTree01.xml
Copyright 2003 R. G. Baldwin
Used to test the program named
DomTree01.java
-->

<?xml-stylesheet type="text/xsl"
href="Dom03.xsl"?>
<?processor ProcInstr="Dummy"?>

<A>
<Q>A Big Header</Q>

<B>
<C>Level 0. This is the beginning of a B.
This text is in the Introduction section.
<![CDATA[This is CDATA < > " &]]></C>

<R>A Mid Header</R>

<C>Text block 1.</C>

<S>A Small Header</S>
<B>
<C>Text block 2.</C>
</B>

<S>Another Small Header</S>
<B>
<C>Text block 3.</C>

<T>A Smallest Header</T>
<B>
<C>Text block 4.</C>

<D>
<E>First list item in E
<G>Nested G text element</G>
</E>
<F>First list item in F</F>
<E>Second list item in E</E>
<F>Second list item in F</F>
<E>Third list item in E</E>
<F>Third list item in F</F>
</D>

<C>Text block 5.</C>
</B>
<C>Text block 6.</C>
</B>

<R>Another Mid Header</R>
<C>Text block 7.</C>
</B>

<B>
<R>Another Mid Header in Another B</R>
<C>Text block 8.</C>
</B>
</A>

Listing 21

/*File DomTree02.java
Copyright 2003 R.G.Baldwin

This program is an upgraded version of DomTree01.
This version shows the actual text belonging to
text nodes instead of simply showing the type
of node.

This version also displays attribute names and
values, which was not the case with DomTree01.

This program produces a text-based output on
the screen that represents the tree structure
of an XML file.

Not tested for DOCUMENT_FRAGMENT_NODE.
Not tested for ENTITY_NODE.
Not tested for ENTITY_REFERENCE_NODE.
Not tested for NOTATION_NODE.

The following output was produced by testing
this program with the XML file named
DomTree02.xml

#document DOCUMENT_NODE
top DOCUMENT_TYPE_NODE
#comment COMMENT_NODE
xml-stylesheet PROCESSING_INSTRUCTION_NODE
top ELEMENT_NODE
#comment COMMENT_NODE
theData ELEMENT_NODE
Attribute: type=Programming
Attribute: test_attr=Testing
title ELEMENT_NODE
#text Java
author ELEMENT_NODE
#text R.Baldwin
price ELEMENT_NODE
#text $9.95
uvw ELEMENT_NODE
#text abc-
xyz ELEMENT_NODE
#text def-
#text

uvw ELEMENT_NODE
#text ghi-
#text each

theData ELEMENT_NODE
Attribute: type=Pets
title ELEMENT_NODE
Attribute: another_test_attr=More Test
#text Dogs
author ELEMENT_NODE
#text R.U.Barking
price ELEMENT_NODE
#text $19.95

Note. No effort was made to provide meaningful
information about errors and exceptions.

Tested using SDK 1.4.2 under WinXP.
************************************************/

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import java.io.File;
import java.io.FileOutputStream;
import org.w3c.dom.*;

public class DomTree02{

int indent = -1;//Indentation level for display

public static void main(String argv[]){
if (argv.length != 2){
System.err.println(
"usage: java DomTree02 fileIn validate");
System.err.println(
"validate = n for no, y for yes");
System.exit(0);
}//end if

try{
//Get a factory object for DocumentBuilder
// objects
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();

//Configure the factory object setting
// validating true or false based on user
// input.
if(argv[1].equals("n")){
factory.setValidating(false);
}else{
factory.setValidating(true);
}//end if/else

factory.setNamespaceAware(false);
//Set to ignore cosmetic white space
// between elements.
factory.
setIgnoringElementContentWhitespace(true);

//Get a DocumentBuilder (parser) object
DocumentBuilder builder =
factory.newDocumentBuilder();

//Parse the XML input file to create a
// Document object that represents the
// input XML file.
Document document = builder.parse(
new File(argv[0]));

//Instantiate an object of this class
DomTree02 thisObj = new DomTree02();

thisObj.processNode(document);

}catch(Exception e){
e.printStackTrace(System.err);
}//end catch

}// end main()
//-------------------------------------------//

//This method is used recursively to identify
// and display node structure.
private void processNode(Node node){
if (node == null){
System.err.println(
"Nothing to do, node is null");
return;
}//end if

//Increase indentation level for display.
indent++;
//Get name and type of node. Some types of
// nodes have generic names, such as #text.
// Other nodes have actual names.
String nodeName = node.getNodeName();
int type = node.getNodeType();

//Indent to the correct level and display the
// name of the node.
doIndent();
System.out.print(nodeName + " ");

//Use the type to display the type of the
// node on the same line following the name
// of the node.
switch(type){
case Node.CDATA_SECTION_NODE:{
System.out.println("CDATA_SECTION_NODE");
break;
}//end case Node.CDATA_SECTION_NODE

case Node.COMMENT_NODE:{
System.out.println("COMMENT_NODE");
break;
}//end case

case Node.DOCUMENT_FRAGMENT_NODE:{
System.out.println(
"DOCUMENT_FRAGMENT_NODE");
break;
}//end case

case Node.DOCUMENT_NODE:{
System.out.println("DOCUMENT_NODE");
break;
}//end case Node.DOCUMENT_NODE

case Node.DOCUMENT_TYPE_NODE:{
System.out.println("DOCUMENT_TYPE_NODE");
break;
}//end case

case Node.ELEMENT_NODE:{
System.out.println("ELEMENT_NODE");
//Get and display attributes if any
NamedNodeMap attrList =
node.getAttributes();
int attrLen = 0;
if(attrList != null){
attrLen = attrList.getLength();
}//end if

for(int i = 0; i < attrLen; i++){
Node attrNode = attrList.item(i);
doIndent();
System.out.println(" Attribute: "
+ attrNode.getNodeName()
+ "="
+ attrNode.getNodeValue());
}//end for loop
break;
}//end case Node.ELEMENT_NODE

case Node.ENTITY_NODE:{
System.out.println("ENTITY_NODE");
break;
}//end case

case Node.ENTITY_REFERENCE_NODE:{
System.out.println(
"ENTITY_REFERENCE_NODE");
break;
}//end case Node.ENTITY_REFERENCE_NODE

case Node.NOTATION_NODE:{
System.out.println("NOTATION_NODE");
break;
}//end case

case Node.PROCESSING_INSTRUCTION_NODE:{
System.out.println(
"PROCESSING_INSTRUCTION_NODE");
break;
}//end case

//Handle text nodes
case Node.TEXT_NODE:{
System.out.println(node.getNodeValue());
break;
}//end case Node.TEXT_NODE

default:{
System.out.println("Unknown Node Type");
}//end default case
}//end switch

//This method is first called on the node
// that represents the root node of the DOM
// tree. The following code recursively
// processes the entire tree.
NodeList children = node.getChildNodes();
if (children != null){
int len = children.getLength();
//Iterate on NodeList of child nodes.
for (int i = 0; i < len; i++){
//Process each of the nested elements
// recursively.
processNode(children.item(i));
}//end for loop
}//end if children

//Decrease indentation level for display
indent--;
}//end processNode(Node)
//-------------------------------------------//

//This method displays two spaces for each
// level of indentation.
private void doIndent(){
for(int cnt = 0; cnt < indent; cnt++){
System.out.print(" ");
}//end for loop
}//end doIndent

}//end class DomTree02

Listing 22

<?xml version="1.0"?>

<!DOCTYPE top [
<!ELEMENT top (theData)*>
<!ELEMENT theData (title,author,price)*>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT price (#PCDATA | uvw)*>
<!ELEMENT uvw (#PCDATA | xyz)*>
<!ELEMENT xyz ANY>
<!ATTLIST theData type CDATA #REQUIRED>
<!ATTLIST theData test_attr CDATA #IMPLIED>
<!ATTLIST title another_test_attr CDATA #IMPLIED>
]>

<!-- File DomTree02.xml
Copyright 2003 R. G. Baldwin
Each data element contains a type
attribute that is displayed in the
output tree display.-->

<?xml-stylesheet type="text/xsl"
href="Dom07.xsl"?>

<top>

<!--The following element is designed to
be complex, involving combination text
and element child nodes. However, it is
not intended to make any sense in a real-world
sense-->
<theData type="Programming" test_attr="Testing">
<title>Java</title>
<author>R.Baldwin</author>
<price>$9.95<uvw>abc-<xyz>def-</xyz>
</uvw><uvw>ghi-</uvw>each
</price>
</theData>

<theData type="Pets">
<title another_test_attr="More Test">Dogs</title>
<author>R.U.Barking</author>
<price>$19.95</price>
</theData>

</top>

Listing 23


Copyright 2003, Richard G. Baldwin.  Reproduction in whole or
in
part in any form or medium without express written permission from
Richard
Baldwin is prohibited.

About the author

Richard Baldwin
is a college professor (at Austin Community College in Austin, TX) and
private consultant whose primary focus is a combination of Java, C#,
and XML. In addition to the many platform and/or language independent
benefits of Java and C# applications, he believes that a combination of
Java, C#, and XML will become the primary driving force in the delivery
of structured information on the Web.

Richard has participated in numerous consulting projects, and he
frequently provides onsite training at the high-tech companies located
in and around Austin, Texas.  He is the author of Baldwin’s
Programming Tutorials, which
has gained a worldwide following among experienced and aspiring
programmers. He has also published articles in JavaPro magazine.

Richard holds an MSEE degree from Southern Methodist University
and has many years of experience in the application of computer
technology to real-world problems.

Baldwin@DickBaldwin.com

-end-
 

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories