JavaJava JAXP, Implementing Default XSLT Behavior in Java

Java JAXP, Implementing Default XSLT Behavior in Java

Java Programming Notes # 2206


Preface

In this lesson, I will explain default XSLT behavior,
and will show you how to write Java code that mimics that
behavior. 
The resulting Java code serves as a skeleton for more advanced
transformation programs.

What is JAXP?

JAXP is an
API designed
to help you write programs for creating and processing XML
documents.  JAXP is
very important for many reasons, not the least of which is the
fact that it is a critical part of Sun’s Java Web Services Developer
Pack
(JWSDP).  As you are probably already aware, web services is
expected by many to be a very important aspect of the Internet of the
future

This lesson is one in a series designed to help you
understand how to use JAXP and how to use the JWSDP.

The first lesson in this series was
entitled Java
API for XML Processing (JAXP), Getting Started
.
 
The
previous lesson was entitled Java
JAXP, Exposing a DOM Tree
.

What is XML?

XML is an acronym for the eXtensible Markup Language. 
I will assume that you already
understand
XML, and will teach you how to use JAXP to write programs for
creating and processing XML documents.

What are XSL
and XSLT?

I provided quite a lot of background material on XSL and XSLT
in a previous lesson in this series.  A brief review of
that
material follows.

XSL is an acronym for Extensible Stylesheet language. 
XSLT is an acronym for XSL Transformations.
The W3C is a
governing body that has published many important documents on XML, XSL,
and
XSLT.


The uses of XSLT include the following:

  • Transforming non-XML documents into XML documents.
  • Transforming XML documents into other XML documents.
  • Transforming XML documents into non-XML documents.

Viewing tip

You may find it useful to open another copy of this lesson in a
separate browser window.  That will make it easier for you to
scroll back and forth among the different listings and figures while
you are reading about them.

Supplementary material

I recommend that you also study the other lessons in my extensive
collection of online Java and XML tutorials.  You will find those
lessons
published at Gamelan.com
As of the date of this writing, Gamelan doesn’t maintain a
consolidated index of my tutorial lessons, and sometimes
they are difficult to locate there.  You will find a consolidated
index at www.DickBaldwin.com.

Preview

A tree structure in memory

A DOM parser can be used to
create a tree structure in memory that represents an XML
document.  In Java, that tree structure is encapsulated in an
object of the interface type DocumentDocument
and its superinterface Node declare numerous methods that can
be used to navigate, extract information from, modify, and otherwise
manipulate the DOM tree.  As
is always the case, classes that implement Document must
provide concrete definitions of those methods.

Many operations are possible

Given an object of type Document, there are many
methods that
can be invoked on the object to perform a variety of operations. 
For example, it is possible to write Java code to move nodes from one
location in the tree
to another location in the tree, thus rearranging the structure of the
XML document represented by the Document object.  It is
possible to delete nodes, and to insert new nodes.  It is
also possible
to
recursively traverse the tree, extracting information about the nodes
along
the way.

Two ways to
transform an XML document

There are at least two ways to transform the contents of an XML
document into another document:

  • By writing Java code to manipulate the DOM and perform the
    transformation.
  • By using XSLT to perform the transformation.

It should be possible to write Java code to perform any
transformation that can be performed using XSLT, but the reverse may
not be true.

General
description of XSLT

Here is a partial quotation from XML In A Nutshell, (which I highly recommend), by
Elliotte Rusty Harold and
W. Scott Means.  This quotation provides a general description of
XSLT:

“…
(XSLT) is a functional programming language used to specify how an
input XML document is converted into another text document — possibly,
though not necessarily, another XML document.  An XSLT processor
reads both an input XML document and an XSLT stylesheet (which is
itself an XML document because XSLT is an XML application) and produces
a result tree as output. … Documents can be transformed using a
standalone program or as part of a larger program that communicates
with the XSLT processor through its API.”

In this lesson, I will provide and explain a larger program that communicates
with the XSLT processor through its API.  The program will also
execute Java code that mimics the transformation provided by XSLT.

Advantages
and disadvantages

As is usually the case, there are advantages and disadvantages to
both approaches to
document transformation.

As an example of an advantage provided by XSLT, if it is possible to
perform the required
transformation using XSLT, that approach will probably require you to
write less code than would be required to perform the same
transformation by writing a Java program from scratch.

A large
library of functions

With the XSLT transformation process, you write a stylesheet, which
is somewhat analogous to a driver program in a more conventional
programming environment.  That driver program accesses and
uses functions from a large library of pre-written functions to perform
a series of well-defined operations on the DOM tree to produce
the desired transformation.

(XSLT
authors don’t call them functions.  Rather, they are called XSLT
elements.  According to XML
In A Nutshell
, there are 37 standard
XSLT
elements.  Also according to XML In A Nutshell, most
XSLT
processors also provide various nonstandard extension elements and
allow you to write your own extension elements in languages such as
Java.)

Is there a
similar library of Java methods?

I am not aware of a library of Java methods in the public domain
that emulates the 37 standard XSLT Elements.  However, I freely
admit that such a library may exist and I may simply not know
about it.

Therefore, to write a Java program that emulates an XSLT
transformation, you need to either

  • Create your own library of Java
    methods and use that library with your Java code to perform the
    transformation, or
  • Start from scratch each time and write a
    custom program to perform the transformation.

A skeleton
library of Java methods

This lesson, and several lessons to follow this one, will show you
how to write the skeleton of a Java library containing methods that
emulate the most common XSLT elements.  Once you have the library,
writing Java code to transform XML documents consists simply of writing
a short driver program to access and use those methods.  Thus,
given the proper library of methods, it is no more difficult to write a
driver Java program to perform the transformation than it is to write
an
XSLT stylesheet.

Library is
not my primary purpose

However, my primary purpose in these lessons is not to provide such
a library, but rather is to help you understand how to use a DOM
tree to create, modify, and manipulate XML documents.  By
comparing Java code that manipulates a DOM tree with similar XSLT
operations, you will have an opportunity to learn a little about XSLT
in the process of learning how to manipulate a DOM tree using Java code.

If you already know a lot about XSLT, you may learn a little
about Java by studying these lessons.  If you already know a lot
about Java, you may learn a little about XSLT.  If you don’t
already know either
Java or XSLT, you may learn a little about both.

Debugging
XSLT can be difficult

While writing a Java program to emulate an XSLT Transformation may
require you to write more code than writing a stylesheet, in my
opinion, it is much easier to debug a Java program that fails to
deliver the desired result than it is to debug an XSL stylesheet that
fails to deliver.  This is an advantage of
using Java code over XSLT.  I find XSLT to be extremely difficult
to debug (but I haven’t attempted to
use a fancy XSLT debugger, several of which are freely available on the
Internet).

Java
provides more detailed control

Another difference in using Java code relative to XSLT has to do
with
the detailed control of the transformation process.  I
believe, (but cannot prove),
that it is possible to write Java programs
to provide transformations that are not possible using standard XSLT
elements.  If I am correct, this may be another
advantage of writing Java code over using XSLT.

Some
Details Regarding XSLT

The following is a partial quotation from XML In A Nutshell.  (Note that I will be referring to
this excellent book several more times in this lesson.  For
brevity, I will refer to it simply as Nutshell.)

“XSLT
is an XML application for specifying rules by which one XML document is
transformed into another XML document.  An XSLT document — that
is, an XSLT stylesheet — contains template rules.  Each template
rule has a pattern and a template.  An XSLT processor compares the
elements and other nodes in an input XML document to the template-rule
patterns in a stylesheet.  When one matches, it writes the
template from that rule into the output tree.  … XSLT uses the
XPath syntax to identify matching nodes.”

My
explanation

Let’s see if I can explain this process in my own words. 
Assume that an XML document has been parsed so as to produce a DOM tree
in memory that represents the XML document.  (The creation of a DOM tree in this manner
was discussed in several previous lessons
in this series.)

An XSLT processor starts examining the DOM tree at its root
node.  It
obtains instructions from the XSLT stylesheet telling it how to
navigate the
tree, and what to do with each node that it encounters along the way.

Finding
matching template rules

As each node is encountered, the processor searches the stylesheet
looking for instructions on how to treat that node.  (These instructions will be referred to
later as template rules.)
  If the processor finds
instructions that match the node type, it performs the operations
indicated by the
instructions.  If it doesn’t find matching instructions, it
executes built-in instructions appropriate to that node.

(An XML
document can contain seven different types of nodes.  The
different types will be identified later.  This lesson will
describe and explain the built-in
instructions for six of those seven node types.  Java code will be
developed that emulates the built-in
instructions for each of the six types of nodes.)

Establishing
the context node

An XPath expression can be
used to point to a specific node and to
establish that node as the context node.  Once a context node is
established, there are at least two XSLT elements that can be used to
manage the traversal among children of that node:

  • xsl:apply-templates
      select, optional attribute
      mode, optional attribute
      xsl:sort, optional XSLT element
  • xsl:for-each
      select, required attribute
      xsl:sort, optional XSLT element

The
xsl:apply-templates XSLT element

The first of these, xsl:apply-templates,
examines and processes all child nodes of the context node that match
an optional select
attribute.

(When
combined with a default template rule to be discussed later, this often
results in a recursive examination and processing of all descendant
nodes of the context node.)

According to Nutshell,

“The
xsl:apply-templates instruction tells the processor to search for and
apply the highest-priority template in the stylesheet that matches each
node identified by the select attribute.”

Applying
template rules

As each node is examined, the processor searches the stylesheet to
determine if the XSLT programmer has provided a template rule that
matches the node and defines how that
node should be treated.  If a matching template rule is found, the
node is treated in the manner prescribed by the template rule.

Literal text
in the XSLT stylesheet elements

You can think of the XSLT process as operating on an input DOM tree
to produce an output DOM tree.  If the template rule being applied
contains literal text, that literal text is used to
create a text node in the output tree.

(I will
explain how this feature is used to transform XML documents into XHTML
documents in a future lesson.)

If no match
is found

If a matching template rule is not found, the processor executes a
built-in template rule appropriate to the type of node involved. 
Built-in template rules are provided by the XSLT processor to handle
the seven different types of nodes in an XML document:

  1. root node
  2. element node
  3. attribute node
  4. text node
  5. comment node
  6. processing instruction node
  7. namespace node

This lesson will explain the built-in rules that handle the first
six types of nodes in the above list.

Recursion is
common

As mentioned earlier, the combination of xsl:apply-templates and a built-in
template rule often produces recursion.  Assuming that there is
nothing in a matching template rule that stops
the recursion operation, recursion continues until all descendant nodes
of the original context node have been examined and processed.

The mode
attribute

The mode attribute of xsl:apply-templates makes it
possible to cause different template
rules to match nodes of the same type at different places in the DOM
tree.

Sorting

The optional xsl:sort
element makes it possible to modify the
order in which the nodes are examined.

Iterative
operation

The second XSLT element in the above list, xsl:for-each, executes an iterative
examination and processing of all child nodes of the context node that
match the required select attribute. 
According to Nutshell,

“The
xsl:for-each instruction iterates over the nodes identified by its
select attribute and applies templates to each one.”

In other words, the processor will examine all child nodes of the
context node that match the select
attribute.  As each child node is examined, the processor will
search the stylesheet looking for a template rule that matches the
child node.  If a matching template rule is found, the matching
template rule will be used to process that
node. 
If a matching template rule is not found, a built-in template rule
appropriate for the type of node will be used to process the node.

As before, the optional xsl:sort
element makes it possible to modify the
order in which the nodes are examined.  I will explain this in
detail in a future lesson.

Combined
operations

Frequently a stylesheet will combine recursive and iterative
operations to produce more complex operations.

Enough talk, let’s
see some code

I will begin by discussing the XML file named Dom11.xml (shown in Listing 29) along with
the XSL
stylesheet file named Dom11.xsl
(shown in Listing 30). 
These two listings are provided near the end of the lesson.

After explaining the transformation produced by applying this
stylesheet to this XML document, I will explain the transformation
produced by applying the empty stylesheet
named Dom11a.xsl, (shown in Listing 33), to a nearly
identical XML document.

(The
two XML files are the
same except that they refer to different stylesheet files, one of which
is empty.)
 

A Java program
named Dom11

Following that, I will explain a Java program (shown in Listing 31) that
emulates the behavior of the stylesheets shown in Listings 30 and 33
when
applied to the XML file shown in Listing 29.

I will explain that the Java program shown in Listing 31 emulates
the behavior of the empty stylesheet shown in Listing 33, and will
explain why that is true.

Discussion
and Sample Code


The XML
file named Dom11.xml

The XML file shown in Listing 29 is relatively straightforward.  A
tree view of that XML file is shown in Figure 1.

(The
program named DomTree02, discussed in an earlier lesson, was used to
produce this tree view of the XML file.

The values of the text nodes in Figure 1 were manually highlighted in
red to make it easier to refer to those values later in this lesson.)

#document DOCUMENT_NODE
top DOCUMENT_TYPE_NODE
#comment COMMENT_NODE
#comment COMMENT_NODE
dummy-target PROCESSING_INSTRUCTION_NODE
xml-stylesheet PROCESSING_INSTRUCTION_NODE
false-target PROCESSING_INSTRUCTION_NODE
top ELEMENT_NODE
theData ELEMENT_NODE
Attribute: attr=Dummy Attr Value
title ELEMENT_NODE
#text Java

subtitle ELEMENT_NODE
Attribute: position=Low
#text really
#text rules

author ELEMENT_NODE
#text R.Baldwin
price ELEMENT_NODE
#text $9.95
theData ELEMENT_NODE
title ELEMENT_NODE
#text Python
author ELEMENT_NODE
#text R.Baldwin
price ELEMENT_NODE
#text $15.42
theData ELEMENT_NODE
title ELEMENT_NODE
#text XML
author ELEMENT_NODE
#text R.Baldwin
price ELEMENT_NODE
#text $19.60
Figure 1

A database of books

As you may already have figured out,
this XML document represents a small database containing information
about books.  However, the structure and content of this XML file
was not intended to have any purpose other than to illustrate the
default
behavior of the built-in XSLT template rules.

The XSL
stylesheet file named Dom11.xsl

The stylesheet file shown in Listing 30 is very important relative to
the purpose
of this lesson, so I will discuss it in detail.

Recall that an XSL stylesheet is itself an XML file, and can therefore
be represented as a tree.  I will begin by showing you an
abbreviated version of a tree view of the stylesheet, as shown in
Figure 2.

#document DOCUMENT_NODE
xsl:stylesheet ELEMENT_NODE
Attribute: xmlns_xsl=http:
//www.w3.org/1999/XSL/Transform
Attribute: version=1.0

xsl:template ELEMENT_NODE
Attribute: match=*|/
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=text()|@*
xsl:value-of ELEMENT_NODE
Attribute: select=.
Figure 2

Why abbreviated?

The reason that I refer to this as
an abbreviated version is because I manually deleted comment nodes and
extraneous text nodes in order to emphasize the important elements in
the document.

(Note that I also manually entered a line
break in the third line to force the material to fit into this narrow
publication format.)


The root element

The root node of all XML documents is the document node.  However,
in addition to the root node, there is also a root element.

As you can see from Figure 2, the root element in the XSL document is
of type xsl:stylesheet
The root element has two attributes, each of which is standard for XSL
stylesheets.

The first attribute points to the XSLT namespace URI, which you can
read about in the W3C
Recommendation
.  The second attribute provides the XSLT
version.  According to Nutshell, the version must be 1.0. 
Also, according to Nutshell,

“The namespace URI must be exactly
correct.  If even so much as a single character is wrong, the
stylesheet processor will output the stylesheet itself instead of
either the input document or the transformed input document.”


Unable to
verify this behavior

I have been unable to verify this behavior experimentally.  When I
delete a character from the XSL namespace URI and then load the XML
file into IE 6.0, there is simply no output.  The browser screen
remains blank.  When I modify the XSL namespace URI and attempt to
use JAXP to apply the stylesheet to the XML file, the system throws
several errors and the program aborts.  Neither approach seems to “output the stylesheet itself” as
indicated by Nutshell.

Children of the
root element node

As you can see from Figure 2, the root element node has two child
nodes, both of which are of type xsl:template
Here is what XSLT
and XPath On The Edge
by Jeni Tennison has to say about xsl:template:

“This element defines a template, which
can be applied (if a match pattern is specified) or called (if a name
is specified).”


As you can see from the attribute values in Figure 2, a match pattern
is provided for both of the xsl:template
nodes in Figure 2.

(The child nodes shown in Figure 2 are
also called template rules.)


Back to basics

Getting back to XSLT basics, whenever the XSLT processor encounters a
node while traversing the DOM tree, it will examine all of the template
rules in the stylesheet searching for one whose match pattern matches
the node.  If it finds a matching template rule, it will execute
the instructions contained as elements within the template rule. 
If it doesn’t find a match, it will execute a built-in template rule
that matches the node.

An explicit
representation of a built-in template rule

Consider the first child node of the xsl:stylesheet
root element in Figure 2.  Listing 1 shows this template rule in
XSL syntax, (extracted from Listing
30).

  <xsl:template match="*|/">
<xsl:apply-templates/>
</xsl:template>

Listing 1

The template rule shown in Listing 1
is an explicit representation of one of the built-in template
rules. 

Matching the
root node and element nodes

Consider the match pattern for this template rule (the text value of the attribute named
match).
  According to Nutshell,

“The
asterisk * is an XPath wild-card pattern that matches all element
nodes, regardless of what name they have or what namespace they’re in.

The forward slash / is an XPath pattern that matches the root
node. 

This is the first node the processor selects for processing, and
therefore this is the first template rule the processor executes
(unless a nondefault template rule also matches the root node).

… the vertical bar combines these two expressions so that it matches
both the root node and element nodes.” 


The
<xsl:apply-templates/> element

Now consider the <xsl:apply-templates/> element
that makes up the body of this template rule.  This element causes
the processor to process all child nodes of each matching node,
examining nodes, searching for matching template rules, and executing
the elements embedded in matching template rules along the way. 
Again, according to Nutshell, still speaking of the template rule in
Listing 1,

“In
isolation, this rule means that the XSLT processor eventually finds and
applies templates to all nodes except attribute and namespace nodes
because every nonattribute, non-namespace node is either the root node,
a child of the root node, or a child of an element.  Only
attribute and namespace nodes are not children of their parents.”

An explicit
representation of a built-in template rule

Once again, the template rule shown in Listing 1 is an explicit
representation of one of the built-in template rules.  If I were
to remove this template rule from the stylesheet, and then apply the
stylesheet to the XML document, this template rule would still be
applied where appropriate by the XSLT processor, because it is built
into the processor.

Handling text
nodes by default

Listing 2 shows the template rule, in XSL syntax that corresponds to
the second child node of the root element node in Figure 2.  Once
again, this is a template rule with a match pattern.  This
template rule is also an explicit representation of one of the built-in
rules, which copies the value of text and attribute nodes into the
output document.

  <xsl:template match="text()|@*">
<xsl:value-of select="."/>
</xsl:template>

Listing 2

The match
pattern

The text() in the value of
the attribute named match is an XPath pattern matching all text
nodes.  The @* is
an XPath pattern matching all attribute nodes.  The vertical bar
combines the two patterns.  Hence, the template rule matches all
text and all attribute nodes.

The
xsl:value-of element

Once a match is made, the behavior of the rule is governed by the
single element that is embedded in the rule.  The xsl:value-of element, with a select value of “.” returns the
text value of the context or current node.  (This is similar to the use of a single
period to represent the current directory in some file management
systems such as MSDOS.)

Text value to
the output

Therefore, whenever the XSLT processor applies this template rule to a
text or attribute node, the text value of that node is sent to the
output document (a text node is
created in the output tree).

If the node is a text node, the value is simply the text in the node.

If the node is an attribute node, the value is the attribute value, but
not the attribute name.

The output

Now it’s time for the big question.  What does the output look
like when the stylesheet shown in Listing 30 is used to transform the
XML document shown in Listing 29?  The result of such a
transformation is shown in Figure 3.

(Note
that I manually inserted a line break near the end of the fourth line
in Figure 3 to force the material to fit in this narrow publication
format.  This caused the text $19.60 to move down to the fifth
line.)

<?xml version="1.0" encoding="UTF-8"?>
Java
reallyrules
R.Baldwin$9.95PythonR.Baldwin$15.42XMLR.Baldwin
$19.60
Figure 3

The XML declaration

The first line in Figure 3 is an XML declaration that was placed there
by the XSLT processor independent of the content of the XML file.

The text in the
output

If you compare the text in Figure 3 with the material highlighted in
red in Figure 1, you will see that the output produced by this
stylesheet containing only explicit representations of default template
rules is the concatenation of text
values for all the element nodes in the XML document
.

Line breaks in
the output

The two line breaks following the words Java and rules in Figure 3 correspond to the
line breaks in the text portion of the title
element shown in Listing 3.  (This
element was extracted from the original XML file in Listing 29.)

<title>Java
<subtitle position="Low">really</subtitle>rules
</title>

Listing 3

Because these two line breaks occur within the text portion of the
element, they also appear in the output in Figure 3.  In other
words, the line breaks are considered by the XSLT processor to be a
legitimate part of the text content of the element.

The remaining line breaks in the XML file shown in Listing 29 occur
between XML tags.  Therefore, they are not considered to be a part
of the text content of any element and they do not appear in Figure 3.

No attribute
values in the output

You may have noticed that even though a couple of the elements in the
XML file have attributes (see Figure
1),
and one of the template rules matches attribute nodes, the
attribute values do not appear in the output shown in Figure 3. 
Nutshell explains this in the following way:

“… although this template rule says what
should happen when an attribute node is reached, by default the XSLT
processor never reaches attribute nodes and, therefore, never outputs
the value of an attribute.”


Nutshell goes on to tell us,

“Attribute values are output according to
this template only if a specific rule applies templates to them, and
none of the default rules do this because attributes are not considered
to be children of their parents.  In other words, if element E has
an attribute A, then E is the parent of A, but A is not the child of E.”


Finally, Nutshell tells us,

“Applying templates to the children of an
element with <xsl:apply-templates/>” does not apply templates to
attributes of the element.  To do that, the xsl:apply-templates
element must contain an XPath expression specifically selecting
attributes.”


Applying an
empty stylesheet

Now consider the stylesheet shown in Listing 33, as shown in
abbreviated tree format in Figure 4.

(As was the case with Figure 2, comment
nodes and extraneous text nodes were manually removed from Figure 4.)

#document DOCUMENT_NODE
xsl:stylesheet ELEMENT_NODE
Attribute: xmlns_xsl=http:
//www.w3.org/1999/XSL/Transform
Attribute: version=1.0
Figure 4

Unlike Figure 2, the stylesheet
represented by Figure 4 doesn’t contain any template rules.  In
fact, except for the root (document)
node and the xsl:stylesheet
root
element node, the stylesheet is completely empty.

Produces
exactly the same output

However, the result of applying the empty stylesheet to the XML file
discussed earlier produces exactly the same result as was produced by
applying the stylesheet shown in Listing 30 and Figure 2 to that XML
file.

This is because the two template rules shown in Listing 30 and Figure 2
replicate the behavior of two of the built-in template rules. 
Therefore, removing them from the stylesheet has no impact on the
result produced by applying the stylesheet to the XML file.  If
they are needed, they are available as built-in rules of the XSLT
processor.

Transformation
behavior of an empty stylesheet

Because the two template
rules in the previous stylesheet replicate the behavior of two of the
built-in template rules, removing those template rules from the
stylesheet to produce an empty stylesheet had absolutely no impact on
the transformation result.  The transformation result produced by
the previous stylesheet was identical to those produced by the empty
stylesheet.

According to Nutshell, when you transform an XML document using an
empty stylesheet,

“… the output consists of a text
declaration plus the text of the input document. … Markup from the
input document has been stripped.  The net effect of applying an
empty stylesheet … to any input XML document is to reproduce the
content but not the markup of the input document.  To change that,
we’ll need to add template rules to the stylesheet telling the XSLT
processor how to handle the specific elements in the input
document.  In the absence of explicit template rules, an XSLT
processor falls back on built-in rules …”


Combined output

Whenever the XSLT processor
encounters a node for which you haven’t defined a matching template
rule, the default template rule for that type of node will be
applied. 
Therefore, the total output is often a combination of output produced
by template rules that you provide and built-in template rules.

Therefore, if you are going to create a stylesheet containing template
rules of your own design, it is very important for you to understand
the default behavior provided by the built-in template rules.  The
total output produced by your stylesheet is very likely to be a
combination of the output produced by your template rules and the
output produced by the built-in template rules.

Other built-in
template rules

I have explained the behavior of the built-in template rules that cover
the following four types of nodes:

  • root node
  • element node
  • attribute node
  • text node

I will explain the behavior of the
built-in template rules that cover the following two types of nodes
later in this lesson:

  • comment node
  • processing instruction node

I will also have some comments about namespace nodes later in this
lesson as well.

A Java program
that emulates the built-in template rules


Now let’s change direction and concentrate on Java code rather than
XSLT elements.  The
following paragraphs describe a Java program named Dom11.

The primary purposes of
this lesson are to:

  • Demonstrate Java code that
    replicates the behavior of the built-in template rules for six of
    the seven possible types of nodes.
  • Provide a skeleton program
    that can be expanded later to provide more complex behavior.

This program implements six built-in
template rules for an XML processor.  In addition, it implements
several other template rules that are required to support the built-in
rules, such as
xsl:value-of and xsl:apply-templates

As such, the program serves as the skeleton for the definition of
custom template rules.

Behavior of the
program

As written, this program extracts and concatenates all text values from
a specified XML file, and writes that text into a result file, using
two different approaches:

  • An XSLT transformation
    operating under program control.
  • Program code that emulates the
    behavior of the XSLT transformation.

In particular, this program
illustrates Java code that emulates the XSLT templates in the files
named Dom11.xsl and Dom11a.xsl.  These two XSL
files differ in terms of their dependence on the built-in templates.

As you saw in the earlier discussion, both XSL files produce the same
result when processed against the XML files named Dom11.xml and Dom11a.xml, demonstrating the
behavior of the built-in template rules.  The execution of these
built-in template rules causes the contents of every text node to be
concatenated and written into the result file.

The program code in this program emulates those built-in template rules
and produces the same results.

Usage
instructions

The program requires three command line arguments in the following
order:

  • The name of the input XML file
    – must be Dom11.xml or Dom11a.xml.
  • The name of the output file to
    be produced by the XSLT transformation.
  • The name of the output file to
    be produced by the program code that emulates the XSLT transformation.

Order of execution

The program begins by executing code to transform the incoming XML file
in a way that mimics the XSLT transformation.  Along the way, it
saves the processing instructions, (one
of which contains the name of the stylesheet file),
for later
use by the code that governs the XSLT transformation process.  (Otherwise,
the code that performs the XSLT transformation later would have to
search the DOM tree for the XSL stylesheet file name.)

The name of the XSL
stylesheet file is extracted from the processing instruction in the XML
file. 
Then the program
uses the XSL style sheet to transform the XML file into a result file.

Errors,
exceptions, and testing

No effort was made to provide meaningful information about errors and
exceptions.  If an error or exception occurs, the default behavior
for that error or exception will occur.

The program was tested using SDK 1.4.2 under WinXP.

Will discuss in
fragments


I will discuss this program in fragments.  A complete listing of
the program is shown in Listing 31 near the end of the lesson.

Listing 4 shows the beginning of the class named Dom11 and the beginning of the main method.

public class Dom11{

PrintWriter out;//output stream
//Save processing instruction nodes here
static Vector procInstr = new Vector();

public static void main(String argv[]){
if (argv.length != 3){
System.err.println(
"usage: java Dom11 "
+ "xmlFileIn "
+ "xformFileOut "
+ "codeFileOut");
System.exit(0);
}//end if

Listing 4

The code in Listing 4 declares a couple of variables, one of
which will be used later to save processing instruction nodes.

Then the code in Listing 4 provides usage instructions based on
command-line arguments.

Parse the input
XML file

The code in Listing 5 parses the input XML file, producing an object of
type Document, which is a DOM
tree in memory.

    try{
//Get a factory object for DocumentBuilder
// objects
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();

//Configure the factory object. Change
// the following parameter to false for a
// non-validating parser.
factory.setValidating(true);
factory.setNamespaceAware(false);
//The following statement causes the parser
// to ignore cosmetic whitespace between
// elements.
factory.
setIgnoringElementContentWhitespace(true);

//Get a DocumentBuilder (parser) object
DocumentBuilder builder =
factory.newDocumentBuilder();

//Parse the XML input file to create a
// Document object that represents the
// input XML file.
Document document = builder.parse(
new File(argv[0]));

Listing 5

Steps for creating a Document object

There is nothing new in the code in Listing 5.  I have discussed
the code required to create a Document
object in several previous lessons beginning with the lesson entitled Java
API for XML Processing (JAXP), Getting Started
.

As you saw in those earlier lessons, creating a Document
object involves three steps:

  1. Create a DocumentBuilderFactory object
  2. Use the DocumentBuilderFactory object to create a DocumentBuilder
    object
  3. Use the DocumentBuilder object to create a Document
    object

Both the DocumentBuilderFactory class and the DocumentBuilder
class belong to the javax.xml.parsers package.  As of this
writing, this package is part of J2SE 1.4.2.

Transformation
through program code

The code in Listing 6 begins the process of transforming the DOM tree
into an output file through the execution of program code (as opposed to an XSLT transformation).

The code begins by instantiating a new object of the Dom11 class.

      Dom11 thisObj = new Dom11();

thisObj.out = new PrintWriter(
new FileOutputStream(argv[2]));

Listing 6

Get an output
stream

Then the program gets an output stream for the output produced by the
program code.  This stream points to an output file that was
specified by the third command- line parameter.

Process the DOM
tree

The code in listing 7 invokes the processDocumentNode
method to process the DOM tree.  This method (and the methods that it calls)
begins with the Document node,
and processes all
the nodes in the DOM tree to produce the required output.

      thisObj.processDocumentNode(document);

Listing 7

Note that the code in listing 7
passes the Document object’s
reference to the method named processDocumentNode
This is the root node of the entire DOM tree, and can be treated as
type Node, because the Document interface extends the Node interface.

Set the main method aside

My explanation of this program will follow the execution thread through
the program.  At this point, I will set the discussion of the main method aside temporarily and
come back to it later when the  processDocumentNode
method returns control to the main
method.

The
processDocumentNode method

The entire processDocumentNode
method is shown in Listing 8.

  void processDocumentNode(Node node){
//Write one line of text into the output.
out.println("<?xml version="1.0" "
+ "encoding="UTF-8"?>");

processNode(node);

out.flush();
}//end processDocumentNode

Listing 8

This method is used to produce any text required in the output at
the document level, such as the XML declaration for an XML
document.  (As you can see from
Listing 8, the code in this method writes an XML declaration into the
output.)

Invoke the
processNode method

Despite the name that I chose to give to the processDocumentNode method, it
doesn’t actually process the document node directly.  Rather after
sending any required text to the output, it invokes the
method named processNode to
actually process the document node.

(Note
that the Document object’s
reference is passed to the method named processNode in Listing 8.)

When the DOM
tree has been processed …

When the processNode method
returns, (after the entire DOM tree
has been processed),
the processDocumentNode
method flushes the output stream and returns control to the main method. 

As you will see
later, subsequent code in the main
method invokes a method that will perform an XSLT transformation on the
XML file and write the output into a different output file.  I
will discuss that method later in this lesson.

The processNode
method

There are seven possible types of nodes in an XML document:

  1. root or document node
  2. element node
  3. attribute node
  4. text node
  5. comment node
  6. processing instruction node
  7. namespace node

The processNode method handles
the first six types and ignores namespace nodes.

(Apparently
it is not possible to handle namespace nodes in a Java program because
there is no constant in the
Node class that can be used to identify
namespace nodes.  This will become clearer later as we examine the
code in the processNode
method.)

Get and save
the node type

The beginning of the processNode
method is shown in Listing 9.  Note that the method receives an
incoming parameter, which is a reference to an object as type Node.  This can include any of
the seven node types that can occur in a DOM tree.

If the parameter doesn’t point to an actual object, the method simply
returns, as opposed to throwing a NullPointerException.

  void processNode(Node node){

try{
if (node == null){
System.err.println(
"Nothing to do, node is null");
return;
}//end if

//Process the incoming node based on its
// type.
int type = node.getNodeType();


Listing 9

The final statement in Listing 9 invokes the getNodeType method to get and save
the type of the node whose reference was received as an incoming
parameter.

Process the node

Each time the processNode
method is invoked, it receives a Node
object’s reference as an incoming parameter.  The code in Listing
9 determines the type of the incoming node.  Listing 10 shows the
beginning of a switch
statement that is used to initiate the processing of each incoming node
based on its type.

      switch (type){
case Node.DOCUMENT_NODE:{
if(false){
//cannot be reached in this example
}else{//invoke default behavior
defElOrRtNodeTemp(node);
}//end else
break;
}//end case DOCUMENT_NODE

Listing 10

The switch statement has six
cases to handle six types of nodes, plus a default case to ignore
namespace nodes.

The
DOCUMENT_NODE case

The code in Listing 10 will be executed whenever the incoming method
parameter points to a document node.

(Note
that this will happen only once during the processing of a DOM
tree.  The first node processed will always be the document node,
and there is only one document node in a DOM tree.)

DOCUMENT_NODE is a constant (public static final variable) that
is defined in the Node
interface.  (The interface
provides similar constants for all node types other than namespace
nodes.)
  These constants can be used to distinguish between
different node types.

Will invoke
default behavior in this case

Note that the code in the case in Listing 10 is an if/else construct.  If the
conditional clause in the if statement
evaluates to true (which is not
possible in this case),
the code in the if statement will be executed. 
(This is where I will place the code
for custom template rules in subsequent lessons.)

If the conditional clause in the if statement does not evaluate to
true, the code in the else
statement will be executed.  (This
is where I have placed the code that mimics the built-in template
rules.)

Note that the code in the else
statement in Listing 10 invokes a method named defElOrRtNodeTemp.  When I
discuss this method momentarily, you will see that its behavior mimics
one of the built-in template rules that I discussed earlier in this
lesson.  Before getting to that, however, I want to give you a
preview of how I will define custom template rules in future lessons.

Creating custom
template rules

As you will see in subsequent lessons, the process for creating a
custom template rule is as follows:

  • Go to the method named processNode, which I am
    discussing right now.
  • Identify the case for the node
    type in the switch statement.
  • Change the conditional clause
    in the if statement for that
    case to
    implement a match for a particular node of that type.
  • Write code in the body of the if statement to implement the custom
    template rule.

If the modified conditional clause
evaluates to true, the custom template rule will be executed.  If
false, the
default rule will be executed.

The
ELEMENT_NODE case

Before getting to the discussion of the method named defElOrRtNodeTemp, I want to show
you the ELEMENT_NODE case in
Listing 11.

(This
is still part of the switch
statement that was begun in Listing 10)
        case Node.ELEMENT_NODE:{
if(false){
//unreachable in this example
}else{//invoke default behavior
defElOrRtNodeTemp(node);
}//end else
break;
}//end case ELEMENT_NODE

Listing 11

Except for the type of node in the first line in Listing 11, the code
in this case is identical to the code in the DOCUMENT_NODE case shown in Listing
10.  Note in particular that the default behavior for this case
invokes the same method as the default behavior for the document node
case.

As before, the code in the if
statement is not reachable in this program.

(That
will be true for every case in this program, because this program is
designed specifically to exhibit the same behavior as the built-in XSLT
template rules.)

The method
named defElOrRtNodeTemp

Still following the execution thread, I will set my discussion of the switch
statement aside temporarily and discuss the method named defElOrRtNodeTemp.  As
mentioned above, this method is invoked
as the default behavior for document nodes and element nodes in
Listings 10 and 11.

I will return to my discussion of the switch
statement shortly.

The entire method named defElOrRtNodeTemp
is shown in Listing 12.

  void defElOrRtNodeTemp(Node node)
throws Exception{
int nodeType = node.getNodeType();
if((nodeType == Node.ELEMENT_NODE) ||
(nodeType == Node.DOCUMENT_NODE)){

applyTemplates(node,null);
}else{
throw new Exception(
"Bad call to defElOrRtNodeTemp");
}//end else
}//end defElOrRtNodeTemp

Listing 12

Behavior of the
method named defElOrRtNodeTemp

This method mimics the behavior of the built-in XSLT template rule
shown in Listing 1, and repeated in Figure 5 below for convenient
viewing.

  <xsl:template match="*|/">
<xsl:apply-templates/>
</xsl:template>
Figure 5

As I indicated earlier, the match
pattern for this template rule matches the document node and all
element nodes.

(Hence, this method is invoked by the two
cases in the switch statement
corresponding to the document node and an element node.)


Code is
straightforward

The code in this method is relatively straightforward.  First it
tests to confirm that the incoming parameter points to a node of the
correct type, and throws an exception if the incoming parameter is not
of the correct type.

If the incoming parameter is of the correct type, the code in the
method invokes a method named applyTemplates
passing the node as a parameter to that method.

(Note the similarity between the code in
Listing 12 and the XSLT template rule in Figure 5.)

The method
named applyTemplates

Continuing to follow the execution thread, I will now discuss the
method named applyTemplates,
shown in Listing 13.

  void applyTemplates(Node node,String select){
NodeList children = node.getChildNodes();
if (children != null){
int len = children.getLength();
//Iterate on NodeList of child nodes.
for (int i = 0; i < len; i++){
if((select == null) ||
(select.equals(children.item(i).
getNodeName()))){

//Recursive method call
processNode(children.item(i));

}//end if
}//end for loop
}//end if children != null

}//end applyTemplates

Listing 13

Behavior of the
apply-templates rule

The applyTemplates method
partially emulates the XSLT apply-templates
rule discussed earlier in this lesson, and shown in Figure 6.

<xsl:apply-templates
optional attribute select="..."
optional attribute mode="..."
/>
Figure 6

The apply-templates
rule has two attributes, select and
mode. 

(The applyTemplates method shown in Listing 13 does not
support the mode attribute.  Perhaps I will update the method in a
future lesson to support this attribute.)


As I explained earlier in this lesson,

The xsl:apply-templates
rule processes all child nodes of the context node that match
an optional select
attribute.
  If the select attribute is omitted, all
child nodes are processed.”


Behavior
of the method named applyTemplates


The applyTemplates method
shown in Listing 13 receives two incoming parameters:

  • The context node.
  • The select parameter.

If the select parameter is null, the method
examines and processes all child nodes of the context node. 
Otherwise, it processes only those child nodes that match the select parameter.

The code in Listing 13 invokes the getChildNodes
method on the context node to get a list of all child nodes of the
context node.  If there are no child nodes, it quietly returns.

A recursive
method call

If there are child nodes, the
method uses a for loop to
process all child nodes that match the select
parameter as described above.

(Note that the match or lack thereof is
based on the name of the node obtained by invoking the method named getNodeName on the child node being
examined.)


For each matching child node, the applyTemplates
method makes a recursive call to the method
named processNode, passing the
child node’s reference as a parameter to the processNode method.


Return to
defElOrRtNodeTemp method

Eventually, the recursive process will end, and control will return to
the defElOrRtNodeTemp method
shown in Listing 12.  From there, control will return to either
the DOCUMENT_NODE case or the ELEMENT_NODE
case in the switch statement
in Listing 10 or Listing 11 from which the
defElOrRtNodeTemp
method was called.

That, in turn, brings us back to a discussion of the other cases in the
switch statement.

The TEXT_NODE
and ATTRIBUTE_NODE cases

The next two cases from the switch
statement that I will discuss are shown in Listing 14.  (The switch
statement began in Listing 10)

Listing 14 shows the cases for text nodes and attribute nodes.  I
have grouped these two cases together because the default behavior of
both cases is to invoke the method named defTextOrAttrTemp, and to send the String returned by that method to
the output.

        case Node.TEXT_NODE:{
if(false){
//unreachable in this program
}else{//invoke default behavior
out.print(defTextOrAttrTemp(node));
}//end else
break;
}//end case Node.TEXT_NODE

case Node.ATTRIBUTE_NODE:{
if(false){
//unreachable in this program
}else{//invoke default behavior
out.print(defTextOrAttrTemp(node));
}//end else
break;
}//end case Node.ATTRIBUTE_NODE

Listing 14

The
defTextOrAttrTemp method

Once again, following the execution thread, I will now discuss the
method named defTextOrAttrTemp
method.  This method is called whenever:

  • The processNode method
    is called with a reference to either a text node or an attribute node,
    and.
  • The default behavior for the node type is executed.

Listing 15 shows the entire method named defTextOrAttrTemp.

  String defTextOrAttrTemp(Node node)
throws Exception{
int nodeType = node.getNodeType();
if((nodeType == Node.ATTRIBUTE_NODE)
|| (nodeType == Node.TEXT_NODE)){

//Get and return value of context node.
return valueOf(node,".");
}else{
throw new Exception(
"Bad call to defaultTextOrAttr method");
}//end else
}//end defaultTextOrAttr

Listing 15

Emulates a
built-in XSLT template rule

This method emulates the built-in XSLT template rule shown in Listing 2
and repeated in Figure 7 below for convenient viewing.

  <xsl:template match="text()|@*">
<xsl:value-of select="."/>
</xsl:template>
Figure 7

As I told you earlier, this template
rule matches all text nodes and all attribute nodes.  Therefore,
the
defTextOrAttrTemp
method is invoked by the default behavior of either the TEXT_NODE case or the ATTRIBUTE_NODE case in the switch statement in Listing 14.

Similar behavior

Once again, note the similarity between the method named defTextOrAttrTemp in Listing 15 and
the template rule shown in Figure 7.

In Figure 7, the template rule executes the xsl:value-of XSLT element to send
the value of the context node to the output.

The method shown in Listing 15 invokes a method named valueOf,
passing “.” as a parameter (note the
period between the quotation marks).
  The value returned by
that method is
sent to the output by the code in the default behaviors of the two
cases in Listing 14.

The method named
valueOf

The method named valueOf,
which begins in Listing 16, is
fairly complex.  I will discuss portions of this method
in this lesson and will discuss the remainder of the method in
subsequent lessons.

This method emulates an <xsl:value-of
select=”???”/>
XSLT element.

Three forms of
method call

The method requires two parameters.  The first parameter is of
type Node, and is the context
node.  The second parameter is of type String and is a select parameter.

The valueOf method recognizes
three forms of call:

  1. valueOf(Node theNode,String “@attrName”)
  2. valueOf(Node theNode,String “.”)
  3. valueOf(Node theNode,String “nodeName”)

In the first form, the method returns the text value of the named
attribute of theNode.  An attribute is specified by a select value
that begins with @.  If the attribute doesn’t
exist, the method returns an empty string.

In the second form, which is the only form actually used in this
program, the value of the select parameter is a String containing a single
period.  In this form, the method returns the concatenated text
values of the context node and all
descendants
of the context node (including text
nodes that are children of the context node).

In the third form, the method returns the concatenated text values of
all descendants of a specified child node of the context node.  If
the context node has more than one child node with
the specified name, only the first one found is processed. 
The others are ignored.

Features not
supported

The valueOf method does not
support the following features, which are standard features of the xsl:value-of XSLT element:

  • disable-output-escaping
  • processing instruction nodes
  • comment nodes
  • namespace nodes

Will discuss
the second form only

Since the second form of call listed above is the only form actually
used in this program, I will discuss only those portions of the method
that support that form.  I will defer discussion of the other
portions of the method until they are used in subsequent lessons.

Process the
context node

The code in Listing 16 picks up at the point where it is determined
that the incoming value for select
is a String object’s reference
with a value of “.” (note the period
between the quotation marks).
  This is a request to return
the value of the context node.

This method supports two possibilities for the context node:

  1. Element node – return the concatenated text values of all
    descendant nodes of the context node.
  2. Text node – return the text value of the text node.

Clearly the first possibility is the more complex of the two, but as
you will see, recursion makes it easy to accomplish.

When the
context node is an element node …

The code in Listing 16 shows the beginning of the code required to
process the context node as an element node.

  public String valueOf(Node node,String select){

//code deleted for brevity

else if(select != null
&& select.equals(".")){

int nodeType = node.getNodeType();
if(nodeType == Node.ELEMENT_NODE){
//Process the context node as an element
// node.

Listing 16

Get list of
child nodes

In preparation for processing all descendant nodes of the context node,
the code in Listing 17 gets a list of child nodes, along with the
length of the list.

In addition, the code in Listing 17 initializes a String variable named nodeTextValue that will be used to
collect the concatenated text values of the descendant nodes. 
Note that this variable is initialized to contain an empty string.

        NodeList childNodes =
node.getChildNodes();
int listLen = childNodes.getLength();

String nodeTextValue = "";//result

Listing 17

Process child
nodes of context node

Having gotten a list of child nodes of the context node, all that is
required to accomplish the objective is to make a series of recursive
calls to the valueOf method,
passing each child
node in turn to the valueOf method
as shown in Listing 18.

        for(int j = 0; j < listLen; j++){
nodeTextValue +=
valueOf(childNodes.item(j),".");
}//end for loop

return nodeTextValue;

Listing 18

Each child node becomes the new context node upon re-entry into the valueOf method, and each call
requests the value of the context node (the current child node) by passing
“.” for the select parameter.

Concatenation

The code in Listing 18 also deals with concatenation.  The value
returned from each call to the valueOf
method is concatenated with the text value already stored in the
variable named nodeTextValue.

Finally, after all child nodes have been processed, the code in Listing
18 returns the concatenated value stored
in the variable named nodeTextValue.

When the
context node is a text node …

If you understood all of the above, (including
the recursion),

you should find it easy to
understand the code shown in Listing 19.  Listing 19 shows the
case where the context node is a text node.

      }else if(nodeType == Node.TEXT_NODE){
return node.getNodeValue();

Listing 19

In this case, the method simply returns the value obtained by invoking getNodeValue on the text node.

One other
possibility

There is one other possibility that is handled by the code in Listing
20.  That possibility is that the context node is neither a text
node nor an element node.  In that case, the valueOf method returns an empty
string.

      }else{
//ignore all other context node types
}//end else
}//end if for context node

//code deleted for brevity

return "";//empty string
}//end method valueOf

Listing 20

Other types of
nodes in the switch statement

Returning to the switch
statement that began in Listing 10, we find two additional cases, each
of which invokes the same method by default:

  • COMMENT_NODE
  • PROCESSING_INSTRUCTION_NODE

The default behavior of the cases corresponding to both of these node
types is to invoke the method named defComOrProcInstrTemp.

        case Node.COMMENT_NODE:{
if(false){
//unreachable in this program
}else{//invoke default behavior
defComOrProcInstrTemp(node);
}//end else
break;
}//end case COMMENT_NODE

case Node.PROCESSING_INSTRUCTION_NODE:{
if(false){
//unreachable in this program
}else{//invoke default behavior
//First save proc instr for later
// use.
procInstr.add(node);
//Now invoke default behavior.
defComOrProcInstrTemp(node);
}//end else
break;
}//end case PROCESSING_INSTRUCTION_NODE

Listing 21

Save all
processing instructions

I will discuss the defComOrProcInstrTemp
method shortly.  First, however, I will explain the extra code
that appears in the default portion of the processing instruction node
case in Listing 21.

The purpose of a processing instruction in an XML file is to provide
instructions to processing programs such as this one.  The XML
file shown in Listing 29 contains the three processing instructions
shown in Listing 22.

<?dummy-target dummy-data="def"?>
<?xml-stylesheet
type="text/xsl" href="Dom11.xsl"?>
<?false-target false-data="ghi"?>

Listing 22

Stylesheet
identified in a processing instruction

The first and third of the three processing instructions are dummy
processing instructions put there
to test the capabilities of this program.  However, the
processing instruction in the middle is a real processing instruction
that specifies the name of the file containing a stylesheet.  That
stylesheet will be used later when this program causes an XSLT
transformation to take place using the XML file in Listing 29, and the
stylesheet file identified in Listing 22.  (That
stylesheet actually appears in Listing 30.)

In order to use that processing instruction to identify the stylesheet
file, this program must capture the processing instruction and extract
the file name from the processing instruction.  A statement in the
second case in Listing 21 causes references to all processing
instruction nodes to be added to and saved in static variable of the
Dom11 class named procInstr.

That information will be used later to extract the name of the
stylesheet file from the processing instruction.

The
defComOrProcInstrTemp method

Both of the switch cases shown
in Listing 21 invoke this method as their default behavior.  A
complete listing of the defComOrProcInstrTemp
method is shown in Listing 23.

  String defComOrProcInstrTemp(Node node)
throws Exception{
int nodeType = node.getNodeType();
if((nodeType == Node.COMMENT_NODE) ||
(nodeType ==
Node.PROCESSING_INSTRUCTION_NODE)){

return "";//empty string
}else{
throw new Exception("Bad call to " +
"defalutCommentOrProcInstrTemplate");
}//end else
}//end defComOrProcInstrTemp

Listing 23

The defComOrProcInstrTemp
method emulates the built-in template rule shown in Figure 8.

<xsl:template
match="processing-instruction()|comment()"
Figure 8

According to Nutshell, the built-in
template rule for comments and processing instructions doesn’t output
anything into the output tree.  Therefore, the
defComOrProcInstrTemp method shown
in Listing 23 simply returns an empty string.

The namespace
node case


The default case for the switch
statement begun in Listing 10 is shown in Listing 24.

        default:{
//Ignore all other node types.
}//end default

}//end switch

Listing 24

Since the switch statement
contains explicit cases for six of the seven possible types of nodes in
a Dom tree, the default case will be activated only in the case of
namespace nodes.  As I mentioned earlier, the Node interface doesn’t provide a
constant that can be used to identify namespace nodes, so it isn’t
possible to create an explicit case for namespace nodes.

Also, here is what Nutshell has to say about the built-in template rule
for namespace nodes:

“A
… template rule … instructs the processor not to copy any part of
the namespace node to the output.”

Therefore, the default case in Listing 24, which catches all namespace
nodes, doesn’t send anything to the output.

End of the
processNode method

I have discussed everything of significance in the processNode method.  Continuing
to follow the execution thread, I will now turn my attention back to
the main method.

Perform an XSLT
transformation

After the code has been executed to process the document using program
code (beginning with the invocation
of the processDocumentNode
method in Listing 7),
the
statement in Listing 25 invokes the doXslTransform
method to cause the XML document to be transformed using the stylesheet
identified in one of the processing instructions in the XML file.

      thisObj.doXslTransform(
document,argv[1],procInstr);

Listing 25

Stylesheet
reference has been saved

The success of the method call in Listing 25 depends on the
stylesheet processing instruction having been saved while the document
was being processed.  Otherwise, it would be necessary to add code
in this method to search the DOM tree for the stylesheet processing
instruction.

All processing instructions are saved in a Vector object by this
program.  The Vector
object’s reference is passed as the third parameter to this
method.  The first parameter is a reference to the Document or root node in the DOM
tree.  The second parameter is the name of the output file.

The
doXslTransform method

The doXslTransform method
begins in Listing 26.  This method uses an XSLT stylesheet file to
transform an incoming Document object
into an output file.  A large portion of the code in this method
is dedicated to:

  • Identifying the processing instruction containing the stylesheet
    information.
  • Extracting the stylesheet information from the processing
    instruction.

Identify the
processing instruction containing the stylesheet reference

The code in Listing 26 searches the Vector
object seeking a processing instruction node that contains a stylesheet
reference.

  void doXslTransform(Document document,
String outFile,
Vector procInstr)
throws Exception{
try{
//Get stylesheet ID from proc instr.
ProcessingInstruction pi = null;
boolean piFlag = false;
int size = procInstr.size();

//Search for a stylesheet in the Vector
// containing processing instruction nodes.
for(int i = 0; i < size; i++){
pi = (ProcessingInstruction)procInstr.
get(i);
if(pi.getTarget().startsWith(
"xml-stylesheet") && pi.getData().
startsWith("type="text/xsl"")){
//Looks like a good stylesheet.
piFlag = true;
break;
}//end if
}//end for loop
if(piFlag == false){//still false?
throw new Exception(
"No valid stylesheet");
}//end if

Listing 26

How does this
work?

To see how this code works, first take a look at the processing
instruction in the XML file that contains the stylesheet
reference.  This processing instruction was shown in Listing 22,
and is repeated below in Figure 9 for convenient viewing.

<?xml-stylesheet 
type="text/xsl" href="Dom11.xsl"?>
Figure 9

The purpose of a processing instruction is to provide information to
processing programs that will be used to process the XML file.

Format of a
processing instruction

According to Nutshell,

“A processing instruction begins with
<? and ends with ?>.  Immediately following the <? is an
XML name called the target, possibly the name of the application for
which this processing instruction is intended or possibly just an
identifier for this particular processing instruction.  The rest
of the processing instruction contains text in a format appropriate for
the application for which the instruction is intended.”


Applying this knowledge to the stylesheet processing instruction in
Figure 9, you can see that the target consists of the following
text:  xml-stylesheet.

Accessing the
target and the data

The target of a processing instruction node can be accessed in Java by
invoking the getTarget method
on the processing
instruction node’s reference.

The remainder of the text in the processing instruction can be accessed
by invoking the getData method
on the same reference.

The code in Listing 26 examines each of the objects in the Vector, invoking getTarget and getData, searching for a processing
instruction whose target and data match that which is known to be true
for a stylesheet.  When a match is found, the code breaks out of
the for loop.

If no match is found, the code in Listing 26 throws an exception.

Extract the
stylesheet file name

Having identified the processing instruction that contains the
stylesheet reference, the code in Listing 27 uses the getData method of the ProcessingInstruction interface,
along with some methods of the String
class to extract the name of the file containing the stylesheet.

      String xslFile = pi.getData().
substring(pi.getData().indexOf(
"href=")+6);
//Eliminate the quotation mark at the end
xslFile = xslFile.substring(
0,xslFile.length()-1);

Listing 27

The ability to extract the file name is based on the known format of
the stylesheet processing instruction.

Do the XSLT
transformation

The remaining code in the doXslTransform
method is shown in Listing 28.

      //Get a TransformerFactory object
TransformerFactory xformFactory =
TransformerFactory.newInstance();

//Get an XSL Transformer object based on
// the XSL file discovered above.
Transformer transformer =
xformFactory.newTransformer(
new StreamSource(
new File(xslFile)));

//Get a DOMSource object that represents
// the DOM tree.
DOMSource source = new DOMSource(document);

//Get an output stream for the output
// file.
PrintWriter xformStream = new PrintWriter(
new FileOutputStream(outFile));

//Get a StreamResult object that points to
// the output file. Then transform the DOM
// sending text to the output file.
StreamResult xformResult =
new StreamResult(xformStream);

//Do the transform
transformer.transform(source,xformResult);
}catch(Exception e){
e.printStackTrace(System.err);
}//end catch

}//end doXslTransform

Listing 28

You have seen
this code before

The code in Listing 28 is not new to this series of lessons.  This
code was discussed in detail in the earlier lesson entitled Getting
Started with Java JAXP and XSL Transformations (XSLT)
. 
Therefore, other than to point out one
difference relative to the previous code, and to review the steps
involved, I won’t discuss the code in Listing 28 further in this lesson.

Steps for creating a Transformer object

The following two steps
are required to create a Transformer object.  Once a Transformer object is available, it
can be used to transform one DOM tree into another DOM tree.

  1. Create a TransformerFactory object by invoking the
    static newInstance method of the TransformerFactory class.
  2. Invoke the newTransformer method on the TransformerFactory
    object.

One important
difference

There is one important difference between the code in Listing
28 and the code in the earlier lesson.  The two programs invoke
different overloaded versions of the newTransformer
method of the TransformerFactory
class.

The earlier lesson entitled Getting
Started with Java JAXP and XSL Transformations (XSLT)

invoked a version that took no parameters and returned a Transformer object that simply
copies a source tree to a result tree.

The code in Listing 28 invokes a version of the newTransformer method that takes
the stylesheet file as an input parameter and returns a Transformer object that uses the
stylesheet file to perform an XSLT transformation.

That concludes the discussion of the program named Dom11.

Run the Program

I encourage you to copy the Java code, XML files, and XSL files from
the listings near the end of this lesson.  Compile and execute the
programs.  Experiment with them, making changes, and observing the
results
of your
changes.

Summary

I explained default XSLT behavior
and showed you how to write Java code that mimics that behavior. 
The resulting Java code serves as a skeleton for more advanced
transformation programs.

What’s Next?

In the next lesson, I will show you 
how to
write a Java program that mimics an XSLT transformation for converting
an XML file into a text file.
  I will also show that once you
have a
library of Java
methods that
emulate XSLT elements, it is no more difficult to
write a Java program to transform an XML document than it is to
write an XSL stylesheet to transform the same document.

Complete Program Listings


Complete listings of the various files discussed in this lesson are
contained in the listings that follow.

<?xml version="1.0"?>

<!DOCTYPE top [
<!ELEMENT top (theData)*>
<!ELEMENT theData (title,author,price)*>
<!ELEMENT title (#PCDATA | subtitle)*>
<!ELEMENT author (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ELEMENT subtitle (#PCDATA)>
<!ATTLIST theData attr CDATA #IMPLIED>
<!ATTLIST subtitle position CDATA #IMPLIED>
]>

<!-- File Dom11.xml
Copyright 2003 R. G. Baldwin
Illustrates built-in template rules.
-->

<!--Two of the following proc instr were included
to test the ability of the program to find the
actual stylesheet proc instr.-->
<?dummy-target dummy-data="def"?>
<?xml-stylesheet
type="text/xsl" href="Dom11.xsl"?>
<?false-target false-data="ghi"?>

<top>

<theData attr="Dummy Attr Value">
<title>Java
<subtitle position="Low">really</subtitle>rules
</title>
<author>R.Baldwin</author>
<price>$9.95</price>
</theData>

<theData>
<title>Python</title>
<author>R.Baldwin</author>
<price>$15.42</price>
</theData>

<theData>
<title>XML</title>
<author>R.Baldwin</author>
<price>$19.60</price>
</theData>

</top>

Listing 29

<?xml version='1.0'?>
<!-- File Dom11.xsl
Copyright 2003 R. G. Baldwin
Illustrates extraction of text from an XML file.

This version specifies a template rule that
guarantees that the root and all child nodes
are processed.

It also specifies a template rule that copies
the value of text and attribute nodes into the
output. However, attribute nodes are not copied.
See Nutshell page 147 for the reason.
-->
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<!--According to Nutshell, this matches a
default template.-->
<xsl:template match="*|/">
<xsl:apply-templates/>
</xsl:template>

<!--According to Nutshell, this matches a
default template.-->
<xsl:template match="text()|@*">
<xsl:value-of select="."/>
</xsl:template>

</xsl:stylesheet>

Listing 30

/*File Dom11.java
Copyright 2003 R.G.Baldwin

This program implements all six built-in default
template rules for an XML processor. In
addition, it implements a couple of other
template rules that are required to support
the built-in rules, such as xsl:value-of.

As such, the program serves as the skeleton for
the definition of custom template rules.

To create a custom temtlate rule:
1. Go to the processNode method.
2. Identify the node type.
3. Change the conditional clause in the if
statement to implement the match.
4. Write code in the body of the if statement to
implement the custom rule.

If the modified conditional clause evaluates to
true, the custom rule will be executed. If
false,the default rule will be executed.

As written, this program extracts and
concatenates all text values from a specified
XML file, and writes that text into a result
file, using two different approaches:

1. An XSLT style sheet and transformation.
2. Program code that emulates the behavior of the
XSL transformation.

In particular, this program illustrates Java code
that emulates the XSLT templates in the files
named Dom11.xsl and Dom11.xsl. These two XSL
files differ in terms of their dependence on the
built-in templates.

Dom11.xsl explicitly includes template rules that
replicate the built-in rules for text, nodes, and
documents.

Dom11.xsl doesn't explicitly include any
template rules, but depends entirely on built-in
rules for proper operation.

Both XSL files produce the same result when
processed against the XML files named Doc11.xml
and Dom11.xml, demonstrating the behavior of
the built-in template rules.

The execution of these template rules causes the
explicit template rules, or the built-in template
rules to be executed on every node, thereby
causing the contents of every text node to be
concatenated and written into the result file.


The program requires three command line
parameters in the following order:
1, The name of the input XML file - must be
Dom11.xml or Dom11.xml.
2. The name of the output file to be
produced by the XSL transformation.
3. The name of the output file to be
produced by the program code that emulates
the XSL transformation.

The name of the XSL stylesheet file is extracted
from the processing instruction in the XML file.

The program begins by executing code to transform
the incoming XML file in a way that mimics the
XSL Transformation. Along the way, it saves the
processing instructions containing the ID of the
stylesheet file for use by the XSLT process
later. Otherwise, the code that performs the
XSL transformation later would have to search the
DOM tree for the XSL stylesheet file.

Then the program uses the XSLT style sheet to
transform the XML file into a result file.

No effort was made to provide meaningful
information about errors and exceptions.

Tested with SDK 1.4.2 under WinXP.
************************************************/

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;

import org.w3c.dom.*;

import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.*;

import java.util.*;
import java.io.*;

public class Dom11{

PrintWriter out;//output stream
//Save processing instruction nodes here
static Vector procInstr = new Vector();

public static void main(String argv[]){
if (argv.length != 3){
System.err.println(
"usage: java Dom11 "
+ "xmlFileIn "
+ "xformFileOut "
+ "codeFileOut");
System.exit(0);
}//end if

try{
//Get a factory object for DocumentBuilder
// objects
///
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();

//Configure the factory object. Change
// the following parameter to false for a
// non-validating parser.
///
factory.setValidating(true);
factory.setNamespaceAware(false);
//The following statement causes the parser
// to ignore cosmetic whitespace between
// elements.
///
factory.
setIgnoringElementContentWhitespace(true);

//Get a DocumentBuilder (parser) object
///
DocumentBuilder builder =
factory.newDocumentBuilder();

//Parse the XML input file to create a
// Document object that represents the
// input XML file.
///
Document document = builder.parse(
new File(argv[0]));

//Instantiate an object of this class
///
Dom11 thisObj = new Dom11();

//TRANSFORMATION THROUGH PROGRAM CODE
//Use program code to transform the
// DOM tree into an output file.
//
//Get an output stream for the output
// produced by the program code. This
// stream object is used by several
// methods, so it was instantiated at this
// point and saved as an instance variable
// of the object.
///
thisObj.out = new PrintWriter(
new FileOutputStream(argv[2]));

//Process the DOM tree, beginning with the
// Document node to produce the output.
// Invocation of processDocumentNode starts
// a recursive process that processes the
// entire DOM tree.
///
thisObj.processDocumentNode(document);


//XSLT TRANSFORMATION
//Use XSLT to transform the DOM tree into
// an output file. Note that the success
// of this method call depends on the
// stylesheet processing instruction having
// been saved while the transformation was
// being performed using program code
// above. Otherwise, it would be necessary
// to include the code in this method to
// search the DOM tree for the stylesheet
// processing instruction. All processing
// instructions are saved in a Vector
// object, which is passed as the third
// parameter to this method.
///
thisObj.doXslTransform(
document,argv[1],procInstr);

}catch(Exception e){
//Note that no effort was made to provide
// meaningful results in the event of an
// exception or error.
///
e.printStackTrace(System.err);
}//end catch
}// end main()
//-------------------------------------------//

//This method is used to produce any text
// required in the output at the document
// level, such as the XML declaration for an
// XML document.
///
void processDocumentNode(Node node){
//Write one line of text into the output.
///
out.println("<?xml version="1.0" "
+ "encoding="UTF-8"?>");

//Go process the root (document) node. This
// method call triggers a recursive process
// that processes the entire DOM tree.
///
processNode(node);

out.flush();
}//end processDocumentNode
//-------------------------------------------//

//There are seven kinds of nodes:
// root or document
// element
// attribute
// text
// comment
// processing instruction
// namespace
//
//This method handles the first six.
// Apparently it is not possible to handle
// namespace nodes in Java because there is
// no constant in the Node class to identify
// namespace nodes
///
void processNode(Node node){

try{
if (node == null){
System.err.println(
"Nothing to do, node is null");
return;
}//end if

//Process the incoming node based on its
// type.
///
int type = node.getNodeType();

//To define an overriding template rule,
// insert the matching condition in the
// conditional clause of the if statement,
// and provide code to implement the rule
// in the body of the if statement. If the
// conditional clause evaluates to true,
// the default rule for that element type
// will not be processed.
///
switch (type){
case Node.TEXT_NODE:{
if(false){
//Change conditional and write
// overriding handler here
///
}else{//invoke default behavior
out.print(defTextOrAttrTemp(node));
}//end else
break;
}//end case Node.TEXT_NODE

case Node.ATTRIBUTE_NODE:{
if(false){
//Change conditional and write
// overriding handler here
///
}else{//invoke default behavior
out.print(defTextOrAttrTemp(node));
}//end else
break;
}//end case Node.ATTRIBUTE_NODE

case Node.ELEMENT_NODE:{
if(false){
//Change conditional and write
// overriding handler here
///
}else{//invoke default behavior
defElOrRtNodeTemp(node);
}//end else
break;
}//end case ELEMENT_NODE

case Node.DOCUMENT_NODE:{
if(false){
//Change conditional and write
// overriding handler here
///
}else{//invoke default behavior
defElOrRtNodeTemp(node);
}//end else
break;
}//end case DOCUMENT_NODE

case Node.COMMENT_NODE:{
if(false){
//Change conditional and write
// overriding handler here
///
}else{//invoke default behavior
defComOrProcInstrTemp(node);
}//end else
break;
}//end case COMMENT_NODE

case Node.PROCESSING_INSTRUCTION_NODE:{
if(false){
//Change conditional and write
// overriding handler here
///
//Save proc instr for later use
procInstr.add(node);
}else{//invoke default behavior
//First save proc instr for later
// use.
///
procInstr.add(node);
//Now invoke default behavior.
///
defComOrProcInstrTemp(node);
}//end else
break;
}//end case PROCESSING_INSTRUCTION_NODE

default:{
//Ignore all other node types.
}//end default

}//end switch

}catch(Exception e){
e.printStackTrace(System.err);
}//end catch
}//end processNode(Node)
//-------------------------------------------//

//This method emulates the following default
// template rule:
// <xsl:template match="text()|@*">
// <xsl:value-of select="."/>
// </xsl:template>
///
String defTextOrAttrTemp(Node node)
throws Exception{
int nodeType = node.getNodeType();
if((nodeType == Node.ATTRIBUTE_NODE)
|| (nodeType == Node.TEXT_NODE)){
//Get and return the value of the context
// node.
///
return valueOf(node,".");
}else{
throw new Exception(
"Bad call to defaultTextOrAttr method");
}//end else
}//end defaultTextOrAttr
//-------------------------------------------//

//This method emulates the following default
// template rule:
// <xsl:template match="*|/">
// <xsl:apply-templates/>
// </xsl:template>
///
void defElOrRtNodeTemp(Node node)
throws Exception{
int nodeType = node.getNodeType();
if((nodeType == Node.ELEMENT_NODE) ||
(nodeType == Node.DOCUMENT_NODE)){
//Note that the following is a recursive
// method call.
///
applyTemplates(node,null);
}else{
throw new Exception(
"Bad call to defElOrRtNodeTemp");
}//end else
}//end defElOrRtNodeTemp
//-------------------------------------------//

//This method emulates the following default
// template rule:
// <xsl:template
// match="processing-instruction()|comment()"
///
String defComOrProcInstrTemp(Node node)
throws Exception{
int nodeType = node.getNodeType();
if((nodeType == Node.COMMENT_NODE) ||
(nodeType ==
Node.PROCESSING_INSTRUCTION_NODE)){
//According to page Nutshell pg 148, the
// default rule for comments and processing
// instructions doesn't output anything
// into the result tree.
///
return "";//empty string
}else{
throw new Exception("Bad call to " +
"defalutCommentOrProcInstrTemplate");
}//end else
}//end defComOrProcInstrTemp
//-------------------------------------------//

//See Nutshell, pg 148 for an explanation as to
// why it is not possible to write a Java
// method that emulates the default namespace
// template.
///
void defaultNamespaceTemplate(Node node)
throws Exception{
throw new Exception("See Nutshell pg 148" +
"regarding default behavior for " +
"namespace template.");
}//end defaultNamespaceTemplate
//-------------------------------------------//

//Simulates an XSLT apply-templates rule.
// <xsl:apply-templates
// optional select = "..."
// optional mode = "..."
// >
//Note that the mode attribute is not supported
// in this version.
//If the select parameter is null, all child
// nodes are processed.
void applyTemplates(Node node,String select){
NodeList children = node.getChildNodes();
if (children != null){
int len = children.getLength();
//Iterate on NodeList of child nodes.
for (int i = 0; i < len; i++){
if((select == null) ||
(select.equals(children.item(i).
getNodeName()))){
//Note that the following is a
// recursive method call.
///
processNode(children.item(i));
}//end if
}//end for loop
}//end if children != null

}//end applyTemplates
//-------------------------------------------//

//This method simulates an XSLT
// <xsl:value-of select="???"/>
// The general form of the method call is
// valueOf(Node theNode,String select)
//
//The method recognizes three forms of call:
// valueOf(Node theNode,String "@attrName")
// valueOf(Node theNode,String ".")
// valueOf(Node theNode,String "nodeName")
//
//In the first form, the method returns the
// text value of the named attribute of
// theNode. An attribute is specified by a
// select value that begins with @. If the
// attribute doesn't exist, the method returns
// an empty string.
//
//In the second form, the method returns the
// concatenated text values of descendants of
// the context node.
//
//In the third form, the method returns the
// concatenated text values of all descendants
// of a specified child node of the context
// node. If the context node has more than one
// child node with the specified name, only the
// first one found is processed. The others
// are ignored.
//
//The method does not support the following,
// which are standard features of xsl:value-of:
// disable-output-escaping
// processing instruction nodes
// comment nodes
// namespace nodes
///

public String valueOf(Node node,String select){

if(select != null
&& select.charAt(0) == '@'){
//This is a request for the value of an
// attribute. Returns empty string if the
// attribute doesn't exist on the element.
String attrName = select.substring(1);
NamedNodeMap attrList =
node.getAttributes();
Node attrNode = attrList.getNamedItem(
attrName);
if(attrNode != null){
return attrNode.getNodeValue();
}else{
return "";//empty string
}//end else
}//end if on @

else if(select != null
&& select.equals(".")){
//This is a request to process the context
// node
int nodeType = node.getNodeType();
if(nodeType == Node.ELEMENT_NODE){
//Process the context node as an element
// node. Return the concatenated text
// values of all descendants of the
// context node.
NodeList childNodes =
node.getChildNodes();
int listLen = childNodes.getLength();
String nodeTextValue = "";//result

for(int j = 0; j < listLen; j++){
nodeTextValue +=
valueOf(childNodes.item(j),".");
}//end for loop
return nodeTextValue;
}else if(nodeType == Node.TEXT_NODE){
//Process the context node as a text
// node. Simply get and return its
// value.
return node.getNodeValue();
}else{
//ignore all other context node types
}//end else
}//end if for context node

else if(select != null){
//Process a child node whose name is
// specified by the value of the incoming
// parameter named select. Get and return
// the concatenated text values of all
// descendants of the specified child node.
//This process assumes that there is only
// one child node with the specified name
// and processes the first one that it
// finds.
NodeList children = node.getChildNodes();
int len = children.getLength();
for (int i = 0; i < len; i++){
//Trap the specified child node
if(children.item(i).getNodeName().
equals(select)){
//Make a recursive call and let
// existing code do the work.
return valueOf(children.item(i),".");
//The above return statement causes any
// additional child nodes having the
// same name to be ignored.
}//end if getNodeName == select
}//end for loop on all child nodes
}//end else if(select != null)
//Will reach here only if value of select
// is null.
///
return "";//empty string
}//end method valueOf
//-------------------------------------------//

//This method uses an incoming XSLT stylesheet
// file to transform an incoming Document
// object into an output file. Note that the
// successful invocation of this method depends
// on the processing instruction containing the
// stylesheet having been saved in a Vector
// object that is received as an incoming
// parameter. Otherwise, this method would
// have to search the DOM for the stylesheet
// processing instruction.
///
void doXslTransform(Document document,
String outFile,
Vector procInstr)
throws Exception{
try{
//Get stylesheet ID from proc instr.
ProcessingInstruction pi = null;
boolean piFlag = false;
int size = procInstr.size();
//Search for a stylesheet in the Vector
// containing processing instruction nodes.
///
for(int i = 0; i < size; i++){
pi = (ProcessingInstruction)procInstr.
get(i);
if(pi.getTarget().startsWith(
"xml-stylesheet") && pi.getData().
startsWith("type="text/xsl"")){
//Looks like a good stylesheet.
///
piFlag = true;
break;
}//end if
}//end for loop
if(piFlag == false){//still false?
throw new Exception(
"No valid stylesheet");
}//end if
//Get the stylesheet file reference
///
String xslFile = pi.getData().
substring(pi.getData().indexOf(
"href=")+6);
//Eliminate the quotation mark at the end
///
xslFile = xslFile.substring(
0,xslFile.length()-1);

//Get a TransformerFactory object
///
TransformerFactory xformFactory =
TransformerFactory.newInstance();
//Get an XSL Transformer object based on
// the XSL file discovered above.
///
Transformer transformer =
xformFactory.newTransformer(
new StreamSource(
new File(xslFile)));
//Get a DOMSource object that represents
// the DOM tree.
///
DOMSource source = new DOMSource(document);

//Get an output stream for the output
// file.
///
PrintWriter xformStream = new PrintWriter(
new FileOutputStream(outFile));

//Get a StreamResult object that points to
// the output file. Then transform the DOM
// sending text to the output file.
///
StreamResult xformResult =
new StreamResult(xformStream);

//Do the transform
///
transformer.transform(source,xformResult);
}catch(Exception e){
e.printStackTrace(System.err);
}//end catch

}//end doXslTransform

}// class Dom11

Listing 31

<?xml version="1.0"?>

<!DOCTYPE top [
<!ELEMENT top (theData)*>
<!ELEMENT theData (title,author,price)*>
<!ELEMENT title (#PCDATA | subtitle)*>
<!ELEMENT author (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ELEMENT subtitle (#PCDATA)>
<!ATTLIST theData attr CDATA #IMPLIED>
<!ATTLIST subtitle position CDATA #IMPLIED>
]>

<!-- File Dom11a.xml
Copyright 2003 R. G. Baldwin
Illustrates built-in template rules.
Same as Dom11.xml except for stylesheet
specification.
-->

<!--Two of the following proc instr were included
to test the ability of the program to find the
actual stylesheet proc instr.-->
<?dummy-target dummy-data="def"?>
<?xml-stylesheet
type="text/xsl" href="Dom11a.xsl"?>
<?false-target false-data="ghi"?>

<top>

<theData attr="Dummy Attr Value">
<title>Java
<subtitle position="Low">really</subtitle>rules
</title>
<author>R.Baldwin</author>
<price>$9.95</price>
</theData>

<theData>
<title>Python</title>
<author>R.Baldwin</author>
<price>$15.42</price>
</theData>

<theData>
<title>XML</title>
<author>R.Baldwin</author>
<price>$19.60</price>
</theData>

</top>

Listing 32

<?xml version='1.0'?>
<!-- File Dom11a.xsl
Copyright 2003 R. G. Baldwin
Illustrates extraction of text from an XML file.

This version accepts a built-in template rule
that guarantees that the root and all child nodes
are processed.

It also accepts the built-in template rule
that copies the value of text and attribute nodes
into the output. However, attribute nodes are
not copied. See Nutshell page 147 for the
reason.

As a result, the stylesheet is completely
empty.
-->
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

</xsl:stylesheet>

Listing 33


Copyright 2004, Richard G. Baldwin.  Reproduction in whole or
in
part in any form or medium without express written permission from
Richard
Baldwin is prohibited.

About the author

Richard Baldwin
is a college professor (at Austin Community College in Austin, TX) and
private consultant whose primary focus is a combination of Java, C#,
and XML. In addition to the many platform and/or language independent
benefits of Java and C# applications, he believes that a combination of
Java, C#, and XML will become the primary driving force in the delivery
of structured information on the Web.

Richard has participated in numerous consulting projects, and he
frequently provides onsite training at the high-tech companies located
in and around Austin, Texas.  He is the author of Baldwin’s
Programming Tutorials, which
has gained a worldwide following among experienced and aspiring
programmers. He has also published articles in JavaPro magazine.

Richard holds an MSEE degree from Southern Methodist University
and has many years of experience in the application of computer
technology to real-world problems.

Baldwin@DickBaldwin.com

-end-
 

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories