gamelan
Search EarthWeb
CodeGuru | Gamelan | Jars | Wireless | Discussions
Navigate developer.com
Architecture & Design  
Database  
Java
Languages & Tools
Microsoft & .NET
Open Source  
Project Management  
Security  
Techniques  
Voice  
Web Services  
Wireless/Mobile
XML  
Technology Jobs  

   Developer.com Webcasts:
  The Impact of Coding Standards and Code Reviews

  Project Management for the Developer

  Defining Your Own Software Development Methodology

  more Webcasts...




See the Winners!


Developer Jobs

Be a Commerce Partner
Online Universities
Laptops
Compare Prices
Memory
Calling Cards
Promos and Premiums
Car Donations
Boat Donations
Web Design
Compare Prices
Shop
Data Center Solutions
Web Hosting Directory
Laptop Batteries

 


Web Devs:
Moonlight as a Game Developer and Win Cool Prizes by Accepting the RIA Run Challenge

Now, your mission--should you choose to accept: Take your shot at gaming stardom if you think you might have what it takes to build a cool RIA game and you could win an Xbox 360 or other fabulous prizes. Hurry! You only have until May 15, 2008 to enter. »

 
Article:
Leveraging Your Flash Development with Silverlight

You're not giving up Flash any time soon (and we don't blame you.) But if you could get your Flash application working in Silverlight, why wouldn't you? We show you the tools and techniques required to have your rockin' Flash application rolled for Silverlight. Learn more here. »

 
Article:
What Does it Take to Build the Best RIA?

With the proliferation of Rich Interactive Application (RIA) platform choices out there, you no longer have to take a one-size-fits-all approach to developing your next RIA application. Knowing the strengths (and weaknesses) of each platform can help you to decide the best RIA for your next application. »

 
Developer News -
SaaS Tool Offers Custom Database Development    May 9, 2008
Microsoft’s Automated Agent: Can We Talk?    May 7, 2008
Borland Finally Sells CodeGear    May 7, 2008
Red Hat Heads For The JON 2.0    May 7, 2008
Free Tech Newsletter -

Best Practices for Developing a Web Site: Checklists, Tips, Strategies & More. Download Exclusive eBook Now.

Java JAXP, Transforming XML to XHTML
By Richard G. Baldwin

Java Programming Notes # 2210


Preface

In the previous lesson entitled Java JAXP, Writing Java Code to Emulate an XSLT Transformation, I showed you how to write a Java program that mimics an XSLT transformation for converting an XML file into a text file.  I also showed that once you have a library of Java methods that emulate XSLT elements, it is no more difficult to write a Java program to transform an XML document than it is to write an XSL stylesheet to transform the same document.

In this lesson, I will show you how to use XSLT to transform an XML document into an XHTML document.  I will also show you how to write Java code that performs the same transformation.

This lesson is one in a series designed to teach you how to use JAXP and Sun's Java Web Services Developer Pack (JWSDP).

The first lesson in the series was entitled Java API for XML Processing (JAXP), Getting Started.  As mentioned above, the previous lesson was entitled Java JAXP, Writing Java Code to Emulate an XSLT Transformation.

JAXP, XML, XSL, XSLT, W3C, and XHTML, a Review

JAXP is an API designed to help you write programs for creating and processing XML documents. It is a critical part of Sun's Java Web Services Developer Pack (JWSDP).

XML is an acronym for the eXtensible Markup Language.  I will assume that you already understand XML, and will teach you how to use JAXP to write programs for creating and processing XML documents.

XSL is an acronym for Extensible Stylesheet language.  XSLT is an acronym for XSL Transformations.

The numerous uses of XSLT include the following:
  • Transforming non-XML documents into XML documents.
  • Transforming XML documents into other XML documents.
  • Transforming XML documents into non-XML documents.
This lesson explains a Java program that transforms an XML document into an XHTML document.

An XHTML document is an XML document that provides a rigorous alternative to the use of an HTML document.  According to the W3C, XHTML 1.0 is a "Reformulation of HTML 4 in XML 1.0."

Viewing tip

You may find it useful to open another copy of this lesson in a separate browser window.  That will make it easier for you to scroll back and forth among the different listings and figures while you are reading about them.

Supplementary material

I recommend that you also study the other lessons in my extensive collection of online Java and XML tutorials.  You will find those lessons published at Gamelan.com.  As of the date of this writing, Gamelan doesn't maintain a consolidated index of my tutorial lessons, and sometimes they are difficult to locate there.  You will find a consolidated index at www.DickBaldwin.com.

Preview

A tree structure in memory

A DOM parser can be used to create a tree structure in memory that represents an XML document.  In Java, that tree structure is encapsulated in an object of the interface type Document.

Many operations are possible

Given an object of type Document (often called a DOM tree), there are many methods that can be invoked on the object to perform a variety of operations.

Two ways to transform an XML document

There are at least two ways to transform the contents of an XML document into another document:

  • By writing Java code to manipulate the DOM tree and perform the transformation.
  • By using XSLT to perform the transformation.

A skeleton library of Java methods

This is one of several lessons that show you how to write the skeleton of a Java library containing methods that emulate the most common XSLT elements.  Once you have the library, writing Java code to transform XML documents consists mainly of writing a short driver program to access those methods.  Given the proper library of methods, it is no more difficult to write a Java program to perform the transformation than it is to write an XSLT stylesheet.

Library is not my primary purpose

However, my primary purpose in these lessons is not to provide such a library, but rather is to help you understand how to use a DOM tree to create, modify, and manipulate XML documents.  By comparing Java code that manipulates a DOM tree with similar XSLT operations, you will have an opportunity to learn a little about XSLT in the process of learning how to manipulate a DOM tree using Java code.

Some Details Regarding XHTML

XHTML documents, a special case

An XHTML document is an XML document.  It is a rigorous alternative to an HTML document. 

One of the interesting uses of XSLT is the transformation of XML documents into XHTML documents.  This makes it possible to render the information contained in an XML document using an XHTML-compatible Web browser.

Where does the transformation take place?

When transforming an XML document for rendering with an XHTML browser, the transformation can take place anywhere between the source of the XML document and the browser.

Transforming on the server

For example, a transformation program can be written in Java and run on a web server as a servlet, or it can be written as a JavaBeans component and accessed from a scriptlet in JavaServer pages (JSP).

Transforming at the browser

The transformation can also be performed by the browser.  For example, Microsoft IE 6.0 and XSLT can be used for this purpose.

Will transform XML into XHTML

This and the next several lessons will illustrate parallel Java code and XSLT transformations to transform XML documents into XHTML documents.  The sample programs will illustrate various aspects of the manipulation of a DOM tree using Java code.

Requirements for XHTML documents

According to Web Design & Development Using XHTML by Griffin, Morales, and Finnegan, an XHTML document differs from an HTML document in the following ways:

  • XHTML documents must be well-formed.
  • Element and attribute names must be in lower case.
  • Non-empty elements require end tags.
  • Attribute values must always be quoted.
  • XHTML documents have no attribute minimization.
  • XHTML documents end empty elements.
  • XHTML documents use elements with id and name attributes.
  • XHTML documents use Document Type Declarations
  • XHTML documents use XML namespaces.
Although it is not a requirement, an XHTML document often has an XML declaration at the beginning to identify the document as an XML document.

Some Details Regarding XSLT

Previous lessons in this series have provided quite a bit of detailed information regarding the operation of XSLT.  Therefore, this discussion will be brief.

Assume that an XML document has been parsed to produce a DOM tree in memory that represents the XML document.

Execute template rules

An XSLT processor starts examining the DOM tree at its root node.  It obtains instructions from the XSLT stylesheet telling it how to navigate the tree, and how to treat each node that it encounters along the way.

As each node is encountered, the processor searches the stylesheet looking for a template rule that governs how to treat nodes of that type.  If the processor finds a template rule that matches the node type, it performs the operations indicated by the template rule.  Otherwise, it executes a built-in template rule appropriate to that node.

Literal text in template rules

If the template rule being applied contains literal text, that literal text is used to create text in the output.

Traversal of the DOM tree

There are at least two XSLT elements that can be used to traverse the children of a context node:
  • xsl:apply-templates
  • xsl:for-each

The xsl:apply-templates element

The xsl:apply-templates element was discussed in detail in previous lessons.

The xsl:for-each element

The xsl:for-each element executes an iterative examination of all child nodes of the context node that match a required select attribute.  As each child node is examined, it is processed using XSLT elements that form the content of the xsl:for-each element in the template rule.

This lesson will include examples that use the xsl:for-each element in addition to the xsl:apply-templates element.  The lesson will also explain a Java method that emulates the xsl:for-each element.

Enough talk, let's see some code

I will begin by discussing the XML file named Dom03.xml (shown in Listing 24 near the end of the lesson) along with the XSL stylesheet file named Dom03.xsl (shown in Listing 25).

A Java program named Dom03

After explaining the transformation produced by applying this stylesheet to this XML document, I will explain the transformation produced by processing the XML file with a Java program named Dom03 (shown in Listing 23) that mimics the behavior of the XSLT transformation.

Discussion and Sample Code

The XML file named Dom03.xml

The XML file shown in Listing 24 is relatively straightforward.  A tree view of the XML file is shown in Figure 1.  (This XML file is both well-formed and valid.)

#document DOCUMENT_NODE
A DOCUMENT_TYPE_NODE
#comment COMMENT_NODE
xml-stylesheet PROCESSING_INSTRUCTION_NODE
A ELEMENT_NODE
Q ELEMENT_NODE
#text A Big Header
B ELEMENT_NODE
C ELEMENT_NODE
#text Text block 1.
R ELEMENT_NODE
#text A Mid Header
C ELEMENT_NODE
#text Text block 2.
#comment COMMENT_NODE
processor PROCESSING_INSTRUCTION_NODE
S ELEMENT_NODE
#text A Small Header
B ELEMENT_NODE
C ELEMENT_NODE
#text Text block 3.
S ELEMENT_NODE
#text Another Small Header
B ELEMENT_NODE
C ELEMENT_NODE
#text Text block 4.
T ELEMENT_NODE
#text A Smallest Header
B ELEMENT_NODE
C ELEMENT_NODE
#text Text block 5.
D ELEMENT_NODE
E ELEMENT_NODE
#text First list item in E
G ELEMENT_NODE
#text Nested G text element
F ELEMENT_NODE
#text First list item in F
E ELEMENT_NODE
#text Second list item in E
F ELEMENT_NODE
#text Second list item in F
E ELEMENT_NODE
#text Third list item in E
F ELEMENT_NODE
#text Third list item in F
C ELEMENT_NODE
#text Text block 6.
C ELEMENT_NODE
#text Text block 7.
R ELEMENT_NODE
#text Another Mid Header
C ELEMENT_NODE
#text Text block 8.
B ELEMENT_NODE
R ELEMENT_NODE
#text Another Mid Header in Another B
C ELEMENT_NODE
#text Text block 9.
Figure 1

(This tree view of the XML file was produced using a program named DomTree02, which was discussed in an earlier lesson.

Note that in order to make the tree view more meaningful, I manually removed extraneous line breaks and text nodes associated with those line breaks.  The extraneous line breaks in Figure 1 were caused by extraneous line breaks in the XML file.  The extraneous line breaks in the XML file were placed there for cosmetic reasons and to force it to fit into this narrow publication format.)


Content of the XML document

The structure and content of the XML document was primarily designed to illustrate various transformation concepts that I intend to explain in this lesson.  However, to some extent, I designed the structure and content keeping in mind the ultimate rendering of the XHTML file that will be produced by transforming the XML file into an XHTML file.

The rendered XHTML file

At this point, I'm going to jump ahead and show you what the final XHTML file looks like when rendered using Netscape Navigator v7.1.  The rendering of the XHTML file is shown in Figure 2. 

(You may find it useful to compare the rendering in Figure 2 with the XML file structure and content in Figure 1.  You should be able to identify text nodes in Figure 1 that match up with rendered text in Figure 2.)

Rendered XHTML file

Figure 2 Rendered XHTML file

The XSLT Transformation

The XSL stylesheet file named Dom03.xsl

Recall that an XSL stylesheet is itself an XML file, and can therefore be represented as a tree.  Figure 3 presents an abbreviated tree view of the stylesheet shown in Listing 25.  I colored each of the template rules in this view with alternating colors of red and blue to make them easier to identify.

(As is often the case with XSL stylesheets, this stylesheet file is well-formed but it is not valid.)

NOTE:  IT WAS NECESSARY TO MANUALLY ENTER SOME
LINE BREAKS IN THIS PRESENTATION TO FORCE IT TO
FIT INTO THIS NARROW PUBLICATION FORMAT.

#document DOCUMENT_NODE
xsl:stylesheet ELEMENT_NODE
Attribute: version=1.0
Attribute: xmlns:xsl=http://www.w3.org/1999
/XSL/Transform
xsl:output ELEMENT_NODE
Attribute: method=xml
Attribute: doctype-public=-//W3C//DTD
XHTML 1.0 Transitional//EN
Attribute: doctype-system=http://www.w3.
org/TR/xhtml1/DTD/xhtml1-transitional.dtd

xsl:template ELEMENT_NODE
Attribute: match=/
html ELEMENT_NODE
head ELEMENT_NODE
meta ELEMENT_NODE
Attribute: http-equiv=content-type
Attribute: content=text/html;
charset=UTF-8
title ELEMENT_NODE
#text Generated XHTML file
body ELEMENT_NODE
table ELEMENT_NODE
Attribute: border=2
Attribute: cellspacing=0
Attribute: cellpadding=0
Attribute: width=330
Attribute: bgcolor=#FFFF00
tr ELEMENT_NODE
td ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=B
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=C
p ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=D
#text List of items in E

ul ELEMENT_NODE
xsl:for-each ELEMENT_NODE
Attribute: select=E
li ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE
#text List of items in F
ol ELEMENT_NODE
xsl:for-each ELEMENT_NODE
Attribute: select=F
li ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=G
b ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=Q
h1 ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=R
h2 ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=S
h3 ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE

xsl:template ELEMENT_NODE
Attribute: match=T
h4 ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE
Figure 3

Why abbreviated?

The reason that I refer to this as an abbreviated tree view is because I manually deleted comment nodes and extraneous text nodes in order to emphasize the important elements in the stylesheet.

(Extraneous text nodes occur as a result of inserting line breaks in the original XSL document for cosmetic purposes.

Note that I also manually entered several line breaks near the beginning to force the material to fit into this narrow publication format.)


The root element

The root node of all XML documents is the document node.  In addition to the root node, there is also a root element, and it is important not to confuse the two.

As you can see from Figure 3, the root element in the XSL document is of type xsl:stylesheet.  The root element has two attributes, each of which is standard for XSL stylesheets.

(Note that I manually entered a line break in the second attribute of the xsl:stylesheet node to force it to fit into this narrow publication format.  I also manually entered line breaks into two of the attributes of the xsl:output element node to force them to fit into this narrow publication format.)

The first attribute provides the XSLT version. The second attribute points to the XSLT namespace URI, which you can read about in the W3C Recommendation.

Children of the root element node

The root element node (xsl:stylesheet) in
Figure 3 has ten child nodes, nine of which are template rules.  (The green child node is not a template rule.  I will discuss it in detail later.)  I colored the template rules in alternating colors of red and blue to make them easier to identify visually.

The template rules

Each of the nine template rules has a match pattern.  The nine match patterns in the order that they appear in Figure 3 are as follows:
  1. match=/ (root node)
  2. match=B (matches element node named B)
  3. match=C (matches element node named C)
  4. match=D (matches element node named D)
  5. match=G (matches element node named G)
  6. match=Q (matches element node named Q)
  7. match=R (matches element node named R)
  8. match=S (matches element node named S)
  9. match=T (matches element node named T)
I will discuss each of the nine template rules later, but before doing that I will show you the raw XHTML output produced by this XSLT transformation.

(Note that the Java program discussed later produces essentially the same output as the XSLT transformation.)

The output from the transformation


The result of performing an XSLT transformation (by applying the XSL stylesheet shown in Listing 25 to the XML file shown in Listing 24) is shown in Figure 4.  This is the raw XHTML code that was rendered in Figure 2.

I will explain the operations in the XSLT transformation that produced most of the text in Figure 4.

NOTE THAT IT WAS NECESSARY FOR ME TO MANUALLY
INSERT LINE BREAKS IN SEVERAL OF THE LONG LINES
IN THIS MATERIAL TO FORCE IT TO FIT INTO THIS
NARROW PUBLICATION FORMAT. I ALSO MANUALLY
INSERTED LINE BREAKS AT CRITICAL POINTS TO
MAKE IT EASIER TO INTERPRET THE MATERIAL
VISUALLY.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Transitional//EN" "http://www.w3.org/TR/
xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type"
content="text/html; charset=UTF-8"/>
<title>Generated XHTML file</title>
</head>
<body>
<table border="2" cellspacing="0" cellpadding="0"
width="330" bgcolor="#FFFF00"><tr><td>
<h1>
A Big Header
</h1>
<p>
Text block 1.
</p>
<h2>
A Mid Header
</h2>
<p>
Text block 2.
</p>
<h3>
A Small Header
</h3>
<p>
Text block 3.
</p>
<h3>
Another Small Header
</h3>
<p>
Text block 4.
</p>
<h4>
A Smallest Header
</h4>
<p>
Text block 5.
</p>
List of items in E
<ul>
<li>
First list item in E
<b>
Nested G text element
</b>
</li>
<li>
Second list item in E
</li>
<li>
Third list item in E
</li>
</ul>
List of items in F
<ol>
<li>
First list item in F
</li>
<li>
Second list item in F
</li>
<li>
Third list item in F
</li>
</ol>
<p>
Text block 6.
</p>
<p>
Text block 7.
</p>
<h2>
Another Mid Header
</h2>
<p>
Text block 8.
</p>
<h2>
Another Mid Header in Another B
</h2>
<p>
Text block 9.
</p>
</td></tr></table>
</body></html>

Figure 4

(Note that I manually deleted a couple of extraneous line breaks from the output shown in Figure 4.  It was also necessary for me to manually insert line breaks in several of the long lines to force the material to fit in this narrow publication format.  I also manually inserted line breaks at certain critical points to make it easier to interpret the material visually.)

Can sometimes get confusing

I will caution you up front that this discussion can become confusing but I will do everything that I can to minimize the confusion.  The problem is that the discussion will be mixing tags, attributes and elements from the XML file with tags, attributes, and elements from the stylesheet file and the XHTML file.  With so many tags, attributes, and elements being discussed, it is sometimes difficult to keep them separated in your mind.

In particular, in order to cause the output to be a valid XHTML document, it is necessary to manually insert XHTML tags, attributes, and elements in the XSL template rules, which themselves involve XML tags, attributes, and elements.

I will make heavy use of color in an attempt to minimize the confusion.

The first line of text

The first line of text in the output shown in Figure 4 is an XML declaration that is produced automatically by the XSLT transformer available with JAXP.  As I mentioned earlier, such a declaration is not required, but is highly recommended by most authors.

The xsl:output element

Before getting into the template rules in Figure 3, I need to explain the xsl:output element shown in green in Figure 3 and reproduced in Figure 5 below for convenient viewing.


    xsl:output ELEMENT_NODE
Attribute: method=xml
Attribute: doctype-public=-//W3C//DTD
XHTML 1.0 Transitional//EN
Attribute: doctype-system=http://www.w3.
org/TR/xhtml1/DTD/xhtml1-transitional.dtd
Figure 5

The XSL stylesheet version

Listing 1 shows the XSL code that corresponds to the tree view of the stylesheet element shown in Figure 5.

<xsl:output method="xml" 
doctype-public="-//W3C//DTD
XHTML 1.0 Transitional//EN"
doctype-system="http://www.w3.
org/TR/xhtml1/DTD/xhtml1-transitional.dtd" />

Listing 1

(As on several previous occasions, I need to remind you that it was necessary for me to manually insert line breaks in Listing 1 to cause the material to fit in this narrow publication format.)

Literal text passes through to the output


As you learned in the previous lesson, any literal text that you include in your XSL stylesheet will be passed through to the output.  As you will see later, I will cause the output to contain much of the required XHTML text simply by including that XHTML text as literal text in the stylesheet.

The stylesheet is an XML document

It is important to remember, however, that the XSL stylesheet is itself an XML document, and you cannot include any literal text that would cause a parser to reject it as an XML document.  You also cannot do anything that will cause the XSLT processor to reject it as a stylesheet.

XHTML document requires a specific DTD reference

One of the things that is required in the XHTML output is the DTD reference shown in Figure 6.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 
Transitional//EN" "http://www.w3.org/TR/
xhtml1/DTD/xhtml1-transitional.dtd">

Figure 6

(The material in Figure 6 was extracted from Figure 4 and reproduced here for convenient viewing.  This is one of three alternative DTDs that can be used with an XHTML document.)

Correct DTD for XHTML but not for stylesheet


The DTD reference in Figure 6 is a correct DTD reference for an XHTML document, but it is not a correct DTD reference for an XSL stylesheet.  (In fact, stylesheets don't require a DTD and often don't have one.)

If you simply include the text from Figure 6 as literal text in the stylesheet, (in hopes that it will pass through to the output), the XSLT processor will interpret it as a DTD reference for the stylesheet, and will attempt to validate the stylesheet against that reference.  The stylesheet will then be declared invalid and the transformation effort will fail.

Therefore, you must find a way to cause this DTD reference to end up in the XHTML document without confusing the XSLT transformation process.

Two ways to accomplish that

I know of two ways to accomplish that objective.  One way is to include the text from Figure 6 in a CDATA section in the stylesheet.  This raises some other issues, but it can be made to work.

The easier way is to use the xsl:output element shown in Listing 1 to cause the DTD reference to be written into the output without confusing the parser or the XSLT processor.

The xsl:output element

Here is a partial quotation from XML In A Nutshell, (which I highly recommend), by Elliotte Rusty Harold and W. Scott Means.

"The top-level xsl:output element helps determine the exact formatting of the XML document produced when the result tree is stored in a file, written onto a stream, or otherwise serialized into a sequence of bytes."

Ten optional attributes

To make a long story short, this element has ten optional attributes that are used by the XSLT processor to determine the formatting of the output.  The XSLT element shown in Listing 1 specifies values for three of those optional attributes:
  1. method
  2. doctype-public
  3. doctype-system
The default value for method is xml, so I could have omitted this attribute from my stylesheet with no problems.  When the value of this attribute is xml, (which is the case in Listing 1), that instructs the processor to produce a well-formed XML document.

The doctype-public attribute sets the public identifier used in the document type declaration.

The doctype-system attribute sets the system identifier used in the document type declaration.

The required XHTML DTD

There are three allowable DTDs that can be used for an XHTML document:
  • Strict
  • Transitional
  • Frameset
I'm not going to get into the differences between these three DTDs in this lesson.  Suffice it to say that I elected to use the transitional DTD for this example because it is somewhat easier to use than the other two.

The transitional DTD

Here is what the W3C has to say about the DTD for XHTML 1.0 Transitional:

This DTD module is identified by the following PUBLIC and SYSTEM identifiers:

PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
SYSTEM
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"

As you can see, these values match the doctype-public and doctype-system attribute values  in Listing 1, and result in the correct output for the XHTML DTD in Figure 6.

The first template rule


The first template rule (extracted from Figure 3 and given a different color scheme) is shown in tree view in Figure 7.  This template rule contains an XPath expression that matches the document root (note the forward slash).


    xsl:template ELEMENT_NODE
Attribute: match=/
html ELEMENT_NODE
head ELEMENT_NODE
meta ELEMENT_NODE
Attribute: http-equiv=content-type
Attribute: content=text/html;
charset=UTF-8
title ELEMENT_NODE
#text Generated XHTML file
body ELEMENT_NODE
table ELEMENT_NODE
Attribute: border=2
Attribute: cellspacing=0
Attribute: cellpadding=0
Attribute: width=330
Attribute: bgcolor=#FFFF00
tr ELEMENT_NODE
td ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE
Figure 7

The template rule in XSL format

Listing 2 shows the same template rule in XSL format, (extracted from Listing 25).

<xsl:template match="/">
<html>
<head>
<meta http-equiv="content-type"
content="text/html; charset=UTF-8"/>
<title>Generated XHTML file</title>
</head>
<body>
<table border="2" cellspacing="0"
cellpadding="0" width="330"
bgcolor="#FFFF00" >
<tr>
<td>
<xsl:apply-templates/>
</td>
</tr>
</table>
</body>
</html>
</xsl:template>

Listing 2

(Note that according to most of the books that I have read, the following namespace attribute should be used on the html tag.  However, something about it causes problems with the JAXP transformer so I left it off.  The resulting XHTML file is still valid according to the W3C Markup Validation Service even without the namespace attribute.

xmlns="http://www.w3.org/1999/xhtml"
xml:lang="en" lang="en")

The literal text is shown in red


From my viewpoint as the author of the stylesheet, everything that is colored red in Listing 2 is simply literal text that I want to pass through to the output so that it will become part of the raw XHTML text.

The template rule must be well-formed

However, as you can see from Figure 7, the XML parser considers all of this material to be well-formed (but not valid) XML element nodes, attribute nodes, and text nodes.  Were I to make a change to any of the red literal text that would corrupt the well-formed nature of the XML code in Listing 2, the stylesheet could not be used to control an XSLT transformation.  While a stylesheet is not required to be valid, it is required to be well-formed.

Must be very careful when including markup in stylesheet

Therefore, you must be very careful when you include literal markup text in the stylesheet for whatever purpose.  Any markup that you include in the stylesheet must result in the stylesheet being well-formed.

(This was not a problem with the inclusion of literal text in the stylesheet in the previous lesson, because the literal text didn't contain markup characters.  As a result, the literal text was interpreted simply as text nodes in the stylesheet.  As you can see from Figure 7, however, the literal markup text that was included in this stylesheet was interpreted by the parser as element nodes, attributes and text nodes.)

A very simple template rule.

At first blush, this template rule appears to be very long and very complex.  However, as you can see from Listing 2, once you isolate out all of the literal XHTML text that's included in the template rule, the actual XSLT template rule is very simple.  This rule simply passes a lot of literal markup text through to the output and causes templates to be applied to all children of the root (document) node.  (You learned what it means to apply templates in the previous lesson.)

The XHTML tags

If you are familiar with XHTML syntax, you will recognize that the literal text shown in red in Listing 2 begins with typical XHTML tags such as <html>, <head>,  and <body>.  These tags are required for an XHTML document.  This text is sent to the output before any processing of the DOM tree is performed.

Then the literal text creates an XHTML table with a yellow background.  The start tags for the table are sent to the output before the xsl:apply-templates element is executed.

All of the output produced by executing the xsl:apply-templates element is inserted into a single data <td> cell in the table.

Finally, when the xsl:apply-templates element returns, the end tags for the table and the end tags for the document are sent to the output.

The raw XHTML output

Figure 8 shows a condensed version of the raw XHTML output.  The XHTML output shown in red in Figure 8 matches the literal text shown in red in the template rule of Listing 2.

NOTE THAT IT WAS NECESSARY FOR ME TO MANUALLY
INSERT LINE BREAKS IN SEVERAL OF THE LONG LINES
IN THIS MATERIAL TO FORCE IT TO FIT INTO THIS
NARROW PUBLICATION FORMAT.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Transitional//EN" "http://www.w3.org/TR/
xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<meta http-equiv="content-type"
content="text/html; charset=UTF-8"/>
<title>Generated XHTML file</title>
</head>
<body>
<table border="2" cellspacing="0" cellpadding="0"
width="330" bgcolor="#FFFF00"><tr><td>

...HTML CODE DELETED FOR BREVITY...

</td></tr></table>
</body></html>

Figure 8

The effect of xsl:apply-templates

Referring once again to Listing 2, we see that this template rule causes templates to be applied to all child nodes of the root or document node.  A root node can have only one child node, which is the root element node.  Referring back to Figure 1, we see that the root element node is named A.

Now referring back to the tree view of the stylesheet in Figure 3 (and also the list of match patterns presented earlier), we see that the stylesheet doesn't contain a template rule that matches an element named A.

Important to understand built-in behavior

If the processor encounters a node for which there is no matching template rule, it executes a built-in template rule for that type of node.  This is where it becomes important to understand the behavior of the built-in template rules, which I explained in the earlier lesson entitled Java JAXP, Implementing Default XSLT Behavior in Java.

The behavior of the built-in template rule for element nodes is to apply templates to all child nodes of the element node.  Therefore, in this case, the processor will apply templates to all child nodes of the root element node named A.

Referring back to Figure 1, we see that the root element node has three child nodes, which occur in the following order:  Q, B, and B.  Therefore, the first node that will be processed is the node named Q.

A template rule that matches Q

Figure 9 and Listing 3 show a template rule that matches an element named Q.

    xsl:template ELEMENT_NODE
Attribute: match=Q
h1 ELEMENT_NODE
xsl:apply-templates ELEMENT_NODE
Figure 9

The tree view of the template rule is shown in Figure 9.  The XSL stylesheet code is shown in Listing 3.

<xsl:template match="Q">
<h1>
<xsl:apply-templates />
</h1>
</xsl:template>

Listing 3

 A level 1 header in the output

This template rule sends the start and end tags for a level 1 XHTML header to the output, and inserts something between those tags by applying templates to all child nodes of the element node named Q.

Referring back to the element node named Q in Figure 1, we see that it has only one child node, and that node is a text node.  Executing the xsl:apply-templates element on a text node causes the built in version of the template rule to be applied.  The built-in version gets the value of the text node and sends it to the output.  This produces the raw XHTML output shown in Figure 10.

<h1>
A Big Header
</h1>

Figure 10

You should be able to easily identify the header from Figure 10 in the first line of the rendered output in Figure 2.

A template rule that matches B


That takes care of processing the root element node's child named Q.  The next child to be processed is a child node named B.

A template rule that matches an element node named B is shown in Figure 11 and Listing 4.