LanguagesXMLLearning XML: Trees, Nodes, and Templates, Part III

Learning XML: Trees, Nodes, and Templates, Part III

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.


Preface

I have authored numerous online articles on XML.  These articles cover the waterfront from introductory topics to advanced topics. I maintain a consolidated index of hyperlinks to all of my XML articles at my personal website so that you can access earlier articles from there.

You may find it useful to open another copy of this lesson in a separate browser window.  That will make it easier for you to scroll back and forth among the different listings while you are reading about them. 

As of this writing, to my knowledge, Microsoft IE5 is the only widely-used web browser that has the ability to render XML documents.  IE5 can render XML documents using either CSS or XSL. This is one in a series of articles that discuss the use of XSL for the rendering of XML documents, with particular emphasis on the use of IE5 for that purpose.

Introduction

In Part I of this series on Trees, Nodes, and Templates, I showed you how to manually convert an XML document into a tree representation.  You can view that tree in Listing 1.
 

A(root)
+-Q(big header)(text)
+-B(block)
| +-C(paragraph)(text)
| +-R(mid-size header)(text)
| +-C(paragraph)(text)
| +-S(small header)(text)
| +-B(block)
| | +-C(paragraph)(text)
| |
| +-S(small header)(text)
| +-B(block)
| | +-C(paragraph)(text)
| | +-T(smallest header)(text)
| | +-B(block)
| | | +-C(paragraph)(text)
| | | +-D(list)
| | | | +-E(list item)(text)
| | | | +-E(list item)(text)
| | | | +-E(list item)(text)
| | | |
| | | +-C(paragraph)(text)
| | |
| | +-C(paragraph)(text)
| |
| +-R(mid-size header)(text)
| +-C(paragraph)(text)
|
+-B(block)
  +-R(mid-size header)(text)
  +-C(paragraph)(text)

Listing 1

I also mentioned that XSLT could be used to transform an XML tree into an HTML tree.  You can view such an HTML tree in Listing 2.
 

HTML
 +-BODY
   +-table(with attributes)
     +-tr
       +-td
         +-H1(text)
         +-P(text)
         +-h2(text)
         +-P(text)
         +-h3(text)
         +-P(text)
         +-h3(text)
         +-P(text)
         +-h4(text)
         +-P(text)
         +-UL
         | +-LI(text)
         | +-LI(text)
         | +-LI(text)
         |
         +-P(text)
         +-P(text)
         +-h2(text)
         +-P(text)
         +-h2(text)
         +-P(text)

Listing 2

In this lesson, I will continue the development and explanation of the XSLT code that can be used to transform the XML tree shown in Listing 1 into the HTML tree shown in Listing 2.  The standard browser rendering of the HTML represented by Listing 2 can be viewed in Figure 1.
 

A Big Header

Level 0. This is the beginning of a B. This text is in the Introduction section.

A Mid Header

Level 0. This is a continuation of the same B. This text is in the Technical Details section. This section contains two smallHeaders, each of which is followed by a nested B.

A Small Header

Level 1. This is the beginning of a nested B. This block of text is nested inside of a larger overall block of text. This is also the end of this B.

Another Small Header

Level 1. This is the beginning of another nested B at the same level as the previous one. Another B is nested inside of this one.

A Smallest Header

Level 2. This is the beginning of another nested B that is inside of the previous one. This block of text is nested another level down. This B also contains a list.

  • First list item
  • Second list item
  • Third list item

Level 2. Still inside of, but ending the innermost B.

Level 1. Still inside of, but ending the middle B.

Another Mid Header

Level 0. This block of text is back out at the top level of the outermost B.

Another Mid Header in Another B

Level 0. This text is in a completely new B that is at the top B level.
 

Figure 1

The XML File

The XML file is named XSL007.xml.  A copy of the source code for the XML file is shown near the end of the lesson.  I have discussed this file in some detail in the two previous lessons and won’t discuss it further here.

The XSLT File

This file is named XSL007.xsl.  I will continue discussing the XSLT code in fragments.  You can view a complete listing of the XSLT file near the end of the lesson. Complete listings of the XML file, the XSLT file, and the HTML output are all presented near the end of the lesson.

In Part II of this series on Trees, Nodes, and Templates, I discussed the template that matches the root node of the XML file in detail. Of most interest in that discussion was the following fragment from that template:

<td>
<xsl:apply-templates select=”A/*” />
</td>

I explained how the xsl:apply-templates processing element sends the XML processor off looking for specified nodes (usually child nodes) and matching templates. When a template that matches a node is found, the information in the node is processed according to the instructions in the template.  The result is to insert text at that point into the output stream.  I then wrapped up the discussion of the template that matches the root node by showing you that it contains nothing exciting beyond the fragment shown above.

At this point in the process, the XML processor is off processing the child nodes of the root node in the order that they appear in the XML document. According to Listing 1, the first child node that the XML processor will find is a node named Q. Then it will go looking for a template that matches a node named Q. If it finds a matching template, it will use that template along with the information in the node named Q to insert text into the output stream.

The following fragment shows a template that forms a match for the node named Q:

<xsl:template match=”Q”>
    <h1>
        <xsl:apply-templates /> 
    </h1>
</xsl:template>

 The behavior of this template is straightforward.

  • Insert <h1> into the output stream.  (This is an HTML start tag for a large header.)
  • Go search for child nodes of the current node.  If any are found, and if they have matching templates, process them against the templates.
  • After finishing the processing of all of the child nodes, insert </h1> into the output stream.  (This is an HTML end tag for a large header.)

Note that the xsl:apply-templates processing element used in the above fragment doesn’t include a select attribute.  As explained by the W3C:
 

In the absence of a select attribute, the xsl:apply-templates instruction processes all of the children of the current node, including text nodes. 

An examination of Listing 1 shows that the content of the Q element doesn’t contain any other elements. However, it does contain some text, and therefore can be viewed as having a child node of the special text node type. According to the W3C, this text should be processed by default.  However, according to Microsoft,
 

The December 1998 XSL Working Draft includes two built-in templates. These templates make it easier to create style sheets for irregular data by passing text from the source document to the output automatically.

Microsoft® Internet Explorer 5 does not have these templates built in but they are easily added to a style sheet without affecting the use of the style sheet by other XSL processors. Other processors will simply use the templates you specify instead of their own built-in templates.

Microsoft goes on to explain that the template shown in the following listing should be included in the XSLT file to cause text to be passed from the source document to the output automatically:

<xsl:template match=”text()”>
    <xsl:value-of select=”.”/>
</xsl:template>

 Since this is a work-around to cover a deficiency in the IE5 XML processor, I’m not going to explain how it does what it does.  Just remember to include it if you need compatibility with IE5. Hopefully some future version of IE5 will eliminate the requirement for this workaround.

Referring again to Listing 1, after the Q node is processed, the next child node of the root that will be discovered is a node of type B. As you will recall from Listing 1, this node has many children of several different types. The XML processor will search for a template that matches the B node.  The template that it will find is shown in the following fragment:
 

<xsl:template match=”B”>
    <xsl:apply-templates /> 
</xsl:template>

The behavior of this template is very straightforward also.  It simply says to go and process all child nodes of the current B node recursively. Checking Listing 1 again, the first child node that will be found will be a node of type C. Note that the explanatory text in Listing 1 shows this to be a paragraph node.  This explanatory text is in the form of a comment in the tree diagram, and has no bearing on how the node will be processed. The following fragment shows the matching template for a node of type C:

<xsl:template match=”C”>
    <P>
        <xsl:apply-templates /> 
    </P>
</xsl:template>

Now you see why I commented this as a paragraph node. The behavior of this template is to insert the start and end tags for an HTML paragraph in the output stream (<P>…</P>). Sandwiched between the start and end tags for the HTML paragraph will be whatever is produced by processing the child nodes of the C node. If you examine the original XML file at the end of this lesson or the XML tree in Listing 1 you will see that in this case, the only child node is a text node. However, there is nothing about this XSLT structure that restricts it to text only.  Paragraph tags are often used in HTML to produce blank lines above and below other elements such as images.

Following the C node in the tree is a node of type R, which I referred to in Listing 1 as a mid-size header. The following fragment shows the template that matches this node:

<xsl:template match=”R”>
    <h2>
        <xsl:apply-templates /> 
    </h2>
</xsl:template>

As you can see, this template simply produces text in the output stream that represents an HTML header of the h2 variety.  Normally an h2 header is a little smaller than an h1 header. According to Listing 1, the R node is followed by another C (paragraph) node. I have already defined the template that matches a C node, so I don’t need to define it again. The existing template in the XSLT file that matches a C node in the XML file will be used whenever it is necessary to process a node of type C.

According to Listing 1, this C node is followed by a node of type S (small header).  The matching template is shown in the following listing:

<xsl:template match=”S”>
    <h3>
        <xsl:apply-templates /> 
    </h3>
</xsl:template>

As you can see, it looks, and behaves just like the other header templates, except that it creates a different HTML header tag set in the output stream  (h3 instead of h2 or h1).

That brings us to something very interesting.  According to Listing 1, this B node has a child node of type B. As I mentioned in an earlier lesson, it is OK for a node to have a child of its own type. This represents nested elements if you want to think of it in terms of the XML code. Since I have already defined a template for a type B node, I don’t need to define another one.  The existing template for type B will be used to process all type B nodes that are encountered in processing the XML file.

According to Listing 1, this B node has several child nodes of types C, T, and another nested B. I have already discussed the matching template for type C and don’t need to discuss it further. 

The template for the T node is simply another header, which I show in the following listing.  It is almost identical to the previous header templates:

<xsl:template match=”T”>
    <h4>
        <xsl:apply-templates /> 
    </h4>
</xsl:template>

 And here, we need to deal with this nested B node. The existing template for type B will be used to process all type B nodes that are encountered in processing the XML file.

According to Listing 1, this nested B node has children of type C and type D. We already know all about C. The node of type D has three children, all of type E. In the tree diagram of Listing 1, I refer to D as a list node and to E as a list item node.  The reason for this will become apparent in our examination of the following fragment:

<xsl:template match=”D”>
    <UL>
        <!– loop –>
        <xsl:for-each select=”E”>
            <LI>
                <xsl:apply-templates /> 
            </LI>
            </xsl:for-each>
        <!– End loop –>
    </UL>
</xsl:template>

 This template inserts the start and end tags for an HTML unordered list <UL> in the output stream. This template uses an xsl:for-each loop to extract the text value of each E node and to insert that text in the output stream. Each time text is inserted into the output stream, that text is surrounded by HTML list item <LI> tags. Thus, you can see why I identified the D and E nodes as a list node and three list item nodes in the tree diagram of Listing 1.

There are quite a few more nodes in the tree that I haven’t discussed.  The types of those nodes are the same as the types of the nodes already discussed. Therefore, no additional templates are required in the XSLT file to process the remainder of the tree. There is also a lot more to be discussed about making selections and matches. 

In this example, I used the simplest possible select and match criteria because I wanted to illustrate structure, not details.  I will discuss some of those details in subsequent lessons.

Summary

In summary, to transform an XML tree into an HTML output tree using IE5, you must begin by providing a template that matches the root node of the XML tree. That template must include everything necessary to construct the skeleton of the HTML code, beginning with the <HTML> tag and ending with the </HTML> tag. The root template must somehow produce all of the HTML code in the output stream that is required between those two tags. In previous lessons, all of the XSLT processing code was contained in the template that matches the root node.  That is often useful for the very regular XML documents of the type discussed in those lessons. However, it is not necessary to define all of the XSLT processing code in the root template.  Another approach is discussed in this lesson. In this approach, an xsl:apply-templates processing element is used to cause the XML processor to process all of the children of the root node in a recursive fashion. This requires that you create templates in the XSLT file that match those child nodes. The templates that match those child nodes can send the XML processor off to process the children of those child nodes in a recursive fashion as well. As illustrated in this lesson, the XSL transformation process allows you to mix the processing of regular XML files (using for-each loops, for example) with the processing of irregular XML files using recursion.

All in all, this is a powerful concept.  Once you understand the overall concept, it isn’t too difficult to understand the details. However, there are a myriad of details involving many options.  Quite a lot of knowledge would be needed for someone to claim to be an XSLT expert.

Complete Program Listings

A listing of the XML file (XSL007.xml) is provided in the following listing:
 

<?xml version=”1.0″?>

<!– File XSL007.xml
Copyright 2000 R. G. Baldwin
Illustrates recursive transformation using templates.
Works with IE5.0
–>

<?xml-stylesheet type=”text/xsl” href=”XSL007.xsl”?>
 

<A>

<Q>A Big Header</Q>

<B>
<C>Level 0.  This is the beginning of a B. 
This text is in the Introduction section.</C>

<R>A Mid Header</R>

<C>Level 0.  This is a continuation of the same B. 
This text is in the Technical Details section. This 
section contains two smallHeaders, each of which is 
followed by a nested B.</C>

<S>A Small Header</S>
<B>
<C>Level 1.  This is the beginning of a nested B. 
This block of text is nested inside of a larger overall 
block of text. This is also the end of this B.</C>
</B>

<S>Another Small Header</S>
<B>
<C>Level 1.  This is the beginning of another nested B
at the same level as the previous one. 
Another B is nested inside of this one.</C>

<T>A Smallest Header</T>
<B>
<C>Level 2.  This is the beginning of another nested B
that is inside of the previous one.  This block of text is nested 
another level down.  This B also contains a list.</C>

<D>
<E>First list item</E>
<E>Second list item</E>
<E>Third list item</E>
</D>

<C>Level 2.  Still inside of, but ending the innermost B.</C>
</B>
<C>Level 1.  Still inside of, but ending the middle B.</C>
</B>
 

<R>Another Mid Header</R>
<C>Level 0.  This block of text is back out at the top level of 
the outermost B.</C>
</B>

<B>
<R>Another Mid Header in Another B</R>
<C>Level 0.  This text is in a completely new B that is at the
top B level.</C>
</B>

</A>

A listing of the XSLT file (XSL007.xsl) is provided in the following listing:
 

<?xml version=’1.0′?>
<!– File XSL007.xsl
Copyright 2000 R. G. Baldwin
Illustrates recursive transformation using templates.

Works with IE5.0
–>
<xsl:stylesheet 
xmlns:xsl=”http://www.w3.org/TR/WD-xsl”>

<xsl:template match=”/”>
<HTML>
<BODY>
<table BORDER=”2″ CELLSPACING=”0″ 
    CELLPADDING=”0″ WIDTH=”399″ 
    BGCOLOR=”#FFFF00″ >
<tr>
<td>
<xsl:apply-templates select=”A/*” />
</td>
</tr>
</table>

</BODY>
</HTML>
</xsl:template>
<!– End root match template –>

<!– Simulate built-in text template –>
<xsl:template match=”text()”>
<xsl:value-of select=”.”/>
</xsl:template>
<!– End text match template –>

<xsl:template match=”B”>
<xsl:apply-templates /> 
</xsl:template>
<!– End B match template –>

<xsl:template match=”C”>
<P>
<xsl:apply-templates /> 
</P>
</xsl:template>
<!– End C match template –>

<xsl:template match=”D”>
<UL>
<!– loop –>
<xsl:for-each select=”E”>
<LI>
<xsl:apply-templates /> 
</LI>
</xsl:for-each>
<!– End loop –>
</UL>
</xsl:template>
<!– End D match template –>

<!– Header templates follow –>
<xsl:template match=”Q”>
<h1>
<xsl:apply-templates /> 
</h1>
</xsl:template>
<!– End Q match template –>

<xsl:template match=”R”>
<h2>
<xsl:apply-templates /> 
</h2>
</xsl:template>
<!– End R match template –>

<xsl:template match=”S”>
<h3>
<xsl:apply-templates /> 
</h3>
</xsl:template>
<!– End S match template –>

<xsl:template match=”T”>
<h4>
<xsl:apply-templates /> 
</h4>
</xsl:template>
<!– End T match template –>

</xsl:stylesheet>

A listing of the HTML resulting from using the above XSLT file to transform the above XML file is shown in the following listing:

<HTML><BODY><table BORDER=”2″ CELLSPACING="0" CELLPADDING="0" BGCOLOR="#FFFF00"><tr><td><h1>A Big Header</h1><P>Level 0.  This is the beginning of a B. 
This text is in the Introduction section.</P><h2>A Mid Header</h2><P>Level 0.  This is a continuation of the same B. 
This text is in the Technical Details section. This 
section contains two smallHeaders, each of which is 
followed by a nested B.</P><h3>A Small Header</h3><P>Level 1.  This is the beginning of a nested B. 
This block of text is nested inside of a larger overall 
block of text. This is also the end of this B.</P><h3>Another Small Header</h3><P>Level 1.  This is the beginning of another nested B
at the same level as the previous one. 
Another B is nested inside of this one.</P><h4>A Smallest Header</h4><P>Level 2.  This is the beginning of another nested B
that is inside of the previous one.  This block of text is nested 
another level down.  This B also contains a list.</P><UL><LI>First list item</LI><LI>Second list item</LI><LI>Third list item</LI></UL><P>Level 2.  Still inside of, but ending the innermost B.</P><P>Level 1.  Still inside of, but ending the middle B.</P><h2>Another Mid Header</h2><P>Level 0.  This block of text is back out at the top level of 
the outermost B.</P><h2>Another Mid Header in Another B</h2><P>Level 0.  This text is in a completely new B that is at the
top B level.</P></td></tr></table></BODY></HTML>


Copyright 2000, Richard G. Baldwin.  Reproduction in whole or in part in any form or medium without  express written permission from Richard Baldwin is prohibited.

About the author

Richard Baldwin (baldwin.richard@iname.com) is a college professor and private consultant whose primary focus is a combination of Java and XML. In addition to the many platform-independent benefits of Java applications, he believes that a combination of Java and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two.  He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas.  He is the author of Baldwin’s Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories