XSL (eXtensible Stylesheet Language) has some features that may be foreign to a developer’s existing programming repertoire. One of these features is the notion of templates, which are used to transform matching nodes of the XML document being processed. Learning to exploit rule-based templates early in your stylesheet career will help you to develop more maintainable code and to avoid the procedural trap.
This article compares and contrasts rule-based and procedural-based approaches to developing stylesheets to help you understand the tradeoffs. A detailed example is used to illustrate the differences. The following outline is a roadmap for how we’ll come to our conclusions:
- A sample XML document to be transformed
- The desired HTML output we would like to produce
- A simple procedural stylesheet to produce the output
- A simple rule-based stylesheet to produce the output
- Comparison of the rule-based and procedural approaches
A basic understanding of XML and XSL technology will be assumed. Additional resources at the end will provide links to XSL tutorial and reference material.
Sample XML Document
For the purposes of this article, we’ll define a simple XML document that we’ll transform into HTML output using our procedural- and rule-based stylesheets. Here we have a simple list of customers.
<Customers> <Customer> <Name>XYZ Plumbing</Name> <City>New Haven</City> <State>CT</State> </Customer> <Customer> <Name>Joe's Bar and Grill</Name> <City>Waterbury</City> <State>CT</State> </Customer> <Customer> <Name>ABC Pizza</Name> <City>Hartford</City> <State>CT</State> </Customer> </Customers>
The Desired HTML Output
Now, let’s determine what we would like the resulting output of our transformation to look like. We’ll create some fairly simple HTML to display the customers sorted by name in a tabular format. To keep things simple, we’ll not do any special formatting at all of our data.
<html> <body> <table> <tr> <td>Customer</td> <td>City</td> <td>State</td> </tr> <tr> <td>ABC Pizza</td> <td>Hartford</td> <td>CT</td> </tr> <tr> <td>Joe's Bar and Grill</td> <td>Waterbury</td> <td>CT</td> </tr> <tr> <td>XYZ Plumbing</td> <td>New Haven</td> <td>CT</td> </tr> </table> </body> </html>
A Procedural-Based Stylesheet
To produce the output document, we need to find a way to process a collection of
Customer nodes from our XML document and transform them into a tabular list. If you are new to XSL, you would probably try to find a looping construct and start with the
<xsl:for-each> construct. Your resulting stylesheet might look something like this:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns_xsl= "http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <body> <table> <tr> <td>Customer</td> <td>City</td> <td>State</td> </tr> <xsl:for-each select="//Customer"> <xsl:sort select="Name"/> <tr> <td><xsl:value-of select="Name"/></td> <td><xsl:value-of select="City"/></td> <td><xsl:value-of select="State"/></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet>
<xsl:template> statement in this stylesheet will begin with the root node of the document. We build the HTML according to our specification. When it comes time for looping through the
Customer nodes in our document, we use the
<xsl:for-each> statement to select the customer nodes to be processed. The
select="//Customer" attribute is an XPath expression that selects all the customer nodes in the document tree. The
<xsl:value-of> element is used to select the text values of the child
A Rule-Based Stylesheet
Let’s look at how we might solve this problem using rule-based templates instead of a procedural-based approach. The “rules” in the rule-based template approach are the XPath expressions in the
match attribute that cause a particular template to be invoked.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns_xsl= "http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <body> <xsl:apply-templates select="Customers"/> </body> </html> </xsl:template> <xsl:template match="Customers"> <table> <tr> <td>Customer</td> <td>City</td> <td>State</td> </tr> <xsl:apply-templates select="Customer"> <xsl:sort select="Name"/> </xsl:apply-templates> </table> </xsl:template> <xsl:template match="Customer"> <tr> <td><xsl:value-of select="Name"/></td> <td><xsl:value-of select="City"/></td> <td><xsl:value-of select="State/></td> </tr> </xsl:template> </xsl:stylesheet>
Now we have three templates in our stylesheet rather than one. There are no looping constructs, either. The first template creates the outer shell of HTML. The
<xsl:apply-templates select="Customers"/> statement causes the second template to be invoked since its
match attribute’s rule is looking for
Customers nodes. This template builds the outer shell of the table. It in turn selects the
Customer nodes and causes the third templates to be invoked via the
<xsl:apply-templates select="Customer"> statement. Finally, the last template builds each table row. Note that this template will be invoked once for each
Customer. Looping occurs naturally with the rules-based approach.
Once again, try to imagine this as a complex HTML page. Your code would be divided up among the three templates rather than the single template in the procedural example. There would also be a good chance that each template would take up no more than a page worth of code.
It may be hard to believe if you haven’t run the examples, but the procedural-based and rule-based stylesheets will produce identical output. If the end result is what counts, does it matter which approach you use?
I would argue yes, it does matter.
The procedural-based example had a single “godzilla” template that contained all of the HTML markup language as well as the looping logic required to create that markup. If this were a complex page, this template would span several pages’ worth of code. It could be very difficult to track down the starting and ending HTML tags because they could span these several pages.
The rules-based example had three templates, each designed to process nodes in the XML document that matched the “rule” specified in its
match attribute. The HTML markup is divided neatly among the templates and is produced at the appropriate time. Each template would result in a manageable piece of code.
The coding guidelines for most languages strongly suggest having methods or procedures of no more than one page of code. A XSL template is analogous to a method, although it is invoked differently. Shorter templates are much easier to read, understand, and maintain. A template should do only one thing and do it well.
Because early efforts often become the basis for more involved endeavors, it’s as important to form good programming habits when beginning your stylesheet development as with any other language. Calling procedural-based stylesheets an “anti-pattern” might be too strong a criticism of this approach because it definitely has its place or it wouldn’t be there. However, making a resolution to force yourself to use the rule-based pattern early in your stylesheet career will pay dividends later. It is one of the core features of the language. As you can see from the side-by-side comparison, it can result in more maintainable code.
To download the sample xml file and stylesheets, click here.
This article doesn’t cover other features of XSL that help you write modular and maintainable code, such as imports, includes, calling named templates, parameters, and so on. Rather, it focuses on one of the core features of the language: applying templates to certain nodes in the source document in a rules-based fashion.
|Other Articles Written by Jeff Ryan|
About the Author
|Jeff Ryan is an architect for Hartford Financial Services. He has eighteen years of experience in information technology in architecting and developing automated solutions to business problems. He may be reached at email@example.com.|