LanguagesXMLDiscover the Wonders of XSLT: Workflows

Discover the Wonders of XSLT: Workflows

This article concludes the introduction to XSLT at developer.com.

In the previous four articles, the series has covered the essentials of XSLT coding. The final article moves to more advanced subjects such as working with functions and multiple files.

Functions

Functions are implemented in XPath so they are valid wherever an XPath is valid. We have already encountered functions, such as count() and not():

count(a:para)

A function takes zero, one, or more arguments and computes a result. The result may be a number, a string, or a node set.

Much of the power of functions arises from their integration with XPaths. Functions can appear in predicates or, for those that return node sets, in place of an element:

a:section[count(a:para) = 1]
current()/a:para

Because functions appear in XPaths, to use the result, you turn to the familiar value-of instruction:

<xsl:value-of select="count(xxx)"/>

As always, if the function/XPath returns multiple values, you will need the for-each or apply-templates instructions instead:

<xsl:for-each select="a:section[count(a:para) > 1]">

Predefined Functions and Extensions

XPath and XSLT include functions to cover most common needs: string manipulation (substring, length), number manipulation (sum, conversion), boolean (negation), indexing (key search), and more.

The XPath and XSLT recommendations themselves do a good job at documenting the function. I suggest you bookmark the recommendations.

Still, you will find yourself looking for your favorite “insert name here” function. XSLT offers a half-baked extension mechanism that links with functions written in Java, JavaScript, Python, C#, or other languages.

Unfortunately, the W3C has not fully defined the extension mechanism; much is left as implementation details that create serious incompatibilities among XSLT processors. Therefore, to implement a function, you must forego portability and tie yourself to one specific XSLT implementation.

If this is unacceptable to you, there are two workarounds. First, if at all possible, don’t use extensions. As you become more familiar with XSLT, you will find that many algorithms are best implemented through XSLT native (and portable) templates.

If you still need a function, check EXSLT. EXSLT defines standards for the most commonly requested extensions. Unless your needs are really exotic, chances are EXSLT covers them. However, because it’s a voluntary effort and not part of the official W3C recommendations, not every processor supports EXSLT, although the major ones do. Again, check your processor documentation.

Many Documents

The default workflow with XSLT is to process one file through one style sheet. While this simple workflow is appropriate for basic applications, you may want something more sophisticated.

Figure 1: Four common XSLT workflows

Figure 1 illustrates four common workflow options, clockwise:

  • The default workflow, one XML document is the input for a style sheet that produces one document.
  • The document() function (see below) opens multiple input documents but it still produces one output only.
  • XSLT 2.0 (see below) supports multiple outputs. Think of an photo gallery where the style sheet generates as many HTML pages as there are photos in the input document.
  • Finally, a batch engine extends the XSLT processor to work with directories and file hierarchies instead of isolated files. If you followed the exercises throughout the series, you have been using such a batch engine, (XM).

document() Function

The document() function opens a second (or a third, fourth, and so on) input document. The function takes the URI to the file, opens the file, parses it, and returns a node set with the file content.

Because the result is a node set, you can query the result with an XPath, as we saw in the Functions section:

document('params.xml')/p:Parameters/p:Param[@id='1']

The usual combination of for-each and apply-templates instructions offers many options to process the second document:

<xsl:for-each select="document('params.xml')/p:Parameters/p:Param">

Typically, document() accesses parameter files. It is also handy to combine several documents into one output.

result-document

What about the opposite, taking an XML document and splitting it in multiple output documents? There is no solution with XSLT 1.0 but support for multiple output documents will be added in XSLT 2.0.

At the time of writing, the draft XSLT 2.0 proposes the result-document element. Basically, anything that appears within a result-document element is written to a separate file.

<xsl:result-document href="photo-{@id}.html">
   <!-- ... -->
</xsl:result-document>

A word of warning: XSLT 2.0 has not been formally approved at the time of writing, so this feature may still change. Furthermore, chances are your XSLT processor does not implement it (most processors have a proprietary alternative, though). Again, consult your processor documentation if you need this feature.

Exercise

As a final exercise, I encourage you to revisit the previous exercises and update your style sheets to reformat the dates.

So far, dates are displayed in the ISO format: 2004-02-08. By using the substring-before() function, you can reformat it to the more common 02/08/2004 format.

Conclusion

XSLT is a versatile and flexible language. The last five articles have laid down the basics to get you started. Remember that practice makes proficient.

About the Author

Benoît Marchal is a Belgian writer and consultant. He is the author of XML by Example and other XML books.

He is currently developing new training material on UML modelling and XML.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories