http://www.developer.com/

Back to article

Discover the Wonders of XSLT: XPaths


March 15, 2004

This is Part 2 of the developer.com introduction to XSLT. The first part was about tools and the basic syntax. I recommend you read it first.

Make sure you download the updated listings before reading any further.

XPaths

The style sheet language is made up of two W3C recommendations:

  • XPath, which is a querying language
  • XSLT itself, which is a scripting language with an XML syntax

A style sheet describes how to convert the input document into the output. XPath deals with the input; it allows you to retrieve values from the input document. XSLT deals with generating the output. It offers instructions to create elements, attributes and other XML markup in the output.

XPaths are not unlike file paths and URLs, but are adapted to the XML syntax. For example, download the listings and open the sample2.xml file. The path to the document titles is the following:

/a:article/a:info/a:title

Essentially, an XPath lists all the elements that lead to the one you're interested into, just like the way that a file path lists all the directories leading to the file you're interested in. The separator is the forwards slash, /.

An XPath returns a node set, i.e. a list of nodes that match the XPath. A node set may contain zero (which most likely indicates an error in the XPath), one, or more nodes. The node set for the XPath above contains only one node (the article title).

The element names in an XPath must be fully qualified, i.e. they must include both the namespace prefix and the local name. Make sure you declare the namespace prefix in the style sheet as well (see the example below).

Relative XPaths

The previous example was for an absolute path because it starts from the root of the document. XPaths may also be relative to the current node. Again, the concept is very similar to file paths that can either start from the root (or a disk under Windows) or be relative to the current directory.

Absolute XPaths start with the forward slash; relative XPaths start with an element name. Assuming the current node is /a:article, the following XPath points to the article title.

a:info/a:title

You may recognize this XPath from the style sheet in the previous article. Indeed, the template match attribute contains an XPath, in most cases a relative one.

As it interprets the style sheet, the XSLT processor keeps track of the current node. Some instructions, such as xsl:apply-templates and xsl:for-each (see below), change the current node.

Attributes and other special cases

To include an attribute in an XPath, prefix its name with the @ character. The following (relative) XPath selects the link's URI if the current node is a section:

a:para/a:link/@uri

The @ is not a separator but a prefix identifying attributes. Therefore, you still need the forward slash between the attribute name and its parent.

The single and double dot (. and ..) represent the current element and the parent of the current element respectively. If the current element is a paragraph,

../a:para

selects all the paragraphs in the section. The .. selects the paragraph's parent (the section); from there, the XPath selects all the paragraphs in the section. Note that this XPath may return a node set with several nodes, as many nodes as paragraphs in the section, in fact.

To select all the paragraphs in the body, use this XPath:

../../a:section/a:para

Using two slashes as a separator // selects amongst the descendants, as opposed to the children, of the element. The descendants include the children, the children of the children, the children of the children of the children, and so on. The following absolute XPath selects all the titles (article and section titles):

/a:article//a:title

Predicates

To conclude this section on XPaths, let's look at predicates. Predicates allow you to specify conditions that must apply to an element. The predicate appears between square brackets, [ and ], immediately after the element on which the condition applies.

The following XPath selects links pointing to the XSLT recommendation:

//a:link[@uri='http://www.w3.org/TR/xslt']

Predicates allow you to compare an XPath (@uri in this example) with a literal or another XPath. A whole set of functions also is available (see an XSLT reference for a complete list of functions).

For example, this XPath uses the count function to select the paragraph from a section that has only one paragraph:

//a:section[count(a:para) = 1]/a:para

Note that the predicate appears after the element on which it applies, which is not necessarily the last element in the XPath.

Be careful not to confuse the separator, /, with the predicate indicators, [ and ].

Attributes

Attributes have a weird syntax in XSLT:

<a href="{@uri}">

The curly brackets, { and }, mark the content of the attribute as an XPath. If the curly brackets are missing, the processor assumes that the content is a literal.

The XPath should return one node only. If it returns several nodes, the processor retains only the first one.

Students of XSLT often confuse the curly brackets and the at symbol. Both are related to attributes, but they serve completely different roles. The curly brackets are part of XSLT; they indicate that the content of the attribute is an XPath. The at symbol is part of XPath; it indicates the path points to an attribute.

A quick debugging tip: If you can't get what you want in an attribute, make sure you have not forgotten the curly brackets.

Regular Structure

When working on a style sheet, the output may be structured and repetitive. Then, it may be easier to use the xsl:for-each and xsl:value-of instructions.

xsl:for-each loops over the node set. xsl:value-of prints the content of the first element in a node set. Used together, they allow you to loop over and format the result of an XPath.

For example, to print the paragraphs, you could write:

<xsl:for-each select="/a:article/a:section/a:para">
   <p><xsl:value-of select="."/></p>
</xsl:for-each>

Be warned that xsl:for-each changes the current node, so it is crucial that you use a relative XPath in the loop! An absolute path would select data outside of the loop, which is most likely not what you want.

A New Style Sheet

Listing 1 is an updated style sheet that demonstrates the techniques introduced in this article:

  • The template for the body now prints the article title using an xsl:value-of and a table of content through an xsl:for-each.
  • A predicate differentiates the templates for bold and italics.
  • The style sheet inserts hyperlinks by using the special syntax for attribute contents.

Listing 1: updated style sheet

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:a="http://psol.com/2004/article">

<xsl:output method="html"/>

<xsl:template match="a:article">
   <html><xsl:apply-templates/></html>
</xsl:template>

<xsl:template match="a:body">
   <body>
      <h1><xsl:value-of select="../a:info/a:title"/></h1>
      <h2>Table of Contents</h2>
      <ul>
         <xsl:for-each select="a:section">
            <li><xsl:value-of select="a:title"/></li>
         </xsl:for-each>
      </ul><hr/>
      <xsl:apply-templates/>
      <p>This page was made with XML and XSLT.</p>
   </body>
</xsl:template>

<xsl:template match="a:para">
   <p><xsl:apply-templates/></p>
</xsl:template>

<xsl:template match="a:section">
   <xsl:apply-templates/><hr/>
</xsl:template>

<xsl:template match="a:info/a:title">
   <head><title><xsl:apply-templates/></title></head>
</xsl:template>

<xsl:template match="a:section/a:title">
   <h2><xsl:apply-templates/></h2>
</xsl:template>

<xsl:template match="a:link">
   <a href="{@uri}"><xsl:apply-templates/></a>
</xsl:template>

<xsl:template match="a:em">
   <i><xsl:apply-templates/></i>
</xsl:template>

<xsl:template match="a:em[@role='bold']">
   <b><xsl:apply-templates/></b>
</xsl:template>

</xsl:stylesheet>

Testing and exercise

I encourage you to download the listing and run the example for yourself. The listings also include a small exercise so that you can practice what you have learned.

Remember to adapt the processing instruction, as explained in Part 1, if you change the style sheet.

Next month, we will cover more XSLT instructions.

About the Author

Benoît Marchal is a Belgian writer and consultant. He is the author of XML by Example and other XML books. He works mostly on Web services, XML, and Java.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date