January 19, 2021
Hot Topics:

What's New in XPath 2.0?

  • By Steven Holzner
  • Send Email »
  • More Articles »

As of this writing, XPath 2.0 is still in Working Draft form, but it's now stabilized, giving us the chance to work with it. XPath 2.0 is described this way by W3C—just as you'd describe XPath 1.0, in fact:

"The primary purpose of XPath is to address parts of an XML document. XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document."

Although the primary purpose of XPath hasn't changed in this new version, much of the actual specification has. You'll still be able to use the familiar path steps, each made up of an axis (XPath 2.0 uses the same axes as XPath 1.0), followed by a node test, followed by a predicate. However, much of the terminology has changed, along with some basic concepts—for example, XPath supports sequences instead of node-sets. I go into more detail on this in my book XPath Kick Start : Navigating XML with XPath 1.0 and 2.0.

XPath 2.0, XQuery 1.0, and XSLT 2.0 are all tied together, and XPath 2.0 is the common denominator. The W3C groups working on these standards have been working together closely. One way of looking at what's been going on is that XSLT 2.0 and XQuery 1.0 are designed to share as much as possible—and that what they share is in fact XPath 2.0.

So why XPath 2.0? What's it got that XPath 1.0 doesn't have? There are many answers, but one of the main ones is support for new data types. As you know, XPath 1.0 supports only these data types:

  • string

  • boolean

  • node-set

  • number

That was okay long ago, but things have changed—in particular, W3C has been moving toward XML schema for its data types. Supporting new data types based on XML schema means that XPath 2.0 supports all the simple primitive types built into XML schema. There are 19 such types in all, including many that XPath 1.0 doesn't support, such as data types for dates, URIs, and so on.

The XPath 2.0 data model also supports data types that you can derive from these data types in your own XML schema. We're going to see how to work with these various types ourselves.

XML Schema - If you're not familiar with XML schema, you can get all the details at http://www.w3.org/TR/xmlschema-0/, http://www.w3.org/TR/xmlschema-1/, and http://www.w3.org/TR/xmlschema-2/. Another good resource is the book Sams Teach Yourself XML in 21 Days (ISBN: 0672325764).

XPath 2.0 also gives you tremendously more power than XPath 1.0 did. There are dozens of new built-in functions that you can use now, and many more operators. These functions and operators are far more type-aware than what we've seen in XPath 1.0.

Also new in XPath 2.0 are sequences, which replace the familiar node-sets from XPath 1.0. In fact, all XPath 2.0 expressions evaluate to sequences, as we're going to see. And you can also use variables in XPath 2.0.

The current working draft for XPath 2.0 is at http://www.w3.org/TR/xpath20/. This document tells you about XPath 2.0 in some detail, but it doesn't provide the whole story. In addition, there are documents outlining the XPath 2.0 data model—which tells you how XPath 2.0 sees an XML document—the data types used in XPath 2.0, and the functions and operators available. Here's the list:

You still create location paths in XPath 2.0, of course, and build them from location steps. A location step, as in XPath 1.0, can contain an axis, a node test, and a predicate. The allowable axes are the same as in XPath 1.0.However, there are differences already—the namespace axis is considered deprecated in XPath 2.0, which means it's considered obsolete. It's included for backward compatibility, but is not available at all in XQuery 1.0.

Handling Nodes

Although the data types have changed, the node kinds are more or less the same in XPath 2.0 compared to XPath 1.0. As you recall, you can have these kinds of nodes in XPath 1.0: root nodes, element nodes, attribute nodes, processing instruction nodes, comment nodes, text nodes, and namespace nodes. There is one difference in XPath 2.0, however—root nodes are now called document nodes instead, ending a long-standing confusion.

Handling Data Types

As also mentioned, one of the main motivations behind XPath 2.0 was to expand the data types available. XPath 1.0 supported Booleans, node-sets, strings, and numbers, but that was pretty basic. XPath 2.0 supports all the primitive simple types built into XML schema, as well as the types you can derive by restriction from the primitive simple types, which gives you a great deal more control over data typing. Here are the simple primitive types—the xs namespace corresponds to "http://www.w3.org/2001/XMLSchema":

  • xs:string

  • xs:boolean

  • xs:decimal

  • xs:float

  • xs:double

  • xs:duration

  • xs:dateTime

  • xs:time

  • xs:date

  • xs:gYearMonth

  • xs:gYear

  • xs:gMonthDay

  • xs:gDay

  • xs:gMonth

  • xs:hexBinary

  • xs:base64Binary

  • xs:anyURI

  • xs:QName


Besides these types, you can also use types derived from primitive simple types by restriction. Collectively, these simple primitive types and the types derived from primitive simple types by restriction are called atomic types. And XPath 2.0 sequences can contain both atomic types and nodes.

Page 1 of 3

This article was originally published on April 23, 2004

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date