What's New in XPath 2.0?
As of this writing, XPath 2.0 is still in Working Draft form, but it's now stabilized, giving us the chance to work with it. XPath 2.0 is described this way by W3Cjust as you'd describe XPath 1.0, in fact:
"The primary purpose of XPath is to address parts of an XML document. XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document."
Although the primary purpose of XPath hasn't changed in this new version, much of the actual specification has. You'll still be able to use the familiar path steps, each made up of an axis (XPath 2.0 uses the same axes as XPath 1.0), followed by a node test, followed by a predicate. However, much of the terminology has changed, along with some basic conceptsfor example, XPath supports sequences instead of node-sets. I go into more detail on this in my book XPath Kick Start : Navigating XML with XPath 1.0 and 2.0.
XPath 2.0, XQuery 1.0, and XSLT 2.0 are all tied together, and XPath 2.0 is the common denominator. The W3C groups working on these standards have been working together closely. One way of looking at what's been going on is that XSLT 2.0 and XQuery 1.0 are designed to share as much as possibleand that what they share is in fact XPath 2.0.
So why XPath 2.0? What's it got that XPath 1.0 doesn't have? There are many answers, but one of the main ones is support for new data types. As you know, XPath 1.0 supports only these data types:
string
boolean
node-set
number
That was okay long ago, but things have changedin particular, W3C has been moving toward XML schema for its data types. Supporting new data types based on XML schema means that XPath 2.0 supports all the simple primitive types built into XML schema. There are 19 such types in all, including many that XPath 1.0 doesn't support, such as data types for dates, URIs, and so on.
The XPath 2.0 data model also supports data types that you can derive from these data types in your own XML schema. We're going to see how to work with these various types ourselves.
XML Schema - If you're not familiar with XML schema, you can get all the details at http://www.w3.org/TR/xmlschema-0/, http://www.w3.org/TR/xmlschema-1/, and http://www.w3.org/TR/xmlschema-2/. Another good resource is the book Sams Teach Yourself XML in 21 Days (ISBN: 0672325764).
XPath 2.0 also gives you tremendously more power than XPath 1.0 did. There are dozens of new built-in functions that you can use now, and many more operators. These functions and operators are far more type-aware than what we've seen in XPath 1.0.
Also new in XPath 2.0 are sequences, which replace the familiar node-sets from XPath 1.0. In fact, all XPath 2.0 expressions evaluate to sequences, as we're going to see. And you can also use variables in XPath 2.0.
The current working draft for XPath 2.0 is at http://www.w3.org/TR/xpath20/. This document tells you about XPath 2.0 in some detail, but it doesn't provide the whole story. In addition, there are documents outlining the XPath 2.0 data modelwhich tells you how XPath 2.0 sees an XML documentthe data types used in XPath 2.0, and the functions and operators available. Here's the list:
The XPath 2.0 specification is at http://www.w3.org/TR/xpath20/.
The XPath data model defines the information in an XML document that is available to an XPath processor. The data model is defined in the XQuery 1.0 and XPath 2.0 Data Model document at http://www.w3.org/TR/xpath-datamodel/.
The library of functions and operators supported by XPath 2.0 is defined in the XQuery 1.0 and XPath 2.0 Functions and Operators document, which is at http://www.w3.org/TR/xquery-operators/.
The type system used in XPath 2.0 is based on XML Schema, which you can read all about at http://www.w3.org/TR/xmlschema-0/, http://www.w3.org/TR/xmlschema-1/, and http://www.w3.org/TR/xmlschema-2/. The types defined in XML schema can be found in http://www.w3.org/TR/xmlschema-2/.
The formal semantics of XPath 2.0 are defined in the XQuery 1.0 and XPath 2.0 Formal Semantics document. This document is useful for programmers creating XPath processors, and you can find it at http://www.w3.org/TR/xquery-semantics/.
You still create location paths in XPath 2.0, of course, and build them from location steps. A location step, as in XPath 1.0, can contain an axis, a node test, and a predicate. The allowable axes are the same as in XPath 1.0.However, there are differences alreadythe namespace axis is considered deprecated in XPath 2.0, which means it's considered obsolete. It's included for backward compatibility, but is not available at all in XQuery 1.0.
Handling Nodes
Although the data types have changed, the node kinds are more or less the same in XPath 2.0 compared to XPath 1.0. As you recall, you can have these kinds of nodes in XPath 1.0: root nodes, element nodes, attribute nodes, processing instruction nodes, comment nodes, text nodes, and namespace nodes. There is one difference in XPath 2.0, howeverroot nodes are now called document nodes instead, ending a long-standing confusion.
Handling Data Types
As also mentioned, one of the main motivations behind XPath 2.0 was to expand the data types available. XPath 1.0 supported Booleans, node-sets, strings, and numbers, but that was pretty basic. XPath 2.0 supports all the primitive simple types built into XML schema, as well as the types you can derive by restriction from the primitive simple types, which gives you a great deal more control over data typing. Here are the simple primitive typesthe xs namespace corresponds to "http://www.w3.org/2001/XMLSchema":
xs:string
xs:boolean
xs:decimal
xs:float
xs:double
xs:duration
xs:dateTime
xs:time
xs:date
xs:gYearMonth
xs:gYear
xs:gMonthDay
xs:gDay
xs:gMonth
xs:hexBinary
xs:base64Binary
xs:anyURI
xs:QName
xs:NOTATION
Besides these types, you can also use types derived from primitive simple types by restriction. Collectively, these simple primitive types and the types derived from primitive simple types by restriction are called atomic types. And XPath 2.0 sequences can contain both atomic types and nodes.
