March 2, 2021
Hot Topics:

What's New in XPath 2.0?

  • By Steven Holzner
  • Send Email »
  • More Articles »

Working with Sequences

Every XPath 2.0 expression (that is, anything an XPath processor can evaluate, including expressions that return nodes from a document or string values and so on) evaluates to a sequence. Here's the XPath 2.0 definition of a sequence:

  • A sequence is an ordered collection of zero or more items.

  • An item is either an atomic value or a node.

  • An atomic value is a value in the value space of an XML Schema atomic type, as defined in the XML Schema specification. Atomic values can either be simple primitive types, or be derived by restriction from these types.

  • A node is one of the seven node kinds described in the XQuery 1.0 and XPath 2.0 Data Model document.

  • A sequence containing exactly one item is called a singleton sequence. An item is identical to a singleton sequence containing that item.

  • A sequence containing zero items is called an empty sequence.

Sequences can contain nodes or atomic values. As we've seen, an atomic value is a value of one of the 19 built-in simple primitive data types defined in the XML schema specification, or a type derived from them by restriction.

Sequences are the successor to node-sets—besides nodes, they also let you work with simple data items. The term "sequence" is really a catch-all way to refer to data you can work with in XPath 2.0, either an atomic value or a node, or a collection of such items. Sequences can be made up of a single item or multiple items; it's all the same to XPath 2.0. Giving them one name, sequence is an easy way to let you handle single or multiple items (even though the term "sequence" is not very apt for single items).

Sequences can be constructed with this kind of syntax: (1, 2, 3), which is a sequence of the atomic values 1, 2, and 3. In fact, the comma is an operator in XPath 2.0—the sequence construction operator. You can also extract items from a sequence using the [] operator. Here's an example:

(4, 5, 6)[2]

This expression returns the value 5. You can also use the range operator, to, to create sequences, as in this example:

(1 to 1000)

Note that you cannot nest sequences—that is, if you have a sequence (1, 2) and then try to nest that in another sequence as ((1, 2), 3), the result is simply the sequence (1, 2, 3).

Sequences are also ordered, which is different from node-sets in XPath 1.0. For example, take a look at this sequence:

(//planet/mass, //planet/name)

Here, we're creating a sequence in which <mass> elements from our planetary data XML document come before <name> elements—which is the opposite of the way these elements appear in actual document order. But the order of these elements as we've specified them is preserved in the sequence we're creating here.

Here's another way in which XPath 2.0 differs from 1.0—sequences, unlike node-sets, can have duplicate items. For example, take a look at this sequence:

(//planet/mass, //planet/name, //planet/mass)

Here, we're creating a sequence of all <mass> elements, followed by all <name> elements—followed by all <mass> elements again. This is legal in sequences, but not in node-sets. (In fact, the very definition of XPath 1.0 node-sets precludes duplicate items.)

Ordered Versus Unordered Sequences - Here's something to know behind the scenes about sequences versus node-sets. W3C wanted to make life a little easier for people moving from XPath 1.0 to 2.0, so the way sequences are constructed is designed to be somewhat node-set friendly.

Although node-sets are unordered, node-sets are usually constructed in document order. XSLT 2.0 is designed to work on sequences in sequence order, but in order to be compatible with XPath 1.0, path expressions are designed to always return their results using document order by default.

Also, duplicates are removed from the results by default, which means the sequence you get from a path expression is usually going to be the same as the node-set you'd get.

So that's what sequences are all about in general—instead of only supporting one multiple-item construct, the node-set, XPath 2.0 supports sequences, which can contain multiple simple-typed data items as well as nodes.

The for Expression

Sequences are more than just a new concept—XPath 2.0 is really centered around them. There are whole new expressions designed to work with sequences, such as the for expression. This expression is designed to let you handle sequences by looping, or iterating, over all items in a sequence.

Here's a preview that also puts XPath 2.0 variables to work. Say that you wanted to find the average planetary mass in our planets example. Doing that with the for expression is easy—here's what that might look like:

for $variable in /planets/planet return $variable/mass 

Notice what we're doing here—we're using the for expression to loop over all <mass> values. We do that with a variable, something new for us in XPath, named $variable. Variables in XPath 2.0 start with a $ preceding a normal XML-legal name, so you can use any legal XML name here, like $var, $numberProducts, $name, and so on.

We're using the path expression /planets/planet to return a sequence holding all <planet> elements in the document. How do we return the <mass> elements of these <planet> elements in a sequence? We can use the return keyword, as you see here. In this case, the expression we want to return each time through the loop is $variable/mass, and because $variable holds a new <planet> element each time through the loop, we'll get a sequence of all <mass> elements this way.

To get the average mass of the planets, you could use the avg function this way:

avg(for $variable in /planets/planet return $variable/mass)

Note that you could also write our for expression as

for $variable in /planets/planet/mass return $variable 

This does the same thing that the expression /planets/planet/mass does—it returns a sequence of <mass> elements. Here's another example, where we're multiplying the miles per gallon of a number of cars by their fuel capacity to get their total operating ranges:

for $variable in /cars return $variable/milesPerGallon * $variable/gasCapacity 

That's how the for expression works in general, like this:

for variable in sequence return expression 

The if Expression

Besides the for expression, you can now use the conditional if expression in XPath 2.0. Being able to use conditional expressions like if and loop expressions like for in XPath adds a lot of the programming power of true programming languages to XPath 2.0.

Here's an example of an if expression, which finds the minimum of two temperatures (which you can also do with the XPath 2.0 min function):

if ($temperature1 < $temperature2) then $temperature1 else $temperature2 

Here, we're comparing the value in $temperature1 to the value in $temperature2. If $temperature1 holds a value that is less than the value in $temperature2, this if expression returns the value in $temperature1; otherwise, it returns to the value in $temperature2.

This has the feel of a true programming language, and there's a lot more power here than in XPath 1.0. Now, you're allowed to branch to different expressions based on a test expression.

Page 2 of 3

This article was originally published on April 23, 2004

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date