October 25, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Understanding XPath

  • May 31, 2002
  • By Kirk Allen Evans
  • Send Email »
  • More Articles »

Position and Predicates

Using a predicate, we can return the FOLDER element on line 19 using the following XPath:

child::*[position() = 1]

We can also use an abbreviated syntax to specify the same result:

child[1]

Besides using numeric position related to the context node, we can also use compound predicates to express complex Boolean results. The modulus operator is a common mechanism to test a value to see if it is even or odd. We can retrieve the even numbered child elements of the context node:

child::*[position() mod 2 = 0]

If we wanted to return only the last node, we can use the XPath function last() to test if the position of a node is the same as the position of the last node, returning the last node:

child::*[position()=last()]

Besides complex predicates, we can also specify complex location steps using axes, node tests, and predicates. If we wanted to find out all the drives on the current machine using the document in Listing 3, we could issue the following:

parent::FILESYSTEM/child::DRIVE

Abbreviated Location Path Syntax

Because XPath statements can become quite verbose, there also exists an abbreviated version of XPath statements. Using abbreviated syntax, the preceding XPath query is equivalent to:

parent::FILESYSTEM/DRIVE

Another abbreviation uses the backslash character to notate the root node of the document containing the context node. This was previously explained as an absolute location path. As an example, this XPath statement returns the FOLDER element on line 19.

/FILESYSTEM/DRIVE[@LETTER='D']/FOLDER[@NAME='INETPUB']

Using two backslashes successively indicates that the entire document should be searched recursively. This is a common misconception for developers used to UNC notation for working with directory paths. While useful in certain situations where an element pattern may occur anywhere in the current document, it is rarely used in this context.

//FOLDER[@NAME="INETPUB"]/FOLDER[@NAME="WWWROOT"]/FOLDER

This notation, while seemingly simple, becomes very complex when dissected. We begin by searching the entire document for an element called FOLDER with a child named FOLDER and a grandchild named FOLDER. We further limit the location paths by specifying the value of the NAME attribute for each FOLDER element. Finally, we return all matching grandchild FOLDER elements. This example would return the FOLDER elements on lines 7 and 21. Note that this is not the same as parent::FILESYSTEM/DRIVE, where we limit the search to a specified path and not the entire document.

Attributes and Predicates

We have seen examples of using attributes as predicates, but have not formally addressed attributes. Attributes can be retrieved using the attribute axis or by using the abbreviated syntax, an at(@) symbol. Referring to Listing 3 again, where the context node is represented on line 18, we can retrieve all attributes where the name of the attribute is LETTER:

attribute::*[name()='LETTER']

This syntax can be abbreviated to specify searching only the LETTER attribute and no other attributes:

attribute::LETTER

This syntax can be abbreviated further using the at symbol:

@LETTER

XPath Functions

We have mentioned Boolean expressions in the context of predicates, but let's take a look how we can leverage Boolean expressions in predicates. There are 29 different XPath functions relating to strings, numbers, node-sets, and Booleans. Without listing them all here, we will focus on the Boolean function not(). The not() function returns true if the argument is false, false if the argument is true. Let's take a look at what this really means by looking at an example. Here, we will select ourselves only if we contain an attribute named LETTER.

self::*[@LETTER]

What if we wanted to select ourselves only if we did not contain an attribute named LETTER? One way is to use the XPath function not().

self::*[not(@LETTER)]

This statement can be misleading, so let's think about what is really being queried. It would be easy to misinterpret this statement as " return the context node's children that are not an attribute named LETTER." Recall from Table 1 that the self-axis returns the context node. So, we actually return the DRIVE element if the predicate matches. The not() function tests to see if a LETTER attribute is present. If the LETTER attribute is present, the node-test returns false, and the context node is not selected.

XPath functions cannot be used as statements themselves. For instance, the following XPath statement is not legal:

not(@LETTER)

This is because the statement must evaluate as a node-set. In other words, we omitted two parts of the location step: the axis and the node test, we skipped right to the predicate.

Why Use XPath?

XPath is used in conjunction with XPointer and XSLT for searching documents. XPath statements can also be used individually using a Document Object Model (DOM) representation of an XML document. Chapter 6 of XML and ASP.NET, "Exploring the System.Xml Namespace", delves deeper into XPath in the .NET Framework and looks at how XPath statements can be issued using the System.Xml.XPath namespace. Chapter 5 of XML and ASP.NET, "MSXML Parser", also shows how to use the MSXML Parser to query using XPath statements.

This article is an excerpt from the book XML and ASP.NET (ISBN:073571200X), written by Kirk Allen Evans, Ashwin Kamanna, and Joel Mueller, published by New Riders Publishing.

Kirk Allen Evans has been developing applications for over 10 years. His focus remains developing complex, distributed systems using Microsoft technologies. Kirk also developed and maintains vbdna.net, a website devoted to demonstrating techniques for developing distributed systems using Visual Basic.





Page 3 of 3



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel