November 24, 2014
Hot Topics:

Understanding XPath

  • May 31, 2002
  • By Kirk Allen Evans
  • Send Email »
  • More Articles »

The examples so far have used forward axes: that is, we have only navigated to nodes that are descendants of the context node. Let's look at some examples of XPath statements using reverse axes, or axes that navigate up the document hierarchy. Consider the following representation of a file system, with drives A, C, and D, and D has a backup copy of the contents of the C drive. The context node is highlighted. This document is represented in Listing 1. Note that the line numbers are represented only for explanation and are not actually part of the XML document.

Listing 1—An XML Representation of a File System

1 <?xml version="1.0" encoding="utf-8" ?>
2 <FILESYSTEM>
3    <DRIVE LETTER="A"/>
4    <DRIVE LETTER="C">
5     <FOLDER NAME="INETPUB">
6        <FOLDER NAME="WWWROOT">
7          <FOLDER NAME="ASPNET_CLIENT" />
8        </FOLDER>
9     </FOLDER>
10     <FOLDER NAME="Program Files">
11        <FOLDER NAME="Microsoft Visual Studio .NET">
12          <FOLDER NAME="Framework SDK">
13             <FOLDER NAME="BIN"></FOLDER>
14          </FOLDER>
15        </FOLDER>
16     </FOLDER>
17   </DRIVE>
18   <DRIVE LETTER="D">
19     <FOLDER NAME="INETPUB">
20        <FOLDER NAME="WWWROOT">
21          <FOLDER NAME="ASPNET_CLIENT" />
22        </FOLDER>
23     </FOLDER>
24     <FOLDER NAME="Program Files">
25        <FOLDER NAME="Microsoft Visual Studio .NET"/>
26     </FOLDER>
27   </DRIVE>
28 </FILESYSTEM>

Working with the preceding XML structure, we introduce the following XPath statement:

parent::*

This XPath query translates to "retrieve all parent nodes of the context node", which would return the element "FOLDER" on line 10.

ancestor-or-self::*

This query would return a more complex structure, which is depicted in Listing 2. The returned nodes are highlighted.

Listing 2—An XML Representation of a File System

<?xml version="1.0" encoding="utf-8" ?>
<FILESYSTEM>
   <DRIVE LETTER="A"/>
   <DRIVE LETTER="C">
     <FOLDER NAME="INETPUB">
        <FOLDER NAME="WWWROOT">
          <FOLDER NAME="ASPNET_CLIENT" />
        </FOLDER>
     </FOLDER>
     <FOLDER NAME="Program Files">
        <FOLDER NAME="Microsoft Visual Studio .NET">
          <FOLDER NAME="Framework SDK">
             <FOLDER NAME="BIN"></FOLDER>
          </FOLDER>
        </FOLDER>
     </FOLDER>
   </DRIVE>
   <DRIVE LETTER="D">
     <FOLDER NAME="INETPUB">
        <FOLDER NAME="WWWROOT">
          <FOLDER NAME="ASPNET_CLIENT" />
        </FOLDER>
     </FOLDER>
     <FOLDER NAME="Program Files">
        <FOLDER NAME="Microsoft Visual Studio .NET"/>
     </FOLDER>
   </DRIVE>
</FILESYSTEM>

As you can see in Listing 2, a path is depicted from the context node directly to the root node.

Our examples of axes used an axis with an accompanying asterisk. The asterisk is considered a wildcard that translates to "all nodes within the specified path". This identifier is known as the node-test.

Location paths can be relative or absolute. Relative location paths consist of one or more location paths separated by backslashes. Absolute location paths consist of a backslash optionally followed by a relative location path. In other words, relative location paths navigate relative to the context node. Absolute paths specify the absolute position within the document. An absolute location path would then be:

/FILESYSTEM/DRIVE[@LETTER='C']/FOLDER[@NAME='Program Files']

Using an absolute location path, the current context node is ignored when evaluating the XPath query, except for the fact that the path being searched exists in the same document.

XPath Node Test

The XPath node test does just what its name implies: it tests nodes to determine if they meet a condition. We already used one test, the asterisk character, which specified all nodes should be returned. We can limit the nodes that are returned by specifying names. Using the document in Listing 1 again, we want to retrieve all ancestor elements that are named "DRIVE".

ancestor::DRIVE

By specifying the node name in the node test component of the XPath statement, we limit the results so that only a single node is returned, the DRIVE element on line four.

Besides using names for node-tests, we can also use node types. In Table 1, we saw that one of the axes is an attribute axis, which retrieves an attribute based on the specified node test. Again, using the document in Listing 1, the following node test would return the attribute NAME for the context node (highlighted in Listing 1):

attribute::NAME

If we wanted to select all attributes for the context node, we could also issue a wildcard node test:

attribute::*

So, the type of node returned depends partially on the axes specified. Attributes are not children of elements, so using the following XPath statement would not return any nodes:

child::NAME

This is because there is no child element of the context node that is named "NAME". We can also use XPath functions as node tests to return certain nodes. The available node tests are listed in Table 2.

Table 2 Available XPath Function Node Tests

Axis

Description

comment()

Returns True if the matched node is a comment node.

node()

Returns True for any matched node, or False if no match was found.

processing-instruction()

Returns True if the matched node is a processing-instruction.

text()

Returns True if the matched node is a text node.


Considering the analogy of an XPath statement to a SQL statement, we have looked at the equivalent in XPath to a SQL SELECT statement. Now, let's look at the equivalent to a SQL WHERE clause in XPath.

XPath Predicates

Predicates filter the resulting node sets of an XPath query, producing a new node set. A predicate can be evaluated as a Boolean or a number. When evaluated as a number, nodes matching the positional number are returned, where the index of nodes is 1-based. Listing 3 shows the same document as in Listing 1, but highlights a new context node on line 18.

Listing 3—An XML Representation of a File System

1 <?xml version="1.0" encoding="utf-8" ?>
2 <FILESYSTEM>
3    <DRIVE LETTER="A"/>
4    <DRIVE LETTER="C">
5     <FOLDER NAME="INETPUB">
6        <FOLDER NAME="WWWROOT">
7          <FOLDER NAME="ASPNET_CLIENT" />
8        </FOLDER>
9     </FOLDER>
10     <FOLDER NAME="Program Files">
11        <FOLDER NAME="Microsoft Visual Studio .NET">
12          <FOLDER NAME="Framework SDK">
13             <FOLDER NAME="BIN"></FOLDER>
14          </FOLDER>
15        </FOLDER>
16     </FOLDER>
17   </DRIVE>
18   <DRIVE LETTER="D">
19     <FOLDER NAME="INETPUB">
20        <FOLDER NAME="WWWROOT">
21          <FOLDER NAME="ASPNET_CLIENT" />
22        </FOLDER>
23     </FOLDER>
24     <FOLDER NAME="Program Files">
25        <FOLDER NAME="Microsoft Visual Studio .NET"/>
26     </FOLDER>
27   </DRIVE>
28 </FILESYSTEM>




Page 2 of 3



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Sitemap | Contact Us

Rocket Fuel