DatabaseXML Queries and Indexing in SQL Server 2005

XML Queries and Indexing in SQL Server 2005

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

SQL Server 2005 is the first version of Microsoft’s flagship database
to truly embrace XML. True, SQL Server 2000 has some bolt-on XML
functionality, but in SQL Server 2005 XML is pervasive. In particular, you
can use the new XML columns to actually store XML documents as part of a
table. In this article, I’ll dig into another part of the XML story:
indexing those columns for faster searching.

Storing XML in a Table

XML columns allow you to store XML documents or fragments in a SQL
Server database (an XML fragment is an XML document without the
top-level xml element. When you add an XML column to a table,
each row of the table stores an individual XML document or fragment. The
individual documents cannot exceed 2GB each. So, for example, you might
create a table of document information in a publishing-related database
that contained the actual text of documents in XML format as a single
column.

To create an XML column, you assign the xml data type to
the column, whether you’re using the CREATE TABLE statement or
creating the table with SQL Server Management Studio. Here’s an
example:


CREATE TABLE Document (
	DocID int IDENTITY NOT NULL,
	Description nvarchar(50) NOT NULL,
	DocumentStore xml NOT NULL,
 CONSTRAINT PK_Document PRIMARY KEY CLUSTERED 
 (
	DocID ASC
 )
) 

This creates a table with three columns, the third of which can store
XML documents. To insert data into an XML columns, you can use string
constants. These will be implicitly converted to XML when you perform the
insert. For example:


INSERT INTO Document (Description, DocumentStore)
VALUES('Bruce''s poem',
N'

Bruce
The cat/is flat.
')

INSERT INTO Document (Description, DocumentStore)
VALUES('Code of Hammurabi',
N'

Hammurabi
An eye for an eye, a tooth for a tooth.
')

INSERT INTO Document (Description, DocumentStore)
VALUES('Nursery Rhyme',
N'

Mother Hubbard
Jack and Jill/went up the hill.
')

Writing XML Queries

There are two methods to extract data from XML columns: the
.value method and the .query method. When
you want to extract a single piece of information from an XML
column and have it returned as a SQL data type (such as an int
or a varchar), use the .value method. Its syntax is as
follows:

value(XQuery, SQLType)

Here XQuery is an XQuery expression, and SQLType
is the name of the target SQL data type. For example:


SELECT DocumentStore.value('(/Document/@Name)[1]',
 'varchar(50)' )
FROM Document

This gives the results shown in Figure 1.

Using the .value method

As an alternative to the .value method you can use the
.query method. The .query method is used to run
an XQuery expression against an XML column, returning the result as a
value of the xml data type. Its syntax is simple:

query('XQuery')

For example:


SELECT DocumentStore.query('(/Document/Text)')
FROM Document

This gives the results shown in Figure 2.

Using the .query method

Primary XML Indexes

XML columns are stored as binary large objects (BLOBs) in your database.
Because of this, queries that search within an XML column can be slow. You
can speed up these queries by creating a primary XML index on each XML column.
A primary XML index is a special index that “shreds” the XML data to store
information including:

  • Each tag name in the XML
  • The path from the root of the document to the tag
  • The value of the node
  • The type of the node
  • The corresponding primary key of the base table

To create a primary XML index, execute the CREATE PRIMARY
XML INDEX
statement:


CREATE PRIMARY XML INDEX index_name
    ON table_name (xml_column_name)

For the table I’ve been working with, this would be:


CREATE PRIMARY XML INDEX PXML_Document
    ON Document (DocumentStore)

A primary XML index will speed up most queries against an XML column,
and it’s a good idea to create one whenever you’ll be searching XML
columns that contain large amounts of data.

Secondary XML Indexes

In addition to the primary XML index, each XML column can have up to
three secondary XML indexes. These are specialized XML indexes that
help with particular types of XML queries, and can only be created on
columns that already have a primary XML index:

  • The PATH secondary XML index helps with queries that
    use XML path expressions.
  • The VALUE secondary XML index helps with queries that
    search for values anywhere in the XML document.
  • The PROPERTY secondary XML index helps with queries
    that retrieve particular object properties from within an XML
    document.

To create a secondary XML index, you must use the CREATE XML
INDEX
statement:

CREATE XML INDEX index_name 
    ON table_name (xml_column_name)
    [USING XML INDEX xml_index_name 
        [FOR {VALUE|PATH|PROPERTY}]

Maintaining XML indexes is costly, especially if the data in the XML
column changes frequently. Avoid creating secondary XML indexes unless
you frequently perform the types of queries that they are specifically
designed to help with.

Use Your Power Wisely

A final word of caution: just because SQL Server 2005 can store
XML columns doesn’t mean that you have to put XML columns in every
database. In many cases, it will make more sense to break down the data
that might go into XML columns and make regular relational columns from it
instead. But in cases where SQL Server needs to interoperate with some
other component that already depends on XML, storing data in XML
columns can make a lot of sense. That way you can avoid doing constant
data conversions, and use a pure XML format throughout the business
process. In those cases, think carefully about your querying requirements,
create indexes as necessary, and test performance carefully. You’ll
prohably find that XML columns make your life considerably easier.

About the Author

Mike Gunderloy is the author of over 20 books and numerous articles on
development topics, and the Senior Technology Partner for Adaptive Strategy, a
Washington State consulting firm. When
he’s not writing code, Mike putters in the garden on his farm in eastern
Washington state.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories