With the growing use of XML data, the need for the coexistence of relational data and XML data is also growing. The classic approach of storing XML data as physical disk files is unsuitable and tedious in many situations. No wonder modern database engines are geared to store XML documents right along with the rest of the relational data. To that end, the XML data type introduced in SQL Server 2005 is a great addition to the database engine. Prior to SQL Server 2005, developers often used VARCHAR or TEXT column types to store XML documents and fragments. Although this approach served well as far as data storage is concerned, it proved to be poor in terms of querying and manipulating the XML data. This article will give you a jump start in using the XML data type of SQL server and will teach you how XML data can be manipulated with the help of new XML Data Modification Language (XML DML).
Adding an XML Column to a Table
Before you delve any further, create a table in an SQL Server database that contains a column of type XML. Open SQL Server Management Studio and create a new database. (You also can use any existing database if you so wnt.) Then, create a new table, named EmployeesAsXml, inside the database. The EmployeesAsXml table is supposed to store details of employees in XML format. This table is shown in Figure 1:
Figure 1: Adding a column with XML data type
The table has two columns: Id and EmployeeData. The Id column is an identity column and acts as the primary key column. The EmployeeData column is of type XML. Choosing the data type as XML will allow you to store and retrieve XML documents or fragments in this column.
Storing XML Data in the Table
Now that you have created EmployeesAsXml table with XML column, you can learn to store and modify XML data into it.
Assume that you have an XML fragment, as shown below, that represents details of one employee.
<employee employeeid="1"> <firstname>Nancy</firstname> <lastname>Davolio</lastname> </employee>
The employeeid attribute represents a unique ID of an employee. The firstname and lastname tags represent the first name and last name of the employee, respectively.
When performing INSERT or UPDATE operations on the XML data type, you use the same INSERT and UPDATE statements of SQL. The example in Figure 2 will make it clear:
Figure 2: Inserting data in an XML column
Here, you supply the XML fragment to be inserted as a string into the INSERT statement. There also is an alternative. See the query in Figure 3:
Figure 3: Inserting data via an XML variable
In the example shown in Figure 3, you first declare a T-SQL variable of type XML (@xmldata) and then store the XML fragment into it. In the INSERT statement, you then used the XML variable.
The UPDATE on the XML column follows a similar pattern (see Figure 4).
Figure 4: Updating the XML column
In the example provided in Figure 4, you used the XML data type for the @xmldata variable. What if you receive the XML fragment as a plain string (VARCHAR)? In such cases, it would be wise to type cast the string into an XML data type by using the CONVERT() function. Figure 5 will make this approach clear.
Figure 5: Using the CONVERT() function to convert a string to an XML data type
Carefully observe the use of the CONVERT() function. The CONVERT() function accepts three parameters. The first parameter indicates the target data type (XML in your case). The second parameter indicates the source data (@plainstring variable in your case). Finally, the third parameter indicates that the insignificant white spaces will be discarded from the XML fragment. Note that using XML data type is recommended over using the VARCHAR type because the XML data type checks for well formedness of the XML fragment.
Modifying Parts of the XML Data
In the above examples we inserted or updated an entire record in the EmployeesAsXml table. What if we wish to modify a part of the XML fragment already stored? Luckily, the XML data type provides what is known as XML Data Modification Language (XML DML). XMl DML statements allow you to insert, update or delete data from the XML fragment stored in the xml column.
Now, say you want to insert another <employee> element in addition to what you originally inserted (with employee ID 1). To accomplish this, you will need to execute the following XML DML statement:
Figure 6: Inserting nodes in existing the XML data
Observe the above UPDATE statement carefully. It invokes the modify() method of the XML data type. The modify() method accepts a string that instructs whether the data is to be inserted, updated, or deleted followed by the actual data. The above example inserts a new <employee> node after the first employee node. Notice the insert and after clauses closely. The insert clause itself appears inside the string value followed by an <employee> node to be inserted. The after clause specifies the location where the new node is to be inserted. Just like the after clause you can also use before, as first and as last clauses.
Now, assume that you want to substitute the employeeid attribute of an <employee> element with a value of 100. To do this, you need to issue the statement shown in Figure 7.
Figure 7: Updating existing XML data
Here, you used a replace value of a statement that specifies the markup to be replaced (employeeid attribute in your case) and the new value that is to be substituted (100 in your case). You set the employeeid attribute of the first <employee> element to 100 in the example above.
To delete an <employee> node, you will need to use the delete statement as shown in Figure 8:
Figure 8: Deleting existing XML content
Here, the delete statement specifies that the <employee> node with employeeid value of 1 be deleted from the XML fragment.
As you might have noticed by now, the XML DML heavily uses XQuery syntax. XQuery is a W3C proposed standard that deals with XML queries. The syntax of XQuery is based on XPath syntax.
Querying XML Data
In the preceding examples, you dealt with data insertion and modification operations. Equally important are query operations. Just like the Modify() method, the XML data type has several methods that allow you to query data. Some of the commonly used methods of the XML data type are listed below:
- query(): Queries the XML data based on XQuery expression and fetches the results of the query
- value(): Queries the XML data and returns a scalar value as a result
- exist(): Indicates whether a given XQuery expression returned any results
Now, you can see the usage of all the above methods.
Using the query() method
The query() method accepts an XQuery expression on the basis of which data is to be filtered and returns the results as XML. The following example (see Figure 9) fetches an employee whose employeeid attribute is 1.
Figure 9: Fetching XML elements
Here, you use a SELECT statement along with the query() method. The query() method filters only the <employee> node where the employeeid attribute value is 1. Figure 10 presents the results of executing the above query (formatted for better readability):
Figure 10: Results of the query() method
Using the value() method
The value() method accepts an XQuery expression and returns a scalar value as the result Consult Figure 11.
Figure 11: Fetching single values from XML data
The value() method takes two parameters: the XQuery expression returns a scalar value and data type of the returned scalar value. You specify that the firstname element value of the first node be returned and its data type as varchar(255). If you execute the above query, you will get Nancy as the return value.
Using the exist() method
The exist() method accepts an XQuery expression and returns a value indicating whether any matching node was found. If the matching node is found, it returns 1; otherwise, it returns 0. If the column contains NULL, the value() method also returns NULL. The example shown in Figure 12 checks whether any employee node with the employeeid attribute equal to 1 exists.
Figure 12: Checking if an XML node exists
Attaching an XSD Schema
In preceding examples, your XML markup was not validated against any XSD schema. In many cases, using XSD schema can be advantageous. Some of the advantages include:
- Schemas allow you to detect errors in the XML data early in the life cycle of the application.
- It allows you to validate the new XML data being inserted to the XML column.
- It also can improve the performance of query execution.
To attach a schema to an XML column, first of all you need to create a Schema Collection. This is done as shown in Figure 13:
Figure 13: Adding a schema to schema collection
Here, you used a CREATE XML SCHEMA COLLECTION statement to create a schema collection named EmployeeSchemas. What follows is the actual schema that you want to attach. Once you create the schema collection, you can attach the schema as shown in Figure 14:
Figure 14: Attaching a schema to an XML column
Have a look at the XML Type Specification property for the EmployeeData column. Your schema collection EmployeeSchema is listed there. This is how you can attach an XSD schema to the XML column.
In this article, you were introduced to the basics of using the XML data type. The XML data type is a convenient way of storing XML documents and fragments in the SQL Server database itself. The XML DML allows you to query and modify the XML data in a variety of ways. To do so, XML DML heavily relies on XQuery syntax. The methods such as modify(), query(), value(), and exist() enable you to alter and query your XML data in variety of ways. The XML data also can have a schema attached with it that helps you to ensure validity of the XML data being stored.
About the Author
Bipin Joshi is the proprietor of BinaryIntellect Consulting, where he conducts premier training programs on .NET technologies. He wears many hats, including software consultant, mentor, prolific author, webmaster, Microsoft MVP, and member of ASPInsiders. Having adopted the Yoga way of life, Bipin also teaches Kriya Yoga to interested individuals. His detailed profile can be read on his blog. He also can be reached there.