November 26, 2014
Hot Topics:

Inferring an XML Schema from an XML Document

  • September 3, 2008
  • By Paul Kimmel
  • Send Email »
  • More Articles »

Introduction

After a couple attempts, XML isn't that hard to write. Create a text document with matching opening a closing tags, like <Customer></Customer>, with text values in between. That's not too hard. Unfortunately, I don't write XSD (XML Schemas) documents from scratch enough for them to be easy.

An XSD document is to an XML document what a SQL schema is to a SQL element like a table. The XSD document means that XML documents must contain the elements in the order and types defined by the schema to be valid XML documents relative to that schema. That is, the XSD document describes what to anticipate in an XML document that matches the schema. The challenge can be that an XSD document contains attributes and namespace elements that are a little cryptic and can be hard to remember. Fortunately, you don't have to remember.

The .NET framework provides for inferring the schema from a document. If you have the document, you can generate the schema. This article shows you how.

Defining an XML Document

For your purposes, any XML document will do. The XML contained in Listing 1 is an XML document containing columns from the Northwind Customers table. (It was used because it is convenient.) The XML document contains the <xml> tag with the version and encoding attributes, and the rest of the document describes the content.

Listing 1: A sample XML document containing customer information.

<?xml version="1.0" encoding="utf-8" ?>
<!--Generated XML-->
<Root>
   <Customer>
      <CustomerID>ALFKI</CustomerID>
      <CompanyName>Alfreds Futterkiste</CompanyName>
      <ContactName>Paul Kimmel</ContactName>
      <ContactTitle>Sales Representative</ContactTitle>
      <Address>Obere Str. 57</Address>
      <City>Berlin</City>
      <RegionRegion>
      <PostalCode>12209</PostalCode>
      <Country>Germany</Country>
      <Phone>030-0074321</Phone>
      <Fax>030-0076541</Fax>
   </Customer>
   <Customer>
      <CustomerID>ANATR</CustomerID>
      <CompanyName>Ana Trujillo Emparedados y helados</CompanyName>
      <ContactName>Ana Trujillo</ContactName>
      <ContactTitle>Owner</ContactTitle>
      <Address>Avda. de la Constitución 2222</Address>
      <City>México D.F.</City>
      <Region></Region>
      <PostalCode>05021</PostalCode>
      <Country>Mexico</Country>
      <Phone>(5) 555-4729</Phone>
      <Fax>(5) 555-3745</Fax>
   </Customer>
</Root>

The number of records was shortened to conserve space, but the size of the document doesn't matter. This XML document (refer to Listing 1) repeats Custom objects with each child element corresponding to the columns in the Northwind Customers table.

A corresponding XSD document would need to decribe the contents that one would expect in all Customer XML documents, such as the fact that the contents are multiple complex types and each type has specific fields. The field names and types would be expressed in the XSD as well.

Writing Code to Infer the XML Schema and Return an XDocument

The XDocument type is a new type that is part of LINQ to XML. (For more on LINQ to XML, check out my book LINQ Unleashed for C#. VB programmers shouldn't have that much trouble following the C# examples in the book.)

XDocument represents an XML document, and in fact, XSD documents are also XML documents. Listing 2 demonstrates how to use streams, basic IO, and System.XML classes to get the framework to infer (figure out) what the schema should be as indicated by the XML data.

Listing 2: Inferring the XSD (schema) for the XML document in Listing 1.

Imports System.Xml.Schema
Imports System.IO
Imports System.Text
Imports System.Xml

Module Module1

   Sub Main()

      Console.WriteLine(CreateXSD("..\..\Customers.xml"))
      Console.ReadLine()

   End Sub

   Public Function CreateXSD(ByVal filename As String) As XDocument

      Dim xml As XDocument = XDocument.Load(filename)
      Dim inference As XmlSchemaInference = New XmlSchemaInference
      Dim stream As MemoryStream = _
         New MemoryStream(Encoding.ASCII.GetBytes(xml.ToString()))
      Dim reader As XmlTextReader = New XmlTextReader(stream)
      Dim schemaSet As XmlSchemaSet = inference.InferSchema(reader)

      Dim schema As XmlSchema = schemaSet.Schemas()(0)
      Using target As TextWriter = New StringWriter()
         schema.Write(target)
         Return XDocument.Parse(target.ToString())
      End Using

   End Function
End Module




Page 1 of 2



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Sitemap | Contact Us

Rocket Fuel