August 21, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Creating Valid XML Documents: DTDs

  • January 16, 2004
  • By Steven Holzner
  • Send Email »
  • More Articles »
Allowing Choices

DTDs can support choices. By using a choice, we can specify one of a group of items. For example, if you want to specify that one (and only one) of either <x>, <y>, or <z> will appear, use a choice like this:

(x | y | z)

Listing 3 shows an example of using choices in the document. In that example, each product is allowed to contain either a <price> element or a <discountprice> element. To indicate that that's what you want, you only need to make this change to the DTD (as well as declare the new <discountprice> element):

<!ELEMENT project (product, id, (price | discountprice))> 

Listing 3 A Sample XML Document That Uses Choices in a DTD

<?xml version = "1.0" standalone="yes"?>
<!DOCTYPE document [ 
<!ELEMENT document (employee)*> 
<!ELEMENT employee (name, hiredate, projects)> 
<!ELEMENT name (lastname, firstname)> 
<!ELEMENT lastname (#PCDATA)> 
<!ELEMENT firstname (#PCDATA)> 
<!ELEMENT hiredate (#PCDATA)> 
<!ELEMENT projects (project)*> 
<!ELEMENT project (product, id, (price | discountprice))> 
<!ELEMENT product (#PCDATA)> 
<!ELEMENT id (#PCDATA)> 
<!ELEMENT price (#PCDATA) > 
<!ELEMENT discountprice (#PCDATA)> 
]> 
<document>
  <employee>
    <name>
      <lastname>Kelly</lastname>
      <firstname>Grace</firstname>
    </name>
    <hiredate>October 15, 2005</hiredate>
    <projects>
      <project>
        <product>Printer</product>
        <id>111</id>
        <discountprice>$111.00</discountprice>
      </project>
      <project>
        <product>Laptop</product>
        <id>222</id>
        <price>$989.00</price>
      </project>
    </projects>
  </employee>
    .
    .
    .
  <employee>
    <name>
      <lastname>Gable</lastname>
      <firstname>Clark</firstname>
    </name>
    <hiredate>October 25, 2005</hiredate>
    <projects>
      <project>
        <product>Keyboard</product>
        <id>555</id>
        <price>$129.00</price>
      </project>
      <project>
        <product>Mouse</product>
        <id>666</id>
        <discountprice>$25.00</discountprice>
      </project>
    </projects>
  </employee>
</document>

You can also use the +, *, and ? operators with choices. For example, to allow multiple discount prices and to insist that at least one element from the choice appear in the XML document, you can do something like this:

<!ELEMENT project (product, id, (price | discountprice*)+)> 

As you can see, there are plenty of options available when it comes to specifying elements or text content in DTDs (although XML schemas allow us to be even more precise, specifying numeric formats for numbers and so on). But what if we want a content model to let an element contain both elements and text? That's coming up next.

Allowing Mixed Content

When using a DTD, you can allow an element to contain text or child elements, giving it a mixed content model. Note that even with a mixed content model, an element can't contain child elements and text data at the same level at the same time (unless you use the content model ANY). For example, this doesn't work:

<product>
  Keyboard
  <stocknumber>1113</stocknumber>
<product>

However, you can set up a DTD so that an element can contain either child elements or text data. To do that, we treat #PCDATA as we would any element name in a DTD choice. Listing 4 shows an example of this; in this example, the <product> element is declared so that it can have text content or it can contain a <stocknumber> element.

Listing 4 A Sample XML Document That Uses a Mixed Content Model

<?xml version = "1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE document [ 
<!ELEMENT document (employee)*> 
<!ELEMENT employee (name, hiredate, projects)> 
<!ELEMENT name (lastname, firstname)> 
<!ELEMENT lastname (#PCDATA)> 
<!ELEMENT firstname (#PCDATA)> 
<!ELEMENT hiredate (#PCDATA)> 
<!ELEMENT projects (project)*> 
<!ELEMENT project (product, id, price)> 
<!ELEMENT product (#PCDATA | stocknumber)*> 
<!ELEMENT id (#PCDATA)> 
<!ELEMENT price (#PCDATA)> 
<!ELEMENT stocknumber (#PCDATA)> 
]> 
<document>
  <employee>
    <name>
      <lastname>Kelly</lastname>
      <firstname>Grace</firstname>
    </name>
    <hiredate>October 15, 2005</hiredate>
    <projects>
      <project>
        <product>
          <stocknumber>1111</stocknumber>
        </product>
        <id>111</id>
        <price>$111.00</price>
      </project>
      <project>
        <product>
          Laptop
        </product>
        <id>222</id>
        <price>$989.00</price>
      </project>
    </projects>
  </employee>
    .
    .
    .
  <employee>
    <name>
      <lastname>Gable</lastname>
      <firstname>Clark</firstname>
    </name>
    <hiredate>October 25, 2005</hiredate>
    <projects>
      <project>
        <product>
          <stocknumber>1113</stocknumber>
        </product>
        <id>555</id>
        <price>$129.00</price>
      </project>
      <project>
        <product>Mouse</product>
        <id>666</id>
        <price>$25.00</price>
      </project>
    </projects>
  </employee>
</document>

There are plenty of restrictions when we use a mixed content model like this in a DTD. We cannot specify the order of the child elements, and we cannot use the +, *, or ? operators. In fact, there's usually very little reason to use mixed content models at all in XML. We're almost always better off being consistent and declaring a new element that can contain our text data than using a mixed content model.

Allowing Empty Elements

Elements don't need to have any content at all, of course; they can be empty. As you would expect, you can support empty elements by using DTDs. In particular, you can create an empty content model with the keyword EMPTY, like this:

<!ELEMENT intern EMPTY> 

This declares an empty element named <intern/> that you can use to indicate that an employee is an intern. Listing 5 shows this new empty element at work. As you can see, this example allows each <employee> element to contain an <intern/> element—and makes that element optional.

Listing 5 A Sample XML Document That Uses an Empty Element

<?xml version = "1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE document [ 
<!ELEMENT document (employee)*> 
<!ELEMENT employee (intern?, name, hiredate, projects)> 
<!ELEMENT name (lastname, firstname)> 
<!ELEMENT lastname (#PCDATA)> 
<!ELEMENT firstname (#PCDATA)> 
<!ELEMENT hiredate (#PCDATA)> 
<!ELEMENT projects (project)*> 
<!ELEMENT project (product, id, price)> 
<!ELEMENT product (#PCDATA)> 
<!ELEMENT id (#PCDATA)> 
<!ELEMENT price (#PCDATA)> 
<!ELEMENT intern EMPTY> 
]> 
<document>
  <employee>
    <intern/>
    <name>
      <lastname>Kelly</lastname>
      <firstname>Grace</firstname>
    </name>
    <hiredate>October 15, 2005</hiredate>
    <projects>
      <project>
        <product>Printer</product>
        <id>111</id>
        <price>$111.00</price>
      </project>
      <project>
        <product>Laptop</product>
        <id>222</id>
        <price>$989.00</price>
      </project>
    </projects>
  </employee>
    .
    .
    .
  <employee>
    <intern/>
    <name>
      <lastname>Gable</lastname>
      <firstname>Clark</firstname>
    </name>
    <hiredate>October 25, 2005</hiredate>
    <projects>
      <project>
        <product>Keyboard</product>
        <id>555</id>
        <price>$129.00</price>
      </project>
      <project>
        <product>Mouse</product>
        <id>666</id>
        <price>$25.00</price>
      </project>
    </projects>
  </employee>
</document>

Empty elements can't contain any content, but they can contain attributes.

Summary

In this article you have practiced validating XML documents with DTDs and specified the syntax of XML documents for XML processors to check. In a perfect world, there would be no data-entry errors in XML documents, but real life is a different story. If you specify the syntax of an XML document, you can let an XML processor check that document automatically.

About the Author

Steven Holzner is an award-winning author who has written 80 computing books. Material in this article was taken from Sams Teach Yourself XML in 21 Days, Third Edition. (Copyright Sams Publishing) He has been writing about XML since it first appeared and is one of the foremost XML experts in the United States, having written several XML bestsellers and being a much-requested speaker on the topic. He's also been a contributing editor at PC Magazine, has been on the faculty of Cornell University and MIT, and teaches corporate programming classes around the United States.



Page 4 of 4



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel