Modeling One-To-Many Relationships With XML
Somehow, modeling and XML aren't often found together in the same sentence. In my experience, I've seen XML vocabularies created using the "fly by the seat of your pants" methodology more than anything else. After all, because XML is the eXtensible Markup Language, it's easy to create your own markup, right?
If we were talking about storing data in a relational database, on the other hand, you would think about modeling the entities, relationships, and attributes the way they exist in the real world to provide the most flexible data access. You would have the rigor of 3rd normal form to guide you. There could be performance considerations in how you model the data as well.
In this article, we'll discuss some options for implementing one-to-many relationships in XML. We'll consider three different techniques:
- Containment relationship
- Intra-document relationships
- Inter-document relationships
For each technique, we'll produce the following artifacts to flesh out the idea:
- DTD(s) to represent document structure
- Sample XML stream(s)
- An XSL stylesheet to demonstrate data access
We'll discuss when it would be appropriate to use each approach and then summarize what we've learned. Some additional resources about modeling XML are also listed at the end of the article.
Department and Employee Domain
To begin, let's define a business domain to model. We'll implement this model by using our three one-to-many XML modeling techniques.
In the relational database world, departments and employees are often used to illustrate concepts. Because this is such a well-known problem domain that the reader may be already familiar with, I'll also use this as an example. No need to reinvent the wheel here.
The Entity-Relationship, or ER, diagram depicted here shows that we have two entities, Department and Employee. Departments are uniquely identified by department_id. Similarly, Employee uses emp_id as its unique identifier.
The line between Department and Employee indicates a relationship. The infinity symbol next to Employee indicates that there may be many Employees in a Department. In a one-to-many relationship, the key of the one side of the relationship, in this case Department, would become a foreign key on the many side of the relationship, in this case Employee.
In a Containment Relationship, a structure is defined where one element is contained within another. In the strongest form of this relationship, the "contained" element ceases to exist when the "container" element is removed.
Let's take a look at a DTD we'll use in the containment relationship implementation of our domain model:
<?xml version="1.0" encoding="UTF-8"?><!ELEMENT Company (Department+)><!ELEMENT Department (Name, Employee+)><!ELEMENT Employee (Name)><!ELEMENT Name (#PCDATA)>
A Company may contain many Department Elements. A Department element contains a Name and may contain many Employee elements. Employee elements may also contain a Name.
A sample XML stream follows:
<?xml version="1.0" encoding=";UTF-8"?><?xml-stylesheet href="Containment.xsl" type="text/xsl"?><Company> <Department> <Name>Enterprise Development</Name> <Employee> <Name>Jeff</Name> </Employee> <Employee> <Name>Mike</Name> </Employee>> </Department> <Department> <Name>Foundation Services</Name> <Employee> <Name>Sam</Name> </Employee> </Department></Company>
An employee list page can be created from the stream with the stylesheet below:
<?xml version="1.0" encoding="UTF-8"?><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html" indent="yes"/> <xsl:template match="Company"> <html> <head> <title>Employee List</title> </head> <body> <h1>Employee Listing</h1> <table> <tr> <td>Employee</td> <td>Department</td> </tr> <xsl:apply-templates select="//Employee"/> </table> </body> </html> </xsl:template> <xsl:template match="Employee"> <tr> <td> <xsl:value-of select="Name"/> </td> <td> <xsl:value-of select="../Name"/> </td> </tr> </xsl:template></xsl:stylesheet>
The Company template creates the shell of the HTML page. The column headers for a table, listing Employee and Departments, are created. The Employee template is invoked for each Employee node of the document. The employee name (Name) and department name (../Name) is selected into the appropriate table cell.
Page 1 of 3