An introduction to XML
In the beginning, there was HTML. It was simple, easy to learn, and anyone with an editor could create files that were instantly publishable on the World Wide Web. Unfortunately, users have continually demanded more sophistication and control over what they wish to publish and browser makers have responded by introducing a plethora of their own vendor-specific tags in order to satisfy these needs. Now we have entire sites that are optimized for one browser and not another, and designing pages that work well on all browsers is something of a lost art. Scripting languages and technologies, such as plug-ins, Java applets and ActiveX controls, bind the presentation aspects of data ("how" something is displayed within Web browsers) more tightly to the data itself ("what" is being displayed).
XML (eXtensible Markup Language) championed by the W3C (World Wide Web Consortium) represents a significant industry-wide attempt to stop this conflation of features within single HTML files, and separate presentation from data.
XML is simple and complex at the same time. The language itself is quite easy to learn and use, given that browsers from Microsoft and Netscape both have built-in support for parsing and rendering XML. It is targeted to provide the basic support needed to describe complex structured data, and give you the ability to create your own tags. It can also be described as a simpler descendant of SGML (the Standard Generalized Markup Language).
But XML is also complex. As a developer you now have several options to write Web pages to publish and present the data. The three options described below provide increasing levels of sophistication, flexibility and user interaction capability. You can:
- write Web pages at the markup level by using XSL (eXtensible Style Language) to specify how XML documents ought to appear
- write Web pages that operate on XML data using scripting languages supported by the browsers using the DOM (Document Object Model) supported by XML parsers and documents
- write Java applets that parse the XML data and provide sophisticated GUI interfaces through which a user might interact with this data
The examples below rely upon Microsoft IE 4.0 and the built-in XML and XSL software provided by Microsoft. The W3C has numerous ongoing projects to standardize XML, XSL and related standards and it can therefore be expected that similar support will be forthcoming from Netscape and other browser vendors. High quality parsers for XML and toolkits are also available from vendors such as IBM and DataChannel.
This zip file contains the samples for this tutorial. Download it and edit the samples provided to experiment with various options. I also recommend downloading the XML Java Parser from Microsoft (and related software), and the XSL Technology Preview -- see the links at the end of this article.
The former contains Java classes that can parse your XML and DTD files, and generate verbose errors that can help you considerably in debugging such files; you invoke the utility with the "jview" command line utility as:
jview msxml -d
will provide a brief synopsis of all of the command line options available.
The XSL technology preview contains the "msxsl" utility which is a command line program that can read an XML and related XSL file and generate the corresponding HTML file as it would appear within a browser. Since the ActiveX control we will use in IE to render XML/XSL combinations does not provide any error messages, using this utility is the only way you can debug your more complex XSL scripts, until better tools (such as XMLStyler from Arbortext) become more widely available. You can use this tool by typing:
Backward compatibility and benefits
Before we begin, it is wise to consider the question: should you really learn XML, and begin upgrading your Web site and pages to XML, XSL and all these new technologies? It is perhaps obvious that there is a lot of HTML out there and all these Web pages are not going to be converted overnight to XML. However, even though HTML is subsumed by XML and will probably be supported by browsers for some time to come, you can realistically expect that over time, more and more pages will be in XML. As tools mature, support for contenders such as CSS (Cascading Style Sheets) which attempt(ed) to provide separation of presentation from data, but not exensibility, will likely dwindle, and XML data and applications will proliferate even though competitors may co-exist for some time to come.
The benefits of XML are many:
- By separating presentation and data, XML makes it easier to change GUI aspects without having to modify data that is being rendered. It therefore makes developing and maintaining Web pages a whole lot simpler.
- XML enables domain specific extension tags through which more specific markup languages (such as those specialized for mathematics or chemistry) can be defined and evolved.
- As extension tags become standardized, it will become easier to interchange data between Web pages. Such acts as dragging and dropping products into shopping carts, or your personal data from a rolodex into a Web form, will all become much simpler.
As with any new technology, however, XML is not without its drawbacks. The masses may not rush to embrace it in spite of Microsoft's and Netscape's efforts, and standardization of tags through accepted DTD's may never materialize. Even though several XML and XSL tools have been announced, many are immature and some can only be characterized as "idea"ware. You may also have to sift through the tremendous hype surrounding it in order to learn what's useful about XML.
With those caveats, let us begin to see what it can do for us.
XML at work
Consider the simple example of a personal address database. Here's a typical entry from such a database ("addresses.xml").
If you have worked with HTML files, several differences should immediately be apparent. Besides the lack of presentation-related tags, the example illustrates how every XML tag is neatly bracketed by <TAG> </TAG> (Note: to handle HTML tags that are often not so neatly bracketed, XML allows a tag to be completed as <TAG/>), and the use of a nested tree structure (i.e., in the above example, an ADDRESS contains a WORK and HOME block, each of which in turn contains a STREET, CITY, STATE and COUNTRY elements). Much of HTML is subsumed by such syntax (you only have to remember to replace those pesky IMG tags with IMG .. / tags to convert them to valid XML).
The example data we will use will be a list of addresses of the form given above, bracketed as shown below (which forms a complete XML file -- named "addresses.xml"):
In essence, XML files are not much more complicated than that. Even such simple recursive tree structures allow for the specification of complex structured data, using tags that are specifically intended to convey the meaning of the data (in the example above, a tag named CITY is much more descriptive of a portion of someone's address than the bold tag, or whatever formatting element a Web author might have chosen to use).
The first line specifies the version of XML that is used.
Document Type Definition (DTD) and Document Object Model (DOM)
There are two important points to note about the tree structured form in which an XML file presents itself:
- What goes in each element (or place) in the tree can be described by a simple grammar. Such a grammar (or specification) is called a Document Type Definition. Even though it is not mandatory, specifying a DTD will allow parsers to "validate" your XML file. For example, the grammar may specify that WORK addresses may have an associated COMPANY tag, but not HOME addresses. The appearance of a COMPANY element within a HOME address would trigger an error in validating XML parsers. The XML specification allows for DTDs to be contained within or without an XML file.
- From the tree-structure representation it should also be apparent that a parser can expose data contained in an XML document using very few objects. The Microsoft parser parses every XML document into a a set of such objects. An "XMLDocument" object corresponds to an XML file and contains a single root "XMLElement." Each such XMLElement can contain an XMLElementCollection which, in turn, comprises of one or more XMLElements. This set of objects is said to form the DOM or Document Object Model for XML. Elements have properties such as "text" and "children" through which the entire DOM can be traversed. They also hav methods through which these properties may be manipulated. Thus it is through this DOM that scripting languages or programming languages such as Java access and manipulate the contents of XML documents.
To reference a DTD corresponding to our "addresses.xml" file, we modify it to include:
The second line references a file called "AddressList.dtd", whose root element is an ADDRESSLIST. (The Microsoft tools require the root element's name to be the same as that of the DTD file name).
Here's what this file looks like:
The first line states that an ADDRESSLIST is comprised of one or more ADDRESS blocks. Each ADDRESS element is then defined to contain one or more WORK or HOME elements. This element is also defined to have an attribute named NAME that is required to be present in each ADDRESS definition. Subsequent lines describe the WORK element as having STREET, CITY, STATE, and COUNTRY elements, and then each of these elements is defined as a textual element (as indicated by PCDATA?). Note that WORK and HOME can also have a piece of text associated with them.
DTDs can get quite complex. There is now an effort called XML-Data within the W3C to describe schemas (entities and relations) using XML itself. As the authors of this submission realize, this submission allows DTDs to be specified in XML itself, and hence replace the DTD syntax. Please see Microsoft's Web site in the URLs provided at the end of this article for details.
DTDs can also be embedded inline within an XML document, by modifying the second line in "addresses.xml", to contain:
.. and so on..
How can we look at these XML files?
We now have a cursory understanding of "addresses.xml" and "AddressList.dtd". But how do we look at them from within a Web browser (say IE 4.0)? This is where XSL (or eXtensible Style Language) comes in. You can think of an XSL processor (embedded within an ActiveX control -- which we'll cover shortly) as a piece of code within your browser that uses an XSL file to interpret and convert an XML file to HTML. (Note: Remember that you can use the "msxsl" utility to generate HTML files from XML/XSL files).
It is important to note that XSL is an evolving standard. The approaches taken by browser makers such as Microsoft and Netscape are different today. However, under the W3C's guidance, you can expect XSL to be standardized soon.
Each XSL file contains a list of patterns and actions (several examples are presented below) combined in the form of rules. "Address1.xsl" in the zip file below presents the XML data we have as simple HTML.
This entire file is bracketed with:
The first rule of interest is the rule for the root element:
My address book
As the comments indicate, the root element pattern is indicated by the present of the token "<root/> ". The action taken for this rule is to emit the HTML text included below this token. All the interesting work is done by specifying "<children/> " -- which instructs the XSL processor to insert the result of processing the children of the root element in the body of the HTML text.
Let us now look at a few simple examples of how we might process some of the XML elements in our address book.
This rule specifies that CITY and STATE elements should be output in italic font. Note that if you omit the "<children/>" specifier, the result of processing these elements will be empty; This feature can be used to restrict your HTML output to contain only relevant data from the XML file -- in effect, to contain simple filtered views of your address book.
The "target-element" tags that are used to specify patterns allows wild-carding. For example, you can use the following rule while debugging your XSL files.
This rule will output all the elements for which you have not specified any tags in red. Note that by not specifying any type or other qualifiers to the target-element tag you will match on "every" element if a more specific rule for that element is not found in your XSL file.
You may be wondering how you might format the CITY field in a WORK address different from the CITY field in a HOME address. The rule I specified above matches the CITY element regardless of context. In order to specify a context you can use the "element" tag as shown below:
You can have several "element" tags in order to specify a target at a specific level. For example, you may want to format TITLE tags within sections of chapters in books specifically.
Now let us look at how we format a work or home address
The interesting action items are bracketed by the "eval" tag. The first one specifies that we want to include the "tagName", which will be WORK or HOME followed by the position of this element within its parent. This enables multiple work addresses to be formatted as "WORK1", "WORK2", etc. Note that what goes within the "eval" tags is really ECMAScript (see link below) and this gives you an enormously powerful capability within your pattern's action rules.
The other interesting bit about that last rule is the "select-elements" tag. For example, the line
instructs the XSL processor to process this elements children one level down, and insert the results of processing such elements which match the target-element pattern STREET at the place indicated in the HTML output. To get ALL the STREET elements below a particular element, use
The rule for ADDRESS elements is not much more complicated. It illustrates how you use
to get the NAME attribute inside of an ADDRESS element.
XSL rules also allow you to process different occurrences of a particular pattern differently. For example, we use
to indicate that the last ADDRESS should be formatted slightly differently. You can use this feature to generate totals and roll-up output in reports.
Rule order and evaluation may seem complex, but if you follow the order presented here (i.e., start with a root pattern, then the leaf XML elements and then the higher level groupings) you will see that XSL files are easy to master and you can get predictable results. You can do far more than what we've indicated with XSL files since, unlike CSS scripts, XSL files can re-order the data that is being output.
Presenting the same data using HTML tables
"Address2.xsl" in the zip file at the end of this article presents WORK addresses in the same XML data, only now as HTML tables. This is perhaps not illustrative of the best HTML around, but the flexibility with XSL styles should indicate how trivial it is to re-format the data without touching the XML data contained in your address book. This, after all, is what separation of presentation from data is all about. This latter file also illustrates the use of "style-rules" which can be used to specify the style associated with XML elements.
The Microsoft tutorial on XSL referenced below illustrates other powerful features supported within XSL which allow it to provide capabilities far exceeding those targeted by CSS (Cascading Style Sheets). Using XSL you can sort, reformat XML documents and even include things such as table of contents, report rollups or summary and totalling information.
Putting it all together
Now that we have the XML, DTD, and XSL files we are ready to create the HTML file that puts it all together (see address1.html in the zip file).
When you browse this file, it will load the ActiveX control containing the XSL processor which will then proceed to load the XML file indicated in the documentURL parameter using the XSL file specified in the styleURL parameter.
The remaining lines are not that important. They basically instruct the ActiveX control to parse the result of the XML/XSL combination on loading the Web page, and the DIV tag instructs the browser to place the "xslTarget" item as the main contents of the page.
You can browse this HTML file and you will be able to see the XML/XSL combination rendered using the first style sheet we created. By changing "Address1.xsl" to "Address2.xsl" you can see the same data, formatted as an HTML table. I have provided "address2.html" that does this in the sample zip file.
XML and XSL provide sophisticated capabilities and you can be sure that vendors are scrambling to support them better even as you read this article. Once they mature, you may not have to directly edit XML or XSL files anymore, in the same that sophisticated Web page design tools hide much of the complexity inherent in HTML today. This article has hopefully have given you a headstart on understanding the nuts and bolts behind these powerful new technologies.
Links on this article
- The recommended XML standard from the W3C
- Starting point for XSL at the W3C
- XMLStyler from Arbortext
- Starting point at Microsoft for XML, XSL etc.
- XML at Netscape
- XML software from IBM -- validating XML parser
- The XML FAQ by Peter Flynn and others
Sundar Narasimhan got his Ph.D. from MIT in 1994, and is now Chief Scientist at Ascent Technology Inc., where he works on real-time resource allocation and database mining systems. You can reach him at: firstname.lastname@example.org.