Preface
I have authored numerous online articles on XML. These articles cover the waterfront from introductory to advanced. I maintain a consolidated index of hyperlinks to all of my XML articles at my personal website so that you can access earlier articles from there.
Introduction
This is the next installment in a series of articles designed to explain XML to beginners.
Experts skip this article
Those of you who already know a lot about XML can skip ahead to something more challenging, such as some of my articles on XSL. You will find links to all of my articles at my personal website.
Beginners, keep reading
Those of you who are just getting your feet wet in this area (and may have found the XML water to be a little deep), keep reading.
What is XML?
In an earlier article, I provided the following brief description of XML.
XML gives us a way to create and maintain structured documents in plain text that can be rendered in a variety of different ways.
A primary objective of XML is to separate content from presentation. |
Since then, I have been working to break down the jargon into plain English and have provided some examples of structured documents and rendering.
I made a promise
So far, I have introduced tags, elements, content and attributes. I have discussed tags and attributes in detail.
At the end of the previous article, I promised that this article would continue the discussion with particular emphasis on elements and content.
What is an element?
You already know about start tags and end tags.
An element consists of start tag (with optional attributes), an end tag, and the content in between as shown below:
<chapter number=”1″>Content for Chapter 1 </chapter> |
Color coded for clarity
In this lesson, I have used artificial color coding to make it easier to refer to the different parts of the element.
(Note however, that because an XML document is maintained in plain text, the characters in an XML document do not have color properties.)
In the case shown above, the optional attribute is colored blue and the content is colored green.
Elements can be nested
Elements can be nested inside other elements in the construction of the XML document as shown below:
<book> <chapter number=”1″>Content for Chapter 1 </chapter> <chapter number=”2″>Content for Chapter 2 </chapter> </book> |
Color coding and indentation
In the above rendering, the tags belonging to the book element are shown in blue while the tags belonging to the chapter elements are shown in green.
I also provided artificial indentation to make it easier to see that two chapter elements are nested inside a single book element.
Indentation is common
Such indentation is common in the presentation of raw XML data for human consumption. For example, the default rendering of an XML document by IE5 is an indented tree structure similar to that shown above.
Identify the elements
The book element consists of its start tag, its end tag, and everything in between, as shown below.
<book> … </book> |
Each chapter element consists of its start tag, its end tag, and everything in between, as shown below.
<chapter number=”1″> … </chapter> |
Content of the book element
In this case, the chapter elements form the content of the book element.
So, what is an element?
The element is the fundamental unit of information in an XML document. Most XML processing programs (such as rendering engines) depend on this fundamental unit of information in order to do their job.
An XML document is an element
The entire XML document is an element.
In this example, the entire XML document consists of the book element. It might be referred to as the outer element.
To be of much use, an XML document will have other elements nested inside the outer element.
For example, a nested element can define some type of information, such as chapter in our book example. Other possibilities would be table elements and appendix elements.
Meta information
Through the use of attributes, the element often defines information about the information provided by the XML document (sometimes referred to as meta information).
In our book example, the number attribute provides the chapter number for each of the chapter elements. In effect, the chapter number is information about the information contained in the chapter.
The content
Sandwiched in between the start tag and the end tag of an element, we find the information (content) that the XML document is designed to convey.
So, what are elements good for?
By using a well-defined structure (based on XML elements) to create and maintain your document, you make it much easier to write computer programs that can be used to render, and otherwise process your document.
SAX is an example
At some point, you might want to visit one of my earlier articles entitled “What is SAX, Part 1.” (You will find a link to that article at my personal website.)
Writing programs to process XML documents
That article describes how to write computer programs (using the Java programming language) that decompose an XML document into its elements for some useful purpose.
In those articles, I explain that SAX supports an event-based approach to XML document processing. (If you have a background in event-driven programming, such as Java or Visual Basic, you will like the SAX approach.)
Parsing events
An event-based approach reports parsing events (such as the start and end of elements) to the program using callbacks.
The program implements and registers event handlers for the different events.
Code in the event handlers is designed to achieve the objective of the program.
Not critical to understanding XML
I realize that this discussion of event-driven programming for the processing of XML documents would not be classified as “information for beginners.” It is not even critical for an understanding of XML.
However, it is the best example that I can come up with to explain the benefits provided by XML elements. Don’t worry too much about SAX at this at this point. Just keep studying, and at some point in the future, it will fall into place.
What’s next?
I will continue this discussion on elements and content in the next article in this series.
Copyright 2000, Richard G. Baldwin. Reproduction in whole or in part in any form or medium without express written permission from Richard Baldwin is prohibited.
About the author:
Richard Baldwin is a college professor and private consultant whose primary focus is a combination of Java and XML. In addition to the many platform-independent benefits of Java applications, he believes that a combination of Java and XML will become the primary driving force in the delivery of structured information on the Web.Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two. He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas. He is the author of Baldwin’s Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.
Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.