LanguagesXMLXML for Beginners, Part 3: Tags and Attributes

XML for Beginners, Part 3: Tags and Attributes

Preface

I have authored numerous online articles on XML.  These articles cover the waterfront from introductory to advanced.  I maintain a consolidated index of hyperlinks to all of my XML articles at my personal website so that you can access earlier articles from there.

Introduction

This is the next installment in a series of articles designed to explain XML to beginners.

Experts skip this article

Those of you who already know a lot about XML can skip ahead to something more challenging, such as some of my articles on XSL.  You will find links to all of my articles at my personal website.

Beginners, keep reading

Those of you who are just getting your feet wet in this area (and may have found the XML water to be a little deep), keep reading.

What is XML?

In an earlier article, I provided the following brief description of XML.
 

XML gives us a way to create and maintain structured documents in plain text that can be rendered in a variety of different ways.

A primary objective of XML is to separate content from presentation.

Then I proceeded to break down the jargon into plain English and provided some examples of structured documents.

I made a promise

At the end of the previous article, I promised that this article would explain the XML terms element and attribute.  However, I am going to break my promise somewhat.  This article discusses the XML attribute in some detail, but defers the detailed discussion of element until the next article.

Some XML code for a book

Also, in the previous article, I provided the following XML code that describes a simple book.
 

<book>
    <chap number=”1″>
        Text for Chapter 1
    </chap>

    <chap number=”2″>
        Text for Chapter 2
    </chap>
</book>

The book represented by this XML code has two chapters with some text in each chapter.  I told you that this XML code contains an attribute that describes the chapter number in each chapter element.

Let’s define a tag

I am going to begin my explanation with a new jargon word: tag.

What is a tag?

The common jargon for XML items (such as the following) enclosed in angle brackets is tag.  (You may be familiar with this jargon based on HTML experience.)
 

<book>

This is a start tag

The tag shown above is often referred to as a start tag or a beginning tag.

The tag shown below is often referred to as an end tag.
 

</book>

The end tag contains a slash

What is the difference between a start tag and an end tag?  In this case, the start tag and the end tag differ only in that the end tag contains a slash character.

Sometimes there are other differences

However, the start tag can also contain optional attributes as discussed below.

Vocabulary words: element, content, and attribute?

To begin with, I need to introduce you to another new word:  content.

What is an element?

Using widely-accepted XML jargon, I will call the sequence of characters in the following box an element.

Note that an element begins with a start tag and ends with an end tag and includes everything in between.
 

<chap number=”1″>Text for Chapter 1</chap>

Color coded for clarity

In this lesson, I have used artificial color coding to make it easier to refer to the different parts of the element.

(Note however, that because an XML document is maintained in plain text, the characters in an XML document do not have color properties.)

What is the content?
The characters in between the tags (rendered in green in this presentation) constitute the content.

What is an attribute?

The characters rendered in blue in the above box constitute an attribute.

To reiterate so you will remember it

An element consists of a start tag and an end tag with the content being sandwiched in between the two tags.  The content is part of the element.

May include optional attributes

The start tag may contain optional attributes.  In this example, a single attribute provides the number value for the chapter.

Tell me more about attributes

The term attribute is a commonly used term in computer science and usually has about the same meaning, regardless of whether the discussion revolves around XML, Java programming, or database management.

Attributes belong to things, or things have attributes

A chapter in a book is a thing.  A chapter has a number.

An apple has a color, red or green.  An apple also has a taste, sweet or sour.

A dog has a size, small, medium, or large.

In the above statements, number, color, taste, and size are attributes.  Those attributes have values like red, green, sweet, sour, small, medium, and large.

As you can see, attributes are a very common aspect of the world in which we live and work.

People have attributes

A person also has attributes, and each  attribute has a value.

Here is a list of some of the attributes (along with their values) that might be used to describe a person.
 

name=Joe
height=84
weight=176
complexion=pale
sex=male
training=Java programmer
degree=Masters

Obviously, there are many more attributes that could be used to describe a person.

The importance of an attribute depends on the context

The decision as to which of many possible attributes are important depends on the context in which the person is being considered.

Attributes for basketball players

For example, if the person is being considered in the context of being a candidate for an all-male basketball team, the height, weight, and sex attributes of a person will probably be important considerations.

Attributes for programmers

On the other hand, if the person is being considered in the context of being a candidate for employment as a programmer, the height, weight, and sex attributes should not be important at all, but the training and degree attributes might be very important.

Why does XML use attributes?

The definition of XML given earlier is repeated here for convenience.
 

XML gives us a way to create and maintain structured documents in plain text that can be rendered in a variety of different ways.

A primary objective of XML is to separate content from presentation.

Once again, what is rendering?

In an earlier article, I suggested that the most common modern use of the word rendering means to present something for human consumption.  (Usually, but not always, we are referring to visual consumption.)

Multiple renderings for the same document

I gave an example of a newspaper that can either be rendered on newsprint paper, or can be rendered on a computer screen.

What is a rendering engine?

If the newspaper (structured document) is created and maintained as an XML document, then some sort of computer program (often referred to as a rendering engine) will probably be used to render it into the desired presentation format.

What about our book?

Our book could also be rendered in a variety of different ways.

Chapter numbers may be important

Regardless of how it is rendered, it will probably be useful to separate and number the chapters.

The value of the number attribute could be used by the rendering engine to present the chapter number for a specific rendering.

Chapter numbers may be rendered differently

In some renderings, the number might appear on an otherwise blank page that begins a new chapter.  This is common in printed books, but is not common in online presentations.

In a different rendering, the chapter number might appear in the upper right or left-hand corner of each page.

Separation of content from presentation

To reiterate, one of the most important characteristics of XML (as opposed to HTML) is that XML separates content from presentation.

The XML document contains information about structure and content.  It does not contain presentation information (as does HTML).

Presentation of XML requires a rendering engine

Presentation of an XML document requires the use of a rendering engine of some sort to render the XML document in a particular presentation style.

IE 5.0 contains a rendering engine

As of the date of this writing, to the best of my knowledge, the only commonly available XML rendering engine in use on the web is Microsoft’s Internet Explorer 5.0.  (Hopefully, Netscape will catch up soon.)

When provided with an XML document and an appropriate stylesheet, IE5 can transform XML data into HTML data and render it in the browser window.

What is a stylesheet?

I will have a lot to say about stylesheets in future articles.

Attributes may be useful in rendering

Attributes provide information about XML elements that may be useful to the rendering engine.

If the attribute values for an element are not important in a particular presentation context, the rendering engine for that context can ignore them.  If they are important in a particular context, the rendering engine can use them.

What’s Next?

I will have more to say about elements and content in the next article in this series.

Copyright 2000, Richard G. Baldwin.  Reproduction in whole or in part in any form or medium without  express written permission from Richard Baldwin is prohibited.


About the author:

Richard Baldwin is a college professor and private consultant whose primary focus is a combination of Java and XML. In addition to the many platform-independent benefits of Java applications, he believes that a combination of Java and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two.  He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas.  He is the author of Baldwin’s Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories