October 25, 2016
Hot Topics:


  • April 3, 2000
  • By Joe Burns
  • Send Email »
  • More Articles »

Not too long ago, HTML version 4.0 was recommended by the World Wide Web Consortium (W3C). I put up a tutorial on all of the new HTML 4.0 commands, then set about creating in-depth tutorials for each command. (See, " HTML 4.0 ")

It was only a matter of time before someone wrote to me and asked when HTML 5.0 would be coming out. I have your answers:

  • It never will.
  • It already has.

On January 26, 2000, the W3C released specifications defining what they termed XHTML 1.0 (Extensible Hypertext Markup Language). I have also seen it written "xHTML", "Xhtml", and "XML/HTML".

Now, depending on which articles you read, (I've read waaay too many at this point), XHTML is either HTML 5.0, or HTML versions breathed their last with 4.0 and there will never be a 5.0 because XHTML is the direction markup languages are taking now.

Confused? Let's beat through it.

Right now there are two languages vying to be number one on the Web. The first is good old HTML and the second is Extensible Markup Language (See, " What is XML? "). Which is better really depends on whom you talk to and what they want to do with the pages they create.

HTML is well within the grasp of the Weekend Silicon Warrior and creates decent text and image pages. It is, by far, the most-used language on the Web.

XML is much more dynamic and allows for much more specific database interaction than was ever possible before. An example would be searching for "dog" in Yahoo!. You get everything that has "dog", as well as all related, larger words such as "dogma". Well, XML can change all that. Your searches and requests can be specific. Results will be specific.

Another big plus on the XML side is the ability for you to create custom XML tags. If you want a tag named "zork" that allows you to turn text green and change the font size to 24 point, you can create it. Follow the links above to Goodies tutorials explaining how.

What Is This XHTML?

Once again, some say it's HTML with XML qualities. Others, like me, say it's XML with HTML written into the Document Type Definition (DTD).

Here's the scoop as I understand it. XML has become the chosen language for the Web's future. At least, that's the feeling I get from reading the pages on the W3C Web site. Obviously, you cannot simply eliminate HTML, so they did what, I think, was a pretty smart thing. They combined them. I just don't know that I'm overly thrilled with the way they combined them.

Document Type Definition: DTD

Inside your browser, there's a DTD. It's different from browser to browser depending on which version you're using. The reason that Internet Explorer 4.0 understands some HTML 4.0 level commands and Internet Explorer 3.0 doesn't is because those commands were written into the 4.0 browser's DTD.

The new XHTML 1.0 DTD (which looks like this, in case you're interested) is basically the XML DTD with the HTML 4.0 DTD put inside it. Users must follow the majority of XML rules because HTML is under XML's umbrella rather than being the other way around.


The W3C suggests that HTML should be "an application of XML". The purpose is to tighten HTML's programming standards to make them compliant with XML.

You may not like that, but there's some sense to it. XML is very specific. One thing means one thing. Period. HTML isn't so specific. For example:

  • Tags can be in caps or not.
  • TEXTAREA boxes require end tags, yet text boxes do not.
  • Tags can end in any order regardless of how they were placed.

I'm sure you can come up with some more examples, but these are the three that I point out to students.

By placing the HTML 4.0 DTD under the XML DTD, the language has no choice but to follow the same strict rules that XML does. Some will love that, others won't.

The Rules

At the moment, the best one can hope to do is to write XHTML documents that are compatible with current browsers. I'll run down a few of the rules for writing in XHTML. If you've already read my XML tutorial, many will be familiar to you.

1. You will use the XML & XHTML declaration statements to start every XHTML page:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"> 

The commands will alert the browser displaying the page that XHTML is the language to render.

2. The head and body tags are now mandatory.

3. Every tag must be closed.

In HTML, you could get away with simply putting a <P> between paragraphs and the browser would render it just fine. If you only had one table on a page, you didn't need the end TD and end TR tags. Under the XHTML DTD, that's no longer true. All tags that require end tags get end tags.

4. Empty tags get a terminating slash.

An empty tag is a tag that doesn't require an end tag. Examples include <BR> and <HR>.

Under the XHTML DTD, empty tags will now carry a space following the tag text and then a terminating slash, like so:

  • <BR> is now <br />.
  • <HR> is now <hr />.
  • <IMG SRC="--"> is now <img src="--" />.

You may have noticed above that I wrote head, body, br, hr, and img in lower case in the XHTML examples. That's because:

5. All tags must be lower case.

This does not apply to attributes, only tags. For example, both of these formats are acceptable under the XHTML DTD:

  • <font color="#ffffcc">
  • <font color="#FFFFCC">

You may have noticed that I have quotes around all of the attributes. That's because:

6. Attribute quotes are now mandatory.

7. Tags may not nest.

In HTML, this is an acceptable format. It will render:


No more. Now the tags must follow a logical begin and end pattern. They must end at the same level as they are started. This is the proper XHTML method of writing the code above:

  • <b><i>Text</i></b>

Once again, note the lower case tags.

8. Attribute values must be denoted.

Most attributes are done this way. For example, FONT FACE="arial". Notice that "arial" follows the attribute "FACE=".

The attribute and equal signs, in some cases, have been eliminated in HTML. For example:

  • <INPUT TYPE="radio" checked>

The word "checked" is a minimized attribute. Under XHTML, no more. You must denote every attribute. Here's the correct method of writing what is above under the XHTML DTD:

  • <input type="radio" checked="checked">

These don't come up too often. Here are a few examples in HTML format:

  • <INPUT TYPE="radio" checked>
  • <INPUT TYPE="checkbox" checked>
  • <OPTION selected>
  • <DL compact>
  • <UL compact>

In each case, you'll need to set the minimized attribute to one that is denoted. The easy way to remember it is that it always denotes itself: checked="checked" and selected="selected".

9. The <pre> tag cannot contain: img, object, big, small, sub, or sup.

10. You may not have any forms inside of other forms.

11. If your code contains a &, it must be written as &amp.

12. Any use of CSS should use all lower case lettering.

13. Any use of JavaScript should be done through external JavaScripting.

OK, this is not always true. You can set up a JavaScript within an XHTML DTD page. Here's a look at the format:

<script language="JavaScript type=text/javascript">
document.write("Hi there");]]>

I think you'll agree, my statement above, although not totally true, will save you multiple headaches.

14. <!--Comments are no longer used.-->

If you want to write a comment in an XHTML document, you write it as:

  • <[CDATA[comment goes in here]]>

15. JavaScripts are no longer commented out.

That will throw big errors in some browsers.

Page 1 of 2

Comment and Contribute


(Maximum characters: 1200). You have characters left.



Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date
Rocket Fuel