HTML vs XHTML

A bit of history

The history of HTML at W3C starts with HTML 3.2, code named Wilbur, which was followed a few years later by HTML 4.0, then HTML 4.01 which is the last version of HTML. HTML 4.01 is the last W3C specification which defines the semantics of the elements of the main hypertext mark-up language on the Web. From HTML 3.2 to HTML 4.01, the language has improved a lot and has started to remove features which caused internationalization and accessibility problems. XHTML 1.0 was created shortly after HTML 4.01 to help the transition of hypertext to a new generation of mark-up languages for text.

XHTML 1.1

In this document, we will not address XHTML 1.1 which is an additional step toward a more flexible version of hypertext with the full benefits of XML architecture and integration of different technologies. Note that XHTML 1.1 has slighly improved the semantics of HTML 4.01 by including a ruby module, used in particular languages like Japanese scripts (read the ruby specification for more information).

Semantics

HTML 4.01 and XHTML 1.0 assign the same semantics to their elements and attributes. For example, an element address has exactly the same meaning in HTML 4.01 and XHTML 1.0. Only bits of the syntax varies between the two languages. For example :

HTML 4.01 example

<img alt="Portrait Murakami Haruki" 
   src="/images/murakami.jpg">

<p lang="fr">Je 
levai la tête pour regarder les 
étoiles.  Leur vue apaisa peu 
à peu les battements de mon 
coeur.</p>

<p><cite class="title">Chroniques 
de l'oiseau à ressort</cite>
 - <cite class="author">Haruki 
 Murakami</cite></p>

XHTML 1.0 example

<img alt="Portrait Murakami Haruki"
   src="/images/murakami.jpg"/>

<p xml:lang="fr">Je 
levai la tête pour regarder les
 étoiles. Leur vue apaisa peu 
 à peu les battements de mon 
 coeur.</p>

<p><cite class="title">Chroniques 
de l'oiseau à ressort</cite>
 - <cite class="author">Haruki 
 Murakami</cite></p>

Their syntaxes are still very similar and there are only a few differences between them.

Both languages come in three flavours: frameset, transitional and strict. The "strict" flavour is strongly recommended for both of them; indeed, the strict flavour make sure that artefacts and problems that were identified in HTML 3.2 are not allowed, while the transitional flavour allows some of those to helps implementers to upgrade smoothly their software or their content.

Benefits ? Use the right tool for the right job.

Is there any advantage to use one version of HTML more than the other? There is no simple answer and the benefits you will gain are tied to the context and the use of the language.

Switching from HTML 4.01 to XHTML 1.0 brings almost no direct benefits for the visitors of your Web site; still, there are several good reasons on the authoring side that may incite you to make the switch:

Easier to maintain
Based on the XML syntax rules, where every opened tag must be closed, XHTML is easier to code and to maintain, since the structure is more apparent and mismatching tags are easier to spot.
XSL ready
XHTML 1.0 is the reformulation of HTML 4.01 in XML. Therefore, XHTML documents are hypertext documents and XML documents. A powerful technology has been developed at W3C to manipulate and transform XML documents : the Extensible Style sheet Language Transformations (XSLT. This technology is tremendously useful to create various new resources automatically from an XHTML document; for example:

A few recipes on using XSLT and XHTML together are given on the W3C Web site.

Ready for the future
When the new version of XHTML becomes a recommendation, XHTML 1.0 documents will be easily upgradable to this new version, to allow to take advantages of its exciting new features. It's likely that an XSLT style sheet will be available by then to help you move your XHTML 1.0 (strict) documents to XHTML 2.0 documents.
Easier to teach and to learn

The syntax rules defined by XML are more consistent and hence easier to explain than the SGML rules on which HTML is based. It also helps teach the CSS Box Model.

So what?

HTML 4.01 is as valuable as XHTML 1.0 in a daily usage; the syntax proposed by XHTML 1.0 has several important benefits, but the weight of these benefits has to be evaluated in the context of your project: use the right tool for the right job.

For a Web designer, starting to use XHTML 1.0 might be helpful in some circumstances and will certainly help to smoothly negociate the future. XHTML 1.0 gives a wonderful opportunity to learn about XML languages and their possibilities without having to learn new semantics. It seems to be a reasonnable way of improving one's competences without too much work.


Valid XHTML 1.0!

Created Date: 2003-09-26 by Karl Dubost
Last modified $Date: 2003/10/07 01:37:50 $ by $Author: ot $

Copyright © 2000-2003 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.