HTML 4.01 or XHTML 1.0? The choice between the two popular ways of authoring for the web seldom yields a clear answer: after all, the two languages share the same semantics, and the differences are mostly about the writing style.
Advocates of the XHTML style will hail the potential of XML for transformation and processing. Advocates of HTML 4.01 will generally reply that Internet Explorer, as of today, does not recognize the preferred media type for XHTML. As a result, most people serve XHTML in a way tantamount to serving tag soup to browsers: in that logic, using HTML 4.01 is the actually “strict” choice.
Both are quite correct, but for anyone authoring (X)HTML by hand, there is one very good reason, often overlooked, to prefer the XHTML syntax to the “classic” HTML one: shorttags.
Let’s look at the following piece of HTML markup.
<p<a href="/">first part of the text</> second part
Now for the surprising part: The above is proper HTML. Valid, conformant, everything. It uses an ill-known feature of SGML called shorthand markup, which was authorized in HTML up to HTML 4.01. But what used to be a “cool” feature for SGML experts becomes a liability in HTML, where the construct is more likely to appear as a typo than as a conscious choice.
All could be fine if this form typo-that-happens-to-be-legal was properly implemented in contemporary HTML user-agents. It is not. In the example above, </> is supposed to close the <a> element. In most browsers today, it does not, and the second part of the text will be part of the link, when it should not. try it.
validation as helping tool
This is reason enough for me, as clumsy author of HTML, to prefer the XHTML document types, notwithstanding all the media type debate: validation is an incentive to keep my code clean. XHTML forces me to close my elements, put my attributes behind quotes, and it won’t let disruptive typos pass as valid.
That does not mean we are leaving HTML 4.01 authors in the awkward company of shorttags: since the HTML specification lists these as not recommended, in the upcoming release of the Markup Validator, detected usage of shorthand markup will be signaled as a warning. SGML hackers can still use it at their own risks. Others will be warned about, and advised to fix, their typos.