“HTML5: A vocabulary and associated APIs for HTML and XHTML”
original title: “Web Applications 1.0”
(re)defines HTML as an abstract language
What does HTML5 not do?
HTML5 does not treat HTML as SGML
HTML5 does not use DTDs
Browsers do not have SGML parsers — they don’t check DTDs or follow other SGML parsing rules.
Instead, they use custom parsers built specifically for parsing HTML
HTML5 does specify an XML serialization of HTML…
…but HTML5 does not restrict HTML to only a (well-formed) XML-based serialization
text/html: attribute syntax
All of the examples below are conformant HTML
<input name="be evil">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Why not restrict HTML to well-formed XML (that is, XHTML)?
XML requires draconian error handling for all content served as XHTML…
…which means that for any XHTML page that contains even one single minor error…
…a browser must fail to display the page at all…
…and because the vast majority of HTML on the Web is not well-formed XML…
…we need to interoperably handle that “real world” (or “tag soup”) HTML
HTML5 includes a precise algorithm for exactly how conformant UAs/browsers must parse HTML (an algorithm that closely matches existing browser implementations)…
…including parsing of HTML that may not
be well-formed XML and is served up as
Recognizing that many authors produce non-conformant documents and that apps need to deal with such documents, HTML5 precisely specifies handling of markup errors and other classes of errors — so that such errors will be handled in an interoperable way across UAs/browsers
We need to specify error handling behavior to ensure interoperability “even in the face of documents that do not comply to the letter of the specifications”.
Authors will write invalid content regardless of what we spec. So the spec states “what authors must not do, and then tells implementors what they must do when an author does it anyway”.
see http://esw.w3.org/topic/HTML/DraconianErrorHandling and Ian Hickson’s “Error handling and Web language design”, http://ln.hixie.ch/?start=1074730186
So what’s new/different in HTML5?
New elements for better document structure…
img, but scripted…
Used on Y! Pipes…
Along with new elements, we also have new APIs
Persistent client-side data storage
Standards-based Web applications work the same across browsers, so users are not locked into using any particular product from any particular vendor. Users get to choose.