HTML5 Overview

Michael(tm) Smith

W3C HTML WG Co-Chair
W3C Web Apps WG Co-Team Contact

with extensive borrowings from Anne van Kesteren and Simon Pieters

Full title

“HTML5: A vocabulary and associated APIs for HTML and XHTML”

original title: “Web Applications 1.0”

Focuses on Web Applications

HTML as an abstract language

(re)defines HTML as an abstract language

Other serializations possible?

RMS dressed up like
S-expressions, … ?

What does HTML5 not do?

HTML5 does not treat HTML as SGML

HTML5 does not use DTDs

True: Some tools do handle HTML as SGML

But: Browsers don’t process HTML as SGML

Browsers do not have SGML parsers — they don’t check DTDs or follow other SGML parsing rules.

Instead, they use custom parsers built specifically for parsing HTML

HTML5 does specify an XML serialization of HTML…

…but HTML5 does not restrict HTML to only a (well-formed) XML-based serialization

text/html: attribute syntax

All of the examples below are conformant HTML


  "-//W3C//DTD XHTML 1.0 Transitional//EN"
<!doctype html>

Declaring a character encoding

<meta charset="utf-8">

Why not restrict HTML to well-formed XML (that is, XHTML)?

XML requires draconian error handling for all content served as XHTML…

…which means that for any XHTML page that contains even one single minor error…

…a browser must fail to display the page at all…

…and because the vast majority of HTML on the Web is not well-formed XML…

…we need to interoperably handle that “real world” (or “tag soup”) HTML

Precisely specifying parsing of “real world” HTML

HTML5 includes a precise algorithm for exactly how conformant UAs/browsers must parse HTML (an algorithm that closely matches existing browser implementations)…

…including parsing of HTML that may not be well-formed XML and is served up as text/html

non-browser HTML5 parsers

Formal spec for error handling

Recognizing that many authors produce non-conformant documents and that apps need to deal with such documents, HTML5 precisely specifies handling of markup errors and other classes of errors — so that such errors will be handled in an interoperable way across UAs/browsers

In other words

We need to specify error handling behavior to ensure interoperability “even in the face of documents that do not comply to the letter of the specifications”.


And in yet other words

Authors will write invalid content regardless of what we spec. So the spec states “what authors must not do, and then tells implementors what they must do when an author does it anyway”.

see and Ian Hickson’s “Error handling and Web language design”,

HTML5 conformance checking

So what’s new/different in HTML5?

Support for de facto standards

Web Forms 2.0


New elements…

New elements for better document structure…


canvas element: img, but scripted…

Used on Y! Pipes…

<canvas width="150" height="200" id="demo">
<!-- fallback content here -->

<script type="text/javascript">
 var canvas = document.getElementById("demo"),
     context = canvas.getContext("2d")
 context.fillStyle = "lime"
 context.fillRect(0, 0, 150, 200)

video and audio elements

Along with new elements, we also have new APIs

Cross-document messaging

New APIs: Another example

Persistent client-side data storage

Webkit implementation of HTML5 client-side SQL database API

Summary: How Will HTML5 help?

Making things better for developers

Giving choice to users

Standards-based Web applications work the same across browsers, so users are not locked into using any particular product from any particular vendor. Users get to choose.

How can you help HTML5?

Questions? Comments?

pensive French bulldog