HTML Design Principles

W3C Working Draft 26 November 2007

This Version:
Latest Version:
Anne van Kesteren (Opera Software ASA) <annevk@opera.com>
Maciej Stachowiak (Apple Inc) <mjs@apple.com>


HTML 5 defines the fifth major revision of the core language of the World Wide Web, HTML. This document describes the set of guiding principles used by the HTML Working Group for the development of HTML5. The principles offer guidance for the design of HTML in the areas of compatibility, utility and interoperability.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is the First Public Working Draft of "HTML Design Principles" produced by the HTML Working Group, part of the HTML Activity. The Working Group intends to publish this document as a Working Group Note. The working group is working on a new version of HTML not yet published under TR. In the meantime, you can access the HTML 5 Editor's draft. The appropriate forum for comments on this document is public-html-comments@w3.org, a mailing list with a public archive.

The decision to request publication of the document was based on a poll of the members of the HTML working group, with the results being 51 "Yes" votes, 2 "No" votes, and 1 "Formally Object", vote.

The specific objection recorded was judged to fall under the category of a comment that can be addressed in future drafts — not a critical reason to delay publication, and with the understanding that full consensus is not a prerequisite to publication, because the decision of the HTML working group to publish the document reflects the intent of the group to signal to the community to begin carefully reviewing the document, and to encourage wide review of the document within and outside of W3C.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. The group does not expect this document to become a W3C Recommendation. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

In the HTML Working Group, we have representatives from many different communities, including the WHATWG and other W3C Working Groups. The HTML 5 effort under WHATWG, and much of the work on various W3C standards over the past few years, have been based on different goals and different ideas of what makes for good design. To make useful progress, we need to have some basic agreement on goals for this group.

These design principles are an attempt to capture consensus on design approach. They are pragmatic rules of thumb that must be balanced against each other, not absolutes. They are similar in spirit to the TAG's findings in Architecture of the World Wide Web, but specific to the deliverables of this group.

1.1. Conformance for Documents and Implementations

Many language specifications define a set of conformance requirements for valid documents, and corresponding conformance requirements for implementations processing these valid documents. HTML 5 is somewhat unusual in also defining implementation conformance requirements for many constructs that are not allowed in conforming documents.

This dual nature of the spec allows us to have a relatively clean and understandable language for authors, while at the same time supporting existing documents that make use of older or nonstandard constructs, and enabling better interoperability in error handling.

Some of the design principles below apply much more to the conformance requirements for content (the "conforming language") while others apply much more to the conformance requirements for implementations (the "supported language"). Since the supported language is a strict superset of the conforming language, there is considerable overlap, but the principles will do their best to make clear which set of requirements they apply to.

2. Compatibility

There are many ways of interpreting compatibility. Sometimes the terms "backwards compatibility" and "forwards compatibility" are used, but sometimes the meaning of those terms can be unclear. The principles in this section address different facets of compatibility.

2.1. Support Existing Content

This principle applies primarily to the supported language.

Existing content often relies upon expected user agent processing and behavior to function as intended. Processing requirements should be specified to ensure that user agents implementing this specification will be able to handle most existing content. In particular, it should be possible to process existing HTML documents as HTML 5 and get results that are compatible with the existing expectations of users and authors, based on the behavior of existing browsers. It should be made possible, though not necessarily required, to do this without mode switching.

Content relying on existing browser behavior can take many forms. It may rely on elements, attributes or APIs that are part of earlier HTML specifications, but not part of HTML 5, or on features that are entirely proprietary. It may depend on specific error handling rules. In rare cases, it may depend on a feature from earlier HTML specifications not being implemented as specified.

When considering changes to legacy features or behavior, relative to current implementations and author expectations, the following questions should be considered:

The benefit of the proposed change should be weighed against the likely cost of breaking content, as measured by these criteria. In some cases, it may be desirable to make a nonstandard feature or behavior part of the conforming language, if it satisfies a valid use case. However, the fact that something is part of the supported language does not by itself mean that relying on it is condoned or encouraged.

2.1.1. Examples

Many sites use broken markup, such as badly nested elements (<b>a<i>b</b>c</i>), and both authors and users have expectations based on the error handling used by legacy user agents. We need to define processing requirements that remain compatible with the expected handling of such content.

Some sites rely on the <u> element giving the presentational effect of an underline.

2.2. Degrade Gracefully

This principle applies primarily to the conforming language.

On the World Wide Web, authors are often reluctant to use new language features that cause problems in older user agents, or that do not provide some sort of graceful fallback. HTML 5 document conformance requirements should be designed so that Web content can degrade gracefully in older or less capable user agents, even when making use of new elements, attributes, APIs and content models.

It is not necessarily appropriate to consider every Web user agent ever made, including even very old versions of browsers or tools that are extremely unpopular even in their niche markets. However, strong consideration should be given to the following categories of user agents. It is highly likely that content authors will find it important to target these categories:

In some cases, a new feature may simply not apply to a certain class of user agents, or may be impractical to design in a way that can degrade. For example, new scripting APIs cannot be made to work in scriptless user agents. But in many cases, approaches like the following can be used:

This list is not exhaustive; in some cases slightly more complicated approaches are more effective.

2.2.1. Examples

The default presentation of the proposed irrelevant attribute can be emulated through the CSS rule [irrelevant] { display: none; }.

Proposed new multimedia elements like <canvas> fallback </canvas> or <video> fallback </video> allow fallback content. Older user agents will show "fallback" while user agents supporting canvas or video will show the multimedia content.

The proposed getElementsByClassName() method can be made considerably faster than pure ECMAScript implementations found in existing libraries, but a script-based implementation can be used when the native version is not available.

The <datalist> element can be associated with an <input> element and may contain a hidden <select> element. This way the fallback for the intended "combo box" control can be a text field or a text field with an associated pop-up menu in existing mainstream browsers

2.3. Do not Reinvent the Wheel

If there is already a widely used and implemented technology covering particular use cases, consider specifying that technology in preference to inventing something new for the same purpose. Sometimes, though, new use cases may call for a new approach instead of more extensions on an old approach.

contenteditable="" was already used and implemented by user agents. No need to invent a new feature.

2.4. Pave the Cowpaths

When a practice is already widespread among authors, consider adopting it rather than forbidding it or inventing something new.

Authors already use the <br/> syntax as opposed to <br> in HTML and there is no harm done by allowing that to be used.

2.5. Evolution Not Revolution

Revolutions sometimes change the world to the better. Most often, however, it is better to evolve an existing design rather than throwing it away. This way, authors don't have to learn new models and content will live longer. Specifically, this means that one should prefer to design features so that old content can take advantage of new features without having to make unrelated changes. And implementations should be able to add new features to existing code, rather than having to develop whole separate modes.

Switching to XML syntax requires a global change, so continue supporting classic HTML syntax as well.

3. Utility

These principles call for a design that makes sure HTML can be used effectively for its many intended purposes.

3.1. Solve Real Problems

Changes to the spec should solve actual real-world problems. Abstract architectures that don't address an existing need are less favored than pragmatic solutions to problems that web content faces today. And existing widespread problems should be solved, when possible.

3.2. Priority of Constituencies

In case of conflict, consider users over authors over implementors over specifiers over theoretical purity. In other words costs or difficulties to the user should be given more weight than costs to authors; which in turn should be given more weight than costs to implementors; which should be given more weight than costs to authors of the spec itself, which should be given more weight than those proposing changes for theoretical reasons alone. Of course, it is preferred to make things better for multiple constituencies at once.

3.3. Secure By Design

Ensure that features work with the security model of the web. Preferrably address security considerations directly in the specification.

Communicating between documents from different sites is useful, but an unrestricted version could put user data at risk. Cross-document messaging is designed to allow this without violating security constraints.

3.4. Separation of Concerns

HTML should allow separation of content and presentation. For this reason, markup that expresses structure is usually preferred to purely presentational markup. However, structural markup is a means to an end such as media independence. Profound and detailed semantic encoding is not necessary if the end can be reached otherwise. Defining reasonable default presentation for different media may be sufficient. HTML strikes a balance between semantic expressiveness and practical usefulness. Names of elements and attributes in the markup may be pragmatic (for brevity, history, simplicity) rather than completely accurate.

The article element defines an individual article, but not the details of how it is displayed. A journal article may be the only article on a page, formatted in multiple columns, while a blog post may share a page with multiple other articles and be presented in a box with a border.

The b and i elements are widely used — it is better to give them good default rendering for various media including aural than to try to ban them.

3.5. DOM Consistency

The two serializations should be designed in such a way that the DOM trees produced by the respective parsers appear as consistently as feasible to scripts and other program code operating on the document trees. Discrepancies can be allowed for compatibility with legacy implementations, but the differences should be minimized.

Also, unless required for compatibility with legacy implementations and deployed content, gratuitous difference in syntactic appearance should be avoided as well.

The HTML (text/html) parser puts elements in the http://www.w3.org/1999/xhtml namespace in the DOM for compatibility with the XML syntax of HTML 5.

4. Interoperability

These principles exist to improve the chances of HTML implementations being truly interoperable.

4.1. Well-defined Behavior

Prefer to clearly define behavior that content authors could rely on, in preference to vague or implementation-defined behavior. This way, it is easier to author content that works in a variety of user agents. However, implementations should still be free to make improvements in areas such as user interface and quality of rendering.

4.2. Avoid Needless Complexity

Simple solutions are preferred to complex ones, when possible. Simpler features are easier for user agents to implement, more likely to be interoperable, and easier for authors to understand. But this should not be used as an excuse to avoid satisfying the other principles.

4.3. Handle Errors

Error handling should be defined so that interoperable implementations can be achieved. Prefer graceful error recovery to hard failure, so that users are not exposed to authoring errors.

5. Universal Access

Features should be designed for universal access. This category covers various principles related to that.

5.1. Media Independence

Features should, when possible, work across different platforms, devices, and media. This should not be taken to mean that a feature should be omitted just because some media or platforms can't support it. For example, interactive features should not be omitted merely because they can not be represented in a printed document.

The general reflowability of HTML text makes it more suitable to variable screen dimensions than a representation of exact glyph positions.

A hyperlink can not be actuated in a printed document, but that is no reason to omit the a element.

5.2. Support World Languages

Enable publication in all world languages. But this should not be taken as equalizing writing systems by prohibiting features that do not apply to all of them. Features for packing multiple translations of a document in a single file are out of scope.

Supporting Unicode allows text in most of the world's languages, including mixing of text in different languages.

Italic text is useful because it applies to many bicameral scripts, even though some scripts have no such concept. Similarly, ruby is useful for many scripts, even though it has a CJK focus.

Text in element content has better language support than text in attribute content; in element content ruby annotations can be inserted, as well as dir attributes and bdo elements in case the Unicode bidirectional algorithm is insufficient to correctly order adjacent runs of mixed direction text.

5.3. Accessibility

Design features to be accessible to users with disabilities. Access by everyone regardless of ability is essential. This does not mean that features should be omitted entirely if not all users can make full use of them, but alternate mechanisms should be provided.

The image in an img may not be visible to blind users, but that is a reason to provide alternate text, not to leave out images.

The progress element is intrinsically accessible as it has unambiguous progress bar semantics which permits mapping to accessibility APIs that can represent progress indicators.


The editors would like to thank Charles McCathieNevile, Chris Wilson, Dan Connolly, Henri Sivonen, Ian Hickson, Jirka Kosek, Lachlan Hunt, Nik Thierry, Philip Taylor, Richard Ishida, Stephen Stewart, and Steven Faulkner for their contributions to this document as well as to all the people who have contributed to HTML 5 over the years for improving the Web!

If you contributed to this document, but your name is not listed above please let the editors know so they can correct this omission.