W3C

XHTML™ 2.0

W3C Working Draft 5 August 2002

This version:
http://www.w3.org/TR/2002/WD-xhtml2-20020805
Latest version:
http://www.w3.org/TR/xhtml2
Editors:
Shane McCarron, Applied Testing and Technology
Jonny Axelsson, Opera Software
Beth Epperson, Netscape/AOL
Ann Navarro, WebGeek, Inc.
Steven Pemberton, CWI (HTML Working Group Chair)

This document is also available in these non-normative formats: Single XHTML file, PostScript version, PDF version, ZIP archive, and Gzip'd TAR archive.


Abstract

This Working Draft specifies the XHTML 2.0 Markup Language and a variety of XHTML-conforming modules that support that language.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.

This document is the first public working draft of this specification. It should in no way be considered stable, and should not be referenced for any purposes whatsoever. This version does not include the implementations of XHTML 2.0 in either DTD or XML Schema form. Those will be included in subsequent versions, once the contents of this language stabilizes.

This document has been produced by the W3C HTML Working Group (members only) as part of the W3C HTML Activity. The goals of the HTML Working Group are discussed in the HTML Working Group charter.

Public discussion of XHTML takes place on www-html@w3.org (archive). To subscribe send an email to www-html-request@w3.org with the word subscribe in the subject line.

Please report errors in this document to www-html-editor@w3.org (archive).

At the time of publication, the Working Group believed there were zero patent disclosures relevant to this specification. A current list of patent disclosures relevant to this specification may be found on the Working Group's patent disclosure page.

A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.

Quick Table of Contents

List of Issues

Full Table of Contents

1. Introduction

This section is informative.

1.1. What is XHTML 2?

XHTML 2 is a markup language intended for rich, portable web-based applications. While the ancestry of XHTML 2 comes from HTML 4, XHTML 1.0, and XHTML 1.1, it is not intended to be backward compatible with its earlier versions. Application developers familiar with earlier its ancestors will be comfortable working with XHTML 2. Appendix A describes the ways in which XHTML 2 differs from previous versions and what application developers need to know to convert existing applications to XHTML 2.

1.2. What are the XHTML 2 Modules?

XHTML 2 is a member of the XHTML Family of markup languages. It is an XHTML Host Language as defined in XHTML Modularization. As such, it is made up of a set of XHTML Modules that together describe the elements and attributes of the language, and their content model. XHTML 2 updates many of the modules defined in XHTML Modularization 1.0 [XHTMLMOD], and includes the updated versions of all those modules and their semantics. XHTML 2 also uses modules from Ruby [RUBY], XML Events [XMLEVENTS], and XForms [XFORMS].

The modules defined in this specification are largely extensions of the modules defined in XHTML Modularization 1.0. This specification also defines the semantics of the modules it includes. So, that means that unlike earlier versions of XHTML that relied upon the semantics defined in HTML 4, all of the semantics for XHTML 2 are defined either in this specification or in the specifications that it normatively references.

Even though the XHTML 2 modules are defined in this specification, they are available for use in other XHTML family markup languages. Over time, it is possible that the modules defined in this specification will migrate into the XHTML Modularization specification.

2. Terms and Definitions

This section is informative.

While some terms are defined in place, the following definitions are used throughout this document. Familiarity with the W3C XML 1.0 Recommendation [XML] is highly recommended.

abstract module
a unit of document type specification corresponding to a distinct type of content, corresponding to a markup construct reflecting this distinct type.
content model
the declared markup structure allowed within instances of an element type. XML 1.0 differentiates two types: elements containing only element content (no character data) and mixed content (elements that may contain character data optionally interspersed with child elements). The latter are characterized by a content specification beginning with the "#PCDATA" string (denoting character data).
deprecated
a feature marked as deprecated is in the process of being removed from this recommendation. Portable applications should not use features marked as deprecated.
document model
the effective structure and constraints of a given document type. The document model constitutes the abstract representation of the physical or semantic structures of a class of documents.
document type
a class of documents sharing a common abstract structure. The ISO 8879 [SGML] definition is as follows: "a class of documents having similar characteristics; for example, journal, article, technical manual, or memo. (4.102)"
document type definition (DTD)
a formal, machine-readable expression of the XML structure and syntax rules to which a document instance of a specific document type must conform; the schema type used in XML 1.0 to validate conformance of a document instance to its declared document type. The same markup model may be expressed by a variety of DTDs.
driver
a generally short file used to declare and instantiate the modules of a DTD. A good rule of thumb is that a DTD driver contains no markup declarations that comprise any part of the document model itself.
element
an instance of an element type.
element type
the definition of an element, that is, a container for a distinct semantic class of document content.
entity
an entity is a logical or physical storage unit containing document content. Entities may be composed of parse-able XML markup or character data, or unparsed (i.e., non-XML, possibly non-textual) content. Entity content may be either defined entirely within the document entity ("internal entities") or external to the document entity ("external entities"). In parsed entities, the replacement text may include references to other entities.
entity reference
a mnemonic string used as a reference to the content of a declared entity (eg., "&amp;" for "&", "&lt;" for "<", "&copy;" for "©".)
generic identifier
the name identifying the element type of an element. Also, element type name.
hybrid document
A hybrid document is a document that uses more than one XML namespace. Hybrid documents may be defined as documents that contain elements or attributes from hybrid document types.
instantiate
to replace an entity reference with an instance of its declared content.
markup declaration
a syntactical construct within a DTD declaring an entity or defining a markup structure. Within XML DTDs, there are four specific types: entity declaration defines the binding between a mnemonic symbol and its replacement content; element declaration constrains which element types may occur as descendants within an element (see also content model); attribute definition list declaration defines the set of attributes for a given element type, and may also establish type constraints and default values; notation declaration defines the binding between a notation name and an external identifier referencing the format of an unparsed entity.
markup model
the markup vocabulary (i.e., the gamut of element and attribute names, notations, etc.) and grammar (i.e., the prescribed use of that vocabulary) as defined by a document type definition (i.e., a schema) The markup model is the concrete representation in markup syntax of the document model, and may be defined with varying levels of strict conformity. The same document model may be expressed by a variety of markup models.
module
an abstract unit within a document model expressed as a DTD fragment, used to consolidate markup declarations to increase the flexibility, modifiability, reuse and understanding of specific logical or semantic structures.
modularization
an implementation of a modularization model; the process of composing or de-composing a DTD by dividing its markup declarations into units or groups to support specific goals. Modules may or may not exist as separate file entities (i.e., the physical and logical structures of a DTD may mirror each other, but there is no such requirement).
modularization model
the abstract design of the document type definition (DTD) in support of the modularization goals, such as reuse, extensibility, expressiveness, ease of documentation, code size, consistency and intuitiveness of use. It is important to note that a modularization model is only orthogonally related to the document model it describes, so that two very different modularization models may describe the same document type.
parameter entity
an entity whose scope of use is within the document prolog (i.e., the external subset/DTD or internal subset). Parameter entities are disallowed within the document instance.
parent document type
A parent document type of a hybrid document is the document type of the root element.
tag
descriptive markup delimiting the start and end (including its generic identifier and any attributes) of an element.

3. Conformance Definition

This section is normative.

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

3.1. Document Conformance

3.1.1. Strictly Conforming Documents

DTD Bias

This section has a distinct DTD bias. We need to make it clear that either the DTD or the Schema can be used to validate XHTML 2.0 documents.

A strictly conforming XHTML 2.0 document is a document that requires only the facilities described as mandatory in this specification. Such a document must meet all the following criteria:

  1. The document must conform to the constraints expressed in Appendix B - XHTML 2.0 Schema or Appendix D - XHTML 2.0 Document Type Definition.

  2. The root element of the document must be html.

  3. The root element of the document must contain an xmlns declaration for the XHTML 2.0 namespace [XMLNAMES]. The namespace for XHTML is defined to be http://www.w3.org/2002/06/xhtml2. An example root element might look like:

    <html xmlns="http://www.w3.org/2002/06/xhtml2" xml:lang="en">
    
  4. There must be a DOCTYPE declaration in the document prior to the root element. If present, the public identifier included in the DOCTYPE declaration must reference the DTD found in Appendix C using its Formal Public Identifier. The system identifier may be modified appropriately.

    <!DOCTYPE
     html PUBLIC "-//W3C//DTD XHTML 2.0//EN"
     "http://www.w3.org/TR/xhtml2/DTD/xhtml2.dtd">
    

Here is an example of an XHTML 2.0 document.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 2.0//EN"
    "http://www.w3.org/TR/xhtml2/DTD/xhtml2.dtd">
<html xmlns="http://www.w3.org/2002/06/xhtml2" xml:lang="en" >
  <head>
    <title>Virtual Library</title>
  </head>
  <body>
    <p>Moved to <a href="http://vlib.org/">vlib.org</a>.</p>
  </body>
</html>

Note that in this example, the XML declaration is included. An XML declaration like the one above is not required in all XML documents. XHTML document authors are strongly encouraged to use XML declarations in all their documents. Such a declaration is required when the character encoding of the document is other than the default UTF-8 or UTF-16 and no encoding was determined by a higher-level protocol.

3.2. User Agent Conformance

A conforming user agent must meet all user agent conformance requirements defined in [XHTMLMOD].

4. The XHTML 2.0 Document Type

This section is normative.

The XHTML 2.0 document type is a fully functional document type with rich semantics. It is a collection of XHTML-conforming modules (most of which are defined in this specification). The Modules and their elements are listed here for information purposes, but the definitions in their base documents should be considered authoritative. In the on-line version of this document, the module names in the list below link into the definitions of the modules within the relevant version of the authoritative specification.

Need XHTML 2.0 Definition Table

We need a table that defines the modules that are in XHTML 2.0 and links them into this document. Currently, that will be a bunch of modules that are in this document, and modules from XML Events, Ruby, and XForms. The table below is largely correct, but is still just a place holder.

Structure Module*
body, head, html, title
Text Module*
abbr, acronym, address, blockquote, br, cite, code, dfn, div, em, h, h1, h2, h3, h4, h5, h6, kbd, line, p, pre, quote, samp, section, span, strong, var
Hypertext Module*
a
List Module*
dl, dt, dd, name, nl, ol, ul, li
Bidirectional Text Module
bdo
Client-side Image Map Module
area, map
Edit Module
del, ins
Link Module
link
Metainformation Module
meta
Object Module
object, param
Presentation Module
hr, sub, sup
Scripting Module
noscript, script
Server-side Image Map Module
Attribute ismap on img
Stylesheet Module
style element
Target Module
target attribute
Table Module
caption, col, colgroup, table, tbody, td, tfoot, th, thead, tr

XHTML 2.0 also uses the following externally defined modules:

Ruby Annotation Module [RUBY]
ruby, rbc, rtc, rb, rt, rp
XML Events Module [XMLEVENTS]
listener
XForms Module [XFORMS]
LOTS OF ELEMENTS

There are no additional definitions required by this document type. An implementation of this document type as an XML Schema is defined in Appendix B, and as a DTD in Appendix D.

5. Module Definition Conventions

This section is normative.

This document defines a variety of XHTML modules and the semantics of those modules. This section describes the conventions used in those module definitions.

5.1. Module Structure

Each module in this document is structured in the following way:

5.2. Abstract Module Definitions

An abstract module is a definition of an XHTML module using prose text and some informal markup conventions. While such a definition is not generally useful in the machine processing of document types, it is critical in helping people understand what is contained in a module. This section defines the way in which XHTML abstract modules are defined. An XHTML-conforming module is not required to provide an abstract module definition. However, anyone developing an XHTML module is encouraged to provide an abstraction to ease in the use of that module.

5.3. Syntactic Conventions

The abstract modules are not defined in a formal grammar. However, the definitions do adhere to the following syntactic conventions. These conventions are similar to those of XML DTDs, and should be familiar to XML DTD authors. Each discrete syntactic element can be combined with others to make more complex expressions that conform to the algebra defined here.

element name
When an element is included in a content model, its explicit name will be listed.
content set
Some modules define lists of explicit element names called content sets. When a content set is included in a content model, its name will be listed.
expr ?
Zero or one instances of expr are permitted.
expr +
One or more instances of expr are required.
expr *
Zero or more instances of expr are permitted.
a , b
Expression a is required, followed by expression b.
a | b
Either expression a or expression b is required.
a - b
Expression a is permitted, omitting elements in expression b.
parentheses
When an expression is contained within parentheses, evaluation of any subexpressions within the parentheses take place before evaluation of expressions outside of the parentheses (starting at the deepest level of nesting first).
extending pre-defined elements
In some instances, a module adds attributes to an element. In these instances, the element name is followed by an ampersand (&).
defining required attributes
When an element requires the definition of an attribute, that attribute name is followed by an asterisk (*).
defining the type of attribute values
When a module defines the type of an attribute value, it does so by listing the type in parentheses after the attribute name.
defining the legal values of attributes
When a module defines the legal values for an attribute, it does so by listing the explicit legal values (enclosed in quotation marks), separated by vertical bars (|), inside of parentheses following the attribute name. If the attribute has a default value, that value is followed by an asterisk (*). If the attribute has a fixed value, the attribute name is followed by an equals sign (=) and the fixed value enclosed in quotation marks.

5.4. Content Types

Abstract module definitions define minimal, atomic content models for each module. These minimal content models reference the elements in the module itself. They may also reference elements in other modules upon which the abstract module depends. Finally, the content model in many cases requires that text be permitted as content to one or more elements. In these cases, the symbol used for text is PCDATA. This is a term, defined in the XML 1.0 Recommendation, that refers to processed character data. A content type can also be defined as EMPTY, meaning the element has no content in its minimal content model.

5.5. Attribute Types

In some instances, it is necessary to define the types of attribute values or the explicit set of permitted values for attributes. The following attribute types (defined in the XML 1.0 Recommendation) are used in the definitions of the abstract modules:

Attribute Type Definition
CDATA Character data
ID A document-unique identifier
IDREF A reference to a document-unique identifier
IDREFS A space-separated list of references to document-unique identifiers
NAME A name with the same character constraints as ID above
NMTOKEN A name composed of only name tokens as defined in XML 1.0 [XML].
NMTOKENS One or more white space separated NMTOKEN values
PCDATA Processed character data

In addition to these pre-defined data types, XHTML Modularization defines the following data types and their semantics (as appropriate):

Data type Description
Character A single character from [ISO10646].
Charset A character encoding, as per [RFC2045].
Charsets A space-separated list of character encodings, as per [RFC2045].
ClassName Used by the class attribute, ClassNames are tokens that identify an element as being a member of the set named by the value of the class attribute. ClassName attribute tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
ContentType A media type, as per [RFC2045].
ContentTypes A comma-separated list of media types, as per [RFC2045].
Coordinates Comma separated list of Lengths used in defining areas.
Datetime Date and time information.
FPI A character string representing an SGML Formal Public Identifier.
HrefTarget Window name used as destination for results of certain actions.
LanguageCode A language code, as per [RFC3066].
Length The value may be either in pixels or a percentage of the available horizontal or vertical space. Thus, the value "50%" means half of the available space.
LinkTypes

Authors may use the following recognized link types, listed here with their conventional interpretations. A LinkTypes value refers to a space-separated list of link types. White space characters are not permitted within link types.

These link types are case-insensitive, i.e., "Alternate" has the same meaning as "alternate".

User agents, search engines, etc. may interpret these link types in a variety of ways. For example, user agents may provide access to linked documents through a navigation bar.

Alternate
Designates substitute versions for the document in which the link occurs. When used together with the hreflang attribute, it implies a translated version of the document. When used together with the media attribute, it implies a version designed for a different medium (or media).
Stylesheet
Refers to an external style sheet. See the Style Sheet Module for details. This is used together with the link type "Alternate" for user-selectable alternate style sheets.
Start
Refers to the first document in a collection of documents. This link type tells search engines which document is considered by the author to be the starting point of the collection.
Next
Refers to the next document in a linear sequence of documents. User agents may choose to pre-load the "next" document, to reduce the perceived load time.
Prev
Refers to the previous document in an ordered series of documents. Some user agents also support the synonym "Previous".
Contents
Refers to a document serving as a table of contents. Some user agents also support the synonym ToC (from "Table of Contents").
Index
Refers to a document providing an index for the current document.
Glossary
Refers to a document providing a glossary of terms that pertain to the current document.
Copyright
Refers to a copyright statement for the current document.
Chapter
Refers to a document serving as a chapter in a collection of documents.
Section
Refers to a document serving as a section in a collection of documents.
Subsection
Refers to a document serving as a subsection in a collection of documents.
Appendix
Refers to a document serving as an appendix in a collection of documents.
Help
Refers to a document offering help (more information, links to other sources information, etc.)
Bookmark
Refers to a bookmark. A bookmark is a link to a key entry point within an extended document. The title attribute may be used, for example, to label the bookmark. Note that several bookmarks may be defined in each document.
MediaDesc

A comma-separated list of media descriptors as described by [CSS]. The default is all.

MultiLength The value may be a Length or a relative length. A relative length has the form "i*", where "i" is an integer. When allotting space among elements competing for that space, user agents allot pixel and percentage lengths first, then divide up remaining available space among relative lengths. Each relative length receives a portion of the available space that is proportional to the integer preceding the "*". The value "*" is equivalent to "1*". Thus, if 60 pixels of space are available after the user agent allots pixel and percentage space, and the competing relative lengths are 1*, 2*, and 3*, the 1* will be allotted 10 pixels, the 2* will be allotted 20 pixels, and the 3* will be allotted 30 pixels.
MultiLengths A comma separated list of items of type MultiLength.
Number One or more digits
Pixels The value is an integer that represents the number of pixels of the canvas (screen, paper). Thus, the value "50" means fifty pixels. For normative information about the definition of a pixel, please consult [CSS2]
Shape The shape of a region.
Text Arbitrary textual data, likely meant to be human-readable.
URI A Uniform Resource Identifier Reference, as defined by the type anyURI in [XMLSCHEMA].
URIs A space-separated list of URIs as defined above.
URI List A comma-separated list of URIs as defined above.

6. XHTML Attribute Collections

This section is normative.

Many of the abstract modules in this document define the required attributes for their elements. The table below defines some collections of attributes that are referenced throughout the modules. These expressions should in no way be considered normative or mandatory. They are an editorial convenience for this document. When used in the remainder of this section, it is the expansion of the term that is normative, not the term itself.

The following basic attribute sets are used on many elements. In each case where they are used, their use is identified via their collection name.

6.1. Core Attribute Collection

class = ClassName
This attribute assigns one or more class names to an element; the element may be said to belong to these classes. A class name may be shared by several element instances.

The class attribute can be used for different purposes in XHTML, for instance as a style sheet selector (when an author wishes to assign style information to a set of elements), and for general purpose processing by user agents.

For instance in the following example, the p element is used in conjunction with the class attribute to identify a particular type of paragraph.

<p><span class="note">
These programs are only available if you have purchased the advanced professional suite.
</p>

Style sheets rules can then be used to render the paragraph appropriately, for instance by putting a border round it, giving it a different background colour, or where necessary by not displaying it at all.

id = ID
The id attribute assigns a identifier to an element. The id of an element must be unique within a document.

The id attribute has several roles in XHTML:

As an example, the following headings are distinguished by their id values:

<h id="introduction">Introduction</h>
<p>...</p>
<h id="events">The Events Module</h>
<p>...</p>
title = Text
This attribute offers advisory information about the element for which it is set.

Values of the title attribute may be rendered by user agents in a variety of ways. For instance, visual browsers should display the title as a "tool tip" (a short message that appears when the pointing device pauses over an object). Audio user agents may speak the title information in a similar context.

The title attribute has an additional role when used with the link element to designate an external style sheet. Please consult the section on links and style sheets for details.

Example:

<a href="/Jakob/" title="Author biography">Jakob Nielsen</a>'s Alertbox for January 11, 1998

6.2. I18N Attribute Collection

xml:lang = LanguageCode
This attribute specifies the base language of an element's attribute values and text content. It is defined normatively in [XML] section 2.12 http://www.w3.org/TR/REC-xml#sec-lang-tag. The default value of this attribute is unspecified.

An element inherits language code information according to the following order of precedence (highest to lowest):

In this example, the primary language of the document is French ("fr"). One paragraph is declared to be in US English ("en-us"), after which the primary language returns to French. The following paragraph includes an embedded Japanese ("ja") phrase, after which the primary language returns to French.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 2.0//EN"
   "http://www.w3.org/xhtml2/DTD/xhtml2.dtd">
<html xmlns="http://www.w3.org/2002/06/xhtml2" xml:lang="fr">
<head>
   <title>Un document multilingue</title>
</head>
<body>
<p>...Interpreted as French...</p>
<p xml:lang="en-us">...Interpreted as US English...</p>
<p>...Interpreted as French again...</p>
<p>...French text interrupted by<em xml:lang="ja">some
         Japanese</em>French begins here again...</p>
</body>
</html>

6.3. Hypertext Attribute Collection

href = URI
This attribute specifies a hypertext link that is activated when the element is selected.
target = HrefTarget
This attribute identifies an environment that will act as the destination for a resource identified by a hyperlink when it is activated.

This specification does not define how this attribute gets used, since that is defined by the environment that the hyperlink is evaluated in.

XFrames not published yet

We need a reference to XFrames here, but XFrames is not yet public.
accesskey = Character
This attribute assigns an access key to an element. An access key is a single character from the document character set. Note. Authors should consider the input method of the expected reader when specifying an accesskey.

Pressing an access key assigned to an element gives focus to the element. The action that occurs when an element receives focus depends on the element. For example, when a user activates a link defined by the a element, the user agent generally follows the link. When a user activates a radio button, the user agent changes the value of the radio button. When the user activates a text field, it allows input, etc.

In this example, we assign an access key to a link defined by the a element. Typing this access key takes the user to another document, in this case, a table of contents.

<p><a accesskey="C" 
      rel="contents"
      href="http://someplace.com/specification/contents.html">
    Table of Contents</a>
</p>

The invocation of access keys depends on the underlying system. For instance, on machines running MS Windows, one generally has to press the "alt" key in addition to the access key. On Apple systems, one generally has to press the "cmd" key in addition to the access key.

The rendering of access keys depends on the user agent. We recommend that authors include the access key in label text or wherever the access key is to apply. User agents should render the value of an access key in such a way as to emphasize its role and to distinguish it from other characters (e.g., by underlining it).

navindex = Number
This attribute specifies the position of the current element in the navingation order for the current document. This value must be a number between 0 and 32767. User agents must ignore leading zeros.

The navigation order defines the order in which elements will receive focus when navigated by the user via the keyboard. The navigation order may include elements nested within other elements.

Elements that may receive focus should be navigated by user agents according to the following rules:

  1. Those elements that support the navindex attribute and assign a positive value to it are navigated first. Navigation proceeds from the element with the lowest navindex value to the element with the highest value. Values need not be sequential nor must they begin with any particular value. Elements that have identical navindex values should be navigated in the order they appear in the character stream.
  2. Those elements that do not support the navindex attribute or support it and assign it a value of "0" are navigated next. These elements are navigated in the order they appear in the character stream.
  3. Elements that are disabled do not participate in the navigation order.

Tabbing keys. The actual key sequence that causes navigation or element activation depends on the configuration of the user agent (e.g., the "tab" key is used for navigation and the "enter" key is used to activate a selected element).

User agents may also define key sequences to navigate the navigation order in reverse. When the end (or beginning) of the navigation order is reached, user agents may circle back to the beginning (or end).

6.4. Events

The global attributes from [XMLEVENTS] are included in the Events attribute collection. The normative definition of those attributes and their semantics is included in that specification. They are described briefly below:

defaultAction = cancel|perform
This attribute defines the default action to take when the matching event is encountered. If not specified, the defaultAction is perform
event = CDATA
This attribute defines the name of the event that is to be captured. The set of legal names for XHTML 2 is to be defined.

List of XHTML 2 Events Needed

We need to define the list of XHTML 2 events and map them into the XHTML DOM.
handler = IDREF
This attribute specifies the ID of a handler element that defines the action that should be performed if the event reaches the observer.
observer = IDREF
This attribute specifies an ID for an observer element for which the listener is to be registered.
phase = capture|default
This attribute specifies the phase of event propagation in which to process the event. If not specified, the default value of this attribute is default.
propagate = stop|continue
This attribute specifies whether an event should stop propagating after this observer processes it, or continue down the event chain for possible further processing. If not specified, the default value of this attribute is continue.
target = IDREF
This attribute specifies the id of the target element of the event (i.e., the node that caused the event). If not specified, the default value of this attribute is the element on which the event attribute is specified.

Note that these attributes are not in the XHTML namespace. Instead, they are in the XML Events namespace. The XHTML namespace is the default namespace for XHTML documents, so XHTML elements and attributes do not require namespace prefixes (although they are permitted). XML Events attributes MUST use a prefix, since they are not in the default namespace of the document. When XML Events are included in an XHTML document, the default prefix for those attribute is ev.

6.5. Common Attribute Collection

This collection assembles the Core, I18N, Events and Hypertext attribute collections defined above.

7. XHTML Structure Module

This section is normative.

The Structure Module defines the major structural elements for XHTML. These elements effectively act as the basis for the content model of many XHTML family document types. The elements and attributes included in this module are:

Elements Attributes Minimal Content Model
body Common (Heading | Block | List)*
head Common title
html I18N, profile (URI), xmlns (URI = "http://www.w3.org/2002/06/xhtml2") head, body
title I18N PCDATA

This module is the basic structural definition for XHTML content. The html element acts as the root element for all XHTML Family Document Types.

Note that the value of the xmlns attribute is defined to be "http://www.w3.org/2002/06/xhtml2". Also note that because the xmlns attribute is treated specially by XML namespace-aware parsers [XMLNAMES], it is legal to have it present as an attribute of each element. However, any time the xmlns attribute is used in the context of an XHTML module, whether with a prefix or not, the value of the attribute shall be the XHTML namespace defined here.

Implementation: DTD

7.1. The html element

After the document type declaration, the remainder of an XHTML document is contained by the html element.

Attributes

The I18N collection
A collection of attributes related to Internationalization, including the xml:lang.
profile = URI
This attribute specifies the location of one or more meta data profiles, separated by white space. For future extensions, user agents should consider the value to be a list even though this specification only considers the first URI to be significant. Profiles are discussed in the section on meta data.

7.2. The head element

The head element contains information about the current document, such as its title, keywords that may be useful to search engines, and other data that is not considered document content. User agents do not generally render elements that appear in the head as content. They may, however, make information in the head available to users through other mechanisms.

Attributes

The I18N collection
A collection of attributes related to Internationalization, including the xml:lang.

7.3. The title element

Every XHTML document must have a title element in the head section.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Authors should use the title element to identify the contents of a document. Since users often consult documents out of context, authors should provide context-rich titles. Thus, instead of a title such as "Introduction", which doesn't provide much contextual background, authors should supply a title such as "Introduction to Medieval Bee-Keeping" instead.

For reasons of accessibility, user agents must always make the content of the title element available to users (including title elements that occur in frames). The mechanism for doing so depends on the user agent (e.g., as a caption, spoken).

Titles may contain entity references (for accented characters, special characters, etc.), but may not contain other markup (including comments). Here is a sample document title:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 2.0//EN"
   "http://www.w3.org/TR/xhtml2/DTD/xhtml2.dtd">
<html xmlns="http://www.w3.org/2002/06/xhtml2">
<head>
<title>A study of population dynamics</title>
... other head elements...
</head>
<body>
... document body...
</body>
</html>

7.4. The body element

The body of a document contains the document's content. The content may be presented by a user agent in a variety of ways. For example, for visual browsers, you can think of the body as a canvas where the content appears: text, images, colors, graphics, etc. For audio user agents, the same content may be spoken.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

8. XHTML Text Module

This section is normative.

This module defines all of the basic text container elements, attributes, and their content model:

Element Attributes Minimal Content Model
abbr Common (PCDATA | Inline)*
acronym Common (PCDATA | Inline)*
address Common (PCDATA | Inline)*
blockquote Common, cite (URI) (PCDATA | Heading | Block | List)*
br Core EMPTY
cite Common (PCDATA | Inline)*
code Common (PCDATA | Inline)*
dfn Common (PCDATA | Inline)*
div Common (PCDATA | Flow)*
em Common (PCDATA | Inline)*
h Common (PCDATA | Inline)*
h1 Common (PCDATA | Inline)*
h2 Common (PCDATA | Inline)*
h3 Common (PCDATA | Inline)*
h4 Common (PCDATA | Inline)*
h5 Common (PCDATA | Inline)*
h6 Common (PCDATA | Inline)*
kbd Common (PCDATA | Inline)*
line Common (PCDATA | Inline)*
p Common (PCDATA | Inline | List | blockquote | pre | table)*
pre Common (PCDATA | Inline)*
quote Common, cite (URI) (PCDATA | Inline)*
samp Common (PCDATA | Inline)*
section Common (PCDATA | Flow)*
span Common (PCDATA | Inline)*
strong Common (PCDATA | Inline)*
var Common (PCDATA | Inline)*

The minimal content model for this module defines some content sets:

Heading
h | h1 | h2 | h3 | h4 | h5 | h6
Block
address | blockquote | div | p | pre
Inline
abbr | acronym | br | cite | code | dfn | em | kbd | q | samp | span | strong | var
Flow
Heading | Block | Inline

Note that the use of the words Block and Inline here are meant to be suggestive of the role the content sets play. They are not normative with regards to presentation (in other words, a style sheet might give an element within the Block content a display property of inline).

Implementation: DTD

8.1. The abbr element

The abbr element indicates that a text fragment is an abbreviation (e.g., W3C, XML, Inc., Ltd., Mass., etc.).

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

The content of the abbr and acronym elements specifies the abbreviated expression itself, as it would normally appear in running text. The title attribute of these elements may be used to provide the full or expanded form of the expression.

Note that abbreviations and acronyms often have idiosyncratic pronunciations. For example, while "IRS" and "BBC" are typically pronounced letter by letter, "NATO" and "UNESCO" are pronounced phonetically. Still other abbreviated forms (e.g., "URI" and "SQL") are spelled out by some people and pronounced as words by other people. When necessary, authors should use style sheets to specify the pronunciation of an abbreviated form.

Examples:

  <abbr title="Limited">Ltd.</abbr>
  <abbr title="Abbreviation">abbr.</abbr>

8.2. The acronym element

The acronym element indicates that a text fragment is an acronym (e.g., BBC, WWW, URL, etc.). Its usage is the same as the abbr element above.

While some dictionaries define an acronym to be just a word formed from the initial letters of other words, others require the acronym to be pronouncable as a word. This specification does not require the acronym element to adhere to either definition, but is only provided for author convenience.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Examples:

  <acronym title="World Wide Web">WWW</acronym>
  <acronym xml:lang="fr" 
        title="Société Nationale des Chemins de Fer">
     SNCF
  </acronym>

8.3. The address element

The address element may be used by authors to supply contact information for a document or a major part of a document such as a form. This element often appears at the beginning or end of a document.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

For example:

<address>
<a href="mailto:webmaster@example.net">Webmaster</a>
</address>

8.4. The blockquote element

This element designates a block of quoted text.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext
cite = URI
The value of this attribute is a URI that designates a source document or message. This attribute is intended to give further information about the element's contents (e.g., the source from which a quotation was borrowed, or the reason text was inserted or deleted). User Agents should provide a means for the user to access the further information.

This example formats an excerpt from "The Two Towers", by J.R.R. Tolkien, as a blockquote.

<blockquote cite="http://www.example.com/tolkien/twotowers.html">
<p>They went in single file, running like hounds on a strong scent,
and an eager light was in their eyes. Nearly due west the broad
swath of the marching Orcs tramped its ugly slot; the sweet grass
of Rohan had been bruised and blackened as they passed.</p>
</blockquote>

8.5. The br element

The br element indicates that the current output line should be ended at this point, and a new line begun. This element is deprecated in favor of the line element.

Attributes

The Core collection
A collection of basic attributes used on all elements, including class, id, title.

Example:

<p class="poem" xml:lang="fr">
Un petit d'un petit<br/>
S'etonne aux Halles.<br/>
Un petit d'un petit,<br/>
Ah! Degres te fallent.
</p>

8.6. The cite element

The cite element contains a citation or a reference to other sources.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext
,
cite = URI
The value of this attribute is a URI that designates a source document or message. This attribute is intended to give further information about the element's contents (e.g., the source from which a quotation was borrowed, or the reason text was inserted or deleted). User Agents should provide a means for the user to access the further information.

In the following example, the cite element is used to delineate the speaker:

As <cite cite="http://www.whitehouse.gov/history/presidents/ht33.html">Harry S. Truman</cite> said,
<quote lang="en-us">The buck stops here.</quote>

More information can be found in <cite cite="http://www.w3.org/TR/REC-xml">[XML]</cite>.

8.7. The code element

The code element contains a fragment of computer code.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Example:

The <code>code</code> element contains a fragment of computer code.

8.8. The dfn element

The dfn element contains the defining instance of the enclosed term.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Example:

An <dfn id="def-acronym">acronym</dfn> is a word formed
from the initial letters or groups of letters of words in a set phrase
or series of words.

8.9. The div element

The div element, in conjunction with the id and class attributes, offer a generic mechanism for adding structure to documents. This element defines no presentational idioms on the content. Thus, authors may use this element in conjunction with style sheets, the xml:lang attribute, etc., to tailor XHTML to their own needs and tastes.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

For example, suppose you wish to make a presentation in XHTML, where each slide is enclosed in a separate element. You could use a div element, with a class of slide:

<body>
    <h>The meaning of life</h>
    <p>By Huntington B. Snark</p>
    <div class="slide">
        <h>What do I mean by "life"</h>
        <p>....</p>
    </div>
    <div class="slide">
        <h>What do I mean by "mean"?</h>
        ...
    </div>
    ...
</body>

8.10. The em element

The em element indicates emphasis for its contents.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Example:

Do <em>not</em> phone before 9 a.m.

8.11. The heading elements

A heading element briefly describes the topic of the section it introduces. Heading information may be used by user agents, for example, to construct a table of contents for a document automatically.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

There are two styles of headings in XHTML: the numbered versions h1, h2 etc., and the structured version h, which is used in combination with the section element.

There are six levels of numbered headings in XHTML with h1 as the most important and h6 as the least. The visual presentation of headers can render more important headings in larger fonts than less important ones.

Structured headings use the single h element, in combination with the section element to indicate the structure of the document, and the nesting of the sections indicate the importance of the heading.

For example:

<body>
<h>This is a top level heading</h>
<p>....</p>
<section>
    <p>....</p>
    <h>This is a second-level heading</h>
    <p>....</p>
    <h>This is another second-level heading</h>
    <p>....</p>
</section>
<section>
    <p>....</p>
    <h>This is another second-level heading</h>
    <p>....</p>
    <section>
        <h>This is a third-level heading</h>
        <p>....</p>
    </section>
</section>

These visual representation of these levels can be distinguished in a style sheet:

h {font-family: sans-serif; font-weight: bold; font-size: 200%}
section h {font-size: 150%} /* A second-level heading */
section section h {font-size: 120%} /* A third-level heading */

etc.

Numbered sections and references
XHTML does not itself cause section numbers to be generated from headings. Style sheet languages such as CSS however allow authors to control the generation of section numbers.

The practice of skipping heading levels is considered to be bad practice. The series h1 h2 h1 is acceptable, while h1 h3 h1 is not, since the heading level h2 has been skipped.

8.12. The kbd element

The kbd element indicates text to be entered by the user.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Example:

To exit, type <kbd>QUIT</kbd>.

8.13. The line element

The line element represents a sub-paragraph. It is intended as a structured replacement for the br element. It contains a piece of text that when visually represented should start on a new line, and have a line break at the end. Whether the line should wrap or not visually depends on styling properties of the element.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

By retaining structure in text that has to be broken over lines, you retain essential information about its makeup. This gives you greater freedom with styling the content. For instance, line numbers can be generated automatically from the stylesheet if needed.

For instance, for a document with the following structure:

<p class="program">
<line>program p(input, output);</line>
<line>begin</line>
<line>    writeln("Hello world");</line>
<line>end.</line>
</p>

the following CSS stylesheet would number each line:

.program { counter-reset: linenumber }

line:before {
    position: relative;
    left: -1em;
    counter-increment: linenumber;
    content: counter(linenumber);
}

8.14. The p element

The p element represents a paragraph.

In comparison with earlier versions of HTML, where a paragraph could only contain inline text, XHTML2's paragraphs represent the conceptual idea of a paragraph, and so may contain lists, blockquotes, pre's and tables as well as inline text. They may not, however, contain directly nested p elements.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Authors are discouraged from using empty p elements. User agents should ignore empty p.

<p>Payment options include:
<ul>
<li>cash</li>
<li>credit card</li>
<li>luncheon vouchers.</li>
</ul>
</p>

8.15. The pre element

The pre element indicates that whitespace in the enclosed text has semantic relevance, and will normally be included in renderings of the content

Note that all elements in the XHTML family preserve their whitespace in the document, which is only removed on rendering, via a stylesheet, according to the rules of CSS [CSS]. This means that in principle all elements may preserve or collapse whitespace on rendering, under control of a stylesheet. Also note that there is no requirement that the <pre> element be rendered in a monospace font (although this is the default rendering), nor that text wrapping be disabled.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Non-visual user agents are not required to respect extra white space in the content of a pre element.

The following example shows a preformatted verse from Shelly's poem To a Skylark:

<pre>
       Higher still and higher
         From the earth thou springest
       Like a cloud of fire;
         The blue deep thou wingest,
And singing still dost soar, and soaring ever singest.
</pre>

Here is how this might be rendered:

       Higher still and higher
         From the earth thou springest
       Like a cloud of fire;
         The blue deep thou wingest,
And singing still dost soar, and soaring ever singest.

8.16. The quote element

This element designates a inline text fragment of quoted text.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext
cite = URI
The value of this attribute is a URI that designates a source document or message. This attribute is intended to give further information about the element's contents (e.g., the source from which a quotation was borrowed, or the reason text was inserted or deleted). User Agents should provide a means for the user to access the further information.

Visual user agents are not required to add delimiting quotation marks (as was the case for the q element in earlier versions of HTML). It is the responsibility of the document author to add any required quotation marks.

The following example illustrates nested quotations with the quote element.

<p>John said, <quote lang="en-us">"I saw Lucy at lunch, she told me
<quote lang="en-us">'Mary wants you
to get some ice cream on your way home.'</quote> I think I will get
some at Jen and Berry's, on Gloucester Road."</quote></p>

Here is an example using the cite attribute:

Steven replied: <quote cite="http://lists.w3.org/Public/www-html/June2002/001.html">We quite agree</quote>

8.17. The samp element

The samp element designates sample output from programs, scripts, etc.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Example:

On starting, you will see the prompt <samp>$ </samp>.

8.18. The section element

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

The section element, in conjunction with the h element, offers a mechanism for structuring documents into sections. This element defines content to be block-level but imposes no other presentational idioms on the content, which may otherwise be controlled from a style sheet.

By representing the structure of documents explicitely using the section and h elements gives the author greater control over presentation possibilities than the traditional implicit structuring using numbered levels of headings. For instance, it is then possible to indicate the nesting of sections by causing a border to be displayed to the left of sections.

Here is an example

<body>
<h>Events</h>
<section>
    <h>Introduction</h>
    <p>....</p>
    <h>Specifying events</h>
    <p>...</p>
    <section>
        <h>Attaching events to the handler</h>
        <p>...</p>
    </section>
    <section>
        <h>Attaching events to the listener</h>
        <p>...</p>
    </section>
    <section>
        <h>Specifying the binding elsewhere</h>
        <p>...</p>
    </section>
</section>    

8.19. The span element

The span element, in conjunction with the id and class attributes, offer a generic mechanism for adding structure to documents. This element imposes no presentational idioms on the content. Thus, authors may use this element in conjunction with style sheets, the xml:lang attribute, etc., to tailor XHTML to their own needs and tastes.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

For example, suppose you wish to mark all words in a document that need to be collected into an index. You could use a span element, with a class of xref:

<p>This operation is called
the <span class="xref">transpose</span>
or <span class="xref">inverse</span>.</p>

8.20. The strong element

The strong element indicates strong emphasis for its contents.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext
On <strong>Monday</strong> please put the rubbish out,
but <em>not</em> before nightfall!

8.21. The var element

The var element indicates an instance of a variable or program argument.

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext

Example:

The parameter <var>ncols</var> represents
the number of colors to use.

9. XHTML Hypertext Module

This section is normative.

The Hypertext Module provides the element that is used to define hypertext links to other resources, as well as a number of attributes.

This module supports the following element:

Element Attributes Minimal Content Model
a Common, charset (Charset), hreflang (LanguageCode), rel (LinkTypes), rev (LinkTypes), type (ContentType) (PCDATA | Inline)*

This module adds the a element to the Inline content set of the Text Module, and activates the Hypertext Attribute Collection.

Implementation: DTD

9.1. The a element

Attributes

The Common collection
A collection of other attribute collections, including: Core, Events, I18N, and Hypertext
charset = Charset
This attribute specifies the character encoding of the resource designated by the link. Please consult the section on character encodings for more details.
hreflang = LanguageCode
This attribute specifies the base language of the resource designated by href and may only be used when href is specified.
type = ContentType
This attribute gives an advisory hint as to the content type of the content available at the link target address. It allows user agents to opt to use a fallback mechanism rather than fetch the content if they are advised that they will get content in a content type they do not support.

Authors who use this attribute take responsibility to manage the risk that it may become inconsistent with the content available at the link target address.

For the current list of registered content types, please consult [MIMETYPES].

rel = LinkTypes
This attribute describes the relationship from the current document to the URI referred to by the element. The value of this attribute is a space-separated list of link types.