This section is normative.
In order to ensure that XHTML-family documents are maximally
portable among XHTML-family user agents, this specification
rigidly defines conformance requirements for both of these
and for XHTML-family document types. While the conformance
definitions can be found in this section, they necessarily
reference normative text within this document, within the
base XHTML specification [XHTML1], and within other
related specifications. It is only possible to fully
comprehend the conformance requirements of XHTML through a
complete reading of all normative references.
3.1. XHTML Family Document Type
It is possible to modify existing document types and define
wholly new document types using both modules defined in this
specification and other modules. Such a document type
conforms to this specification when it meets the following
The document type must be defined using one of the
implementation methods defined by the W3C (currently this
is limited to XML DTDs, but XML Schema will be available
The document type must have a unique identifier as defined
The document type must include, at a minimum, the
Structure, Hypertext, Basic Text, and List modules defined
in this specification.
For each of the W3C-defined modules that are included, all
of the elements, attributes, and any required minimal
content models must be included (and optionally extended)
in the document type's content model.
The document type may define additional elements and
attributes. However, these must be in their own XML
XHTML Family Document Conformance
Documents that rely upon XHTML-family document types are
considered XHTML conforming if they validate against their
referenced document type.
3.3. XHTML Family User Agent Conformance
A conforming user agent must meet all of the following
criteria (as defined in [XHTML1]):
In order to be consistent with the XML 1.0 Recommendation
[XML], the user agent
must parse and evaluate an XHTML document for
well-formedness. If the user agent claims to be a
validating user agent, it must also validate documents
against their referenced DTDs according to [XML].
When the user agent claims to support facilities defined
within this specification or required by this specification
through normative reference, it must do so in ways
consistent with the facilities' definition.
When a user agent processes an XHTML document as generic
XML, it shall only recognize attributes of type
ID (e.g. the
id attribute on most XHTML
elements) as fragment identifiers.
If a user agent encounters an element it does not
recognize, it must render the element's content.
If a user agent encounters an attribute it does not
recognize, it must ignore the entire attribute
specification (i.e., the attribute and its value).
If a user agent encounters an attribute value it doesn't
recognize, it must use the default attribute value.
If it encounters an entity reference (other than one of the
predefined entities) for which the User Agent has processed
no declaration (which could happen if the declaration is in
the external subset which the User Agent hasn't read), the
entity reference should be rendered as the characters
(starting with the ampersand and ending with the
semi-colon) that make up the entity reference.
When rendering content, User Agents that encounter
characters or character entity references that are
recognized but not renderable should display the document
in such a way that it is obvious to the user that normal
rendering has not taken place.
The following characters are defined in [XML] as whitespace
Space ( )
Tab (	)
Carriage return (
Line feed (
The XML processor normalizes different system's line end
codes into one single line-feed character, that is passed
up to the application. The XHTML user agent in addition,
must treat the following characters as whitespace:
Form feed (
Zero-width space (​)
In elements where the 'xml:space' attribute is set to
'preserve', the user agent must leave all whitespace
characters intact (with the exception of leading and
trailing whitespace characters, which should be removed).
Otherwise, whitespace is handled according to the
All whitespace surrounding block elements should be
Comments are removed entirely and do not affect
whitespace handling. One whitespace character on either
side of a comment is treated as two white space
Leading and trailing whitespace inside a block element
must be removed.
Line feed characters within a block element must be
converted into a space (except when the 'xml:space'
attribute is set to 'preserve').
A sequence of white space characters must be reduced to
a single space character (except when the 'xml:space'
attribute is set to 'preserve').
With regard to rendition, the User Agent should render
the content in a manner appropriate to the language in
which the content is written. In languages whose
primary script is Latinate, the ASCII space character
is typically used to encode both grammatical word
boundaries and typographic whitespace; in languages
whose script is related to Nagari (e.g., Sanskrit,
Thai, etc.), grammatical boundaries may be encoded
using the ZW 'space' character, but will not typically
be represented by typographic whitespace in rendered
output; languages using Arabiform scripts may encode
typographic whitespace using a space character, but may
also use the ZW space character to delimit 'internal'
grammatical boundaries (what look like words in Arabic
to an English eye frequently encode several words, e.g.
'kitAbuhum' = 'kitAbu-hum' = 'book them' == their
book); and languages in the Chinese script tradition
typically neither encode such delimiters nor use
typographic whitespace in this way.
Whitespace in attribute values is processed according to
3.4. Naming Rules
Names for XHTML-conforming document types must adhere to
strict naming conventions so that it is possible for software
and users to readily determine the relationship of document
types to XHTML. The names for document types implemented as
XML Document Type Definitions are defined through XML Formal
Public Identifiers (FPIs). Within FPIs, fields are separated
by double slash character sequences (
various fields MUST be composed as follows:
The leading field identifies the resources relationship to
a formal standard. For privately defined resources, this
field MUST be "
-". For formal standards, this
field MUST be the formal reference to the standard (e.g.
The second field MUST contain the name of the organization
responsible for maintaining the named item. There is no
formal registry for these organization names. Each
organization SHOULD define a name that is unique. The name
used by the W3C is, for example,
The third field MUST take the form
followed by an organization-defined unique identifier (e.g.
MyML 1.0). This identifier SHOULD be composed of a unique
name and a version identifier that can be updated as the
document type evolves.
The fourth field defines the language in which the item is
Using these rules, the name for an XHTML family conforming
document type might be
for Naming Rules
Naming Rules are critical for portability of user agents
and XHTML-conforming tools. These rules need to be simple
enough that they can be readily adhered to, and need to
convey upon document type and module designers the power to
readily associate their creations with XHTML (for marketing
purposes, if nothing else). The above rules address these
concerns. There were some other possibilities for naming
conventions, and they were not used for the following
Use the XHTML version in the identifier.
In the case of new modules, there is no need to
associate the module with a specific version of XHTML -
the name does not need to identify version
dependencies. In the case of new document types, the
new type does not necessarily have any relationship to
a specific version of XHTML. Instead, the new document
type should itself have versioning that will help in
its evolution. Document types will necessarily evolve
out of step with XHTML from the W3C.