3. Conformance Definition

Contents

3.1. XHTML Family Document Type Conformance
3.2. XHTML Family Document Conformance
3.3. XHTML Family User Agent Conformance
3.4. Naming Rules
- 3.4.1. Rationale for Naming Rules

This section is normative.

In order to ensure that XHTML-family documents are maximally portable among XHTML-family user agents, this specification rigidly defines conformance requirements for both of these and for XHTML-family document types. While the conformance definitions can be found in this section, they necessarily reference normative text within this document, within the base XHTML specification [XHTML1], and within other related specifications. It is only possible to fully comprehend the conformance requirements of XHTML through a complete reading of all normative references.

3.1. XHTML Family Document Type Conformance

It is possible to modify existing document types and define wholly new document types using both modules defined in this specification and other modules. Such a document type conforms to this specification when it meets the following criteria:

The document type must be defined using one of the implementation methods defined by the W3C (currently this is limited to XML DTDs, but XML Schema will be available soon).
The document type must have a unique identifier as defined in Naming Rules.
The document type must include, at a minimum, the Structure, Hypertext, Basic Text, and List modules defined in this specification.
For each of the W3C-defined modules that are included, all of the elements, attributes, and any required minimal content models must be included (and optionally extended) in the document type's content model.
The document type may define additional elements and attributes. However, these must be in their own XML Namespace [XMLNAMES].

3.2. XHTML Family Document Conformance

Documents that rely upon XHTML-family document types are considered XHTML conforming if they validate against their referenced document type.

3.3. XHTML Family User Agent Conformance

A conforming user agent must meet all of the following criteria (as defined in [XHTML1]):

In order to be consistent with the XML 1.0 Recommendation [XML], the user agent must parse and evaluate an XHTML document for well-formedness. If the user agent claims to be a validating user agent, it must also validate documents against their referenced DTDs according to [XML].
When the user agent claims to support facilities defined within this specification or required by this specification through normative reference, it must do so in ways consistent with the facilities' definition.
When a user agent processes an XHTML document as generic XML, it shall only recognize attributes of type ID (e.g. the id attribute on most XHTML elements) as fragment identifiers.
If a user agent encounters an element it does not recognize, it must render the element's content.
If a user agent encounters an attribute it does not recognize, it must ignore the entire attribute specification (i.e., the attribute and its value).
If a user agent encounters an attribute value it doesn't recognize, it must use the default attribute value.
If it encounters an entity reference (other than one of the predefined entities) for which the User Agent has processed no declaration (which could happen if the declaration is in the external subset which the User Agent hasn't read), the entity reference should be rendered as the characters (starting with the ampersand and ending with the semi-colon) that make up the entity reference.
When rendering content, User Agents that encounter characters or character entity references that are recognized but not renderable should display the document in such a way that it is obvious to the user that normal rendering has not taken place.
The following characters are defined in [XML] as whitespace characters:
- Space ( )
- Tab (	)
- Carriage return ()
- Line feed (
  )
The XML processor normalizes different system's line end codes into one single line-feed character, that is passed up to the application. The XHTML user agent in addition, must treat the following characters as whitespace:
- Form feed ()
- Zero-width space ()
In elements where the 'xml:space' attribute is set to 'preserve', the user agent must leave all whitespace characters intact (with the exception of leading and trailing whitespace characters, which should be removed). Otherwise, whitespace is handled according to the following rules:
- All whitespace surrounding block elements should be removed.
- Comments are removed entirely and do not affect whitespace handling. One whitespace character on either side of a comment is treated as two white space characters.
- Leading and trailing whitespace inside a block element must be removed.
- Line feed characters within a block element must be converted into a space (except when the 'xml:space' attribute is set to 'preserve').
- A sequence of white space characters must be reduced to a single space character (except when the 'xml:space' attribute is set to 'preserve').
- With regard to rendition, the User Agent should render the content in a manner appropriate to the language in which the content is written. In languages whose primary script is Latinate, the ASCII space character is typically used to encode both grammatical word boundaries and typographic whitespace; in languages whose script is related to Nagari (e.g., Sanskrit, Thai, etc.), grammatical boundaries may be encoded using the ZW 'space' character, but will not typically be represented by typographic whitespace in rendered output; languages using Arabiform scripts may encode typographic whitespace using a space character, but may also use the ZW space character to delimit 'internal' grammatical boundaries (what look like words in Arabic to an English eye frequently encode several words, e.g. 'kitAbuhum' = 'kitAbu-hum' = 'book them' == their book); and languages in the Chinese script tradition typically neither encode such delimiters nor use typographic whitespace in this way.
Whitespace in attribute values is processed according to [XML].

3.4. Naming Rules

Names for XHTML-conforming document types must adhere to strict naming conventions so that it is possible for software and users to readily determine the relationship of document types to XHTML. The names for document types implemented as XML Document Type Definitions are defined through XML Formal Public Identifiers (FPIs). Within FPIs, fields are separated by double slash character sequences (//). The various fields MUST be composed as follows:

The leading field identifies the resources relationship to a formal standard. For privately defined resources, this field MUST be "-". For formal standards, this field MUST be the formal reference to the standard (e.g. ISO/IEC 15445:1999).
The second field MUST contain the name of the organization responsible for maintaining the named item. There is no formal registry for these organization names. Each organization SHOULD define a name that is unique. The name used by the W3C is, for example, W3C.
The third field MUST take the form DTD XHTML- followed by an organization-defined unique identifier (e.g. MyML 1.0). This identifier SHOULD be composed of a unique name and a version identifier that can be updated as the document type evolves.
The fourth field defines the language in which the item is developed (e.g. EN).

Using these rules, the name for an XHTML family conforming document type might be -//MyCompany//DTD XHTML-MyML 1.0//EN.

3.4.1. Rationale for Naming Rules

Naming Rules are critical for portability of user agents and XHTML-conforming tools. These rules need to be simple enough that they can be readily adhered to, and need to convey upon document type and module designers the power to readily associate their creations with XHTML (for marketing purposes, if nothing else). The above rules address these concerns. There were some other possibilities for naming conventions, and they were not used for the following reasons:

Use the XHTML version in the identifier.
In the case of new modules, there is no need to associate the module with a specific version of XHTML - the name does not need to identify version dependencies. In the case of new document types, the new type does not necessarily have any relationship to a specific version of XHTML. Instead, the new document type should itself have versioning that will help in its evolution. Document types will necessarily evolve out of step with XHTML from the W3C.