1. Introduction

Contents

    1.1.  What is XHTML?
    1.2.  Modularization Framework
    1.3.  Modularization of XHTML
        1.3.1.  Semantic modules
        1.3.2.  DTD modules
        1.3.3.  Compound document types
        1.3.4.  Validation
        1.3.5.  Conformance

This section is normative.

1.1. What is XHTML?

XHTML is the reformulation of HTML 4.0 as an application of XML. XHTML 1.0 specifies three XML document types that correspond to the three HTML 4.0 DTDs: Strict, Transitional, and Frameset. XHTML 1.0 is the basis for a family of document types that subset and extend HTML. This document describes how to create additional members of the XHTML family of document types.

1.2. Modularization Framework

This framework provides mechanisms for defining members of the XHTML family of document types, including the three standard XHTML 1.0 document types (Strict, Transitional, and Frameset), subset document types that include only some of the elements from one of the standard document types, and extension document types that incorporate elements from other XML document types.

The modularization framework defines a collection of semantic modules that form the basis for the XHTML family of document types. These semantic modules may be combined with each other and with semantic modules defined for other XML document types to create XHTML subset and extension document types that qualify as members of the XHTML family of document types.

The modularization framework also defines a collection of DTD modules that represent the underlying building blocks used to define the XHTML semantic modules. These DTD modules are created according to certain conventions, as specified by the framework, to allow their combination into semantic modules and complete, functional DTDs.

The conventions specified by the modularization framework may also be used to create new semantic and DTD modules for other XML document types. These new semantic modules can then be used with the XHTML semantic modules to create XHTML document types that incorporate elements from other XML document types.

The modularization framework provides instructions to document authors for associating an XHTML document type with a document instance, and for verifying that the document is a valid instance of the XHTML document type associated with the document.

1.3. Modularization of XHTML

The modularization of XHTML refers to the task of specifying well-defined sets of XHTML elements that can be combined and extended by document authors, document type architects, other XML standards specifications, and application and product designers to make it economically feasible for content developers to deliver content on a greater number and diversity of platforms.

Over the last couple of years, many specialized markets have begun looking to HTML as a content language. There is a great movement toward using HTML across increasingly diverse computing platforms. Currently there is activity to move HTML onto mobile devices (handheld computers, portable phones, etc.), television devices (digital televisions, tv-based web browsers, etc.), and appliances (fixed function devices). Each of these devices has different requirements and constraints.

Modularizing XHTML provides a means for product designers to specify which elements are supported by a device using standard building blocks and standard methods for specifying which building blocks are used. These modules serve as "points of conformance" for the content community. The content community can now target the installed base that supports a certain collection of modules, rather than worry about the installed base that supports this permutation of XHTML elements or that permutation of XHTML elements. The use of standards is critical for modularized XHTML to be successful on a large scale. It is not economically feasible for content developers to tailor content to each and every permutation of XHTML elements. By specifying a standard, either software processes can autonomously tailor content to a device, or the device can automatically load the software required to process a module.

Modularization also allows for the extension of XHTML's layout and presentation capabilities, using the extensibility of XML, without breaking the XHTML standard. This development path provides a stable, useful, and implementable framework for content developers and publishers to manage the rapid pace of technological change on the Web.

The modularization of XHTML is accomplished on two major levels: at the semantic level, and at the document type level. Roughly speaking, the semantic level provides a conceptual approach to the modularization of XHTML, while the document type level provides DTD-level building blocks that allow document type designers to support the semantic modules.

1.3.1. Semantic modules

An XHTML document type is defined as a set of semantic modules. A semantic module defines, in a document type, one kind of data that is semantically different from all others. Semantic modules can be combined into document types without a deep understanding of the underlying schema that defines the modules.

1.3.2. DTD modules

A DTD module consists of a set of element types, a set of attribute list declarations, and a set of content model declarations, where any of these three sets may be empty. An attribute list declaration in a DTD module may modify an element type outside the element types in the module, and a content model declaration may modify an element type outside the element type set.

An XML DTD is a means of describing the structure of a class of XML documents, collectively known as an XML document type. XML schemas are currently represented as DTDs, as described in the XML 1.0 Recommendation [XML]. Where possible, this document also allows for the potential use of other schema languages that are currently under consideration by the W3C XML Schema Working Group. (e.g. DCD, SOX, DDML, XSchema)

1.3.3. Compound document types

A compound document type is an XML DTD composed from a collection of XML DTDs or DTD Modules. The primary purpose of the modularization framework described in this document is to allow a DTD author to combine elements from multiple semantic modules into a compound document type, develop documents against that compound document type, and to validate that document against the associated compound document type.

One of the most valuable benefits of XML over SGML is that XML reduces the barrier to entry for standardization of element sets that allow communities to exchange data in an interoperable format. However, the relatively static nature of HTML as the content language for the Web has meant that any one of these communities have previously held out little hope that their XML document types would be able to see widespread adoption as part of Web standards. The modularization framework allows for the dynamic incorporation of these diverse document types within the XHTML family of document types, further reducing the barriers to the incorporation of these domain-specific vocabularies in XHTML documents.

1.3.4. Validation

The use of well-formed, but not valid, documents is an important benefit of XML. In the process of developing a document type, however, the additional leverage provided by a validating parser for error checking is important. The same statement applies to XHTML document types with elements from multiple semantic modules.

The general problem of fragment validation - validation of XML documents with different schemas from multiple XML Namespaces [XMLNS] in different portions of the document - is beyond the scope of this framework. An essential feature of this framework, however, is a collection of conventions for creating, from a set of semantic modules, compound DTDs.

1.3.5. Conformance

This section introduces three notions of conformance relating to the modularization of XHTML: Document type conformance, document conformance, and browser conformance.

1.3.5.1. Document type conformance

The goal of the modularization framework is to support the creation of new modules beyond those envisioned for XHTML. To support this activity, it is the intent of this document that semantic modules form the atomic building blocks for new XHTML document types. For a compound document type to be considered an XHTML document type, it must satisfy the following properties:

These requirements are intended to be extremely permissive, while ensuring that document authors can rely on the behavior of a module at the semantic level.

1.3.5.2. Document conformance

An XHTML document that conforms to the modularization framework must meet the following requirements:

1.3.5.3. Browser conformance

A browser conforming to this document shall be a conforming browser as defined in [XHTML1]. In addition, such a browser shall support the following functionality:

  1. The browser shall use the value of the xmlns attribute of the html element to uniquely identify the default namespace of the document as associated with the document's XHTML document type.