6. Developing DTDs with defined and extended modules

Contents

    6.1. Defining additional attributes
    6.2. Defining additional elements
    6.3. Defining a new module
    6.4. Defining the content model for a collection of modules
        6.4.1. Integrating a stand-alone module into XHTML
        6.4.2. Mixing a new module throughout the modules in XHTML
    6.5. Creating a new DTD
        6.5.1. Creating a simple DTD
        6.5.2. Creating a DTD by extending XHTML
        6.5.3. Creating a DTD by removing and replacing XHTML modules
    6.6. Using the new DTD

This section is informative.

The primary purpose of defining XHTML modules and a general modularization methodology is to ease the development of DTDs that are based upon XHTML. These DTDs may extend XHTML by integrating additional capabilities (e.g. [SMIL] or [MathML]), or they may define a subset of XHTML for use in a specialized device. Regardless of the application, XHTML modules are up to the task. This section describes the techniques that DTD designers must use in order to take advantage of this modularization architecture. It does this by applying the techniques defined in the previous sections in progressively more complex ways, culminating in the creation of a complete DTD from disparate modules.

Note that in no case do these examples require the modification of the XHTML-provided module files themselves. The XHTML module files are completely parameterized, so that it is possible through separate module definitions and driver files to customize the definition and the content model of each element and each element's hierarchy.

Finally, remember that most users of XHTML are not expected to be DTD authors. DTD authors are generally people who are defining specialized markup that will improve the readability, simplify the rendering of a document, or ease machine-processing of documents, or they are client designers that need to define the specialized DTD for their specific client. Consider these cases:

An organization is providing subscriber's information via a web interface. The organization stores its subscriber information in an XML-based database. One way to report that information out from the database to the web is to embed the XML records from the database directly in the XHTML document. While it is possible to merely embed the records, the organization could define a DTD module that describes the records, attach that module to an XHTML DTD, and thereby create a complete DTD for the pages. The organization can then access the data within the new elements via the Document Object Model [DOM], validate the documents, provide style definitions for the elements that cascade using Cascading Style Sheets [CSS], etc. By taking the time to define the structure of their data and create a DTD using the processes defined in this section, the organization can realize the full benefits of XML.
An Internet client developer is designing a specialized device. That device will only support a subset of XHTML, and the devices will always access the Internet via a proxy server that is validating content before passing it on to the client (to minimize error handling on the client). In order to ensure that the content is valid, the developer creats a DTD that is a subset of XHTML using the processes defined in this section. They then use the new DTD in their proxy server and in their devices, and also make the DTD available to content developers so that developers can validate their content before making it available. By performing a few simple steps, the client developer can use the modularization architecture defined in this document to greatly ease their DTD development cost and ensure that they are fully supporting the subset of XHTML that they choose to include.

6.1. Defining additional attributes

In some cases, an extension to XHTML can be as simple as additional attributes. Attributes can be added to an element just by specifying an additional ATTLIST for the element, for example:

<!ATTLIST a
	  myattr   CDATA        #IMPLIED
>

would add the "myattr" attribute, with a value type of CDATA, to the "a" element. This works because XML permits the extension of the attribute list for an element at any point in a DTD.

Naturally, adding an attribute to a DTD does not mean that any new behavior is defined for arbitrary clients. However, a content developer could use an extra attribute to store information that is accessed by associated scripts via the Document Object Model (for example).

6.2. Defining additional elements

Defining additional elements is only slightly more complicated than defining additional attributes. Basically, DTD authors should write the element declaration for each element:

<!ELEMENT myelement ( #CDATA | myotherelement )* >
<!ATTLIST myelement
          myattribute    CDATA    #IMPLIED
>

<!ELEMENT myotherelement EMPTY >

After the elements are defined, they need to be integrated into the content model. Strategies for integrating new elements or sets of elements into the content model are addressed in Defining the content model for a collection of modules below.

6.3. Defining a new module

When work on extending XHTML modules is done with the intent of making the work generally available for use in developing additional extended DTDs, developers should adhere to the module definition techniques defined in Implementing Document Model Modules in the DTD. A module constructed using those techniques has several characteristics:

The definitions in the module are related by some common theme that makes it reasonable to have them in a single module.
The module is declared such that its entities are uniquely named.
The module adheres to a strict naming convention for various classes of definitions that make those names predictable.
The module may rely upon entities defined in other modules to specify its entities, elements, and attribute lists.
Unless the content model of elements in the module is fixed, each element's content model is parameterized so that it can be extended by DTD authors.

6.4. Defining the content model for a collection of modules

Since the content model of modules is fully parameterized, DTD authors may modify the content model for every element in every module. There are two ways to approach this modification:

Re-define the "<element>.content" entity for each element.
Re-define one or more of the global content model entities (*.class or *.mix).

The strategy taken will depend upon the nature of the modules being combined and the nature of the modules being integrated. The remainder of this section describes techniques for integrating two different classes of modules.

6.4.1. Integrating a stand-alone module into XHTML

When a module (and remember, a module can be a collection of other modules) contains elements that only reference each other in their content model, it is said to be "internally complete". As such, the module can be used on its own (for example, you could define a DTD that was just that module, and use one of its elements as the root element). Integrating such a module into XHTML is a three step process:

Decide what element(s) can be thought of as the root(s) of the new module.
Decide where these elements need to attach in the XHTML content tree.
Then, for each attachment point in the content tree, add the root element(s) to the content definition for the XHTML elements.

Consider attaching the elements defined above. In that example, the element myelement is the root. To attach this element under the object element, and only the object element, of XHTML, the following would work:

<!ENTITY % Object.content "( % Flow.mix | param | myelement )*">

A DTD defined with this content model would allow a document like the following fragment:

<object data="...">
<p>The object didn't load!</p>
<myelement>This is content of a locally defined element</myelement>
</object>

6.4.2. Mixing a new module throughout the modules in XHTML

Extending the example above, to attach this module everywhere that the %Flow.mix content model group is permitted, would require something like the following:

<!ENTITY % Misc.class
     "ins | del | script | noscript | myelement" >

Since the %Misc.class content model class is used throughout the XHTML Transitional DTD, the new module would become available throughout an extended XHTML document.

6.5. Creating a new DTD

So far the examples in this section have described the methods of extending XHTML and XHTML's content model. Once this is done, the next step is to collect the modules that comprise the DTD into a single DTD driver, incorporating the new definitions so that they override and augment the basic XHTML definitions as appropriate.

When defining a new DTD, it is essential that each DTD have a unique identifier to use in the xmlns attribute of the root element (usually the html element). This identifier is often a URI, but in any event is something that can be used by browsers to differentiate the DTD from others. This identifier is defined using the XHTML1.ns parameter entity when creating a DTD that uses the XHTML1 structure module.

6.5.1. Creating a simple DTD

Using the trivial example above, it is possible to define a new DTD that extends the XHTML Transitional DTD pretty easily. The following is a complete, working extended DTD:

<!ENTITY % XHTML1.ns "http://my.company.com/DTDs/example.dtd" %gt;

<!ELEMENT myelement ( #PCDATA | myotherelement )* >
<!ATTLIST myelement
     myattribute	CDATA	#IMPLIED
>

<!ELEMENT myotherelement EMPTY >

<!ENTITY % Misc.class
     "ins | del | script | noscript | myelement" >

<!ENTITY % XHTML1-t.dtd PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
          "http://www.w3.org/DTDs/XHTML1/XHTML1-t.dtd">
%XHTML1-t.dtd;

6.5.2. Creating a DTD by extending XHTML

Next, there is the situation where a complete, additional, and complex module is added to XHTML (or to a subset of XHTML). In essence, this is the same as in the trivial example above, the only difference being that the module being added is incorporated in the DTD by reference rather than explicitly including the new definitions in the DTD.

One such complex module is the DTD for [MathML]. In order to combine MathML and XHTML into a single DTD, an author would just decide where MathML content should be legal in the document, and add the MathML root element to the content model at that point:

<!ENTITY % XHTML1.ns "http://www.w3.org/DTDs/XHTML1_plus_MathML.dtd" %gt;
<!ENTITY % XHTML1-math
     PUBLIC "-//W3C//MathML 1.0//EN"
            "http://www.w3.org/DTDs/MathML/MathML1.dtd" >
%XHTML1-math;

<!ENTITY % Inlspecial.class "a | img | object | map | math" >

<!ENTITY % XHTML1-strict
     PUBLIC "-//W3C//XHTML 1.0 Strict//EN"
            "http://www.w3.org/DTDs/XHTML/XHTML1-s.dtd" >
%XHTML1-strict;

Note that, while this is a valid example, it does not create a working DTD at this time. The reason for this is that the MathML DTD defines two elements (var and select) that conflict directly with XHTML. This conflict needs to be resolved in order for the new DTD to work correctly.

6.5.3. Creating a DTD by removing and replacing XHTML modules

Finally, another way in which DTD authors may use XHTML modules is to define a DTD that is a subset of XHTML (because, for example, they are building devices or software that only supports a subset of XHTML). Doing this is only slightly more complex than the previous example. The basic steps to follow are:

Select the predefined XHTML DTD to use as a basis for the new DTD (Strict, Transitional, or Frameset).
Select the modules to remove from that DTD
Define a new DTD that "IGNORES" the modules.

For example, consider a device that supports the Strict XHTML 1.0, but without forms or tables. The DTD for such a device would look like this:

<!ENTITY % XHTML1.ns "http://www.w3.org/DTDs/XHTML1_simple.dtd" %gt;

<!ENTITY % XHTML1-form.module "IGNORE" >
<!ENTITY % XHTML1-table.module "IGNORE" >

<!ENTITY % XHTML1-strict
     PUBLIC "-//W3C//XHTML 1.0 Strict//EN"
            "http://www.w3.org/DTDs/XHTML/XHTML1-s.dtd" >
%XHTML1-strict;

Note that this does not actually modify the content model for the Strict XHTML 1.0 DTD. However, since XML ignores elements in content models that are not defined, the form and table elements are dropped from the model automatically.

6.6. Using the new DTD

Once a new DTD has been developed, it can be used in any document. Using the DTD is as simple as just referencing it in the DOCTYPE declaration of a document:

<!DOCTYPE html PUBLIC "-//MyOrg//DTD My XHTML Extensions//EN"
          "http://www.myorg.com/DTDs/myorg.dtd">
<html xmlns="http://www.myorg.com/DTDs/myorg.dtd">
<head>
<title>MyOrg Document</title>
</head>
<body>
<p>This is an example document using the new elements:
<myelement>A test element <myotherelement /> </myelement>
</p>
</body>
</html>