XTiger Language Specification

Authors: Émilien Kia, Vincent Quint, Irène Vatton -- INRIA Rhône-Alpes

Date: 2007-10-11


Abstract

This document presents XTiger, an XML language for specifying document templates. XTiger templates are intended to guide an editing tool for building documents that follow a predefined model. The XTiger language is used jointly with another XML language, typically XHTML, which is called the target language. A template is a target language document where XTiger elements indicate how the document can be edited and still conform with the model. XTiger is versatile enough to represent templates that capture the overall structure of large documents as well as the fine details of a microformat.

Contents

1. Introduction to the XTiger language

Most popular XML document formats used on the web, such as XHTML or SVG, are very flexible: they allow many different types of documents to be represented. This is an advantage in a wide space such as the Web, as a broad range of documents can be handled consistently. XHTML, for instance, is used to represent not only traditional Web pages, but also complex technical documents, sophisticated e-commerce forms or rich media slides, and all these documents can be accessed with a single browser. But this flexibility makes document authoring a complex task. When producing a specific type of document, an author is faced with all the possibilities provided by XHTML, and she has to make a number of difficult decisions. If multiple similar documents have to be produced consistently, for a particular use or for some specific application, authors have to make a consistent use of the XHTML document format, which has proven to be very difficult.

XTiger (eXtensible Templates for Interactive Guided Editing of Resources) tackles this problem by defining how the document format (XHTML, for instance) has to be used for representing a certain type of document. To do so, XTiger relies on the notion of a template. A template is a skeleton representing a given type of document, expressed in the format of the final documents to be produced (XHTML, for instance). The format of the final documents is called the target language and must be an XML language. The skeleton contains some statements, expressed in the XTiger language, that specify how this minimal document can evolve and grow, while keeping in line with the intended type of the final documents. Some parts of the template may be frozen, if they have to appear as is in the final document. Some parts may be modified when producing the final document, some others may be added either freely or under some constraints. It is the role of the XTiger language to specify these possibilities and constraints.

When talking about XTiger, it is important to make a distinction between two kinds of documents: a template and its instances. A template is the skeleton presented above, containing XTiger elements and defining a certain type of document. It is the seed used to produce a series of documents, called instances, that are derived from the template by following the statements expressed by the embedded XTiger elements. In the rest of the paper, we use the term instance instead of final document.

The statements expressed by XTiger elements are supposed to be interpreted by a document authoring tool. Starting from a template, the tool helps the user to follow the XTiger statements, thus ensuring that the instance being edited will stick to the document type specified by the template.

XTiger templates may be used to specify the overall structure of a large document, as well as the fine details of some of its parts. This latter feature allows in particular to express how to use microformats in large documents.

XTiger is not a document type like XHTML, SVG or MathML. It is always used in combination with a target language, which is a document type. The XTiger elements interspersed in a template are not supposed to be displayed in the same way as elements of the target language. Instead, the role of these XTiger elements is to specify what elements and attributes of the target language must, should or could be present at these positions in the document instance. That is the core of the language, which specifies the structure and (parts of) the content of documents.

This functionality is complemented with additional features that make the language easier to use. For instance, structure fragments can be defined only once and used at several places, in one or several templates. This facilitates a modular construction of templates, by sharing reusable pieces of structure stored in libraries.

As XTiger is used to describe structures, and because it is always mixed with XML languages, it is itself an XML language. XML namespaces are used to distinguish between XTiger elements and elements from the target language. This distinction allows existing web browsers to simply ignore the XTiger elements and to display a template as if these elements were not present.

The XTiger namespace is http://ns.inria.org/xtiger. For the sake of readability, all examples in this document use prefix xt: for XTiger element names, while names from the target language are not prefixed.

The target language used in the following examples is XHTML, but it might be any other XML language as well. The first example below is a piece of XTiger language that defines a component called "author". This component is constituted by a XHTML paragraph that contains a few XHTML span and br elements, with classes from the hCard microformat. The "author" component can be used to generate the XHTML structure representing an author in document instances, following the hCard microformat.

<xt:component name="author">
  <p class="vcard">
    <span class="fn">
      <xt:use types="string" label="name">Author name</xt:use>
    </span>
    <br/>
    <span class="adr">
      <xt:use types="string" label="address">Address ...</xt:use>
    </span>
    <br/>
    <span class="email">
      <xt:use types="string" label="email">email ...</xt:use>
    </span>
  </p>
</xt:component>

2. Types in XTiger

In XTiger, types are used to specify pieces of structure that may occur at several places in a template or in several templates. XTiger offers a few basic types and allows constructed types to be built. Constructed types are built with constructors that combine XTiger basic types and types from the target language. Two constructors are available: component and union.

2.1. Basic types

XTiger offers three basic types:

2.2. Target language types

As XTiger always works with a target language and is used to produce documents in that language, it may use elements and attributes from the target language. For instance, when the target language is XHTML, elements h1, h2, p, strong, span, cite are target language types.

2.3. Component

Component is a constructor that creates a new constructed type by specifying an XML structure assembling other types, which may be basic types, target languages types and constructed types (unions and other components). The type thus created has a name that allows it to be referred from other XTiger elements. This name must be unique in the template where it is defined.

The XTiger element component is used to define a component type:

<!ELEMENT component ANY>
<!ATTLIST component
    name  NMTOKEN  #REQUIRED>
Attributes:
name
The name of the type. This attribute must be unique in the template and is mandatory.
Content:
The content of the component element defines the structure of the new type. It may be any XML structure that combines target language elements (possibly with attributes) and XTiger elements allowed in the template body.

An example :

<xt:component name="hello">
    <p>Hello world!</p>
</xt:component>

This example defines a type called "hello" that is a XHTML paragraph (the target language of the template where this elements occurs is XHTML) containing the text "Hello World !". It uses a target language type (p element).

2.4. Union

Union is a constructor that defines a new type as a choice between several types, each of which being a basic type, a target language type, or a constructed type (component or other union). The new type has a name that allows it to be used in other XTiger elements. This name must be unique in the template where it is defined.

The XTiger element union is used to define a union type:

<!ELEMENT union EMPTY>
<!ATTLIST union
    name    NMTOKEN  #REQUIRED
    include CDATA    #REQUIRED
    exclude CDATA    #IMPLIED>
Attributes:
name
The name of the type. This attribute must be unique in the template and is mandatory.
include
A list of names separated by spaces. These names can be basic types (number, boolean, string), names of elements from the target language (div, h1, h2, p, ... for XHTML), or the name attribute of a component or another union. This attribute is used to define the options that constitute the union. This attribute is mandatory.
exclude
This attribute is a list of names separated by spaces. These names can be basic types (number, boolean, string), names of elements from the target language (div, h1, h2, p, ...), or the name attribute of a component or another union. This attribute is used to exclude some elements that are part of the union as defined by the include attribute. This attribute is optional.
Content:
empty.

XTiger provides four predefined unions that may be used in any type definition:

anySimple
includes all basic types (number, string and boolean)
anyElement
includes all elements defined in the target language.
anyComponent
includes all components defined in the template.
any
includes anySimple, anyElement and anyComponent.

Example :

<xt:union name="hello_or_p" include="hello p"/>
<xt:union name="headings" include="h1 h2 h3 h4 h5 h6"/>
<xt:union name="headings1to4" include="headings" exclude="h5 h6"/>

With these definitions, the hello_or_p union provides a choice between the hello component and the p element. The headings union provides a choice between all HTML headings (h1 to h6). The headings1to4 union provides a choice between all HTML headings except h5 and h6.

3. Type definitions

The definitions of components and unions presented above must appear in the head of an XTiger template, or in a XTiger library imported by a template.

Type definitions do not appear in document instances. Instead, instances include a Processing Instruction that refers to their template, which contains type definitions and reference to libraries containing additional type definitions.

3.1. Head element

The head element collects definitions of components and unions that are used in the template. It also refers to the libraries that contain additional components and/or unions used in the template. This is done with the import element.

There is always a head element in a template, but only one. It may appear anywhere in the template, but it cannot be the root of the document. In XHTML documents it is recommended to insert it in the XHTML head element.

<!ELEMENT head ((component | union | import )*) >
<!ATTLIST head
    version         CDATA    #REQUIRED
    templateVersion CDATA    #IMPLIED>
Attributes:
version
Version of the XTiger language used in the template. For the XTiger language defined in this specification, the value of the version attribute must be "0.9". This attribute is mandatory.
templateVersion
Version of the template. A template may evolve over time, when the type of document it represents is modified and several versions are available. This version number may be used to make sure that the right version of the template is used for a given document instance.
Content:
The head element contains any number (including 0) of component, union and import elements, but no other elements.

3.2. XTiger libraries

A XTiger library is an XML document containing definitions of constructed types (components and/or unions). Libraries allow types to be declared only once and to be shared between different templates. A XTiger library is defined by the root element library. Its content model is the same as the head element of a template. Like the head element, a library can import other XTiger libraries using the import element.

<!ELEMENT library ((component | union | import)*)>
<!ATTLIST head
    version         CDATA    #REQUIRED
    templateVersion CDATA    #IMPLIED>

3.3. Using libraries

When a template or a library uses constructed types defined in a library, that library must be explicitly imported in the template or library that uses it by an import element.

<!ELEMENT import EMPTY >
<!ATTLIST import
    src    CDATA    #REQUIRED>
Attributes:
src
URI of the imported library. This attribute is mandatory.
Content:
empty.

All components and unions declared in the imported library are inserted at the position of the import element. Some imported components and unions can be redeclared (same name attribute) in the current head or library element. The order of import elements in a head or library is important: a component or a union defined in an imported library with the same name attribute as a previous definition replaces that previous definition.

4. The template body

A template contains a set of type definitions grouped in the head element but also the skeleton of a target language document and some XTiger statements that are used to generate instances. The latter (skeleton and statements) is called the template body. A copy of it serves as initial instance when a new document is created from the template. The head element and its definitions are not copied in the instance.

All target language elements included in the template body appear in all document instances exactly as they are in the template. Their content is preserved and can not be modified in instances. This is the static part of the template.

There is also a dynamic part in a template, i.e. a part that can be modified under the control of XTiger elements. The XTiger elements that control the dynamic part are:

4.1. Inclusion of types

The use element indicates what type(s) of element can appear at that position in an instance. Only one element of the specified type(s) can appear at that position in an instance document.

<!ELEMENT use ANY>
<!ATTLIST use
    label       NMTOKEN  #REQUIRED
    types       CDATA    #REQUIRED
    currentType CDATA    #IMPLIED
    initial (true|false) #IMPLIED "false"
Attributes:
label
Label associated with the use element. This attribute allows authors of instances to make a difference between the many XTiger elements that appear in a document. It is mandatory.
types
A list of space separated names. A single name is also allowed. Each name is either a basic type (number, boolean, string), an element of the target language (h1, h2, p, ...), or the name attribute of a component or a union. The element to be inserted at that position in an instance must be of one of these types, but there is no constraint on the descendants of the inserted elements, provided they comply with the DTD or schema of the target language, when target language elements are used. This attribute is mandatory.
currentType
Name of the selected type, when a choice has been made. This attribute is present only in instances. It should not be used in a template.
initial
Indicates whether the content of the use instance is the initial value provided by the template (true) or not (false).This attribute is present only in instances. It should not be used in a template.
Content:
The use element may have a content. If a content is present, it must be of one of the types listed in the types attribute. This content is considered as an initial value that will be present in an instance. It may be replaced by an instance author by another content, provided it is compliant with the types attribute.

When a component and only that component is used only once in the template, the use element may be replaced by a component element. This is a shortcut. Its semantics are the same that a use element at that position which refers to the component.

Example 1:

<xt:use label="birthday" types="string">
Your birth date here
</xt:use>

In this example "Your birth date here" is the content that will be displayed when a new instance is created from the template. This string can be freely replaced by an instance author by any other string, but only by a string.

Example 2:

<xt:head version="0.9">
  <xt:component name="short_date">
    <xt:use label="day"   types="number">20</xt:use> /
    <xt:use label="month" types="number">10</xt:use> /
    <xt:use label="year"  types="number">1981</xt:use>
  </xt:component>
...
</xt:head>
...
<xt:use label="birthday" types="short_date"/>

This example shows how a component can be used to make sure that the user will enter a date in the dd/mm/yyyy format.

<xt:use label="date" types="em short_date">
  <em>20 october 1981</em>
</xt:use>

Here, the content of the xt:use element may be either an XHTML em element or a short_date component. Only one of them can be inserted at that position in an instance. The current content <em>20 october 1981</em> is a valid value, because it is an em. It does not need to be also a short_date.

4.2. Free content areas

The use element puts strong constraints on the structure and/or content of a part of a document. It is sometimes useful to have more flexibility. That is the role of the bag element. It indicates that any number of elements may appear at that position in an instance document, and it specifies the allowed types for these elements.

<!ELEMENT bag ANY>
<!ATTLIST bag
    label   NMTOKEN  #REQUIRED
    types   CDATA    #REQUIRED>
    include CDATA    #IMPLIED>
    exclude CDATA    #IMPLIED>
Attributes:
label
Label associated with the bag element. This attribute allows authors of instances to make a difference between the many XTiger elements that appear in a document. It is mandatory.
types
A list of space separated names. A single name is also allowed. Each name is either a basic type (number, boolean, string), an element of the target language (h1, h2, p, ...), or the name attribute of a component or a union. The elements to be inserted at that position in an instance (bag children) must be of one of these types. The types attribute is mandatory.
include
A list of names separated by spaces. These names can be basic types (number, boolean, string), names of elements from the target language (div, h1, h2, p, ... for XHTML), or the name attribute of a component or another union. This attribute is used to limit the list of allowed descendant element types that could be inserted into bag children. This attribute is optional. When it is omitted all descendant element types allowed by the target language can be inserted into bag children.
exclude
This attribute is a list of names separated by spaces. These names can be basic types (number, boolean, string), names of elements from the target language (div, h1, h2, p, ...), or the name attribute of a component or another union. This attribute is used to exclude some element types from the possible set of descendant element types. This attribute is optional.
Content:
The bag element may have a content. If a content is present, it must follow the constraints set by the types attribute. This content is considered as an initial value that will be present in an instance. It may be replaced by an instance author by another content, provided it remains compliant with the types attribute.

Example:

<p>
  <xt:bag label="para" types="string em strong code">
  This <em>paragraph</em> contains <em><strong>strings</strong></em>
  and <strong><code>any</code></strong> combination of <em>emphasis</em>,
  <code>code</code> and <strong>strong</strong> elements.
  </xt:bag>
</p>

4.3. Repeated elements

It is often useful to be able to repeat a part of the document structure several times. In this case, the structure to be repeated must first be declared as a component. It can then be used with a xt:repeat element in the template body.

<!ELEMENT repeat ( use+ | component )>
<!ATTLIST repeat
    label         NMTOKEN #REQUIERED
    minOccurs     CDATA   #IMPLIED "0"
    maxOccurs     CDATA   #IMPLIED "*">
Attributes:
label
Label associated with the repeat element. This attribute allows authors of instances to make a difference between the many XTiger elements that appear in a document. It is mandatory.
minOccurs
Minimum number of times the component must be repeated. If this attribute is absent, the minimum is 0.
maxOccurs
Maximum number of times the component may be repeated. "*" means no upper bound. If this attribute is absent, it is equivalent to "*".
Content:
A use element indicates (with its types attribute) the type of the component to be repeated. Basic types are not allowed. If the types attribute of the use element is a list of several types, the repeated elements may have any of these types. Several use elements may be present in a repeat element in a template to provide initial values to several repeated elements.

Instead of a use element, the content may be a single component element, when this component is used only there in the whole template. This is equivalent to defining this component in the head of the template and putting a use element (that refers to that virtual component) in the repeat element.

Example:

<xt:head version="0.9">
  <xt:component name="bib_item">
    <li>
      <xt:repeat label="authors" minOccurs="1" maxOccurs="5">
        <xt:component name="author">
          <xt:use label="given_name" types="string"/>
          <xt:use label="family_name" types="string"/>
        </xt:component>
      </xt:repeat>
      ...
    </li>
  </xt:component>
  ...
</xt:head>
...
<h2>Bibliography</h2>
<ul>
  <xt:repeat label="bib_list" minOccurs="1">
    <xt:use label="entry" types="bib_item"/>
  </xt:repeat>
</ul>

This example describes a bibliography section which includes at least one bib_item component. Each of these components contains several authors.

4.4. Optional elements

It is often useful to indicate that some part of the document is optional. This is done with the option element. This element is equivalent to an xt:repeat element with maxOccurs="1" and minOccurs="0".

<!ELEMENT option ANY>
<!ATTLIST option
    label   NMTOKEN      #REQUIRED
    checked (true|false) #IMPLIED "true">
Attributes:
label
Label associated with the option element. This attribute allows authors of instances to make a difference between the many XTiger elements that appear in a document. It is mandatory.
checked
Indicates whether the optional element is present (true) or not (false). If this attribute is not present, the element is there (true). This attribute is usually not present in a template. It is used in document instances.
Content:
The piece of document that is optional.

Example:

<xt:option label="bibliography">
  <component name="biblio">
    <h2>Bibliography</h2>
    <ul>
      <xt:repeat label="bib_list" minOccurs="1">
        <xt:use label="entry" types="bib_item"/>
      </xt:repeat>
    </ul>
  </component>
</xt:option>

This example defines a component called biblio and makes it optional at the position where it appears in the template body.

4.5. Attributes

XTiger provides a way to control attributes from the target language. This is achieved by inserting an attribute element as a child of a target language element. The attribute element makes an attribute of its parent element mandatory, fixed, or prohibited. If several attributes of a single target language element have to be controlled, several attribute elements must be used, one for each of these attributes.

<!ELEMENT attribute EMPTY>
<!ATTLIST attribute
    name    NMTOKEN                          #REQUIRED
    type    (number, string, list)           #IMPLIED "string"
    use     (required, optional, prohibited) #IMPLIED "required"
    default CDATA #IMPLIED
    fixed   CDATA #IMPLIED
    values  CDATA #IMPLIED>
Attributes:
name
Name of the attribute of the parent element that is constrained. This attribute is mandatory.
type
Type of the constrained attribute (number, string, list). If the type attribute is not present, the default type "string" is assumed.
use
Indicates whether the constrained attribute is required, optional, or prohibited. If an attribute is required by its DTD, this attribute will be added even if the template makes it prohibited or optional. If attribute use is not present, the default value "required" is assumed.
default
Default value of the constrained attribute. This value can be replaced by another value in the instance. Attribute default is optional.
fixed
Fixed value of the constrained attribute. Attribute fixed is optional
values
List of possible values. Possible values are separated by spaces. Attribute values is optional.
Content:
empty.

Example:

<div>
  <xt:attribute name="class" use="optional" 
    values="comment example info" default="comment"/>
  ...
</div>

This example shows a XHTML div element whose class attribute is made optional with value limited to the three options comment, example and info. The default value is set to comment.

5. Resources and processing

When working with XTiger templates, three different kinds of resources are involved:

Template file
A template defines the skeleton of a document and the constructed types (components and unions) it uses. Template files have the .xtd extension.
XTiger libraries
Libraries are lists of constructed type definitions. They can be imported by templates and other libraries. Library files have the .xtl extension.
Document instances
Instances are documents generated from templates. Instance files have the usual extension of their target language.

When a user creates a document from a template, the new document instance is created as a copy of the template. However, the xt:head element with its type definitions is kept by the authoring tool, but it is not copied in the document instance.

The template is linked to the new instance by a processing instruction:
<?xtiger template="URI/of/the/template.xtd" version="0.9" templateVersion="xx" ?>
which is inserted at the beginning of the instance, in the same way CSS style sheets are linked to XML documents. With this link, the authoring tool can find all the type definitions needed during editing sessions. All other XTiger elements (use, bag, repeat, option, attribute) as well as all target language elements are kept in the copy that constitutes the initial instance. XTiger types that appear in these elements are replaced by references to their definition in the template (actually, by references to a parsed representation of types in core memory which is more compact).

6. References

Francesc Campoy Flores, Vincent Quint, Irène Vatton, Templates, Microformats and Structured Editing, Proceedings of DocEng'06, ACM Symposium on Document Engineering, 10-13 October 2006, Amsterdam, The Netherlands, pp. 188-197. This research paper presents an early version of the XTiger language.