Authors: Émilien Kia, Vincent Quint, Irène Vatton -- INRIA Rhône-Alpes
Date: 2007-10-11
Abstract
This document presents XTiger, an XML language for specifying document templates. XTiger templates are intended to guide an editing tool for building documents that follow a predefined model. The XTiger language is used jointly with another XML language, typically XHTML, which is called the target language. A template is a target language document where XTiger elements indicate how the document can be edited and still conform with the model. XTiger is versatile enough to represent templates that capture the overall structure of large documents as well as the fine details of a microformat.
Contents
Most popular XML document formats used on the web, such as XHTML or SVG, are very flexible: they allow many different types of documents to be represented. This is an advantage in a wide space such as the Web, as a broad range of documents can be handled consistently. XHTML, for instance, is used to represent not only traditional Web pages, but also complex technical documents, sophisticated e-commerce forms or rich media slides, and all these documents can be accessed with a single browser. But this flexibility makes document authoring a complex task. When producing a specific type of document, an author is faced with all the possibilities provided by XHTML, and she has to make a number of difficult decisions. If multiple similar documents have to be produced consistently, for a particular use or for some specific application, authors have to make a consistent use of the XHTML document format, which has proven to be very difficult.
XTiger (eXtensible Templates for Interactive Guided Editing of Resources) tackles this problem by defining how the document format (XHTML, for instance) has to be used for representing a certain type of document. To do so, XTiger relies on the notion of a template. A template is a skeleton representing a given type of document, expressed in the format of the final documents to be produced (XHTML, for instance). The format of the final documents is called the target language and must be an XML language. The skeleton contains some statements, expressed in the XTiger language, that specify how this minimal document can evolve and grow, while keeping in line with the intended type of the final documents. Some parts of the template may be frozen, if they have to appear as is in the final document. Some parts may be modified when producing the final document, some others may be added either freely or under some constraints. It is the role of the XTiger language to specify these possibilities and constraints.
When talking about XTiger, it is important to make a distinction between two kinds of documents: a template and its instances. A template is the skeleton presented above, containing XTiger elements and defining a certain type of document. It is the seed used to produce a series of documents, called instances, that are derived from the template by following the statements expressed by the embedded XTiger elements. In the rest of the paper, we use the term instance instead of final document.
The statements expressed by XTiger elements are supposed to be interpreted by a document authoring tool. Starting from a template, the tool helps the user to follow the XTiger statements, thus ensuring that the instance being edited will stick to the document type specified by the template.
XTiger templates may be used to specify the overall structure of a large document, as well as the fine details of some of its parts. This latter feature allows in particular to express how to use microformats in large documents.
XTiger is not a document type like XHTML, SVG or MathML. It is always used in combination with a target language, which is a document type. The XTiger elements interspersed in a template are not supposed to be displayed in the same way as elements of the target language. Instead, the role of these XTiger elements is to specify what elements and attributes of the target language must, should or could be present at these positions in the document instance. That is the core of the language, which specifies the structure and (parts of) the content of documents.
This functionality is complemented with additional features that make the language easier to use. For instance, structure fragments can be defined only once and used at several places, in one or several templates. This facilitates a modular construction of templates, by sharing reusable pieces of structure stored in libraries.
As XTiger is used to describe structures, and because it is always mixed with XML languages, it is itself an XML language. XML namespaces are used to distinguish between XTiger elements and elements from the target language. This distinction allows existing web browsers to simply ignore the XTiger elements and to display a template as if these elements were not present.
The XTiger namespace is http://ns.inria.org/xtiger. For the
sake of readability, all examples in this document use prefix
xt: for XTiger element names, while names from the target
language are not prefixed.
The target language used in the following examples is XHTML, but it might
be any other XML language as well. The first example below is a piece of
XTiger language that defines a component called "author". This component is
constituted by a XHTML paragraph that contains a few XHTML
span and br elements, with classes from the hCard
microformat. The "author" component can be used to generate the XHTML
structure representing an author in document instances, following the hCard
microformat.
<xt:component name="author">
<p class="vcard">
<span class="fn">
<xt:use types="string" label="name">Author name</xt:use>
</span>
<br/>
<span class="adr">
<xt:use types="string" label="address">Address ...</xt:use>
</span>
<br/>
<span class="email">
<xt:use types="string" label="email">email ...</xt:use>
</span>
</p>
</xt:component>
In XTiger, types are used to specify pieces of structure that may occur at several places in a template or in several templates. XTiger offers a few basic types and allows constructed types to be built. Constructed types are built with constructors that combine XTiger basic types and types from the target language. Two constructors are available: component and union.
XTiger offers three basic types:
number representing integers and floating point
numbers,boolean representing boolean values (true or
false),string representing variable length character strings.As XTiger always works with a target language and is used to produce
documents in that language, it may use elements and attributes from the
target language. For instance, when the target language is XHTML, elements
h1, h2, p, strong,
span, cite are target language types.
Component is a constructor that creates a new constructed
type by specifying an XML structure assembling other types, which may be
basic types, target languages types and constructed types (unions and other
components). The type thus created has a name that allows it to be referred
from other XTiger elements. This name must be unique in the template where it
is defined.
The XTiger element component is used to define a component
type:
<!ELEMENT component ANY>
<!ATTLIST component
name NMTOKEN #REQUIRED>
namecomponent element defines the
structure of the new type. It may be any XML structure that combines
target language elements (possibly with attributes) and XTiger elements
allowed in the template body.An example :
<xt:component name="hello">
<p>Hello world!</p>
</xt:component>
This example defines a type called "hello" that is a XHTML paragraph (the
target language of the template where this elements occurs is XHTML)
containing the text "Hello World !". It uses a target language type
(p element).
Union is a constructor that defines a new type as a choice
between several types, each of which being a basic type, a target language
type, or a constructed type (component or other union). The new type has a
name that allows it to be used in other XTiger elements. This name must be
unique in the template where it is defined.
The XTiger element union is used to define a union type:
<!ELEMENT union EMPTY>
<!ATTLIST union
name NMTOKEN #REQUIRED
include CDATA #REQUIRED
exclude CDATA #IMPLIED>
nameincludediv, h1, h2, p,
... for XHTML), or the name attribute of a component or another union.
This attribute is used to define the options that constitute the union.
This attribute is mandatory.excludediv, h1,
h2, p, ...), or the name attribute of a
component or another union. This attribute is used to exclude some
elements that are part of the union as defined by the
include attribute. This attribute is optional.XTiger provides four predefined unions that may be used in any type definition:
anySimplenumber, string
and boolean)anyElementanyComponentanyanySimple, anyElement and
anyComponent.Example :
<xt:union name="hello_or_p" include="hello p"/> <xt:union name="headings" include="h1 h2 h3 h4 h5 h6"/> <xt:union name="headings1to4" include="headings" exclude="h5 h6"/>
With these definitions, the hello_or_p union provides a
choice between the hello component and the p
element. The headings union provides a choice between all HTML
headings (h1 to h6). The headings1to4
union provides a choice between all HTML headings except h5 and
h6.
The definitions of components and unions presented above must appear in the head of an XTiger template, or in a XTiger library imported by a template.
Type definitions do not appear in document instances. Instead, instances include a Processing Instruction that refers to their template, which contains type definitions and reference to libraries containing additional type definitions.
The head element collects definitions of components and unions that are
used in the template. It also refers to the libraries that contain additional
components and/or unions used in the template. This is done with the
import element.
There is always a head element in a template, but only one.
It may appear anywhere in the template, but it cannot be the root of the
document. In XHTML documents it is recommended to insert it in the XHTML
head element.
<!ELEMENT head ((component | union | import )*) >
<!ATTLIST head
version CDATA #REQUIRED
templateVersion CDATA #IMPLIED>
versiontemplateVersioncomponent, union and import
elements, but no other elements.A XTiger library is an XML document containing definitions of constructed
types (components and/or unions). Libraries allow types to be declared only
once and to be shared between different templates. A XTiger library is
defined by the root element library. Its content model is the
same as the head element of a template. Like the
head element, a library can import other XTiger libraries using
the import element.
<!ELEMENT library ((component | union | import)*)>
<!ATTLIST head
version CDATA #REQUIRED
templateVersion CDATA #IMPLIED>
When a template or a library uses constructed types defined in a library,
that library must be explicitly imported in the template or library that uses
it by an import element.
<!ELEMENT import EMPTY >
<!ATTLIST import
src CDATA #REQUIRED>
srcAll components and unions declared in the imported library are inserted at
the position of the import element. Some imported components and
unions can be redeclared (same name attribute) in the current
head or library element. The order of
import elements in a head or library is important: a
component or a union defined in an imported library
with the same name attribute as a previous definition replaces
that previous definition.
A template contains a set of type definitions grouped in the
head element but also the skeleton of a target language document
and some XTiger statements that are used to generate instances. The latter
(skeleton and statements) is called the template body. A copy of it
serves as initial instance when a new document is created from the template.
The head element and its definitions are not copied in the
instance.
All target language elements included in the template body appear in all document instances exactly as they are in the template. Their content is preserved and can not be modified in instances. This is the static part of the template.
There is also a dynamic part in a template, i.e. a part that can be modified under the control of XTiger elements. The XTiger elements that control the dynamic part are:
use, for including elements defined by their type,bag, for defining free content areas,repeat, for repeating elements of a given type,option, for defining an optional part,attribute, for specifying how to use target language
attributes and their values.The use element indicates what type(s) of element can appear
at that position in an instance. Only one element of the specified type(s)
can appear at that position in an instance document.
<!ELEMENT use ANY>
<!ATTLIST use
label NMTOKEN #REQUIRED
types CDATA #REQUIRED
currentType CDATA #IMPLIED
initial (true|false) #IMPLIED "false"
labeluse element. This attribute
allows authors of instances to make a difference between the many
XTiger elements that appear in a document. It is mandatory.typesnumber, boolean,
string), an element of the target language
(h1, h2, p, ...), or the
name attribute of a component or a union. The element to
be inserted at that position in an instance must be of one of these
types, but there is no constraint on the descendants of the inserted
elements, provided they comply with the DTD or schema of the target
language, when target language elements are used. This attribute is
mandatory.currentTypeinitialuse element may have a content. If a content is
present, it must be of one of the types listed in the
types attribute. This content is considered as an initial
value that will be present in an instance. It may be replaced by an
instance author by another content, provided it is compliant with the
types attribute.When a component and only that component is used only once in the
template, the use element may be replaced by a
component element. This is a shortcut. Its semantics are the
same that a use element at that position which refers to the
component.
Example 1:
<xt:use label="birthday" types="string">
Your birth date here
</xt:use>
In this example "Your birth date here" is the content that will be displayed when a new instance is created from the template. This string can be freely replaced by an instance author by any other string, but only by a string.
Example 2:
<xt:head version="0.9">
<xt:component name="short_date">
<xt:use label="day" types="number">20</xt:use> /
<xt:use label="month" types="number">10</xt:use> /
<xt:use label="year" types="number">1981</xt:use>
</xt:component>
...
</xt:head>
...
<xt:use label="birthday" types="short_date"/>
This example shows how a component can be used to make sure that the user will enter a date in the dd/mm/yyyy format.
<xt:use label="date" types="em short_date">
<em>20 october 1981</em>
</xt:use>
Here, the content of the xt:use element may be either an
XHTML em element or a short_date component. Only
one of them can be inserted at that position in an instance. The current
content <em>20 october 1981</em> is a valid value,
because it is an em. It does not need to be also a
short_date.
The use element puts strong constraints on the structure
and/or content of a part of a document. It is sometimes useful to have more
flexibility. That is the role of the bag element. It indicates
that any number of elements may appear at that position in an instance
document, and it specifies the allowed types for these elements.
<!ELEMENT bag ANY>
<!ATTLIST bag
label NMTOKEN #REQUIRED
types CDATA #REQUIRED>
include CDATA #IMPLIED>
exclude CDATA #IMPLIED>
labelbag element. This attribute
allows authors of instances to make a difference between the many
XTiger elements that appear in a document. It is mandatory.typesnumber, boolean,
string), an element of the target language
(h1, h2, p, ...), or the
name attribute of a component or a union. The elements to
be inserted at that position in an instance (bag children) must be of
one of these types. The types attribute is mandatory.includeexcludebag element may have a content. If a content is
present, it must follow the constraints set by the types
attribute. This content is considered as an initial value that will be
present in an instance. It may be replaced by an instance author by
another content, provided it remains compliant with the
types attribute.Example:
<p> <xt:bag label="para" types="string em strong code"> This <em>paragraph</em> contains <em><strong>strings</strong></em> and <strong><code>any</code></strong> combination of <em>emphasis</em>, <code>code</code> and <strong>strong</strong> elements. </xt:bag> </p>
It is often useful to be able to repeat a part of the document structure
several times. In this case, the structure to be repeated must first be
declared as a component. It can then be used with a xt:repeat
element in the template body.
<!ELEMENT repeat ( use+ | component )>
<!ATTLIST repeat
label NMTOKEN #REQUIERED
minOccurs CDATA #IMPLIED "0"
maxOccurs CDATA #IMPLIED "*">
labelrepeat element. This attribute
allows authors of instances to make a difference between the many
XTiger elements that appear in a document. It is mandatory.minOccursmaxOccursuse element indicates (with its types
attribute) the type of the component to be repeated. Basic types are
not allowed. If the types attribute of the
use element is a list of several types, the repeated
elements may have any of these types. Several use elements
may be present in a repeat element in a template to
provide initial values to several repeated elements.
Instead of a use element, the content may be a single
component element, when this component is used only there
in the whole template. This is equivalent to defining this component in
the head of the template and putting a use
element (that refers to that virtual component) in the
repeat element.
Example:
<xt:head version="0.9">
<xt:component name="bib_item">
<li>
<xt:repeat label="authors" minOccurs="1" maxOccurs="5">
<xt:component name="author">
<xt:use label="given_name" types="string"/>
<xt:use label="family_name" types="string"/>
</xt:component>
</xt:repeat>
...
</li>
</xt:component>
...
</xt:head>
...
<h2>Bibliography</h2>
<ul>
<xt:repeat label="bib_list" minOccurs="1">
<xt:use label="entry" types="bib_item"/>
</xt:repeat>
</ul>
This example describes a bibliography section which includes at least one
bib_item component. Each of these components contains several
authors.
It is often useful to indicate that some part of the document is optional.
This is done with the option element. This element is equivalent
to an xt:repeat element with maxOccurs="1" and
minOccurs="0".
<!ELEMENT option ANY>
<!ATTLIST option
label NMTOKEN #REQUIRED
checked (true|false) #IMPLIED "true">
labeloption element. This attribute
allows authors of instances to make a difference between the many
XTiger elements that appear in a document. It is mandatory.checkedtrue)
or not (false). If this attribute is not present, the
element is there (true). This attribute is usually not
present in a template. It is used in document instances.Example:
<xt:option label="bibliography">
<component name="biblio">
<h2>Bibliography</h2>
<ul>
<xt:repeat label="bib_list" minOccurs="1">
<xt:use label="entry" types="bib_item"/>
</xt:repeat>
</ul>
</component>
</xt:option>
This example defines a component called biblio and makes it optional at the position where it appears in the template body.
XTiger provides a way to control attributes from the target language. This
is achieved by inserting an attribute element as a child of a
target language element. The attribute element makes an
attribute of its parent element mandatory, fixed, or prohibited. If several
attributes of a single target language element have to be controlled, several
attribute elements must be used, one for each of these
attributes.
<!ELEMENT attribute EMPTY>
<!ATTLIST attribute
name NMTOKEN #REQUIRED
type (number, string, list) #IMPLIED "string"
use (required, optional, prohibited) #IMPLIED "required"
default CDATA #IMPLIED
fixed CDATA #IMPLIED
values CDATA #IMPLIED>
nametypetype attribute is not present, the default type "string"
is assumed.useuse is not present, the default value "required"
is assumed.defaultdefault is optional.fixedfixed is optionalvaluesvalues is optional.Example:
<div>
<xt:attribute name="class" use="optional"
values="comment example info" default="comment"/>
...
</div>
This example shows a XHTML div element whose
class attribute is made optional with value limited to the three
options comment, example and info. The
default value is set to comment.
When working with XTiger templates, three different kinds of resources are involved:
.xtd extension..xtl extension. When a user creates a document from a template, the new document
instance is created as a copy of the template. However, the
xt:head element with its type definitions is kept by the
authoring tool, but it is not copied in the document instance.
The template is linked to the new instance by a processing
instruction:
<?xtiger template="URI/of/the/template.xtd" version="0.9"
templateVersion="xx" ?>
which is inserted at the beginning of the instance, in the same way CSS style
sheets are linked to XML documents. With this link, the authoring tool can
find all the type definitions needed during editing sessions. All other
XTiger elements (use, bag, repeat,
option, attribute) as well as all target language
elements are kept in the copy that constitutes the initial instance. XTiger
types that appear in these elements are replaced by references to their
definition in the template (actually, by references to a parsed
representation of types in core memory which is more compact).
Francesc Campoy Flores, Vincent Quint, Irène Vatton, Templates, Microformats and Structured Editing, Proceedings of DocEng'06, ACM Symposium on Document Engineering, 10-13 October 2006, Amsterdam, The Netherlands, pp. 188-197. This research paper presents an early version of the XTiger language.