Authors: Émilien Kia, Vincent Quint, Irène Vatton -- INRIA Rhône-Alpes
Version: 1.0 - Date: 2009-12-15
Abstract
This document presents XTiger, an XML language for specifying document templates. XTiger templates are intended to guide an editing tool for building documents that follow a predefined model. The XTiger language is used jointly with another XML language, typically XHTML, which is called the target language. A template is a target language document where XTiger elements indicate how the document can be edited and still conform with the model. XTiger is versatile enough to represent templates that capture the overall structure of large documents as well as the fine details of a microformat.
Contents
Most popular XML document formats used on the web, such as XHTML or SVG, are very flexible: they allow many different types of documents to be represented. This is an advantage in a wide space such as the Web, as a broad range of documents can be handled consistently. XHTML, for instance, is used to represent not only traditional Web pages, but also complex technical documents, sophisticated e-commerce forms or rich media slides, and all these documents can be accessed with a single browser. But this flexibility makes document authoring a complex task. When producing a specific type of document, an author is faced with all the possibilities provided by XHTML, and she has to make a number of difficult decisions. If multiple similar documents have to be produced consistently, for a particular use or for some specific application, authors have to make a consistent use of the XHTML document format, which has proven to be very difficult.
XTiger (eXtensible Templates for Interactive Guided Editing of Resources) tackles this problem by defining how the document format (XHTML, for instance) has to be used for representing a certain type of document. To do so, XTiger relies on the notion of a template. A template is a skeleton representing a given type of document, expressed in the format of the final documents to be produced (XHTML, for instance). The format of the final documents is called the target language and must be an XML language. The skeleton contains some statements, expressed in the XTiger language, that specify how this minimal document can evolve and grow, while keeping in line with the intended type of the final documents. Some parts of the template may be frozen, if they have to appear as is in the final document. Some parts may be modified when producing the final document, some others may be added either freely or under some constraints. It is the role of the XTiger language to specify these possibilities and constraints.
When talking about XTiger, it is important to make a distinction between two kinds of documents: a template and its instances. A template is the skeleton presented above, containing XTiger elements and defining a certain type of document. It is the seed used to produce a series of documents, called instances, that are derived from the template by following the statements expressed by the embedded XTiger elements. In the rest of the paper, we use the term instance instead of final document.
The statements expressed by XTiger elements are supposed to be interpreted by a document authoring tool. Starting from a template, the tool helps the user to follow the XTiger statements, thus ensuring that the instance being edited will stick to the document type specified by the template.
XTiger templates may be used to specify the overall structure of a large document, as well as the fine details of some of its parts. This latter feature allows in particular to express how to use microformats in large documents.
XTiger is not a document type like XHTML, SVG or MathML. It is always used in combination with a target language, which is a document type. The XTiger elements interspersed in a template are not supposed to be displayed in the same way as elements of the target language. Instead, the role of these XTiger elements is to specify what elements and attributes of the target language must, should or could be present at these positions in the document instance. That is the core of the language, which specifies the structure and (parts of) the content of documents.
This functionality is complemented with additional features that make the language easier to use. For instance, structure fragments can be defined only once and used at several places, in one or several templates. This facilitates a modular construction of templates, by sharing reusable pieces of structure stored in libraries.
As XTiger is used to describe structures, and because it is always mixed with XML languages, it is itself an XML language. XML namespaces are used to distinguish between XTiger elements and elements from the target language. This distinction allows existing web browsers to simply ignore the XTiger elements and to display a template as if these elements were not present.
The XTiger namespace is http://ns.inria.org/xtiger
. For the
sake of readability, all examples in this document use prefix xt:
for XTiger element names, while names from the target language are not
prefixed.
The target language used in the following examples is XHTML, but it might be
any other XML language as well. The first example below is a piece of XTiger
language that defines a component called "author" (see also other template
examples). This component is constituted by a XHTML p
aragraph
that contains a few XHTML span
and br
elements, with
classes from the hCard microformat. The "author" component can be used to
generate the XHTML structure representing an author in document instances,
following the hCard microformat.
<xt:component name="author"> <p class="vcard"> <span class="fn"> <xt:use types="string" label="name">Author name</xt:use> </span> <br/> <span class="adr"> <xt:use types="string" label="address">Address ...</xt:use> </span> <br/> <span class="email"> <xt:use types="string" label="email">email ...</xt:use> </span> </p> </xt:component>
In XTiger, types are used to specify pieces of structure that may occur at several places in a template or in several templates. XTiger offers a few basic types and allows constructed types to be built. Constructed types are built with constructors that combine XTiger basic types and types from the target language. Two constructors are available: component and union.
XTiger offers three basic types:
number
representing integers and floating point numbers,boolean
representing boolean values (true
or
false
),string
representing variable length character strings.As XTiger always works with a target language and is used to produce
documents in that language, it may use elements and attributes from the target
language. For instance, when the target language is XHTML, elements
h1
, h2
, p
, strong
,
span
, cite
are target language types.
Component
is a constructor that creates a new constructed type
by specifying an XML structure assembling other types, which may be basic
types, target languages types and constructed types (unions and other
components). The type thus created has a name that allows it to be referred
from other XTiger elements. This name must be unique in the template where it
is defined.
The XTiger element component
is used to define a component
type:
<!ELEMENT component ANY> <!ATTLIST component name NMTOKEN #REQUIRED>
name
component
element defines the structure
of the new type. It may be any XML structure that combines target
language elements (possibly with attributes) and XTiger elements allowed
in the template body.An example :
<xt:component name="hello"> <p>Hello world!</p> </xt:component>
This example defines a type called "hello" that is a XHTML paragraph (the
target language of the template where this element occurs is XHTML) containing
the text "Hello World !". It uses a target language type (p
element).
Union
is a constructor that defines a new type as a choice
between several types, each of which being a basic type, a target language
type, or a constructed type (component or other union). The new type has a name
that allows it to be used in other XTiger elements. This name must be unique in
the template where it is defined.
The XTiger element union
is used to define a union type:
<!ELEMENT union EMPTY> <!ATTLIST union name NMTOKEN #REQUIRED include CDATA #REQUIRED exclude CDATA #IMPLIED>
name
include
div
, h1
, h2
, p
, ...
for XHTML), or the name attribute of a component or another union. This
attribute is used to define the options that constitute the union. This
attribute is mandatory.exclude
div
, h1
, h2
,
p
, ...), or the name attribute of a component or another
union. This attribute is used to exclude some elements that are part of
the union as defined by the include
attribute. This
attribute is optional.XTiger provides four predefined unions that may be used in any type definition:
anySimple
number
, string
and
boolean
)anyElement
anyComponent
any
anySimple
, anyElement
and
anyComponent.
Example :
<xt:union name="hello_or_p" include="hello p"/> <xt:union name="headings" include="h1 h2 h3 h4 h5 h6"/> <xt:union name="headings1to4" include="headings" exclude="h5 h6"/>
With these definitions, the hello_or_p
union provides a choice
between the hello
component and the p
element. The
headings
union provides a choice between all HTML headings
(h1
to h6
). The headings1to4
union
provides a choice between all HTML headings except h5
and
h6
.
The definitions of components and unions presented above must appear in the head of an XTiger template, or in a XTiger library imported by a template.
Type definitions do not appear in document instances. Instead, instances include a Processing Instruction that refers to their template, which contains type definitions and reference to libraries containing additional type definitions.
The head element collects definitions of components and unions that are used
in the template. It also refers to the libraries that contain additional
components and/or unions used in the template. This is done with the
import
element.
There is always a head
element in a template, but only one. It
may appear anywhere in the template, but it cannot be the root of the document.
In XHTML documents it is recommended to insert it in the XHTML
head
element.
<!ELEMENT head ((component | union | import )*) > <!ATTLIST head version CDATA #REQUIRED templateVersion CDATA #IMPLIED>
version
templateVersion
component
, union
and import
elements, but no other elements.A XTiger library is an XML document containing definitions of constructed
types (components and/or unions). Libraries allow types to be declared only
once and to be shared between different templates. A XTiger library is defined
by the root element library
. Its content model is the same as the
head
element of a template. Like the head
element, a
library can import other XTiger libraries using the import
element.
<!ELEMENT library ((component | union | import)*)>
<!ATTLIST library
version CDATA #REQUIRED
templateVersion CDATA #IMPLIED>
When a template or a library uses constructed types defined in a library,
that library must be explicitly imported in the template or library that uses
it by an import
element.
<!ELEMENT import EMPTY > <!ATTLIST import src CDATA #REQUIRED>
src
All components and unions declared in the imported library are inserted at
the position of the import
element. Some imported components and
unions can be redeclared (same name
attribute) in the current
head
or library
element. The order of
import
elements in a head or library is important: a
component
or a union
defined in an imported library
with the same name
attribute as a previous definition replaces
that previous definition.
A template contains a set of type definitions grouped in the
head
element but also the skeleton of a target language document
and some XTiger statements that are used to generate instances. The latter
(skeleton and statements) is called the template body. A copy of it
serves as initial instance when a new document is created from the template.
The head
element and its definitions are not copied in the
instance.
All target language elements included in the template body appear in all document instances exactly as they are in the template. Their content is preserved and can not be modified in instances. This is the static part of the template.
There is also a dynamic part in a template, i.e. a part that can be modified under the control of XTiger elements. The XTiger elements that control the dynamic part are:
use
, for including elements defined by their type,bag
, for defining free content areas,repeat
, for repeating elements of a given type,attribute
, for specifying how to use target language
attributes and their values.The use
element indicates what type(s) of element can appear at
that position in an instance. Only one element of the specified type(s) can
appear at that position in an instance document.
<!ELEMENT use ANY> <!ATTLIST use label NMTOKEN #IMPLIED types CDATA #REQUIRED option (set|unset) #IMPLIED currentType CDATA #IMPLIED initial (true) #IMPLIED
label
use
element. This attribute
allows authors of instances to make a difference between the many XTiger
elements that appear in a document. It is mandatory.types
number
, boolean
,
string
), an element of the target language (h1
,
h2
, p
, ...), or the name
attribute
of a component or a union. The element to be inserted at that position in
an instance must be of one of these types, but there is no constraint on
the descendants of the inserted elements, provided they comply with the
DTD or schema of the target language, when target language elements are
used. This attribute is mandatory.
Recursion is forbidden. For example, when a use
element
is part of a component, it cannot refer to that component.
option
use
element is
optional. The value is set
when the content is generated and
unset
when it is omitted. Usually in a template the value is
set
.currentType
initial
use
element may have a content. If a content is
present in the template, it must be of one of the types listed in the
types
attribute. This content is considered as an initial
value that will be present in an instance. It may be replaced by an
instance author by another content, provided it is compliant with the
types
attribute.Even if a component
is used only once in the template, it must
be declared within the template head
and a use
element will refer to it.
Example 1:
<xt:use label="birthday" types="string">
Your birth date here
</xt:use>
In this example "Your birth date here" is the content that will be displayed when a new instance is created from the template. This string can be freely replaced by an instance author by any other string, but only by a string.
Example 2:
<xt:head version="1.0"> <xt:component name="short_date"> <xt:use label="day" types="number">20</xt:use> / <xt:use label="month" types="number">10</xt:use> / <xt:use label="year" types="number">1981</xt:use> </xt:component> ... </xt:head> ... <xt:use label="birthday" types="short_date"/>
This example shows how a component can be used to make sure that the user will enter a date in the dd/mm/yyyy format.
<xt:use label="date" types="em short_date">
<em>20 october 1981</em>
</xt:use>
Here, the content of the xt:use
element may be either an XHTML
em
element or a short_date
component. Only one of
them can be inserted at that position in an instance. The current content
<em>20 october 1981</em>
is a valid value, because it
is an em
. It does not need to be also a
short_date
.
The use
element puts strong constraints on the structure and/or
content of a part of a document. It is sometimes useful to have more
flexibility. That is the role of the bag
element. It indicates
that any number of a set of elements may appear at that position in an instance
document, and it specifies the allowed types for these elements.
<!ELEMENT bag ANY> <!ATTLIST bag label NMTOKEN #REQUIRED types CDATA #REQUIRED> include CDATA #IMPLIED> exclude CDATA #IMPLIED>
label
bag
element. This attribute
allows authors of instances to make a difference between the many XTiger
elements that appear in a document. It is mandatory.types
number
, boolean
,
string
), an element of the target language (h1
,
h2
, p
, ...), or the name
attribute
of a component or a union.
The elements to be inserted at the top level of the bag in an instance
(bag children) must be of one of these types. The types
attribute is mandatory. By default, all descendant element types allowed
by the target language can be inserted into bag children.
include
This attribute is used to extend the list of allowed descendant element types that could be inserted into bag children. This attribute is optional.
exclude
This attribute is used to exclude some element types from the possible set of descendant element types. This attribute is optional.
bag
element may have a content. If a content is
present, it must follow the constraints set by the types
attribute. This content is considered as an initial value that will be
present in an instance. It may be replaced by an instance author by
another content, provided it remains compliant with the
types
attribute.Example 1:
<div> <xt:bag label="sect" types="p h2 h3 h4 div"> <p> This <em>paragraph</em> contains <em><strong>strings</strong></em> and <strong><code>any</code></strong> combination of <em>emphasis</em>, <code>code</code> and <strong>strong</strong> elements. </p> </xt:bag> </div>
Many occurrences of p
, h2
, h3
,
h4
, and div
elements may appear at the top level of
the bag
, and only these elements. There is no constraint about the
order of these elements and as for the use
element, no constraint
is specified on the content of these elements.
By default the bag
element will generate this initial
paragraph ; em
and strong
elements are allowed by
the target language.
Example 2:
<div> <xt:bag label="sect" types="p h2 div" include="author" exclude="h2"> <h2>Title...</h2> <p> This <em>paragraph</em> contains <em><strong>strings</strong></em> and <strong><code>any</code></strong> combination of <em>emphasis</em>, <code>code</code> and <strong>strong</strong> elements. </p> </xt:bag> </div>
In example 2, h3
and h4
elements cannot appear at
the top level of the bag, only p
, h2
, and
div
elements are allowed. The include
attribute says
that the author
component can be inserted within the
bag
but not at the top level. The exclude
attribute
says that the h2
element can be inserted only at the top level.
Example 3:
<div> <xt:bag label="sect" types="anyElement"> <p> This <em>paragraph</em> contains <em><strong>strings</strong></em> and <strong><code>any</code></strong> combination of <em>emphasis</em>, <code>code</code> and <strong>strong</strong> elements. </p> </xt:bag> </div>
In example 3, any element of the target language can appear at any levey level of the bag. Only the target langage constraints apply.
It is often useful to be able to repeat a piece of the document structure
(or an alternative of pieces) several times. In this case, the structure to be
repeated must first be declared as a component
. It can then be
used with a repeat
element in the template body around a
use
element that refers the component(s).
<!ELEMENT repeat ( use+ )> <!ATTLIST repeat label NMTOKEN #REQUIERED minOccurs CDATA #IMPLIED "1" maxOccurs CDATA #IMPLIED "*">
label
repeat
element. This attribute
allows authors of instances to make a difference between the many XTiger
elements that appear in a document. It is mandatory.minOccurs
maxOccurs
A use
element indicates (with its types
attribute) the type of the component to be repeated. Basic types are not
allowed. The use
element cannot have an option
attribute, as the option is equivalent to a minOccur="0"
.
If the types
attribute of the use
element is
a list of several types, the repeated elements may have any of these
types. Several use
elements may be present in a
repeat
element in a template to provide initial values to
several repeated elements.
Example:
<xt:head version="1.0"> <xt:component name="author"> <xt:use label="given_name" types="string"/> <xt:use label="family_name" types="string"/> </xt:component> <xt:component name="bib_item"> <li> <xt:repeat label="authors" minOccurs="1" maxOccurs="5"> <xt:use types="author"/> </xt:repeat> ... </li> </xt:component> ... </xt:head> ... <h2>Bibliography</h2> <ul> <xt:repeat label="bib_list" minOccurs="1"> <xt:use label="entry" types="bib_item"/> </xt:repeat> </ul>
This example describes a bibliography section which includes at least one
bib_item
element. Each of these elements may contain one to five
authors.
The document bibliography could be also defined with a bag
element:
<h2>Bibliography</h2> <ul> <xt:bag label="bib_list" types="bib_item"/> </ul>
In that case, the list of bib_item
could be empty. It is
equivalent to a repeat
with minOccurs="0"
.
XTiger provides a way to control attributes from the target language. This
is achieved by inserting an attribute
element as a child of a
target language element. The attribute
element makes an attribute
of its parent element mandatory, fixed, or prohibited. If several attributes of
a single target language element have to be controlled, several
attribute
elements must be used, one for each of these
attributes.
<!ELEMENT attribute EMPTY>
<!ATTLIST attribute
name NMTOKEN #REQUIRED
type (number, string, list) #IMPLIED "string"
use (required, optional, prohibited) #IMPLIED "required"
default CDATA #IMPLIED
fixed CDATA #IMPLIED
values CDATA #IMPLIED>
name
type
type
attribute is not present, the default type "string" is
assumed.use
use
is not present, the default value "required"
is assumed.default
default
is
optional.fixed
fixed
is optionalvalues
values
is optional.Example:
<div> <xt:attribute name="class" use="optional" values="comment example info" default="comment"/> ... </div>
This example shows a XHTML div
element whose class
attribute is made optional with value limited to the three options
comment
, example
and info
. The default
value is set to comment
.
When working with XTiger templates, three different kinds of resources are involved:
.xtd
extension..xtl
extension. When a user creates a document from a template, the new document
instance is created as a copy of the template. However, the
xt:head
element with its type definitions is kept by the authoring
tool, but it is not copied in the document instance.
The template is linked to the new instance by a processing
instruction:
<?xtiger template="URI/of/the/template.xtd" version="1.0"
templateVersion="xx" ?>
which is inserted at the beginning of the instance, in the same way CSS style
sheets are linked to XML documents. With this link, the authoring tool can find
all the type definitions needed during editing sessions. All other XTiger
elements (use
, bag
, repeat
,
attribute
) as well as all target language elements are kept in the
copy that constitutes the initial instance. XTiger types that appear in these
elements are replaced by references to their definition in the template
(actually, by references to a parsed representation of types in core memory
which is more compact).
Francesc Campoy Flores, Vincent Quint, Irène Vatton, Templates, Microformats and Structured Editing, Proceedings of DocEng'06, ACM Symposium on Document Engineering, 10-13 October 2006, Amsterdam, The Netherlands, pp. 188-197. This research paper presents an early version of the XTiger language.