Using the HTML.trans File

This document is an introduction to the structure transformation mechanism provided by Amaya. It describes the syntax of the transformation language and the way transformations are performed in the editor.

The file amaya/HTML.trans contains the description of available transformations. This file can be edited during an Amaya session. It is dynamically parsed when the transformation procedure is called by the editor, so new transformations can be added during an editing session.

Syntax of the Amaya transformation language

Comments begin with!and continue until the end of the line.

The file consists of a list of transformation descriptions. Each transformation is described by three parts :

The name appears in the Transform menu and identifies the transformation for the end-user.

The pattern

The pattern describes a specific organization of the elements to be transformed. It acts as a filter over the HTML dtd. The purpose of the pattern is to identify a particular combination of elements to which the transformation can be applied. In a pattern it is possible to express conditions on sequence of tags, on the content of a tag and on the existence and value of attributes.

Formally, a pattern contains HTML tags (possibly with attributes) and some composition operators:

|for choice

,for sibling

+for sequence

?for option

( ) for grouping nodes

The braces { } define the content of a node.

The symbol * is a token that matches any element type.

It is possible to rename a tag by preceding it with a name followed by a colon (:).

The tag may have attributes. If no value is given for an attribute, an element is matched if the attribute is present. If a value is specified for the attribute, an element is matched if the attribute is present and have the specified value.

Examples of patterns are given at the end of the document.

The rules

A rule expresses how some elements identified in the pattern are transformed. A rule has two parts separated by the symbol >:

The target tag list is itself divided into two parts separated by a colon (:):

The generation location path is searched in the leftmost branch of the document tree, starting from the parent of the element matching the highest symbol of the pattern.

In the target tag list, the dot symbol (.) is used for descending in the tree structure.

If the special token star (*) ends the list of tags to be generated, the source element tag is not changed, but it can be moved to a different place in the destination.

If the source tag or the name in the left part of a rules is present more than once in the pattern, the rule transforms all the elements matching an occurrence of the tag in the pattern.

Transformation process

When the user chooses the Transform command from the Edit menu, Amaya parses the HTML.trans (or the MathML.trans, etc.) file. Then the selected elements are matched with the pattern of each transformation. The names of the matched transformations are proposed to the user in a pop-up menu.

If several transformations with the same name match the selected elements, the higher-level matching transformation is proposed to the user. If several transformations match at the same level, the first one declared in the HTML.trans file is proposed. As a consequence, it is recommended to specify the transformations with specific patterns before the more general ones.

Once a transformation has been chosen by the user, the destination structure is built according to the rules while selected elements are traversed.

Finally, the contents of the source elements (text and pictures, but also structured elements) are moved into the produced elements.

This transformation process for HTML documents is fully described in Interactively Restructuring HTML Documents, a paper presented at the 5th international WWW conference in Paris, May 96, by Cécile Roisin and Stéphane Bonhomme.