Warning:
This wiki has been archived and is now read-only.

ITS General Decorator

From ITS
Jump to: navigation, search

ITS General Decorator

Overview

The ITS General Decorator is an XSLT-based ITS processor which decorates an input document with information for ITS data categories. The processor implements all data categories from the ITS 1.0 specification. It is targeting users who need a component for a more comprehensive processor (e.g. an ITS-based converter to the XML Localization Interchange File Format).

The XSLT-based "decoration" approach was pioneered by Spritser, a general ITS 1.0 implementation provided by Sebastian Rahtz (Oxford University). The main difference to Spritser is that ITS General Decorator allows for easily creating new data categories or extending existing ones. Hence, ITS General Decorator is also a tool for experimenting with new or modified ITS data categories.

Background

The ITS General Decorator is realized by means of an XSLT-based processing chain:

  • Generation of intermediate stylesheets based on input (including ITS information)
  • Decoration of input by means of intermediate stylesheets

Batch scripts can easily string these process steps together.

The output is the same document as the input, with the following additional information:

  • For each element node there is an <itsElementAnnotation> element. It contains an attribute "datacat" which specifies the data category, and markup specific to the data category.
  • For each attribute node there is an <itsAttributeAnnotation> element. It contains an attribute "datacat" which specifies the data category, and markup specific to the data category.
  • If an input document is processed against several data categories, for each data category there will be additional <itsElementAnnotation> and <itsAttributeAnnotation> elements.

Example input and output

An example input document for the ITS "Translate" data category looks like this:

<x a="y" xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
  <its:rules>
    <its:translateRule selector="//@alt" translate="yes"/>
    <its:translateRule selector="//u/@alt" translate="no"/>
  </its:rules>
  <b its:translate="no" alt="b">
    <x>
      <u a="r" alt="j" its:translate="yes">
      </u>
    </x>
  </b>
</x> 

After the processing chain, the output looks like File:Its-general-decorator-example-output.xml

Create your own data categories

The file sampleCatDesc.xml contains definitions of data categories. These can be modified or a new data category can be added. As an example, below is the definition of the "Elements within Text" data category with the extension of local "Elements within Text".

<datacat name="elementswithintext">
 <defaults>
  <defaultsElements its:withinText="no"/>
 </defaults>
 <inheritance appliesTo="onlyElements"/>
 <rulesElement name="withinTextRule"/>
 <localAdding datcatSelector="*[@itsx:withinText]" 
  addedMarkup="@itsx:withinText"/>
</datacat>

A data category definition contains the following:

  • the name of the data category, e.g."elementswithintext"
  • optionally defaults for elements and / or attributes
  • an inheritance definition
  • optionally the name of a rules element
  • optionally definitions for local markup

A more detailed description of the underlying format is available in this paper (see Figure 7).

Downloads

Main (as ZIP file):

  • version 0.1 (2008-12-23) is implemented in XSLT 1.0 and comes with a batch file for altovaxml and an ANT build file for saxon. For the later, the location of saxon has to be provided in the location attribute of the <pathelement> element.
  • version 0.2 is buggy and is deprecated - if you find it somewhere do not use it.
  • version 0.3 (2009-07-15) is implemented in XSLT 1.0 and comes with an ANT build file for saxon. For using the build file, the location of saxon has to be provided in the location attribute of the <pathelement> element. It also comes with a php script to use the general decorator on the server-side. See an example installation used for generating XLIFF 1.2 files using ITS Translate information.