[ contents ]

W3C

Internationalization Tag Set (ITS)

W3C Working Draft 22 November 2005

This version:
http://www.w3.org/TR/2005/WD-its-20051122/
Latest version:
http://www.w3.org/TR/its
Editors:
Christian Lieske, SAP
Felix Sasaki, W3C

Abstract

This document defines data categories and their implementation as a set of elements and attributes called the Internationalization Tag Set (ITS). ITS is used with new and existing schemas to support the internationalization and localization of schemas and documents. Implementations of ITS are provided for three schema languages: XML DTDs, XML Schema and RELAX NG. In addition, implementations are provided as fixed modularizations of various existing vocabularies (e.g. XHTML, DocBook, Open Document). The definition of the data categories is still in an early draft stage. Feedback is especially appreciated on the general conception of ITS and the means of scope of ITS information in particular.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a First Public Working Draft of "Internationalization Tag Set (ITS)".

This document defines data categories and their implementation as a set of elements and attributes called the Internationalization Tag Set (ITS). ITS is used with new and existing schemas to support the internationalization and localization of schemas and documents. Implementation of ITS are provided for three schema languages: XML DTDs, XML Schema and RELAX NG. In addition, implementations are provided as fixed modularizations of various existing vocabularies (e.g. XHTML, DocBook, Open Document). The definition of the data categories is still in an early draft stage. Feedback is especially appreciated on the general conception of ITS and the means of scope of ITS information in particular.

This document was developed by the ITS Working Group, part of the W3C Internationalization Activity. The Working Group expects to advance this Working Draft to Recommendation Status (see W3C document maturity levels).

Send your comments to www-i18n-comments@w3.org. Use "Comment on its tagset WD" in the subject line of your email. The archives for this list are publicly available.

Per section 4 of the W3C Patent Policy, Working Group participants have 150 days from the title page date of this document to exclude essential claims from the W3C RF licensing requirements with respect to this document series. Exclusions are with respect to the exclusion reference document, defined by the W3C Patent Policy to be the latest version of a document in this series that is published no later than 90 days after the title page date of this document.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced under the 5 February 2004 W3C Patent Policy. Since the Working Group expects this document to become a W3C Recommendation, under that policy it has associated W3C Royalty-Free Licensing oblications. The Working Group maintains a public list of patent disclosures relevant to this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

Appendices

A Schemas for ITS
B References
C References (Non-Normative)
D Acknowledgements (Non-Normative)

Go to the table of contents.1 Introduction

This section is informative.

This document defines data categories and their implementation as a schema that can be used with new and existing schemas to support the internationalization and localization of schemas and documents. The implementation is provided for three schema languages: XML DTDs [XML 1.0], XML Schema [XML Schema] and RELAX NG [RELAX NG]. In addition, implementations as fixed modularizations of various existing vocabularies (e.g. XHTML [XHTML 1.0], DocBook [DocBook], Open Document [OpenDocument]) are provided.

Requirements for the internationalization and localization of markup are formulated in [ITS REQ]. This working draft responds to only a part of these requirements. Some of the following items are mentioned in [ITS REQ], but are not covered in this working draft:

These requirements have not been addressed at this point in time since the ITS Working Group expects that it will take a substantial amount of time to address them, but that the framework suggested in this document will accomodate them.

Other requirements will also be addressed in the future in a document on techniques for internationalization and localization of XML schemas and XML instances.

Go to the table of contents.1.1 Background: Motivation for ITS

Content or software that is authored in one language (i.e. source language) is often made available in additional languages. This is done through a process called localization, where the original material is translated and adapted to the target audience.

From the viewpoints of feasibility, cost, and efficiency, it is important that the original material should be suitable for localization. This is achieved by proper design and development, and the corresponding process is referred to as internationalization.

The increasing usage of XML as a medium for documentation-related content (e.g. DocBook, being a format for writing structured documentation, well suited to computer hardware and software manuals) and software-related content (e.g. the eXtensible User Interface Language [XUL]) provides growing challenges and opportunities in the domain of XML internationalization and localization.

Example 1: Document with localizable information

In this example the text in square brackets [...] shows the parts that need to be localized. Without localization-specific information it is difficult for tools to detect that PhaseCode should not be translated, or that the title attribute sometimes does and sometimes does not.

<Manual>
 <Info>
  <PhaseCode>Review Level</PhaseCode>
  <FormNo>8U81-GS-52C</FormNo>
  <Name>[Owner's Manual]</Name>
  ...
 </Info>
 <Section id="0" title="#Introduction#">
  <Ltitle id="005" title="#ZOOM#">
   <Mtitle id="00501" title="[Getting started]" option="no" cols="1">
    <MultiCol cols="1">
     <Text>[Some text to localize]</Text>
     ...
    </Multicol>
   </Mtitle>
  </Ltitle>...
</Manual>
Example 2: Document with localizable information

In the example below predicting what needs to be translated depends on more than the name of the element, but also on some attribute value of its parent.

<dialogue xml:lang="en-gb">
 <rsrc id="123">
  <component id="456" type="image">
   <data type="text">images/cancel.gif</data>
   <data type="coordinates">12,20,50,14</data>
  </component>
  <component id="789" type="caption">
   <data type="text">[Cancel]</data>
   <data type="coordinates">12,34,50,14</data>
  </component>
 </rsrc>
</dialogue>
Example 3: Document with localizable information

In the example below, there are no clear mechanism allowing one to know which string element needs to be translated.

<resources>
 <section id="Homepage">
  <arguments>
   <string>page</string>
   <string>childlist</string>
  </arguments>
  <variables>
   <string>POLICY</string>
   <string>[Corporate Policy]</string>
  </variables>
  <keyvalue_pairs>
   <string>Page</string>
   <string>[ABC Corporation - Policy Repository]</string>
   <string>Footer_Last</string>
   <string>[Pages]</string>
   <string>bgColor</string>
   <string>NavajoWhite</string>
   <string>title</string>
   <string>[List of Available Policies]</string>
  </keyvalue_pairs>
 </section>
</resources>

Go to the table of contents.1.2 Out of Scope

The data categories and their implementation as a schema does not address document-external mechanisms or data formats for describing localization-relevant information over and above what is appropriate for inclusion in the format itself. Such mechanisms and data formats, also sometimes called XML Localization Properties, are out of the scope of this document. However, this document specifies a methodology how localization properties and information about internationalization and localization can be applied to various places in schemas and instance documents. See Section 3: Scope of ITS information.

Go to the table of contents.1.3 Usage Scenarios

Information which supports internationalization and localization with respect to XML schemas and XML instances may be used in many ways. Example usages (see section 2 in [ITS REQ]) are:

  • Content authoring

  • Terminology creation and translation

  • Software development

The diversity of these usages leads to a great variety of requirements and possible formalization of an XML language supports information related to internationalization and localization. The concepts described in this document are meant to provide general answers to these sometimes conflicting requirements.

Example 4: Usage scenarios and possible implementations of ITS data categories: Example translatability

A content author needs a simple way to express whether the content of an element or attribute should be translated or not, e.g. an attribute translate. On the other hand, for translations of large document sets based on the same schema, a specification of defaults for translatability and exceptions from the defaults is of importance, e.g. all p elements should be translated, but not p elements inside of an index element.

This specification responds to this variety by introducing the concept of scope.

Go to the table of contents.1.4 Important Design Decisions

Five design decisions are crucial for the development of ITS: data categories, scoping, extensibility, limited impact and technological viability.

About data categories: ITS defines data categories as a description of information for internationalization and localization of XML schemas and documents. This description is independent of its implementation e.g. via an element or attribute. See Section 2.4: Data category for a definition of the term data categories, Section 4: Description of Data Categories for the definition of the various ITS data categories, and Section 6: Markup Declarations for the data category implementations.

About scoping: Content authors need a simple way to express whether the content of an element or attribute should be translated or not, e.g. a translate attribute. On the other hand, for translations of large document sets based on the same schema, a specification of defaults for translatability and exceptions from the defaults is of importance (e.g. all p elements should be translated, but not p elements inside of an index element). This specification responds to these conflicting requirements by introducing a methodology for optionally specifying scoping information, cf. Section 3: Scope of ITS information. The methodology also provides a means for attaching information related to attributes (a task for which no standard means exists yet). The ITS mechanisms for expressing scope need to consider the following:

  • viable for both XML schemata and XML instances

  • viable in situ (at the XML node to which it pertains) or dislocated (not at the XML node to which it pertains)

About extensibility: It may be useful or necessary to extend the set of information available for internationalization or localization purposes beyond what is provided by ITS. This specification does not define a general extension mechanism, since ordinary XML mechanisms (e.g. XML Namespaces [XML Names]) may be used.

About limited impact: ITS follows the example from section 4 of [XLink 1.1], by providing mostly global attributes for the implementation of ITS data categories. Avoiding elements for ITS purposes as much as possible assures limited impact on existing markup schemes, see section 3.14 in [ITS REQ]. Only for some requirements additional child elements have to be used, see for example Section 4.5: Ruby.

About technological viability: In order to foster a quick adaptation, ITS was developed with two important criteria in mind:

  • No dependence on technologies which are yet to be developed

  • Fit with existing work in the W3C architecture (e.g. use of XPath [XPath 1.0] for scoping)

Go to the table of contents.1.5 Development of this Specification

This specification has been developed using the ODD (One Document Does it all) language of the Text Encoding Initiative ([TEI]). This is a literate programming language for writing XML schemas, with three characteristics:

  1. The element and attribute set is specified using an XML vocabulary which includes support for macros (like DTD entities, or schema patterns), a hierarchical class system for attributes and elements, and creation of modules.

  2. The content models for elements and attributes is written using embedded Relax NG XML notation.

  3. Documentation for elements, attributes, value lists etc is written inline, along with examples and other supporting material.

XSLT transform are provided by the TEI to extract documentation in HTML, XSL FO or LaTeX forms, and to generate Relax NG documents and DTDs. From the Relax NG documents, James Clark's trang can be used to create XML Schema documents.

Go to the table of contents.2 Notation and Terminology

This section is normative.

Go to the table of contents.2.1 Notation and Terminology

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119].

Go to the table of contents.2.2 Namespaces used in this Specification

The namespace URI that must be used by implementations of this specification is:

http://www.w3.org/2005/11/its

The namespace prefix used in this specification for this URI is "its".

In addition, the following namespaces are used in this document:

  • http://www.w3.org/2000/10/XMLSchema for the XML Schema namespace, here used with the prefix "xs"

  • http://relaxng.org/ns/structure/1.0 for the RELAX NG namespace, here used with the prefix "rng"

Go to the table of contents.2.3 Schema Language

[Definition: The term schema language refers in this specification to XML DTDs, XML Schema or RELAX NG.]

Go to the table of contents.2.4 Data category

[Definition: ITS defines data category as an abstract concept for a particular type of information for internationalization and localization of XML schemas and documents.]. The concept of a data category is independent of its implementation in an XML environment (e.g. via an element or attribute).

For each data category, ITS distinguishes between the following:

Example 5: Data categories and their implementation

The data category translatability conveys mainly information whether a piece of content should be translated or not. The simplest formalization of this prose description on a schema language independent level is a translate attribute with two possible values: yes and no. An implementation on a schema language specific level would be the declaration of the translate attribute in e.g. an XML DTD, an XML Schema document or an RELAX NG document.

An alternative formalization on a schema language independent level is a schemaRule element which conveys via a translate attribute information about translatability. An implementation on a schema language specific level is the declaration of the schemaRule element.

Go to the table of contents.2.5 Scope

[Definition: Scope is a means to describe to what elements and / or attributes an ITS data category and its values should be applied to.]. Scope is discussed in detail in Section 3: Scope of ITS information.

Go to the table of contents.3 Scope of ITS information

This section is normative.

Go to the table of contents.3.1 Relation between Data Categories and Scope

Scope information is always attached to a single data category. The relation between scope and the various data categories is described in Section 4: Description of Data Categories.The scope information - and the data categories - can be realized in various positions, which are defined below.

Example 6: Example for implementation of scope information about translatability, expressed via a translate attribute
<text its:translate="yes" its:translateScope="//p">...
<!-- all p elements should be translated, except the following one -->
 <p its:translate="no" its:translateScope="."/>
</text>

Go to the table of contents.3.2 Position of Scope (Where to Express Information about Scope)

Information about scope can appear in three places:

The various mechanisms to define scope are defined in detail below.

Go to the table of contents.3.2.1 Scope in a Schema

In Schemas, scoping is expressed via schema annotation [Definition: schema annotation is a schema language specific means to provide information about element, attribute, type etc. declarations. This information is not used by the schema processor, but for external, validation independent applications.]. The scope of the data category depends on the position of the schema annotation. Since schema annotation mechanisms are schema language specific, the following definitions are made:

  • [Definition: scope for elements in XML Schema is expressed via an xs:appinfo element which is a direct child of the xs:element element and which contains a schemaRule element.]

  • [Definition: scope for attributes in XML Schema is expressed via an xs:appinfo element which is a direct child of the xs:attribute element and which contains a schemaRule element.]

  • [Definition: scope for elements in RELAX NG is expressed via a schemaRule element which is a direct child of the rng:element element.]

  • [Definition: scope for attributes in RELAX NG is expressed via a schemaRule element which is a direct child of the rng:attribute element.]

As for XML DTDs, this specification defines no specific mechanism to express scope within the DTD.

Note: To be able to express scope information for XML DTDs, the mechanisms described in Section 3.2.2: Dislocated Scope can be used.

Example 7: Scope for translatability in an XML Schema
<xs:element name="p">
 <xs:annotation>
  <xs:appinfo>
   <its:schemaRule translate="yes"/>
  </xs:appinfo>
 </xs:annotation> ...
</xs:element>
Example 8: Scope for translatability in Relax NG
<element name="p">
 <its:schemaRule translate="yes"/> ...
</element>              

To group several schemaRule elements, a schemaRules element should be used.

Example 9: Grouping schemaRule elements with a schemaRule element
<xs:element name="p">
 <xs:annotation>
  <xs:appinfo>
   <its:schemaRules>
    <its:schemaRule translate="yes"/>
    <its:schemaRule locInfo="This has to be handled carefully"
     locInfoType="alert"/>
   </its:schemaRules>
  </xs:appinfo>
 </xs:annotation> ...
</xs:element>

Several data categories about the same element or attribute declaration should be expressed at the same schemaRule element.

Example 10: Several data categories at the same element
<its:schemaRule translate="yes" locInfo="This has to be handled carefully"
 locInfoType="alert"/>

Go to the table of contents.3.2.2 Dislocated Scope

Dislocated scope information is expressed via a documentRules element. It contains one or more documentRule elements. Each documentRule element has one or more attributes which express data categories, and for each data category attribute an attribute which expresses scope.

The naming convention for the attribute for scope is datacategory + Scope, e.g. translateScope. As for dislocated scope, the value of the attribute must be an XPath expression. It must start with "/", that is, it must be an AbsoluteLocationPath as described in [XPath 1.0]. Only in this way it is assured that the scope information can be applied in a dislocated way.

Dislocated scope information can appear in a schema (e.g. as content of the xs:appinfo element), in an instance file or in a separate XML document. The precedence of the processing of the scope information depends on these variations. See also Section 3.3: Processing of Scope Information.

Note: The difference between schemaRule and documentRule is that schemaRule has no attributes for scope, e.g. no translateScope attribute. The reason is that schemaRule always refers to the element or attribute declaration of which it is part of. In contrast, documentRule can be used everywhere in a schema for dislocated scope information. It is possible to use schemaRule and documentRule together in a schema.

Example 11: Example for using schemaRule and documentRule together in a schema.
<xs:schema>
 <xs:annotation>
  <xs:appinfo>
   <its:documentRule translate="no" translateScope="//p[@editor='john']"/>
<!-- This rule holds for p elements which are edited by John. -->
  </xs:appinfo>
 </xs:annotation>
 <xs:element name="p">
  <xs:annotation>
   <xs:appinfo>
    <its:schemaRule translate="yes"/>
<!-- This rule holds for all p elements -->
    </xs:appinfo>
   </xs:annotation> ...
  </xs:element> ...
 </xs:schema>

Go to the table of contents.3.2.3 Scope in an Instance Document

In instance documents scope is expressed via a combination of an attribute which expresses the data category and a scope attribute for the data category. The naming convention for the attribute for scope is datacategory + Scope, e.g. translateScope. This is identical to Section 3.2.2: Dislocated Scope.

Example 12: scope for the content of an element and all attributes attached to the element.
<meta its:translate="yes" its:translateScope=". | @*"/>

Scope in an instance document must be either expressed via an AbbreviatedStep "." or it must be a RelativeLocationPath as described in [XPath 1.0]. If scope is the AbbreviatedStep ".", its evaluation must always be interpreted as the textual value of the current node, that is, the textual value of an element to which the scope attribute is attached to. For example, <p its:dir="ltr" dirScope="."> ... </p> is used to select the textual content of a p element.

If child elements should be part of the scope, an XPath step expression like descendant-or-self::* should be used.

Example 13: Scope for the content of an element, including all descendant elements.
<p its:translate="no" its:translateScope=".//descendant::*"> ... </p>

To avoid mismatches between the multiple scope attributes, only the following axis should be used in the XPath expression: child, descendant, attribute, descendant-or-self.

Example 14: Scope with various axis
<text its:translate="yes"
 its:translateScope="child::body/descendant::p">
 <body its:translate="no" its:translateScope="descendant::p/attribute:id"> ... 
  <p its:translate="no" its:translateScope="descendant-or-self::*"> ... </p>
 </body>
</text>

Note: The following xml schema datatype can be used to verify that only these axis are used at the beginning of the XPath expression:

Example 15: XML Schema datatype used to restrict XPath expressions
<xs:simpleType name="scopeInline">
 <xs:restriction base="xs:string">
  <xs:pattern value="
      (child::.+)
     | (descendant::.+)
     | (descendant-or-self::.+)
     | (\.//.+)
     | (attribute::/.+)
     | (@.+)
     | (name\(\)=.+)"/>
 </xs:restriction>
</xs:simpleType>

Go to the table of contents.3.3 Processing of Scope Information

Go to the table of contents.3.3.1 Precedence between Scope Information

The following precedence order is defined for scope information:

Note: These proceeding rules fulfill the same purpose as the built-in template rules of [XSLT 1.0].

Example 16: Conflicts between scope information which are resolved via the precedence order

Due to the rules described above, the translatability information via the translate attribute at the p element has precedence before the translatability information at the documentRule element.

<text>
 <head>
  <its:documentRule its:translate="yes" its:translateScope="//p"/>
 </head>
 <body> ...
  <p its:translate="no"> ... </p>
 </body>
</text>

Go to the table of contents.3.3.2 Default Scope

The default scope differs with respect to each data category, see the table in Section 4: Description of Data Categories. For many data categories, it is the textual content of an element and all its child elements. This is different from for example xml:lang, which scope is intended to be also all attributes.

For translatability, the default scope may be reset with the following XPath expression for elements, to be attached to a scope attribute to the element in question: descendant-or-self::*. As for attributes, the expression is descendant-or:self::*/attribute::*.

Example 17: Reset the default scope for elements for translatability
<p its:translate="yes" its:translateScope="descendant-or-self::*"> ... </p>
<p its:translate="no" its:translateScope="descendant-or:self::*/attribute::*"> ... </p>

Go to the table of contents.3.3.3 Conflict between In Situ Scope Information

It is possible that the resolution of scope information leads to contradictions, for example if the default of translatability should be set back to the default for elements and attributes at the same time. Such conflicts occur also if different information about the same data categories should be expressed for attributes at the same element.

Example 18: Reset the default scope for elements for translatability.
<p its:translate="yes" its:translateScope="descendant-or-self::*"> ... </p>
Example 19: Reset the default scope for attributes for translatability. This is not possible at the same element:
<p its:translate="no" its:translateScope="descendant-or:self::*/attribute::*"> ... </p>
Example 20: Make an exception for the title attribute. This is not possible at the same element:
<p its:translate="yes" its:translateScope="attribute::title"> ... </p>

Such conflicts should be resolved via scope information which is attached to a different element node and evaluates to the node in question. To avoid mismatches with other descriptions of scope information, this should be the only case where axis other than child, descendant, attribute, descendant-or-self are used for in situ scope.

Example 21: Resolving the conflict between in situ information via scope information at different elements
<body its:translate="no" its:translateScope="child::p[1]/attribute::*">
 <p its:translate="yes" its:translateScope="descendant-or-self::*"
  title="This should be translated"/>
 <p its:translate="yes" its:translateScope="preceding-sibling::p/attribute::title"/>
</body>

Go to the table of contents.3.4 Mapping In Situ Scope to Dislocated Scope

The in situ description of scope and the dislocated description are just positional variants. All apply to instance documents. It must be possible to convert the in situ descriptions to a dislocated description. This conversion can only be executed in a generic way, if the in situ scope attributes contain only relative path expressions.

Example 22: Conversion between in situ descriptions and dislocated descriptions of scope
<body its:translate="no" its:translateScope="child::p[1]/attribute::*">
 <p its:translate="yes" its:translateScope="descendant-or-self::*"/>
 <p its:translate="yes" its:translateScope="preceding-sibling::p/attribute::title"/>
</body>
<its:rules>
 <its:rule its:translate="no" translateScope="/body/child::p[1]/attribute::*"/>
 <its:rule its:translate="yes"
  its:translateScope="/body/child::p[1]/descendant-or-self::*"/>
 <its:rule its:translate="yes"
  its:translateScope="/body/child::/p[2]/preceding-sibling::p/attribute::title"/>
</its:rules>

Go to the table of contents.3.5 Scope and XPath

When using XPath 1.0 or 2.0 as part of XSLT, the transformation of the document might lead to the loss of ITS scope information. This specification leaves it to an application of ITS what should happen in such cases, since this specification does not mandate XSLT, XQuery or other languages which encompass XPath.

Go to the table of contents.4 Description of Data Categories

This section is normative.

The following table summarizes the relations between data categories and scope.

Data categoryApplicable in schemadislocated scope applicabledefault scope in instance document
Translatability ++Textual content of element, including child elements, but excluding attributes
Localization information ++Textual content of element, including child elements, but excluding attributes
Terminology ++Textual content of element, without attributes and child elements
Directionality -+Textual content of element, including attributes and child elements
Ruby -+Textual content of element, without attributes and child elements

Note: The data categories differ with respect to defaults in the instance document for compatibility reasons with existing standards and practices. For example, the dir attribute in [HTML 4.01] refers to the content of the element and all attributes and child elements. Hence, the data category of directionalty has the same scope. On the other hand, it is common practice that information about translatability refers only to textual content of an element. Hence, the data category of translatability has this kind of scope.

Go to the table of contents.4.1 Translatability

Go to the table of contents.4.1.1 Definition

[Definition: The data category translatability expresses information about whether the content of an element or attribute should be translated or not.]. The values of this data category are yes (translatable) or no (not translatable).

Note: This definition of translatability is identical to the definition in [Dita 1.0]. The implementation of this data category is different from [Dita 1.0] since the former allows for expressing scope information.

Go to the table of contents.4.1.2 Implementation

Translatability can be expressed in a schema, dislocated or in an instance document.

In a schema, translatability is expressed via a schemaRule element with a translate attribute. The attribute has the values yes or no.

Example 23: Translatability expressed in a schema
<xs:element name="p">
 <xs:annotation>
  <xs:appinfo>
   <its:schemaRule translate="yes"/>
  </xs:appinfo>
 </xs:annotation> ...
</xs:element>

Dislocated, translatability is expressed via a documentRule element with a translate attribute. The attribute has the values yes or no. In addition, a translateScope attribute is required.

Example 24: Translatability expressed dislocated
<its:documentRules>
 <its:documentRule translate="yes" translateScope="//p"/>
<!-- All p elements should be translated-->
</its:documentRules>

In an instance document, translatability is expressed via a translate attribute with the values yes or no. If no translateScope attribute is present, the scope is the textual content of the element, including child elements, but excluding attributes. If a translateScope attribute is present, the scope is defined by the value of this attribute which is an XPath expression.

Example 25: Translatability expressed in an instance document

In the body element, all elements and attributes except id attributes should be translated. The content of the specified quote element, however, must not be translated.

<book>
 <head>...</head>
 <body its:translate="yes" its:translateScope=".//* or .//@*[not(name()='id')]"> ...
  <p>And he said: you need a new <quote its:translate="no">motherboard</quote>
  </p> ... 
 </body>
</book>

Go to the table of contents.4.2 Localization Information

Go to the table of contents.4.2.1 Definition

[Definition: The data category localization information is used to communicate information to localizers about a particular item of content.]

This data category has several purposes:

  • Tell the translator how to translate parts of the content

  • Expand on the meaning or contextual usage of a particular element, such as what a variable refers to or how a string will be used on the user interface

  • Clarify ambiguity and show relationships between items sufficiently to allow correct translation (e.g. in many languages it is impossible to translate the word "enabled" in isolation without knowing the gender, number and case of the thing it refers to.)

  • Indicate why a piece of text is emphasized (important, sarcastic, etc.)

Two types of informative notes are needed

  • An alert contains information that the translator must read before translating a piece of text. Example: an instruction to the translator to leave parts of the text in the source language.

  • A description provides useful background information that the translator will refer to only if they wish. Example: a clarification of ambiguity in the source text.

Go to the table of contents.4.2.2 Implementation

Localization information can be expressed in a schema, dislocated or in an instance document.

In a schema, localization information is expressed via a schemaRule element with a locInfo attribute. The type of the localization information is expressed via a locInfoType attribute with the values alert or description.

Example 26: Localization information expressed in a schema
<xs:element name="p">
 <xs:annotation>
  <xs:appinfo>
   <its:schemaRule locInfo="This has to be handled carefully" locInfoType="alert"/>
  </xs:appinfo>
 </xs:annotation> ...
</xs:element>

Dislocated, localization information is expressed via a documentRule element with the attributes locInfo and locInfoType. In addition, a locInfoScope attribute is required.

Example 27: Localization information expressed dislocated
<its:documentRules>
 <its:documentRule locInfo="This p element has to be handled carefully"
  locInfoType="alert" locInfoScope="/body/p[1]"/>
</its:documentRules>

In an instance document, localization information is expressed via the attributes locInfo and locInfoType. If no locInfoScope attribute is present, the scope is the textual content of element, including child elements, but excluding attributes. If a locInfoScope attribute is present, the scope is defined by the value of this attribute which is an XPath expression.

Example 28: Localization information expressed in an instance document
<book>
 <head>...</head>
 <body its:locInfo="Just translate all p elements." its:locInfoType="alert"
  its:locInfoScope="//p"> ...
  <p its:locInfo="This p element has to be handled
   carefully" its:locInfoType="alert">And he said: you need a new
   <quote>motherboard</quote>
  </p> ...
 </body>
</book>

Go to the table of contents.4.3 Terminology

Go to the table of contents.4.3.1 Definition

The terminology data category is used to mark terms. This helps to increase consistency across different parts of the documentation. It is also helpful for translation.

Go to the table of contents.4.3.2 Implementation

The terminology data category can be expressed in a schema, dislocated or in an instance document.

In a schema, the terminology data category is expressed via a schemaRule element with a term attribute, which has the value yes.

Example 29: The terminology data category expressed in a schema
<xs:element name="span">
 <xs:annotation>
  <xs:appinfo>
   <its:schemaRule term="yes"/>
<!-- All span elements are used to mark up terms-->
  </xs:appinfo>
 </xs:annotation> ...
</xs:element>

Dislocated, the terminology data category is expressed via a documentRule element with the term attribute, which has the value yes. A termScope attribute is required. In addition, an optional termRef attribute can be used to refer to external information about the term. The datatype of termRef is xs:anyURI.

Example 30: The terminology data category expressed dislocated
<its:documentRules>
 <its:documentRule term="yes" termScope="/body/p[1]/span"
  termRef="http://example.com/termdatabase/#x142539"/>
</its:documentRules>

In an instance document, the terminology data category is expressed via a term attribute, which has the value yes, and an optional termRef attribute. If no termScope attribute is present, the scope is the textual content of the element, without elements / attributes. If a termScope attribute is present, the scope is defined by the value of this attribute which is an XPath expression.

Example 31: The terminology data category expressed in an instance document
<book>
 <head>...</head>
 <body> ... 
  <p>And he said: you need a new <quote its:term="yes">motherboard</quote></p> ...
 </body>
</book>

Go to the table of contents.4.4 Directionality

Go to the table of contents.4.4.1 Definition

This data category expresses the directionality of a piece of text. Its values are ltr or rtl. This definition is compliant with the dir attribute in [HTML 4.01], except that [HTML 4.01] does not allow for scoping.

In addition, bdo with the value yes can be supplied. It has the same purpose as the bdo element in [HTML 4.01].

Go to the table of contents.4.4.2 Implementation

The dir attribute is used for the implementation of the directionality data category. It has the two values ltr or rtl. An optional bdo attribute with the value yes can be provided.

Directionality can be expressed dislocated or in an instance document.

Dislocated, directionality is expressed via a documentRule element with the dir attribute which has the values ltr or rtl, and an optional bdo attribute with the value yes. In addition, a dirScope attribute is required.

Example 32: Directionality expressed dislocated
<its:documentRules>
 <its:documentRule dir="rtl" dirScope="/body/p[1]/quote[xml:lang='he']"/>
<!-- Some Hebrew quotation -->
</its:documentRules>

In an instance document, directionality is expressed via a dir attribute, which has the values ltr or rtl, and an optional bdo attribute with the value yes. If no dirScope attribute is present, the scope is the textual content of the element, including all child element and attributes. If a dirScope attribute is present, the scope is defined by the value of this attribute which is an XPath expression.

Example 33: Directionality expressed in an instance document
<book>
 <head>...</head>
 <body> ...
  <p>And he said: <quote its:dir="rtl"> ... a Hebrew quotation  ... </quote></p> ... 
 </body>
</book>

Go to the table of contents.4.5 Ruby

Go to the table of contents.4.5.1 Definition

The data category ruby is used for a run of text that is associated with another run of text, referred to as the base text. Ruby text is used to provide a short annotation of the associated base text. It is most often used to provide a reading (pronunciation) guide.

Go to the table of contents.4.5.2 Implementation

Ruby can be expressed in an instance document with or without scope information.

Ruby in an instance document without scope information is realized with a ruby element which contains a rubyBase and a rubyText element.

Example 34: Ruby in an instance document without scope
<text>
 <head> ... </head>
 <body>
  <p>This is about the <its:ruby>
   <its:rubyBase>W3C</its:rubyBase>
    <its:rubyText>World Wide Web Consortium</its:rubyText>
   </its:ruby>.</p>
 </body>
</text>

Ruby in an instance document with scope information is expressed via two attributes:

  • A rubyText attribute contains the ruby text (corresponding to the rubyText element in the case of no scope information)

  • A rubyScope attribute contains the scope information. The XPath expression in this attribute selects the ruby base text, corresponding to the rubyBase element in the case of no scope information.

Example 35: Ruby in an instance document with scope
<text>
 <head> ... </head>
 <body>
  <img src="w3c_home.png" alt="W3C"
   its:rubyScope="@alt" its:rubyText="World Wide Web Consortium"/> ...
 </body>
</text>

Note: The structure of the content model for the ruby element without scope information is identical with the structure of ruby in section 5.4 of [OpenDocument], and simple ruby markup as defined in section 1.2.1 in [Ruby-TR].

Go to the table of contents.5 Modularizations of ITS with existing Markup Schemes

[Ed. note: This section will be written in a subsequent working draft.]

Two topics are to be covered in this section:

Go to the table of contents.5.1 ITS and XHTML 1.0

Go to the table of contents.5.2 ITS and DocBook

TODO

Go to the table of contents.5.3 ITS and Open Document Format 1.0

TODO

Go to the table of contents.5.4 ITS and DITA 1.0

TODO

Go to the table of contents.6 Markup Declarations

This section is normative.

A data type data.scope is defined for scope. Its value is an XPath expression [XPath 1.0]. A data type data.itsBoolean is defined for boolean values, e.g. to express translatability.

data.scope
data.scope
[1]   data.scope   ::=    text
data.itsBoolean
data.itsBoolean
[2]   data.itsBoolean   ::=    "yes" | "no"

The attribute group att.datacats is used to express the ITS data categories. It makes use of the data type data.itsBoolean.

att.datacats
att.datacats
[3]   att.datacats.attributes   ::=    att.datacats.attribute.translate, att.datacats.attribute.locInfo, att.datacats.attribute.locInfoType, att.datacats.attribute.term, att.datacats.attribute.termRef, att.datacats.attribute.dir, att.datacats.attribute.bdo, att.datacats.attribute.rubyText
[4]   att.datacats.attribute.translate   ::=    attribute translate { data.itsBoolean }?
[5]   att.datacats.attribute.locInfo   ::=    attribute locInfo { text }?
[6]   att.datacats.attribute.locInfoType   ::=    attribute locInfoType { "description" | "alert" }?
[7]   att.datacats.attribute.term   ::=    attribute term { "yes" }?
[8]   att.datacats.attribute.termRef   ::=    attribute termRef { xsd:anyURI }?
[9]   att.datacats.attribute.dir   ::=    attribute dir { "ltr" | "rtl" }?
[10]   att.datacats.attribute.bdo   ::=    attribute bdo { "yes" }?
[11]   att.datacats.attribute.rubyText   ::=    attribute rubyText { text }?

The attribute group att.scope is used to express scope for ITS data categories. It makes use of the data type data.scope. An overview of the relation between scope and ITS data categories is given in Section 4: Description of Data Categories.

att.scope
att.scope
[12]   att.scope.attributes   ::=    att.scope.attribute.translateScope, att.scope.attribute.locInfoScope, att.scope.attribute.termScope, att.scope.attribute.dirScope, att.scope.attribute.rubyScope
[13]   att.scope.attribute.translateScope   ::=    attribute translateScope { data.scope }?
[14]   att.scope.attribute.locInfoScope   ::=    attribute locInfoScope { data.scope }?
[15]   att.scope.attribute.termScope   ::=    attribute termScope { data.scope }?
[16]   att.scope.attribute.dirScope   ::=    attribute dirScope { data.scope }?
[17]   att.scope.attribute.rubyScope   ::=    attribute rubyScope { data.scope }?
ruby
ruby
[18]   ruby   ::=    element ruby { ruby.content }
[19]   ruby.content   ::=    rubyBase, rubyText
rubyBase
rubyBase
[20]   rubyBase   ::=    element rubyBase { rubyBase.content }
[21]   rubyBase.content   ::=    text
rubyText
rubyText
[22]   rubyText   ::=    element rubyText { rubyText.content }
[23]   rubyText.content   ::=    text

The schemaRules element contains rules for ITS information, to be used as schema annotation. The schemaRule element contains attributes from the ITS data categories.

schemaRules
schemaRules
[24]   schemaRules   ::=    element schemaRules { schemaRules.content }
[25]   schemaRules.content   ::=    schemaRule+
schemaRule
schemaRule
[26]   schemaRule   ::=    element schemaRule { schemaRule.content, schemaRule.attributes }
[27]   schemaRule.content   ::=    empty
[28]   schemaRule.attributes   ::=    att.datacats.attributes, empty

The documentRules element contains rules for ITS information, to be used as schema annotation. The documentRule element contains attributes from the ITS data categories and the scope attributes.

documentRules
documentRules
[29]   documentRules   ::=    element documentRules { documentRules.content }
[30]   documentRules.content   ::=    documentRule+
documentRule
documentRule
[31]   documentRule   ::=    element documentRule { documentRule.content, documentRule.attributes }
[32]   documentRule.content   ::=    empty
[33]   documentRule.attributes   ::=    att.scope.attributes, att.datacats.attributes, empty

Go to the table of contents.7 Conformance

This section is normative.

Conformance to ITS falls into two categories: conformance to the ITS data categories (cf. Section 4: Description of Data Categories) and conformance to Scope (cf. Section 3: Scope of ITS information).

Go to the table of contents.7.1 Conformance to the ITS data categories

An implementation of the ITS data categories is conformant if it supplies a schema which adopts the ITS data categories:

  • The schema must allow the usage of the attribute group att.datacats at every element which is declared in the schema

  • The schema should allow the usage of the attribute group att.scope at every element which is declared in the schema

  • The schema should allow the usage of the documentRules and the documentRule element in at least one element in the schema

The schemaRules and schemaRule element are to be used as schema annotations. It is the responsibility of the schema processor to allow for such annotations.

Example 36: A schema which is conformant to the ITS data categories
<xs:schema xmlns:myns="http://example.com/mySchema"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:its="http://www.w3.org/2005/11/its"
 targetNamespace="http://example.com/mySchema" elementFormDefault="qualified"
 attributeFormDefault="unqualified">
 <xs:import namespace="http://www.w3.org/2005/11/its" schemaLocation="its.xsd"/>
 <xs:element name="document">
  <xs:complexType>
   <xs:sequence>
     <xs:element ref="myns:head"/>
     <xs:element ref="myns:body"/>
   </xs:sequence>
   <xs:attributeGroup ref="myns:commonAtts"/>
  </xs:complexType>
 </xs:element>
 <xs:attributeGroup name="commonAtts">
  <xs:attributeGroup ref="its:att.datacats.attributes"/>
  <xs:attributeGroup ref="its:att.scope.attributes"/>
 </xs:attributeGroup>
 <xs:element name="head">
  <xs:complexType>
   <xs:choice minOccurs="0" maxOccurs="unbounded">
     <xs:element ref="its:documentRules"/>
     <xs:element ref="its:documentRule"/>
   </xs:choice>
   <xs:attributeGroup ref="myns:commonAtts"/>
  </xs:complexType>
 </xs:element>
 <xs:element name="body">
  <xs:complexType>
   <xs:sequence>
    <xs:element ref="myns:para" maxOccurs="unbounded"/>
   </xs:sequence>
   <xs:attributeGroup ref="myns:commonAtts"/>
  </xs:complexType>
 </xs:element>
 <xs:element name="para">
  <xs:complexType mixed="true">
   <xs:attributeGroup ref="myns:commonAtts"/>
  </xs:complexType>
 </xs:element>
</xs:schema>

Go to the table of contents.7.2 Conformance to scope

Conformance to scope encompasses conformance to the ITS data categories, with the following changes:

A mandatory part of this conformance criterion is the usage of XPath. An application which processes ITS information must be able to process XPath in the version 1.0 or higher. It is not required to support a specific host language of XPath, like for example [XSLT 1.0].

Go to the table of contents.A Schemas for ITS

This section is informative.

The following schemas are provided :

Go to the table of contents.B References

HTML 4.01
Dave Ragget, Arnaud Le Hors, Ian Jacobs, eds. HTML 4.01 Specification. W3C Recommendation 24 December 1999. Available at http://www.w3.org/TR/1999/REC-html401-19991224/. The latest version of HTML 4.01 is available at http://www.w3.org/TR/html401/.
RELAX NG
James Clark, Makoto Murata. RELAX NG Specification. OASIS Committee Specification 3 December 2001. Available at http://www.oasis-open.org/committees/relax-ng/spec-20011203.html. The latest version of RELAX NG is available at http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=relax-ng.
RFC 2119
S. Bradner. Key Words for use in RFCs to Indicate Requirement Levels. IETF RFC 2119. Available at http://www.ietf.org/rfc/rfc2119.txt.
XLink 1.1
Steve DeRose, Eve Maler, David Orchard, Norman Walsh. XML Linking Language (XLink) Version 1.1. W3C Working Draft 7 July 2005. Available at http://www.w3.org/TR/2005/WD-xlink11-20050707/. The latest version of XLink 1.1 is available at http://www.w3.org/TR/xlink11/.
XML 1.0
Tim Bray, Jean Paoli, C.M. Sperberg-McQueen, et al., editors. Extensible Markup Language (XML) 1.0 (Third Edition), W3C Recommendation 04 February 2004. Available at http://www.w3.org/TR/2004/REC-xml-20040204/. The latest version of XML 1.0 is available at http://www.w3.org/TR/REC-xml/.
XML Names
Tim Bray, Dave Hollander, Andrew Layman. Namespaces in XML. W3C Recommendation 14 January 1999. Available at http://www.w3.org/TR/1999/REC-xml-names-19990114/. The latest version of XML Names is available at http://www.w3.org/TR/REC-xml-names/.
XML Schema
Henry S. Thompson, David Beech, Murray Maloney, Noah Mendelsohn. XML Schema Part 1: Structures Second Edition. W3C Recommendation 28 October 2004. Available at http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/. The latest version of XML Schema is available at http://www.w3.org/TR/xmlschema-1/.
XPath 1.0
James Clark. XML Path Language (XPath) Version 1.0. W3C Recommendation 16 November 1999. Available at http://www.w3.org/TR/1999/REC-xpath-19991116. The latest version of XPath 1.0 is available at http://www.w3.org/TR/xpath.

Go to the table of contents.C References (Non-Normative)

Dita 1.0
OASIS DITA Architectural Specification Committee Draft 01. Oasis Committee Specification 3 May 2005. Available at http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita.
DocBook
Norman Walsh and Leonard Muellner. DocBook: The Definitive Guide. Available at http://www.docbook.org/.
ITS REQ
Yves Savourel. Internationalization and Localization Markup Requirements. W3C Working Draft 5 August 2005. Available at http://www.w3.org/TR/2005/WD-itsreq-20050805/. The latest version of ITS REQ is available at http://www.w3.org/TR/itsreq/.
OpenDocument
Michael Brauer et al. OASIS Open Document Format for Office Applications (OpenDocument).. Oasis Standard 1 May 2005. Available at http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office. The latest version of OpenDocument is available at http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office.
Ruby-TR
Marcin Sawicki (until 10 October, 1999), Michel Suignard, Masayasu Ishikawa (石川 雅康), Martin Dürst, Tex Texin, Ruby Annotation. W3C Recommendation 31 May 2001. Available at http://www.w3.org/TR/2001/REC-ruby-20010531/ . The latest version of Ruby Annotation is available at http://www.w3.org/TR/ruby/.
TEI
Lou Burnard and Syd Bauman (eds). Text Encoding Initiative Guidelines development version (P5). TEI Consortium, Charlottesville, Virginia, USA, Text Encoding Initiative.
XHTML 1.0
Steven Pemperton et al. XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition). W3C Recommendation 26 January 2000, revised 1 August 2002. Available at http://www.w3.org/TR/2002/REC-xhtml1-20020801/. The latest version of XHTML 1.0 is available at http://www.w3.org/TR/xhtml1/.
XSLT 1.0
James Clark. XSL Transformations (XSLT) Version 1.0. W3C Recommendation 16 November 1999. Available at http://www.w3.org/TR/1999/REC-xslt-19991116. The latest version of XSLT 1.0 is available at http://www.w3.org/TR/xslt.
XUL
exTensible User Inferface Language. Available at http://www.xulplanet.com/.

Go to the table of contents.D Acknowledgements (Non-Normative)

This document has been developed with contributions by the ITS Working Group. At the date of publication, the members of the Working Group were: Damien Donlon (Sun Microsystems), Martin Dürst (Invited Expert), Richard Ishida (W3C), Masaki Itagaki (Invited Expert), Christian Lieske (SAP), Naoyuki Nomura (Ricoh), Sebastian Rahtz (Invited Expert), François Richard (HP), Goutam Saha (CDAC), Felix Sasaki (W3C), Yves Savourel (ENLASO), Dianne Stoick (Boeing), Najib Tounsi (Ecole Mohammadia d'Ingeniéurs Rabat (EMI)) and Andrzej Zydroń (Invited Expert).

A special thanks goes to Sebastian Rahtz who introduced us to the ODD language, which was used to create this document, and who provided the stylesheets to generate schemas and the XHTML version (via xmlspec) out of an ODD document.