Best Practices for XML Internationalization

1 Introduction

This document is a complement to [ITS]. Not all internationalization-related issues can be solved with special markup described in [ITS]; there are a number of problems that can be avoided by designing correctly the XML format, and by applying a few guidelines when designing and authoring documents. This document and [ITS] implement requirements formulated in [ITS REQ].

1.1 Who should use this document

This document is divided into two main sections:

The first one is intended to the designers and developers of XML applications.
The second is for the XML content authors. This includes users modifying the original content such as the translators.

1.2 How to use this document

Designers and developers of XML applications should read Section 2: When Designing an XML Application. It provides a list of some of the important design choices they should do in order to ensure the internationalization of their format. The techniques are usually illustrated with examples for XML Schema, RELAX NG and XML DTD.

Users and authors of XML content should read Section 3: When Authoring XML Content where they can find a number of guidelines on how to create content with internationalization in mind. Many of these best practices do not require the XML format used to have been developed especially for internationalization.

Section 5: ITS Applied to Existing Formats provides a set of concrete examples on how to apply ITS to existing XML based formats. This illustrates many of the guidelines in this document.

Each guideline is illustrated by one or more techniques (identified with a sequential number through-out the document).

2 When Designing an XML Application

Designers and developers of XML applications should take in account the following best practices:

Best Practice 1: Provide xml:lang to specify natural language content
Best Practice 2: Provide a way to specify text directionality
Best Practice 3: Avoid translatable attributes
Best Practice 4: Indicate the translatability of elements and attributes
Best Practice 5: Provide a way to override translatability information
Best Practice 6: Provide text segmentation-related information
Best Practice 7: Provide a way to specify ruby text
Best Practice 8: Provide a way to specify comments for translators
Best Practice 9: Provide a way to specify unique identifiers
Best Practice 10: Identify terminology-related elements
Best Practice 11: Provide a way to override terminology information
Best Practice 12: Use multilingual documents with caution
Best Practice 13: Name elements with caution
Best Practice 14: Provide ITS rules for your DTD or schema

Best Practice 1: Provide xml:lang to specify natural language content

Include xml:lang in your DTD or schema to allow to specify the natural language of the content.

How to do this

Make sure the xml:lang attribute is available for the root element of your document, and for any element where a change of language may occur.

For details on how to add an attribute such as xml:lang to a DTD, an XSD schema, or a RELAX-NG schema, see: Section 4.2: Adding an Attribute to an Existing DTD or Schema.

Note: The scope of the xml:lang attribute applies to both the attributes and the content of the element where it appears, therefore one cannot specify different languages for an attribute and the element content. ITS does not provide remedy for this. Instead, it is recommended to not use attributes for translatable text.

Note: If not the language of the content, but a natural language value as data or meta-data about something external to the document has to be specified, an attribute different from xml:lang (like hreflang in XHTML) should be used.

Example 1: Language information not applicable to content of element where it is used

In this example the XHTML hreflang attribute indicates that the target of the link is in German. The hreflang does not provide any information about the content of the element a.

<a xml:lang="en" href="german.html" hreflang="de">Click here for German</a>

Note: Make sure that the definition of the xml:lang attribute allows for empty values. That is, in a DTD you must not use NMTOKEN as the data type, instead use CDATA. The XML Schema built-in data type language allows empty values and therefore can be used.

For existing DTD and schema:

If you are working with an existing DTD or schema where there is a way to specify content language that is not implemented using the xml:lang attribute (but still uses the same values as xml:lang), you should provide an ITS rules document where you use the its:langRule element to specify what attribute or element is used instead of xml:lang.

Example 2: Non-standard way of declaring language information

In this document the langcode element is used to specify the language of an entry.

<myRes>
 <messages>
  <msg id="1">
   <langcode>en</langcode>
   <text>Cannot find file.</text>
  </msg>
  <msg id="2">
   <langcode>fr</langcode>
   <text>Fichier no trouvé.</text>
  </msg>
 </messages>
</myRes>

[Example's source code]

Example 3: Associating non-standard language information to ITS

Use the following rule to specify that the langcode element holds the same values as the xml:lang attribute.

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <its:langRule selector="//text[../langcode]" langPointer="../langcode"/>
</its:rules>

[Example's source code]

Why do this

It is not recommended to use your own attribute or element to specify the language of the content. The xml:lang attribute is supported by various XML technologies such as XPath and XSL (e.g. the lang() function). Using something different would diminish the interoperability of your documents and reduce your capability to take advantage of some XML applications.

Resources:

Background information

Internationalization FAQ: xml:lang in XML document schemas.
http://www.w3.org/International/questions/qa-when-xmllang
Mechanisms for declaring language in HTML
http://www.w3.org/TR/i18n-html-tech-lang/#ri20050208.095812479

Reference links

Description of the language identification mechanism in the XML specification.
http://www.w3.org/TR/REC-xml/#sec-lang-tag
The "Language Information" data category in ITS.
http://www.w3.org/TR/its/#language-information

More resources

Technique index - Topic index

Best Practice 2: Provide a way to specify text directionality

Include its:dir in your DTD or schema to allow to specify text directionality.

How to do this

Make sure the its:dir attribute is available for the root element of your document and for all elements with content that may be rendered.

For details on how to add an attribute such as its:dir to a DTD, an XSD schema, or a RELAX-NG schema, see: Section 4.2: Adding an Attribute to an Existing DTD or Schema.

The its:dir attribute is part of the ITS "Directionality" data category which allows the user to specify the base writing direction of blocks, embeddings and overrides for the Unicode bidirectional algorithm.

For existing DTD and schema:

If you are working with an existing DTD or schema where there is a way to specify text directionality that is not implemented using the its:dir attribute, you should provide an ITS rules document where you use the its:dirRule element to associate the different directionality indicators with their equivalent in ITS.

Example 4: Specifying text directionality with non-ITS markup

In this document the textdir attribute is used to specify directionality of a text run.

<text xml:lang="en">
 <body>
  <par>In Hebrew, the title <quote xml:lang="he" textdir="r2l">פעילות הבינאום, W3C</quote>
     means <quote>Internationalization Activity, W3C</quote>.</par>
 </body>
</text>

[Ed. note: TODO: Update the XSLT template to convert example to correct bidi display]

[Example's source code]

Example 5: Associating non-ITS text directionality information with ITS

Use the following rule to specify the relationships between the textdir attribute of the format and the ITS "Directionality" data category.

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <its:dirRule selector="//*[@textdir='l2r']" dir="ltr"/>
 <its:dirRule selector="//*[@textdir='r2l']" dir="rtl"/>
 <its:dirRule selector="//*[@textdir='lro']" dir="lro"/>
 <its:dirRule selector="//*[@textdir='rlo']" dir="rlo"/>
</its:rules>

[Example's source code]

Why do this

Generally the Unicode bidirectional algorithm will cause text in scripts such as Arabic and Hebrew to appropriately order mixed script text. Sometimes, however, additional help is needed. For instance, in the sentence of example 6 the 'W3C' and the comma should appear to the left side of the quotation. This cannot be achieved using the bidirectional algorithm alone.

Example 6: Sentence where bidirectional markup is needed for a proper display

The title says "פעילות הבינאום, W3C" in Hebrew.

The desired effect can be achieved using Unicode control characters, but this is not recommended (see: [Unicode in XML]). Markup is needed to establish the default directionality of a document, and to change that where appropriate by creating nested embedding levels.

Markup is also applicable to disable the effects of the bidirectional algorithm for a specified range of text.

Resources:

Background information

Internationalization FAQ: What you need to know about the bidi algorithm and inline markup
http://www.w3.org/International/articles/inline-bidi-markup/
Authoring Techniques for XHTML & HTML Internationalization: Handling Bidirectional Text 1.0
http://www.w3.org/TR/i18n-html-tech-bidi/#ri20030728.094313871
Unicode Technical Report #20: Unicode in XML and other Markup Languages
http://www.w3.org/TR/unicode-xml/

Reference links

The "Directionality" data category in ITS.
http://www.w3.org/TR/its/#directionality

More resources

Technique index - Topic index

Best Practice 3: Avoid translatable attributes

Do not put translatable text in attributes.

How to do this

Make sure all translatable text is stored as element content, not as attribute values.

For example, do not allow this:

Example 7: Bad design

The alt attribute contains translatable text.

<image src="elephants.png" alt="Elephants bathing in the Zambezi River."/>

Instead, design for this:

Example 8: Better design

There is no more translatable attribute.

<image src="elephants.png">Elephants bathing in the Zambezi River.</image>

For existing DTD and schema:

If you are working with a DTD or a schema where there are attributes with translatable values, you should provide an ITS rules document where you use the its:translateRule element to specify what attributes are translatable. See Best Practice 4: Indicate the translatability of elements and attributes for more information how to do this.

Why do this

There are a number of issues related to storing translatable text in attribute values. Some of them are:

The language identification mechanism (i.e. xml:lang) applies to the content of the element where it is declared, including its attribute values. If the text of an attribute is in a different language than the text of the element content, one cannot set the language for both correctly.
In some languages, bidirectional markers may be needed to provide a correct display. Normally, those markers are elements, but elements cannot be used within an attribute value. One can use Unicode control characters instead, but this is not recommended (see: [Unicode in XML]).
It is difficult to apply to the text of the attribute value meta-information such as no-translate flags, designer's notes, etc.
The difficulty to attach unique identifiers to translatable attribute text makes it more complicated to use ID-based leveraging tools.
Translatable attributes can create problems when they are prepared for localization because they can occur within the content of a translatable element, breaking it into different parts, and possibly altering the sentence structure.

All these potential problems are less likely to occur when the text is the content of an element rather than the value of an attribute.

Note: In many occurences, moving translatable text from attribute value to element content can result in having a sentence embedded within another one. For instance, in the example 8: the description of the image will be embedded inside the text of the paragraph where the image is. In such cases, do not forget to declare the relevant element (here image) as 'nested', as described here: Best Practice 6: Provide text segmentation-related information

Resources:

Reference links

The "Translate" data category in ITS.
http://www.w3.org/TR/its/#trans-datacat
The "Element Within Text" data category in ITS.
http://www.w3.org/TR/its/#elements-within-text

More resources

Technique index - Topic index

Best Practice 4: Indicate the translatability of elements and attributes

Define whether elements and attributes are translatable.

How to do this

You should provide an ITS rules document where you use its:translateRule elements to indicate which elements have non-translatable content.

Note: If needed, make provisions for the case where the content of an element is flagged with xml:lang="zxx", where zxx indicates a content that is not in a language, and therefore is most likely not translatable.

If you are working with a DTD or a schema where there are translatable attributes (something that is not recommended), you should also use its:translateRule to specify these translatable attributes.

Example 9: Document where default ITS "Translate" rules do not apply

In the following document, the content of the head element should not be translated, and the value of the alt attribute should be translated. In addition, the content of the del element should not be translated.

<myDoc xml:lang='en'>
 <head>
  <id xml:lang="zxx">H4-A3-F8-A1</id>
  <author>Page Harrison</author>
  <rev>v13 July-27-2005</rev>
 </head>
 <par>To start click <ins>the <ui>Start</ui>
  button</ins><del>this icon: <ref file='start.png' alt='Start icon'/></del>
  and fill the form.</par>
 </myDoc>

[Example's source code]

Example 10: Overriding default translatability rules

The following rules specify exceptions from the default ITS behavior for DITA:

Rule 1: Indicates that the content of head in myDoc is not translatable. By inheritance, the child elements of head are also assumed not translatable.
Rule 2: Indicates that all the alt attributes are translatable.
Rule 3: Indicates that the content of del is not translatable.
Rule 4: Indicates that the non-translatability of del applies also to any attribute that may have been set as translatable by a prior rule (i.e. the second rule).
Rule 5: Indicates that any element or attribute with their language set to zxx is not translatable.

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <its:translateRule selector="/myDoc/head" translate="no"/>
 <its:translateRule selector="//*/@alt" translate="yes" /> 
 <its:translateRule selector="//del" translate="no" />
 <its:translateRule selector="//@*[ancestor::del]" translate="no"/>
 <its:translateRule selector="//*[lang('zxx')] | //@*[lang('zxx')]" translate="no" />
</its:rules>

[Example's source code]

Why do this

By default, ITS assumes that the content of all elements is translatable and that all attributes have non-translatable values. If your XML document type does not correspond to this default assumptions it is important to indicate what are the exceptions to improve translation throughput.

Resources:

Reference links

The "Translate" data category in ITS.
http://www.w3.org/TR/its/#trans-datacat

More resources

Technique index - Topic index

Best Practice 5: Provide a way to override translatability information

Include its:translate and its:rules in your DTD or schema to allow authors to override translatability information.

How to do this

Make sure the its:translate attribute is available for the root element of your documents, and for any element that has text content.

For details on how to add an attribute such as its:translate to a DTD, an XSD schema, or a RELAX-NG schema, see: Section 4.2: Adding an Attribute to an Existing DTD or Schema.

Make also sure the its:rules element is available somewhere in your documents, for example in the header part if there is one. The its:rules element provides access to the its:translateRule element which can be used to change the translatability property of elements and attributes at the document level.

The its:translate attribute and the its:translateRule element are part of the ITS "Translate" data category which expresses information about whether the content of an element or attribute should be translated or not.

For existing DTD and schema:

If you are working with DTD or a schema where there is a way to override translatability information that is not its:translate, the authors of the documents should use it. In addition, you should provide an ITS rules document where you use the its:translateRule element to associate this mechanism to the ITS Translate data category.

For example, [DITA 1.0] offers a translate attribute, and [Glade] provides a translatable attribute. Both have the same semantics as its:translate.

Example 11: DITA translation information

The following rules indicate how to associate the DITA translate attribute with the ITS Translate data category. The order in which the rules are listed is important:

Rule 1: Indicates that the content of any element with a translate attribute set to no is not translatable.
Rule 2: Indicates that any attribute value of any element with a translate attribute set to no is not translatable. This is needed because some attributes are translatable in DITA and we need to make sure they are not translated when translate="no" is used.
Rule 3: Indicates that the content of any element with a translate attribute set to yes is translatable. This takes care of the cases where translate="yes" is used to override a prior translate="no".

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <its:translateRule selector="//*[@translate='no']" translate="no"/>
 <its:translateRule selector="//*[@translate='no']/descendant-or-self::*/@*"
  translate="no"/>
 <its:translateRule selector="//*[@translate='yes']" translate="yes"/>
</its:rules>

[Example's source code]

You can find a more complete example of how DITA markup is associated with ITS in Section 5.4.2: Relating ITS to Existing Markup in DITA.

Why do this

In some cases, the author of a document may need to change the translatability property on parts of the content, overriding defaults or more general rules.

Resources:

Reference links

The "Translate" data category in ITS.
http://www.w3.org/TR/its/#trans-datacat

More resources

Technique index - Topic index

Best Practice 6: Provide text segmentation-related information

Define how elements in mixed content should be handled with regard to segmentation.

How to do this

Provide an ITS rules document where you use the its:withinTextRule element to indicate which element should be treated as part of its parent or as an nested independent run of text. By default elements are assumed to be non-nested independent run of text.

The its:withinTextRule element is part of the ITS "Element Within Text" data category which reveals if and how an element affects the way text content behaves from a linguistic viewpoint.

Example 12: A DITA document with formatting and footnote elements.

In the following DITA document:

The elements term and b should be treated as parts of their parents.
The element fn should be treated as a nested an independent run of text.

<concept id="myConcept" xml:lang="en-us">
 <title>Types of horse</title>
 <conbody>
  <ol>
   <li>Palouse horse:<p><term>Palouse horses</term><fn>A palouse horse is the same as
    an <b>Appaloosa</b>.</fn> have spotted coats.
    The <term>Nez-Perce</term> Indians have been key in breeding this
    type of horse.</p></li>
  </ol>
 </conbody>
</concept>

[Example's source code]

Example 13: ITS rules to specify some elements as "within text" and "nested".

The its:withinTextRule element is used to specify the behavior of term and b (within text), as well as fn (nested). Any case not listed is assumed to have the value its:withinText="no".

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <its:withinTextRule selector="//term | //b" withinText="yes"/>
 <its:withinTextRule selector="//fn" withinText="nested"/>
</its:rules>

These rules applied on the example 12 document will result on four distinct runs of text:

title: "Types of horse"
li: "Palouse horse:"
p: "{term}Palouse horses{/term}{fn/} have spotted coats. The {term}Nez-Perce{/term} Indians have been key in breeding this type of horse."
fn: "A palouse horse is the same as an {b}Appaloosa{/b}."

[Example's source code]

Why do this

Many applications that process content for linguistic-related tasks need to be able to perform a basic segmentation of the text content. They need to be able to do this without knowing about the semantic of the elements.

While in many cases it is possible to automatically detect mixed content, there are some occurrences where the structure of an element makes it impossible for tools to know for sure how to treat text. For example, the li element in XHTML can contain text as well as p elements.

Resources:

Reference links

The "Element Within Text" data category in ITS.
http://www.w3.org/TR/its/#elements-within-text

More resources

Technique index - Topic index

Best Practice 7: Provide a way to specify ruby text

Include its:ruby in you DTD or schema to allow for ruby text.

How to do this

Make sure the its:ruby element is available in all elements where there is text.

[Ed. note: YS: Not sure if Ruby is as important as other BP, do we need to say it?]

The its:ruby element is part of the ITS "Ruby" data category which is used for a run of text that is associated with another run of text, referred to as the base text. Ruby text is used to provide a short annotation of the associated base text. It is most often used to provide a reading (pronunciation) guide.

For existing DTD or schema:

If you are working with an existing DTD or schema where there is a way to specify ruby text that is not implemented using the its:ruby element, you should provide an ITS rules document where you use the its:rubyRule element to associate your ruby markup with its equivalent in ITS.

Example 14: Document with ruby-like elements.

In this document the rubyBlock element is similar to its:ruby, rBase is similar to its:rb, rParen is similar to its:rp, rText and is similar to its:rt.

<text>
 <para>この本は <rubyBlock>
  <rBase>慶応義塾大学</rBase>
  <rParen>(</rParen>
  <rText>けいおうぎじゅくだいがく</rText>
  <rParen>)</rParen>
 </rubyBlock>の歴史を説明するものです。</para>
</text>

[Example's source code]

Example 15: Association between the ITS "Ruby" data category and equivalent elements

This its:rubyRule element indicates that the rBase element is similar to its:rb and that the elements its:ruby, its:rt and its:rt have corresponding elements as well.

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <its:rubyRule selector="//rBase" rubyPointer=".."
  rpPointer="../rParen" rtPointer="../rText" />
</its:rules>

[Example's source code]

Note: [Ed. note: TODO: note about need to have ruby-like lement set as withinText and nested (for rt equivalent).]

Why do this

Ruby provides markup for phonetic or semantic annotation of text such as is common in Far Eastern scripts for Japanese and Chinese. (Ruby is known as furigana in Japan).

[Ed. note: TODO: Need more info]

Resources:

Include its:locNote, its:locNoteType, and its:locNoteRef in your DTD or schema to allow authors to provide translation-related notes and instructions.

How to do this

Make sure the attributes its:locNote, its:locNoteType, as well as its:locNoteRef are available in your DTD or schema.

Make also sure that the its:rules element is available somewhere in your documents, for example in the header part if there is one. The its:rules element provides access to the its:locNoteRule element which can be used to specify translation-related notes and instruction at a more general level.

For existing DTD or schema:

If you are working with an existing DTD or schema where there is a way to provide notes to the localizers that is not implemented using ITS, you should provide an ITS rules document where you use the its:locNoteRule element to associate your ruby markup with its equivalent in ITS.

Example 16: Document with custom localization notes

In this document the comment element is a note for its sibling text element.

<messages>
 <msg id="ERR_NOFILE">
  <text>The file '{0}' could not be found.</text>
  <comment>The variable {0} is the name of a file.</comment> 
 </msg>
</messages>

[Example's source code]

Example 17: Association between the ITS "Localization Note" data category and equivalent elements

The its:locNoteRule element specifies that the text elements have an associated localization description in their sibling comment elements.

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <its:locNoteRule selector="//msg/text" locNoteType="description"
  locNotePointer="../comment"/>
</its:rules>

[Example's source code]

Why do this

To assist the translator to achieve a correct translation, authors may need to provide information about the text that they have written. For example, the author may want to:

tell the translator how to translate part of the content
expand on the meaning or contextual usage of a particular element, such as what a variable refers to or how a string will be used on the UI
clarify ambiguity and show relationships between items sufficiently to allow correct translation (e.g. in many languages it is impossible to translate the word 'enabled' in isolation without knowing the gender, number and case of the thing it refers to.)
explain why text is not translated, point to text reuse, or describe the use of conditional text
indicate why a piece of text is emphasized (important, sarcastic, etc.)

Resources:

Reference links

The "Localization Note" data category in ITS.
http://www.w3.org/TR/its/#locNote-datacat

More resources

Technique index - Topic index

Best Practice 9: Provide a way to specify unique identifiers

Provide a way to assign a unique identifier to translatable text.

How to do this

Make sure the attribute xml:id, or an equivalent attribute, is available, at least, the "paragraph" level, for the elements that contain translatable text.

Why do this

In order to most effectively reuse translated text where content is reused (either across update versions or across deliverables) it is necessary to have a unique and persistent identifier associated with the element.

This identifier allows the translation tools to correctly track an item from one version or location to the next. After one is sure that this is the same item, the content can be examined for changes, and if no change has taken place the potential for reuse of the previous translation is very high.

Change analysis constitutes an extremely powerful productivity tool for translation when compared to the typical source matching (a.k.a. translation memory) techniques, which simply look for similar source text in the database without, most of the time, being able to tell whether the context of its use is the same.

Resources:

Reference links

W3C Recommendation: xml:id
http://www.w3.org/TR/xml-id/

More resources

Technique index - Topic index

Best Practice 10: Identify terminology-related elements

Define what elements are related to terminology information

How to do this

You should provide an ITS rules document where you use its:termRule elements to indicate which elements are "terms" and information related to them (e.g. definitions).

Example 18: Document with terminology-related markup

In this document, the elements term, syn, and dt denote terms. In addition, they can all have associated information.

<myDoc>
 <body>
  <p>A <term def="d001">doppelgänger</term> is basically <def xml:id="d001">the
  counterpart of a person</def>. It is almost the same as an 
  <syn ref="#alterego">alter ego</syn>, but with a more sinister connotation. 
  Sometimes the word "fetch" is also used.</p>
 </body>
 <definitions>
  <entry xml:id="alterego">
   <dt>alter ego</dt>
   <dd>A second self. Figurative sense: trusted friend.</dd>
   <origin>Latin, literally: "second I"</origin>
  </entry>
 </definitions>
</myDoc>

[Example's source code]

Example 19: ITS rules specifying terminology-related elements

This set of ITS rules indicates the following:

Rule 1: Indicates that the term element is a term and its associated information can be accessed in the node that has the ID corresponding to the value in its ref attribute.
Rule 2: Indicates that the syn element is a term and its ref attribute contains a URI location where some associated information can be found.
Rule 3: Indicates that the dt element is a term and its associated information is in its sibling element dd.

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <its:termRule selector="//term" term="yes" termInfoPointer="id(@def)"/>
 <its:termRule selector="//syn" term="yes" termInfoRefPointer="@ref"/>
 <its:termRule selector="//dt" term="yes" termInfoPointer="../dd"/>
</its:rules>

[Example's source code]

Why do this

The capability of specifying terms within the source content is important for terminology management that is beneficial to translation and localization quality. Term identification also facilitates the creation of glossaries and allows validation of terminology usage in the source and target documents.

Identified terms could be used for indexing that may require some language specific information. For example, Japanese words are sorted not by script characters, but by phonetic characters. Therefore when a Japanese index item is created, it should be accompanied with a phonetic string, called Yomigana.

As a result, terms may require various attributes, such as part of speech, gender, number, term types, definitions, notes on usage, etc. To avoid such a large attribute data is repeated within a document, it should be possible for identified terms to link to externalized attribute data, such as glossary documents and terminology database.

Resources:

Reference links

The "Terminology" data category in ITS.
http://www.w3.org/TR/its/#terminology

More resources

Technique index - Topic index

Best Practice 11: Provide a way to override terminology information

Include its:term, its:termInfoRef, and its:rules in your DTD or schema to allow authors to override terminology-related information

How to do this

Make sure the its:term and the its:termInfoRef attributes are available for any element that has text content. [Ed. note: Not sure about this: Shouldn't it apply only to elements that are defined as term?]

Make also sure the its:rules element is available somewhere in your documents, for example in the header part if there is one. The its:rules element provides access to the its:termRule element which can be used to change the terminology-related information of attributes.

[Ed. note: TODO]

Example 20:

[Ed. note: TODO]

Why do this

In some cases, the author of a document may need to change the information indicating what is a term or how to point to term information, overriding more general rules that have been defined for the DTD or schema.

Resources:

Reference links

The "Terminology" data category in ITS.
http://www.w3.org/TR/its/#terminology

More resources

Technique index - Topic index

Best Practice 12: Use multilingual documents with caution

[Ed. note: TODO]

Example 21:

[Ed. note: TODO]

Best Practice 13: Name elements with caution

Use a meaningful naming scheme for your elements

If possible avoid having element names reflecting the ID of the element

Example 22: [Ed. note: TODO]

[Ed. note: TODO]

<strings>
 <INPUTPATH>Input path:</INPUTPATH>
 <HELP>Help</HELP>
 <OK>OK</OK>
 <CANCEL>Cancel</CANCEL>
</strings>

[Example's source code]

Instead, [Ed. note: TODO]

Example 23: [Ed. note: TODO]

[Ed. note: TODO]

<strings>
 <str xml:id="INPUTPATH">Input path:</str>
 <str xml:id="HELP">Help</str>
 <str xml:id="OK">OK</str>
 <str xml:id="CANCEL">Cancel</str>
</strings>

[Example's source code]

Best Practice 14: Provide ITS rules for your DTD or schema

Provide all the ITS rules needed to process documents in your format.

[Ed. note: TODO]

Provides these rules in a single standalone ITS document. ITS-aware tools will be able to associate it with the documents it pertains using their own mechanism, or the authors of the documents will be able to use the ITS linking mechanism to point to it.

You ITS rules document should include the following information, when applicable:

What part of your markup has translatability rules different from the defaults (See: Best Practice 4: Indicate the translatability of elements and attributes).
The list of elements that should be treated as "nested" or "within text" from a segmentation viewpoint (See: Best Practice 6: Provide text segmentation-related information).
What part of your markup denotes terms and information related to them (See: Best Practice 10: Identify terminology-related elements).
What part of your markup holds notes for the localizers or the translators (See: Best Practice 8: Provide a way to specify comments for translators).
The correspondance between any proprietary mechanism you have to specify the language of content and xml:lang (See: Best Practice 1: Provide xml:lang to specify natural language content).
The correspondance between any proprietary mechanism you have to override translatability information and the ITS equivalent (See: Best Practice 5: Provide a way to override translatability information).
The correspondance between any proprietary mechanism you have to indicate text directionality and its:dir (See: Best Practice 2: Provide a way to specify text directionality).
The correspondance between any proprietary mechanism you have to markup Ruby text and its:ruby (See: Best Practice 7: Provide a way to specify ruby text).

Some examples of ITS rules documents for existing XML formats are shown in Section 5: ITS Applied to Existing Formats.

Resources:

Reference links

W3C Proposed Recommendation: Internationalization Tag Set (ITS)
http://www.w3.org/TR/its/

More resources

Technique index - Topic index

3 When Authoring XML Content

Authors of XML content should consider the following best practices:

Best Practice 15: Specify the language of the content
Best Practice 16: Specify text directionality if needed
Best Practice 17: Override translatability information if needed
Best Practice 18: Assign unique identifiers to text items when possible
Best Practice 19: Use CDATA sections with caution
Best Practice 20: Provide comments for translators
Best Practice 21: Ensure any inserted text is context-independent
Best Practice 22: Use entity references with caution
Best Practice 23: Place sub-flow elements with caution

A number of these practices can be followed only when the XML application has been internationalized properly using the design guidelines Section 2: When Designing an XML Application.

Best Practice 15: Specify the language of the content

Make sure to indicate the language for all elements and attributes of your document.

How to do this

Your DTD or schema should provide the xml:lang attribute for this purpose. See: Best Practice 1: Provide xml:lang to specify natural language content for more information.

Use this recommended attribute on the root element and, if needed, on each element for which the language content is different. The elements without declaration inherit the language information from their parents.

Make sure that the value of xml:lang conforms to BCP 47.

Example 24: Declaring language information

In this example, the main content of the document is in English, while a short citation is identified as being in French.

<document xml:lang="en">
 <para>The motto of Québec is the short phrase:
  <q xml:lang="fr">Je me souviens</q>. It is chiseled on 
  the front of the Parliament Building.</para>
</document>

[Example's source code]

Why do this

Having information about what is the language of the content is very important in many situations. Some of them are:

selection of a proper font (e.g. for traditional or simplified Chinese)
processing of the text for wrapping and hyphenation
providing spell-checking or grammar verification of the text
selecting proper formatting properties for data such as date, time, numbers, etc.
selecting proper automated text such as quotation marks or other punctuation signs
using the text with voice browsers

Resources:

Background information

Internationalization FAQ: xml:lang in XML document schemas.
http://www.w3.org/International/questions/qa-when-xmllang

Reference links

The values to use with xml:lang to specify a language.
http://www.rfc-editor.org/rfc/bcp/bcp47.txt
Description of the language identification mechanism in the XML specification.
http://www.w3.org/TR/REC-xml/#sec-lang-tag
Language tags in HTML and XML.
http://www.w3.org/International/articles/language-tags/
Tagging text with no language.
http://www.w3.org/International/questions/qa-no-language

Test data

I18N Tests: Automatic font assignment for CJK text (for XHTML).
http://www.w3.org/International/tests/sec-cjk-fonts.html

More resources

Technique index - Topic index

Best Practice 16: Specify text directionality if needed

[Ed. note: TODO]

Best Practice 17: Override translatability information if needed

[Ed. note: TODO]

How to do this

[Ed. note: TODO]

Overriding translatability information relates to marking up paragraphs or section of text that should remain untranslated, but are enclosed in XML elements that are normally translatable.

Example 25: [Ed. note: TODO]

[Ed. note: TODO]

Example 26: Overriding default translation rules

In the following document, the content of the par elements is normally translatable, but in this instance, the last one should remain in English. Declaring its:translate as an optional attribute of the par element allows the author to set the given paragraph as not translatable.

<myDoc xmlns:its="http://www.w3.org/2005/11/its" its:version="1.0">
 <par>To apply these terms to you library, attach the following notice.
  It is safest to attach it to the start of each source file to most 
  effectively convey the exclusion of warranty; and each file should 
  have at least the "copyright" line and a pointer to where the full 
  notice is found.</par>
  <par>The notice should read (preferably in English):</par>
  <par its:translate="no">This library is free software; you can 
  redistribute it and/or modify it under the terms of the GNU Lesser 
  General Public License as published by the Free Software Foundation; 
  either version 2.1 of the License, or (at your option) any later 
  version. This software is distributed as open source under LGPL.</par>
 </myDoc>

[Example's source code]

Note: Authors should NOT use its:translate to tag single words or terms that (they think) should remain the same as the source language when translated into a given target language (e.g. loan-words). This type of decision is done during translation using terminology lookup tools, and does not involve any specific tagging. Authors may decide what is translatable, but not how to translate it.

Do NOT do the following:

Example 27: XML document with inppropriate usage of its:translate.

In this document its:translate is used to markup a proper name and two loan words in an attempt to indicate what should not be translated. You should NOT do this.

<book xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <body>
  <p>Everything started when <span its:translate="no">Zebulon</span> 
  discovered that he had a <span its:translate="no">doppelgänger</span> 
  who was a serious baseball <span its:translate="no">aficionado</span>.</p>
 </body>
</book>

[Example's source code]

Why do this

[Ed. note: TODO]

Best Practice 18: Assign unique identifiers to text items when possible

[Ed. note: TODO]

Best Practice 19: Use CDATA sections with caution

[Ed. note: TODO]

Best Practice 20: Provide comments for translators

[Ed. note: TODO]

Best Practice 21: Ensure any inserted text is context-independent

Make sure any piece of inserted text is grammatically independent of its surrounding context.

How to do this

Use inserted text only when the text is self-contained and does not affect its surrounding context. Error messages, quotations are an example of inserted text that usually would not cause problem.

Avoid to use inserted text that has any effect or dependence on the context where is is inserted.

Why do this

If not used properly, inserted text can cause important (and sometimes un-resolvable) problems during localization.

Inserted text refers to any text that is marked by a placeholder in the XML document and automatically inserted within a text content when the document is processed. The nature of such text can be for example:

boilerplate text reused in different contexts,
various parts of a compound document put together,
or variables values computed at some point during the process the document go through.

The implementation of such text can be done different ways in XML. Some of them are:

Using entity references.
Using [XInclude 1.0] mechanisms.
Using [XLink 1.0] mechanisms.
Using a custom mechanism specific to a given format (e.g. the conref attribute in [DITA 1.0]).

There are several important issues related to inserted text. Consider the following:

Example 28: Using conref in DITA

In this example, the author, working with [DITA 1.0], decided to reference the standard terms she uses and has at her disposal in a termbase by using the conref mechanism. In this occurence, the term t123 has the value "hydraulic lift".

<p>Using an <term conref="termbase#t123"/> raise the vehicle from the ground.</p>

At a first glance this seems to work fine in English. However, such construction has several problems:

You do not want to separate the article from the noun. If "hydraulic lift" is modified in the future and replaced by some other term, it may require an article 'a' instead of 'an'.
The article/noun separation causes also trouble for the translator: Without any easy way to see the actual term when translating the paragraph, she may not be able to decide the gender of the article.
If it is used at the beginning of a sentence, the term would need to be capitalized.
The term is singular in the termbase, while it may need to be plural somewhere in the document.
In inflected languages the form required in the text may be different from the form stored in the termbase. For example, in Polish the term would be stored in its nominative form ("dźwignia hydrauliczna"), while it should be in its instrumental form once inserted in this context: "Używając [dźwignię hydrauliczną] podnieś pojazd z ziemi."

Resources:

Background information

Internationalization article: Working with Composite Messages.
http://www.w3.org/International/articles/composite-messages/
Internationalization article: Re-using Strings in Scripted Content
http://www.w3.org/International/articles/text-reuse/

More resources

Technique index - Topic index

Best Practice 22: Use entity references with caution

[Ed. note: TODO]

Make sure the entities content is grammatically independent of its surrounding context. See: Best Practice 21: Ensure any inserted text is context-independent for more details.
Avoid using entities that are not well-formed XML content. The entities declarations may be processed separately during localization and should be parsable.

[Ed. note: TODO]

Best Practice 23: Place sub-flow elements with caution

Place sub-flow elements where it is has least negative impact on the parent text flow

[Ed. note: TODO]

Sometimes in the content model of some elements, there is the need for translatable information that constitutes a run of text linguistically separated from the text within which it resides. Index markers are a good example of such case: Each index marker is an independent run of text, but it is located inside the paragraph to which it pertains.

If possible, place sub-flow elements at the beginning or at the end of the paragraph, this reduces the impact the element has in the paragraph content from the translation viewpoint and may improve re-usability.

4 Generic Techniques

This section provides a set of generic techniques that are applicable to various guidelines, for example, how to add ITS attributes or elements to different types of schemas.

4.1 Writing ITS Rules

[Ed. note: TODO]

Whether they are external or embedded, there are a few things you should take in consideration when writing ITS rules.

Note: Try to keep the number of nodes to be overriden to a minimum for better performances. For example, If most of a document should not be translated, it is better to set the root element to be non-translatable than to set all elements. The inheritance mechanism will have the same effect for a much lower computing cost.

Note: Because a rule has precedence over the ones before, you want to start with the most general rules first and progressively override them as needed. Some rules may be more complex to take in account all the aspects of inheritance.

4.1.1 Precedence and Inheritance

[Ed. note: TODO]

The order in which the rules are declared matter greatly. ITS defines an order of precedence to process the rules.

Within a its:rules element, rules go from the most general to the most specific. When two rules select the same nodes of a document, the last rule wins.

Be mindful of the inheritance properties of each data category, a table summarizes the type and scope of inheritance for each data category.

Remember also than inheritance does not override selection. For example:

Example 29:

The first rule sets all nodes as not-translatable, then the second rule sets all p elements as translatable, overriding the first rule for the selected nodes. But the b element is not part of the selection of the second rule and therefore keeps the original setting of not-translatable: Only the text "Some text with " and the terminal "." will be translated.

<doc xmlns:its="http://www.w3.org/2005/11/its">
 <head>
  <its:rules version="1.0">
   <its:translateRule selector="//*" translate="no"/>
   <its:translateRule selector="//p" translate="yes"/>
  </its:rules>
 </head>
 <text>
  <data>Some data with <b>bolded parts</b>.</data>
  <p>Some text with <b>bolded words</b>.</p>
 </text>
</doc>

If you change the selector of the first rule to selector="/doc", the not-translatable property is inherited for each child node of the doc element, and when the second rule is applied, the translate property is also applied to the child nodes of the p element, overriding the previous rule for the b element inside p. Therefore the translatable text is "Some text with bolded words."

You could also get the same effect by changing the selector of the second rule instead of the first rule, and explicitly selecting the nodes inside the p elements with the expression selector="//p/descendant-or-self::*".

In general it is usually better to let the inheritance propagate the rules, rather than select explicitly children elements. Such method is also faster since less nodes are selected.

4.1.2 Dealing with namespaces

[Ed. note: TODO]

When writing rules for document using XML namespaces you must make sure to declare the namespaces, and to use the relevant prefixes in the different XPath expressions.

Example 30:

[Ed. note: TODO]

4.1.3 Create your XPath expressions with care

ITS uses XPath expressions in several contexts to identify nodes. The most prominent contexts are selectors, and pointer attributes such as:

<its:translateRule selector="//term" translate="no"/>

<its:locNoteRule locNoteType="description" selector="//msg/data"
 locNotePointer="../notes"/>

When writing ITS-related XPath expressions like the one above, the following general dimensions should be considered:

ITS XPath expressions pertain to XPath 1.0 or its successor
The values of ITS selector attributes are XPath absolute location paths
The values of ITS pointer attributes are XPath relative location paths

In environments where XSL is used to process ITS-related XPath expressions, it is important to know about the subset of XPath which is termed "XSLT patterns" (see the note in the section Global Approach of the ITS Specification). Using only XSLT patterns in ITS selector attributes helps to avoid issues which may arise with respect to the "match" attribute in XSL "template" elements.

In addition to these general dimensions, best practices related to writing XPath expressions should be taken into account (see for example the XPath tutorial http://www.zvon.org/xxl/XPathTutorial/General/examples.html).

4.2 Adding an Attribute to an Existing DTD or Schema

This example shows how to add an attribute (here xml:lang) to an existing document type.

[Ed. note: TODO: to make more generic.]

4.2.1 Include `xml:lang` in XML Schema

Import the xml.xsd file in your schema and use references to xml:lang in your element declarations.

To include the xml:lang attribute in your XSD document, import the W3C xml.xsd schema in your own XSD schema using the xsd:import element.

Example 31:

Importing the xml:lang declaration in an XSD schema.

<xsd:schema targetNamespace="myNamespaceURI" 
 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
 xmlns:t="myNamespaceURI" elementFormDefault="qualified" xml:lang="en">
 <!-- Import for xml:lang and xml:space -->
 <xsd:import namespace="http://www.w3.org/XML/1998/namespace"
 schemaLocation="http://www.w3.org/2001/xml.xsd"/>
 ...

Once the xml.xsd schema is imported, you can use the reference to xml:lang in any of your element declarations.

Example 32:

Using xml:lang in an XSD schema.

... 
<xsd:element name="myDoc">
 <xsd:complexType>
  <xsd:sequence maxOccurs="unbounded">
   <xsd:element name="section" type="t:Section_Type"/>
  </xsd:sequence>
  <xsd:attribute name="version" type="xsd:string" use="required"/>
  <xsd:attribute ref="xml:lang" use="optional"/>
 </xsd:complexType>
 ...

4.2.2 Including `xml:lang` in Relax NG

Declare xml:lang directly in your schema.

In RELAX NG you do not have to import the XML namespace. You can declare xml:lang directly in your schema.

Example 33:

Declaration of xml:lang in RELAX NG

<define name="att.global.attribute.xmllang">
 <optional>
  <attribute name="xml:lang">
   <a:documentation>indicates the language of the element content using the
    codes from RFC3066 or its successor.
   </a:documentation>
   <ref name="data.language"/>
     </attribute>
 </optional>
</define>
<define name="data.language">
  <data type="language"/>
</define>

4.2.3 Including `xml:lang` in XML DTD

Add the xml:lang directly in the attribute list of your elements.

For example, to add xml:lang to a <para> element you can specify the following DTD constructs:

Example 34:

Declaration of xml:lang in a DTD.

<!ELEMENT para (#PCDATA) > 
<!ATTLIST para
          xml:lang CDATA #IMPLIED >

5 ITS Applied to Existing Formats

This section presents several examples of how ITS can be used to enhance the internationalization readiness of some well-known XML document types. These examples are only illustrative and may have to be adapted to fit the need of each specific user.

Two topics are covered for each format:

How should ITS be integrated in specific markup schemes? For example, as for XHTML, it is helpful for the interoperability of ITS implementations to specify that the ITS rules element will always be part of the content model of the head element.
How should ITS data categories be associated with existing markup declarations in a schema, which fulfill identical or overlapping purposes? For example, [DITA 1.0] already has an attribute to indicate translatability of text, but without a mechanism for selection of information in documents and schemas.

The following XML applications are discussed:

Section 5.1: ITS and XHTML 1.0
Section 5.2: ITS and TEI
Section 5.3: ITS and XML Spec
Section 5.4: ITS and DITA
Section 5.5: ITS and Glade
Section 5.6: ITS and DocBook

5.1 ITS and XHTML 1.0

[XHTML 1.0] is a reformulation of the three HTML 4 document types as applications of XML 1.0. HTML is an SGML (Standard Generalized Markup Language) application conforming to International Standard ISO 8879, and is widely regarded as the standard publishing language of the World Wide Web.

5.1.1 Integration of ITS into XHTML

In XHTML 1.0, the XHTML namespace may be used with other XML namespaces as per [XML Names], but such documents are not strictly conforming XHTML 1.0 documents in the sense of XHTML 1.0.

An example of such a non-conformant XHTML 1.0 document is as follow.

Example 35: A non-conformant XHTML 1.0 document

<html xmlns="http://www.w3.org/1999/xhtml"
 xmlns:its="http://www.w3.org/2005/11/its" lang="en" xml:lang="en">
 <head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta name="keywords" content="ITS example, XHTML translation" />
  <its:rules version="1.0" xmlns:h="http://www.w3.org/1999/xhtml">
   <its:translateRule selector="//h:meta[@name='keywords']/@content"
    translate="yes" />
   <its:termRule selector="//h:span[@class='term']" term="yes" />
  </its:rules>
  <title>ITS Working Group</title>
 </head>
 <body>
  <h1>Test of ITS on <span class="term">XHTML</span></h1>
  <p>Some text to translate.</p>
  <p its:translate="no">Some text not to translate.</p>
 </body>
</html>

[Example's source code]

There are two ways to use ITS with XHTML and keep the XHTML document conformant:

To use [XHTMLMod1.1]. See: Section 5.1.2: Using XHTML Modularization 1.1 for the Definition of ITS for details.
To use either external ITS global rules (as shown below). Even local information within the document that would be handled by ITS attributes can be set indirectly.

Example 36: ITS external rules for XHTML

These rules illustrate some of the ITS data categories you can associate to specific XHTML markup. The first its:translateRule indicates that the attribute content of the meta element should be translated if the attribute name is set to "keywords". The second its:translateRule indicates that no p with a class="notrans" should be translated. And the its:termRule indicates that any span element with class="term" is a term.

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0"
 xmlns:h="http://www.w3.org/1999/xhtml">
 <its:translateRule selector="//h:meta[@name='keywords']/@content"
  translate="yes" />
 <its:translateRule selector="//h:p[@class='notrans']"
  translate="no" />
 <its:termRule selector="//h:span[@class='term']" term="yes" />
</its:rules>

[Example's source code]

The corresponding document:

<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
 <head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta name="keywords" content="ITS example, XHTML translation" />
  <title>ITS Working Group</title>
 </head>
 <body>
  <h1>Test of ITS on <span class="term">XHTML</span></h1>
  <p>Some text to translate.</p>
  <p class="notrans">Some text not to translate.</p>
 </body>
</html>

[Example's source code]

5.1.2 Using XHTML Modularization 1.1 for the Definition of ITS

This section describes how to use [XHTMLMod1.1] for the definition of ITS. It first defines an ITS abstract module which is then implemented in the formats of XML Schema, RELAX NG and XML DTD. The module is meant to be integrated in existing or new schemas which rely on [XHTMLMod1.1].

5.1.2.1 Abstract Definition of ITS Markup

The following is the abstract definition of the elements for global ITS markup, which is consistent with the XHTML Modularization framework [XHTMLMod1.1]. Further definitions of XHTML abstract modules can be found in [XHTMLMod1.1].

Note that this definition does not contain the ruby element and the dir attribute, since these are already available in XHTML.

Elements	Attributes	Minimal Content Model
rules	version (CDATA), xlink:href (URI), xlink:type ("simple")	( translateRule \| locNoteRule \| termRule \| dirRule \| rubyRule \| langRule \| withinTextRule )*
translateRule	Selector, translate ("yes"\|"no")	EMPTY
locNoteRule	Selector, locNotePointer (CDATA), locNoteType ("alert"\| "description"*), locNoteRef (URI), locNoteRefPointer (CDATA)	locNote?
locNote	translate ("yes"\|"no"), locNote (CDATA), locNoteType ( "alert" \| "description"* ), locNoteRef (URI), termInfoRef ( URI ), term ( "yes" \| "no" ), dir ( "ltr" \| "rtl" \| "lro" \| "rlo" )	(PCDATA \| ruby)*
termRule	Selector, term ( "yes" \| "no" ), termInfoRef ( URI ), termInfoRefPointer ( CDATA), termInfoPointer ( CDATA )	EMPTY
dirRule	Selector, dir ("ltr" \| "rtl" \| "lro" \| "rlo")	EMPTY
rubyRule	Selector, rubyPointer (CDATA), rtPointer (CDATA), rpPointer (CDATA), rbcPointer (CDATA), rtcPointer (CDATA), rbspanPointer (CDATA)	rubyText
rubyText	translate ("yes"\|"no"), locNote (CDATA), locNoteType ("alert"\|"description"*), locNoteRef (URI), term ("yes" \| "no"), termInfoRef (CDATA), dir ("ltr" \| "rtl" \| "lro" \| "rlo" ), rbspan (CDATA)	PCDATA
langRule	Selector, langPointer (CDATA)	EMPTY
withinTextRule	Selector, withinText ("yes"\|"no"\|"nested")	EMPTY

The following is the abstract definitions of two attribute groups: the selector attribute used within global rules, and ITS attributes to be used locally. Again these definition makes use of [XHTMLMod1.1].

Collection	Attributes in Collection
Selector	selector (CDATA)
ITSLocal	translate ("yes"\|"no"), locNote (CDATA), locNoteType ("alert"\|"description"*), locNoteRef (URI), termInfoRef (URI), term ("yes" \| "no")

5.1.2.2ITS XML Schema Module Implementation

The following schema contains the implementation of the abstract markup module in XML Schema.

Example 37:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://www.w3.org/2005/11/its" xmlns:its="http://www.w3.org/2005/11/its"
    xmlns:h="http://www.w3.org/1999/xhtml" elementFormDefault="qualified"
    xmlns:xlink="http://www.w3.org/1999/xlink">
    <xs:import namespace="http://www.w3.org/1999/xlink" schemaLocation="xlink.xsd"/>
    <xs:import namespace="http://www.w3.org/1999/xhtml"
        schemaLocation="xhtml-schemas/xhtml-ruby-1.xsd"/>
    <xs:simpleType name="translate.type">
        <xs:restriction base="xs:string">
            <xs:enumeration value="yes"/>
            <xs:enumeration value="no"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="term.type">
        <xs:restriction base="xs:string">
            <xs:enumeration value="yes"/>
            <xs:enumeration value="no"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="locNoteType.type">
        <xs:restriction base="xs:string">
            <xs:enumeration value="alert"/>
            <xs:enumeration value="description"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="dir.type">
        <xs:restriction base="xs:string">
            <xs:enumeration value="ltr"/>
            <xs:enumeration value="ltr"/>
            <xs:enumeration value="lro"/>
            <xs:enumeration value="rlo"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="withinText.type">
        <xs:restriction base="xs:string">
            <xs:enumeration value="yes"/>
            <xs:enumeration value="no"/>
            <xs:enumeration value="nested"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:attributeGroup name="its.Selector.attlist">
        <xs:attribute name="selector" type="xs:string" use="required"/>
    </xs:attributeGroup>
    <xs:attributeGroup name="its.ITSLocal.attlist">
        <xs:attribute name="translate" form="qualified" use="optional" type="its:translate.type"/>
        <xs:attribute name="locNote" type="xs:string" form="qualified" use="optional"/>
        <xs:attribute name="locNoteType" form="qualified" use="optional" type="its:locNoteType.type"/>
        <xs:attribute name="locNoteRef" type="xs:anyURI" form="qualified" use="optional"/>
        <xs:attribute name="termInfoRef" type="xs:string" form="qualified" use="optional"/>
        <xs:attribute name="term" type="its:term.type" form="qualified" use="optional"/>
    </xs:attributeGroup>
    <xs:element name="rules" type="its:rules.type"/>
    <xs:complexType name="rules.type" mixed="false">
        <xs:choice minOccurs="0" maxOccurs="unbounded">
            <xs:element ref="its:translateRule"/>
            <xs:element ref="its:locNoteRule"/>
            <xs:element ref="its:termRule"/>
            <xs:element ref="its:dirRule"/>
            <xs:element ref="its:rubyRule"/>
            <xs:element ref="its:langRule"/>
            <xs:element ref="its:withinTextRule"/>
        </xs:choice>
        <xs:attributeGroup ref="its:rules.attlist"/>
    </xs:complexType>
    <xs:attributeGroup name="rules.attlist">
        <xs:attribute name="version" use="required" type="xs:string"/>
        <xs:attribute ref="xlink:href" use="optional"/>
        <xs:attribute ref="xlink:type" use="optional"/>
    </xs:attributeGroup>
    <xs:element name="translateRule" type="its:translateRule.type"/>
    <xs:complexType name="translateRule.type">
        <xs:attributeGroup ref="its:its.Selector.attlist"/>
        <xs:attribute name="translate" use="required" type="its:translate.type"/>
    </xs:complexType>
    <xs:element name="locNoteRule" type="its:locNoteRule.type"/>
    <xs:complexType name="locNoteRule.type">
        <xs:sequence minOccurs="0" maxOccurs="1">
            <xs:element ref="its:locNote"/>
        </xs:sequence>
        <xs:attributeGroup ref="its:its.Selector.attlist"/>
        <xs:attribute name="locNotePointer" type="xs:string" use="optional"/>
        <xs:attribute name="locNoteType" use="required" type="its:locNoteType.type"/>
        <xs:attribute name="locNoteRef" type="xs:anyURI" use="optional"/>
        <xs:attribute name="locNoteRefPointer" type="xs:string" use="optional"/>
    </xs:complexType>
    <xs:element name="locNote" type="its:locNote.type"/>
    <xs:complexType name="locNote.type" mixed="true">
        <xs:attribute name="translate" use="optional" type="its:translate.type"/>
        <xs:attribute name="locNote" type="xs:string" use="optional"/>
        <xs:attribute name="locNoteType" use="optional" type="its:locNoteType.type"/>
        <xs:attribute name="locNoteRef" type="xs:anyURI" use="optional"/>
        <xs:attribute name="termInfoRef" type="xs:anyURI" use="optional"/>
        <xs:attribute name="term" use="optional" type="its:term.type"/>
        <xs:attribute name="dir" use="optional" type="its:dir.type"/>
    </xs:complexType>
    <xs:element name="termRule"/>
    <xs:complexType name="termRule.type">
        <xs:attributeGroup ref="its:its.Selector.attlist"/>
        <xs:attribute name="term" type="its:term.type" use="required"/>
        <xs:attribute name="termInfoRef" type="xs:anyURI" use="optional"/>
        <xs:attribute name="termInfoRefPointer" type="xs:string" use="optional"/>
        <xs:attribute name="termInfoPointer" type="xs:string" use="optional"/>
    </xs:complexType>
    <xs:element name="dirRule" type="its:dirRule.type"/>
    <xs:complexType name="dirRule.type">
        <xs:attributeGroup ref="its:its.Selector.attlist"/>
        <xs:attribute name="dir" type="its:dir.type" use="required"/>
    </xs:complexType>
    <xs:element name="rubyRule"/>
    <xs:complexType name="rubyRule.type">
        <xs:sequence>
            <xs:element ref="its:rubyText"/>
        </xs:sequence>
        <xs:attributeGroup ref="its:its.Selector.attlist"/>
        <xs:attribute name="rubyPointer" type="xs:string" use="optional"/>
        <xs:attribute name="rtPointer" type="xs:string" use="optional"/>
        <xs:attribute name="rpPointer" type="xs:string" use="optional"/>
        <xs:attribute name="rbcPointer" type="xs:string" use="optional"/>
        <xs:attribute name="rtcPointer" type="xs:string" use="optional"/>
        <xs:attribute name="rbspanPointer" type="xs:string" use="optional"/>
    </xs:complexType>
    <xs:element name="rubyText" type="its:rubyText.type"/>
    <xs:complexType name="rubyText.type" mixed="true">
        <xs:attribute name="translate" type="its:translate.type" use="optional"/>
        <xs:attribute name="locNote" type="xs:string" use="optional"/>
        <xs:attribute name="locNoteType" type="its:locNoteType.type" use="optional"/>
        <xs:attribute name="locNoteRef" type="xs:anyURI" use="optional"/>
        <xs:attribute name="term" type="its:term.type" use="optional"/>
        <xs:attribute name="termInfoRef" type="xs:string" use="optional"/>
        <xs:attribute name="dir" type="its:dir.type" use="optional"/>
        <xs:attribute name="rbspan" type="xs:string" use="optional"/>
    </xs:complexType>
    <xs:element name="langRule"/>
    <xs:complexType name="langRule.type">
        <xs:attributeGroup ref="its:its.Selector.attlist"/>
        <xs:attribute name="langPointer" type="xs:string" use="required"/>
    </xs:complexType>
    <xs:element name="withinTextRule"/>
    <xs:complexType name="withinTextRule.type">
        <xs:attributeGroup ref="its:its.Selector.attlist"/>
        <xs:attribute name="withinText" type="its:withinText.type"/>
    </xs:complexType>
</xs:schema>

The following is a driver file which can be used to evoke the schema above.

Example 38:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xhtml="http://www.w3.org/1999/xhtml"
    targetNamespace="http://www.w3.org/1999/xhtml" xmlns:its="http://www.w3.org/2005/11/its"
    xmlns="http://www.w3.org/1999/xhtml" blockDefault="#all">
    <xs:annotation>
        <xs:documentation> This is the XML Schema Driver for new Document Type XHTML Basic 1.0 + ITS
            $Id: Overview.html,v 1.5 2018/10/09 13:17:02 denis Exp $ </xs:documentation>
        <xs:documentation source="http://www.w3.org/TR/xml-i18n-bp/#integration-its-xhtmlmod"/>
    </xs:annotation>
    <xs:import namespace="http://www.w3.org/2005/11/its" schemaLocation="its-module.xsd"/>
    <xs:redefine schemaLocation="xhtml-schemas/xhtml-basic10.xsd">
        <xs:group name="HeadOpts.mix">
            <xs:choice>
                <xs:group ref="HeadOpts.mix"/>
                <xs:element ref="its:rules"/>
            </xs:choice>
        </xs:group>
        <xs:attributeGroup name="Common.attrib">
            <xs:attributeGroup ref="Common.attrib"/>
            <xs:attributeGroup ref="its:its.ITSLocal.attlist"/>
        </xs:attributeGroup>
    </xs:redefine>
</xs:schema>

The file below is an instance which can be validated against this schema.

Example 39:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:xlink="http://www.w3.org/1999/xlink"
    xmlns:its="http://www.w3.org/2005/11/its" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3.org/1999/xhtml xhtml-plus-its.xsd">
    <head>
        <title> </title>
        <its:rules version="1.0">
            <its:locNoteRule locNoteType="alert" selector="..." locNoteRef="..."> </its:locNoteRule>
            <its:locNoteRule locNoteType="alert" selector="...">
                <its:locNote> </its:locNote>
            </its:locNoteRule>
            <its:termRule selector="..." term="yes"/>
        </its:rules>
    </head>
    <body>
        <h3> </h3>
        <table>
            <tr>
                <td> </td>
            </tr>
        </table>
        <ul>
            <li its:locNote="..." its:translate="no"> </li>
        </ul>
    </body>
</html>

5.1.2.3ITS DTD Module Implementation

[Ed. note: TODO]

5.1.3 Relating ITS to Existing Markup in XHTML

A number of XHTML constructs implement the same semantic as some of the ITS data categories. In addition, some of the attributes in XHTML are translatable, which is not the default for XML documents according to ITS defaults settings for translatability. These attributes need to be identified as translatable.

An external ITS rules element can summarize these relations. Because XHTML use is widespread and covers a large amount of legacy material the rules defined here may not be optimal for everyone.

Example 40: ITS external rules for XHTML documents

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0"
 xmlns:h="http://www.w3.org/1999/xhtml">

 <!-- special content. (See note 1) -->
 <its:translateRule selector="//h:script" translate="no"/>
 <its:translateRule selector="//h:style" translate="no"/>

 <!-- Normal translatable attributes -->
 <its:translateRule selector="//h:*/@abbr" translate="yes"/>
 <its:translateRule selector="//h:*/@accesskey" translate="yes"/>
 <its:translateRule selector="//h:*/@alt" translate="yes"/>

 <its:translateRule selector="//h:*/@prompt" translate="yes"/>
 <its:translateRule selector="//h:*/@standby" translate="yes"/>
 <its:translateRule selector="//h:*/@summary" translate="yes"/>
 <its:translateRule selector="//h:*/@title" translate="yes"/>

 <!-- The input element (Important: See note 2) -->
 <its:translateRule selector="//h:input/@value" translate="yes"/>
 <its:translateRule selector="//h:input[@type='hidden']/@value" translate="no"/>

 <!-- Non-translatable element (See note 3) -->

 <its:translateRule selector="//h:del" translate="no"/>
 <its:translateRule selector="//h:del/descendant-or-self::*/@*" translate="no"/>

 <!-- Often-used translatable meta content. -->
 <its:translateRule selector="//h:meta[@name='keywords']/@content"
		    translate="yes"/>
 <its:translateRule selector="//h:meta[@name='description']/@content"
		    translate="yes"/>

 <!-- Possible term (Important: See note 4) -->
 <its:termRule selector="//h:dt" term="yes"/>

 <!-- Bidirectional information -->
 <its:dirRule selector="//h:*[@dir='ltr']" dir="ltr"/>
 <its:dirRule selector="//h:*[@dir='rtl']" dir="rtl"/>
 <its:dirRule selector="//h:bdo[@dir='ltr']" dir="lro"/>
 <its:dirRule selector="//h:bdo[@dir='rtl']" dir="rlo"/>

 <!-- Elements within text -->
 <its:withinTextRule withinText="yes"
  selector="//h:abbr | //h:acronym | //h:br | //h:cite | //h:code | //h:dfn
  | //h:kbd | //h:q | //h:samp | //h:span | //h:strong | //h:var | //h:b | //h:em
  | //h:big | //h:hr | //h:i | //h:small | //h:sub | //h:sup | //h:tt | //h:del
  | //h:ins | //h:bdo | //h:img | //h:a | //h:font | //h:center | //h:s | //h:strike
  | //h:u | //h:isindex" />

</its:rules>

[Example's source code]

Additional notes on these rules:

Note 1: The script and style elements may have translatable text, but their content needs to be parsed with respectively a script filter and a CSS filter. Depending on the capability of your translation tools you may want to leave these elements translatable.
Note 2: The value attribute of the input element may or may not be translatable depending on the way the element is used. Selecting value as translatable or not needs to be decided depending on your own use.
Note 3: The del element indicates removed text and therefore, most often, would not be translatable. Because this element may contain elements with translatable attributes such as img with an alt attribute, and because the scope of translatability does not include attributes, you need to: a) define this rule after the definition of the translatable attributes, and b) use the rules with selector="//h:del/descendant-or-self::*/@*" to override any possible translatable attribute within a del element or any of its descendants.
Note 4: The dt element is defined by HTML as a "definition term" and can therefore be seen as a candidate to be associated with the ITS Terminology data category. However, for historical reasons, this element has been used for many other purposes. Selecting dt as a term or not needs to be decided depending on your own use.

5.2 ITS and TEI

The Text Encoding Initiative [TEI] is intended for literary and linguistic material, and is most often used for digital editions of existing printed material. It is also suitable, however, for general purpose writing. The P5 release of the TEI consists of 23 modules which can be combined together as needed.

5.2.1 Integration of ITS into TEI

The TEI is maintained as a single ODD document, and customizations of it are also written as ODD documents. These are processed using XSLT stylesheets to make a tailored user-level schema in XML DTD, XML Schema or RELAX NG.

The ITS additions involve two changes to TEI:

Allowing rules to appear in the TEI metadata section (the teiHeader).
Adding the ITS local attributes to the TEI global attribute set.

Both of these can be easily achieved using standard techniques in ODD.

The body of a TEI+ITS customization consists of a schemaSpec which lists the modules to be included (this example includes six common ones):

Example 41: A schemaSpec element with modules to be included

<schemaSpec ident="tei-its" start="TEI">
 <moduleRef key="header"/>
 <moduleRef key="core"/>
 <moduleRef key="tei"/>
 <moduleRef key="textstructure"/>
 <moduleRef key="namesdates"/>
 <moduleRef key="msdescription"/> 
 <!-- Etc. -->
</schemaSpec>

[Example's source code]

In addition, we load the ITS schema (in its RELAX NG XML format, the language used by the TEI for expressing content models), and overload the definition of the TEI content class model.headerPart to include the ITS rules:

Example 42: Inclusion of ITS rules into the TEI schema

<moduleRef url="its.rng">
 <content xmlns:rng="http://relaxng.org/ns/structure/1.0">
 <rng:define name="model.headerPart" combine="choice">
  <rng:ref name="rules"/>
 </rng:define>
 </content>
</moduleRef>

[Example's source code]

The content class determines which elements are allowed as children of teiHeader. Lastly, we change the definition of the global attribute class att.global to reference the ITS local attributes (available from the ITS schema we loaded earlier):

Example 43: Addition of the ITS local attributes to the global attributes

<classSpec ident="att.global" type="atts" mode="change">
 <attList>
  <attRef name="span.attributes"/>
 </attList>
</classSpec>

[Example's source code]

When processing, this customization produces a schema which permits markup like this:

Example 44: Document which is valid against a schema TEI+ITS

<TEI xmlns:its="http://www.w3.org/2005/11/its" xmlns="http://www.tei-c.org/ns/1.0">
 <teiHeader>
  <fileDesc>
   <!-- details of the file -->
  </fileDesc>
  <rules xmlns="http://www.w3.org/2005/11/its" version="1.0"
   xmlns:t="http://www.tei-c.org/ns/1.0">
   <translateRule translate="no" selector="//t:body/t:p/@*"/>
   <translateRule translate="yes" selector="//t:body/t:p"/>
  </rules>
 </teiHeader>
 <text>
  <body>
   <p rend="normal">Hello <hi>world</hi>
   </p>
   <p rend="special">Goodbye</p>
   <p its:translate="no">This must not be translated</p>
  </body>
 </text>
</TEI>

[Example's source code]

In this example, a set of rule elements are provided in the header to provide rules, and the body of the text performs a specific override.

5.3 ITS and XML Spec

[XML Spec] is intended for W3C working drafts, notes, recommendations, and all other document types that fall under the category of technical reports. XML Spec is available in the formats of XML DTD, XML Schema and RELAX NG.

5.3.1 Integration of ITS into XML Spec

ITS has been integrated into xmlspec-i18n.dtd. This is a version of the XML DTD version 2.9 of XML Spec which already supplies various internationalization and localization related features. For example, there is an attribute translate in xmlspec-i18n.dtd, which can be used for the same purposes as the ITS translate attribute. To be able to separate them from original XML Spec declarations, all additions are stored in two separate files i18n-extensions.mod and i18n-elements.mod. Xmlspec-i18n.dtd is used within the W3C Internationalization Activity for the creation of technical reports.

For the integration of ITS, the following modifications to the xmlspec-i18n.dtd have been made:

A new entity <!ENTITY % its SYSTEM "its.dtd"> and the entity call %its; have been added to xmlspec-i18n.dtd.
The existing XML Spec entity %common.att; has been modified . The ITS entities %att.translate.attributes;, %att.locNote.attributes;, %att.term.attributes;, and %att.dir.attributes; have been added to %common.att;. In this way, the local attributes can be used at any element defined in the XML Spec DTD.
The XML Spec entity %header.mdl; contains the content model of the header element. The ITS element rules has been added as the last element to this content model. In this way, rules can be used inside an XML Spec document. The header element of the XML Spec DTD has been chosen as the place for rules, to avoid the impact of ITS markup on XML Spec markup.
The ITS element ruby has been added to the XML Spec entity %p.pcd.mix;. In this way it is possible to use ruby as an inline element.

5.3.2 Relating ITS to Existing Markup in XML Spec

As mentioned before, xmlspec-i18n.dtd has its own existing markup declarations for various internationalization and localization related purposes. In the original XML Spec 2.9 DTD, there is a term element which fulfills the same purpose as the ITS term attribute.

To associate such existing XML Spec and xmlspec-i18n.dtd related markup to ITS markup, the following rules element has been created.

Example 45: Mapping ITS markup to XML Spec and xmlspec-i18n.dtd markup

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">

 <!--The following rules are for xmlspec-i18n.dtd-->

 <its:termRule selector="//qterm" term="yes"/>
 <its:dirRule dir="ltr" selector="//*[@dir='ltr']"/>
 <its:dirRule dir="rtl" selector="//*[@dir='rtl']"/>
 <its:dirRule dir="lro" selector="//*[@dir='lro']"/>
 <its:dirRule dir="rlo" selector="//*[@dir='rlo']"/>
 
 <its:locNoteRule locNoteType="alert"
   locNotePointer="@locn-alert" selector="//*"/>
 <its:locNoteRule locNoteType="description"
   locNotePointer="//@locn-note" selector="//*"/>
   
 <its:translateRule translate="yes" 
   selector="//*[@translate='yes']"/>
 <its:translateRule translate="no" 
   selector="//*[@translate='no']"/>

 <!--This rule is for the original XML Spec DTD-->
 <its:termRule selector="//term" term="yes"/>

</its:rules>

[Example's source code]

Since both XML Spec and xmlspec-i18n.dtd do not define a namespace, the mappings use XPath expressions with unqualified element and attribute names.

5.4 ITS and DITA

The Darwin Information Typing Architecture [DITA 1.0] is an XML-based architecture for authoring, producing, and delivering readable information as discrete, typed topics.

5.4.1 Integration of ITS into DITA

DITA offers by default some of the ITS features (see: Section 5.4.2: Relating ITS to Existing Markup in DITA for more information on that aspect). But in some cases you may still want to allow the use of ITS markup directly into your DITA documents. For example, the its:locNote attribute, or the its:rules element. DITA provides a way to create a domain specialization based on the foreign element and attribute extension points.

For example, the DITA Concept DTD can be extended as follow:

First, by creating two files for the ITS domain specialization. The first one itsDomain.ent contains the entity definitions that will be used in the extended DTD.

Example 46: Content of the itsDomain.ent file

<!ENTITY % its-d-foreign "its"           >
<!ENTITY   its-d-att     "(topic its-d)" >

The second file, itsDomain.mod, contains the definition of the element where the ITS markup will be placed.

Example 47: Content of the itsDomain.mod file

<!-- declaration for the specialized wrapper and alternate element -->
<!ENTITY % its "its">
<!-- definition for the specialized wrapper and alternate element -->
<!ELEMENT its ((%its-rules;) | (%its-ruby;)) >
<!ATTLIST its %global-atts;
          class CDATA "+ topic/foreign its-d/its ">

Then you can adapt the concept.dtd file to take into account the new doamin.

Include the ITS domain entities at the end of the Domain Entity Declarations section:
```
<!ENTITY % its-d-dec SYSTEM "itsDomain.ent" >
%its-d-dec;
```
Define the extension element at the end of the Domain Extension section:
```
<!ENTITY % foreign "foreign | %its-d-foreign;" >
```

Modify the list of included domains in the included-domains entity:

<!ENTITY included-domains
   "&ui-d-att; &hi-d-att; &pr-d-att; &sw-d-att;
   &ut-d-att; &indexing-d-att; &its-d-att;" >

include the ITS domain module at the end of the Domain Element Integration section:
```
<!ENTITY % its-d-def SYSTEM "itsDomain.mod" >
%its-d-def;
```

[Ed. note: TODO: Finish integration of latest DITA info]

5.4.2 Relating ITS to Existing Markup in DITA

There are several ITS data categories that are already implemented in DITA. For example, DITA offers a translate attribute that provides the same functionality as its:translate.

Like for other formats, these existing features can be associated with ITS data categories, so ITS-enabled tools can process seamlessly DITA source documents.

Note: When you have the choice of using a DITA construct or a ITS construct to express the same thing, make sure to use the DITA construct to ensure DITA processors work properly. Use ITS local markup only if DITA does not provide an equivalent.

Example 48: Associating ITS markup to DITA markup

<?xml version="1.0"?>
<!-- Possible default ITS rules for DITA -->
<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">

 <!-- Translatable attribute (some are deprecated) -->
 <translateRule selector="//image/@alt" translate="yes"/>
 <translateRule selector="//lq/@reftitle" translate="yes"/>
 <translateRule selector="//note/@othertype" translate="yes"/>
 <translateRule selector="//object/@standby" translate="yes"/>
 <translateRule selector="//othermeta/@content" translate="yes"/>
 <translateRule selector="//state/@value" translate="yes"/>
 <translateRule selector="//map/@title" translate="yes"/>
 <translateRule selector="//topicref/@navref" translate="yes"/>
 <translateRule selector="//topicgroup/@navtitle" translate="yes"/>
 <translateRule selector="//topichead/@navtitle" translate="yes"/>
 <translateRule selector="//data/@label" translate="yes"/>

 <!-- Non-translatable elements -->
 <translateRule selector="//draft-comment//*" translate="no"/>
 <translateRule selector="//draft-comment/descendant-or-self::*/@*" translate="no"/>
 <translateRule selector="//required-cleanup//*" translate="no"/>
 <translateRule selector="//required-cleanup/descendant-or-self::*/@*" translate="no"/>
 <translateRule selector="//coords" translate="no"/>
 <translateRule selector="//shape" translate="no"/>

 <!-- Translatability flags -->
 <translateRule selector="//*[@translate='no']" translate="no"/>
 <translateRule selector="//*[@translate='no']/descendant-or-self::*/@*" translate="no"/>
 <translateRule selector="//*[@translate='yes']" translate="yes"/>

 <!-- Directionality flags -->
 <its:dirRule selector="//*[dir='ltr']" dir="ltr"/>
 <its:dirRule selector="//*[dir='rtl']" dir="rtl"/>
 <its:dirRule selector="//*[dir='lro']" dir="lro"/>
 <its:dirRule selector="//*[dir='rlo']" dir="rlo"/>

 <!-- Elements within text (inline) -->
 <its:withinTextRule withinText="yes"
  selector="//boolean | //cite | //itemgroup | //keyword | //ph | //q | //state | //term |
   //tm | //xref | //b | //i | //sub | //sup | //tt | //u | //apiname | //codeph | //delim |
   //fragref | //kwd | //oper | //option | //parmname | //repsep | //sep | //synnoteref |
   //synph | //var | //cmdname | //filepath | //msgnum | //msgph | //systemoutput |
   //userinput | //varname | //menucascade | //shortcut | //uicontrol | //wintitle |
   //coords | //shape" />

 <!-- The keyword elements within keywords are sub-flow, no in-line -->
 <its:withinTextRule withinText="nested" selector="//keywords/keyword" />

 <!-- Elements within text (subflow) -->
 <its:withinTextRule withinText="nested"
  selector="//draft-comments | //required-cleanup | //alt | //fn | //indexterm" />   

 <!-- Terminology -->
 <its:termRule selector="//term | //dt | //termindex" term="yes" />

</its:rules>

[Example's source code]

The declarations above cover different versions of DITA.

5.5 ITS and Glade

[Glade] is a user interface builder system for GTK+ and Gnome. It uses XML files to store the UI components. The library has been ported to different platform and offers bindings in different programing languages.

Example 49: Example of Glade document

<?xml version="1.0" standalone="no"?><!--*- mode: xml -*-->
<glade-interface>
 <widget class="GtkWindow" id="main_window">
  <property name="visible">True</property>
  <property name="title" translatable="yes">Glade Text Editor</property>
  <property name="type">GTK_WINDOW_TOPLEVEL</property>
  <property name="window_position">GTK_WIN_POS_NONE</property>
  <property name="modal">False</property>
  <property name="default_width">600</property>
  <property name="default_height">450</property>
  <property name="resizable">True</property>
  <property name="destroy_with_parent">False</property>
  <property name="decorated">True</property>
  <property name="skip_taskbar_hint">False</property>
  <property name="skip_pager_hint">False</property>
  <property name="type_hint">GDK_WINDOW_TYPE_HINT_NORMAL</property>
  <property name="gravity">GDK_GRAVITY_NORTH_WEST</property>
  <property name="focus_on_map">True</property>
  <property name="urgency_hint">False</property>
  <signal name="delete_event" handler="on_main_window_delete_event"/>
  <child>
   <widget class="GtkVBox" id="vbox1">
    <property name="visible">True</property>
    <property name="homogeneous">False</property>
    <property name="spacing">0</property>
    <child>
     <widget class="GtkHandleBox" id="handlebox2">
      <property name="visible">True</property>
      <property name="shadow_type">GTK_SHADOW_OUT</property>
      <property name="handle_position">GTK_POS_LEFT</property>
      <property name="snap_edge">GTK_POS_TOP</property>
      <child>
       <widget class="GtkMenuBar" id="menubar1">
        <property name="visible">True</property>
        <property name="pack_direction">GTK_PACK_DIRECTION_LTR</property>
        <property name="child_pack_direction">GTK_PACK_DIRECTION_LTR</property>
        <child>
         <widget class="GtkMenuItem" id="File">
          <property name="visible">True</property>
          <property name="label" translatable="yes">_File</property>
          <property name="use_underline">True</property>
          <child>
           <widget class="GtkMenu" id="File_menu">
            <child>
             <widget class="GtkImageMenuItem" id="New">
              <property name="visible">True</property>
              <property name="label">gtk-new</property>
              <property name="use_stock">True</property>
              <signal name="activate" handler="on_New_activate"/>
             </widget>
            </child>
            <child>
             <widget class="GtkImageMenuItem" id="Open">
              <property name="visible">True</property>
              <property name="label">gtk-open</property>
              <property name="use_stock">True</property>
              <signal name="activate" handler="on_Open_activate"/>
             </widget>
            </child>
            <child>
             <widget class="GtkImageMenuItem" id="Save">
              <property name="visible">True</property>
              <property name="label">gtk-save</property>
              <property name="use_stock">True</property>
              <signal name="activate" handler="on_Save_activate"/>
             </widget>
            </child>
            <child>
             <widget class="GtkMenuItem" id="separator1">
              <property name="visible">True</property>
             </widget>
            </child>
            <child>
             <widget class="GtkImageMenuItem" id="Exit">
              <property name="visible">True</property>
              <property name="label">gtk-quit</property>
              <property name="use_stock">True</property>
              <signal name="activate" handler="on_Exit_activate"/>
             </widget>
            </child>
           </widget>
          </child>
         </widget>
        </child>
       </widget>
      </child>
     </widget>
     <packing>
      <property name="padding">0</property>
      <property name="expand">False</property>
      <property name="fill">True</property>
     </packing>
    </child>
   </widget>
  </child>
 </widget>
</glade-interface>

[Example's source code]

5.5.1 Integration of ITS into Glade

The content of the Glade files are mostly made of not translatable data: UI widgets properties. Text content is limited to title, label and various other type of UI strings. While Glade does offers support for some of the ITS features, in some cases you may still want to allow the use of ITS markup directly into your Glade resources.

[Ed. note: TODO]

5.5.2 Relating ITS to Existing Markup in Glade

Glade offers a translatable attribute that provides the same functionality as its:translate. The comments attribute can also be associated to localization information.

Like for other formats, existing features of Glade can be associated with ITS data categories using global rules, so ITS-enabled tools can process seamlessly Glade source documents.

Example 50: Associating ITS markup to Glade markup

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
 <!-- ITS rules for Glade 2.0, based on http://glade.gnome.org/glade-2.0.dtd -->
 <its:translateRule selector="/glade-interface" translate="no"/>
 <its:translateRule selector="//*[@translatable='yes']" translate="yes"/>
 <its:translateRule selector="//atkaction/@description" translate="yes"/>
 <its:locNoteRule selector="//*[@translatable='yes']"
  locNoteType="description" locNotePointer="@comments"/>
</its:rules>

[Example's source code]

[Ed. note: TODO]

5.6 ITS and DocBook

DocBook is a general purpose XML schema particularly well suited to books and papers about computer hardware and software (though it is by no means limited to these applications). DocBook is maintained by the DocBook Technical Committee of OASIS.

5.6.1 Integration of ITS into DocBook

DocBook V5.0 schema is maintained as a very modular and easy to customize schema written in RELAX NG [RELAX NG 1.0]. General techniques for schema customization are described in [DocBook V5.0 HOWTO].

The ITS additions involve the following changes to DocBook schema:

Adding the ITS local attributes to every existing DocBook element.
Not all ITS local attributes are added into schema as DocBook already provides its own means for specifying directionality of text.
Allowing its:rules element inside DocBook info element which is a general metadata container.
Allowing its:ruby as inline element almost everywhere where plain text could be.

Example 51: DocBook schema customization

# This schema integrates ITS markup (http://www.w3.org/TR/its/) 
# into DocBook schema (http://docbook.org)
#
# This schema conforms to Conformance Type 1 defined in
# http://www.w3.org/TR/its/#conformance-product-schema
# 
# Schema adds the following ITS elements into DocBook schema: 
#  * rules
#  * ruby
#
# Schema adds the following local ITS attributes into DocBook schema:
#  * translate
#  * locNote
#  * locNoteType
#  * locNoteRef
#  * term
#  * termInfoRef
#
# $Id: Overview.html,v 1.5 2018/10/09 13:17:02 denis Exp $
#

# Namespace declarations for DocBook, ITS and HTML (HTML is used internally in DocBook schema)  
namespace db = "http://docbook.org/ns/docbook"
namespace its = "http://www.w3.org/2005/11/its"
namespace html = "http://www.w3.org/1999/xhtml"

# Include base DocBook schema
include "docbook.rnc"
{
   # Exclude ITS markup from "wildcard" element
   db._any =
      element * - (db:* | html:* | its:*) {
         (attribute * { text }
          | text
          | db._any)*
      }
}

# Include base ITS schema
include "its.rnc"

# Define pattern for local ITS attributes
db.its.attributes = 
   its-att.translate.attributes?
   & its-att.locNote.attributes?
   & its-att.term.attributes?
   & its-att.version.attributes?

# Add local ITS attributes to all DocBook elements
db.common.attributes &= db.its.attributes
db.common.idreq.attributes &= db.its.attributes

# Allow its:rules inside info element
db.info.extension |= its-rules

# Allow Ruby markup almost everywhere
db.ubiq.inlines |= its-ruby

[Example's source code]

For your convenience there is also available “flattened” schema stored inside one file and converted to other schema languages as well.

dbits.rnc (RELAX NG compact syntax schema in one file)[Ed. note: Flattened version are broken at this time]
dbits.rng (RELAX NG schema in one file)[Ed. note: Flattened version are broken at this time]
dbits.dtd (DTD in one file)[Ed. note: Flattened version are broken at this time]
dbits.xsd (W3C XML Schema)[Ed. note: TODO]

There is no need for adding its:span element as DocBook provides similar element called phrase which can be used for attaching ITS local attributes to an arbitrary piece of text.

The following example shows sample DocBook article conforming to DocBook+ITS schema. The its:translateRule element is used to indicate that function names (marked-up by function element) should not be translated. The first paragraph is also marked as non-translatable using local ITS markup.

Example 52: Sample DocBook document with ITS markup

<?xml version="1.0" encoding="UTF-8"?>
<article xmlns="http://docbook.org/ns/docbook" 
         xmlns:its="http://www.w3.org/2005/11/its" 
         xmlns:db="http://docbook.org/ns/docbook" 
         version="5.0" xml:lang="en">
  <info>
    <title>Sample article</title>
    <its:rules version="1.0">
      <its:translateRule translate="no" selector="//db:function"/>
    </its:rules>
  </info>
  <para its:translate="no">Nontranslatable content</para>
  <section>
    <title>Sample section</title>
    <para>You can delete file using <function>unlink()</function> function.</para>
  </section>
</article>

[Example's source code]

5.6.2 Relating ITS to Existing DocBook Markup

A number of DocBook constructs implement the same semantic as some of the ITS data categories. In addition, some of the DocBook attributes are translatable, which is not the default for XML documents according to ITS defaults settings for translatability. These attributes need to be identified as translatable.

Note: When you have the choice of using a DocBook construct or a ITS construct to express the same thing, make sure to use the DocBook construct to ensure DocBook processing tools properly. Use ITS local markup only if DocBook does not provide an equivalent.

An external ITS its:rules element can summarize these relations. Because DocBook use is widespread and diverse the rules defined here are just example which may need further tailoring for specific use.

Example 53: ITS external rules for DocBook documents

<its:rules xmlns:its="http://www.w3.org/2005/11/its" 
	   xmlns:db="http://docbook.org/ns/docbook"
	   xmlns:xlink="http://www.w3.org/1999/xlink"
	   version="1.0">

 <!-- Translatable attributes -->
 <its:translateRule selector="//db:table/@summary" translate="yes"/>
 <its:translateRule selector="//db:*/@xlink:title" translate="yes"/>
 <its:translateRule selector="//db:*/@xreflabel" translate="yes"/>
 <its:translateRule selector="//db:*/@label" translate="yes"/>

 <!-- Non-translatable elements -->
 <its:translateRule translate="no" selector="//db:*[@revisionflag = 'deleted']"/>
 <its:translateRule translate="no"
		    selector="//db:abbrev 
			      | //db:author 
			      | //db:classname 
			      | //db:command 
			      | //db:constant 
			      | //db:date
			      | //db:editor 
			      | //db:email 
			      | //db:envar 
			      | //db:errorcode 
			      | //db:exceptionname 
			      | //db:filename 
			      | //db:function 
			      | //db:initializer 
			      | //db:interfacename 
			      | //db:markup 
			      | //db:methodname 
			      | //db:modifier 
			      | //db:ooclass 
			      | //db:ooexception 
			      | //db:oointerface 
			      | //db:option 
			      | //db:parameter 
			      | //db:person 
			      | //db:personname 
			      | //db:productnumber 
			      | //db:property
			      | //db:returnvalue 
			      | //db:symbol 
			      | //db:tag 
			      | //db:type 
			      | //db:uri 
			      | //db:varname"/>

 <!-- Possible terms -->
 <its:termRule selector="//db:glossterm" term="yes"/>
 <its:termRule selector="//db:firstterm" term="yes"/>

 <!-- Bidirectional information -->
 <its:dirRule selector="//db:*[@dir='ltr']" dir="ltr"/>
 <its:dirRule selector="//db:*[@dir='rtl']" dir="rtl"/>
 <its:dirRule selector="//db:*[@dir='lro']" dir="lro"/>
 <its:dirRule selector="//db:*[@dir='rlo']" dir="rlo"/>

 <!-- Elements within text -->
 <its:withinTextRule withinText="yes"
		     selector="//db:abbrev 
			       | //db:accel 
			       | //db:acronym 
			       | //db:application 
			       | //db:author 
			       | //db:citation  
			       | //db:citebiblioid 
			       | //db:citerefentry 
			       | //db:citetitle 
			       | //db:classname 
			       | //db:code 
			       | //db:command 
			       | //db:computeroutput 
			       | //db:constant 
			       | //db:database 
			       | //db:date 
			       | //db:editor 
			       | //db:email 
			       | //db:emphasis 
			       | //db:envar 
			       | //db:errorcode 
			       | //db:errorname 
			       | //db:errortext 
			       | //db:errortype 
			       | //db:exceptionname 
			       | //db:filename 
			       | //db:foreignphrase 
			       | //db:function 
			       | //db:guibutton 
			       | //db:guiicon 
			       | //db:guilabel 
			       | //db:guimenu 
			       | //db:guimenuitem 
			       | //db:guisubmenu 
			       | //db:hardware 
			       | //db:initializer 
			       | //db:interfacename 
			       | //db:jobtitle 
			       | //db:keycap 
			       | //db:keycode 
			       | //db:keycombo 
			       | //db:keysym 
			       | //db:link 
			       | //db:literal 
			       | //db:markup 
			       | //db:menuchoice 
			       | //db:methodname 
			       | //db:modifier 
			       | //db:mousebutton 
			       | //db:olink
			       | //db:ooclass 
			       | //db:ooexception 
			       | //db:oointerface 
			       | //db:option 
			       | //db:optional 
			       | //db:org 
			       | //db:orgname 
			       | //db:package 
			       | //db:parameter 
			       | //db:person 
			       | //db:personname 
			       | //db:phrase 
			       | //db:productname 
			       | //db:productnumber 
			       | //db:prompt 
			       | //db:property
			       | //db:quote 
			       | //db:replaceable 
			       | //db:returnvalue 
			       | //db:shortcut 
			       | //db:subscript 
			       | //db:superscript 
			       | //db:symbol 
			       | //db:systemitem 
			       | //db:tag 
			       | //db:token 
			       | //db:trademark 
			       | //db:type 
			       | //db:uri 
			       | //db:userinput
			       | //db:varname 
			       | //db:wordasword"/>

 <its:withinTextRule withinText="nested"
		     selector="//db:alt 
			       | //db:footnote 
			       | //db:remark 
			       | //db:indexterm 
			       | //db:primary 
			       | //db:secondary 
			       | //db:tertiary"/>

</its:rules>

[Example's source code]

Best Practices for XML Internationalization

W3C Working Draft 28 June 2007

Abstract

Status of this Document

Table of Contents

Appendices

1 Introduction

1.1 Who should use this document

1.2 How to use this document

2 When Designing an XML Application

3 When Authoring XML Content

4 Generic Techniques

4.1 Writing ITS Rules

4.1.1 Precedence and Inheritance

4.1.2 Dealing with namespaces

4.1.3 Create your XPath expressions with care

4.2 Adding an Attribute to an Existing DTD or Schema

4.2.1 Include xml:lang in XML Schema

4.2.2 Including xml:lang in Relax NG

4.2.3 Including xml:lang in XML DTD

5 ITS Applied to Existing Formats

5.1 ITS and XHTML 1.0

5.1.1 Integration of ITS into XHTML

5.1.2 Using XHTML Modularization 1.1 for the Definition of ITS

5.1.2.1 Abstract Definition of ITS Markup

5.1.2.2ITS XML Schema Module Implementation

5.1.2.3ITS DTD Module Implementation

5.1.3 Relating ITS to Existing Markup in XHTML

5.2 ITS and TEI

5.2.1 Integration of ITS into TEI

5.3 ITS and XML Spec

5.3.1 Integration of ITS into XML Spec

5.3.2 Relating ITS to Existing Markup in XML Spec

5.4 ITS and DITA

5.4.1 Integration of ITS into DITA

5.4.2 Relating ITS to Existing Markup in DITA

5.5 ITS and Glade

5.5.1 Integration of ITS into Glade

5.5.2 Relating ITS to Existing Markup in Glade

5.6 ITS and DocBook

5.6.1 Integration of ITS into DocBook

5.6.2 Relating ITS to Existing DocBook Markup

A References (Non-Normative)

B Revision Log (Non-Normative)

C Acknowledgements (Non-Normative)

4.2.1 Include `xml:lang` in XML Schema

4.2.2 Including `xml:lang` in Relax NG

4.2.3 Including `xml:lang` in XML DTD