ITS WG Collaborative editing page
Follow the conventions for editing this page.
Status: Initial Draft ie. please focus on technical content, rather than wordsmithing at this stage.
Author: Yves Savourel
Attribute and Translatable Text
Provisions must be taken to ensure that attributes with translatable values do not impair the localization process.
If translatable text is provided as an attribute value rather than element content, the following problems may arise:
- It is difficult to apply to the text of the attribute value meta-information such as no-translate flags, designer's notes, etc. (Except when using mechanisms such as XPath or XPointer).
- The difficulty to attach unique identifiers to translatable attribute text makes it more complicated to use ID-based leveraging tools.
- Translatable attributes can create problems when they are prepared for localization because they can occur within the content of a translatable element, breaking it into different parts, and possibly altering the sentence structure.
- The language identification mechanism (i.e. xml:lang) applies to the content of the element where it is declared, including its attribute values. If the text of an attribute is in a different language than the text of the element content, one cannot set the language for both correctly.
- In some languages, bidirectional markers may be needed to provide a correct display. Tags cannot be used within an attribute value. One can use Unicode control characters instead, but this is not recommended (see the W3C Note and Unicode Technical Report Unicode in XML & Other Markup Languages).
In this example the no-translate flag applies to the content of the element, but not to the title text. The title text may benefit from id-based leveraging, but has no ID. The xml:lang tag, after translation, will only be relevant for the element content, not the title text.
<extract id="0517.1447" translate="no" xml:lang="en" title="Ambiguous linguistic construct.">The man hit the boy with the stick in the bathroom.</extract>
In this example part of the alt-text value should be left untranslated (the name of the picture), but it is difficult to see how that would be expressed so that a machine translation tool would exhibit the correct behavior.
<image id="0517.1716" alt-text="Catalog number 123: The Fish Wife" source="fishwife.png" />
In this example many translation tools would see the value of the alt attribute as embedded inside the sentence where the image is inserted, making the translation difficult.
<para>Click the button <image source="startnow.png" alt="Start Now!" /> to register now.</para>
"Click the button [code]Start Now![code] to register now."
Whenever possible, a schema should ensure that translatable text is stored in elements rather than attributes.