XLIFF Mapping

From ITS
Jump to: navigation, search

IMPORTANT: This page/table is currently being progressively moved to XLIFF 1.2 Mapping and XLIFF 2.0 Mapping.

This page will be cleaned up when all the text is moved.

Notes:

  • When posting emails about this topic, please refer to Issue 55
  • The namespace prefix 'TBD' has been replaced with 'itsx'.
  • The 'strucural' entries relate to the case where the element with the ITS information is a non-inline (structural) element. For example a <p> in HTML.
  • The 'inline' entries relate to the case where the element with the ITS information is an inline element. For example a <span> in HTML.
  • In general, this table attempts to map ITS data categories and their attributes first into corresponding attributes or elements in XLIFF 1.2 amd 2.0. Only if a native XLIFF equivalent is the introduction of a native ITS attribute or element into XLIFF considered, and then only where extensibility is permitted in XLIFF. In other words the mapping aims to ensure the resulting document is a conformant XLIFF document.
  • itsx: is a schema prefix for the namespace http://www.w3.org/ns/its-xliff/.
  • Color Code
color meaning
  stuck in XLIFF TC
  dependent on an unstable ITS category
  nedds W3C I18N WG review

The mapping currently takes the following approach:

Data Categories ("driver") XLIFF 1.2 XLIFF 2.0
Translate
(Yves)
See XLIFF_1.2_Mapping#Translate See XLIFF_2.0_Mapping#Translate
Localization Note
(Yves)
See XLIFF_1.2_Mapping#Localization_Note structural:
<note>
See XLIFF_1.2_Mapping#Localization_Note inline:
<mrk id='1' type='comment' value='[note]' > should extensiblity be introduced here?
Terminology
See XLIFF_1.2_Mapping#Terminology inline:
<mrk type='term' value='info'|ref='infoRef'>
Directionality
See XLIFF_1.2_Mapping#Directionality structural: XLIFF2 directionality mechanism
See XLIFF_1.2_Mapping#Directionality inline: XLIFF2 directionality mechanism
Language information
structural: fall back on mrk if needed

inline:

<mrk mtype='x-its' xml:lang='en'>

inline:

<mrk type='x-its' xml:lang='en'>
Element Within Text
(Yves)
See XLIFF_1.2_Mapping#Element_Within_Text yes: inline codes

no: unit

nested: sub-flows mechanism
Domain See XLIFF_1.2_Mappin#Domain itsx:domains
Text Analysis
(Dave and David)
Recommend only use text analysis inline:
<mrk mtype="phrase" its:taConfidence="0.7"
    its:taClassRef="http://nerd.eurecom.fr/ontology#Place"
    its:taIdentRef="http://dbpedia.org/resource/Arizona">
    Arizona</mrk>

If its:taConfidence is used, then the annotated text must be contained within an element with a relevant its:annotatorsRef, e.g.:

its:annotatorsRef="text-analysis|http://enrycher.ijs.si"
inline:
1.2 <mrk mtype='phrase'> using ITS native and (if used) comment for the resolved prose text
Locale Filter
(Yves)
structural: When target locale is undefined:
<trans-unit id='id' its:localeFilterList="*-ca" its:localeFilterType="exclude">

When target locale is known: no extraction or

<trans-unit id='id' translate='no|yes'>
structural:
 <trans-unit id='id' its:localeFilterList="*-ca" its:localeFilterType="exclude">

When target locale is known: no extraction or

<trans-unit id='id' translate='no|yes'>
inline: When target locale is undefined:
<mrk mtype='x-its' its:localeFilterList="*-ca" its:localeFilterType="exclude">

When target locale is known: Inline code or

<mrk mtype="protected">...</mrk>
<mrk mtype="x-its-translate-yes">...</mrk>
inline: When target locale is undefined:
 <mrk type='x-its' its:localeFilterList="*-ca" its:localeFilterType="exclude">

When target locale is known: Inline code or

<mrk translate='yes|no'>
Provenance
(Dave)
structural:
<target its:provenanceRecordsRef="#ph3">
...
<its:provenanceRecords xml:id="ph3">
   <its:provenanceRecord 
     person="John Doe"
     orgRef="http://www.legaltrans-ex.com/"
     revPerson="Tommy Atkins"
     revOrgRef="http://www.vistatec.com/"
     provRef="http://www.examplelsp.com/excontent987/legal/prov/e6354"/>
   <its:provenanceRecord 
     revPerson="John Smith"
     revOrgRef="http://john-smith.qa.example.com"/>
 </its:provenanceRecords>

Important note to the example: The natural carriers of the provenance info on the structural level are <source>, <target>, <trans-unit>, <group>, <file>. However, in case the <source> and <target> elements are used within an <alt-trans> element (as opposed to <trans-unit>), the parent <alt-trans> element MUST carry all the provenance info relevenat for the whole translation candidate.

structural:
<target its:provenanceRecordsRef="#ph3">
...
<its:provenanceRecords xml:id="ph3">
   <its:provenanceRecord 
     person="John Doe"
     orgRef="http://www.legaltrans-ex.com/"
     revPerson="Tommy Atkins"
     revOrgRef="http://www.vistatec.com/"
     provRef="http://www.examplelsp.com/excontent987/legal/prov/e6354"/>
   <its:provenanceRecord 
     revPerson="John Smith"
     revOrgRef="http://john-smith.qa.example.com"/>
 </its:provenanceRecords>
inline:
 <mrk mtype="x-its" its:provenanceRecordsRef="#ph3">

or

 <mrk mtype="seg" its:provenanceRecordsRef="#ph3">
inline:
 <mrk id='1' type="its:provenanceRecordsRef" ref="#ph3">
External Resource
(???)
See XLIFF_1.2_Mapping#External_Resource in unit: NEED TO CHECK RELATIONSHIP TO RESOURCE MODULE
See XLIFF_1.2_Mapping#External_Resource inline:
 <mrk id='1' type="itsx:externalResource" ref="[uri]">
Target Pointer
(Yves)
See XLIFF_1.2_Mapping#Target_Pointer N/A as mapping. ITS processors working on XLIFF documents should use:
<its:targetLocale selector="//xlf:source" targetPointer="../xlf:target"/>
Id Value
(Yves)
See XLIFF_1.2_Mapping#Id_Value structural: <unit name="[value]">
See XLIFF_1.2_Mapping#Id_Value inline: N/A - after deliberation the resolution would be more problematic than resolving the use case
Preserve Space
(???)
Structural: See XLIFF_1.2_Mapping#Preserve_Space structural: xml:space
inline: See XLIFF_1.2_Mapping#Preserve_Space inline:
Localization Quality Issue
(Yves)
structural: may be use in source, seg-source or target elements in a trans-unit or a alt-trans element:
<trans-unit>
  <target its:locQualityIssuesRef="#lqi1">c'es le contenu
  </target>
</trans-unit> 
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType="misspelling"
         locQualityIssueComment="'c'es' is unknown. Could be 'c'est'"
         locQualityIssueSeverity="50 />
</its:locQualityIssues> 

It is recommended that only the the stand-off mode of annotation is used and that its:locQualityIssueType, its:locQualityIssueComment, locQualityIssueSeverity, its:locQualityIssueProfileRef and its:locQualityIssueEnabled are not used within trans-unit or alt-trans elements.

in unit:
<unit its:locQualityIssuesRef="#lqi1">
...
<its:locQualityIssues 
</its:locQualityIssues> 
inline: may be used inline with an mrk within a source, seg-source or target elements in a trans-unit or a alt-trans element:
<mrk mtype="x-its" its:locQualityIssuesRef="#lqi1">

It is recommended that only the the stand-off mode of annotation is used and that its:locQualityIssueType, its:locQualityIssueComment, locQualityIssueSeverity, its:locQualityIssueProfileRef and its:locQualityIssueEnabled are not used with mrk elements within trans-unit or alt-trans elements.

For both structural and inline usage, if the content of an alt-trans target element is copied verbatim to the target element of a trans-unit, i.e. no post-editing is conducted on the translation, then the value of its:locQualityIssuesRef can be copies as an attribute for the target element in a trans-unit.

inline:
<mrk type="its:lqi" ref="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef locQualityIssueEnabled/>
</its:locQualityIssues> 
Localization Quality Rating
(dF)
structural: may be used to annotate a group, trans-unit or alt-trans
<trans-unit its:locQualityRatingScore="100"
 its:locQualityRatingScoreThreshold="95"
 its:locQualityRatingProfileRef="http://example.org/qaModel/v13">
structural:
inline:
<mrk mtype="x-its" its:locQualityRatingScore="100"
 its:locQualityRatingScoreThreshold="95"
 its:locQualityRatingProfileRef="http://example.org/qaModel/v13">

Question: is LQR compatible with inline mark-up as it is used to "is used to express an overall measurement of the localization quality of a document or an item in a document."?

For both structural and inline usage, if the content of an alt-trans element is copied verbatim to a trans-unit, i.e. no post-editing is conducted on the translation, then the local Localization Quality Rating attribute values can be copied to the corresponding trans-unit.

inline:
MT Confidence
(dF)
Structural: It is recommended that for use in alt-trans the existing xlf:match-quality attribute be used for presenting the value of its:mtConfidence. In this case, the ITS tools information should be given as its:annotatorsRef e.g.
<alt-trans mid="0" match-quality="0.546" its:annotatorsRef="mtconfidence|http://mlwlt.moravia.com/mlwlt-service-xliff-mt/mlwlt-service.asmx">

Note: Only in cases when the XLIFF files are used with tools that do not consume the XLIFF alt-trans match-quality and origin and attributes, should consideration be given to using its:mtConfidence, but only for the target and bin-target sub-elements of the alt-trans, e.g.

<alt-trans>
 <target its:mtConfidence="0.8982">some translated text</target>
</alt-trans>

In addition, if the content of an alt-trans target</cod> element is copied verbatim to the <code>target element of a trans-unit, i.e. no post-editing is conducted on the MT translation, then the confidence value can be copied to a its:mtConfidence for the target element int eh trans-unit

<trans-unit>
  <target its:mtConfidence="0.8982" its:annotatorsRef="mtconfidence|http://mlwlt.moravia.com/mlwlt-service-xliff-mt/mlwlt-service.asmx">
  some translated text
  </target>
</trans-unit>

If the translation was NOT performed on the whole unit, each segment mrk element must carry the MT confidence metadata (see the inline case).

structural:
inline:
   <target>
   <mrk mtype="seg" its:mtConfidence="0.8982" its:annotatorsRef="mtconfidence|http://mlwlt.moravia.com/mlwlt-service-xliff-mt/mlwlt-service.asmx">
   some translated text</mrk>
   </target>

dF: I think that the WG consensus was that that mtconfidence makes no sense for subsegment, so that this is really only relevant with mtype="seg"

inline:
Allowed Characters
(Yves)
structural: this can be applied only to the source and/or the target elements of a trans-unit element.
<trans-unit>
  <target its:allowedCharacters="[character spec]">
  some translated text
  </target>
</trans-unit>
structural: this can be applied only to the source and/or the target elements.
<segment>
  <target its:allowedCharacters="[character spec]">
  some translated text
  </target>
</segment>
inline: this can be applied to mrk only within the source and/or the target elements of a trans-unit element.
<trans-unit>
  <source>
    <mrk mtype="x-its" its:allowedCharacters="[character spec]">some source text</mrk>
  </source>
</trans-unit>


inline: this can be applied to mrk only within the source and/or the target element.
<segment>
  <source>
    <mrk mtype="its" its:allowedCharacters="[character spec]">some source text</mrk>
  </source>
</segment>
Storage Size
(Yves)
structural: this can be applied only to the source and/or the target elements in a trans-unit.
<trans-unit>
  <target its:storageSize="12" its:storageEncoding="UTF-16" its:lineBreakType="crlf">
  some translated text
  </target>
</trans-unit>


(Note: maxbytes not enough and can't use both pointer and local markup)

structural: this can be applied only to the source and/or the target elements.
<segment>
  <target its:storageSize="12" its:storageEncoding="UTF-16" its:lineBreakType="crlf">
  some translated text
  </target>
</segment>


(see also possible module from FE)

inline: this can be applied to mrk only within the source and/or the target elements of a trans-unit element.
<trans-unit>
  <source>
    <mrk mtype="x-its" its:storageSize="12" its:storageEncoding="UTF-16" its:lineBreakType="crlf">some source text</mrk>
  </source>
</trans-unit>


inline: this can be applied to mrk only within the source and/or the target elements.
<segment>
  <source>
    <mrk mtype="x-its" its:storageSize="12" its:storageEncoding="UTF-16" its:lineBreakType="crlf">some source text</mrk>
  </source>
</segment>

Notes

Provenance mapping

Best Practice

In XLIFF, the ITS provenance annotation should only be added as local stand-off markup i.e. using a its:provenanceRecords element within the XLIFF file. This facilitates the addition of further its:provenanceRecord elements as additional translation, translation revisions or other activities recorded in external provenance records are conducted upon the XLIFF file.

If the its:provenanceRecords element referenced by a its:provenanceRecordsRef contains any of the translation or translation revision related attributes, namely: its:person, its:personRef, its:org, its:orgRef, its:tool, its:toolRef, its:revPerson, its:revPersonRef, its:revOrg, its:revOrgRef, its:revTool or its:revToolRef, then the its:provenanceRecordsRef should only be used as local or global annotation selecting xlf:target or xlf:bin-target elements or a xlf:mrk inline markup within either of those XLIFF elements. This is because the provenance mark-up in this case is appropriate only to translated text.

If the its:provenanceRecords element referenced by a its:provenanceRecordsRef contains only the provRef attribute, then the its:provenanceRecordsRef may be used as local or global annotation selecting any XLIFF elements, since the its:provRef attribute may point to an external provenance records that could relate to an activity that resulted in textual content of any of the elements in an XLIFF file.

If, as the result of additional activities upon an XLIFF file results in values in a its:provenanceRecord that forks from that of other elements referencing the same its:provenanceRecords, then that its:provenanceRecords must be copied to a new element with a distinct id, while the reference attribute for the element(s) concerned is changed to refer to this new its:provenanceRecords id.

Design Note

Note XLIFF1.2 supports some constructs that could map to ITS provenance annotation as outlined below. This mapping is not recommended because these elements are dropped in XLIFF 2.0, so that the solution would not be future proof.

<trans-unit phase-name="#ph1">
<target phase-name="#ph2">
...
<phase-group>
 <phase phase-name="ph1">
   process-name="translate"
   company-name="[value of its:orgRef or its:org]"
   tool-id="tl1"
   contact-name="[value of its:person]"
   contact-email="[value of its:personRef IF it has scheme 'mailto:'] "
   /> 
 <phase phase-name="ph2">
 ...
   /> 
</phase-group> 
 ... 
<tool tool-id="tl1">
  tool-name="[value of its:toolRef or its:tool]"
</tool>

One limitation however is that we can't map non 'mailto:' scheme for its:personRef into contact-email. Therefore the proposed mapping uses a reference to the ITS stand-off record, its:provenanceRecords. This also offers a similar mapping then for both XLIFF1.2 and XLIFF2.0 where phase-group is not available.