Difference between revisions of "XLIFF Mapping"

From MultilingualWeb-LT EC Project Wiki
Jump to: navigation, search
Line 345: Line 345:
 
<td rowspan=3>MT Confidence<br/>(dF)</td>
 
<td rowspan=3>MT Confidence<br/>(dF)</td>
 
</tr><tr>
 
</tr><tr>
<td bgcolor="white">structural:target only</td>
+
<td bgcolor="white">structural:target and bin-target only
 
   &lt;target its:mtConfidence="0.8982">
 
   &lt;target its:mtConfidence="0.8982">
 +
</td>
 
<td bgcolor="white">structural:</td>
 
<td bgcolor="white">structural:</td>
 
</tr><tr>
 
</tr><tr>

Revision as of 01:05, 8 February 2013

This page provide a tentative mapping of the ITS data categories in XLIFF 1.2 and 2.0

Notes:

  • When posting emails about this topic, please refer to Issue 55
  • The namespace prefix 'TBD' used below indicates a namespace that needs to be defined.
  • The 'strucural' entries relate to the case where the element with the ITS information is a non-inline (structural) element. For example a <p> in HTML.
  • The 'inline' entries relate to the case where the element with the ITS information is an inline element. For example a <span> in HTML.
  • In general, this table attempts to map ITS data categories and their attributes first into corresponding attributes or elements in XLIFF 1.2 amd 2.0. Only if a native XLIFF equivalent is the introduction of a native ITS attribute or element into XLIFF considered, and then only where extensibility is permitted in XLIFF. In other words the mapping aims to ensure the resulting document is a conformant XLIFF document.
  • itsx: is a schema prefix for a schema that should be designed and hosted by W3C MLW-LT to facilitate ITS mappings, such as XLIFF mapping. The rationale is that XLIFF processors processing the ITS mappings won't usually be generic ITS processors, that could properly parse native ITS constructs.
  • Color Code
color meaning
  stuck in XLIFF TC
  dependent on an unstable ITS category
  nedds W3C I18N WG review

The mapping currently takes the following approach:

Data Categories ("driver") XLIFF 1.2 XLIFF 2.0
Translate
(Yves)
structural: no extraction or
<trans-unit id='id' translate='yes|no'>
structural: no extraction or
<unit id='id'>
 <segment translate='yes|no'>
inline: inline code or
<mrk mtype="protected">...</mrk>
<mrk mtype="x-its-Translate-Yes">...</mrk>

Value for 'not-protected' needs to be defined. dF: Proposed verbose: x-its-Translate-Yes.

inline: inline code or
<mrk id="id" translate="yes|no">
Localization Note
(Yves)
structural:
<note>

alert: priority="1" description: priority > 1

structural:
<note>

Informal consensus on mailing list that note should be extensible, need to formalize and propgate into spec
No solution for noteType without extension or new XLIFF attribute

inline:
<mrk mtype='x-its' comment='[note]' itsx:locNoteType='alert|description'>

is it a best practice?

inline:
<mrk id='1' type='comment' value='[note]' > should extensiblity be introduced here?
Terminology
(???)
structural: Use <mrk> structural: Use <mrk>
inline:
<mrk mtype='term'>  need to define value for termInfo(Ref)
inline:
<mrk type='term' value='info'|ref='infoRef'>
Directionality
(???)
structural: trans-unit its:dir structural: XLIFF2 directionality mechanism
inline: Unicode characters for inline inline: XLIFF2 directionality mechanism
Ruby
(???)
TBD TBD
Language information
(???)
structural: fall back on mrk if needed

inline:

<mrk mtype='x-itsLang???' xml:lang='lang'>

inline:

<mrk type='its:Lang' value='en'>
Element Within Text
(Yves)
yes: inline codes

no: trans-unit

nested: <sub>
yes: inline codes

no: unit

nested: sub-flows mechanism
Domain
(???)
@TBD:itsDomain either DC or local ITS @TBD:itsDomain
Disambiguation
(Dave and David)
inline:
1.2 <mrk mtype='phrase'> using ITS native and (if used) comment for the resolved prose text
Locale Filter
(Yves)
structural: Same as for Translate structural: Same as for Translate
inline: Same as for Translate inline: Same as for Translate
Provenance
(Dave)
structural:
<target its:provenanceRecordsRef="#ph3">
...
<its:provenanceRecords xml:id="ph3">
   <its:provenanceRecord 
     person="John Doe"
     orgRef="http://www.legaltrans-ex.com/"
     revPerson="Tommy Atkins"
     revOrgRef="http://www.vistatec.com/"
     provRef="http://www.examplelsp.com/excontent987/legal/prov/e6354"/>
   <its:provenanceRecord 
     revPerson="John Smith"
     revOrgRef="http://john-smith.qa.example.com"/>
 </its:provenanceRecords>
structural:
<target its:provenanceRecordsRef="#ph3">
...
<its:provenanceRecords xml:id="ph3">
   <its:provenanceRecord 
     person="John Doe"
     orgRef="http://www.legaltrans-ex.com/"
     revPerson="Tommy Atkins"
     revOrgRef="http://www.vistatec.com/"
     provRef="http://www.examplelsp.com/excontent987/legal/prov/e6354"/>
   <its:provenanceRecord 
     revPerson="John Smith"
     revOrgRef="http://john-smith.qa.example.com"/>
 </its:provenanceRecords>

inline:
 <mrk mtype="x-its" its:provenanceRecordsRef="#ph3">

or

 <mrk mtype="seg" its:provenanceRecordsRef="#ph3">
inline:
 <mrk id='1' type="its:provenanceRecordsRef" ref="#ph3">
External Resource
(???)
in trans-unit:
 @TBD:itsExternalResource
in unit:
 @TBD:itsExternalResource
inline:
 <mrk mtype="x-its" @TBD:itsExternalResource=[uri]">
inline:
 <mrk id='1' type="its:externalResource" ref="[uri]"
Target Pointer
(Yves)
N/A in the XLIFF document. Used when extracting and merging. N/A in the XLIFF document. Used when extracting and merging.
Id Value
(Yves)
structural:
<trans-unit resname="[value]">
structural: <unit name="[value]">
inline: N/A inline: N/A
Preserve Space
(???)
structural: xml:space structural: xml:space
inline: inline:
Localization Quality Issue
(Yves)
in trans-unit:
<trans-unit its:locQualityIssuesRef="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef />
</its:locQualityIssues> 
in unit:
<unit its:locQualityIssuesRef="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef />
</its:locQualityIssues> 
inline:
<mrk mtype="x-its" its:locQualityIssuesRef="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef locQualityIssueEnabled/>
</its:locQualityIssues> 
inline:
<mrk type="its:lqi" ref="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef locQualityIssueEnabled/>
</its:locQualityIssues> 
Localization Quality Rating
(dF)
structural: structural:
inline: inline:
MT Confidence
(dF)
structural:target and bin-target only
 <target its:mtConfidence="0.8982">
structural:
inline:
   <mrk its:mtConfidence="0.8982">
inline:
Allowed Characters
(Yves)
structural:
<trans-unit its:allowedCharacters>
structural:
<unit its:allowedCharacters>
inline:
<mrk mtype="x-its" its:allowedCharacters="[pattern]">
inline:
<mrk id="id" type="its:allowedCharacters"
 value="[pattern]">
Storage Size
(Yves)
structural:
<trans-unit its:storageSize its:storageEncoding its:lineBreakType>

(maxbytes not enough and can't use both pointer and local markup)

structural:
<unit its:storageSize its:storageEncoding its:lineBreakType

(see also possible module from FE)

inline:
<mrk mtype="x-its" its:storageSize its:storageEncoding its:lineBreakType>
inline: ??? (Could use a delimited string in mrk's @value. Otherwise: This opens the question of allowing or not extented attributes in <mrk>

1 Notes

1.1 Provenance mapping

1.1.1 Best Practice

In XLIFF, the ITS provenance annotation should only be added as local stand-off markup i.e. using a its:provenanceRecords element within the XLIFF file. This facilitates the addition of further its:provenanceRecord elements as additional translation, translation revisions or other activities recorded in external provenance records are conducted upon the XLIFF file.

If the its:provenanceRecords element referenced by a its:provenanceRecordsRef contains any of the translation or translation revision related attributes, namely: its:person, its:personRef, its:org, its:orgRef, its:tool, its:toolRef, its:revPerson, its:revPersonRef, its:revOrg, its:revOrgRef, its:revTool or its:revToolRef, then the its:provenanceRecordsRef should only be used as local of global annotation selecting xlf:target or xlf:bin-target elements or a xlf:mrk inline markup within either of those XLIFF elements. This is because the provenance mark-up in this case is appropriate only to translated text.

If the its:provenanceRecords element referenced by a its:provenanceRecordsRef contains only the provRef attribute, then the its:provenanceRecordsRef may be used as local of global annotation selecting any XLIFF elements, since the its:provRef attribute may point to an external provenance records that could relate to an activity that resulted in textual content of any of the elements in an XLIFF file.

If, as the result of additional activities upon an XLIFF file results in values in a its:provenanceRecord that forks from that of other elements referencing the same its:provenanceRecords, then that its:provenanceRecords must be copied to a new element with a distinct id, while the reference attribute for the element(s) concerned is changed to refer to this new its:provenanceRecords id.

1.1.2 Design Note

Note XLIFF1.2 supports some constructs that could map to ITS provenance annotation as outlined below.

<trans-unit phase-name="#ph1">
<target phase-name="#ph2">
...
<phase-group>
 <phase phase-name="ph1">
   process-name="translate"
   company-name="[value of its:orgRef or its:org]"
   tool-id="tl1"
   contact-name="[value of its:person]"
   contact-email="[value of its:personRef IF it has scheme 'mailto:'] "
   /> 
 <phase phase-name="ph2">
 ...
   /> 
</phase-group> 
 ... 
<tool tool-id="tl1">
  tool-name="[value of its:toolRef or its:tool]"
</tool>

One limitation however is that we can't map non 'mailto:' scheme for its:personRef into contact-email. Therefore the proposed mapping uses a reference to the ITS stand-off record, its:provenanceRecords. This also offers a similar mapping then for both XLIFF1.2 and XLIFF2.0 where phase-group is not available.