Difference between revisions of "XLIFF Mapping"

From MultilingualWeb-LT EC Project Wiki
Jump to: navigation, search
Line 203: Line 203:
 
   
 
   
 
</td>
 
</td>
<td>structural: in units and segments
+
<td>structural: in source, target, seg-source, bin-source, bin-target 
  
  &lt;unit its:provenanceRecordsRef="#ph3">
+
  &lt;target its:provenanceRecordsRef="#ph3">
...
+
&lt;segment its:provenanceRecordsRef="#ph3">
+
 
  ...
 
  ...
 
  &lt;its:provenanceRecords xml:id="ph3">
 
  &lt;its:provenanceRecords xml:id="ph3">

Revision as of 20:05, 6 February 2013

This page provide a tentative mapping of the ITS data categories in XLIFF 1.2 and 2.0

Notes:

  • When posting emails about this topic, please refer to Issue 55
  • The namespace prefix 'TBD' used below indicates a namespace that needs to be defined.
  • The 'strucural' entries relate to the case where the element with the ITS information is a non-inline (structural) element. For example a <p> in HTML.
  • The 'inline' entries relate to the case where the element with the ITS information is an inline element. For example a <span> in HTML.
  • In general, this table attempts to map ITS data categories and their attributes first into corresponding attributes or elements in XLIFF 1.2 amd 2.0. Only if a native XLIFF equivalent is the introduction of a native ITS attribute or element into XLIFF considered, and then only where extensibility is permitted in XLIFF. In other words the mapping aims to ensure the resulting document is a conformant XLIFF document.
  • itsx: is a schema prefix for a schema that should be designed and hosted by W3C MLW-LT to facilitate ITS mappings, such as XLIFF mapping. The rationale is that XLIFF processors processing the ITS mappings won't usually be generic ITS processors, that could properly parse native ITS constructs.
  • Color Code
color meaning
  stuck in XLIFF TC
  dependent on an unstable ITS category
  nedds W3C I18N WG review

The mapping currently takes the following approach:

Data Categories ("driver") XLIFF 1.2 XLIFF 2.0
Translate
(Yves)
structural: no extraction or
<trans-unit id='id' translate='yes|no'>
structural: no extraction or
<unit id='id'>
 <segment translate='yes|no'>
inline: inline code or
<mrk mtype="protected">...</mrk>
<mrk mtype="x-its-Translate-Yes">...</mrk>

Value for 'not-protected' needs to be defined. dF: Proposed verbose: x-its-Translate-Yes.

inline: inline code or
<mrk id="id" translate="yes|no">
Localization Note
(Yves)
structural:
<note>

alert: priority="1" description: priority > 1

structural:
<note>

Informal consensus on mailing list that note should be extensible, need to formalize and propgate into spec
No solution for noteType without extension or new XLIFF attribute

inline:
<mrk mtype='x-its-Note' comment='[note]' itsx:locNoteType='alert|description'>

is it a best practice?

inline:
<mrk id='1' type='comment' value='[note]' > should extensiblity be introduced here?
Terminology
(???)
structural: Use <mrk> structural: Use <mrk>
inline:
<mrk mtype='term'>  need to define value for termInfo(Ref)
inline:
<mrk type='term' value='info'|ref='infoRef'>
Directionality
(???)
structural: trans-unit its:dir structural: XLIFF2 directionality mechanism
inline: Unicode characters for inline inline: XLIFF2 directionality mechanism
Ruby
(???)
TBD TBD
Language information
(???)
structural: fall back on mrk if needed

inline:

<mrk mtype='x-itsLang???' xml:lang='lang'>

inline:

<mrk type='its:Lang' value='en'>
Element Within Text
(Yves)
yes: inline codes

no: trans-unit

nested: <sub>
yes: inline codes

no: unit

nested: sub-flows mechanism
Domain
(???)
@TBD:itsDomain either DC or local ITS @TBD:itsDomain
Disambiguation
(Dave and David)
inline:
1.2 <mrk mtype='phrase'> using ITS native and (if used) comment for the resolved prose text
Locale Filter
(Yves)
structural: Same as for Translate structural: Same as for Translate
inline: Same as for Translate inline: Same as for Translate
Provenance
(Dave)
structural:
<unit its:provenanceRecordsRef="#ph3">
...
<segment its:provenanceRecordsRef="#ph3">
...
<its:provenanceRecords xml:id="ph3">
   <its:provenanceRecord 
     person="John Doe"
     orgRef="http://www.legaltrans-ex.com/"
     revPerson="Tommy Atkins"
     revOrgRef="http://www.vistatec.com/"
     provRef="http://www.examplelsp.com/excontent987/legal/prov/e6354"/>
   <its:provenanceRecord 
     revPerson="John Smith"
     revOrgRef="http://john-smith.qa.example.com"/>
 </its:provenanceRecords>

structural: in source, target, seg-source, bin-source, bin-target
<target its:provenanceRecordsRef="#ph3">
...
<its:provenanceRecords xml:id="ph3">
   <its:provenanceRecord 
     person="John Doe"
     orgRef="http://www.legaltrans-ex.com/"
     revPerson="Tommy Atkins"
     revOrgRef="http://www.vistatec.com/"
     provRef="http://www.examplelsp.com/excontent987/legal/prov/e6354"/>
   <its:provenanceRecord 
     revPerson="John Smith"
     revOrgRef="http://john-smith.qa.example.com"/>
 </its:provenanceRecords>
inline:
 <mrk mtype="x-itsProvenanceRecordsRef" its:provenanceRecordsRef="#ph3">
inline:
 <mrk id='1' type="its:provenanceRecordsRef" ref="#ph3">
External Resource
(???)
in trans-unit: @TBD:itsExternalResource in unit: @TBD:itsExternalResource
inline: <mrk mtype="x-itsExternalResource" @TBD:itsExternalResource=[uri]"> inline: <mrk id='1' type="its:externalResource" ref="[uri]"
Target Pointer
(Yves)
N/A in the XLIFF document. Used when extracting and merging. N/A in the XLIFF document. Used when extracting and merging.
Id Value
(Yves)
structural:
<trans-unit resname="[value]">
structural: <unit name="[value]">
inline: N/A inline: N/A
Preserve Space
(???)
structural: xml:space structural: xml:space
inline: inline:
Localization Quality Issue
(Yves)
in trans-unit:
<trans-unit its:locQualityIssuesRef="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef />
</its:locQualityIssues> 
in unit:
<unit its:locQualityIssuesRef="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef />
</its:locQualityIssues> 
inline:
<mrk mtype="x-itsLQI" its:locQualityIssuesRef="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef locQualityIssueEnabled/>
</its:locQualityIssues> 
inline:
<mrk type="its:lqi" ref="#lqi1">
...
<its:locQualityIssues xml:id="lqi1">
 <its:locQualityIssue locQualityIssueType locQualityIssueComment
  locQualityIssueSeverity locQualityIssueProfileRef locQualityIssueEnabled/>
</its:locQualityIssues> 
Localization Quality Rating
(dF)
structural: structural:
inline: inline:
MT Confidence
(dF)
structural: structural:
inline: inline:
Allowed Characters
(Yves)
structural:
<trans-unit its:allowedCharacters>
structural:
<unit its:allowedCharacters>
inline:
<mrk mtype="x-itsAllowedChararacters" 
 its:allowedCharacters="[pattern]">
inline:
<mrk id="id" type="its:allowedCharacters"
 value="[pattern]">
Storage Size
(Yves)
structural:
<trans-unit its:storageSize its:storageEncoding its:lineBreakType>

(maxbytes not enough and can't use both pointer and local markup)

structural:
<unit its:storageSize its:storageEncoding its:lineBreakType

(see also possible module from FE)

inline:
<mrk mtype='x-itsSS' its:storageSize its:storageEncoding its:lineBreakType>
inline: ??? (Could use a delimited string in mrk's @value. Otherwise: This opens the question of allowing or not extented attributes in <mrk>

1 Notes

1.1 Provenance mapping

1.1.1 Best Practice

In XLIFF, the ITS provenance annotation should only be added as local stand-off markup i.e. using a its:provenanceRecords element within the XLIFF file. This facilitates addition of further its:provenanceRecord elements as additional translation or translation revisions are conducted in the XLIFF file.

If, as the result of additional translation or translation revision activities, the annotation of a XLIFF translation unit, segment or inline fragment forks from that of other elements referencing the same its:provenanceRecords, then that its:provenanceRecords must be copied to a new element with a distinct id, which the reference attribute for the element(s) concerned is changed to refer to this new its:provenanceRecords id.

1.1.2 Design Note

Note XLIFF1.2 supports some constructs that could map to ITS provenance annotation as outlined below.

<trans-unit phase-name="#ph1">
<target phase-name="#ph2">
...
<phase-group>
 <phase phase-name="ph1">
   process-name="translate"
   company-name="[value of its:orgRef or its:org]"
   tool-id="tl1"
   contact-name="[value of its:person]"
   contact-email="[value of its:personRef IF it has scheme 'mailto:'] "
   /> 
 <phase phase-name="ph2">
 ...
   /> 
</phase-group> 
 ... 
<tool tool-id="tl1">
  tool-name="[value of its:toolRef or its:tool]"
</tool>

One limitation however is that we can't map non 'mailto:' scheme for its:personRef into contact-email. Therefore the proposed mapping uses a reference to the ITS stand-off record, its:provenanceRecords. This also offers a similar mapping then for both XLIFF1.2 and XLIFF2.0 where phase-group is not available.