Difference between revisions of "LSP Localization Chain Side Use Case Demonstration"

From MultilingualWeb-LT EC Project Wiki
Jump to: navigation, search
(Use Case Demonstration)
(Step 1: Preproduction process)
Line 46: Line 46:
 
Localization workflow interaction:
 
Localization workflow interaction:
 
* Localization Note: when alert type, send a notification to the project manager and add tooltip visualization in the workflow.
 
* Localization Note: when alert type, send a notification to the project manager and add tooltip visualization in the workflow.
* Language Information: quality check to ensure the source language content is according to the Webservice parameter. Provide contextual information for the localization engineer.
+
* Language Information: quality check to ensure the source language content is according to the Webservice parameter.
 
* Storage Size: quality check for the original content.
 
* Storage Size: quality check for the original content.
 
* Readiness: control of processes to be done. Date control for availability and delivery. Priority control.
 
* Readiness: control of processes to be done. Date control for availability and delivery. Priority control.

Revision as of 09:38, 11 February 2013

1 Use Case Description

This implementation demonstrates the use of several ITS 2.0 data categories in an internal LSP localization workflow and is part of a broader showcase called CMS – Localization Chain Integration. The contents are generated in a language service client side CMS, sent to the LSP translation server, processed in the LSP internal localization workflow, downloaded from the client side and imported into the CMS.
In this use case only the LSP side will be described.
The interchange format will be XHTML (export from/import to the client CMS).
In this use case the following entities take part: client CMS, web services client, LSP web services server, LSP internal localization workflow, ITS pre-production/post-production XHTML engine, LSP-based Translation Process Managers, CAT tool, LSP-based Translators and LSP-based Reviewers.

The data categories used and its benefits are the following:

  • Translate: translate data category assures that pieces of content will not be translated.
  • Localization Note: this data category provides more context to the Process Managers, LSP-based Translators and LSP-based Reviewers with the aim that they do a better localization job.
  • Domain: this data category provides more information to the LSP-based Translators and LSP-based Reviewers. Also this information is used by the internal workflow to select the dictionaries that the LSP-based Translators and LSP-based Reviewers will use as support in the localization job. Lastly it is used to store, classify ans select the translation memories.
  • Language Information: expresses the language of a given content. It’s useful for selecting the LSP-based Translators and LSP-based Reviewers and the nature of the job. Also adds contextual information and helps them to decide if a piece of content will or will not be translated.
  • Allowed Characters: this data category allows a way for checking internal limitations in certain elements of a document for guaranteeing the proper functionality of the translated documents in the client side.
  • Storage Size: this data category allows a way for checking limitations in the size of a document or elements within a document for guaranteeing the proper functionality of the translated documents in the client side.
  • Provenance: this data category provides the information of the LSP-based Translator and Reviewer and the organization that has done the job for possible tracking issues. Also, if a second translation of the same content occurs, the system will propose the same Translator/Reviewer that did the job in first place.
  • Readiness: readiness provides the information to the LSP-based Translation Process Managers of when the content was ready to process, when the language service client wants the job to be fulfilled, the priority of the job in comparison with others potential contemporary ones and what processes are needed. All of this will have direct impact in how they organize the localization job (milestones and dates) and to arrange it with the LSP-based Translators and LSP-based Reviewers.

2 Use Case Implementation

The implementation of this use case involves the following components:

  • Linguaserve GBCC: B2B SOAP web services upon OFBiz. Internal Linguaserve development based on SOAP and web services upon OFBiz. Defines the operations and requests the client side can make to the Linguaserve server, being the most important of all of them the request for translation/revision and the download of translated contents.
  • Linguaserve PLINT: Internal localization workflow upon OFBiz. Internal Linguaserve development upon OFBiz. Allows the processing of the files sent by the client to the LSP-based Translation Process Managers. This workflow performs the necessary preprocessing and postprocessing of the ITS Tags before and after the translation and revision tasks. Supports the following ITS2.0 data categories: Translate, Localization note, Domain, Language Information, Allowed Characters, Storage Size, Provenance and Readiness.
  • Linguaserve ITS pre-production/post-production XML engine: Java classes. Transforms the CMS Drupal XML to a CAT-oriented XML to make easier the translation/revision work in the CAT tool and reconstructs the CMS XML with the CAT-oriented XML translated version. Supports the following ITS2.0 data categories: Translate, Localization Note, Domain, Language Information, Allowed Characters, Storage Size, Provenance and Readiness.
  • STAR Transit XV (CAT tool): tool used by the LSP-based Translators and LSP-based Reviewers to do their work.


The detailed usage of each data category is:

  • Translate: global and local usage.
  • Localization Note: global and local usage.
  • Domain: global usage.
  • Language Information: local usage.
  • Allowed Characters: local usage.
  • Storage Size: local usage.
  • Provenance: local usage.
  • Readiness: global usage.

3 Use Case Demonstration

  • Status:
  • Connection between the CMS client side and the LSP server side tested and working.
  • Client CMS - LSP localization workflow roundtrip tests made in coordination with Cocomore with Drupal XHTML files.
  • LSP workflow integrated engine tested with Drupal XHTML files for processing the selected usage of the data categories.
  • Data category usage integration with the localization workflow finished.
  • Ongoing translation of client contents.
  • Demonstration: https://www-pre.linguaserve.net/las_demos/control/MLWLTWP3DemoEngine user: demos password: demosLingu@serve
  • ITS Data Categories: Translate, Localization Note, Domain, Language Information, Allowed Characters, Storage Size, Provenance and Readiness. The Readiness data category is an ITS 2.0 extension.

4 Interoperability Behaviour

4.1 Step 1: Preproduction process

Localization workflow interaction:

  • Localization Note: when alert type, send a notification to the project manager and add tooltip visualization in the workflow.
  • Language Information: quality check to ensure the source language content is according to the Webservice parameter.
  • Storage Size: quality check for the original content.
  • Readiness: control of processes to be done. Date control for availability and delivery. Priority control.


The data categories treatment by the XML engine in the internal preproduction process is the following:

Preproduction process
Data category Global / document (Drupal XML) Local / element (XML node) Content (only when it includes HTML)
Translate The default values are applied. A particular XML node could be not translatable. Block parts of the content marked as not translatable.
Localization Note Inform the translator. Block and inform the translator.
Domain Inform the translator.
Language Information Block and inform the translator.
Storage Size Inform the translator.
Readiness Inform the project manager.

Example CMS XML source file:

<?xml version="1.0" encoding="UTF-8"?>
<source xmlns:its="http://www.w3.org/2005/11/its" its:version="2.0">
 <its:rules>
  <its:domainRule selector="//job[@id='11']" domainPointer="@domain"/>
  <its:locNoteRule locNoteType="description" selector="//job[@id='11']//item">
   <its:locNote its:translate="no">This is a Press release.</its:locNote>
  </its:locNoteRule>
  <its:readinessRule ready-at="11/10/2012 21:19:50:000 CEST" priority="1/3" complete-by="15/10/2012 17:00:00:000 CEST" ready-to-process="hTranslate, reviseQA, hReview, publish"/>
 </its:rules>
 <job job_id="11" id="11" type_id="8" type="node" xml:lang="de" domain="Presse">
  <item id="11-body-0-value" its:allowedCharacters="."><![CDATA[<p>Malterdingen, 22.08.2012 – Auf der <span translate="no">Fakuma</span> 
in Friedrichshafen vom 16. bis 20. Oktober 2012 präsentiert Ferromatik Milacron drei Exponate mit absoluten Neuheiten (Halle 
<span translate="no">B3</span>, Stand <span translate="no">B3-3203</span>). Erstmals wird die Mehrkomponenten- und Würfeltechnik in 
Kombination mit der neuen modularen F-Serie vorgeführt. Gleichzeitig zeigt der Spritzgießmaschinenbauer ein neues Modell der F-Serie. 
Nicht zuletzt enthüllt das Unternehmen die zweite Generation der vollelektrischen <span translate="no">ELEKTRON</span> Baureihe mit 
neuem Design und neuer Steuerung.</p>]]></item>
  <item id="11-body-0-format" its:translate="no" its:allowedCharacters="."><![CDATA[full_html]]></item>
  <item id="11-node_title" its:allowedCharacters="[^&lt;&gt;]" its:storageSize="255"><![CDATA[Sondertechnologie wohin das Auge reicht]]></item>
 </job>
</source>

The result of this step is shown in the next step.

4.2 Step 2: Translation and revision

Localization workflow interaction:

  • Localization Note: inform the translator/revisor of all the localization notes in the moment of assigning the job.
  • Domain: automatic selection of CAT terminology. Semiautomatic assignment of translators/reviewers. Selection of Translation Memories by domains. Context for translation/revision for new terminology and neologisms.
  • Provenance: possibility to reassign the same translator/revisor in new versions of the same content (based on identifiers). Inform the project manager.
  • Readiness: control of processes to be done. Inform the priority to translator/revisor.


Example preprocessed CAT oriented XML file:

<?xml version="1.0" encoding="UTF-8"?>
<xlas version="2.0">
    <xlasUTrad sourceFile="11_8.xml" xmlGuid="8">
        <xlasRefTrad its2dataCategory="Domain"><![CDATA[Presse]]></xlasRefTrad>
        <xlasRefTrad its2dataCategory="LocalizationNote"><![CDATA[This is a Press release.]]></xlasRefTrad>
        <xlasTrad nodeName="item" xlasId="1"><![CDATA[<p>Malterdingen, 22.08.2012 – Auf der <xlasbloq xlasbloqid="1">
&lt;span translate=&quot;no&quot;&gt;Fakuma&lt;/span&gt;</xlasbloq> in Friedrichshafen vom 16. bis 20. Oktober 2012 präsentiert 
Ferromatik Milacron drei Exponate mit absoluten Neuheiten (Halle <xlasbloq xlasbloqid="2">&lt;span translate=&quot;no&quot;&gt;B3&lt;/span&gt;</xlasbloq>, 
Stand <xlasbloq xlasbloqid="3">&lt;span translate=&quot;no&quot;&gt;B3-3203&lt;/span&gt;</xlasbloq>). Erstmals wird die Mehrkomponenten- und Würfeltechnik 
in Kombination mit der neuen modularen F-Serie vorgeführt. Gleichzeitig zeigt der Spritzgießmaschinenbauer ein neues Modell 
der F-Serie. Nicht zuletzt enthüllt das Unternehmen die zweite Generation der vollelektrischen 
<xlasbloq xlasbloqid="4">&lt;span translate=&quot;no&quot;&gt;ELEKTRON&lt;/span&gt;</xlasbloq> Baureihe mit neuem Design 
und neuer Steuerung.</p>]]></xlasTrad>
        <xlasTrad nodeName="item" xMax="255" xMaxOrig="39" xlasId="2"><![CDATA[Sondertechnologie wohin das Auge reicht]]></xlasTrad>
    </xlasUTrad>
</xlas>

This file will be imported in the CAT tool using a filter created ad hoc, based on a XML files with embedded HTML tags filter and modified to add new special tags.
The content inside the xlasRefTrad nodes is reference content for the translators/reviewers. Is is visible but blocked in the CAT tool.
The content inside the xlasTrad nodes is translatable content. It is editable by the translators/reviewers except the standard HTML tags and the content between the special inline blocking tags (<xlasbloq>).

The specific manipulation with each data category is the following:

  • Translate: The content of the XML nodes with the local attribute its:translate="no" is not extracted. Additionally, with the translatable nodes with HTML content, the preprocessing step adds the xlasbloq tags before and after the pieces of the content that are not translatable.
  • Localization Note: The content of the localization note (from the <its:locNote> node) is added in a xlasRefTrad node, blocked by the CAT filter but visible for the translators and reviewers, if it applies (selector attribute).
  • Domain: The content of the domain attribute (as established by the domain pointer "//job[@id=11]/@domain") is added in a xlasRefTrad node, blocked by the CAT filter but visible for the translators and reviewers, if it applies (selector attribute).
  • Language Information: The local lang attributes are visible for the translators and reviewers and blocked. Workflow use: The source language information is obtained from the DB (originally a web service parameter), always available for the LSP-based Translation Process Managers and used to select the translators and reviewers.
  • Storage Size: The local XML its:storageSize attribute value is obtained and informed in the xMax attribute of the xlasTrad node when applies. Also the size of the original content is calculated and informed in the xMaxOrig attribute.
  • Readiness: Workflow use: the priority info is obtained from the DB (originally a web service parameter) and always available for the LSP-based Translation Process Managers. The expected finalization date (complete-by parameter) is updated into the system DB in the preprocessing step.

4.3 Step 3: Postproduction process

Localization workflow interaction:

  • Domain: Storage and classification of Translation Memories by domains.
  • Readiness: control of processes to be done. Date control for availability and delivery.


The data categories treatment by the XML engine in the internal postproduction process is the following:

Postproduction process
Data category Global / document (Drupal XML) Local / element (XML node) Content (only when it includes HTML)
Translate Insert translation on the translatable nodes. Undo blocking of parts of the content marked as not translatable.
Language Information Update the xml:lang attribute in the translated nodes. Update the language attributes in the translated content.
Allowed Characters Check.
Storage Size Check.
Translation Agent Provenance Add or update the data category node.
Revision Agent Provenance Add or update the data category node.
Readiness Update the data category node.

Example translated and postprocessed CMS XML file (with translation simulation marks in the "translated" content):

<?xml version="1.0" encoding="UTF-8"?>
<source its:version="2.0" xmlns:its="http://www.w3.org/2005/11/its">
 <its:rules>
  <its:domainRule domainPointer="@domain" selector="//job[@id='11']"/>
  <its:locNoteRule locNoteType="description" selector="//job[@id='11']//item">
   <its:locNote its:translate="no">This is a Press release.</its:locNote>
  </its:locNoteRule>
  <its:readinessRule complete-by="15/10/2012 17:00:00:000 CEST" priority="1/3" ready-at="15/10/2012 10:59:00:393 CEST" ready-to-process="hReview, publish"/>
  <its:transProvRule selector="//item" transOrg="Linguaserve" transPerson="11236" transRevOrg="Linguaserve" transRevPerson="11239"/>
 </its:rules>
 <job domain="Presse" id="11" job_id="11" type="node" type_id="8" xml:lang="es">
  <item id="11-body-0-value" its:allowedCharacters="."><![CDATA[**es_es** <p>**es_es** Malterdingen, 22.08.2012 – Auf der 
<span translate="no">Fakuma</span>**es_es**  in Friedrichshafen vom 16. bis 20. Oktober 2012 präsentiert Ferromatik Milacron 
drei Exponate mit absoluten Neuheiten (Halle <span translate="no">B3</span>**es_es**, Stand <span translate="no">B3-3203</span>
**es_es** ). Erstmals wird die Mehrkomponenten- und Würfeltechnik in Kombination mit der neuen modularen F-Serie vorgeführt. 
Gleichzeitig zeigt der Spritzgießmaschinenbauer ein neues Modell der F-Serie. Nicht zuletzt enthüllt das Unternehmen die 
zweite Generation der vollelektrischen <span translate="no">ELEKTRON</span>**es_es**  Baureihe mit neuem Design und neuer 
Steuerung.</p>]]></item>
  <item id="11-body-0-format" its:allowedCharacters="." its:translate="no"><![CDATA[full_html]]></item>
  <item id="11-node_title" its:allowedCharacters="[^&lt;&gt;]" its:storageSize="255"><![CDATA[**es_es** Sondertechnologie wohin das Auge reicht]]></item>
 </job>
</source>

This file is exported from the CAT tool and postprocessed in the internal localization workflow.
The specific manipulation with each data category is the following:

  • Translate: Insert the translation on the XML translatable nodes. Additionally, with the translatable nodes with HTML content, undo the blocking of parts of the content marked as not translatable (remove the xlasbloq tags).
  • Language Information: Update the xml:lang attribute in the XML translated nodes. Change the code of the source language with the code of the target language. Also, with the translatable nodes with HTML content, update the language attributes in the translated content.
  • Allowed Characters: The postprocessing engine checks if the content of the node fulfills the restriction indicated by the its:allowedCharacters attribute.
  • Storage Size: The postprocessing engine checks if the content of the node fulfills the restriction indicated in the its:storageSize attribute.
  • Translation Agent Provenance: The global data category tag is added or updated showing the internal ID of the translator who has done the job.
  • Revision Agent Provenance: The global data category tag is added or updated showing the internal ID of the reviewer who has done the job.
  • Readiness: The ready-to-process attribute is updated deleting the processing steps already done, and letting only the next steps in the localization chain. These tasks will be executed in the client CMS side: the import and publication of the translated content. Also the ready-at attribute is update with the time stamp .