Work Package 3: CMS – Localization Chain Integration – Details

Version 0.2 • 20 December 2012

Return to Work Package 3 Overview

The following content is presented for informational purposes and has not been edited or condensed.

Contents:


Cocomore Report

Task 3.1

In this Task Cocomore is developing a set of Drupal modules that will support the use of

Okapi Components for XLIFF is currently in process. This deliverable is composed of several parts:

Report on LT-Web Processing in the CMS

Task 3.2

Deliverable 3.2.2

The contents are generated in Drupal, a Content Management System (CMS). Before they are sent, the contents are annotated with ITS 2.0 metadata in two ways: automatic annotation and manual annotation. XHTML + ITS 2.0 will be used as interchange format.

Once created, they are sent to the Linguaserve Global Business Connector Contents (GBCC) translation server, processed in the Linguaserve internal localization workflow Platform for Localization, Interoperability and Normalization of Translation (PLINT). Afterwards, with the annotated content translated and the metadata treated, they are downloaded by the client and imported into the CMS.

The ITS 2.0 selected data categories for integration are:

  1. Translate
  2. Localization Note
  3. Domain
  4. Language Information
  5. Allowed Characters
  6. Storage Size
  7. Provenance
  8. Readiness (ITS 2.0 extension)

Integration on behalf of the LSP (Language Services Provider) is being done in three areas, and is expected to have a complete version by December 2012. The B2B Integration Showcase is expected to be completed with real content, client and use case by March 2013.

  1. Pre-production/post-production engine for processing content files annotated with ITS 2.0.
  2. Linguaserve localization workflow to provide support to project management and production processes.
  3. Computer Assisted Translation (CAT) tool usage for translation, revision and postediting with ITS 2.0 annotated content.

Current Status of work pending completion of the real client showcase implementation system:

  1. All data categories implemented are pending final unit tests in coordination with Cocomore. Expected date for final tests is December 2012.
  2. Domain: workflow integration on development is also expected to be completed in December 2012.
  3. Provenance: workflow integration pending reply on behalf of Cocomore. Expected completion date is December 2012.

Linguaserve Report

Task 3.1

Linguaserve has provided web service documentation and support to Cocomore for Drupal modules, as well as support in testing.

Task 3.2 for Deliverable 3.2.2

The contents are generated in Drupal, a Content Management System (CMS). Before they are sent, the contents are annotated with ITS 2.0 metadata in two ways: automatic annotation and manual annotation. XHTML + ITS 2.0 will be used as interchange format.

Once created, they are sent to the Linguaserve Global Business Connector Contents (GBCC) translation server, processed in the Linguaserve internal localization workflow Platform for Localization, Interoperability and Normalization of Translation (PLINT). Afterwards, with the annotated content translated and the metadata treated, they are downloaded by the client and imported into the CMS.

The ITS 2.0 selected data categories for integration are:

  1. Translate
  2. Localization Note
  3. Domain
  4. Language Information
  5. Allowed Characters
  6. Storage Size
  7. Provenance
  8. Readiness (ITS 2.0 extension)

Integration on behalf of the LSP (Language Services Provider) is being done in three areas, and is expected to have a complete version by December 2012. The B2B Integration Showcase is expected to be completed with real content, client and use case by March 2013.

  1. Pre-production/post-production engine for processing content files annotated with ITS 2.0.
  2. Linguaserve localization workflow to provide support to project management and production processes.
  3. Computer Assisted Translation (CAT) tool usage for translation, revision and postediting with ITS 2.0 annotated content.

Current Status of work pending completion of the real client showcase implementation system:

  1. All data categories implemented are pending final unit tests in coordination with Cocomore. Expected date for final tests is December 2012.
  2. Domain: workflow integration on development is also expected to be completed in December 2012.
  3. Provenance: workflow integration pending reply on behalf of Cocomore. Expected completion date is December 2012.

Detailed progress

Task 3.1: Support for web services and interoperability between CMS and TMS

  1. Webservice definition for Drupal modules.
  2. Input on best practices in content granularity and analysis of possibilities to provide context to the translators.
  3. Testing on intercommunication between Cocomore and Linguaserve.
  4. Localization case study for ITS 2.0 web localization from German into Chinese and French in progress. To be completed in December 2012.

Task 3.2 for Deliverable 3.2.2

The work status is as follows:

  1. Intercommunication between Cocomore and Linguaserve is ready as shown in Lyon TPAC.
  2. CAT tool filter has been adapted to ITS 2.0 usage is done.
  3. Use of data categories shown in demo engine and in CAT tool shown in Lyon TPAC with the previous XML-Drupal format. This format is now being changed to XHTML-Drupal.
  4. Use of metadata
    ITS 2.0 Data CategoryBehaviourLinguaserveTMS module modified
    TranslateBlock parts of untranslatable contentEngine and CAT tool
    Localization NoteProvide information to translators/revisersEngine and CAT tool
    Localization NoteAlert the project managers and add tooltip visualization in the workflowLocalization workflow
    DomainProvide context to the translatorsEngine and CAT tool
    DomainAutomatic selection of CAT terminology and translation memoriesLocalization workflow
    Language InformationInform the translators/revisersCAT tool
    Language InformationUpdate the information after the translation job has been completedEngine
    Language InformationQuality check to ensure the source language content complies with the Webservice parameterLocalization workflow
    Allowed Characterscheck if the restrictions are metEngine
    Storage SizeInform the translator/reviser/posteditorCAT tool
    Storage SizeCheck if the restrictions are metEngine
    Storage SizeQuality check using the original contentLocalization workflow
    ProvenanceCreate or update the data category information with the translator/reviser/posteditor who carried out the workengine
    Readiness (ITS 2.0 extension)Update the data category information with the availability dates and the following tasks in the localization chainengine
    Readiness (ITS 2.0 extension)Delivery date control and priority controlLocalization workflow
  5. A demo for ITS2 processing is available (with the previous XML-Drupal format) at https://www-pre.linguaserve.net/las_demos/control/MLWLTWP3DemoEngine (User: demos, password: demosLingu@serve). They were several interchange format changes (XML → HTML5 → XHTML) to cover various needs of the manual task, CMS capabilities, and best practices related to the standard. These changes affected the development of the ITS 2 engine.
  6. B2B Integration Showcase is expected to be ready for the Multilingual Web Workshop in Rome (12th March). Key milestones for completion include:
    • December 2012 - Web services connector and engine manipulation unit and integration tests.
    • December 2012 - Text annotation. Linguaserve enriches texts (around 75 thousand words) with metadata (support provided by Cocomore).
    • December 2012 - Cocomore sends all annotated contents to Linguaserve.
    • January 2013 - Translating environment: Linguaserve uses the annotated texts for a human machine-assisted translating scenario.
    • January 2013 - Linguaserve prepares enriched metadata content for Cocomore to import translated annotated texts (150,000 words) into Drupal.
    • February 2013 – Import of translated and annotated texts back into Drupal begins.
    • February 2013 – Import of translated and annotated texts back into Drupal ends.
    • March 2013 - Quality assurance: Review and feedback process.
    • March 12th 2013 - Deliverable D.3.2.2. Linguaserve and Cocomore deliver the first version of a website with annotated and translated text.
    • April-June 2013 - Review and web site maintenance with ITS 2.0. Linguaserve and Cocomore review the whole showcase. Web content and annotation update maintenance is undertaken with the full CMS-TMS workflow.
    • June 2013 – Showcase report. Linguaserve delivers to Vistatec the report on the showcase D3.2.2 to be integrated into Deliverable 3.2.3. , including showcase layout, design, dissemination, presentation and demo material (support provided by Cocomore).

VistaTEC - Work Packet 3 - Localization Quality Assurance Showcase

VistaTEC is developing applications and integrations to its existing systems to enable linguistic reviewers to collect relevant metadata from LT-Web marked content as it moves through the language review workflow and storing, processing, integrating, displaying and modifying it to support Localisation Managers running Language Quality Review Programmes.

Quality Metadata

VistaTEC’s main area of interest within ITS 2.0 relates to the two localization quality data categories: locQualityIssue and locQualitySummary. locQualityIssue is a somewhat complex data category as it encodes several types of data which can be applied in multiple instances to the same document elements. The desire for interoperability within the data values is also high.

Work on these data categories produced a comprehensive list of data attributes, recommended values and methods for local, global and stand-off markup.

Presentations

Presented browser based prototype vision for Reviewer’s Workbench at FEISGILLT in Seattle in October 2012. Presentation outlined the current challenges for reviewers with multiple files, double-entry of data and awkward nature of data capture. The demonstration of the prototype showed how the process could be refined with in-browser identification, classification and capture of errors using client-side code to alter the HTML DOM and embed information pertaining to errors as ITS 2.0 metadata.

Implementation

Table 1: Milestones and progress

MilestoneStatus
ITS 2.0 Test SuiteIn Progress and on schedule.
Reviewer’s Workbench
(Desktop application for capturing reviewer feedback as ITS metadata and rendering existing relevant metadata as visual queues for reviewers.)Browser based prototype coded and demonstrated at Prague.
Java application not started.
REST API to receive requests from Reviewer’s Workbench and Web Language Quality DashboardNot started. Planned start January 2013.
Creation of triple store for storage of Provenance and conversion of data to RDF.Prototype coded and demonstrated in Prague. Planned start January 2013.
Conversion routines to convert ITS metadata to triplestore RDF data.Not started. Planned start January 2013.
Business Intelligence queries against triple storeNot started. Planned start January 2013.
Decision Support reportsNot started. Planned start date February 2013.
Extension of Business Intelligence Dashboard
(Charts and Graphical Key Performance Indicators.)Not started. Planned start date March 2013.