Online MT System Provenance Quality Metadata

From MultilingualWeb-LT EC Project Wiki
Revision as of 11:52, 25 September 2012 by Pnietoca (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

1 Summary

This implementation demonstrates how an Online MT System can be used to automatically communicate the identity of agents that have been involved in the revision and the translation of the content, and to allow translation quality reviewers, to evaluate how the performance of these agents affects the quality of the translation.

In this use case ITS meta-data is used to solve the following problems:

  • Informing the translation consumers of how a content was translated and subsequently revised.
  • Providing the translation consumers information concerning what language the original source text was in.
    • Benefit: Helps the revisers and posteditors to discern the possible cultural nuances of the original text so as to perform a better revision.
    • Uses the Source Language data category .
  • Passing feedback on the severity of the errors detected during a language-oriented quality assurance (QA) process, to the translation consumer.
    • Benefit: Allows the content author and the consumer of the translations to have a better understanding of the common errors made during the translation and revision process and to help them in making decisions for the improvement of these processes quality.
    • Uses the Quality Error data category.

2 Use Case Description

This use case demonstration illustrates how ITS allows to communicate the identity of agents that have been involved in the revision and the translation of the content, and to allow translation quality reviewers, to evaluate how the performance of these agents affects the quality of the translation.

This scenario may involve the following product classes: Content Authoring Tool; Postedition Tool; Content Management System (CMS), MT Systems and Web Browsers.

The business processes involved are: TBD

3 Use Case Implementation

The implementation of this use case involves the following components:

  • Linguaserve’s RTTS (Real Time Translation System) ATLAS PW1.
  • DCU’s MT System MaTrEx (Statistical)
  • LucySoftware’s MT System (Rule-based)
  • Postedition tool.

4 Use Case Demonstration

  • Status:Specification under development, implementation under development
  • Demonstration:TBD.

5 Interoperability Behaviour

Design assumptions:

  • After clicking in the language selector the user will send a request to the RTTS to translate the input file.
  • By default the MT Systems will translate the content of the tags and will not translate the attributes should the metadata translate tag is absent.
  • Some of the metadata of the input will be deleted in the output after the process.
  • The input file example is based on the HTML5 files of the Test Suite

5.1 Step 1: Source HTML5

This HTML source file:

<html lang="en">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – The importance of quality</title>
   <link href="Rules.xml" rel="its-rules"/>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html" its-translate="no">English</a></li>
	 <li><a href="/es/index.html" its-translate="no">Español</a></li>
	</ul>
   </span>
  </section>
  <section its-src-lang="en-US">
   <p>
    The text was originally written in American English.
   </p>
   <p>
    Quality assurance (QA) refers to the planned and systematic activities implemented in a quality system so that quality requirements for a product or service will be fulfilled.
   </p>
   <p>
    It is the systematic measurement, comparison with a standard, monitoring of processes and an associated feedback loop that confers error prevention.
   </p>
   <p>
    This can be contrasted with quality control, which is focused on process outputs.
   </p>
  </section>
 </body>
</html>

where the Rules.xml is:

<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="2.0">
    <its:transRevProvRule selector="//h:section/h:p"
     its:transRevToolPointer="./@mtsystem" 
     its:transRevOrgPointer="./@organization"
     its:transRevPersonPointer="./@postedited-by"/>
</its:rules>

The user clicks on the link Español then a request of translation to Spanish is sent to the RTTS.

5.2 Step 2: The RTTS processes the content of the source file and sends it to the selected MT System

The RTTS receives the translation request and downloads the original file, subsequently parses and process the source code. In the second place downloads and reads the XML Rules file, capture the rules, and apply them to the HTML document and overriding when necessary. After this process the input file according to the rules, will be exactly the same as the original.

This file is sent to the MT System.

5.3 Step 3A: The MT System returns the results of the translation to the RTTS

The MT System will parse the file, translate it and create the next output:

<html lang="en">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – The importance of quality</title>
   <link href="Rules.xml" rel="its-rules"/>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html" its-translate="no">English</a></li>
	 <li><a href="/es/index.html" its-translate="no">Español</a></li>
	</ul>
   </span>
  </section>
  <section its-src-lang="en-US">
   <p>
    El texto fue escrito originalmente en Inglés Americano.
   </p>
   <p>
    La garantía de calidad (QA) se refiere a las actividades planificadas y sistemáticas aplicadas en un sistema de calidad para que los requisitos de calidad de un producto o servicio va a ser cumplida.
   </p>
   <p>
    Es la medición sistemática, la comparación con un estándar, la supervisión de procesos y un bucle de realimentación asociado que confiere la prevención de errores.
   </p>
   <p>
    Esto se puede contrastar con el control de calidad, que se centra en las salidas del proceso.
   </p>
  </section>
 </body>
</html>

The RTTS will add the information of the translation agent creating the next file.

<html lang="en">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – La importancia de la calidad</title>
   <link href="Rules.xml" rel="its-rules"/>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html" its-translate="no">English</a></li>
	 <li><a href="/es/index.html" its-translate="no">Castellano</a></li>
	</ul>
   </span>
  </section>
  <section its-src-lang="en-US">
   <p its-trans-rev-tool="lucy-mt-system-6.9">
    El texto fue escrito originalmente en Inglés Americano.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9">
    La garantía de calidad (QA) se refiere a las actividades planificadas y sistemáticas aplicadas en un sistema de calidad para que los requisitos de calidad de un producto o servicio va a ser cumplida.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9">
    Es la medición sistemática, la comparación con un estándar, la supervisión de procesos y un bucle de realimentación asociado que confiere la prevención de errores.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9">
    Esto se puede contrastar con el control de calidad, que se centra en las salidas del proceso.
   </p>
  </section>
 </body>
</html>

5.4 Step 3B: The postedition tool returns the provenance metadata information to the MT System

The postedition tool will wrap the information of the provenance of the postedition and send it to the MT System.

TBD.

5.5 Step 4: The RTTS receives the translated file, improved by the postedition tool, from the MT System and presents the result to the user

Once the RTTS receives the output from the MT System, it will modify some tags and clean others that are no longer needed, finally the result will be:

<html lang="es">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – La importancia de la calidad</title>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html">English</a></li>
	 <li><a href="/es/index.html">Español</a></li>
	</ul>
   </span>
  </section>
  <section its-src-lang="en-US">
   <p its-trans-rev-tool="lucy-mt-system-6.9" its-trans-person-tool="John Doe" its-trans-org-tool="Linguaserve">
    El texto fue escrito originariamente en inglés americano.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9" its-trans-person-tool="John Doe" its-trans-org-tool="Linguaserve">
    La garantía de calidad (GC) se refiere a las actividades planificadas y sistemáticas implementadas en un sistema de calidad para que los requisitos de calidad de un producto o servicio va a ser cumplidas.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9" its-trans-person-tool="John Doe" its-trans-org-tool="Linguaserve">
    Es la medición sistemática, la comparación con un estándar, la supervisión de procesos y un bucle de retroalimentación asociado que asegura la prevención de errores.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9" its-trans-person-tool="John Doe" its-trans-org-tool="Linguaserve">
    Esto se puede contrastar con el control de calidad, que se centra en las salidas del proceso.
   </p>
  </section>
 </body>
</html>

In this example html:@lang is updated according to the new language, the translate ITS tags plus link are deleted because they will produce noise in the user side.

5.6 Step 5: The content is subjected to a QA process

When the QA process is completed the will have the next aspect:

<html lang="es">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – La importancia de la calidad</title>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html">English</a></li>
	 <li><a href="/es/index.html">Español</a></li>
	</ul>
   </span>
  </section>
  <section its-src-lang="en-US">
   <p its-trans-rev-tool="lucy-mt-system-6.9" its-trans-person-tool="John Doe" its-trans-org-tool="Linguaserve">
    El texto fue escrito originariamente en inglés americano.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9" its-trans-person-tool="John Doe" its-trans-org-tool="Linguaserve">
    La garantía de calidad (GC) se refiere a las actividades planificadas y sistemáticas implementadas en un sistema de calidad para que los requisitos de calidad de un producto o servicio <span its-qa-type="grammar error" its-qa-ruleSet="SAE J2450" its-qa-severity="major" its-qa-note="puedan ser cumplidos" its-qa-agent="Linguaserve">va a ser cumplidas</span>.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9" its-trans-person-tool="John Doe" its-trans-org-tool="Linguaserve">
    Es la medición sistemática, la comparación con un estándar, la supervisión de procesos y un bucle de retroalimentación asociado que asegura la prevención de errores.
   </p>
   <p its-trans-rev-tool="lucy-mt-system-6.9" its-trans-person-tool="John Doe" its-trans-org-tool="Linguaserve">
    Esto se puede contrastar con el control de calidad, que se centra en las salidas del proceso.
   </p>
  </section>
 </body>
</html>

The QA revisor adds notes with information regarding the errors found.