Online MT System Process Metadata

From MultilingualWeb-LT EC Project Wiki
Jump to: navigation, search

1 Summary

This implementation demonstrates how an Online MT System can automatically control a process depending on its progress, convey information to editors and indicate the state of the content to the user.

In this use case ITS meta-data is used to solve the following problems:

  • Informing the editors of the priority of a determined production process, if the content has been updated since the last process, and the date expected to be completed.
    • Benefit: Provides version control.
    • Benefit: Allows the editor to establish a planning based on priorities and deadlines concerning the different processes that has assigned.
    • Uses the readiness data category.
  • Informing the user of the progress of different processes to which the content is submitted.
    • Benefit: Helps the user to extract information about the processes and allows him to create statistics and plan future processes.
    • Uses the progress-indicator data category.
  • Informing the RTTS of the progress of different processes in order to take different course of actions.
    • Benefit: Allows the RTTS to manipulate the content and present it to the user in a determined way, depending on the behaviour selected.
    • Uses the progress-indicator data category.

2 Use Case Description

This use case demonstration illustrates how ITS allows a HTML5 Content Author to communicate instructions and information to a posteditor regarding the state of a process, and also inform the user of the progress os the same process via a Real Time Translation System connected to different MT Service Providers.

This scenario may involve the following product classes: Content Authoring Tool; Postedition Tool; Content Management System (CMS), MT Systems and Web Browsers.

The business processes involved are: TBD

3 Use Case Implementation

The implementation of this use case involves the following components:

  • Linguaserve’s RTTS (Real Time Translation System) ATLAS PW1.
  • DCU’s MT System MaTrEx (Statistical)
  • LucySoftware’s MT System (Rule-based)
  • Postedition tool.

4 Use Case Demonstration

  • Status:Specification under development, implementation under development
  • Demonstration:TBD.

5 Interoperability Behaviour

Design assumptions:

  • After clicking in the language selector the user will send a request to the RTTS to translate the input file.
  • By default the MT Systems will translate the content of the tags and will not translate the attributes should the metadata translate tag is absent.
  • Some of the metadata of the input will be deleted in the output after the process.
  • The input file example is based on the HTML5 files of the Test Suite

5.1 Step 1: Source HTML5

This HTML source file:

<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – Java Hello World!</title>
   <link href="Rules.xml" rel="its-rules"/>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html" its-translate="no">English</a></li>
	 <li><a href="/es/index.html" its-translate="no">Español</a></li>
	</ul>
   </span>
  </section>
  <section>
   <p><span> 
	Here it's the code of a "Hello World!" application in Java
   </span></p>
   <p><code its-translate="no">
    class HelloWorldApp {
     public static void main(String[] args) {
      System.out.println("Hello World!"); // Display the string.
     }
    }
   </code></p>
   <p><span>
    The "Hello World!" application consists of two primary components: the HelloWorldApp class definition, and the main method. Now compile and run it!
   </span></p>
  </section>
 </body>
</html>

where the Rules.xml is:

<its:rules version="2.0">
<its:readinessRule ready-to-process="posteditQA" revised="no" priority="2/3" complete-by="01/10/2012 13:00:00.000 UTC" selector="//h:section/h:p/h:span"/>
</its:rules>

The user clicks on the link Español then a request of translation to Spanish is sent to the RTTS.

5.2 Step 2: The RTTS processes the content of the source file and sends it to the selected MT System

The RTTS receives the translation request and downloads the original file, subsequently parses and process the source code. In the second place downloads and reads the XML Rules file, capture the rules, and apply them to the HTML document and overriding when necessary. After this process the input file will look like this:

<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – Java Hello World!</title>
   <link href="Rules.xml" rel="its-rules"/>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html" its-translate="no">English</a></li>
	 <li><a href="/es/index.html" its-translate="no">Español</a></li>
	</ul>
   </span>
  </section>
  <section>
   <p><span its-ready-to-process="posteditQA" its-revised="no" its-priority="2/3" its-complete-by="01/10/2012 13:00:00.000 UTC"> 
	Here it's the code of a "Hello World!" application in Java
   </span></p>
   <p><code its-translate="no">
    class HelloWorldApp {
     public static void main(String[] args) {
      System.out.println("Hello World!"); // Display the string.
     }
    }
   </code></p>
   <p><span its-ready-to-process="posteditQA" its-revised="no" its-priority="2/3" its-complete-by="01/10/2012 13:00:00.000 UTC">
    The "Hello World!" application consists of two primary components: the HelloWorldApp class definition, and the main method. Now compile and run it!
   </span></p>
  </section>
 </body>
</html>

This file is sent to the MT System.

5.3 Step 3A: The MT System returns the results of the translation to the RTTS

The MT System will parse the file, translate it and create the next output:

<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – Java Hello World!</title>
   <link href="Rules.xml" rel="its-rules"/>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html" its-translate="no">English</a></li>
	 <li><a href="/es/index.html" its-translate="no">Español</a></li>
	</ul>
   </span>
  </section>
  <section>
   <p><span its-ready-to-process="posteditQA" its-revised="no" its-priority="2/3" its-complete-by="01/10/2012 13:00:00.000 UTC" its-progress-of-process="posteditQA" its-progress-indicator="0" its-progress-units="sentences"> 
	Aquí está el código de un "Hello World!" aplicación en Java
   </span></p>
   <p><code its-translate="no">
    class HelloWorldApp {
     public static void main(String[] args) {
      System.out.println("Hello World!"); // Display the string.
     }
    }
   </code></p>
   <p><span its-ready-to-process="posteditQA" its-revised="no" its-priority="2/3" its-complete-by="01/10/2012 13:00:00.000 UTC" its-progress-of-process="posteditQA" its-progress-indicator="0" its-progress-units="sentences">
    El "Hello World!" aplicación consta de dos componentes principales: la definición de clase HelloWorldApp y el método principal. Ahora compilar y ejecutarlo!
   </span></p>
  </section>
 </body>
</html>

The MT System will add the information of the progress-indicator metadata related to process posteditQA.

5.4 Step 3B: The MT System sends the readiness metadata information to the postedition tool

The MT System will wrap the information of the process to the translatable content to send it to the postedition tool for the editor to use it as help when reviewing the content.

TBD.

5.5 Step 4: The RTTS receives the translated file from the MT System and presents the result to the user

Once the RTTS receives the output from the MT System, it will modify some tags and clean others that are no longer needed, finally the result will be:

<!DOCTYPE html>
<html lang="es">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – Java Hello World!</title>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html">English</a></li>
	 <li><a href="/es/index.html">Español</a></li>
	</ul>
   </span>
  </section>
  <section>
   <p><span its-progress-of-process="posteditQA" its-progress-indicator="0" its-progress-units="sentences"> 
	Aquí está el código de un "Hello World!" aplicación en Java
   </span></p>
   <p><code>
    class HelloWorldApp {
     public static void main(String[] args) {
      System.out.println("Hello World!"); // Display the string.
     }
    }
   </code></p>
   <p><span its-progress-of-process="posteditQA" its-progress-indicator="0" its-progress-units="sentences">
    El "Hello World!" aplicación consta de dos componentes principales: la definición de clase HelloWorldApp y el método principal. Ahora compilar y ejecutarlo!
   </span></p>
  </section>
 </body>
</html>

In this example html:@lang is updated according to the new language, the readiness ITS tags plus link are deleted because they will produce noise in the user side.

5.6 Step 5: The postedition tool improves the translations and feeds the MT System

When the process of postedition is completed the next request to the same file will have the next aspect:

<!DOCTYPE html>
<html lang="es">
 <head>
  <meta charset=utf-8> 
   <title>ITS 2.0 – Java Hola Mundo!</title>
  </meta>
 </head>
 <body>
  <section>
   <span id="languageSelector">
	<ul>
	 <li><a href="/en/index.html">English</a></li>
	 <li><a href="/es/index.html">Español</a></li>
	</ul>
   </span>
  </section>
  <section>
   <p><span its-progress-of-process="posteditQA" its-progress-indicator="100" its-progress-units="sentences"> 
	Aquí está el código de un programa Java "Hola Mundo!"
   </span></p>
   <p><code>
    class HelloWorldApp {
     public static void main(String[] args) {
      System.out.println("Hello World!"); // Display the string.
     }
    }
   </code></p>
   <p><span its-progress-of-process="posteditQA" its-progress-indicator="100" its-progress-units="sentences">
    El programa "Hola Mundo!" aplicación consta de dos componentes principales: la definición de clase HelloWorldApp y el método principal. Ahora a compilarlo y ejecutarlo!
   </span></p>
  </section>
 </body>
</html>