Warning:
This wiki has been archived and is now read-only.

Online MT Systems Use Case Demonstration

From MultilingualWeb-LT EC Project Wiki
Jump to: navigation, search

1 Summary

This implementation demonstrates how an Online MT System can automatically:

  • Translate HTML5 documents from an ITS-conformant Web CMS.
  • Control a process depending on its progress, convey information to editors and indicate the state of the content to the user.
  • Communicate the identity of agents that have been involved in the revision and the translation of the content, and to allow translation quality reviewers, to evaluate how the performance of these agents affects the quality of the translation.

In this use case ITS meta-data is used to solve the following problems:

  • Informing the RTTS of precisely which sentences or sentence fragments should or should not be translated and which is the source language.
    • Benefit: Allows the user to block automatically the machine translation of certain parts of the Web page that are not required to be translated or must not be machine translated because of its difficulty or provenance, i.e. a technical essay or constitutional laws.
    • Benefit: Avoids automatically the machine translation of parts of the Web page that are in various languages and must remain that way, i.e. a language selector.
    • Benefit: Specifies automatically to the RTTS the source language of the text and whether it applies to the whole text or not.
    • Uses the translate data category and the language information data category.
  • Informing the RTTS, at a paragraph, sentence or word level, of the appropriate training corpora or glossary (depending on the MT System) that should be used on the translation by the MT Systems.
    • Benefit: Improves the accuracy and quality of the machine translation.
    • Uses the domain data category.
  • Providing the editor with the necessary information to review the text in order to help him with the disambiguation and to improve the quality and accuracy of the revision.
    • Benefit: Help to improve the accuracy and quality of the a review machine translation after the postedition process.
    • Uses the localization note data category.
  • Informing the editors of the priority of a determined production process, if the content has been updated since the last process, and the date expected to be completed.
    • Benefit: Provides version control.
    • Benefit: Allows the editor to establish a planning based on priorities and deadlines concerning the different processes that has assigned.
    • Uses the readiness data category (Extension for CMD, out of ITS 2.0).
  • Informing the user of the progress of different processes to which the content is submitted.
    • Benefit: Helps the user to extract information about the processes and allows him to create statistics and plan future processes.
    • Uses the progress-indicator data category (Extension for CMD, out of ITS 2.0).
  • Informing the RTTS of the progress of different processes in order to take different course of actions.
    • Benefit: Allows the RTTS to manipulate the content and present it to the user in a determined way, depending on the behaviour selected.
    • Uses the progress-indicator data category (Extension for CMD, out of ITS 2.0).
  • Informing the translation consumers of how a content was translated and subsequently revised.
  • Providing the translation consumers information concerning what language the original source text was in.
    • Benefit: Helps the revisers and posteditors to discern the possible cultural nuances of the original text so as to perform a better revision.
    • Uses the Source Language data category (Extension for CMD, out of ITS 2.0).
  • Passing feedback on the severity of the errors detected during a language-oriented quality assurance (QA) process, to the translation consumer.
    • Benefit: Allows the content author and the consumer of the translations to have a better understanding of the common errors made during the translation and revision process and to help them in making decisions for the improvement of these processes quality.
    • Uses the Localization Quality Issue data category.

2 Use Case Description

This use case demonstration illustrates how ITS via a Real Time Translation System connected to different MT Service Providers allows to:

  • Communicate instructions on language, domain and translation, and convey infomation about the translation to a content editor.
  • Communicate instructions and information to a posteditor regarding the state of a process, and also to inform the user of the progress of the same process.
  • Communicate the identity of agents that have been involved in the revision and the translation of the content, and to allow translation quality reviewers, to evaluate how the performance of these agents affects the quality of the translation.

This scenario may involve the following product classes: Content Authoring Tool; Content Editor; Content Management System (CMS), MT Systems and Web Browsers.

The business processes involved are: TBD

3 Use Case Implementation

The implementation of this use case involves the following components:

  • Linguaserve’s RTMPS (Real Time Multilingual Publication System) ATLAS PW1.
  • DCU’s MT System MaTrEx (Statistical).
  • LucySoftware’s MT System (Rule-based).

4 Use Case Demonstration

5 Interoperability Behaviour

Design assumptions:

  • After clicking in the language selector the user will send a request to the RTTS to translate the input file.
  • Some of the metadata of the input will be deleted in the output after the process.
  • The input file example is based on the HTML5 files of the Test Suite.