Warning:
This wiki has been archived and is now read-only.

Okapi Use Case - Simple Machine Translation

From MultilingualWeb-LT EC Project Wiki

Jump to: navigation, search

1 Description

XML and HTML5 documents are translated using a machine translation system, such as Microsoft Translator.

The documents are extracted based on their ITS properties and the extracted content is send to the translation server. The translated content is merged back into its original XML or HTML5 format.

2 Data categories

The following data categories are directly used:

Translate - The non-translatable content is protected.
Locale Filter - Only the parts in the scope of the locale filter are extracted, the others are treated as 'do not translate' content.
Element Within Text - The information is used to decide what elements are extracted as in-line codes and sub-flows.
Preserve Space - The information is mapped to the preserveSpace field in the extracted text unit.
Domain - The domain values are placed into a propery that can be provided to select an MT engine.

3 Benefits

The ITS markup provides the key information that drives the extraction in both XML and HTML5.
Information such as preserving white space can also be passed on to the extracted content and insure a better output.

Retrieved from "https://www.w3.org/International/multilingualweb/lt/wiki/index.php?title=Okapi_Use_Case_-_Simple_Machine_Translation&oldid=2019"

Okapi Use Case - Simple Machine Translation

1 Description

2 Data categories

3 Benefits

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Navigation

Tools