Metadata and workflow comparison
Contents
- 1 Process Definitions
- 1.1 Generation of Source Locale Content
- 1.2 Preparation for Localization
- 1.3 Localization
- 1.3.1 translate
- 1.3.2 machine-translate
- 1.3.3 statistical-machine-translate
- 1.3.4 rule-based-machine-translate
- 1.3.5 translation-memory-lookup
- 1.3.6 human-translate
- 1.3.7 human-transcreate
- 1.3.8 post-edit
- 1.3.9 Non-language-content-localization
- 1.3.10 review-target-quality
- 1.3.11 review-translation-quality
- 1.3.12 analyse-localization-workflow
- 1.4 Publication of Target Locale Content
- 1.5 Curation of Content Translations
- 2 Mapping processes definitions to data categories
1 Process Definitions
To help structure these process definitions they are captured:
- defined as classes, i.e. a type of process that can have any number of real world instances. This offer opportunities to refine definitions in a structured way in future to produce sub-classes
- grouped under sets of process classes which will help clearly communicate where in the overall internationalization and localization process they sit
- should indicate the output of the process or the change in state of those outputs compared to inputs
- should indicate the inputs to the process and preconditions on those input
- should be correlated against data categories, indicating if the process creates, reads, updates or delete specific data categories from source or target content
1.1 Generation of Source Locale Content
This is a group of processes classes that are involved in the generation of content that is intended to be localised for different target locales.
1.1.1 create-source
- Definition
- Author source locale content. This involve the creation of unstructured content but also its annotation to provide structure such a headings and bullet points. The creation of source content may be conducted in compliance with internationalisation guidelines, whereby authors are encourages to use certain spelling, grammatical, format, style and annotation guidelines that aim to facilitate the subsequent localization of the content.
1.1.2 revise-source
- Definition
- Provide a revised version of existing source content (needs contentResultSource - yes)
- Input
- Source content
- Output
- Revised version of the source content, associated to the original
1.1.3 annotate-source
- Definition
- This is annotation of content to associate portions of it with specific meta-data relevant to the processes it will later undergo. This could include marking content to be translated, to be transliterated, that represents a named entity, or that represents a term. Such annotation may be performed by the original author, or by a dedicated quality assurance or localisation specialist and may be assisted by text analytic services, e.g. named entity recognizer.
1.1.4 generate-source-terminology
- Definition
- Management of source content terminology in a separate terminology database. Terminology represents certain important and repeated concepts within a set of content, the consistent use of which is seen to improve the comprehension of the content. For localisation, the consistent translation of terminology is an important quality concern.
1.1.5 source-quality-assurance
- Definition
- Quality assurance review of source content. This is the process of ensuring the consistency and comprehensibility of source content, including functions such as spell checking, grammar checking, correct terminology usage, correct use of meta-data, correct annotation of concepts, the use of controlled language to ease the translation task and the inclusion of content specific instructions or comments to inform the translation process.
1.1.6 voiceover-source
- Definition
- Provision of a spoken audio component to source content.
1.1.7 subtitle-source
- Definition
- Provision of textual transcription of audio component of source content.
1.1.8 internationalise-content
- Definition
- check and revise content to meet internationalisation guidelines
- Note
- This seems a generic process encompassing many of the others presented in this section, so this may be redundant as a separate process in this list.
1.2 Preparation for Localization
This is a group of processes classes that are involved in preparing content specifically to be localized.
1.2.1 translate-multilingual-terms
- Definition
- associate approved translation to source content terminology
1.2.2 normalize-source
- Definition
- This involves the division of source content into segments, usually at a sentential level, which are amenable to translation. This includes the removal of mark-up not relevant to the translation process and the inclusion of instructions on whether specific content should not be translated in specific locales.
1.2.3 assemble-localization-job
- Definition
- assemble the source content and associated language resources, e.g. translation memories, term-bases, MT engines, translation guidelines, as input into a localisation job
1.2.4 generate-localization-quote
- Definition
- provide an estimate of the effort required to localise a job for quoting or pricing purposes, not to perform the job
1.2.5 transcribe-source
- Definition
- transcribe the source content (needs contentResultSource - yes)
1.2.6 transliterate-source
- Definition
- transliterate the source content (needs contentResultSource - yes)
1.3 Localization
This is a group of process classes that are involved in the localization of content from a source localised to one or more target locales.
1.3.1 translate
- Definition
- Bind a translation in the target language to the source content. This involves the translation of source locale content into a language appropriate to a target locale. It may be performed in a way that is sensitive to the domain of the source content, to terminology and entity annotation in the source and to translation instructions provided with the source content. It may be performed by human or automated software agents.
- Input
- source content
- Output
- binding between source content and a translation into the target language
1.3.2 machine-translate
- subclass of
- translate
- Definition
- translate using an automated process
- Output
- Machine translation agent identification
1.3.3 statistical-machine-translate
- subclass of
- machine-translate
- Definition
- translate using an automated process#
- Output
- translation confidence score
1.3.4 rule-based-machine-translate
- subclass of
- machine-translate
- Definition
- translate using an automated process
1.3.5 translation-memory-lookup
- subclass of
- machine-translate
- Definition
- translate using an automated process
- Output
- fuzzy match score
1.3.6 human-translate
- subclass of
- translate
- Definition
- translate using human judgement only
1.3.7 human-transcreate
- subclass of
- human-translate
- Definition
- Human translation performed with priority given to maintaining intent, style, tone and context, over the literal accuracy of translation. It is typically performed on marketing content.
1.3.8 post-edit
- subclass of
- translate and human-translate
- Definition
- approve a previous machine translation or provide a preferred alternative translation
- Input
- a binding between some source content and one or more translations into the target language
- Output
- a preferred revision of the target content or the selection of one of the input translations as the preferred translation
1.3.9 Non-language-content-localization
- Definition
- This is the adaptation of non text content to the target locale, which may involve the adaptation or replace of images, colour scheme, graphical design and content layouts, text font, number and data formats, including currency, numbers, dates etc.
1.3.10 review-target-quality
- Definition
- human review for quality assurance only the target text, without the source text (see UNE 15038 “review”), by an expert for instance
1.3.11 review-translation-quality
- Definition
- human revision for quality assurance examining the translation and comparing source and target (see UNE 15038 “revision”)
1.3.12 analyse-localization-workflow
- Definition
- This is the analysis of the performance of the localization process to identify processes that are not achieving agreed performance targets and to provide input to decision making on process improvement.
1.4 Publication of Target Locale Content
This is a group of process classes that are involved in the consumption of target locale content and any associated rating or annotation by content consuming users.
1.4.1 integrate-target-content
- Definition
- This is the integration of the target content into the form of the content to be consumed. It includes the assembly of content in the correct order and the integration of meta-data previously removed from the source content.
1.4.2 test-target-content
- Definition
- This is testing of applications that use the target content to ensure correct operation and presentation of the content.
1.4.3 proof-target-content
- Definition
- human checking of proofs before publishing for quality assurance (see UNE 15038 “proofreading”)
1.4.4 publish-target-content
- Definition
- Publish target locale content for consumption by its intended consumers.
1.4.5 localisation-meta-data-removal
- Definition
- This is the process of removing meta-data that was used in the localization process, but is not required for the publishing and consumption of the target content.
1.4.6 gather-target-content-consumer-feedback
- Definition
- This is the process of eliciting and collecting feedback on the quality and usefulness of the target content from its consumers.
1.5 Curation of Content Translations
This is a group of process classes that are involved in the analysis and productive reuse of process provenance data and source-target localize content bindings resulting from the execution of process class instances.
1.5.1 maintain-translation-memory
- Definition
- Maintenance of Translation memories, including replacement of matches with revised translations and corrections to source content
1.5.2 maintain-termbase
- Definition
- his is the collection and indexing of all identified terms from a set of content being translated, together with definitions, morphologies and their associated translations.
1.5.3 maintain-parallel-text
- Definition
- Maintenance of parallel text for SMT training
2 Mapping processes definitions to data categories
2.1 Explanation
This is currently a placeholder from earlier work. TO be updated against revised work on this page. The following table is a static snapshot of this page, where the latest version is maintained.
- Consumes = the metadata is used by the designated process to produce its results, i.e., it is input into that stage of the workflow.
- Generates = the metadata is output from the process. This includes processes that take the metadata item is input but modify its value as output (i.e., a process can both consume and generate a specific metadata item).
- Transforms = the process converts the metadata from one format to another (without modifying the values). A common example would be a process that filters an input format and then passes metadata items found in that format along in a new format without altering their values.
2.2 Table
Key | Authoring Phase (CMS/Controlled Language) | Translation Phase | Metrics/Analysis | Publication (CMS) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C = Consumes | Authoring | Source QA | Enrichment | Source Terminology Management | Connection to Provider | Translation | Multilingual Terminology Management | Review | Translation QA | Post-translation Process | Annotate w/process & qual data | Annotate provenance of lang. res. | CMS reintegration | CMS revision management | Publication | |||||||||||||||||||||||||||||||||||||||||||||||||
G = Generates | General | XLIFF Connector | Pretranslation | Human | Machine | L10n | Postediting | Update Linguistic Resources | Costing and Billing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
T = Transforms | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | C | G | T | |
Internationalization | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
autoLanguageProcessingRule | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||
directionality | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||
dropRule | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
idValue | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
localElementsWithinText | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||
localeSpecificContent | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||
preserveSpace | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||
ruby | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||
targetPointer | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
translate | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||
Process | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
approvalStatus | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||
cacheStatus | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||||
legalStatus | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||
processState | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||
processTrigger | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||
proofreadingState | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
revisionState | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Project Information | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
domain | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||
formatType | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||
genre | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||
purpose | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
register | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
translator qualification | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Provenance | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
author | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
contentLicensingTerms | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
revisionAgent | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sourceLanguage | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
translationAgent | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||
Quality | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
qualityError | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||
qualityProfile | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||
Translation | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
context | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||
confidentiality | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
externalPlaceholder | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||
languageResource | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||
mtConfidence | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mtDisambiguation | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
namedEntity | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||||||||||||||||||||||||||||||||||||||
specialRequirements | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||||||||||||||
term | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | |||||||||||||||||||||||||||||||||||||
textAnalysisAnnotation | x | x | x | x | x | x | x | x | x | x |