Re: Issue-55: XLIFF mapping - Terminology and termInfoPointer

Yves, I sent the other reaction that is relevant for this after
looking at the terminology mapping samples.. let me add my comments
inline as a specific reaction to this..

Dr. David Filip
=======================
LRC | CNGL | LT-Web | CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto: david.filip@ul.ie


On Wed, Feb 20, 2013 at 1:53 AM, Yves Savourel <ysavourel@enlaso.com> wrote:
> More on the Terminology mapping:
>
> We say we should use <mrk mtype='term' [its:termInfoRef (or xyz:itsTermInfo-like attribute)]>term</mrk>
> That's good but:
>
> --- Can we put other ITS data categories in that same <mrk> too?
> -> why not?
I understand that the mrk is taken exclusively for term if the
mtype="term" and I think this is OK.
Generally I am not opposed to using mrk for multiple functions. But we
should be using core repertoire for encoding ITS stuff whenever
possible to nurture general interoperability and not enforce support
for its specific constructs to make use of the metadata. So for me
being able to use a generic method is more important than making mrk
generally usable for encoding more its categories at the same time..
>
> --- How do we express its;term='no'?
> Is it even needed in XLIFF?
I do not think it is needed. Term='no' can be either ignored on
extraction. Or if we insist on having it we can introduce
mtype='x-its-Term-No' or similar.
This is similar to the translate solution, we chose mrk
mtype="protected">...</mrk>
and the verbose <mrk mtype="x-its-Translate-Yes">...</mrk> for the
opposite value.
>
> --- Do we want to have a <source>/<target>-level terminology info?
I do not think that terminology on structural level is strong enough use case.
> If no: then what do we do with something like <html:p its-term='yes'>word</html:p>?
I do not think that paragraphs should be systematically considered as
being possible terms. If a paragraphs happens to consist of a single
term, I think that it is an exception, and even the authors should be
encouraged to use an embedded span for encoding this rather than say
that the whole paragraph is a term.
I believe that terminology generally and typically appears 'inline'
and that we should be concentrating on this is as the main success
scenario.
>
> We could 'move the info to an <mrk>, but then things become *very complicated* to map back and forth.
Moving this to mrk is what we agreed back in Novemeber, I see that
moving things back onto the structural level in the source format can
be complicated, but XLIFF merge back is supposed to happen with full
knowledge of the extraction mechanism. Besides, we should discourage
using structural [higher than span] elements as ITS terms in source
forrnats.
>
> any thoughts
> -yves
>
>
>

Received on Wednesday, 20 February 2013 12:21:21 UTC