Re: issue-41 (mtconfidence), issue-42 (mtConfidence, textAnalysisAnnotation, quality)

Hi Yves, all,

2012/9/19 Yves Savourel <ysavourel@enlaso.com>

> Hi Felix, all,
>
> > This creates problems. As Dave and Declan ask at
> >
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0085.html
> >
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0087.html
> > overriding semantics in ITS 1.0 is always complete, and ITS 2.0 so far
> is the same.
> > I would have to change my whole "artifical output" implementation to
> change that,
> > so I would probably object.
>
> Actually, I think the bit "Override semantics are always complete, that is
> all information that is specified in one rule element is overridden by the
> next one." has been added in 2.0. It's not in 1.0 (
> http://www.w3.org/TR/its/#selection-precedence).
>

That's correct - I added it after a question from Dave, see

http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0228.html


>
> That may have been the intent, but I even wonder if it was important with
> the initial data category.
> Note also that, the wording is not as specific
>
> If I understand this correctly you are saying that:
>
> - If we have a data category with 3 information AAA, BBB and CCC.
> - If there is a global rule that define AAA='a' and BBB='b' for a node N
> - and the same node N has a local attribute that specify CCC='c', the 3
> information for that for N will be AAA=undefined, BBB=undefined, CCC='c'
> and not AAA='a', BBB='b' and CCC='c'?
>

Correct.


>
> If I misunderstood, then forget the rest of this email.
>
> If not:
>
> This is not very natural: how can something undefined (the local AAA and
> BBB) override anything: they don't exist.
>

It is "natural" because we define precedence on a "per data category"
basis. Even if that was implicit for ITS 1.0, I can prove easily that we
did the same in ITS 1.0, via the ITS 1.0 test suite. See e.g.
http://www.w3.org/International/its/tests/test2/EX-locNotePointer-attribute-1-result.xml
there is a test result for each element node / attribute node *per data
category*. Several values are captured in that manner. We even had
attributes "outputType" making clear from which the values came
(local, global, inheritance, default). These attributes only make sense if
the overriding semantics is complete.


>
> This also prevent the user to define some information using pointers
> globally and complement the information with ITS local attributes, like
> this:
>
> <doc xmlns:i='http://www.w3.org/2005/11/its' i:version='2.0'>
> <i:rules version='2.0'>
> <i:locQualityIssueRule selector='//z' locQualityIssueTypePointer='@type'
> locQualityIssueSeverityPointer='@score' />
> </i:rules>
> <p>Text with <z type='other' score='1'
> i:locQualityIssueComment='comment'>error</z></p>
> </doc>
>
> An example where not overriding undefined local information would be
> useful is the Storage Size data category: often the encoding and the line
> break type of the storage will be the same for the whole document, but the
> size constraint will be different locally. Having to repeat everything over
> and over is a rather un-efficient.
>

But we did the same for ITS 1.0: e.g. its:term="yes" is the same for each
term, and a termreference is additional information. We didn't allow to
have just a term reference. We make that even clear in the definition for
local, interrelating "term" and "termInfoRef" by saying the latter is
optional.

"

   -

   A term<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#att.local.no-ns.attribute.term>
attribute
   with the value "yes" or "no".
   -

   An optional termInfoRef<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#att.local.no-ns.attribute.termInfoRef>
attribute
   that contains a URI referring to the resource providing information about
   the term.

"

If you introduce the new approach of overrding semantics, this gets messed
up: you will have situations looking at a given node asking "where does
this termInfoRef come from - locally set or globally set?".
Imagine also that as a user of an ITS processor you want to debug ITS local
markup and rules, because information for a given node doesn't look right.
Without the complete overriding that can be a real challenge.



> It seems 2.0 has several data categories with more than a single
> information.


Yes - but I object (object in the W3C formal objection sense, if needed)
against given up the overriding semantics, and rather not fulfill each need
in these data categories. Simplicity here is much more important than
expressivity IMO.

One reason is backwards compatibility with ITS 1.0, see above. Another is
the implementation strategy I (and I think Sebastian Rahtz) used for 1.0.
Note that this strategy is also mentioned in the spec, and this comes from
ITS 1.0:
"The precedence order fulfills the same purpose as the built-in template
rules of [XSLT 1.0]."
Now, in XSLT you would create a real mess if you would have templates for
each piece of information of a data category - you'd rather have a template
*per data category precendence*, e.g.

<xsl:template match="*[@its:term]" priority="+1000" mode="translate">...
</xsl:template>

This template says: local "term" attribute has the highest precedence. The
template doesn't even check for termInfoRef, since that is optional (see
above).


> And obliterating existing information defined globally because one *other*
> information is set locally used may prove challenging.
>


I rather see huge benefits to go that way, in addition to compatibility
with 1.0. With complete overriding, Is very clear for each node in a
document what ITS information pertains to it. You gave storage size as an
example, but  think about quality issue, precise or disambiguation: with
the masses of attributes we have here, the non complete overriding will
create a real mess when people want to understand where information comes
from.

Best,

Felix


>
> Cheers,
> -yves
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Wednesday, 19 September 2012 04:29:29 UTC