issue-41 (mtconfidence), issue-42 (mtConfidence, textAnalysisAnnotation, quality) from Felix Sasaki on 2012-09-18 (public-multilingualweb-lt@w3.org from September 2012)

From: Felix Sasaki <fsasaki@w3.org>
Date: Tue, 18 Sep 2012 21:03:47 +0200
To: public-multilingualweb-lt@w3.org
Message-ID: <CAL58czrqOHGL_UDGAaDs8KTRiJwbg47c8nfmECLN0yk98jbsyQ@mail.gmail.com>

Hi all, trying to reply to mails from Tadej and Arle in the issue-42 thread
and to Dave and Declan in the issue-41 thread,

we have an issue with mtconfidence and textanalsys annotation that is
similar: there is information that probably will be given only once,
related to tooling, and there is other information that needs to be
repeated several times.
In the case of mtconfidence the actual confidence value would be repeated.
In the case of text analysis annotation, it would be actually the named
entity annotations that are repeated several times.

This creates problems. As Dave and Declan ask at
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0085.html
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0087.html
overriding semantics in ITS 1.0 is always complete, and ITS 2.0 so far is
the same. I would have to change my whole "artifical output" implementation
to change that, so I would probably object.

So it might make sense to

1) integrate tool info, input info and process info along the lines of
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0080.html
2) have that kind of information for the three data categories
mtconfidence, quality issue and disambiguation
3) drop textanalysis annotation and quality precis, since what they express
is expressed by the "new" data category

This is still a rough thought, so comments are very welcome.

- Felix

-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Tuesday, 18 September 2012 19:04:14 UTC