See also: IRC log
topic; Doodle poll about virtual f2f
<tadej> daveL: poll shows 27th and 28th to be both good candidates
<tadej> ... I would suggest taking the 27th and 28th, having both around 3 hour calls in the afternoon
<tadej> ... howerver, we should deal with more specific issues beforehand
<tadej> daveL: Tuesday, Nov 20th is also a good candidate
<tadej> ACTION: daveL to confirm November 20, 27 and 28 as virtual session dates [recorded in http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action01]
<trackbot> Created ACTION-278 - Confirm November 20, 27 and 28 as virtual session dates [on David Lewis - due 2012-11-12].
topic; upcoming meetings
<tadej> daveL: checking if the schedule makes sense - so far Prague 23-24 Jan, Rome 12-13 March, Bled 7-8 May, and Madrid still unspecified
<tadej> daveL: as for events, there's a GALA event, LocWorld, the WWW conference in Rio, and the LRC conference in Limerick
<tadej> Yves_: the only thing we need to fix is the dates for the Madrid meeting, since July is a holiday month
<Arle> We may be able to get on the GALA program. I will know more soon.
<tadej> Pedro: For July, the sooner the better, ideally first week
<tadej> ... or even last week of June
<tadej> ACTION: daveL to open doodle poll for Madrid dates (end June - beginning July) [recorded in http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action02]
<trackbot> Created ACTION-279 - Open doodle poll for Madrid dates (end June - beginning July) [on David Lewis - due 2012-11-12].
<Arle> (Separate from what Pedro has already submitted, which is a great start.)
topic; standoff markup
<tadej> Yves_: we should use a single root element, like its:standOffList (or similarly named). the inclusion mechanism would be via the script element, either inline or separate file
<tadej> ...given the example, it would be better to split the standoff into two separate <script>-s, and have the script element id match the standoff list ids.
<tadej> Pedro: the external files can be problematic in cases with real-time translation
<tadej> daveL: do you think the its:rules elements could be the enclosing element?
<tadej> Yves_: since we need to point to multiple its:standofflists, they can't be the root element, since they could exist in the same file; its:rules could be a root.
<tadej> daveL: could you correct the schema so it takes this into account?
<tadej> Yves_: mixing rules and standoff can get messy
<tadej> daveL: its:rules is easy from the conformance point of view, easier to explain, although there may be confusion
<tadej> Jirka: there's conceptual overload with this - we'd be declaring its:rules, and it wouldn't contain actual rules, but standoff info
<tadej> daveL: let's summarize having a single element its:standoffList having an id attribute which matches the script element's id.
<tadej> ... in external files, we could have multiple standoff lists
<tadej> ACTION: Yves_ to edit the spec to unique standoff markup [recorded in http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action03]
<trackbot> Sorry, couldn't find Yves_. You can review and register nicknames at <http://www.w3.org/International/multilingualweb/lt/track/users>.
<tadej> ACTION: Yves to edit the spec to unique standoff markup [recorded in http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action04]
<trackbot> Created ACTION-280 - Edit the spec to unique standoff markup [on Yves Savourel - due 2012-11-12].
<tadej> daveL: Marcis sent an update consolidating MT confidence and TA Annotation into simpler definitions
<tadej> ... there's still an open issue on whether defining its:tools should be compulsory for these two data categories. any opinions?
<tadej> Yves_: sounds reasonable
<tadej> daveL: I'll modify the text and make it compulsory.
<tadej> daveL: Marcis also pointed out that several tools could process a fragment of text, which makes things confusing. it's different than MT, since you're annotating an annotation.
<tadej> ... should we then just apply the its:tool to those data categories than have it as a separate data category?
<tadej> tadej: disambiguation could survive that, it's equivalent
<scribe> scribe: daveL
tadej: is currently updating its-tools, looking at use of non-its annotations
<tadej> daveL: right now we have a mechanism to identify to which data category it applies to, allowing for user-defined names
<tadej> daveL: ... since you're borrowing the mechanism anyway, you're out of conformance anyway
<tadej> daveL: we could remove it, since we don't have a formal extension mechanism
<Marcis> I hear you, I just cannot say anything
<tadej> tadej: if we define a per-datacategory confidence attribute, how to express multi-valued attributes?
<Marcis> I mean, if the domains are automatically identified, then you will have a confidence (if the systems will return probabilistic results)
<Marcis> As tadej said - the weighted mechanism says that there is a confidence
<tadej> tadej: It boils down to whether that number is useful for the consumer
<Marcis> The categories (not in exact names...) that I see requiring the confidence are: MT, Terminology, Domain segmentation tools (are there any currently used by the MT use cases?), Named Entity Recognition (currently in Disambiguation, right?), others (?)
<tadej> ACTION: daveL to ask for use cases of data category-specific confidence scores [recorded in http://www.w3.org/2012/11/05-mlw-lt-minutes.html#action05]
<trackbot> Created ACTION-281 - Ask for use cases of data category-specific confidence scores [on David Lewis - due 2012-11-12].
<Ankit> w.r.t. confidence scores in MT, they are are mainly used in a post-editing environment, i.e. when a human translator uses these scores to determine which outputs of a MT system they want to correct..
<tadej> tadej: disambiguation can produce scores, but not commonly used
<tadej> daveL: its:tools has its own element, the its:standOffList - we should describe it how it works within a script element, so it's as similar as possible to the XML markup.