13:54:42 RRSAgent has joined #mlw-lt 13:54:42 logging to http://www.w3.org/2012/08/09-mlw-lt-irc 13:54:47 Zakim has joined #mlw-lt 13:54:51 meeting: MLW-LT WG 13:54:54 chair: David 13:54:58 Declan and Jan sent regrets on top of the regrets recorded on agenda in advance.. 13:55:01 Arle has joined #mlw-lt 13:55:24 agenda: http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0148.html 13:55:41 present: arle, davidF, dom, fsasaki 13:56:41 the Doodle based regrets: Pedro, Shaun, Milan, Raphael, Pablo, Giuseppe, Dave > *Additional regrets:* Tadej 13:57:50 regrets: Pedro, Shaun, Milan, Raphael, Pablo, Giuseppe, Dave, tadej 14:00:41 Yves_ has joined #mlw-lt 14:00:48 scribe: Arle 14:00:59 leroy has joined #mlw-lt 14:01:27 scribe: DomJones 14:02:02 present+ leroy, Yves 14:02:34 omstefanov has joined #mlw-lt 14:02:49 present+ olaf 14:03:04 present+ phil 14:03:19 topic: agenda review 14:03:25 Hi everyone, is GoToMeeting not yet on? I'm getting "waiting for organizer" 14:03:29 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0148.html 14:03:46 http://www.w3.org/2012/08/02-mlw-lt-minutes.html 14:03:47 df: Look at the minutes, any issues, any objections to these? 14:03:50 DES has joined #mlw-lt 14:03:56 present+ des 14:04:02 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:04:16 dF: accept the minutes and move onto topic 1 14:04:45 1. Please join my meeting. https://www3.gotomeeting.com/join/220653286 2. Use your microphone and speakers (VoIP) - a headset is recommended. Or, call in using your telephone. Austria: +43 (0) 7 2088 0034 Germany: +49 (0) 811 8899 6900 Ireland: +353 (0) 15 290 180 Spain: +34 911 23 0850 United Kingdom: +44 (0) 207 151 1850 United States: +1 (773) 945-1031 Access Code: 220-653-286 Audio PIN: Shown after joining the meeting Meeting Password: 54321 Meeting ID: 14:04:48 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:04:57 dF: meeting details for today posted above 14:05:27 Current draft of Quality is here: http://dl.dropbox.com/u/223919/dfki/mlw-lt/locQuality.html 14:05:59 dF: Topic 1 Quality Discussion. Need to discuss issue 42 which is a more general issue. Progress of category, time frame etc? 14:06:03 ... arle please report 14:06:16 philr has joined #MLW-LT 14:06:47 Arle: Just posted link to spec draft, still not in correct form but please use for reference. At this point need agreement on attributes listed in section 6.x.3. These need to be agreed upon, with the exception of ?? 14:07:07 ... all those in top half of table are agreed upon by phil, me, yvves etc. 14:07:17 ... second half needs people to comit to implementations. 14:07:51 Felix: I think you can add me to the list to the people who agree on the information here but not on whether they will become attributes 14:08:35 Arle: useful distinction. Each "Attribute Name" represents pieces of information. Need to nail down and agree upon these. At Felix, do we need to issue a call for consensus on this? 14:08:49 Felix: No W3C process for this... 14:09:09 dF: I think that the quality thing should be addressed in a structured way 14:09:36 ... Arle is the owner of this, if consesnsus needs to be achieved we should do this 14:10:11 Felix: But what if a decision is later overruled? All we can do is structure the discussion and come back to consensus later. 14:10:37 dF: Clear every consensus can be overrulled but structuring a discussion ? 14:10:53 fsasaki: Should start discussing the topic itself 14:11:22 dF: There is one action item #168 which does not seem to have developed much... Arle can you comment? 14:12:11 Arle: This has been ongoing, Yves has been active on this. Really the last piece of that was writing to ?cilgrave? about the XLIFF part. 14:12:18 s/itself/itself, not so much about the process/ 14:12:27 dF: Not many recorded emails on this. 14:12:41 Arle: Lots of discussions going on elsewhere 14:13:54 Arle: v. quickly. Some info that we have agreement on - we started out with the idea of having two seperate pieces to this, 1) What metric, process, tool has generated mark-up. This defines a q name with prefix and uri with more info 14:14:07 ... think of it as a tool, metric, process signature. 14:14:45 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:15:10 ... 2) Low quality score, allows a process to provide a score relavent to a docusment. 95, 32 etc, apply at document level. Some at moment are more inline, locQualityType, for example. 14:15:31 ... these are designed for interoperability between tools. 14:15:47 ... Allows common tagging between different tool. 14:16:26 Arle: Low quality codes - Allows mapping of implementation tools to common set as well as passing over original code. 14:16:52 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:17:23 topic: quality discussion 14:17:26 Arle: These are the ones we have agreement upon, there are five there that we dont have agreement upon. I wont go through those but please look at online document. 14:17:30 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:17:56 dF: Can this be wrapped up in August? Can a cut be made on information pieces that have not made process? 14:18:16 Arle: I think so, these seem stable. I think we have consensus on them. 14:18:30 dF: Are you prepared to cut those which are not mature enough? 14:18:47 Arle: Yes. Except in the case of arguments and impl commitments from Phil, Yves, etc? 14:19:20 dF: I would like to formalise this. Set an action to freeze number of information pieces. Would you be able to freeze the number by the next call, in a week?? 14:19:57 Felix: If you look at issue 42 some of these info pieces are the same across data categories... Im not saying that we would disagree but where they belong to we may disagree. 14:20:36 Arle: That impacts the first two of these.. Whether they are here or move we need them. For all but first two (profile and score) we'll have a decision by next week? 14:20:39 ACTION 14:21:08 action: arle to freeze the number of information items in quality, with the reservation that some items might move to other areas 14:21:08 Created ACTION-192 - Freeze the number of information items in quality, with the reservation that some items might move to other areas [on Arle Lommel - due 2012-08-16]. 14:21:21 s/ACTION// 14:21:24 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:21:46 scribe: Arle 14:21:53 topic: issue-42 14:23:30 Felix: I was looking at the proposals we currently have and in a number of categories we have data about what generated it and the confidence in that. Text analysis, mt confidence, and quality all have similar issues. People have to separate issues generated by multiple tools. Another common aspect between these categories is that these pieces of information are kind of general settings that inherit through the tree to where you need them, much like the language 14:23:50 ..In our case, you might specify one tool, or, if needed, multiple tools used for creating annotations. 14:24:27 ..There is one issue: in Quality, you identify the model, but in the others it is a tool. 14:24:50 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0149.html 14:25:24 David: The common aspect is the state of inheritance, and that you may need to record multiple tools or models on the local level. How does the inheritance relate to global and local approaches? 14:25:46 +q 14:26:07 Felix: Global and local are just different ways to specify the metadata. But these are separate pieces of metadata. Once you have specified them (locally or globally) they inherit throughout the document. 14:26:44 David: Like with translate and they way it can switch it on and off. So the issue really is to specify that these inherit, correct? 14:26:49 philr has joined #MLW-LT 14:27:47 Felix: I see this not as specific to these data categories, but rather as a separate data category. I'm not sure how you would describe the relationship from mtConfidence, quality, and text analysis to these. I don't yet know how it would work in detail. 14:28:13 David: So you propose to introduce a generalized originator category. Isn't that like provenance? 14:28:15 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0149.html 14:28:30 lingProcInfo 14:29:45 Felix: That's a good point. There is a clear relationship. I just pasted a link Christian Lieske supplied on this. It might be provenance or a subcategory of provenance. It is important for at least three categories, but maybe for others. This is so specific that I think maybe we need a specific mechanism. Provenance is really about more complex information related to provenance. This is more about identifying the process used to create something. I'd rather see 14:30:05 ...E.g., pointing to the tool or process. 14:31:39 David: In Dublin I wanted provenance to be independent. I see only two options: (1) subsume it in provenance; (2) specialize it in the categories in question. For example, if I use the LISA QA Model, is it relevant to anything but quality. I don't think it would be problematic to have these done in specialized categories. 14:32:17 ..I think this would work better to modularize ITS. But if we make them orthogonal, we should put them in provenance. 14:33:24 Felix: But if we specialize them, we run into the issue we see with quality that the ITS inheritance model. 14:33:45 David: So are you saying that ITS inheritance is for the content only, not the metadata. 14:33:53 s/not the metadata/not the metadata?/ 14:35:03 Felix: If you want to apply the same type of data category multiple instances of a data category to the same node, you cannot do it. You can't say that Tool A gives one value and Tool B gives another value for the same piece of content. 14:35:22 David: So you mean that if there are comparable originators, you can't apply multiple ones, correct? 14:35:50 Felix: Yes. 14:36:27 David: This won't be an issue for mtConfidence, because you are generally working with a single candidate at a time. If you need more, you should look at XLIFF or something. 14:36:51 ..If you are composing a document from multiple sources, the normal inheritance model would work. 14:37:15 scribe: fsasaki 14:37:29 arle: for quality, the normal inheritance model fails 14:37:33 scribe: Arle 14:38:00 David: Would it be OK to state that inheritance is cancelled when two comparable originators are used on the same node? 14:38:49 Felix: We need to consider backward compatibility, and also the test suite, which has examples where inheritance deletes one piece of information. The test suite is just one example where this change would go against running implementations. 14:39:07 Phil: We are talking about child elements inheriting the metadata from a parent? 14:39:20 Felix: Yes. It is CSS-like inheritance. 14:40:13 Phil: Would it be permitted to replicate certain parts of the document when you need to apply multiple pieces to the same content? It would be building a pseudo-parent around multiple builds. 14:40:59 David: That would be out of scope for us. 14:41:29 Yves: What we could do is have a span with an attribute that points to an external element. That is stand-off annotation that could contain several entries, not just one. 14:42:26 ..The inheritance model works fine in the document itself. 14:42:54 Felix: Yves is saying you have a pointer in the document to the list of alternatives. By using the stand-off list you can have all the annotations you want. 14:44:01 David: You wouldn't duplicate the content, but you would have a list of applicable metadata. This is a mechanism to be used for when there is clashing inheritance? 14:44:46 Felix: Arle and I discussed having a separate section in the HTML5 document that is not displayed where you put this information and then you ship around a single document. 14:45:06 David: I think we should specify this mechanism in a separate discussion. 14:45:21 Felix: I think this is related to Issue-37. I'll create an example. 14:45:44 Action: Felix to create an HTML5 example of the externalized markup within a single file. 14:45:44 Created ACTION-193 - Create an HTML5 example of the externalized markup within a single file. [on Felix Sasaki - due 2012-08-16]. 14:46:29 David: I think the high-level information is whether we keep the producer information in a specialized category, or whether we put it in provenance. I think we all agreed that in the case of clashing producers we have this other mechanism. 14:47:06 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:47:07 Yves: It's not just about different producers, but also about cases where the same information is applied in multiple places. 14:47:26 Felix: This is not producer-specific, but conflict-specific. 14:47:55 David: The use case I am thinking of is about two different reviewers using the same quality model. 14:48:14 felix: or two different text analytics systems 14:48:31 Phil: The general condition is that you want multiple pieces of metadata. Whether they conflict or not, you can accommodate both within a single node(?) 14:48:48 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0149.html 14:48:49 David: From the point of view of MT confidence, I don't think we need this special mechanism. 14:50:13 Felix: One other point (see pasted link). One part for opening Issue-42 is the conflict discussion, part of the issue is that we want to describe tool-specific data. Arle and I need to create a way to describe what generated the data. 14:50:31 ack des 14:50:42 David: I think we use the same templated piece about inheritance. 14:51:17 +1 to des 14:51:34 Des: I have a related issue. Quality score is normalized, but agent isn't mandatory, but agent is mandatory for MT and text analytics. We need to be consistent across these. 14:52:22 David: If you had to include multiple MT results, you have to replicate content, but text analytics can use multiple tools for one piece of content. 14:52:36 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:52:54 +1 to des 14:53:33 Felix: There are limits to harmonization, but let me make some examples. 14:53:42 +1 to des 14:53:49 Topic: Issue-2 14:54:01 David: Is anyone here to tell us anything. 14:54:05 action: felix to work on issue-42, provide examples and template for various data categories 14:54:05 Created ACTION-194 - Work on issue-42, provide examples and template for various data categories [on Felix Sasaki - due 2012-08-16]. 14:54:16 Topic: NIF_RDF rounddrip 14:54:29 http://wiki.nlp2rdf.org/wiki/ITS2NIF2ITS 14:55:25 We choose DBpedia Spotlight(web site) here, because wrapper implem ... 14:55:32 Felix: Short update. Sebestian Hellman did all the work, but see the wiki link I posted. It shows how to go from HTML/arbitrary XML to RDF in the NIF format. Various tools understand this format. One application scenario is to produce named entity annotation with the DBPedia Spotlight tool. 14:56:24 The results can be integrated into the original XML. It provides a bridge to language-technology tools that use NIF. It does not impact the description of the data categories. I've started building a conversion. It will give us a nice bridge to other tooling. 14:56:43 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:56:53 s/Topic: Issue-2/Topic: ?/ 14:56:59 Topic: ?? 14:57:13 topic: test suite 14:58:04 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:58:08 Dom: I'd like people to look at what we've done. I'm going to start looking at the output that tools might produce. So by the beginning of September we should have agreed upon input files and output formats and we can tie implementations against data categories for testing in Prague. 14:58:14 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 14:58:28 ..We're happy with progress, but want others to take a look. 14:58:33 topic: mtConfidence 14:59:40 David: Yves pointed out some deficiencies. I will produce the next draft version. I won't touch the inheritance bit and would wait for Felix. But I think we only need normal inheritance here. 14:59:52 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 15:00:07 Action: dF to produce next draft of mtConfidence. 15:00:08 Created ACTION-195 - Produce next draft of mtConfidence. [on David Filip - due 2012-08-16]. 15:00:10 *Topic 6* > *Seattle event* > http://www.localizationworld.com/lwseattle2012/feisgiltt/ > Felix's Action-191 > https://www.w3.org/International/multilingualweb/lt/track/actions/191 > Please tweet and retweet the I18n blog entry > http://www.w3.org/blog/International/2012/08/06/speaking-proposals-for-feisgillt-event-open-until-august-14-dont-delay/ > Please indicate your attendance on LinkedIn: http://linkd.in/Q5Tq7B > Submit speaking and demo proposals by August 15:00:18 please spread the word :) :) :) 15:00:18 Topic: Seattle event 15:00:27 David: Please Tweet, build buzz, etc. 15:00:45 thanks to dF for making all this happen! 15:00:48 ..Thanks to Felix for publishing blog entry, etc. 15:01:04 David: I'll leave housekeeping topics for the next weeks. 15:01:29 ..I think they are self-explanatory. No need to extend the meeting for now. 15:01:44 http://lists.w3.org/Archives/Public/public-multilingualweb-lt-commits/ 15:02:23 Felix: One final item. I've created a list at this URL that shows the commits to the W3C CVS. It shows you what changes the editors make. 15:02:30 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 15:02:42 Meeting closed. 15:02:50 I have made the request to generate http://www.w3.org/2012/08/09-mlw-lt-minutes.html fsasaki 15:03:10 *Topic 7* > *Houskeeping* > * > * > 1. Who is to finish action-158? > https://www.w3.org/International/multilingualweb/lt/track/actions/158 > Jirka or Yves? > > > 2. Action-181 > https://www.w3.org/International/multilingualweb/lt/track/actions/181 > cannot finalize consensus, still Felix could report on what seems to be > the issue > > 3. Review and close issues 38, 39, and 40 > https://www.w3.org/International/multilingualweb/lt/track/issues/38 > https://www.w3. 17:00:29 Zakim has left #mlw-lt