13:58:39 RRSAgent has joined #mlw-lt 13:58:39 logging to http://www.w3.org/2012/07/26-mlw-lt-irc 13:58:44 Zakim has joined #mlw-lt 13:58:58 meeting: MLW-LT WG 13:59:00 giusepperizzo has joined #mlw-lt 13:59:03 chair: felix 13:59:29 agenda: http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0254.html 13:59:39 DomJones has joined #mlw-lt 13:59:40 leroy has joined #mlw-lt 14:00:22 daveL has joined #mlw-lt 14:00:29 hi raphael, is gotomeeting working for you 14:00:40 we are about to dial 14:00:43 omstefanov has joined #mlw-lt 14:01:17 scribe: daveL 14:02:19 Yves_ has joined #mlw-lt 14:02:59 Des has joined #mlw-lt 14:03:21 chair: fsasaki 14:03:52 Sebastian has joined #mlw-lt 14:04:05 Meeting: MLW-LT weekly teleconf 26th July 2012 12.00 UTC 14:04:05 Hi everybody! 14:04:22 present: dave, felix, des, dom, olaf, sebastian, yves, raphael, guiseppe 14:04:38 pablomendes has joined #mlw-lt 14:04:44 present+ leroy 14:04:50 present+ pablo 14:05:19 topic: named entity syntax discussion 14:05:22 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0280.html 14:06:00 tadej has joined #mlw-lt 14:06:35 raphael and guiseppe introducing themselves and NERD 14:06:52 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html fsasaki 14:06:53 raphael: descibes NERD platform developed with giuseppeerizzo 14:07:17 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html fsasaki 14:07:18 NERD: nerd.eurecom.fr 14:07:33 s/nerd.eurecom.fr/http://nerd.eurecom.fr/ 14:07:33 sebastien: introduces himself as member of LOD2 project and developer of NIF and striving to make this compatible with ITS2.0 14:07:47 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0280.html 14:08:46 tadej: named entitiy dc call for concesus distirbuted 14:09:17 NERD: a broker over numerous web APIs that perform Named Entity extraction, offers an ontology, an API and a Web UI for performing experiments 14:09:19 ...related to terminology but not an extension due to its1.0 backward compatibility option 14:09:52 ... two use cases: type of named entity and which named entity being mentioned 14:10:45 tadej: disambiguation uses similar pattern but is a separate use case, pointing to a specific meaning in a semantic network 14:11:18 I would rather say that the disambiguation comes from a semantic network, or a knowledge base or a dataset (e.g. dbpedia) ... not an ontology since we are talking about instances 14:11:30 tadej: examples included for XML, HTML, the latter with RDFa lite. 14:11:49 q+ the term "resource" is used throughout to refer to something related to "namespace" in the XML world, or a "data source" 14:11:54 ...microdata would be very similar 14:12:00 ack pab 14:12:29 q+ entityTypeResourceRef should be a URI ... not a string, right? 14:12:41 q+ to ask entityTypeResourceRef should be a URI ... not a string, right? 14:13:07 pablomendes: asks if term resource is confusing because of different useages in language resource and web resources community 14:13:36 namespace or just "source" 14:14:14 tadej: could use 'named graph', but perhaps a bit obscure. 'name space' better but conflicts with xml namespace 14:14:42 tadej: suggestion from floor 'source ref' may be better 14:14:44 ack ra 14:14:44 raphael, you wanted to ask entityTypeResourceRef should be a URI ... not a string, right? 14:15:19 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html fsasaki 14:15:25 raphael: for disamibuate use term 'knowledge base' rather than 'ontology' 14:15:53 ... resource ref is mistakenly a string rather than URI 14:16:05 tadej: yes its an error 14:16:20 global = stand off? 14:16:43 ... also explains ITS pattern of local and global tag methods 14:17:10 local = inline? 14:17:32 fasaki: comparable to CSS and has equivalent of cascading rules 14:18:01 some background here about "global" and "local" http://www.w3.org/TR/2012/WD-its20-20120626/#basic-concepts-selection 14:18:13 tadej: also there is inherntence, e.g. to specify dbpedia as source for all reference 14:18:14 Jirka has joined #mlw-lt 14:19:06 s/disamibuate/disambiguate 14:19:25 present+ jirka 14:19:33 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html fsasaki 14:19:38 wondering if "knowledge base", "thesaurus", "ontology", "semantic network" couldn't all be subsumed by "vocabulary" 14:19:54 since the type of knowledge representation is not important here. All those are essentially providers of URIs (vocabularies of globally unique identifers) 14:19:57 tadej: also 3rd example mentions usage of rdfa lite so be consistent with simple usage and standoff annotation 14:20:09 yes pablo, but for types disambiguation, we should talk about vocabulary (ontology, thesaurus) 14:20:20 ... providing mapping between simple rdfa markup and ITS markup 14:20:20 ... while for entities disambiguation, we should talk about datasets 14:20:50 e.g. dbpedia has 2 parts, the OWL ontology (dbpedia-owl) and the dataset part (much larger) 14:21:12 another name clash, I guess. the Linked Data community already took "vocabulary" as meaning schema 14:21:35 pablo, we are talking about the same thing ... I use vocabulary as in the Linked Data community 14:21:48 tadej: answer pablo question, knowledge based is preferable to vocabulary 14:22:38 pablomendes: knowledge is probalby fine for this user community, or perhaps entity vocabulary 14:22:57 identifiers ? 14:23:10 identifiers farm :-) 14:23:53 tadej: identifiers could work, with example of instnace, 'onotlogies' etc 14:24:41 link to terminology data category: http://www.w3.org/TR/2012/WD-its20-20120626/#terminology 14:24:52 q+ to ask what is the added value of using the nerd type as value of the typeof attribute (in RDFa) over the native type provided by an extractor 14:24:58 tadej: responds to sebastian's question that neamed entity and disambiguation are separate from terminology in the affermative 14:25:01 s/instnace/instance 14:25:05 http://wiktionary.dbpedia.org/resource/dog 14:25:08 s/onotlogies/ontologies 14:25:27 http://wiktionary.dbpedia.org/page/dog-English-Verb-2fr 14:26:02 Sebastian: is an issue since repositories such as wiktionary as like knowledge bases 14:26:42 q+ can't the distinction between word and entity come from the type? 14:26:54 Sebastian, there is a relationship between http://dbpedia.org/resource/Dog and http://wiktionary.dbpedia.org/resource/dog ? 14:26:59 tadej: disambiguation lets you specify that type - entity or word 14:27:21 ... as there are difference between inserting terminology link and entity link 14:27:24 ack ra 14:27:24 raphael, you wanted to ask what is the added value of using the nerd type as value of the typeof attribute (in RDFa) over the native type provided by an extractor 14:27:30 q+ pablo 14:28:38 raphael: rdfa example query about different vocab are being used 14:28:53 ... and which tool generated it 14:29:06 tadej: handled by separate data category 14:29:32 see that data category, textanalysis annotation, listed here http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments#Data_categories_2 14:30:38 rephrase attempt: what are the relationships between LexicalEntry instances from Wiktionary and entities in DBpedia? 14:31:13 perhaps lexvo:lexicalization? 14:31:30 http://www.lexvo.org/page/term/eng/lexicalization 14:31:49 http://wiktionary.dbpedia.org/page/dog-English-Noun-1en 14:32:50 Sebastien: responding to rapheal, that in wiktionary the separation between lexical entry and concept is not alway clearly defined 14:32:50 q+ In example 1 the rule is but it's in example 3. Is this just a typo, or are those two different types of rules? Because in example 3 we have entityTypeRef="@typeof" (an XPath expression) but its:entityTypeRef="http:/nerd.eurecom.fr/ontology#Place" (a URI) in example 1. Is entityTypeRef of example 3 is supposed to be entityTypeRefPointer? 14:32:57 ack pab 14:33:04 ... might be a tool artefact that cause confusion 14:33:07 ack yv 14:33:11 http://dbpedia.org/page/dog does not resolve BUT http://dbpedia.org/page/Dog does ! 14:33:14 q+ pablo 14:33:34 Yves: title is not consistent - entity or name entity 14:33:52 tadej: an error, will fix this 14:34:07 s/rapheal/raphael 14:34:12 Yves: example should be an entity pointer 14:34:17 FYI, "pointer" attribute means: pointing to existing information in the document 14:34:22 tadej: will fix this 14:34:50 raphael: what is relation between its draft and NIF 14:35:21 tadej: there is some overlap and will be addressed in future, perhaps as a separate part or document 14:35:33 http://www.w3.org/TR/2012/WD-its20-20120626/ 14:36:00 Sebastian: we are considering document some roundtrip scenario between ITS and NIF 14:37:04 fsasaki: some initial work undertaken 14:37:16 ack pab 14:37:18 wrt. its:disambigType = (word | entity) can't the distinction between word and entity be inferred from entityTypeRef? e.g. wiktionary:doc is a word, dbpedia:Dog is an entity 14:37:58 raphael: is disambig type redundant with entity type ref? 14:38:14 s/raphael/pablo/ 14:38:16 no, it's meta-meta 14:38:26 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html fsasaki 14:38:36 tadej: this is possible, but unlcear how to maintain this mapping and how users can infer this this 14:39:12 Q+ to ask what are the implementations of "Internationalization Tag Set (ITS) Version 1.0" and the main diff between 1.0 and 2.0 ? 14:40:14 I really hav to learn how this speaker queue things work, where can I RTFM ? 14:40:28 tadej: disambiguation use cases are often used in cases where text is short and lacks context 14:41:38 ... and computational lingusitic community draw a clear distinction ebtween lexical and conceptual meaning 14:41:39 ack raph 14:41:39 raphael, you wanted to ask what are the implementations of "Internationalization Tag Set (ITS) Version 1.0" and the main diff between 1.0 and 2.0 ? 14:42:19 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#relation-to-its10 14:42:50 fsasaki: there is a seciton describing difference, mainly html5 coverage and new daa categories 14:43:38 ... its1.0 focussed more on classic i18n and l10n, its2.0 bring in more language technology integration 14:43:50 raphael; which tools implement its1.0 14:44:29 okapi http://okapi.sourceforge.net/ 14:44:33 yves: there is rainbow in okapi framework as well as translation tools such as trados 14:44:35 trados translation tool 14:44:46 http://okapi.opentag.com/ 14:45:06 http://www.opentag.com/okapi/wiki/index.php?title=ITS 14:45:15 fsasaki: thanks everyone 14:45:21 thank you 14:45:40 action: tadej to integrate this feedback 14:45:40 Created ACTION-181 - Integrate this feedback [on Tadej Štajner - due 2012-08-02]. 14:45:41 http://www.w3.org/TR/2012/WD-its2req-20120524/#Automatic_enrichment_of_the_source_content_with_named_entity_annotations 14:46:21 fsasaki: aim to finalise entity related meta-data. 14:46:43 another real-life implementation: http://itstool.org/ 14:47:04 ... this link has lots of other requirements, but we aim to keep things simple as possible to hit w3c timescale including november feature freeze 14:47:18 thanks all 14:47:51 topic: its 20 draft publication 14:48:03 rrsagent, generate minutes 14:48:03 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html daveL 14:48:05 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html 14:48:20 topic: publication of working draft 14:48:39 fsasaki: any objections to publication - there are none 14:49:05 action: fsasaki to publish update to working draft next week 14:49:05 Created ACTION-182 - Publish update to working draft next week [on Felix Sasaki - due 2012-08-02]. 14:49:34 fsasaki: will plan another draft in latter half of august 14:49:38 http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments 14:49:49 topic: implementation committments 14:50:17 fsaski: please keep implementation commmittments table uptodate 14:50:47 yves: will try very hard to implement disambiguation and named entity data category 14:51:49 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.odd 14:52:18 fsasaki: tells tadej to continue working on word version and then when finished editors will integrate into ITS draft doc 14:53:15 topic: call for consensus 14:53:34 fsasaki: any comment on parameter for rule and target points, are there further comments 14:53:42 ... no disagreement 14:53:53 scribe: fsasaki 14:54:12 yves: domain, inheritance discussion 14:54:18 .. the discussion with declan 14:54:22 .. what is the outcome? 14:54:36 dave: it was about the fact that in practice with statistical MT 14:54:44 .. some domains will be more important than others 14:54:54 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html fsasaki 14:55:22 dave: I was saying that the rules precedence is different than the domain precendence declan was talking about 14:55:39 .. in statistical MT, you have sometimes domain precedence 14:56:00 yves: my question was about the domain precedence attribute 14:56:24 felix: yves is asking about the impliciation for implementing domain 14:56:42 dave: do we want to put this as a new optional attribute, e.g. these are the ones that represent the primary domain 14:57:01 .. it is another optional attribute, need to get declan's feedback what's best 14:57:20 .. in practice it will not be a definite instruction, not on the side of MT providers 14:57:39 .. it is a hint, not a mechanical choice 14:58:05 yves: looking at the example of domain rules 14:58:08 .. usage a) and b) 14:58:19 .. you have a domain precedence attribute and criminal law and medical 14:58:31 .. you have a domain poitner that says where to get the information 14:58:36 .. the precedence is in the rule 14:58:42 .. but how do you know which value to use 14:58:47 .. it is not listed in the domain 14:59:04 .. so what is the relationship 14:59:09 .. do we need a domain precedence pointer 14:59:19 dave: not sure we really need it 14:59:23 .. need feedback from declan 14:59:48 .. a separate MT provider may do other decisions 14:59:54 .. a company like adobe might use it 15:00:03 .. section doing the content, versus the one for MT training 15:00:19 .. looked whether it is actually needed - it's a borderline use case 15:00:43 yves: seems to be a border case 15:00:50 dave: I agree 15:01:37 action: dave6 to contact declan and thomas about domain new attribute proposal 15:01:37 Sorry, couldn't find user - dave6 15:02:04 action: felix to integrate parameters for rules and target pointer into the spec 15:02:04 Created ACTION-183 - Integrate parameters for rules and target pointer into the spec [on Felix Sasaki - due 2012-08-02]. 15:02:43 topic: aob 15:02:51 dave: we are adding more content explaining things 15:03:02 .. we end up putting in descriptions of standoff markup that we are pointing to 15:03:10 .. are we happy to put that in the document? 15:03:54 scribe: daveL 15:04:32 fsasaki: responding to query on non normative standoff markup exmaples - we should definitely collect this and then decide how best ot present this 15:04:52 http://www.w3.org/TR/xml-i18n-bp/ 15:05:00 ... next year we have more opportunity for separate best practice best practices as in ITS1.0 15:05:01 q+ re materials above and in additions to ITS 2.0 15:05:53 ... and can take other things into account, such as use of meta data in more complex workflow scneairo 15:07:09 ack o 15:07:09 omstefanov, you wanted to discuss materials above and in additions to ITS 2.0 15:07:25 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html fsasaki 15:07:34 omstefanov: many other fields include commentaries on normative or legal content, including exmaples and implementation quesitons etc 15:08:07 action: felix to prepare a place for BP material 15:08:08 Created ACTION-184 - Prepare a place for BP material [on Felix Sasaki - due 2012-08-02]. 15:09:19 q+ re agenda for prague f2f 15:09:29 ack de 15:09:29 Des, you wanted to discuss agenda for prague f2f 15:10:29 Logistics page is at http://www.w3.org/International/multilingualweb/lt/wiki/PragueSep2012 15:10:35 action: felix to prepare agenda draft for prague 15:10:35 Created ACTION-185 - Prepare agenda draft for prague [on Felix Sasaki - due 2012-08-02]. 15:11:02 Nevertheless, I want to make one more pitch to use the more globally understood term, "Commentaries on ..." rather than "Best Practices" which, even if what you says applies, Felix, usually is understood in a more restricted sense. 15:12:40 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html fsasaki 15:12:50 rrsagent, generate minutes 15:12:50 I have made the request to generate http://www.w3.org/2012/07/26-mlw-lt-minutes.html daveL 15:13:03 bye, y'all and thanks 15:13:10 omstefanov has left #mlw-lt 15:14:16 Jirka has left #mlw-lt 15:14:41 DomJones has left #mlw-lt 16:23:19 zakim, bye 16:23:19 Zakim has left #mlw-lt 16:23:21 rrsagent, bye 16:23:21 I see 6 open action items saved in http://www.w3.org/2012/07/26-mlw-lt-actions.rdf : 16:23:21 ACTION: tadej to integrate this feedback [1] 16:23:21 recorded in http://www.w3.org/2012/07/26-mlw-lt-irc#T14-45-40 16:23:21 ACTION: fsasaki to publish update to working draft next week [2] 16:23:21 recorded in http://www.w3.org/2012/07/26-mlw-lt-irc#T14-49-05 16:23:21 ACTION: dave6 to contact declan and thomas about domain new attribute proposal [3] 16:23:21 recorded in http://www.w3.org/2012/07/26-mlw-lt-irc#T15-01-37 16:23:21 ACTION: felix to integrate parameters for rules and target pointer into the spec [4] 16:23:21 recorded in http://www.w3.org/2012/07/26-mlw-lt-irc#T15-02-04 16:23:21 ACTION: felix to prepare a place for BP material [5] 16:23:21 recorded in http://www.w3.org/2012/07/26-mlw-lt-irc#T15-08-07 16:23:21 ACTION: felix to prepare agenda draft for prague [6] 16:23:21 recorded in http://www.w3.org/2012/07/26-mlw-lt-irc#T15-10-35