Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181]

Thanks for your feedback, Pablo and Tadej. I would propose to handle this
as implementation driven as possible - if there is somebody willing to
implement the a use case in which the distinction is important, we should
keep it, otherwise drop it.

Best,

Felix

2012/8/20 Tadej Å tajner <tadej.stajner@ijs.si>

>  Hi, Pablo,
> correct. The feedback I got was that this distinction is very important,
> but I can't think of an example with the scenario you mention. Perhaps for
> spans where one is contained within the other, such as assigning a lexical
> meaning to a word, while the whole phrase is an entity, for example
> 'agriculture' in 'Ministry of agriculture'.
>
> I think it boils down to this: could this property be reliably inferred
> from the target itself? For instance, if someone points to
> http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3 - can
> we expect that is definitely a case of lexical disambiguation?
>
> -- Tadej
>
>
>
> On 20. 08. 2012 11:42, Pablo N. Mendes wrote:
>
> Hi all,
>
>  I would suggest  to merge "its-entity-type-ident-ref" into
>> "its-disambig-type-ref".
>
>
>  If I understand correctly this is the same proposal I made at the call?
>
>  "<pablomendes> wrt. its:disambigType = (word | entity) can't the
> distinction between word and entity be inferred from entityTypeRef? e.g.
> wiktionary:doc is a word, dbpedia:Dog is an entity" [1]
>
>  If so, this is the answer that Tadej gave:
>
>  "tadej: disambiguation use cases are often used in cases where text is
> short and lacks context
> ... and computational lingusitic community draw a clear distinction
> ebtween lexical and conceptual meaning" [1]
>
>  Perhaps one way to test how strong is this requirement would be to think
> of use cases where one could assign both lexical and conceptual meaning to
> the same span.
>
>  Cheers,
> Pablo
>
>  [1] http://www.w3.org/2012/07/26-mlw-lt-minutes.html
>
>
> On Mon, Aug 20, 2012 at 11:13 AM, Felix Sasaki <fsasaki@w3.org> wrote:
>
>> Hi Sebastian,
>>
>>  2012/8/20 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
>>
>>> Hi Felix,
>>> your proposal is based on the assumption, that more data is available at
>>> these three URLs:
>>>
>>> http:/nerd.eurecom.fr/ontology#Place
>>> http://dbpedia.org/resource/Dublin
>>> http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3
>>>
>>> While this assumption is ok for the Semantic Web, I am not sure about
>>> the ITS world.
>>>
>>
>>
>>  You are right that in the "ITS world" one cannot be sure that more data
>> is available. But I would argue that implementors who process links also in
>> the ITS world very likely need to know (not automatically, but as a
>> prerequisite for implementation ) what the URL is about. So I'd rather
>> encourage implementors towards that "Semantic Web like" approach than
>> defining so many attributes.
>>
>>  Feedback from the people who want to process "disambiguation" without
>> Semantic Web processing is of course very important here.
>>
>>
>>>
>>> Furthermore, if you are attempting to minimize it, I would suggest  to
>>> merge
>>> "its-entity-type-ident-ref" into "its-disambig-type-ref". You wouldn't
>>> be limited to entity types and could use any of:
>>>
>>
>>
>>  Makes sense to me, thanks for the proposal - let's see what Tadej and
>> others say.
>>
>>  Best,
>>
>>  Felix
>>
>>
>>>
>>> - http:/nerd.eurecom.fr/ontology#Place
>>> - http://dbpedia.org/ontology/Place
>>> - http://www.monnet-project.eu/lemon#LexicalSense
>>> - http://www.monnet-project.eu/lemon#LexicalEntry
>>> - http://wordnet.princeton.edu/wndatamodel#NounWordSense
>>> - http://wordnet.princeton.edu/wndatamodel#Synset
>>>
>>> All the best,
>>> Sebastian
>>>
>>> Am 20.08.2012 09:44, schrieb Felix Sasaki:
>>>
>>>> Hi Sebastian, all,
>>>>
>>>>  thanks, Sebastian. From what you say in the wiki and in the previous
>>>> mail,
>>>> I think one could simplify things a lot.
>>>>
>>>> The HTML example from Tadej *could* look like this:
>>>>
>>>> <html lang="en">
>>>>
>>>>     <head>
>>>>
>>>>        <meta charset="utf-8" />
>>>>
>>>>        <title>Entity: Local Test</title>
>>>>
>>>>     </head>
>>>>
>>>>     <body>
>>>>
>>>>         <p><span
>>>>
>>>> its-entity-type-ident-ref="http:/nerd.eurecom.fr/ontology#Place"
>>>>
>>>> its-disambig-ident-ref="http://dbpedia.org/resource/Dublin
>>>> ">Dublin</span>
>>>> is the <span
>>>>
>>>> its-disambig-ident-ref="
>>>> http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3
>>>> ">capital</span>
>>>> of Ireland.</p>
>>>>
>>>>     </body>
>>>>
>>>> </html>
>>>>
>>>> That is, no explicit "resource" references for entity type and
>>>> disambiguation source, and no disambig-type.
>>>>
>>>> Also, I think one could get rid of adding this kind of information via
>>>> global rules - I really don't see a use case for that.
>>>>
>>>> Tadej, others, thoughts? Maybe Yves as one of the implementors
>>>> processing
>>>> the output and other have some thoughts too?
>>>>
>>>> Best,
>>>>
>>>> Felix
>>>>
>>>> 2012/8/17 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
>>>>
>>>>    Dear Felix,
>>>>> to solve this issue I prepared a page:
>>>>>  http://wiki.nlp2rdf.org/wiki/**DBpedia_Spotlight<
>>>>> http://wiki.nlp2rdf.org/wiki/DBpedia_Spotlight>
>>>>>
>>>>> It is a rough draft, so there are many mistakes, still. Once it is
>>>>> mature,
>>>>> I will send it to the DBpedia Spotlight and Apache Stanbol lists to get
>>>>> their feedback.
>>>>> Note that I don't have a problem with these properties as XML
>>>>> attributes,
>>>>> where they can naturally occur only once and encoding an implicit
>>>>> dependency (attribute refering to another attribute) is unproblematic.
>>>>> They
>>>>> are, however, difficult to handle in RDF, even when declaring them
>>>>> functional.
>>>>> I will report back, if there are any news,
>>>>>
>>>>> All the best,
>>>>> Sebastian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Am 14.08.2012 21:34, schrieb Felix Sasaki:
>>>>>
>>>>>   Hi Sebastian, all,
>>>>>>
>>>>>> August is taking its tribute ... I am wondering if there any thoughts
>>>>>> on
>>>>>> Sebastian's mail below. It seems that some of the proposed ITS
>>>>>> attributes
>>>>>> are not needed, but I don't have the competence to evaluate this.
>>>>>> Thoughts
>>>>>> from others?  Sebastian, could you confirm that the output mentioned
>>>>>> in
>>>>>> this other thread
>>>>>>
>>>>>>  http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>>>>>> lt/2012Aug/0168.html<
>>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0168.html>
>>>>>>
>>>>>>
>>>>>>
>>>>>> is correct for NIF? I then would create a test case for our test
>>>>>> suite,
>>>>>> see
>>>>>>
>>>>>>  http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>>>>>> lt-tests/2012Aug/0003.html<
>>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt-tests/2012Aug/0003.html>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Felix
>>>>>>
>>>>>> Am Donnerstag, 9. August 2012 schrieb Sebastian Hellmann :
>>>>>>
>>>>>>   Hi Felix,
>>>>>>
>>>>>>>  below mostly my opinion on this. Nothing, wrong with including these
>>>>>>> properties, but they might not make sense in RDF. If you think, that
>>>>>>> there
>>>>>>> are people who would really use these properties in RDF, then go
>>>>>>> ahead
>>>>>>> and
>>>>>>> include them. Personally, *I* wouldn't know for what *I* could use
>>>>>>> them.
>>>>>>> More comments inline.
>>>>>>>
>>>>>>> Am 09.08.2012 15:20, schrieb Felix Sasaki:
>>>>>>>
>>>>>>>   its:entityTypeSourceRef
>>>>>>>
>>>>>>>>   I really do not find this property helpful.
>>>>>>>>
>>>>>>>  Do you see any sense in saying that
>>>>>>> http://dbpedia.org/resource/****
>>>>>>> Dublin <http://dbpedia.org/resource/**Dublin><http://dbpedia.org/**
>>>>>>> resource/Dublin <http://dbpedia.org/resource/Dublin>>is from
>>>>>>>
>>>>>>>
>>>>>>> http://dbpedia.org ? In the linked data world
>>>>>>> http://dbpedia.org/resource/
>>>>>>>  **Dublin <http://dbpedia.org/resource/**Dublin<
>>>>>>> http://dbpedia.org/resource/Dublin>>
>>>>>>> comes from
>>>>>>> http://dbpedia.org/resource/****Dublin<
>>>>>>> http://dbpedia.org/resource/**Dublin><
>>>>>>>
>>>>>>> http://dbpedia.org/resource/**Dublin<
>>>>>>> http://dbpedia.org/resource/Dublin>>.
>>>>>>> So you might specify a way to convert that to ITS, but we might not
>>>>>>> need
>>>>>>>
>>>>>>> an RDF property for this.
>>>>>>>
>>>>>>>    its:disambigType
>>>>>>>
>>>>>>>  "(http://www.w3.org/2005/11/****its/lexicalConcept|<
>>>>>>>> http://www.w3.org/2005/11/**its/lexicalConcept%7C>
>>>>>>>> <http://**www.w3.org/2005/11/its/**lexicalConcept%7C<
>>>>>>>> http://www.w3.org/2005/11/its/lexicalConcept%7C>
>>>>>>>> http://www.w3.org/2005/11/its/****ontologyConcept|http://www.**w3.**
>>>>>>>> <http://www.w3.org/2005/11/its/**ontologyConcept%7Chttp://www.w3.**
>>>>>>>> >
>>>>>>>> org/2005/11/its/<http://www.**w3.org/2005/11/its/**
>>>>>>>> ontologyConcept%7Chttp://www.**w3.org/2005/11/its/<
>>>>>>>> http://www.w3.org/2005/11/its/ontologyConcept%7Chttp://www.w3.org/2005/11/its/>
>>>>>>>>
>>>>>>>>
>>>>>>>> entity)"
>>>>>>>>
>>>>>>>>   I am unsure about this one.
>>>>>>>>
>>>>>>>     its:entityTypeRef
>>>>>>> is already rdf:type, so it would be a duplicate to have
>>>>>>> its:entityTypeRef
>>>>>>>  in RDF. For http://dbpedia.org/resource/****Dublin<
>>>>>>> http://dbpedia.org/resource/**Dublin>
>>>>>>> <http://dbpedia.org/**resource/Dublin<
>>>>>>> http://dbpedia.org/resource/Dublin>
>>>>>>>
>>>>>>>> its:**entityTypeRef would be one of:
>>>>>>>>
>>>>>>> http://dbpedia.org/ontology/****PopulatedPlace<
>>>>>>> http://dbpedia.org/ontology/**PopulatedPlace>
>>>>>>> <http://dbpedia.**org/ontology/PopulatedPlace<
>>>>>>> http://dbpedia.org/ontology/PopulatedPlace>
>>>>>>> http://dbpedia.org/ontology/****Settlement<
>>>>>>> http://dbpedia.org/ontology/**Settlement>
>>>>>>> <http://dbpedia.org/**ontology/Settlement<
>>>>>>> http://dbpedia.org/ontology/Settlement>
>>>>>>> http://umbel.org/umbel/rc/****PopulatedPlace<
>>>>>>> http://umbel.org/umbel/rc/**PopulatedPlace>
>>>>>>> <http://umbel.**org/umbel/rc/PopulatedPlace<
>>>>>>> http://umbel.org/umbel/rc/PopulatedPlace>
>>>>>>> http://dbpedia.org/ontology/****Place<
>>>>>>> http://dbpedia.org/ontology/**Place><
>>>>>>> http://dbpedia.org/ontology/**Place <
>>>>>>> http://dbpedia.org/ontology/Place>>
>>>>>>> http://umbel.org/umbel/rc/****Village<
>>>>>>> http://umbel.org/umbel/rc/**Village><
>>>>>>> http://umbel.org/umbel/rc/**Village <
>>>>>>> http://umbel.org/umbel/rc/Village>>
>>>>>>> http://umbel.org/umbel/rc/****Location_Underspecified<
>>>>>>> http://umbel.org/umbel/rc/**Location_Underspecified>
>>>>>>> <http:/**/umbel.org/umbel/rc/Location_**Underspecified<
>>>>>>> http://umbel.org/umbel/rc/Location_Underspecified>
>>>>>>> http://schema.org/Place
>>>>>>> http://www.w3.org/2002/07/owl#****Thing<
>>>>>>> http://www.w3.org/2002/07/owl#**Thing>
>>>>>>> <http://www.w3.org/**2002/07/owl#Thing<
>>>>>>> http://www.w3.org/2002/07/owl#Thing>
>>>>>>> http://www.opengis.net/gml/_****Feature<
>>>>>>> http://www.opengis.net/gml/_**Feature>
>>>>>>> <http://www.opengis.**net/gml/_Feature<
>>>>>>> http://www.opengis.net/gml/_Feature>
>>>>>>> +
>>>>>>> http:/nerd.eurecom.fr/****ontology#Place<
>>>>>>> http://nerd.eurecom.fr/**ontology#Place>
>>>>>>> <http://nerd.**eurecom.fr/ontology#Place<
>>>>>>> http://nerd.eurecom.fr/ontology#Place>
>>>>>>>
>>>>>>>
>>>>>>> If you have a Problem with this plurality. Then it might be good to
>>>>>>> include an annotation property  its:preferedEntityTypeRef
>>>>>>> So the data is there already in RDF, the problem is rather to find a
>>>>>>> way
>>>>>>> to convert it back to ITS.
>>>>>>>
>>>>>>> All the best,
>>>>>>> Sebastian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>
>>>>>>> Felix
>>>>>>>
>>>>>>> 2012/8/9 Felix Sasaki <fsasaki@w3.org>
>>>>>>>
>>>>>>>    Thanks for this, Tadej, looks good. There is just one comment I
>>>>>>> don't
>>>>>>> see
>>>>>>> reflected:
>>>>>>>
>>>>>>> 7) A question on the data category in general and the "rules"
>>>>>>> element:
>>>>>>> does it make sense to make some attributes mandatory? Currently, this
>>>>>>> would
>>>>>>> be valid:
>>>>>>>  <its:disambiguation selector="/text/body/p[@id='****dublin']/>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> It seems that still all metadata items / attributes are optional. Is
>>>>>>> there
>>>>>>> a way to be more specific about what must or must not appear
>>>>>>> together,
>>>>>>> what
>>>>>>> is optional etc?
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Felix
>>>>>>>
>>>>>>> 2012/8/9 Tadej Stajner <tadej.stajner@ijs.si>
>>>>>>>
>>>>>>>      Hi,
>>>>>>>     thanks for the tips. I covered them, and I agree towards
>>>>>>> removing the
>>>>>>> local XPath, since it has very limited use. Here is another
>>>>>>> incorporating
>>>>>>> all these comments.
>>>>>>> -- Tadej
>>>>>>>
>>>>>>> On 8/3/2012 1:07 PM, Felix Sasaki wrote:
>>>>>>>
>>>>>>> Hi Tadej, all,
>>>>>>>
>>>>>>>     thanks a lot for this. Just a few comments / questions:
>>>>>>>
>>>>>>>     1) About "The information applies to the textual content of the
>>>>>>> element, including child elements and attributes.": wouldn't it make
>>>>>>> more
>>>>>>> sense to say that this applies to only the content of the element?
>>>>>>> E.g.
>>>>>>> if
>>>>>>> you annotate the "span" element in
>>>>>>>
>>>>>>>     <p>I have seen <span id="timbl"><span class="firstame">Tim</span>
>>>>>>> <span
>>>>>>>  class="lastname">Berners-Lee</****span></span> in the olympics
>>>>>>> opening
>>>>>>>
>>>>>>>
>>>>>>> ceremony</p>
>>>>>>>
>>>>>>>     You want to express disambiguation information about the "span"
>>>>>>> element
>>>>>>> with the "id" attribute, but not about the "id" attribute or the
>>>>>>> nested
>>>>>>> span elements. So inheritance probably should be: "There is no
>>>>>>> inheritance". What do you think?
>>>>>>>
>>>>>>>
>>>>>>>     2) About "The Entity data category can be expressed with global
>>>>>>> rules,
>>>>>>> or locally on an individual element.": This should probably be "The
>>>>>>> Disambiguation data category can be expressed with global rules, or
>>>>>>> locally
>>>>>>> on an individual element."
>>>>>>>
>>>>>>>     3) About local markup: for other data categories, we don't have
>>>>>>> the
>>>>>>> "pointer" attributes as local markup, since processing of XPath in
>>>>>>> local
>>>>>>> markup can be very expensive. So I would propose to drop the local
>>>>>>> pointer
>>>>>>> attributes here too.
>>>>>>>
>>>>>>>     4) In the table at the end, "Global pointing to existing
>>>>>>> information"
>>>>>>> should be "yes" I think.
>>>>>>>
>>>>>>>     5) This selector
>>>>>>>  <its:disambiguation selector="/text/body/p/#****dublin" ...
>>>>>>> In XPath should be
>>>>>>> <its:disambiguation selector="/text/body/p[@id='****dublin']
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     6) To follow the conventions from other data categories, the
>>>>>>> "its:disambiguation" element should probably be called
>>>>>>> "its:disambiguationRule".
>>>>>>>
>>>>>>>     7) A question on the data category in general and the "rules"
>>>>>>> element:
>>>>>>> does it make sense to make some attributes mandatory? Currently, this
>>>>>>> would
>>>>>>> be valid:
>>>>>>>  <its:disambiguation selector="/text/body/p[@id='****dublin']/>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     8) A question to the others in this thread (Guiseppe, Pablo,
>>>>>>> Raphael,
>>>>>>> Sebastian): is this a representation that makes sense to you and that
>>>>>>> your
>>>>>>> tools could produce?
>>>>>>>
>>>>>>>     9) A question to the MT guys: is the way "entity and
>>>>>>> disambiguation"
>>>>>>> information is represented here useful for you?
>>>>>>>
>>>>>>>     Best,
>>>>>>>
>>>>>>>     Felix
>>>>>>>
>>>>>>> 2012/8/3 Tadej Å tajner <tadej.stajner@ijs.si>
>>>>>>>
>>>>>>>    Hi,
>>>>>>> I incorporated some comments that 'entity' was still conflated from
>>>>>>> several distinct things in the data category proposal. Now, we
>>>>>>> distinguish
>>>>>>> between disambiguation of word sense, ontology concept and entity
>>>>>>> instance.
>>>>>>> Following that, it seems that 'Disambiguation' was the better name
>>>>>>> for
>>>>>>> the
>>>>>>> data category.
>>>>>>>
>>>>>>> Thanks for everyone's input!
>>>>>>>
>>>>>>> -- Tadej
>>>>>>>
>>>>>>> On 02. 08. 2012 17:26, Tadej Å tajner wrote:
>>>>>>>
>>>>>>>    Apologies -- wrong link on the previous mail. This is the
>>>>>>> relevant one:
>>>>>>>  http://www.w3.org/****International/multilingualweb/**
>>>>>>> **lt/track/actions/181<
>>>>>>> http://www.w3.org/**International/multilingualweb/**lt/track/actions/181
>>>>>>> >
>>>>>>> <http://**www.w3.org/International/**multilingualweb/lt/track/**
>>>>>>>
>>>>>>> actions/181<
>>>>>>> http://www.w3.org/International/multilingualweb/lt/track/actions/181
>>>>>>> >
>>>>>>> -- Tadej
>>>>>>>
>>>>>>> On 02. 08. 2012 17:22, Tadej Å tajner wrote:
>>>>>>>
>>>>>>> Dipl. Inf. Sebastian Hellmann
>>>>>>> Department of Computer Science, University of Leipzig
>>>>>>> Events:
>>>>>>>      * http://sabre2012.infai.org/****mlode<
>>>>>>> http://sabre2012.infai.org/**mlode><
>>>>>>>
>>>>>>> http://sabre2012.infai.org/**mlode <http://sabre2012.infai.org/mlode
>>>>>>> >>(Leipzig,
>>>>>>> Sept. 23-24-25, 2012)
>>>>>>>
>>>>>>>     * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
>>>>>>> Projects: http://nlp2rdf.org , http://dbpedia.org
>>>>>>>  Homepage: http://bis.informatik.uni-**le**
>>>>>>> ipzig.de/SebastianHellmann<http://leipzig.de/SebastianHellmann>
>>>>>>> <htt**p://bis.informatik.uni-**leipzig.de/SebastianHellmann<
>>>>>>> http://bis.informatik.uni-leipzig.de/SebastianHellmann>
>>>>>>> Research Group: http://aksw.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   --
>>>>> Dipl. Inf. Sebastian Hellmann
>>>>> Department of Computer Science, University of Leipzig
>>>>> Events:
>>>>>    * http://sabre2012.infai.org/**mlode <
>>>>> http://sabre2012.infai.org/mlode>(Leipzig, Sept. 23-24-25, 2012)
>>>>>    * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
>>>>> Projects: http://nlp2rdf.org , http://dbpedia.org
>>>>> Homepage: http://bis.informatik.uni-**leipzig.de/SebastianHellmann<
>>>>> http://bis.informatik.uni-leipzig.de/SebastianHellmann>
>>>>> Research Group: http://aksw.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> Dipl. Inf. Sebastian Hellmann
>>> Department of Computer Science, University of Leipzig
>>> Events:
>>>   * http://sabre2012.infai.org/mlode (Leipzig, Sept. 23-24-25, 2012)
>>>   * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
>>> Projects: http://nlp2rdf.org , http://dbpedia.org
>>> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
>>> Research Group: http://aksw.org
>>>
>>>
>>
>>
>>  --
>> Felix Sasaki
>> DFKI / W3C Fellow
>>
>>
>
>
>  --
> ---
> Pablo N. Mendes
> http://pablomendes.com
> Events: http://wole2012.eurecom.fr
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Monday, 20 August 2012 11:24:35 UTC