Re: [all] Call for consensus on disambiguation - feedback integrated [ACTION-181]

Hi Felix,
your proposal is based on the assumption, that more data is available at 
these three URLs:

http:/nerd.eurecom.fr/ontology#Place
http://dbpedia.org/resource/Dublin
http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3

While this assumption is ok for the Semantic Web, I am not sure about 
the ITS world.

Furthermore, if you are attempting to minimize it, I would suggest  to merge
"its-entity-type-ident-ref" into "its-disambig-type-ref". You wouldn't 
be limited to entity types and could use any of:

- http:/nerd.eurecom.fr/ontology#Place
- http://dbpedia.org/ontology/Place
- http://www.monnet-project.eu/lemon#LexicalSense
- http://www.monnet-project.eu/lemon#LexicalEntry
- http://wordnet.princeton.edu/wndatamodel#NounWordSense
- http://wordnet.princeton.edu/wndatamodel#Synset

All the best,
Sebastian

Am 20.08.2012 09:44, schrieb Felix Sasaki:
> Hi Sebastian, all,
>
> thanks, Sebastian. From what you say in the wiki and in the previous mail,
> I think one could simplify things a lot.
>
> The HTML example from Tadej *could* look like this:
>
> <html lang="en">
>
>     <head>
>
>        <meta charset="utf-8" />
>
>        <title>Entity: Local Test</title>
>
>     </head>
>
>     <body>
>
>         <p><span
>
> its-entity-type-ident-ref="http:/nerd.eurecom.fr/ontology#Place"
>
> its-disambig-ident-ref="http://dbpedia.org/resource/Dublin">Dublin</span>
> is the <span
>
> its-disambig-ident-ref="
> http://www.w3.org/2006/03/wn/wn20/instances/worsense-capital-noun-3">capital</span>
> of Ireland.</p>
>
>     </body>
>
> </html>
>
> That is, no explicit "resource" references for entity type and
> disambiguation source, and no disambig-type.
>
> Also, I think one could get rid of adding this kind of information via
> global rules - I really don't see a use case for that.
>
> Tadej, others, thoughts? Maybe Yves as one of the implementors processing
> the output and other have some thoughts too?
>
> Best,
>
> Felix
>
> 2012/8/17 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
>
>> Dear Felix,
>> to solve this issue I prepared a page:
>> http://wiki.nlp2rdf.org/wiki/**DBpedia_Spotlight<http://wiki.nlp2rdf.org/wiki/DBpedia_Spotlight>
>> It is a rough draft, so there are many mistakes, still. Once it is mature,
>> I will send it to the DBpedia Spotlight and Apache Stanbol lists to get
>> their feedback.
>> Note that I don't have a problem with these properties as XML attributes,
>> where they can naturally occur only once and encoding an implicit
>> dependency (attribute refering to another attribute) is unproblematic. They
>> are, however, difficult to handle in RDF, even when declaring them
>> functional.
>> I will report back, if there are any news,
>>
>> All the best,
>> Sebastian
>>
>>
>>
>>
>> Am 14.08.2012 21:34, schrieb Felix Sasaki:
>>
>>> Hi Sebastian, all,
>>>
>>> August is taking its tribute ... I am wondering if there any thoughts on
>>> Sebastian's mail below. It seems that some of the proposed ITS attributes
>>> are not needed, but I don't have the competence to evaluate this. Thoughts
>>> from others?  Sebastian, could you confirm that the output mentioned in
>>> this other thread
>>>
>>> http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>>> lt/2012Aug/0168.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0168.html>
>>>
>>> is correct for NIF? I then would create a test case for our test suite,
>>> see
>>>
>>> http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>>> lt-tests/2012Aug/0003.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt-tests/2012Aug/0003.html>
>>>
>>> Thanks,
>>>
>>> Felix
>>>
>>> Am Donnerstag, 9. August 2012 schrieb Sebastian Hellmann :
>>>
>>>   Hi Felix,
>>>> below mostly my opinion on this. Nothing, wrong with including these
>>>> properties, but they might not make sense in RDF. If you think, that
>>>> there
>>>> are people who would really use these properties in RDF, then go ahead
>>>> and
>>>> include them. Personally, *I* wouldn't know for what *I* could use them.
>>>> More comments inline.
>>>>
>>>> Am 09.08.2012 15:20, schrieb Felix Sasaki:
>>>>
>>>>   its:entityTypeSourceRef
>>>>>   I really do not find this property helpful.
>>>> Do you see any sense in saying that http://dbpedia.org/resource/****
>>>> Dublin <http://dbpedia.org/resource/**Dublin><http://dbpedia.org/**
>>>> resource/Dublin <http://dbpedia.org/resource/Dublin>>is from
>>>>
>>>> http://dbpedia.org ? In the linked data world
>>>> http://dbpedia.org/resource/
>>>> **Dublin <http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>>
>>>> comes from
>>>> http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin><
>>>> http://dbpedia.org/resource/**Dublin<http://dbpedia.org/resource/Dublin>>.
>>>> So you might specify a way to convert that to ITS, but we might not need
>>>>
>>>> an RDF property for this.
>>>>
>>>>    its:disambigType
>>>>
>>>>> "(http://www.w3.org/2005/11/****its/lexicalConcept|<http://www.w3.org/2005/11/**its/lexicalConcept%7C>
>>>>> <http://**www.w3.org/2005/11/its/**lexicalConcept%7C<http://www.w3.org/2005/11/its/lexicalConcept%7C>
>>>>> http://www.w3.org/2005/11/its/****ontologyConcept|http://www.**w3.**<http://www.w3.org/2005/11/its/**ontologyConcept%7Chttp://www.w3.**>
>>>>> org/2005/11/its/<http://www.**w3.org/2005/11/its/**
>>>>> ontologyConcept%7Chttp://www.**w3.org/2005/11/its/<http://www.w3.org/2005/11/its/ontologyConcept%7Chttp://www.w3.org/2005/11/its/>
>>>>> entity)"
>>>>>
>>>>>   I am unsure about this one.
>>>>    its:entityTypeRef
>>>> is already rdf:type, so it would be a duplicate to have its:entityTypeRef
>>>> in RDF. For http://dbpedia.org/resource/****Dublin<http://dbpedia.org/resource/**Dublin>
>>>> <http://dbpedia.org/**resource/Dublin<http://dbpedia.org/resource/Dublin>
>>>>> its:**entityTypeRef would be one of:
>>>> http://dbpedia.org/ontology/****PopulatedPlace<http://dbpedia.org/ontology/**PopulatedPlace>
>>>> <http://dbpedia.**org/ontology/PopulatedPlace<http://dbpedia.org/ontology/PopulatedPlace>
>>>> http://dbpedia.org/ontology/****Settlement<http://dbpedia.org/ontology/**Settlement>
>>>> <http://dbpedia.org/**ontology/Settlement<http://dbpedia.org/ontology/Settlement>
>>>> http://umbel.org/umbel/rc/****PopulatedPlace<http://umbel.org/umbel/rc/**PopulatedPlace>
>>>> <http://umbel.**org/umbel/rc/PopulatedPlace<http://umbel.org/umbel/rc/PopulatedPlace>
>>>> http://dbpedia.org/ontology/****Place<http://dbpedia.org/ontology/**Place><
>>>> http://dbpedia.org/ontology/**Place <http://dbpedia.org/ontology/Place>>
>>>> http://umbel.org/umbel/rc/****Village<http://umbel.org/umbel/rc/**Village><
>>>> http://umbel.org/umbel/rc/**Village <http://umbel.org/umbel/rc/Village>>
>>>> http://umbel.org/umbel/rc/****Location_Underspecified<http://umbel.org/umbel/rc/**Location_Underspecified>
>>>> <http:/**/umbel.org/umbel/rc/Location_**Underspecified<http://umbel.org/umbel/rc/Location_Underspecified>
>>>> http://schema.org/Place
>>>> http://www.w3.org/2002/07/owl#****Thing<http://www.w3.org/2002/07/owl#**Thing>
>>>> <http://www.w3.org/**2002/07/owl#Thing<http://www.w3.org/2002/07/owl#Thing>
>>>> http://www.opengis.net/gml/_****Feature<http://www.opengis.net/gml/_**Feature>
>>>> <http://www.opengis.**net/gml/_Feature<http://www.opengis.net/gml/_Feature>
>>>> +
>>>> http:/nerd.eurecom.fr/****ontology#Place<http://nerd.eurecom.fr/**ontology#Place>
>>>> <http://nerd.**eurecom.fr/ontology#Place<http://nerd.eurecom.fr/ontology#Place>
>>>>
>>>> If you have a Problem with this plurality. Then it might be good to
>>>> include an annotation property  its:preferedEntityTypeRef
>>>> So the data is there already in RDF, the problem is rather to find a way
>>>> to convert it back to ITS.
>>>>
>>>> All the best,
>>>> Sebastian
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> Felix
>>>>
>>>> 2012/8/9 Felix Sasaki <fsasaki@w3.org>
>>>>
>>>>    Thanks for this, Tadej, looks good. There is just one comment I don't
>>>> see
>>>> reflected:
>>>>
>>>> 7) A question on the data category in general and the "rules" element:
>>>> does it make sense to make some attributes mandatory? Currently, this
>>>> would
>>>> be valid:
>>>> <its:disambiguation selector="/text/body/p[@id='****dublin']/>
>>>>
>>>>
>>>>
>>>> It seems that still all metadata items / attributes are optional. Is
>>>> there
>>>> a way to be more specific about what must or must not appear together,
>>>> what
>>>> is optional etc?
>>>>
>>>> Best,
>>>>
>>>> Felix
>>>>
>>>> 2012/8/9 Tadej Stajner <tadej.stajner@ijs.si>
>>>>
>>>>      Hi,
>>>>     thanks for the tips. I covered them, and I agree towards removing the
>>>> local XPath, since it has very limited use. Here is another incorporating
>>>> all these comments.
>>>> -- Tadej
>>>>
>>>> On 8/3/2012 1:07 PM, Felix Sasaki wrote:
>>>>
>>>> Hi Tadej, all,
>>>>
>>>>     thanks a lot for this. Just a few comments / questions:
>>>>
>>>>     1) About "The information applies to the textual content of the
>>>> element, including child elements and attributes.": wouldn't it make more
>>>> sense to say that this applies to only the content of the element? E.g.
>>>> if
>>>> you annotate the "span" element in
>>>>
>>>>     <p>I have seen <span id="timbl"><span class="firstame">Tim</span>
>>>> <span
>>>> class="lastname">Berners-Lee</****span></span> in the olympics opening
>>>>
>>>> ceremony</p>
>>>>
>>>>     You want to express disambiguation information about the "span"
>>>> element
>>>> with the "id" attribute, but not about the "id" attribute or the nested
>>>> span elements. So inheritance probably should be: "There is no
>>>> inheritance". What do you think?
>>>>
>>>>
>>>>     2) About "The Entity data category can be expressed with global rules,
>>>> or locally on an individual element.": This should probably be "The
>>>> Disambiguation data category can be expressed with global rules, or
>>>> locally
>>>> on an individual element."
>>>>
>>>>     3) About local markup: for other data categories, we don't have the
>>>> "pointer" attributes as local markup, since processing of XPath in local
>>>> markup can be very expensive. So I would propose to drop the local
>>>> pointer
>>>> attributes here too.
>>>>
>>>>     4) In the table at the end, "Global pointing to existing information"
>>>> should be "yes" I think.
>>>>
>>>>     5) This selector
>>>> <its:disambiguation selector="/text/body/p/#****dublin" ...
>>>> In XPath should be
>>>> <its:disambiguation selector="/text/body/p[@id='****dublin']
>>>>
>>>>
>>>>     6) To follow the conventions from other data categories, the
>>>> "its:disambiguation" element should probably be called
>>>> "its:disambiguationRule".
>>>>
>>>>     7) A question on the data category in general and the "rules" element:
>>>> does it make sense to make some attributes mandatory? Currently, this
>>>> would
>>>> be valid:
>>>> <its:disambiguation selector="/text/body/p[@id='****dublin']/>
>>>>
>>>>
>>>>     8) A question to the others in this thread (Guiseppe, Pablo, Raphael,
>>>> Sebastian): is this a representation that makes sense to you and that
>>>> your
>>>> tools could produce?
>>>>
>>>>     9) A question to the MT guys: is the way "entity and disambiguation"
>>>> information is represented here useful for you?
>>>>
>>>>     Best,
>>>>
>>>>     Felix
>>>>
>>>> 2012/8/3 Tadej Štajner <tadej.stajner@ijs.si>
>>>>
>>>>    Hi,
>>>> I incorporated some comments that 'entity' was still conflated from
>>>> several distinct things in the data category proposal. Now, we
>>>> distinguish
>>>> between disambiguation of word sense, ontology concept and entity
>>>> instance.
>>>> Following that, it seems that 'Disambiguation' was the better name for
>>>> the
>>>> data category.
>>>>
>>>> Thanks for everyone's input!
>>>>
>>>> -- Tadej
>>>>
>>>> On 02. 08. 2012 17:26, Tadej Štajner wrote:
>>>>
>>>>    Apologies -- wrong link on the previous mail. This is the relevant one:
>>>> http://www.w3.org/****International/multilingualweb/**
>>>> **lt/track/actions/181<http://www.w3.org/**International/multilingualweb/**lt/track/actions/181>
>>>> <http://**www.w3.org/International/**multilingualweb/lt/track/**
>>>> actions/181<http://www.w3.org/International/multilingualweb/lt/track/actions/181>
>>>> -- Tadej
>>>>
>>>> On 02. 08. 2012 17:22, Tadej Štajner wrote:
>>>>
>>>> Dipl. Inf. Sebastian Hellmann
>>>> Department of Computer Science, University of Leipzig
>>>> Events:
>>>>     * http://sabre2012.infai.org/****mlode<http://sabre2012.infai.org/**mlode><
>>>> http://sabre2012.infai.org/**mlode <http://sabre2012.infai.org/mlode>>(Leipzig,
>>>> Sept. 23-24-25, 2012)
>>>>
>>>>     * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
>>>> Projects: http://nlp2rdf.org , http://dbpedia.org
>>>> Homepage: http://bis.informatik.uni-**le**ipzig.de/SebastianHellmann<http://leipzig.de/SebastianHellmann>
>>>> <htt**p://bis.informatik.uni-**leipzig.de/SebastianHellmann<http://bis.informatik.uni-leipzig.de/SebastianHellmann>
>>>> Research Group: http://aksw.org
>>>>
>>>>
>>>>
>> --
>> Dipl. Inf. Sebastian Hellmann
>> Department of Computer Science, University of Leipzig
>> Events:
>>    * http://sabre2012.infai.org/**mlode <http://sabre2012.infai.org/mlode>(Leipzig, Sept. 23-24-25, 2012)
>>    * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
>> Projects: http://nlp2rdf.org , http://dbpedia.org
>> Homepage: http://bis.informatik.uni-**leipzig.de/SebastianHellmann<http://bis.informatik.uni-leipzig.de/SebastianHellmann>
>> Research Group: http://aksw.org
>>
>>
>>
>


-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Events:
   * http://sabre2012.infai.org/mlode (Leipzig, Sept. 23-24-25, 2012)
   * http://wole2012.eurecom.fr (*Deadline: July 31st 2012*)
Projects: http://nlp2rdf.org , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

Received on Monday, 20 August 2012 08:38:01 UTC