Re: [ISSUE-2] Module suggestions for META-SHARE RDF vocabulary

Hi all,

I have been following this discussion quite passively until now and I think
I can give my 2 cents here. I believe the following survey gives a good
overview of the topic.

Leonardo Lezcano, Salvador Sánchez-Alonso, Antonio J. Roa-Valverde, (2013)
"A survey on the exchange of linguistic resources: Publishing linguistic
linked open data on the Web", Program: electronic library and information
systems, Vol. 47 Iss: 3, pp.263 - 281

http://www.emeraldinsight.com/journals.htm?issn=0033-0337&volume=47&issue=3&articleid=17093339&show=html

I am not sure I can just share a copy in this thread due to copyright
issues though.

Best regards,
Antonio


On Thu, Jun 12, 2014 at 11:49 AM, Sebastian Hellmann <
hellmann@informatik.uni-leipzig.de> wrote:

>  Hi Asun, all,
>
> the image looks quite appropriate, here are some things that are an
> addition to the mentioned names on the image:
>
> ## General Metadata
> The DBpedia community is currently pursuing an implementation and
> extension of DCat and VOID called DataID [1]. While DCat and VOID are
> vocabularies, DataID will provide some guidelines how and where to exactly
> publish the DataID file (similar to the robots.txt or sitemap file). There
> will be a validator implementation to help adoption.
>
> ## Linguistic Specific Metadata
>
> Language Codes for 639-1 and 639-2 are provided by the Library of Congress
> (LoC):
> http://id.loc.gov/vocabulary/iso639-1/ab
> http://id.loc.gov/vocabulary/iso639-2/eng
> Also in RDF:
> http://id.loc.gov/vocabulary/iso639-2/eng.rdf
>
> Sadly, the most popular code, i.e. iso639-3 are not available by LoC:
> http://lexvo.org is the authority here at the moment:
> http://www.lexvo.org/page/iso639-3/eng
>
> ## Linguistic Data
> In my opinion, NIF and lemon are able to cover most industrial use cases.
>
> lemon for dictionaries and terminological data
> NIF as an annotation format for text
>
> While NIF itself provides mechanisms to model (offset) annotations as
> linked data, here are the incorporated NIF modules for expressing the
> annotations itself:
>
> * ITS RDF ontology - http://www.w3.org/2005/11/its/rdf# based on
> http://www.w3.org/TR/its20/
> * NERD - for entity classification  (person, location, ...)
> http://nerd.eurecom.fr/ontology
> * MARL - for sentiment analysis http://purl.org/marl/0.1/ns
> * OLiA - for morpho-syntax, POS tag sets, etc.  http://purl.org/olia
> * DBpedia + DBpedia Ontology for Entity Linking:
> http://dbpedia.org/resource/Barack_Obama
>
> We started to collect them all here:
> https://github.com/NLP2RDF/ontologies/tree/master/vm
>
> ## Limitations of the above:
>
> * If the language codes of ISO are not enough http://glottolog.org/ is an
> option
> * If you need fine grained features  like annotations of annotations
> http://www.openannotation.org/ can be used. The triple count is much
> higher than NIF though and scalability can be a problem.
>
> All the best,
> Sebastian
>
> [1]  http://wiki.dbpedia.org/coop/DataIDUnit
>
>
>
> Am 31.05.2014 11:49, schrieb Asunción Gómez Pérez:
>
>
> Dear all,
>
> Please consider the following picture as a starting point to try to
> identify different metadata in clusters and  splitting it from the  content
> oriented part of the LR . Issues related with country codes are not
> included in this slide, but it should be easy to extend.  In the middle,
> the white boxes refer to candidate vocabularies to be reused or to
> initiatives that could help us with the deffinition of the properties and
> their values.
>
>
> I hope that it helps
>
> Asun
>
>
>
>
> El 22/05/2014 14:00, Marta Villegas escribió:
>
> Dear Penny Dave and all,
>
>  For things like ORGANIZATION, PROJECT, DOCUMENT, PEOPLE (ie
> non-linguistic things) we could use existing ontologies like foaf, doap,
> bibo srwc etc.... (just chose the one that fits more your purpose)
> Also for language names/codes, country names, mime-types (we did not find
> anything but ...) etc.
>
>  Best
>
>
>
>
>  2014-05-22 11:55 GMT+02:00 Penny Labropoulou <penny@ilsp.gr>:
>
>> Dear Dave and all,
>>
>> We agree that a separation into modules will help the discussion, and we
>> basically agree with your proposal.
>>
>> One point as regards the RESOURCE_TYPE module: all LRs are described via
>> the
>> same set of "administrative/descriptive" components + an additional set of
>> more specific components, depending on their resourceType AND mediaType
>> values - the latter set corresponds to all the components included in the
>> resourceComponentType part. So, there's a specific set of components for
>> corpora, lexical/conceptual resources, language descriptions and
>> tools/services (the four resource types recognized by META-SHARE); inside
>> these, we have separate components, depending on the mediaType, so we have
>> text corpora components, video corpora components, audio corpora
>> components,
>> but also lexical/conceptual text components etc. Inside each of these
>> combinations, some elements are shared (e.g. linguality and language, time
>> classification etc.) or can be similar (e.g. there are similar
>> classification components for text, audio, video and image). So, it might
>> be
>> more convenient to separate RESOURCE_TYPE and MEDIA_TYPE modules. What do
>> you think?
>>
>> We also suggest that we add three further modules: ORGANIZATION, PROJECT
>> and
>> DOCUMENT - corresponding to the organizationInfo, projectInfo &
>> documentationInfo parts of the original model.
>>
>> Best,
>> Penny
>>
>> -----Original Message-----
>> From: Dave Lewis [mailto:dave.lewis@cs.tcd.ie]
>> Sent: Thursday, May 22, 2014 12:38 PM
>> To: public-ld4lt@w3.org
>> Subject: [ISSUE-2] Module suggestions for META-SHARE RDF vocabulary
>>
>> Hi all,
>> At the last call we discussed the template for the meta-share ontology as
>> kindly initiated by Jorge:
>>
>> https://docs.google.com/spreadsheets/d/15SE4_qAqYFostmD52uKxpkCPZh1f5TrPeoXK
>> NTlDYpQ/edit#gid=0
>> <https://docs.google.com/spreadsheets/d/15SE4_qAqYFostmD52uKxpkCPZh1f5TrPeoXK%0ANTlDYpQ/edit#gid=0>
>>
>> with further information at:
>> https://www.w3.org/community/ld4lt/wiki/Meta-Share_OWL_metamodel
>>
>> We discussed modules for this to help break down the taks and to partition
>> parts that might take more time to agree or need involvement by different
>> subgroups compared to others.
>>
>> We already agreed to have a CORE component and split out a LICENSES
>> module,
>> but had asked for other suggestions.
>>
>> I'd like to propose two further modules:
>>
>> RESOURCE_TYPE corresponding to the resrouceComponentType part of the
>> meta-share schema:
>> http://www.meta-share.org/portal/knowledgebase/Resourcecomponenttype
>>
>> and
>>
>> USAGE_TYPE corresponding to the usageInfo part of the meta-share schema:
>> http://www.meta-share.org/portal/knowledgebase/Usageinfo
>>
>> These contain large enumerations that could both be subject to ongoing
>> debate and likely candidate for extension/specialization. By separating
>> these out we can avoid such debate delaying work on the CORe module.
>>
>> Should we add these as modules to the spreadsheet?
>>
>>  From an ontology modelling viewpoint, how should we manage the modelling
>> in
>> these proposed modules, would a class taxonomy be a better approach and an
>> enumeration?
>>
>> Kind Regards,
>> Dave
>>
>>
>>
>>
>>
>>
>
>
>  --
> Marta Villegas
> marta.villegas@gmail.com
>
>
> --
> Prof. Asunción Gómez-Pérez
> Catedrática de Universidad
> Director of the Ontology Engineering Group
> Facultad de Informática owl:sameAs Escuela Técnica Superior de Ingenieros Informáticos
> Universidad Politécnica de Madrid
> Campus de Montegancedo, sn
> Boadilla del Monte, 28660, Spain
> Home page: www.oeg-upm.net
> Email: asun@fi.upm.es
> Phone: (34-91) 336-7417
> Fax: (34-91) 352-4819
>
>
>

Received on Thursday, 12 June 2014 11:42:26 UTC