See also: IRC log
<daveL> Meeting: LD4LT community Group Call
<daveL> chair: Dave Lewis
<daveL> Agenda: http://lists.w3.org/Archives/Public/public-ld4lt/2014Jul/0018.html
<daveL> apologies, we just lost goto meeting for a minute, back now
<daveL> Agenda: http://lists.w3.org/Archives/Public/public-ld4lt/2014Jul/0018.html
<daveL> Agenda: http://lists.w3.org/Archives/Public/public-ld4lt/2014Jul/0018.html
dave: this week want to focus on
meta-share ontology
... then will go through meta-share changes we did since last
time
... and will cover suggestions by victor on licensing
... penny is not here today, she had mailed comments on
licensing, will go through those
action-5?
<trackbot> action-5 -- Víctor Rodríguez-Doncel to proposal for a license modue -- due 2014-06-19 -- OPEN
<trackbot> http://www.w3.org/community/ld4lt/track/actions/5
<scribe> done
close action-5
<trackbot> Closed action-5.
<daveL> ACTION-7: Felix - Check with w3c groups if there are other approaches to represent languages as uris
<trackbot> Notes added to ACTION-7 Check with w3c groups if there are other approches to represent languages as uris.
action-7?
<trackbot> action-7 -- Felix Sasaki to Check with w3c groups if there are other approches to represent languages as uris -- due 2014-06-19 -- OPEN
<trackbot> http://www.w3.org/community/ld4lt/track/actions/7
<daveL> http://lists.w3.org/Archives/Public/public-ld4lt/2014Jul/0004.html
close action-7
<trackbot> Closed action-7.
felix: ok to discuss with meta-share, hard to resolve in general
action-8?
<trackbot> action-8 -- David Lewis to Look into isa work related to dcat profiles and report back -- due 2014-07-10 -- OPEN
<trackbot> http://www.w3.org/community/ld4lt/track/actions/8
<scribe> done, see mail from dave
close action-8
<trackbot> Closed action-8.
see mail at http://lists.w3.org/Archives/Public/public-ld4lt/2014Jul/0011.html
action-9?
<trackbot> action-9 -- Jorge Gracia to Implement changes in metashare spreadsheet -- due 2014-07-10 -- OPEN
<trackbot> http://www.w3.org/community/ld4lt/track/actions/9
close action-9
<trackbot> Closed action-9.
<daveL> ACTION-10: Jorge - Identify some external vocabularies to use in ms
<trackbot> Notes added to ACTION-10 Identify some external vocabularies to use in ms.
dave: will discuss later in the call
jorge: will be covered during meta-share discussion
close action-10
<trackbot> Closed action-10.
dave: thanks to all for working on your action points :)
<daveL> http://mlode2014.nlp2rdf.org/lider-roadmapping-workshop/
dave: ld4lt / lider RM workshop in leipzig
<daveL> felix: this event is looking at getting more input from the analytics use cases and needs for linked data
<daveL> .. as there will be a lot of those companies there
dave: will be a good opportunity
for people in this group to meet f2f and discuss content
analytics and general topics
... then, we recently had a workshop in dublin at loc
world
... we have now an opportunity to repeat that in
vancouver
... will be in last week of october
... so FYI, I'll send details around later
<daveL> https://docs.google.com/spreadsheets/d/15SE4_qAqYFostmD52uKxpkCPZh1f5TrPeoXKNTlDYpQ/edit#gid=0
jorge: modifications of the gdocs
spreadsheet format:
... I created new columns to put new information in
... we keep track of the old information. these have been
hidden as you can do in excel. Just click and the previous info
appears
... I added colors so that you can see what changed - this is
shown in blue
... in the discussion column: I, penny, dave, others have added
comments for feeding the discussion
... my proposal: go through the rows in the spreadsheet,
re-read the discussion column, see what we can decide
... I colored in red the discussions that may be more
critical
... propose to go through whole list of rows
dave: agree
jorge: set of classes are
short
... first: agent
... proposed to use FOAF agent both for person and
organization
... see in the comment suggestion provenance agent
dave: using it by itself does not make sense, of course
jorge: for us, as a first step, I propose this, without prov ontology
dave: sure
jorge: now row six: there were
some labels expressed as camel case
... I changed this as separate words
... also suggest to write labels in lower case
... recommended to write labels as normal English
... something to keep in mind when we do the clean version of
this
... for corpus: I removed disjointness, thought it is not
useful
... in row 10: corpus collection
... penny explains that this value does not come from
meta-share model
... I say: we could introduce collection class of dublin
core
... need to check that with meta-share people if that fits with
them
<jgracia> (http://purl.org/dc/dcmitype/Collection
<daveL> http://www.w3.org/TR/vocab-dcat/#vocabulary-overview
dave: in dcat there is the idea
of a data set
... that would be the language resource in our case
... but it can also be a catalog, which can be a collection
with data sets
... so catalog rather than dct collection may be a better way
of doing it
marta: for corpora you can have audio of the corpus and the transcript
<jgracia> Marta Villegas
marta: in a sense you have two
corpora
... so you need two instances of corpuse to encode both parts
of the corpus
... that is the idea: to build a higher node so that you can
add more corpora inside
dave: so that is probably
different than dcat catalogue
... in your description a collection is a sub grouping
<daveL> http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=dcmitype#Collection
dave: about dublin core: it says
"it is an aggregation + collection of resources"
... so dct: collection maybe is more accurate
jorge: we have to decide: how to map dcat data sets with language resources
dave: so stay with collections as suggested
jorge: ok
discussion on the definition of "corpus"
philipp: postpone discussion and decide later whether we define this as property or class
marta: in meta-share corpus you
may have different media types
... you can have audio media type or text part
... penny can give us more info - it is not trivial to move
from annotation schema to ontology at this point
john: two options: we map to other concepts, or we just represent what is in meta-share. what is the goal here?
jorge: the aim is closer to use
what is in meta-share, and to convert that in owl
ontology
... meta-share is based on decades of discussion
john: so if in meta-share there is a corpus collection we use that, if not, we can use s.t. from a semantic web vocabulary
marta: corpus collection has been added, it is not in the original meta-share
john: if this is about alignment we should not have a new vocab that is not in meta-share?
dave: this is a first attempt to
map existing xml format into rdf
... that's slightly different to map a vocab into antother
one
(scribe has a hard time to capture discussion, will see if there is a conclusion)
<daveL> https://www.w3.org/community/ld4lt/wiki/Meta-Share_OWL_metamodel
<daveL> http://www.meta-net.eu/meta-share/META-SHARE%20%20documentationUserManual.pdf
https://github.com/metashare/META-SHARE/tree/master/misc/schema/v3.0
marta: above schema is the latest version of the xml schema
<Tcarrasco> Proposal - corpus: collection of linguistic data; it be in several media-types. Corpus can be: media-type homogenous or heterogeneos; monolingual or multilingual.
jorge: this type of corpus are first class citizens in meta-share model?
marta: yes
jorge: maybe good then to define this as first class entity
<Tcarrasco> Today, the relevant corpora today is n-lingual plain text
dave: how to wrap up the discussion on corpus definition, jorge?
jorge: let's move to license
topic
... one major issue to clarify: mapping between language
resource and dcat data set and dcat distribution classes
... this is still oepn
s/oepen/open/
<scribe> ACTION: daveL to gather info on how to provide more detailed mapping from meta-share to dcat [recorded in http://www.w3.org/2014/07/17-ld4lt-minutes.html#action01]
<trackbot> Created ACTION-11 - Gather info on how to provide more detailed mapping from meta-share to dcat [on David Lewis - due 2014-07-24].
<Tcarrasco> Human annotation is realistic for small corpus - large corpus requires programatic processing for cleaning, annotation and other processing
jorge: agree, now let's move into licensese topic
<daveL> https://www.w3.org/community/ld4lt/wiki/Licensing_information
<daveL> http://lists.w3.org/Archives/Public/public-ld4lt/2014Jul/0014.html
dave: wikipage from victor, penny sent above mail
<Tcarrasco> Sound poor
victor: penny likes the approach
and made some comments
... she said: we should declare more precisely which elements
we use
... currently literals are plain strings. they should be
replaced by URIs
... penny says the license name has to be kept
... use of URL is also ok
... resources that have double licensing
... should be supported, I agree
... she discussed more information that should be there
... penny has not reflected comments in wiki
... I can do it for here or she can do it herself, I'll send a
mail to her about that
... next step will be to update spreadsheet
... we can use meta-share term, declaring odrl
... connecting both via owl:sameAs
... in martas translation I missed an element to aggregate
license information
... in martas model these properties were directly attributed
to the resources
dave: is it necessary to have the aggregation? or can you retrieve that via sparql?
victor: if a resource has two licenses the properties will be related to license one or two
dave: ok
... when I look at dcat I will take the discussion of multiple
licsenses into account too
felix: when you have issues with dcat you may want to talk to phil archer directly, he is on top of things
dave: makes sense - about dcat we
can make a wiki page
... so that it is digestable for dcat people
felix: makes a lot of sense
<jgracia> +1
dave: so victor will lialise with
penny and the wiki page
... and then we can make changes to the actual spreadsheet
dave: do people want to have
another call next thursday?
... I won't be around but we could arrange it
people can do both weeks
dave: I could not chair next week
but maybe somebody else can do that
... trying to nail things down before we get to August
dave: we want to finish off
spreadsheet, then a stable core part, and then handle that back
to marta / penny to publish that on their own github
... hope that we can get to that after the holidays
dave: we will arrange to have a
call next week, assure that we can start the session, then
another call in two weeks too
... thanks to all for your efforts in the mail and here!
adjourned
This is scribe.perl Revision: 1.138 of Date: 2013-04-25 13:59:11 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/roll call/agenda review/ Succeeded: s/@@/marta/ Succeeded: s/pont/point/ Succeeded: s/dave/jorge/ FAILED: s/oepen/open/ No ScribeNick specified. Guessing ScribeNick: fsasaki Inferring Scribes: fsasaki Present: Renat fsasaki Jorge Victor TizianoFlati DaveLewis M.T. Carrasco Benitez RobertoNavigli kevinkoidl Ali_ H_Vahid serge marta philippC johnMcC Regrets: penny Agenda: http://lists.w3.org/Archives/Public/public-ld4lt/2014Jul/0018.html Got date from IRC log name: 17 Jul 2014 Guessing minutes URL: http://www.w3.org/2014/07/17-ld4lt-minutes.html People with action items: davel[End of scribe.perl diagnostic output]