W3C

- DRAFT -

LD4LT CG

15 May 2014

Agenda

See also: IRC log

Attendees

Present
DaveLewis, Gary, Kevin, MartinBenjamin, RobertoNavigli, ali, arle(IRC), asun, flati, fsasaki, john, jorge, maria, penny, phil, roberto, tizinao
Regrets
Chair
dave
Scribe
fsasaki

Contents


<fsasaki_> scribe: fsasaki

dave: reminder for all to join IRC during calls: go to http://irc.w3.org , channel #ld4lt

agenda at http://lists.w3.org/Archives/Public/public-ld4lt/2014May/0005.html

<daveL_> agenda: http://lists.w3.org/Archives/Public/public-ld4lt/2014May/0005.html

dave going through agenda at http://lists.w3.org/Archives/Public/public-ld4lt/2014May/0005.html

dave: I'll talk about last week in madrid, we had an LD4LT meeting - others can give their viewpoints as well
... then we'll discuss ongoing work on meta-share ontology
... then jorge, penny who provided useful info on mailing list - would be good to see where that work is
... there are a few technical issues discussed on the list. If there is a thread on this we mark that up as an issue

https://www.w3.org/community/ld4lt/track/issues/

<daveL_> http://www.w3.org/community/ld4lt/track/

above links gives issue list

dave: the tracker tracks actions and issues
... the issue tracker gives us a way to track topics over several meetings, and can associate that with actions as well
... see e.g. issue-2 which is meta-share metamodel work approach
... we have use case and requirements as issue-1
... would not go through that today
... then there is an issue about dcat, issue-3
... we had last week discussion in madrid related to dcat
... there is an opportunity for cross over where

madrid meeting

dave: there was LIDER meetings and MLW workshop, very succesfull, hosted by UPM

about 100 people turning up

dave: a broader set of discussions around multilingual web topics
... linked data not the only topic

<daveL__> http://www.multilingualweb.eu/documents/2014-madrid-workshop/

see slides now linked from http://www.multilingualweb.eu/documents/2014-madrid-workshop/2014-madrid-program

<daveL__> https://www.w3.org/community/ld4lt/wiki/LD4LT_Group_Madrid_May_2014_Meeting

(slides from the LIDER WS are now also linked)

dave: felix had reached out to wikipedia translation group, alolita sharma came with a large team of wikimedia folks
... wikipedia is now trying to help people to translate pages directly
... they are looking into tools, machine translation and other technologies
... they are also interested in data in other languages
... there is now also wikidata for creating data directly
... related to dbpedia which is about extracting data from content
... anybody wants to add things about this?
... that was 1st day - 2nd day focused more on linked data and language technology
... we had several presentations from LIDER but also other (EU) projects
... we had discussions about data + metadata aspects of language resources
... several people from LD4LT where here, Christian Chiarcos from OLWG, Stelios Piperidis and Marta Villegas presenting about META-SHARE
... and many others. had good discussions
... had a good opportunity to talk about representations of language resource metadata
... good side discussions with EU and publications office about what they will do
... they are planning to publish parallel documents with fine grained identifiers - they are interested in the RDF version of that
... these are legal text - high quality because of the domain
... covering many EU languages

asun: one of the other outcomes:
... we need more cooperation with open knowledge foundation
... importance of having pure linguistic resources instead of having domian dependent language resources

dave: agree
... we did not have Christian Chiarcos involved here so far
... these discussions will also continue at LREC related events I assume

<MartinBenjamin> There have been some experiments with Wikipedia translation to Swahili that have been less than successful, using the Google Translate Toolkit. The biggest problem comes from running English articles through Google Translate, which is absolutely horrible for Swahili.

Reaching agreement on core LR metadata ontologies, with META-SHARE

related issue-2

<daveL__> http://lists.w3.org/Archives/Public/public-ld4lt/2014Apr/0011.html

dave: had input from Jorge, thread is listed at https://www.w3.org/community/ld4lt/track/issues/2

<daveL__> http://lists.w3.org/Archives/Public/public-ld4lt/2014Apr/0017.html

dave: meta-share ontology is very comprehensive already

<MartinBenjamin> Big problem with using Wiki data is that things are written in wiki markup - no stable reference point to link anything below the article level. This is a disaster for trying to link Wiktionary data

dave: issue is: how will this be transformed into an ontology to use in the linked data area
... jorge has set up a related link on the website

<jorge> https://www.w3.org/community/ld4lt/wiki/Meta-Share_OWL_metamodel

<daveL__> https://www.w3.org/community/ld4lt/wiki/Meta-Share_OWL_metamodel

jorge: in the wiki I added materials about meta-model description and the preliminary RDF version made by UPF
... we discussed - how to work on this in a collaborative manner
... I talked about several options: wiki, protege, gdocs
... it seems people are OK with gdocs spreadsheet
... I put the gdrive doc in a mode so you can share it with everybody

<daveL__> https://docs.google.com/spreadsheets/d/15SE4_qAqYFostmD52uKxpkCPZh1f5TrPeoXKNTlDYpQ/edit#gid=0

jorge: wanted to check if this is suitable for everybody

<MartinBenjamin> On the other hand, Wikipedia has a great hidden multilingual terms feature - interwiki links. Terms like "Down syndrome", or movie titles, etc, are very difficult to find in other languages. But if you go to the Wikipedia page for the topic, then follow the interwiki link to the page in your target language, the concept as expressed in that language is usually the article title or is high at the top. The problem again is how to exploit this a[CUT]

jorge: if that's ok I can fill this with the current state, that is the ontology made at upf

roberto: is this already populated?

jorge: no, this is just a skeleton - if people agree I would add it

dave: jorge, did you have discussions about what modules there might be?
... the current ontology e.g. has dublin core and others
... then there are many meta-share items
... did you have discussions how to reflect these in separate namespaces / modules?

jorge: we just have discussed to keep the meta-share module as is
... some part can be improved, e.g. about licenses

penny: had some audio issues - what are you discussing currently?
... I'm not an RDF expert so looking into this now
... we are now looking into what upf did
... so cutting the meta-share ontology into modules that could improve a lot the model

jorge: do you have the gdrive spreadsheet in front of you - is it fine with you to work with this?

penny: yes, let me check

<daveL__> for navigating the original meta-share shcema there is a useful structure at:

<daveL__> http://www.meta-share.org/portal/knowledgebase/HomePage

<jorge> https://www.w3.org/community/ld4lt/wiki/Meta-Share_OWL_metamodel

<daveL__> https://docs.google.com/spreadsheets/d/15SE4_qAqYFostmD52uKxpkCPZh1f5TrPeoXKNTlDYpQ/edit#gid=0

<scribe> ACTION: jorge to fill the gdocs with the current meta-share items [recorded in http://www.w3.org/2014/05/15-ld4lt-minutes.html#action01]

<trackbot> Created ACTION-2 - Fill the gdocs with the current meta-share items [on Jorge Gracia - due 2014-05-22].

penny: we are already discussing some updates
... we should take that into account

jorge: so you mean it would make sense to add a module for services

maria: we were thinking about adding another module for collections
... to cater for loose collections of data
... that should be identified by themselves
... we have not reached a final decision on that

jorge: so meta-share community does not have a final consensus on this

<daveL__> felix: can a stable vesion of the schema be indentified

felix: would it be possible to identify a version that we use as the basis for conversion?

maria: version 3.0 - marta has alredy worked on converting that to RDF
... there will be a minor update around LREC

asun: a few things related to the process:
... and how to record some kind of extra information
... during the following weeks we will raise many issues related to models etc.
... at some point we should decide: which are the core properties
... based on that we can start to extend with other terminology that is not in the core but also important
... I would suggest to make the core minimal and try to extend with other items
... I would include in the gdocs excel a new column: candidate vocabularies that could be re-used for representing meta-share terms

jorge: agree

asun: third comment:
... we should record proposal names, e.g. to know "computational lexicon with property has been proposed by Jorge"
... so that we see who had made a proposal
... and final comment:
... how to relate this with W3C notes that we are writing
... at the moment there is no argumentation
... at some point it would be good to have rationale of decisions together with the term agreed
... otherwise in a conference call like this we could be in a recurrent way

felix: how about having another gdocs (a word doc)

dave: using mailing list?

felix: sure, for discussions, but for document writing gdocs helps

dave: agree

asun: lot's of mail is ok but having the rationale documented in one place helps
... start big discussion by mail is difficult to follow

penny: really like asun point
... could we have an issue tracker

asun: we can have a column to store the discussion

https://www.w3.org/community/ld4lt/track/issues

<tcarrasco> Emails consensus shoud be consolidated into a proper document - it might be appropriate to have a couple of editors

felix: we could use the w3c tracker, I would volunteer to keep that up to date with issues that have been discussed
... as an output of todays discussion I'd close issue-2, the work approach

penny: e.g. we said meta-share version 3.0 is stable, that is something to track

jorge: in the example we could say in a column: we could just say "this comes from the meta-share model"
... for this first version most of the stuff will be authored by meta-share
... then we could add (using the same column) new things

dave: 2nd comment: what should go into core?
... you have indicated that licenses would go out of the core
... are there other natural groupings
... e.g. usages, classification of resources etc.
... should this stay in the core?

maria: usages could be left out
... but maybe first let's have a look at the model and then come up with concrete suggestions

felix: how about timeline expectations from the lider project

asun: we try to reach agreement quite soon
... during LREC we will approach CLARIN + LRE map people
... trying to involve them in the discussion
... if this community building works we should reach aggreement by September
... then additional stuff could be added later

<asungomezperez> Yes, end of July instead of September

jorge: in the last LIDER meeting we said: we'd like to have a draft of the core by the end of July

<asungomezperez> sorry .... to many dates in my head

dave: indeed, that will encourage people to contribute to LD4LT
... having a bit of work done on the core will encourage people from industry to bring in their ideas
... would be good for this community group + the meta-share community as well
... would be interesting, penny, if we forward discussion to the meta-share community as well
... e.g. to see what parts in meta-share are stable / will evolve in the next months etc.
... that will influence the discussion on what should be the core

penny: agree
... we have plenty of ideas to improve the whole thing

Guidelines for migrating existing LR metadata into RDF

dave: from LIDER are there any technical pointers we could provide?
... we have a related LIDER work area "reference architecture"

roberto: on the modeling aspect we could provide experience, e.g. about babelnet > RDF conversion
... at the moment there is only the slides from madrid and a short report

dave: roberto, can you send that to the LD4LT list so that people can have a look?

roberto: sure, will try to structure a bit more and then send it out

jorge: what roberto is working on is the data conversion, but on the agenda there is the metadata aspect

roberto: I could focus on the metadata aspect

jorge: your input roberto on the data aspect we are working on in the bpmlod group would be great

dave: there is the bpmlod group and the "data on the web" w3c best practices group people should be aware of

tomas: that is a W3C working group

<tcarrasco> http://www.w3.org/2013/dwbp/wiki/Main_Page

<tcarrasco> Data on the Web Best Practices Working Group (DWBP WG)

<scribe> ACTION: roberto to send out information on architecture for converting (meta)data into rdf [recorded in http://www.w3.org/2014/05/15-ld4lt-minutes.html#action02]

<trackbot> Created ACTION-3 - Send out information on architecture for converting (meta)data into rdf [on Roberto Navigli - due 2014-05-22].

other issues + call time

dave: no discussion of other items today
... call time - what to do?

tomas: slot in afternoon better for US people

asun: agree, afternoon much better for this call for getting US people in
... thursday afternoon is good for me

<asungomezperez> I cannot on tuesday afternoon because of teaching

<scribe> ACTION: dave to set up doodle poll for call time [recorded in http://www.w3.org/2014/05/15-ld4lt-minutes.html#action03]

<trackbot> Error finding 'dave'. You can review and register nicknames at <http://www.w3.org/community/ld4lt/track/users>.

<tcarrasco> Multilingual Electronic Dossier (MED) - http://joinup.ec.europa.eu/site/med

<scribe> ACTION: david to set up doodle poll for call time [recorded in http://www.w3.org/2014/05/15-ld4lt-minutes.html#action04]

<trackbot> Created ACTION-4 - Set up doodle poll for call time [on David Lewis - due 2014-05-22].

dave: reminder - after dublin there will be locworld workshop feisgiltt
... historically XML focused, this time more linked data focused
... esp. morning of 4th june
... terminology and linked data will be an important topic here too

aob

dave: thanks a lot for participaing, great participation in the call
... people please speak up to make contribution for you and others, that is what the group is for
... thanks all, bye!

Summary of Action Items

[NEW] ACTION: dave to set up doodle poll for call time [recorded in http://www.w3.org/2014/05/15-ld4lt-minutes.html#action03]
[NEW] ACTION: david to set up doodle poll for call time [recorded in http://www.w3.org/2014/05/15-ld4lt-minutes.html#action04]
[NEW] ACTION: jorge to fill the gdocs with the current meta-share items [recorded in http://www.w3.org/2014/05/15-ld4lt-minutes.html#action01]
[NEW] ACTION: roberto to send out information on architecture for converting (meta)data into rdf [recorded in http://www.w3.org/2014/05/15-ld4lt-minutes.html#action02]
 
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-05-15 09:31:33 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.138  of Date: 2013-04-25 13:59:11  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Found Scribe: fsasaki
Inferring ScribeNick: fsasaki
Present: DaveLewis Gary Kevin MartinBenjamin RobertoNavigli ali arle(IRC) asun flati fsasaki john jorge maria penny phil roberto tizinao
Agenda: http://lists.w3.org/Archives/Public/public-ld4lt/2014May/0005.html
Got date from IRC log name: 15 May 2014
Guessing minutes URL: http://www.w3.org/2014/05/15-ld4lt-minutes.html
People with action items: dave david jorge roberto

[End of scribe.perl diagnostic output]