LLD XG -- 28 Apr 2011

<edsu> scribenick: edsu

<dchud> thanks TomB

Reports on the status of the main deliverable

http://www.w3.org/2005/Incubator/lld/wiki/DraftReport

<kcoyle> Benefits section: http://www.w3.org/2005/Incubator/lld/wiki/Benefits

<kcoyle> emma: it is important to start the report with a section of benefits that illustrates the value of linked data for libraries

<kcoyle> ... started this with a review of the 42 use cases

<emma> http://www.w3.org/2005/Incubator/lld/wiki/Draft_Benefits

<kcoyle> ... started with a bullet point list, then organized in terms of 'benefits for whom?" -- everyone, librarians, developers, organizations

<emma> http://www.w3.org/2005/Incubator/lld/wiki/Benefits

<kcoyle> ... then wrote summarizing text

<kcoyle> ... main benefit is that everything will have a URI so it can be referenced and de-referenced

<kcoyle> ... and will make it possible to pull together data

<kcoyle> ... then benefits for different users, like researchers, etc.

<scribe> scribenick: edsu

Issues

kcoyle: we began with the use cases, and extracted from them all the issues and problems that were identified
... we brought these together and came up w/ 3 different categories: management, collaboration and extending of standards, library standards themselves

http://www.w3.org/2005/Incubator/lld/wiki/Draft_issues_page

kcoyle: libraries by their nature work in a stable and somewhat unchanging environment, and how this effects making changes to linked data: price, rights ... and how libraries have large amounts of data already, and how this needs to get translated into this new format
... that's the high level

TomB: to elborate on this point of translation: we want to make reference to different design decisions that can be made in transation, but we don't want to go into too much detail

Available data: vocabularies and datasets

http://www.w3.org/2005/Incubator/lld/wiki/Vocabulary_and_Dataset

marcia: in general we have two main parts

marcia: metadata element sets (rdf schemas, owl ontologies)

<TomB> http://www.w3.org/2005/Incubator/lld/wiki/File:LLD-MetadataElementSetCloudMock.png

marcia: there is a plan that antoine will draw a picture of how metadata terms are reused by each other
... the 2nd major part includes the value-vocabularies and datasets
... the idea is to use the linked-open-data registered in the ckan to show what is relevant for library linked data
... value vocabularies can be used to cover entities and subject vocabularies
... most of the vocabularies are mentioned in the use cases
... the part we haven't finished yet is on published datasets

<emma> LLD on CKAN : http://ckan.net/package?groups=lld

kind of interesting too: http://semantic.ckan.net/group/?group=http://ckan.net/group/lld

antoine: that's the work of william

<marcia> http://www.w3.org/2005/Incubator/lld/wiki/Vocabulary_and_Dataset

antoine: we plan on having a specific section on vocabulary datasets, but we have not yet made progress on it
... the idea would be to start with a summary, to start with the most representative vocabularies/ontologies and value vocabularies
... e.g. for frbr there are several ones, we would identify the issues: when there are more than one, and when there aren't any
... we have been working on the side deliverable to help us identify the issue first

Relevant Technologies

<jeff_> http://www.w3.org/2005/Incubator/lld/wiki/Draft_Relevant_Technologies

<scribe> scribenick: kcoyle

Jeff: Trying to explain linked data is a challenge -- i've written a few paragraphs to try to explain the relevant technologies
... but it's not just a question of having new tools; have to use domain-specific technologies, etc.
... the relevant technologies are allowing us to create the infrastructure; it's not tools, but it's about taking the data we have today
... and mapping to these new technologies; leaving our current infrastructure in place
... sees 3-4 different categories of things happening; like take existing relational databases and map those to technologies
... can store new data in new ways that aren't as hard to map as our old schemas
... use OWL-based design technologies; there are tools to help us do that development
... then there is the controlled vocabulary level, such as SKOS; not classes or properties, but usable vocabularies
... modeling question between what things are best described in OWL and what in SKOS
... much of this gets off-loaded to W3C as the keeper of RDF / Semantic Web standards

<Zakim> edsu, you wanted to ask about using web frameworks and rdfa

<TomB> Edsu: A-ha moment for me: Django tools made it easy to create a Website with URIs - that I could use that for publishing RDF too.

<keven> is there any tech (or policy guidelines) can be used to keep the linkage in linked data (esp. the which used in the name spaces) more sustainable, like cache technology. For the maintainance of the links in linked data is quite fatal.

<antoine> @keven: sthg like that? http://dsnotify.org/

<keven> ok thanks to antonie. i'll look into that

<TomB> ...Cobble together some RDF/XML. Karen in Open Library: Web publishing framework - created templates that would generate RDF.

<TomB> ...Seems overwhelming when people discuss SW tech stack - "convert all your data", "you need a SPARQL endpoint" - developers tune out.

<TomB> ...Legacy systems that we have. Do not have to discard to do something useful.

<TomB> Jeff: In my case, played with Rails - still doing domain models - object-oriented classes - variables get mapped to database. Tried hard. Could produce RDF that way, but frustrating.

<TomB> ...That's why I like ?DVRQ database - do in two days what I spent six months doing with Rails.

<TomB> Edsu: Opposite experience.

<TomB> Jeff: Maybe walk thru the steps I took. Compare scaffolding languages. Important that we be able to do with data we have. Chance to start to migrate.

<TomB> Edsu: RDFa. Rails and Django.

<TomB> Jeff: But Grails has default URI pattern. Now you're stuck. URIs a huge problem - designing good ones.

<TomB> Edsu: Haven't had any trouble - optimized for defining URI spaces. Get you thinking about resources and how am I naming them. Web developers looking at this section would want to see this.

<TomB> Jeff: Compare approaches.

<edsu> scribenick: edsu

keven: are there any policies for keeping the linkage in linked data, e.g. which namespaces are used, using cache technologies to help maintain links

jeff_: caching is normally for network efficiency ; the domain not being supported anymore is a bit different
... imagine dbpedia going away ... i don't know what the answer is
... publishing the information in bulk can help

<keven> thx anyway

TomB: any more questions can be typed into IRC

<Zakim> edsu, you wanted to mention 301

<keven> do you have any comments on drupal used for linked data application?

<marcia> ed: big search engines look at things that moves

<keven> we plan to have a try on drupal to publish some exprimental biblio data

<jeff_> The PURL server can help too. Somebody could step in.

<marcia> ed: this is the architecture of the Web issue

<TomB> Edsu: Do a 301 redirect when a site moves permanently to another location. People who care about link integrity - don't want to serve up dead links - part of the architecture. Link rot. Identifiers break. They do not give the URI enough respect.

General discussion

<lukose> are there any guideline for representing and linking the "DataSet" and the "Model" used in producing the results outlined in a scientific publication, to the "meta-data" of the publication?

<lukose> yes

kcoyle: is this about the underlying data?

<kosuke_> @keven are you using this module? http://drupal.org/project/linked_data

<lukose> absolutely correct!

TomB: so linking a scientific publication with the data used

<marcia> tom: this is about linking sci publication with the data used to describe the publication

http://datacite.org/

<marcia> ... is there a standard way to link the two?

<keven> @kosuke: yes

http://www.dlib.org/dlib/january11/starr/01starr.html

<marcia> ed: someone sent a link to this article

<emma> suggest to look at http://www.w3.org/2005/Incubator/lld/wiki/Cluster_Citations

<TomB> Edsu: Link to D-lib article in January - looking at this problem. Looking at LD approaches to linking data to publications. A consortium that started in 2009.

<TomB> ...Herbert van de Sompel - OAI-ORE.

<TomB> Jeff: Hard time understanding OAI-ORE - aggregations nice, but what are its boundaries? How do you draw those boundaries.

jeff_: hard to imagine what the boundaries of aggregations are in oai-ore and how to draw those boundaries

antoine: i think ore could be used, but there is no standard way to use it to link articles to datasets
... i think it's still an active topic of research

<TomB> Antoine: ORE could be used but there is no standard way to use it for linking articles to datasets. Still a topic of research. Alot of activitity about scientific data. Have not heard about standard ways.

antoine: i've not heard of standard ways, but there are lots of things happening

lukose: good question :)

<kcoyle> just found this: http://www.std-doi.de/front_content.php

<lukose> ok, thanks guys.... this is an interesting challange...

<antoine> could be interesting to mention in report!

antoine++

TomB: perhaps you could consider mentioning it in your section?

<jeff_> The Dryad project at UNC Chapel Hill is working on relating scientific publications with scientific data sets

<marcia> D-Lib article: http://www.dlib.org/dlib/january11/starr/01starr.html

<marcia> D-Lib: isCitedBy: A Metadata Scheme for DataCite

antoine: i think it's more of a research area

<jeff_> http://datadryad.org/

edsu: might make sense to capture it as a possible vocabulary gap

<marcia> D-Lib issue on research data: http://www.dlib.org/dlib/january11/01contents.html

TomB: we need to have a good elevator pitch, or top-level story
... one problem we have is that libraries have changed technologies many times
... the movement to linked data could look like another one

<marcia> * antoine, maybe we need to add that metadata scheme even though no use case

TomB: we want to convey that there is a paradigm shift between record based data with statement based data

TomB: the report is targeted at decision makers, who will be in a position to set policy within their organisations
... any final questions in the 7 minutes remaining?
... any comments from malaysia, china and japan on how the linked data idea is being perceived, and what sort of arguments do we need to put into place in order to convince decision makers that this is something they should devote some resources to

<marcia> Tom: do you want to talk: Recommendations (Karen, Tom) http://www.w3.org/2005/Incubator/lld/wiki/Draft_recommendations_page

hideaki: is it for leaders of libraries and museums?

<kcoyle> and top level managers, no?

TomB: yes

<TomB> Hideaki: in Japan. To decision-makers, we often have to explain benefits of RDF. Prefer to have simple explanations.

TomB++

<lukose> my challange is in creating awareness of the LOD developments arround the world, to our local lib (national archive, national lib, etc....), so I am conducting workshops...the next challange is the benefits of this to the organization.

<marcia> TomB: that is exactly what we are trying to summarize just 3-4 pages

<marcia> .. the benifits for different categories

<Zakim> antoine, you wanted to comment on reviewing or contrib to recs

TomB: that's why we're trying to boil down the high level benefits for different groups

<keven> usually decision makers in library circle used to adopt turn-key solutions for them. they don't care about the linked data technology. so the benefit for them is important to get conciousness. for the techie people they need tools, tools, tools.

<marcia> .. for librarians, developers

<lukose> yes, I would very much like to help....

antoine: could hideaki and lukose play a more formal role in reviewing the benefits? since they have to talk to decision makers it would be great to have them look at it

<kcoyle> I also suspect that benefits may vary by country or region... so there may be benefits that we haven't identified?

TomB: currently the benefits secition is about 2 pages, it still has some rough edges, but it should be ready to be reviewed by the teleconference next tuesday

<keven> i'd love to take a review on this

<marcia> TomB: the benifit section is very important and to be discussed next week

TomB: since it is so crucial, it would be great if we had your help

<marcia> .. is any of you can volunteer to review, it will be very helpful.

<marcia> .. we may sign reviewers on the May 5th

TomB: if you could comment on the mailing list which ones work and which ones don't ; also a review of the recommendations would be helpful

<Zakim> antoine, you wanted to comment on workshops

antoine: one specific point about workshops and education, if there is any experience available in the kind of topic that should be mentioned in such workshops, what sort of targets, it would be really nice, it turns out gunter may not be able to contribute

<marcia> Antoine: workshops on education, if anyone can jump in to make recommendations that will be helpful

<marcia> .. especailly if there are expereince

<lukose> I can make some contribution on my experience in doing these lectures and workshops...

<antoine> lukose++

<marcia> *no, ed, I could not see

<kosuke> @antoine excuse me, does "linking articles to datasets" in ORE mean "citation" in this topic?

<marcia> *just try to duplicate

kosuke: yes we did look at that in the context of citation

<antoine> @kosuke: not sure, maybe we could discuss that by email

kosuke: did you run across that wiki page?

<keven> thanks for having me here

<lukose> tq

<emma> thx !

<dchud> thank you!

LLD XG

28 Apr 2011

Attendees

Contents

Reports on the status of the main deliverable

Issues

Available data: vocabularies and datasets

Relevant Technologies

General discussion

Summary of Action Items