RE: Brainstorming: Key Issues from Young,Jeff (OR) on 2011-02-24 (public-xg-lld@w3.org from February 2011)

From: Young,Jeff (OR) <jyoung@oclc.org>
Date: Thu, 24 Feb 2011 17:46:10 -0500
To: "Ross Singer" <ross.singer@talis.com>, "Karen Coyle" <kcoyle@kcoyle.net>
Cc: <public-xg-lld@w3.org>
Message-ID: <52E301F960B30049ADEFBCCF1CCAEF590B9797F3@OAEXCH4SERVER.oa.oclc.org>
To answer the question on bulk access to VIAF, the current policy is
that somebody needs to request it. So far, there hasn't been significant
demand for the RDF representation.

 

VIAF is a joint project of several national libraries plus selected
regional and trans-national library agencies. See these pages for more
information:

 

http://www.d-nb.de/eng/wir/projekte/viaf_info.htm

http://www.loc.gov/loc/lcib/08012/viaf.html

http://www.oclc.org/research/activities/viaf/

http://www.bnf.fr/en/professionals/dri_partnerships.html#SHDC__Attribute
_BlocArticle15BnF

 

Jeff

 

From: public-xg-lld-request@w3.org [mailto:public-xg-lld-request@w3.org]
On Behalf Of Ross Singer
Sent: Tuesday, February 22, 2011 6:19 AM
To: Karen Coyle
Cc: public-xg-lld@w3.org
Subject: Re: Brainstorming: Key Issues

 

Here are some other ideas, some related to Karen's:

1) Where to start?  To convert a dataset of any significant size, we'll
need name authorities, subject thesauri, controlled vocabulary terms,
etc.  If everyone does this in isolation, minting their own URIs, etc.,
how is this any better than silos of MARC records?  How do institutions
the size of University of Michigan or Stanford get access to datasets
such as VIAF so they don't have to do millions of requests every time
they remodel their data?  How do they know which dataset to look in for
a particular value?  What about all of the data that won't be found in
centralized datasets (local subject headings, subject headings based on
authorities with floating terms, names not in the NAF, etc.)?

2) How do we keep the original data and linked data in sync?  If changes
happen to the linked data representation, how do we funnel that back
into the original representation?  Do we even want to?

3) The richer the data, the more complicated the dependencies: how do we
prevent rats nests of possible licensing issues (Karen raised this, as
well)?  Similarly, this web also creates an n+1 problem: there's always
the potential of a new URIs being introduced with each graph; how much
is enough?  How will a library know?

4) How do we deal with incorrect data that we don't own/manage?

5) As the graph around a particular resource improves in quality, how do
these changes propagate around to the various copies of the data?  How
do libraries deal with the changes (not only regarding conflicts, but
how to keep up with changes in the data model, with regard to indexing,
etc.)?

6) Piggybacking on Karen's "chicken or the egg" problem, who will be
first to take the plunge?  What is the benefit for them to do so?  In
the absence of standards, will their experience have any influence on
how standards are created (that is, will they go through the work only
to have to later retool everything)?

-Ross.

On Thu, Feb 17, 2011 at 12:26 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:

This is my kick-off for brainstorming and key issues. I'd suggest that
for the first go-round we not worry about structure or levels of
granularity but just throw out ideas. I'll do my best to keep track
and we can then come back and have a more coordinated discussion.

Karen's list:

1) Community agreement and leadership
 There are many in the community who are either not interested in
LLD, don't know about LLD, or who are actually opposed to LLD. At the
moment, there are no centers of leadership to facilitate such a major
change to library thinking about its data (although IFLA is probably
the most active).

2) Funding
 It is still quite difficult to convince potential funders that this
is an important area to be working in. This is the "chicken/egg"
problem, that without something to show funders, you can't get funding.

3) Legacy data
 The library world has an enormous cache of data that is somewhat
standardized but uses an antiquated concept of data and data modeling.
Transformation of this data will take coordination (since libraries
share data and systems for data creation). But before it can be
transformed it needs to be analyzed and there must be a plan for
converting it to linked data. (There is a need for library systems to
be part of this change, and that is very complex.)

4) Openness and rights issues
 While linked data can be used in an enterprise system, the value
for libraries is to encourage open use of bibliographic data.
Institutions that "own" bibliographic data may be under constraints,
legal or otherwise, that do not allow them to let their data be used
openly. We need to overcome this out-dated concept of data ownership.

5) Standards
 Libraries need to take advantage of the economies of scale that
data sharing afford. This means that libraries will need to apply
standards to their data for use within libraries and library systems.

You can comment on these and/or post your own. Don't think about it
too hard -- let's get as many issues on the table as we can! (I did 5
- you can do any number you wish.)

kc

--
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
Received on Thursday, 24 February 2011 22:47:07 UTC