See also: IRC log
See also: Summary of F2F outcomes
<emma> Scribe: Lars
<emma> scribenick: LarsG
Introductions:
19 participants
three more people arrive, makes it 22
TomB: basic principles
... it's not like a DC working group, guests are _not_
encouraged to participate unless they have something very
specific to contribute. If necessary, guests please move to the
back
... WiFi is not free, and we have no sponsors. TomB payed
himself, we will let the hat pass around
... $300
... agenda is tight, so let's go
Presentations are at http://www.w3.org/2005/Incubator/lld/wiki/F2F_Pittsburgh_UCslides
scribe: We received 42 UseCases (that's the meaning of it)
emma: We try to group
UseCases
... it's OK to twitter about the meeting. Hashtag is #lld
... UseCases at http://www.w3.org/2005/Incubator/lld/wiki/UseCases
kcoyle: There are guests who came for specific use cases. TomB will present those
Preparation of use case descriptions. Distribution of PostIts. Presenters please write names of the use cases they present on them
TomB: presents 3 FAO use
cases
... 1) Agrovoc
TomB: 1980 multilingual
thesaurus, since 2000 an owl ontology, since 2009 SKOS
... 2) FAO authority control
... description at
http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_FAO_Authority_Description_Concept_Scheme
... 1) is at
http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_AGROVOC_Thesaurus
... 3) AGRIS. Description at http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_AGRIS
emma: the tree UCs fit together since they are from the same organisation. For clustering purposes, it might be better to group them differently.
kcoyle: One large piece of paper with the topic, and then move postIts with UC names around until we're satisfed
one flipchart per UC area
Antoine will consolicate all UC presentation slides into one presentation and upload it to the wiki
scribe: all presenters please mail their slides to Antoine
Jeff Young: UC Authority Data Enrichment (http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Authority_Data_Enrichment)
scribe: authority data used to
collocate information, need to consolidate
internationally
... goal: enrich authority data by linking in and out
... how can we remodel the LinkedData back into MARC
... how far can we re-use existing vocabularies and how much do
we need to define ourselves
Jeff Young: UC Open Library Data (http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Open_Library_Data)
scribe: Open Library has much
bibliographic information from different sources (people,
Amazon). It's not in MARC but key-value pairs
... problems: forms of personal names not preserved, no
subfield structure preserving structure of data
... concepts (subject authority data) is probably more user
friendly and less librarianesque
... they use FRBRish structure
... one goal just to present the data as LinkedData and see if
it's useful
... vocabularies used: owl, skos, foaf, frbr, rdvocab,
dcterms
TomB: if UCs don't have a list of used vocabularies, we should add that to the UC
kcoyle: four cases with authority
data
... 1) AuthorClaim (http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_AuthorClaim)
... goal: try to identify authors and encourage authors to use
the same name form in future, so that authors can find
themselves in the database
... vocabulary: METS
<michaelp> METS, I believe
3) VIAF (http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Virtual_International_Authority_File_(VIAF))
scribe: vocabularies viaf, owl, skos, foaf, frbr entities, frbr elements, dcterms
<edsu> http://www.vivoweb.org/ would've been a nice use case to have in this area ...
scribe: makes sense to cluster it
with Jeff's UC
... i. e. UC Data Enrichment (http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Authority_Data_Enrichment)
alex: UC DNB Linked Data
(http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Linked_Data_Service_of_the_German_National_Library)
... Service in prototypical state
... topics: alignment (DBPedia, Wikipedia, VIAF)
... vocabularies: rda, foaf, relationship vocab, gnd (dnb
internal),
kcoyle: NEP: New economic paper is the same as author claim
Thus we have 41 UCs
emma: does GordonD want to cluster with Jeff (Open Library Data)
GordonD: three different
clusters
... 1) Language technology (http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Language_Technology)
... problem: different library communities use different
terminology (access points=
... no real authority control for subjects
... differences include language (multilinguality), authority
terminologies and notations, uncontrolled terminology (natural
language)
... need: link terms from different languages (singular/plural
etc). Translate user input into controlled terminology
... LinkedData allows term-by-term matching (if the vocabulary
allows it...)
... also issue with compound vs simple terms (broader/narrower,
part/whole)
... Translation architectures:
... * one2one: translate term in vocab1 to exactly one term in
vocab2 (scalability issues)
... * Hub-spoke: One vocabulary as hub. Issue: What to chose as
hub? Issue: semantic drift between spoke vocabularies
... examples:
... * Vocabulary mapping framework (hub-spoke) http://cdlr.strath.ac.uk/VMF/
... * HILT (hub-spoke) using DDC as http://www.d-nb.de/eng/wir/projekte/macs.htm
... * s/http:.../http://hilt.cdlr.strath.ac.uk/
... HILT experimented with multilinguality and it seemed to
work
... * MACS (one2one) SWD, LCSH, Rameau, DDC http://www.d-nb.de/eng/wir/projekte/macs.htm
... 2) UC Library Address Data (http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Library_Address_Data)
... libraries to publish information about themselves as linked
data to allow identification, perhaps including
collection-level data
emma; topic of morning session is to identify the clusters
TomB: then analyse the clusters
one by one
... we hear recurring themes
Marcia to present on Vocabulary Merging (http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Vocabulary_Merging)
marcia: if user find things in a
local service or a tag cloud
... a vocabulary service to relate terms
... vocab merging service to work at the back end e. g. as a
super structure
... sometimes actual merging, sometimes switching system
... mapping of user terms (synonyms) to vocabularies
... presentation of different projects: HILT, MACS, OCLC
terminology services
... UMLS metathesaurus (creating a superstructure) over 1mill
concepts and 4.3 mill concept names
... there concepts have unique URIs
antoine: who does what in this UC? What does the process look like?
GordonD: It's about terminology services. Black box: Service takes a user term and maps that back to a particular terminology a catalogue/community uses. It's transparent to the user: They enter a term and get a bunch of terminology back they can use in specific services.
marcia: It's much a silo
GordonD: DDC and UDC do the same
thing but don't talk to oneanother, but there's rapid
progress
... do you need a terminology service layer to organise the
LinkedOpenData
... good example of statistical mapping technique in the
DDC/LCSH mappings from WorldCat
michaelp: through consistent use of URIs we can get the whole cluster.
<edsu> also nat'l diet library: http://id.ndl.go.jp/auth/ndlsh map their subject terms to lcsh now as linked data / skos
emma: interesting discussion, but
we're pushing the break
... postpone discussion
Break, 1/2 hour
<michaelp> Scribe: MichaelP
<emma> Scribenick: michaelp
Gordon: UC Library Address Data
GordonD: Libraries ro publish
informarion about themselves for identification
... this can be subsumed under collection-level
description
... There is a DCMI AP for this which could be used
... In a LOD environment this still has to be triplified
... This type of collection-level metadata allows for
pre-search filtering and inform decision of users
Jeff: VCard could be used for this.
edsu: Martin has already done this in Sweden.
GordonD: We have this in a DB but
not as linked data. We need advice.
... We want to link Sweden up with Scotland and the US. Sounds
crazy, but is important for travelers.
... and cross-cultural researchers.
Alexander: Accessibility is key here. The accessbility of e.g. digital documents is in scope here.
GordonD: Also availability of assistive technology is important info here.
UC: Bibliographic Network
... Seeking the use of FRBR to bring metadata components
together.
... Matching and deduping is another task in large-scale
aggregations.
... Background issue to this cluster: data in catalogs is
heterogenous.
... But users want homogenous discovery interface.
... Linked data help by breaking these records down into
components.
... Some statements will be the same.
... Focus shifts from the record to the statement.
... Deduping can happen at a much lower level.
... We need to get to the triples from the legacy records.
There is a lot of work going on in this area.
... Main barriers:
... Need to find identification methods.
... Matching URIs, establishing equality of
sub-properties.
... Comparing values; Dewey numbers same as Dewey caption?
TomB: Do these fit into the same category?
GordonD: They are all about
record identification.
... But they are still multidimensional in terms of the way we
have split up the topics.
<Jeff> http://www.w3.org/2005/Incubator/lld/wiki/Topics
TomB: My use cases fell into different topics.
emma: We don't have to do the clustering today.
TomB: If a UC has three salient topics, there should be a sticker in each category for the UC.
GordonD: I would leave it like it is at the moment; we can go back and look at the aspects of UCs in relation to topics later.
TomB: We now try to identify the key topic. We break up the aspects later.
kcoyle: Open Library UC has some FRBR aspects to it.
TomB: Ok, we place it into LLD SW Technologies category.
UC: Subject search
antoine: Better use of subject
vocabs for web search.
... Subjects, works, web pages about subjects and works
... The case addresses all of these aspects
... The scenario allows the user to select a controlled subject
that the system has selected.
... Requirements/Linked Data: Availability of vocabs on the
web.
... and use of indentifiers.
... Issues: Human readable URIs
... URIs patterns for real-world objects.
... Also, there might a difference in the view of the concepts
of the concept provider vs. the user of the info.
... Another issue is the presentation of simple subjects
(user-friendly)
... Vocab merging is another issue.
... Cluster: It is about authority data and bibliographic
data.
Jeff: What I was trying to say in
that UC is that by modeling these systems as linked data we can
use web search technology like Google to do web searches with
controlled vocabularies.
... Leveraging Google for semantic purposes
kcoyle: Would that put it in the Semantic Web section?
Jeff: Semantic Web
environment
... Ok
Antoine: UC Digital
preservation
... Goal is to support planning and realization of digital
preservation
... Two kinds of data: technical data and preservation
processes and agents.
... Some vocabs of interest: Preservation vocabs from LC
... OAI-ORE
... DOAP: Description of a project
... Scenario: Finding objects based in preservation criteria,
tracking checking preservation actions.
... Value of LD technology: linking items, sharing data across
organizations.
... Two main issues: Scalability and persistence; coverage of
existing vocabs incomplete.
... No related UC, but the data could be used in other UCs than
preservation.
... Cluster: Data management?
emma: Non-bibliographic
information?
... We could cluster together with recollection, but the issue
is completely different.
... but the same context.
antoine: We put this UC in
non-bibliographic data.
... UC: Publishing 20th century press archive
<TomB> Antoine: Provide every item of this collection a persistent identifier for citing.
antoine: General goal: provide
for every item a persistent identifier.
... Support the use of a standard metadata viewer.
... Kind of data: bibliographic data + context data
... Scenarios: User interacts with the system using provided
metadata
... search and browse
<edsu> just added CDL's Merritt digital repository software to the digital preservation use case, since they use linkeddata for coordination of curation services: http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Digital_Preservation
<emma> thx edsu !
antoine: User can then view the
images of the pages with the standard viewer.
... Also, info from other sources is pulled in for the end
user.
antoine: There is also a back-end
service side that focusses on harvesting
... Value of LD technology:
... Good vocabs available
... Availability of external sources as LD
... RDFa for machine/human publication of metadata
... Vocabs: ORE, SKOS, FOAF, RDA (persons), EXIF
... Issues:
... Representataion of adhoc aggregations
... end-user display of rich data aggregations
... Capturing the order of documents. Big problem in RDF
... There are only cumbersome solutions available.
<LarsG> added PRONOM as vocabulary for digital preservation http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Digital_Preservation#Related_Vocabularies_.28optional.29
antoine: Related UC: NDNP (Chronicling America), Europeana, VIAF
<edsu> here's an example of martin's linked data for library institutions: http://libris.kb.se/resource/library/S
TomB: Please squeeze in Europeana here.
antoine: I don't think so. It touches many different aspects of several cases.
TomB: Europeana is a
mega-case!
... Can we present NDNP now?
emma: We had a presentation from Ed on the telecon.
TomB: OK, so we just cluster it.
<edsu> Scribe: Ed Summers
<edsu> ScribeNick: edsu
<michaelp> Scribenick: edsu
:-)
antoine: Digital Text
Repositories
... linking texts to authors and other contextual
resources
... there are somre repositories that curate at level of books,
and some that will curate at different levels, portions of
books, poems, etc
... there was some frbr mentioned, digital editions as
manifestations
... linking is useful for authors, topics and to existing
descriptions from external sources ; to make cataloging
faster
... also to enable citation
... also automatic alignment tools could be of use, for
suggesting links in the text to other linked data
resources
... linkeddata useful for adopting and sharing identifiers, and
possibly for representing provenance data
... related to the open library data, subject search, and
bibliographic network use cases
... not sure where to fit it in precisely
emma: we can create a topic if necessary
antoine: it seems bibliographic
emma: it seems to be about using library data that's used elsewhere
antoine: ok let's put it under USE.Consuming and using library data
kai: Citation of Scientific Data
Sets
http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Citation_of_Scientific_Datasets
... there is gaining interest in making data associated with
research available
... in some domains there are some best practices, but they
aren't globally identifiable
... focused on making the data citable
... there are 3 use cases
... 1) verification of research
... 2) find publications based on a dataset
... 3) reputation system to provide incentives for researchers
to make their data available and citable
... a citation is nothing but a link, and they want to link the
data so it's relevant for Linked Data
... an interesting case is if the data itself is linked
data
... maybe the distributed nature to it, fits linked data as
well
... possibly a future role for libraries: making data
available
... existing work in the healthcare/lifescience work
... it's a cross domain problem, not very easy to define
requirements
... we have different roles for people that are part of the
process: authors, reviewers, etc
... there's no existing vocabulary for doing this
... may need to link the citations in publications as well
antoine: there is the need to
reference an article in a newspaper
... in some other use cases
kai: i'm not sure how to classify the use case: maybe library data ; but also handling digital objects
antoine: is it also connected to the authorclaim case?
kai: yes
... it relies on authority data
... especially for people
emma: are you looking to enhance publication?
kai: yes
TomB: where are we going to put it, which category?
kai: is citation the main aspect, or scientific data?
oai-ore was kind of designed for this use case btw: http://dlib.org/dlib/october06/vandesompel/10vandesompel.html
kai pins the tail on Citation
markva: Enhanced Publications
UC
... aggregates of papers, chapters, datasets
... contributed by the SURF foundation where they have 4
projects where the actually implemented it
... fits in with what kai just presented
... the've been using foaf, oai-ore, dctypes, dcterms
... i have some questions about what the use case is
about
... are they annotating the content?
... otherwise very little added on top of ORE
... i think it should be clustered with citation scientific
data uc
antoine: it also seemed kind of bibliographic too, focused on the publication
markva: it's focused on aggregates
kcoyle: kind of background information
markva: they have high res geological images that they would like to include
kcoyle: part of that is a data
management issue; how do you make sure you store things and can
assemble them again
... why don't you put it in Data Management
markva: Mapping Scholarly Debate
UC
... modelling rebuttals, reactions, disagreements ; to capture
evolution of thought
... the schemas are frbr like (work/manifestation) ; i wasn't
able to access the schema
... it would be very useful to link to the actual schemas so
you can see what people have been doing
... they have an implementation at bibliographica.org ; i
couldn't drill down to the relationships ; wasn't clear if it
is work that they would like to do, or have done
... could be relevant Digital Text Repository UC
... also NDNP UC, 20th Century Press Archives UCs
kcoyle: seems relevant to
citation
... the *why* of citation
TomB: i think there might be overlap with linking across datasets
kcoyle: i think in the end we'll have things in multiple places
antoine: we could go back to the owner to figure what vocabulary they use, since william is in the IG
TomB: it's 12:30 so it's
lunch
... ray, lars, emma still have to present
rayd: Radio Station Archive
Digitization UC
... current practice is that audio programs aren't often
digitized, litle metadata ; the goal is to enable cross
references, and search
... the scenario about an archivist who is creating and
annotating the digital versions
... linked data is useful for subclassing dc:identifier,
creating new vocabulary for interviewer, people, etc
... there is little guidance for creating metaata about audio
recordings, and provenance information (who created various
things)
... also seem to be missing vocabulary for documenting
uncertainty
... it all boils down to a vocabulary problem
... a vocabulary for radio programming
kcoyle: it sounded like building an internal system
rayd: i didn't get that sense that it was internally focused
kcoyle: it is almost identical to the linkeddata discussion we had around someone from pbs who was creating vocabulary for programming
edsu: also the work that the bbc are doing
emma: LOCAH Project and Photo
Museum UCs
... they have a connection because they are both about archival
material
... the materials in archives are generally unique, in high
quantities, and multiple content carriers
... the challenge is to get common view of these materials, so
that they can be found
... they have hierarchical descriptions, contextual information
is very important
... ordered sequences, which are more difficult in RDF
... sometimes the data is semi-structured, and there are
quality issues (similar to radio archive)
... they want linkeddata to provide a hub, to make it easier
for users to get to the materials, and related materials via
the context
... linking to dbpedia, library content, library
authorities
they used dcterms, bibo, foaf, skos, rdfs, frbr
emma: but the use of bibo wasn't
clear, they said they just put it in there
... they aren't working on converting ead to rdf, they are
going back to ISAD(G)
... similar to the FRBR -> RDF efforts, which aren't
oriented around marc
... maybe cluster with radio station archive
... Recollection UC
... an effort from NDIIPP to enable discovery of resources, to
provide a tool to easily aggregate archives, to create
descriptions of them, and to publish as linked data
... could be a bit different because it is a digital
archive
antoine: that case is quite connected to the europeana one
corey harper: just last week there was an interesting thread about generating OWL for EAD
<charper> The thread I menteiond starts here: http://listserv.loc.gov/cgi-bin/wa?A2=ind1010&L=ead&T=0&P=1910
antoine: yes, i've been involved in one of those things
charper: it's strange because it's more a document format for finding aids
GordonD: there was a meeting in helsinki about the archival communities search for a data model that connects up with libraries and museums
kai: are they going to publish it as an ontology
GordonD: on the CIDOC/CRM site it
is published as rdfs
... it's an evolving supermodel across libraries/arhives and
museums
LarsG: PODE UC
... it's about pulling together linked data
... wikipedia, project gutenberg
... phase 1 is about frbrising, mashing library data through
web service apis
... 2nd phase is about finding non-fiction material via links
to external datasets
... marc records are very inconsistent, 40 years of doing
things sort of the same way
... also dewey.info is only summaries
... uses frbr, dc, bibo, lexvo, geonames, foaf, skos
... not really sure about what people want to use the data
for
... i would put it under USE
antoine: also related to
bibliographic network
... perhaps what we should do later is flag the more user
oriented ones
See post meeting cleaning:Outcome of the use case discussion
TomB: ok, it's time to break for lunch
Meeting resumes after lunch
<emma> Scribe: Jeff
<emma> scribenick: jeff
antoine: looking at vocabularies that are being used and how they can be aligned
alexander: still intend to look at requirements?
antoine: yes, look at requirements first
<jodi> hi! I'll just be popping in while I'm online this weekend. :)
antoine: do the vocabularies we have do what we want and where are the gaps?
alexander: requirements should include sparql and protocol into the discussion?
antoine: focus on vocabularies
first and then talk about other requirement issues
... gordon wrote document about library standards and linked
data
... but start first with use cases and look at vocabularies
they're using
... start with with bibliographic data vocabularies
bib networks
<marma> http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Bibliographic_Network
<antoine> -> Use Case Bibliographic Network
gordon: bibo and frbrcore
gordon: concerns of frbrcore, including modeling mistakes
karen: but it was the earliest and frbrer is only around 3 weeks
emma: but frbrcore is being used outside the library community
tomb: is persistence of frbrcore a concern?
gordon: ifla frbrer can be trusted with persistence. unlike frbrcore
dianeH: persistence and ownership
is critically important exp. for larger libraries
... not willing to invest in ontologies they don't trust
edsu: I'm willing to trust frbrcore, but it's behind the scenes
<emma> +1 with dianeH's statement
karen: frbrcore was published before FRBR was cooked
kcoyle: encourage groups dragging their feet to realize people want to use these ASAP
antoine: what about bibo? is there a relationship with FRBR?
kcoyle: bibo is more about
academic articles and citations
... and journal articles
bibo uses frbrcore, dc, and a mashup of other vocabularies with some additions
bibo and frbr could be derived from the same underlying data
edsu: bibo is concrete and intuitive and that's a useful thing
karen: looking a bibo, they don't include frbr
<edsu> kcoyle is right (i was wrong) bibo doesn't use frbr at all
<edsu> looks like bibo uses: dcterms, foaf, vann, owl, skos, event, prism
mpanzer: they're more interested in a citation perspective
martin: casual users will be
attracted to bibo
... mapping between frbr and bibo is a useful thing
gordon: true. Who's responsible for dealing with this mapping?
antoine: the LLD XG wiki could be used to list vocabularies and maintain links between them
TomB: LLD XG could provide guidelines for others to maintain links to vocabularies rather than expecting to to be managed centrally
tom: mapping relationships between different vocabularies that are constantly evolving is a complex process
corey: the expertise in this room can help explain how others can connect their vocabularies to otheres
gordon: the issues of cross relationships becomes a problem of institutional agreements and politics
gordon: who could manage these:
IFLA, W3C and ...
... DCMI
gordon: cultural shift that needs to happen to open world movement/assumption. It's a foreign idea still
<TomB> Gordon: cultural shift - orgs rooted in 20th century - open movement - something completely foreign as paradigms. It suits everyone's interest to move into that for the future, because failed in the past.
edsu: an opportunity to create a process for things to incubate elsewhere and then be adopted and developed by major organizations.
major organizations need to be more open to foreign models
NISO has this problem
NISO has a process to move projects from somebody's garage to a managed space
<edsu> TomB and Harry Halpin's paper: http://www.aaai.org/ocs/index.php/SSS/SSS10/paper/view/1140
TomB: vocabulary developers
partner with cultural memory organizations and national
libraries. Partnership where the organization takes over long
term
... this creates a level of trust without imposing too much
early bureaucracy
emma: could the major organizations take the initiative to encourage and nurture promising vocabularies
<marma> one triple, one vote?
<ww> marma: after minimisation? :P
mpanzer: Simply using a vocabulary is an endorsements, but it's still not curation
<TomB> Ed thanks Antoine for opening up the can of worms :-)
<edsu> it's an important can of worms though :-)
antoine: keep track of links from vocabularies to use cases and vice versa
jeff: we can create a database of two way linking
edsu: I'm keeping a tally, but the links would be useful
<edsu> here's the tally i made of vocabs mentioned during the presentations this morning: http://gist.github.com/642570
ACTION: for each use case champion: on the Vocabularies page, link to each URL use case that uses it - see http://www.w3.org/2005/Incubator/lld/wiki/Vocabularies [recorded in http://www.w3.org/2005/Incubator/lld/minutes/2010/10/23-lld-minutes.html#action01]
antoine: continue to look at use cases...
TomB: identification and deduplication
Gordon: no vocabularies listed
TomB: Regional catalog/vocabularies
gordon: bibo, FRBR, etc.
gordon: RDF
... problems and limitations: lack of political will,
ownership, rights, finding synonymous identifiers, lookup
service for bibliographic items
Data BNF: skos, foaf, rda
gordon: frbrizatoin is a concern because it makes assumptions in the underlying data
antoine: the data needs to be enriched
gordon: this may be more of an assumption than a reality. Is it a mistake to mix and match vocabularies?
antoine: there are perceptions that specific vocabularies are psychologically difficult to embrace.
kcoyle: what's the goal of identifying vocabularies listed in the use case. What the purpose?
antoine: the purpose is to identify the issues and concerns of using vocabularies
gordon: are we imagining difficulties and issues because of the vocabularies are or are not being used?
TomB: persistence, mapping, is
good. Scope and limitations may not be so important
... If the ontologies are slow to publish URIs, is that a clue
to complexity and uncertainty? Bounded/unbounded concerns may
be a problem.
... The goal isn't to "review" these vocabularies, just to
ideentify the issues
kcoyle: RDF vocabulares only, or are other vocabularis in scope?
antoine: assume that non-RDF vocabularies will be developed eventually
TomB: it's worth mentioning potential issues converting vocabularies into RDF
<emma> Scribe : Karen
<emma> Scribenick : kcoyle
library standards time is very long -- three years is a short time (frbr, etc.)
gordon: we may be providing a framework to encourage linking between vocab developers
karen has better sense of what we are doing. we can go on
gordon: polymath case... viaf,
lcsh, rameau, linked data services of dnb, insittuto geografic
nacionaol espana
... (IGN), EDM (Europeana Data Model), dbpedia
using lcsh -- is the data set, not a metadata schema
is a controlled vocabulary; emma: we have them on the wiki page for vocabs
that is a big can of worms because of all of the semantic alignments between them. (gordon, and others)
scribe: this may be too difficult
edsu: disagrees, because there
aren't many more than 10 in the library world
... they should be kept separate, but maybe we can gt to that
later
jeff: viaf ontology doesn't always make sense; maybe needs revision before others begin to use it
gordon: feedback mechanism that causes ontologies to be revised
tom: needs for namespace policies
that articulate how vocab will evolve, e.g. dc: if semantics
change, will coin new uri
... needs for namespace policies that articulate how vocab will
evolve, e.g. dc: if semantics change, new uri is coined
not clear if dbpedia/wikipedia have such a policy
what does stability mean on web?
mpanzer: most semantics are
conveyed in notes fields
... do vocabs from same ontology have to be used together to
have correct semantics?
jon: we are identifying organizational level problems for use of linked data, but are in an environment that doesn't have that commitment
emma: points: ownership, official
and not
... institutions should provide links between vocabs and curate
them
... barriers - some are perceived more difficult one
... persistence policies
... can you pick some from a vocabulary and not use whole
vocabulary guidelines?
kc: how do you know what can stand alone?
jon: does it matter?
kc: there can be dependencies between items in vocab
mpanzer: ontological baggage, is not part of linked data stack
charper: isn't that covered by domains and ranges?
mpanzer: domains and ranges are
only two pieces of a relationship; there can be other
parts/relationships
... and domains and ranges are not constraints
jon: we are talking about Lld,
which has an existing domain model, exemplified by marc21
... and marc21 is not expressed anywhere in rdf
tomB: we are talking about a larger environment
jon: we are talking about other things because we can't talk about marc21 in a linked data context
emma: can't, or don't want to?
<ww> jon: marc21 as rdf: curl -H "Accept: text/n3" http://bibliographica.org/301b111e-0dc0-5e34-a5e6-06c461d51789/57512
mark: what could go wrong when we use pieces from other vocabs?
mpanzer: if you assume everyone
using ore properties provides a resource map... but not
necessarily the case
... linked data doesn't know about APs, doesn't know about
records. our domain has highly structured data
... what does linked data mean for us?
<ww> mpanzer: quite so, bibliographica uses ore to group together graphs... and doesn't provide a resource map
<ww> e.g: http://bibliographica.org/aggregate/301b111e-0dc0-5e34-a5e6-06c461d51789/57512/contributor/1
tomB: libraries rely on data
definitions that are out of band in the lld environment
... data received may not meet users' definitions; LD has
formal relationships, but not a community view
jon: this is a significant flaw
in the way we think about linked data
... inld, each statement in itself makes sense
... but for a complete description, may need more than one
statement
gordon: where we started...
choosing different properties from different name spaces
... one issue is definitions; if they aren't absolutely
precise, they will be used wrongly
... meaning that definitions have to be very clear, but in
library world we have many assumptions
... frad has class called Person defined as "an individual" -
not helpful
<ww> even if the definitions are very precise they will be used wrongly cf. owl:sameAs
gordon: the mark twain sam
clemens problem
... lassie is creator of paw print outside of grauman's
chinese
<emma> in FOAF i read "Something is a Person if it is a person." is that much better ?!
gordon: vocabulary creators need guidance on creating definitions that can make sense outside of the context of the vocabulary
edsu: in the end, those that don't make sense won't be used
<jodi> gordon++
tomB: library community
definitions are natural language concepts
... LD world uses formal relationships to other terms
... skos vocabulary terms were never defined in natural
language
diane: ref. dcmi/rda task group
work, and its lessons
... no 'how to' guidance for building vocabularies for the
web
... this group is identifying some issues about what that
guidance might be
<edsu> diane's paper http://www.dlib.org/dlib/january10/hillmann/01hillmann.html
antoine: strongly related to concept of application profiles
mpanzer: w3c has recipes and best
practices; that is what could come out of this group
... not normative, but helping people who need to do
something
... could be aimed just at library data, so it is do-able
<marcia> +1 mpanzer recipes and best practices
<edsu> +1 to michael's suggestion for best practice docs
<marma> +1
<ww> +1
<LarsG> +1
antoine: let's make this part of the deliverables discussion
break!
<TomB> as mentioned during break: http://lists.w3.org/Archives/Public/public-lld/2010Oct/0098.html - just posted - Mikael Nilsson on Thoughts on validation / documentation / abstract models in reaction to yesterday's application profile discussion
<emma> Scribe: Marcia
<emma> Scribenick: marcia
another hour for vocabularies
TomB: encoding vocabularies
issues
... how to identify the sources that control the controlled
vocab. terms
... this is an issue
... waiting for Jon for some special issue related to
MARC
... differences discussed yesterday about DCAM and metadata
language
... community and info services case
Gordon: Use Case Community Information Service
tomb: Use Case Linked Data and legacy library applications case?
Jeff: Use Case Open Library Data: FRBR, RDA vocab
karen: Use Case Virtual International Authority File (VIAF):
Jeff: there is a problem. In the VIAF, we kept adding individual elements that make sense. There is no vocabs available.
Gordon: future is that FRAD to do all the control-related things
Gordon: FRAD has a very rich
properties for person.
... compared for person defined by FRBR, FRAD, FOAF
Karen: there are properties in FOAF that library data do not use at all.
Alex: our database has to do a detour to link each variant first name whith corresponding last name. We had to add an bnode
Ed: to help library users, could libraries to be parteners to develop
michaelP: issues of complexity. local properties are not expected to be adopted by others. Should add as FOAF sub-properties, in the future people can use the dump-down approach
<markva> +1 michaelp
<edsu> for the record I was just relaying to Alex that danbri is looking to partner w/ people like alex and the dnb in the library community to add missing things to foaf
<edsu> doesn't necessarily need to be The Library Community
<edsu> +1 charper # linking to foaf, so that library data can interoperate with the larger world of linked data
gordon: to distinguish different
identities of people, libraries may use other data such as home
address to help.
... other issues: redundant, depricated
<michaelp> Expressing person name authority data as linked data doesn't necessarily mean producing triples that can act as a surrogate of the MARC data.
gordon: context is important to the meaning, not always carries in the definitions. authority headings is different from describing persons
Jeff: this is, the label is different from the concept. heard more like about the label of the person
gordon: conversion issue
<LarsG> GordonD: models develop through feedback and eventually they converge.
tomB: when DC:create domain has
not merged with RDF, later created dcterms:creator to assign
domain range. Difficult to explain to the RDF people
... heard people prefer to have property un-constrained
ed: yesterday's Linked Data session of Karen and Corey discussed about constrain issues
<emma> ...it's about ontological commitment : the more you say, the more guidance, but also constraints
ed: may bring new problems
tomB: the group is carried away a little bit from LLD per se
<emma> TomB: feedback on DC was that it's good to make that commitment
mark:not constraining the range is only OK if you have a mechanism to constraint it locally
mark: in some case it is good to have range constrained
<markva> ... has a function in recommending people what you want in ranges, either literals or URIs
<markva> ... in context of linked data, often you want URIs, e.g. for creators
<antoine> Coming back to MARC: there was a MARC ontology under construction at DERI a few years ago, but now it's gone... none of our use case mentions the need for a MARC vocabulary
karen: MARC people probably has a big gap with the linked data
<LarsG> kcoyle: there is a use for MARC in RDF
<emma> ... the issue is to translate legacy data into other thing, one way may be marc
Ray: regarding MARC expressed as
RDF
... MADS
<emma> MADS and MODS were actually mentionned in Use Cases
<edsu> markva: is your dissertation available online somewhere?
<edsu> markva: i was just fishing around on http://www.few.vu.nl/~mark/
<markva> http://www.cs.vu.nl/~mark/papers/thesis-mfjvanassem.pdf
emma: AACR, RDA, ISBD
MODS and MADS are formats
Jon: there was a presentation,
that break MARC records into statements
... explain this how data can be expressed in linked data
Karen: there are problems to make the MARC data into that kind of statements
gordon: unimarc still allign with
ISBD
... some other allignments are complicated
... registered ISBD in registry
<markva> funny to hear people talk about modelling me ;)
<edsu> markva: thanks!
<markva> hope somebody actually reads it...
tomB: Use Case FAO Authority
Description Concept Scheme: SKOS, RDF, FOAF, ???
... there is an issue of RDA
Alex: GND vocabulary,
not registered yet, not for reuse. there was one vocab that
did not mentioned. Not sure what's coming next-- official or
not
... RDA, SKOS, ???
... conncecting the headings to other vocabs.
... person including academic title
... map to MADS
... all mapping things are working on. maybe next 6 months to
work on
Lars: about the timeline of the LD project
Alex: we have the vocabulary, but
did not regiter them
... already has the description, document
tomB: this is an issue
registry of resitry
scribe: URIs are being
point
... registry become a portal and management tool, a secondary
thing
the word 'registry' in the context of point to URIs is a problem
michael: you could do in your data
tomB: registry is problematic
now. it is confusing
... there are registry under registry.
... has the problem with the word "to register'
<michaelp> In linked data context, "registering" in an external database is quite misleading.
gordon: I used term consistently, to represent your property in the registry
<markva> TomB: "registring" is same as coining a URI
<michaelp> Coining a URI, defining semantics and making this definition available in RDF when this URI is dereferenced is enough.
koren: there are people who do not know the meaning of registry, with domain name behind it
tomB: nothing about the registry
in the sense of Diane and Jon's I do not like
... the issue is the environment
<michaelp> DNB could do that without relying on an external provider.
tomB: registrying in the LD
context is to coin a URI.
... the URI is coined in a registry is... by using the word 'to
registry' is important from vocab management point of view, but
is not the sense in linked data.
<markva> TomB: putting something in registry is orthogonal to use in LD
Jon: formal official namespace
registry
... registration is a formalization of that namespace
Alex: one of the requirements is that a vocab has to have a place to be referred to, look for provoence, etc
michael: national libraries do not need domain names, no requirements to rely on external services
Alex: we have internal and external services. Human reable version and machine readable version.
tomB: something is resolvable is machine ...?
<michaelp> Registries are helpful if you have no easy access to a domain name / namespace. This is usually not an issue for a big library organization.
corey: how do you track the change of the data, should be an important issue to be discussed here
<michaelp> There is no requirements "to register" properties / vocabularies externally to make them "official".
antoine: what are the basic requirements, what are the most important, distinguish with others
tomB: move on.
<michaelp> Coining the URI is the statement that matters.
tomB: wrap up this
discussion
... I would like the group not use to verb 'to register' when
not coin a URI
<michaelp> Corey: Registries provide services that are not available by just using conneg and publishing a flat RDF.
gordon: not happy to use the word
'publishing' either
... to register imply some requirements
jon: registry is a namespace
service, also for trusted vocabularies
... it is more than linked data environment
<markva> michaelp: maintaining a URI is not a function of technology but function of an organisation (after Stu Weibel)
<LarsG> +1 to Stu Weibel
<michaelp> mantaining and persistence ...
<edsu> +1 for moving on
<michaelp> Alexander: We have to keep this issue in mind in terms of best practices.
<michaelp> TomB: Registries good, not required.
<edsu> _+1 for rolling registry information into potential best practice doc
Alex: need some way to say a property is deprecated, etc.
Corey: Need to know of previous version, etc.
Karen: Let people use the
technology they have
... Best practice, not requirement
... We need to say that versioning, etc. is a good thing, but
shouldn't dictate that this is a requirement
All: Agree, it's best practice, and a good thing to have a registry to support important services
See post meeting cleaning:Outcome of the vocabulary discussion
<edsu> paulwalk's creating a tool to visualize vocabulary usage in our use cases: http://172.22.172.216/topics
<paulwalk> I have deployed an early version of my visualising app here: http://www.paulwalk.net/lldvis/ Feel free to play with this - any changes you make **NOT** be persistent yet
<emma> Scribe: Gordon
Emmanuelle: One report
expected.
... Discuss YouTube video this evening
... Have captured vocabulary requirements from this afternoon's
discussion ...
... Requirements on three slides
All: discussion on requirements, slides adjusted, some requirements need to be revisited and further discussed
Emmanuelle: Concern now is to move from use cases/requirements to deliverable
Antoine: Go through the components of the deliverable and identify who is interested in developing them
Emmanuelle: Small groups could analyze use-case clusters
Kai: For each cluster, extract scenarios, abstract from them, and develop single-action use-cases (what a use-case really is)
Antoine: Allows a check that these really are clusters
Karen: What are the clusters?
Emmanuelle: May be other clusters emerging from use-cases not discussed today
Alex: What is the deadline for completing this work?
Emmanuelle: By end of
December
... We will invite XG members not present to add their names to
curation teams
<antoine> ACTION: Karen and Emma to curate archive cluster for end of december [recorded in http://www.w3.org/2005/Incubator/lld/minutes/2010/10/23-lld-minutes.html#action02]
Emmanuelle: Other deliverable is relevant technology pieces, etc.
Antoine: Outreach and dissemination activities are in charter - some progress to this already embedded in wiki
Emmanuelle: Tomorrow, we should
take each topic and see if it translates into deliverable
... If we want to create further W3C activity we should charter
it
Antoine: We should attempt to inventory what we know is out there (in addition to output from use-case and vocabulary discussions). Using CKAN as for the LOD cloud
Antoine: -> http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation
Karen: Any inventory is a moving target, and we should acknowledge that - but inventory useful
<charper> antoine++ re: CKAN
<edsu> antoine: would be good to have ww walk us through adding a package to ckan on a telecon
See post meeting cleaning:Outcome of the deliverables discussion
<antoine> ACTION: Kai and Ed to curate citations cluster for end of december [recorded in http://www.w3.org/2005/Incubator/lld/minutes/2010/10/23-lld-minutes.html#action03]
<antoine> ACTION: Mark (and someone else) to curate digital objects cluster for end of december [recorded in http://www.w3.org/2005/Incubator/lld/minutes/2010/10/23-lld-minutes.html#action04]
<antoine> ACTION: Gordon and Martin to curate bibliographic data cluster for end of december [recorded in http://www.w3.org/2005/Incubator/lld/minutes/2010/10/23-lld-minutes.html#action05]
<antoine> ACTION: Jeff and Alexander to curate authority data cluster for end of december [recorded in http://www.w3.org/2005/Incubator/lld/minutes/2010/10/23-lld-minutes.html#action06]
<antoine> ACTION: Antoine and Michael to curate vocabulary alignment cluster for end of december [recorded in http://www.w3.org/2005/Incubator/lld/minutes/2010/10/23-lld-minutes.html#action07]