Difference between revisions of "UseCaseReport"

From Library Linked Data
Jump to: navigation, search
(Extracted Use Cases)
(Archives and heterogeneous data)
(25 intermediate revisions by one user not shown)
Line 1: Line 1:
 
= Abstract =
 
= Abstract =
  
This document describes a number of selected use cases, case studies, outreach and dissemination initiatives targeted to the library community and related sectors. It presents the report of the use cases gathered and analysed by the W3C Library Linked Data Incubator Group, which have been submitted as input to the Incubator Group by different organizations and individuals. The use cases are organized into different clusters.The report includes a short description of the selected use cases as well as the extracted use cases or scenarios from each cluster.
+
Selected use case and case studies from the library community and related sectors are described in this document, along with outreach and dissemination efforts. These were gathered and analyzed by the W3C Library Linked Data Incubator Group, based on submissions from different organizations and individuals. Use cases have been grouped into eight topical clusters, which are described below. Selected use cases from each cluster have also been summarized.
  
 
= Motivation and method =
 
= Motivation and method =
  
As described in the group charter, one of the main activities of the group is to gather use cases and case studies demonstrating successful implementation of Semantic Web technologies in libraries and related sectors. This activity also includes the gathering of outreach and dissemination initiatives of the library community and its related sectors, as well as innovative uses, which demonstrate the possibilities and benefits offered by the application of Linked Data technologies and methods to these organizations.  
+
As described in the [http://www.w3.org/2005/Incubator/lld/charter group charter], one of the main activities of the W3C Library Linked Data Incubator Group was to gather use cases and case studies demonstrating successful implementation of Semantic Web technologies in libraries and related sectors. Outreach and dissemination initiatives were also gathered, along with innovative uses, which demonstrate the possibilities and benefits offered by the application of Linked Data technologies and methods in libraries.
  
 
The use cases presented in this report demonstrate the need for Linked Data technologies in order to DESCRIBE library resources and their context, and SHARE these descriptions among institutions and with the broader public. The issue of description mainly involves the creation or representation of RELATIONSHIPS between resources, by MAPping similar entities, making existing relationships more explicit, and creating new relations, either using machine processing (inferences, alignments, etc.) or manually (tagging, cataloguing). Those relationships can be used to provide DISCOVERy, through BROWSE and SEARCH services, and to FEDERATE or AGGREGATE many sources. They are also involved in data MANAGEment issues.  Linked Data technologies are used to improve global interoperability of library data, by RE-USing metadata elements sets and value vocabularies, providing URIs for resources, and developing PUBLISHing services like APIs.
 
The use cases presented in this report demonstrate the need for Linked Data technologies in order to DESCRIBE library resources and their context, and SHARE these descriptions among institutions and with the broader public. The issue of description mainly involves the creation or representation of RELATIONSHIPS between resources, by MAPping similar entities, making existing relationships more explicit, and creating new relations, either using machine processing (inferences, alignments, etc.) or manually (tagging, cataloguing). Those relationships can be used to provide DISCOVERy, through BROWSE and SEARCH services, and to FEDERATE or AGGREGATE many sources. They are also involved in data MANAGEment issues.  Linked Data technologies are used to improve global interoperability of library data, by RE-USing metadata elements sets and value vocabularies, providing URIs for resources, and developing PUBLISHing services like APIs.
Line 22: Line 22:
 
* Social and new uses
 
* Social and new uses
  
After collecting and reviewing the submitted use cases for each cluster, the members of the group have extracted the main scenarios out of the initial set of use cases. The motivation behind these extracted use scenarios is to sum up and capture the main ideas and scenarios contained in the set of original use cases. The extracted use cases presented in this document aim to cover the majority of topics and situations related to each cluster.
+
After collecting and reviewing the submitted use cases for each cluster, group members extracted the main scenarios out of the initial set of use cases. The motivation behind these extracted use scenarios is to sum up and capture the main ideas and scenarios from the set of original use cases. The extracted use cases presented in this document aim to cover the majority of topics and situations related to each cluster.
  
 
The following figure depicts the use case organization and the aforementioned extraction process:
 
The following figure depicts the use case organization and the aforementioned extraction process:
Line 38: Line 38:
  
 
== Bibliographic data ==
 
== Bibliographic data ==
 +
''Rather than summarizing individual use cases, this section provides generalized scenarios.''
  
Separate page about this cluster : [[Cluster BibData]]
+
Separate page about this cluster: [[Cluster BibData]]
  
 
Author: Gordon Dunsire
 
Author: Gordon Dunsire
  
These scenarios are extracted from use cases in this cluster and then generalized. Two general types of agent are involved:  
+
Two general types of agent are involved:  
The "processor" agent consumes, amends, and generates metadata, and may be human or machine.  
+
#The (machine or human) "processor" agent consumes, amends, and generates metadata.  
The "end-user" agent consumes metadata, and is human.  
+
#The (human) "end-user" agent consumes metadata.  
  
 
=== Semantics standardization of bibliographic elements ===
 
=== Semantics standardization of bibliographic elements ===
  
Processor normalizes the element semantics of ingested records to a standard element set.  
+
The processor normalizes the element semantics of ingested records, ensuring a standard element set.
  
 
=== Deduplication and unification of records ===
 
=== Deduplication and unification of records ===
  
Processor merges duplicate records for the same resource into a single master record.  
+
The processor merges duplicate records for the same resource into a single master record.  
End-user is presented with a single record for a resource, with links to records of copies, instead of multiple, slightly-varying bibliographic records.  
+
The end-user is presented with a single record for a resource, with links to records of copies, instead of multiple, slightly-varying bibliographic records.  
  
=== Web resources tagging with standardised bibliographic terms ===
+
===Tagging Web resources with standardized bibliographic terms ===
  
Processor identifies web resources related to a bibliographic record and tags them with terms taken from a set of standard vocabularies.
+
The processor identifies web resources related to a bibliographic record and tags them with terms taken from a set of standard vocabularies.
  
 
=== Integrated metadata search interfaces across several providers  ===
 
=== Integrated metadata search interfaces across several providers  ===
  
End-user searches metadata for all resources in a consortium using a single, integrated interface, and identifies all available copies of a resource, including the nearest to a specified location.
+
The end-user searches metadata for all resources in a consortium using a single, integrated interface, and identifies all available copies of a resource, including the nearest to a specified location.
  
 
=== Information aggregation ===
 
=== Information aggregation ===
  
End-user refines results of a search, and expands it to include related resources from external collections at web-scale.
+
The end-user refines results of a search, and expands it to include related resources from external collections at web-scale.
Processor identifies recently-published bibliographic resources for dissemination in a current awareness service.
+
The processor identifies recently-published bibliographic resources for dissemination in a current awareness service.
End-user obtains access to an online full-text version of a resource via a link from the bibliographic record for the resource.
+
The end-user obtains access to an online full-text version of a resource via a link from the bibliographic record for the resource.
  
 
=== Bibliographic records annotation ===
 
=== Bibliographic records annotation ===
  
End-user can annotate bibliographic records retrieved by a search
+
The end-user can annotate bibliographic records retrieved by a search
  
 
== Authority data ==
 
== Authority data ==
  
Separate page about this cluster : [[Cluster Authority data]]
+
Separate page about this cluster: [[Cluster Authority data]]
  
 
Authors: Alexander, Jeff, Joachim
 
Authors: Alexander, Jeff, Joachim
Line 82: Line 83:
 
===Metadata addition by non-librarians while uploading a working paper===
 
===Metadata addition by non-librarians while uploading a working paper===
  
Alice uploads her working paper on corporate taxation to a economics repository. After entering title and abstract, she has to add her own name and the names of her co-workers. When she starts typing, a list of already-known authors from a authority file is presented, augmented with additional information (e.g. year of birth) to make the persons identifiable. When she selects an author, the system stores the authors URI in the background, which facilitates precice retrieval of all papers of a given author, irrespective of the possibly varying literal forms she and her collegues used when entering his or her name. When she starts adding keywords for the paper, suggestions from a disciplinary thesaurus (e.g. STW Thesaurus for Economics) are presented. Hints lead from alternate forms of keywords (possibly in alternate languages) to the preferred form and help selecting the best fitting keywords. The storage of an URI in the background, again, supports precise retrieval. Additionally, it allows to display keywords in the preferred language for a given user.
+
Alice uploads her working paper on corporate taxation to a economics repository. After entering the title and abstract, she has to add her own name and the names of her co-workers. When she starts typing, a list of already-known authors from a authority file is presented, augmented with additional information (e.g. year of birth) to make the persons identifiable. When she selects an author, the system stores the authors' URI in the background, which facilitates precise retrieval of all papers of a given author, irrespective of the possibly varying literal strings she and her collegues used when entering names. When she starts adding keywords for the paper, suggestions from a disciplinary thesaurus (e.g. STW Thesaurus for Economics) are presented. Hints lead from alternate forms of keywords (possibly in alternate languages) to the preferred form to help select the best fitting keywords. Storing a URI in the background, again, supports precise retrieval. Additionally, it enables displaying keywords in a given user's the preferred language.
  
 
===Extended Search Results based on Authority data===
 
===Extended Search Results based on Authority data===
  
John searches for “FAO” in a document repository (with data from differenct source not controlled by a central authority). The system will direct him to all the records associated with the authorized form of this corporate body which is indicated in its authority. The authorized form is “Food and Agriculture Organization of the United Nations”. The authority record serves thus to bring together all form of names for this corporate body, authorized and non-authorized, e.g. “FAO, Rome (Italy)”, “F.A.O.”, “FAO”, “Food and Agriculture Organization, F.A.O. of the U.N.” or “FAO of the UN”. Associated to the authority record for FAO are all the bibliographical records of documents issued by the concept of FAO. This assures John that his search is exhaustive. The system could also suggest related terms as further possible search terms.
+
John searches for “FAO” in a document repository (with data from differenct sources not controlled by a central authority). The system will direct him to all the records associated with the authorized form of this corporate body, as indicated in its authority. The authorized form is “Food and Agriculture Organization of the United Nations”. The authority record serves thus to bring together all form of names for this corporate body, authorized and non-authorized, e.g. “FAO, Rome (Italy)”, “F.A.O.”, “FAO”, “Food and Agriculture Organization, F.A.O. of the U.N.” or “FAO of the UN”. Associated to the authority record for FAO are all the bibliographical records of documents issued by the concept of FAO, assuring John that his search is exhaustive. The system could also suggest related terms as further possible search terms.
  
 
===Authority Data aggregation===
 
===Authority Data aggregation===
  
VIAF collects authority data from The German National Library and other contributors. Despite differences in the information collected, the records in these systems often refer to the same entity. By comparing information, VIAF can create semantic links between them and publishes those relationships from a "cluster" URI. The German National Library can then harvest the clusters it contributes to and ingest the links to relate their entities directly to other contributors. From an end-user perspective, VIAF can also aggregate the properties of these individuals and include those in the cluster representation to help end-user Alice discover and trace them from a central location.
+
VIAF collects authority data from The German National Library and other contributors. Despite differences in the information collected, the records in these systems often refer to the same entity. By comparing information, VIAF can create semantic links between them and can publish those relationships from a "cluster" URI. The German National Library can then harvest the clusters it contributes to and ingest the links to relate their entities directly to other contributors. From an end-user perspective, VIAF can also aggregate the properties of these individuals and include those in the cluster representation to help end-user Alice discover and trace them from a central location.
  
 
== Vocabulary alignment ==
 
== Vocabulary alignment ==
  
Separate page about this cluster : [[Cluster VocAlign]]
+
Separate page about this cluster: [[Cluster VocAlign]]
  
 
Authors: Antoine Isaac, Michael Panzer, Marcia Zeng
 
Authors: Antoine Isaac, Michael Panzer, Marcia Zeng
Line 111: Line 112:
  
  
Enrichment and discovery related use cases. These use cases focus on collections that have applied source or target vocabularies that are part of alignment efforts.
+
Enrichment and discovery related use cases focus on collections that have applied source or target vocabularies that are part of alignment efforts.
  
 
* Vocabulary-based enrichment of collections: the usage of an alignment technique or an existing alignment (e.g., a crosswalk or link map) to add semantically related concepts from target vocabularies to documents that have been indexed or are otherwise discoverable with the source vocabulary. ''Reindexing'' is a specific instance of vocabulary-based enrichment use cases.  
 
* Vocabulary-based enrichment of collections: the usage of an alignment technique or an existing alignment (e.g., a crosswalk or link map) to add semantically related concepts from target vocabularies to documents that have been indexed or are otherwise discoverable with the source vocabulary. ''Reindexing'' is a specific instance of vocabulary-based enrichment use cases.  
 
* Vocabulary-based discovery in and across heterogeneously indexed collections: the enhancement of recall (with improved or at least comparable precision) for queries that use terms of multiple source and target vocabularies. ''Query expansion'' is a specific instance of vocabulary-based discovery use cases.
 
* Vocabulary-based discovery in and across heterogeneously indexed collections: the enhancement of recall (with improved or at least comparable precision) for queries that use terms of multiple source and target vocabularies. ''Query expansion'' is a specific instance of vocabulary-based discovery use cases.
 
* Exploration of topical spaces by cross-vocabulary navigation: the enabling of interactive query construction by providing guided access to aligned vocabularies, allowing the traversal of intra- and inter-vocabulary (alignment) relationships, optionally resulting in a query using terms from the source vocabulary only.
 
* Exploration of topical spaces by cross-vocabulary navigation: the enabling of interactive query construction by providing guided access to aligned vocabularies, allowing the traversal of intra- and inter-vocabulary (alignment) relationships, optionally resulting in a query using terms from the source vocabulary only.
* Multilingual discovery: the employment of alignment techniques, e.g., informed by natural language processing (NLP), to establish semantic interoperability between value vocabularies in different languages. ''Named entity recognition'' (NER)  as described in [[Use_Case_Language_Technology]] is a specific instance of multilingual discovery, aiming at establishing semantic equivalence for concepts that have the same entities (persons, places, events, etc.) as extension, referent, or focus across multiple languages.
+
* Multilingual discovery: the employment of alignment techniques, e.g., informed by natural language processing (NLP), to establish semantic interoperability between value vocabularies in different languages. ''Named entity recognition'' (NER)  as described in [[Use Case Language Technology]] is a specific instance of multilingual discovery, aiming at establishing semantic equivalence for concepts that have the same entities (persons, places, events, etc.) as extension, referent, or focus across multiple languages.
 
* Bridging multiple domains, disciplines, or communities of practice: the enabling of brokering or switching between domain-focused vocabularies of varying terminological specificity to enhance federated discovery in heterogeneously indexed collections or exploration of transdisciplinary topic spaces.
 
* Bridging multiple domains, disciplines, or communities of practice: the enabling of brokering or switching between domain-focused vocabularies of varying terminological specificity to enhance federated discovery in heterogeneously indexed collections or exploration of transdisciplinary topic spaces.
  
Line 131: Line 132:
 
== Archives and heterogeneous data ==
 
== Archives and heterogeneous data ==
  
Separate page about this cluster : [[Cluster Archives]]
+
Separate page about this cluster: [[Cluster Archives]]
  
 
=== Semantic connections ===  
 
=== Semantic connections ===  
Line 183: Line 184:
 
=== Grouping ===  
 
=== Grouping ===  
  
Users should be enabled to define groups of resources on the web that for some reason belong together. The relationship that exists between the resources is often left unspecified. Some of the resources in a group may not be under control of the institution that defines the groups.
+
Users should be able to define groups of resources on the web that for some reason belong together. The relationship that exists between the resources is often left unspecified. Some of the resources in a group may not be under control of the institution that defines the groups.
  
 
=== Enrichment ===  
 
=== Enrichment ===  
Line 213: Line 214:
 
Separate page about this cluster: [[Cluster Social Uses]]
 
Separate page about this cluster: [[Cluster Social Uses]]
  
= Use Cases overview =
+
= Summary of individual Use Cases =
  
 
== Bibliographic data ==  
 
== Bibliographic data ==  
  
Separate page about this cluster : [[Cluster BibData]]
+
Separate page about this cluster: [[Cluster BibData]]
  
 
=== [[Use Case Bibliographic Network]] ===
 
=== [[Use Case Bibliographic Network]] ===
Line 298: Line 299:
 
== Vocabulary alignment ==
 
== Vocabulary alignment ==
  
Separate page about this cluster : [[Cluster VocAlign]]
+
Separate page about this cluster: [[Cluster VocAlign]]
  
 
=== [[Use Case AGROVOC Thesaurus]] ===
 
=== [[Use Case AGROVOC Thesaurus]] ===
Line 334: Line 335:
 
== Archives and heterogeneous data ==
 
== Archives and heterogeneous data ==
  
Separate page about this cluster : [[Cluster Archives]]
+
Separate page about this cluster: [[Cluster Archives]]
 
=== [[Use Case Archipel]] ===
 
=== [[Use Case Archipel]] ===
  

Revision as of 16:57, 20 June 2011

Contents

Abstract

Selected use case and case studies from the library community and related sectors are described in this document, along with outreach and dissemination efforts. These were gathered and analyzed by the W3C Library Linked Data Incubator Group, based on submissions from different organizations and individuals. Use cases have been grouped into eight topical clusters, which are described below. Selected use cases from each cluster have also been summarized.

Motivation and method

As described in the group charter, one of the main activities of the W3C Library Linked Data Incubator Group was to gather use cases and case studies demonstrating successful implementation of Semantic Web technologies in libraries and related sectors. Outreach and dissemination initiatives were also gathered, along with innovative uses, which demonstrate the possibilities and benefits offered by the application of Linked Data technologies and methods in libraries.

The use cases presented in this report demonstrate the need for Linked Data technologies in order to DESCRIBE library resources and their context, and SHARE these descriptions among institutions and with the broader public. The issue of description mainly involves the creation or representation of RELATIONSHIPS between resources, by MAPping similar entities, making existing relationships more explicit, and creating new relations, either using machine processing (inferences, alignments, etc.) or manually (tagging, cataloguing). Those relationships can be used to provide DISCOVERy, through BROWSE and SEARCH services, and to FEDERATE or AGGREGATE many sources. They are also involved in data MANAGEment issues. Linked Data technologies are used to improve global interoperability of library data, by RE-USing metadata elements sets and value vocabularies, providing URIs for resources, and developing PUBLISHing services like APIs.

Use Cases organization

The collected set of use cases, case studies, initiatives and ideas is organized into eight different clusters:

  • Bibliographic data
  • Authority data
  • Vocabulary alignment
  • Archives and heterogeneous data
  • Citations
  • Digital objects
  • Collections
  • Social and new uses

After collecting and reviewing the submitted use cases for each cluster, group members extracted the main scenarios out of the initial set of use cases. The motivation behind these extracted use scenarios is to sum up and capture the main ideas and scenarios from the set of original use cases. The extracted use cases presented in this document aim to cover the majority of topics and situations related to each cluster.

The following figure depicts the use case organization and the aforementioned extraction process:


UCReport-v1.png



The process followed by the group was to first collect the different use cases and case studies and then review them and extract the main scenarios. However, this document presents the extracted use cases first (section 4) and then offers short summaries for each individual use case (section 5). The rationale behind this structure is to give the reader an overall view on the main topics and scenarios involved in the different use cases clusters, before presenting each single case in more detail.

Extracted Use Cases

Bibliographic data

Rather than summarizing individual use cases, this section provides generalized scenarios.

Separate page about this cluster: Cluster BibData

Author: Gordon Dunsire

Two general types of agent are involved:

  1. The (machine or human) "processor" agent consumes, amends, and generates metadata.
  2. The (human) "end-user" agent consumes metadata.

Semantics standardization of bibliographic elements

The processor normalizes the element semantics of ingested records, ensuring a standard element set.

Deduplication and unification of records

The processor merges duplicate records for the same resource into a single master record. The end-user is presented with a single record for a resource, with links to records of copies, instead of multiple, slightly-varying bibliographic records.

Tagging Web resources with standardized bibliographic terms

The processor identifies web resources related to a bibliographic record and tags them with terms taken from a set of standard vocabularies.

Integrated metadata search interfaces across several providers

The end-user searches metadata for all resources in a consortium using a single, integrated interface, and identifies all available copies of a resource, including the nearest to a specified location.

Information aggregation

The end-user refines results of a search, and expands it to include related resources from external collections at web-scale. The processor identifies recently-published bibliographic resources for dissemination in a current awareness service. The end-user obtains access to an online full-text version of a resource via a link from the bibliographic record for the resource.

Bibliographic records annotation

The end-user can annotate bibliographic records retrieved by a search

Authority data

Separate page about this cluster: Cluster Authority data

Authors: Alexander, Jeff, Joachim

Metadata addition by non-librarians while uploading a working paper

Alice uploads her working paper on corporate taxation to a economics repository. After entering the title and abstract, she has to add her own name and the names of her co-workers. When she starts typing, a list of already-known authors from a authority file is presented, augmented with additional information (e.g. year of birth) to make the persons identifiable. When she selects an author, the system stores the authors' URI in the background, which facilitates precise retrieval of all papers of a given author, irrespective of the possibly varying literal strings she and her collegues used when entering names. When she starts adding keywords for the paper, suggestions from a disciplinary thesaurus (e.g. STW Thesaurus for Economics) are presented. Hints lead from alternate forms of keywords (possibly in alternate languages) to the preferred form to help select the best fitting keywords. Storing a URI in the background, again, supports precise retrieval. Additionally, it enables displaying keywords in a given user's the preferred language.

Extended Search Results based on Authority data

John searches for “FAO” in a document repository (with data from differenct sources not controlled by a central authority). The system will direct him to all the records associated with the authorized form of this corporate body, as indicated in its authority. The authorized form is “Food and Agriculture Organization of the United Nations”. The authority record serves thus to bring together all form of names for this corporate body, authorized and non-authorized, e.g. “FAO, Rome (Italy)”, “F.A.O.”, “FAO”, “Food and Agriculture Organization, F.A.O. of the U.N.” or “FAO of the UN”. Associated to the authority record for FAO are all the bibliographical records of documents issued by the concept of FAO, assuring John that his search is exhaustive. The system could also suggest related terms as further possible search terms.

Authority Data aggregation

VIAF collects authority data from The German National Library and other contributors. Despite differences in the information collected, the records in these systems often refer to the same entity. By comparing information, VIAF can create semantic links between them and can publish those relationships from a "cluster" URI. The German National Library can then harvest the clusters it contributes to and ingest the links to relate their entities directly to other contributors. From an end-user perspective, VIAF can also aggregate the properties of these individuals and include those in the cluster representation to help end-user Alice discover and trace them from a central location.

Vocabulary alignment

Separate page about this cluster: Cluster VocAlign

Authors: Antoine Isaac, Michael Panzer, Marcia Zeng

The four "general applications" for vocabulary alignment data (as elaborated in [2]) can serve as a foil for the extraction (with Voc1 and Voc2 as the vocabularies to be aligned):

  1. Reindexing of collections: supporting the indexing of documents with Voc2 based on existing indexing with Voc1, or vice versa.
  2. Concept-based search across vocabularies in heterogeneously indexed collections: supporting the retrieval of documents indexed with Voc1 for queries that use Voc2 concepts, or vice versa.
  3. Navigation across vocabularies: supporting the exploration of concept spaces across vocabularies, giving (exploratory) access to collection items indexed with selected concepts.
  4. Vocabulary merging: supporting the construction of a new vocabulary that encompasses both Voc1 and Voc2, or the integration of one vocabulary into the other (as an extension or satellite of the other vocabulary)

These can be further abstracted in 3 categories of uses:

Enrichment and discovery related use cases

Enrichment and discovery related use cases focus on collections that have applied source or target vocabularies that are part of alignment efforts.

  • Vocabulary-based enrichment of collections: the usage of an alignment technique or an existing alignment (e.g., a crosswalk or link map) to add semantically related concepts from target vocabularies to documents that have been indexed or are otherwise discoverable with the source vocabulary. Reindexing is a specific instance of vocabulary-based enrichment use cases.
  • Vocabulary-based discovery in and across heterogeneously indexed collections: the enhancement of recall (with improved or at least comparable precision) for queries that use terms of multiple source and target vocabularies. Query expansion is a specific instance of vocabulary-based discovery use cases.
  • Exploration of topical spaces by cross-vocabulary navigation: the enabling of interactive query construction by providing guided access to aligned vocabularies, allowing the traversal of intra- and inter-vocabulary (alignment) relationships, optionally resulting in a query using terms from the source vocabulary only.
  • Multilingual discovery: the employment of alignment techniques, e.g., informed by natural language processing (NLP), to establish semantic interoperability between value vocabularies in different languages. Named entity recognition (NER) as described in Use Case Language Technology is a specific instance of multilingual discovery, aiming at establishing semantic equivalence for concepts that have the same entities (persons, places, events, etc.) as extension, referent, or focus across multiple languages.
  • Bridging multiple domains, disciplines, or communities of practice: the enabling of brokering or switching between domain-focused vocabularies of varying terminological specificity to enhance federated discovery in heterogeneously indexed collections or exploration of transdisciplinary topic spaces.

Vocabulary enhancement and reuse

Vocabulary enhancement and reuse either to extend other value vocabularies or as a basis of creation of new value vocabularies. (Often, these use cases will be prerequisites to fulfilling the discovery and enrichment use cases described above.)

  • Extending a common pivot or spine vocabulary with specialized vocabularies that become local extensions of a shared upper-level core.
  • Vocabulary merging: supporting the construction of a new vocabulary that encompasses both Voc1 and Voc2, or the integration of one vocabulary into the other (as an extension or satellite of the other vocabulary)

Publication, discovery, and maintenance of tools or services of vocabulary alignment

  • Alignment-level description that enables one-stop shopping of value vocabulary alignments and/or contents provided by these vocabularies.
  • Change management and versioning of alignments (e.g.: crosswalks, link maps): offering update and notification services to allow application using vocabulary alignments to keep pace with changes in source or target vocabularies, or to keep targeting a specific stable version.

Archives and heterogeneous data

Separate page about this cluster: Cluster Archives

Semantic connections

A group of archives would like to better share information about their holdings. They have separate catalogs and these catalogs do not necessary use the same data formats. Exporting and sharing their data in linked data format would allow them to make connections between the collections using topics, names, place names, and other information contained in their metadata.

Serendipitous discovery

An archive would like to provide better discovery for its users. Traditional database methods do not allow users to follow connections that may be revealed in the descriptions of the archive's materials. Because it is hard to predict what methods a searcher will use and what information will be useful, linked data would allow searchers to follow the paths provided by any data points in the archival metadata.

Convergence

The archive would like to gain greater visibility by linking from web resources to its materials. It would do this by creating and exporting its metadata in linked data format, and by adding that data to the linked data cloud. This scenario is expected to facilitate the creation of semantic links between heterogeneous material such as library, archives, and museums data.

Data management improvement

Build network of institutions using similar metadata to describe preservation actions and to exchange expertise and collection information. Use semantic web tehnologies to facilitate and improve interoperability among heterogeneous data described using various metadata formats. Increase the use of digitally preserved materials to a wider user base.

Citations

Separate page about this cluster: Cluster Citations

Authors: Kai Eckert, Peter Murray, Ed Summers

In this section, we list use cases in a very narrow sense that were extracted from the above mentioned scenarios or made up additionally. A use case in this narrow sense means a specific action that an end-user might want to perform that includes the citation data as we have defined it here.

The purpose of such use cases typically includes the extraction of requirements that then can be fulfilled by the underlying implementation. In turn these use cases also provide a rationale for each requirement and explain, why this requirement is needed. To illustrate this, we added a notion of some requirements in italics.

Publication representation enhancement

Creation of an enhanced representation of publications, where the cited reference is directly accessible from the citation (Position in the cited/citing document, What was cited)

Navigation enhancement

Make it possible for the user to click from a citation directly to the location in the references publication (URI or other resolver mechanism, like OpenURL)

Automatic evaluation of publications

Determine the value of a resource (easily and automatically) by analyzing the content of citations to that work (backlinks, optional: further qualifications like agrees/disagrees)

Retrieval of citation context

Find other publications that build upon the same cited resource to include them in my “Related work” section (backlinks, optional: qualifications like “Extends”, “builds upon”, etc.)

Digital Objects

Separate page about this cluster: Cluster Digital Objects

Authors: ??

Grouping

Users should be able to define groups of resources on the web that for some reason belong together. The relationship that exists between the resources is often left unspecified. Some of the resources in a group may not be under control of the institution that defines the groups.

Enrichment

Users should be enabled to link resources together, e.g. related descriptions, persons, topics, etcetera. For example, a poem in a digital text repository may be linked to the poet as defined in an authority file elsewhere on the Web.

Browsing

Users should be supported in browsing through groups and resources that belong to the groups. Interlinks should allow the user to explore the connections between resources.

Re-use

Users should be enabled to re-use all or parts of a collection, with all or part of its metadata, elsewhere on the linked Web.

Collections

No defined in cluster page

Authors: Gordon, Karen

Separate page about this cluster: Cluster_Collections

Social and new uses

No defined in cluster page

Authors: Uldis Bojārs and Jodi Schneider

Separate page about this cluster: Cluster Social Uses

Summary of individual Use Cases

Bibliographic data

Separate page about this cluster: Cluster BibData

Use Case Bibliographic Network

The International Federation of Library Associations and Institutions (IFLA) initiated a fundamental re-examination of these issues to produce a framework that would provide a clear, precisely stated, and commonly shared understanding of what it is that the bibliographic record aims to provide information about, and what it is that we expect the record to achieve in terms of answering user needs. As an addition, IFLA aimed to recommend a basic level of functionality and basic data requirements for records created by national bibliographic agencies. Linked data techniques would allow this data, the concepts and relationships between them, to be described as an information graph and web standards can be utilized to facilitate the user's discovery requirements.

Use Case AGRIS

Since 1975, the AGRIS (International Information System for the Agricultural Sciences and Technology) database has been aggregating and disseminating bibliographic references, such as research papers, studies and theses, each including metadata such as conferences, researchers, publishers, institutions and subjects, catalogued from more than 150 participating institutions in more than 100 countries. The AGRIS linked data strategy focuses on two different objectives: To institute AGRIS as a producer of linked data exploiting the semantic richness of the AGRIS data by creating an open RDF dataset in agricultural sciences, and to expose it to other web services that can consume and link to AGRIS data

Use Case Community Information Service

Academic organizations of varying sizes (research groups, university departments, scholarly societies, special interest groups, etc.) have a strong interest in maintaining awareness and quality of information in their domain, and in openly publishing this information to the broader academic community and to the general public. A linked data approach could provide the data with an open license which allows its reuse for such purposes, and support the APIs, data standards and client software to lower the barrier to participation in information curation and sharing.

Use Case Data BNF

Bibliothèque nationale de France (BnF) makes different kinds of resources available on the Web. Linked data technologies could help the BnF to bring together data from several sources, with a scalable and interoperable data model, to improve the publication of resources in the online catalog as well to align and link to other usefull resources on the Web.

Use Case Identification And Deduplication Of Library Records

Matching algorithms in the library domain need reference data, current access to such reference data is limited. The application of Linked Data to library records could help to develop automated matching algorithms for library records so that finally only one record exists for every single intellectual item. This situation could provide easier identification of resources' metadata and help to the deduplication of records.

Use Case Linked Data and legacy library applications

The addition of linked data applications to the library information systems creates a challenge for system architects on how to adapt legacy systems to make use of new linked data applications. The main question behind this issue is, how do libraries transition from pilot linked data applications to the use of linked data throughout the library information systems.

Use Case Migrating Library Legacy Data

Libraries wish to convert legacy metadata to RDF triples for several reasons, including taking advantage of systems and services which may emerge in the Semantic Web environment, encouraging increased usage of the metadata and corresponding resource, and contributing to the general sharing of metadata and the common good. The goal is to represent library legacy data as linked data, retaining as much data, utility and semantics as possible and to enable its use in both traditional and innovative ways. To achive this goal there is a need of appropriate vocabularies and element sets available as linked data as well as stable community-wide mappings for common legacy metadata formats to RDF classes and properties.

Use Case Open Library Data

The Open Library is a large bibliographic database (approx. 25 million items) with metadata for books. Over one million ebooks are represented in the database, linked from the bibliographic data. The goal is to allow the user to link to the OL data in an as-easy-as-possible way, e.g. to link to a specific manifestation, but immediately provide information about other items or manifestations of the same work that are available as a full text. Linked data technologies can be used to easily reference specific manifestations in the OL data. Bibliographic data can be exposed by dereferencing or an SPARQL end-point to enhance the citation on the website.

Use Case Regional Catalog

In Germany there is no central catalog of all holdings of German libraries. Academic libraries are organized in regional clusters, where a central catalog for all member libraries is adminstrated by one of the members. With the application of Library Linked Data technologies to the regional central services, a German Central Catalog could be created more easily.

Use Case Pode

The purpose of the Pode project has been to use technology and external data to enrich the data in library catalogue records, and thereby creating a platform for developing enduser services with information and functionality that is not available in the current library web search. This use cases concentrates on converting library data to RDF, converting it to FRBRized library data, and linking data to individual instances in other LOD datasets.

Use Case Polymath Virtual Library

Polymath Virtual Library aims to bring together information, data, digital texts and websites about Spanish, Hispano-American, Brazilian and Portuguese polymaths from all times. The backbone of the system are the authors. The use of linked data will benefit the Polymath Virtual Library in improving the process of obtaining links from different sources and spreading the type of these sources as well as increasing the efficiency of the collection through semiautomatic enrichment of data, obtaining URIs from available in LOD and offer data in LOD to enhance its visibility and use (mainly throughout aggregators as Hispana or Europeana).

Use Case Talis Prism 3

Talis Prism 3 is a next-generation OPAC/search and discovery interface. We need to offer a rich interface to surface the large volume of content available in libraries. Browsing by entities such as author, subject and series is important, as is the reliable extraction of data from MARC 21 into a linked data model. Prism 3 is powered by the Talis Platform, a hosted linked data service which offers both SPARQL querying and powerful full text search capabilities.

Authority data

Separate page about this cluster: Cluster Authority data

Use Case AuthorClaim

The AuthorClaim registration service aims to link scholars with the records about the works that they have written, as recorded in a bibliographic database.The application contributes to the identification of authors. In the application scenario, document metadata records are classified by subject experts. Each expert makes a binary decision of a document belonging to a category or not. The resulting document collection forms an issue of a subject report. Linked data can be used to further generalize the basic application, and encourage the reuse of the applications' results.

Use Case Authority Data Enrichment

Authority control is the practice of creating and maintaining authority data for bibliographic entities. Authority control enables catalogers to disambiguate resources with similar or identical characteristics as well as collocating resources that logically belong together. Linked data could enable the reuse of external data sets by linking instead of copying & merging.

Use Case FAO Authority Description Concept Scheme

The objective of the FAO Authority Description Concept Scheme is to provide more efficient management of the several multilingual forms of a concept through the use of URIs and the assignment of relationships between concepts. Its benefits include providing efficient system searching and exhaustive search results. It also improves access dramatically by providing consistency in the forms used to identify the different entities.

Use Case International Registry for Authors

One of the main pillars for the scientific information retrieval are author’s names. The inconsistency problem in the authors' signatures has been around for many years, but its importance is increasing due to the great number of people publishing research studies and papers. IraLIS works to make authors aware of the need to always sign in the same way, register different name variants and allow a suitable and unconfused signature recognition. Linked Data technologies could help IraLIS by creating specific URIs for each author.

Use Case Linked Data Service of the German National Library

In Germany, Authority data is collected and maintained collaboratively. This data, as well as the German National Library’s bibliographic data, is relevant to many libraries and other cultural heritage institutions. Linked Data provides a suitable framework for publishing relevant data at the German National Library and linking it to other data sources of interest.

Use Case Virtual International Authority File (VIAF)

The goal of VIAF project is to facilitate research across languages anywhere in the world by making authorities truly international. VIAF explores virtually combining the name authority files of all three institutions into a single name authority service. As of the fall of 2009 there are 18 personal name authority files from 15 organizations participating in VIAF. The VIAF Linked Data approach provides usefull experiences and knowledge on how to apply Linked Data principles to authority records.

Vocabulary alignment

Separate page about this cluster: Cluster VocAlign

Use Case AGROVOC Thesaurus

The AGROVOC Thesaurus of the Food and Agricultural Organization of the UN in Rome (FAO) is a SKOS concept scheme of terminology in agriculture, forestry, fisheries, food and related domains such as the environment in multiple languages. These concepts are used to tag and discover research results across multiple languages; its expression as Linked Data helps FAO create explicit equivalences between AGROVOC terms and terms in agricultural vocabularies maintained by other organizations.

Use Case Browsing And Searching In Repositories With Different Thesauri

In the library community, there exist different thesauri for annotating entries in library catalogues. A user should be able to browse and search in several library catalogues in parallel with the keywords from any of the used thesauri. It is important that there exist mappings between the different thesauri and categorization systems. Providing these as linked open data, it becomes easier to integrate further thesauri and or categorization systems into the network of interlinked thesauri, that will facilitate parallel search from different catalogues.

Use Case Civil War Data 150

Civil War Data 150 (“CWD150”), is a collaborative project to share and connect American Civil War related data across local, state and federal institutions during the four year sesquicentennial commemoration of the Civil War, beginning in April of 2011. By aggregating these diverse data sources and performing vocabulary alignment to an ontology specific to the American Civil War but applicable to a broader military schema, it becomes possible to query information about a particular place, regiment, battle, or officer. CWD150 will use linked data technology to create connections based on the strong identifiers and taxonomy of the Civil War, particularly the regiments, battles, battlefields, officers, and soldiers and sailors.

Use Case Component Vocabularies

Creators of metadata use a variety of methods to encode or reference entities associated with the resource described, e.g. names, title, subjects, geographic names, etc. The goal is to allow metadata to link to established vocabularies. Linked data technology may be used to achieve this, by assigning URIs to vocabulary terms. URIs are assigned to vocabulary terms in the controlled vocabularies, and metadata descriptions at external systems use the URIs to reference those terms.

Use Case Language Technology

Language Technology is applied in areas like machine translation, automatic summarization, (web) search or spell checking. To estimate the usefulness of library linked data for language technology, it is important to concentrate first on one specific use case. This will be named entity recognition (NER) in single and potentially across languages. A traditional approach towards NER is the application of a gazetteer, that is a dictionary with information about places, people, institutions etc. This approach has the drawback that it is hard to keep the gazetteer up to date. Another problem is the sustainable creation of gazeteers across languages. Linked data could help to solve the two problems of NER ("keeping up to date" and "briding across languages").

Use Case Subject Search

Traditionally, subject heading systems are a way to standardize the names of things, typically concepts. Typical library practice is to store them in bibliographic records and represent those records on the Web in HTML where they can be indexed by Web search engines. Linked Data principles would help libraries to use subject heading systems more effectively for Web discovery and reuse, by using HTTP URIs and OWL to identify and deliver better modeled resources for consolidated use by humans, machines, and semantic agents.

Use Case Vocabulary Merging

Library users expect single point-of-search in consortial resource discovery service involving multiple organisations and large-scale metadata aggregations. Users also expect to be able to search for subjects using their own language and terms in an unambiguous, contextualised manner. Linked data technologies could provide the underlying infrastructure by semantic mapping or merging of concepts across vocabularies. The use case brings to discussion several ways for linked data vocabularies federation.

Use Case Bridging OWL and UML

The first linked data principle says: “Use URIs as names for things ”. However, in order to avoid incosistencies and tight coupling, names should be sistematically managed and get rationalized in an adaptable conceptual model that is based on use cases and managed with sensible meta-model language(s). Web Ontology Language (OWL) and Uniform Modeling Language (UML) are two such languages that could be content-negotiated in either direction to manage and represent a common domain model This use case illustrates how UML class diagrams can be used to explore, reuse, and design OWL. The UML community has developed the Ontology Definition Metamodel (ODM) to help bridge this gap.

Archives and heterogeneous data

Separate page about this cluster: Cluster Archives

Use Case Archipel

The Flemish Archipel Project is focused on providing access and long term archiving of digitized material from a diverse set of memory institutions. Libraries, archival institutions, the art sector (museums) and broadcasters contribute their content to a network of repositories. One of the challenges is the domain-specific metadata models that are used in each sector. The availability of such domain-specific metadata vocabularies and models expressed as Linked Data would benefit the project by allowing better formalized and tested mappings.

Use Case Digital Preservation

Preservation of digital objects in the long term is a challenging activity which is not limited to storage and back-up : it involves complex strategies aiming at providing a trusted environment where digital objects can evolve along with the changes in technology, hardware and software environments. Linked data provides a global environment for describing the objects and their significant properties, also allowing to avoid duplication of efforts when describing resources and their attributes, allowing the creation of a global information graph encompassing all the information needed to perform complex queries and actions

Use Case Europeana

Europeana provides a service to link archives, libraries, museums and audio-visual material from across Europe. It aggregates metadata from various cultural heritage providers. It allows to search in a unified way various object collections using that metadata through a web portal or an API. It aims at easing further re-use and reference to the digitized objects it refers to. Linked data can help Europeana to enhance semantic interoperability between metadata models, enrich existing metadata , to improve data objects and link harvesting, to enhance search processes and to provide low-level, easy access to metadata to third-parties.

Use Case LOCAH

The Archives Hub is a national service that provides a wealth of rich inter- disciplinary information about archives held across the UK. The LOCAH Project is investigating the creation of links between the Hub and other data sources including DBPedia, BBC, LCSH and others. User studies and log analyses indicate that Archives Hub users frequently search laterally through the descriptions. Linked data is a way of vastly expanding the benefits of lateral search, helping users discover contextually related materials by creating links between archival collections and other sources, that are often widely dispersed.

Use Case Photo Museum

Photo collections are popular material in the Internet. Institutes that are doing long time preservation of photographic collections would need new tools and procedures for presenting the images. Often they have several more or less linked databases describing the collections from different angles (physical descriptions of materials and conservation, content, agents, contracts and intellectual property rights etc.). Linked Data approaches seem to give a good solution for the technical problems caused by hetereogenity of data and data sources.

Use Case Radio Station Archive Digitisation

Many radio stations have archives of audio programming going back many years. In many cases they are not digitised and have little or inconsistent metadata. Current practice for metadata creation and transcription is often ad-hoc, conforming in various degrees to established library methods. Linked Data would enable cross references to other events (particularly valuable where the audio in question is a news broadcast) and to enable federated searching both on these cross references and generally, adding value to the digitisation process as well as multiplicating the potential impact of the results.

Use Case Recollection

Use Case Ontology of Cantabria's Cultural Heritage

Citations

Separate page about this cluster: Cluster Citations

Use Case Citation of Scientific Datasets

Use Case Enhanced Publications

Use Case Mapping Scholarly Debate

Digital objects

Separate page about this cluster: Cluster Digital Objects

Use Case Collecting material related to courses at The Open University

Use Case Digital Text Repository

Use Case Enhanced Publications

Use Case Editing reports on new academic documents

Use Case NDNP (National Digital Newspaper Program)

Use Case NLL Digitized Map Archive

Use Case Publishing 20th Century Press Archives

Collections

Separate page about this cluster: Cluster_Collections

Use Case AuthorClaim (shared w/Authority data cluster)

Use Case Collection-Level Description

Use Case Community Information Service (shared w/Bibliographic data cluster)

Use Case Digital resources with access restrictions

Use Case Library Address Data

Use Case Nearest physical collection

Social and new uses

Separate page about this cluster: Cluster Social Uses

Use Case Community Information Service (shared w/Bibliographic data cluster)

Use Case Crosslinking Environment Data and the Library

Use Case Crowdsourced Catalog

Use Case Mendeley Research Networks for linking researchers and publications

Use Case Open Library Data (shared w/Bibliographic data cluster)

Use Case Ranking Search Results by Popularity using Circulation Data

Use Case SEO

Use Case Support This Book Button

References

Full URIs for citation...

Full-URI links to clusters: