This wiki has been archived and is now read-only.

Cluster Archives

From Library Linked Data
Jump to: navigation, search

Authors: Karen Coyle, Emmanuelle Bermès


Archives differ from libraries in that they each hold unique works, thus they are unable to engage in the kind of metadata sharing that is common in libraries. In fact, there is less standardization of the metadata created in archives than there is in libraries because of the lack of need for sharing. This means that archives have not had an opportunity to connect to each other in the way that libraries have when they have created consortia based on cataloging needs. Similar challenges are met when trying to federate or aggregate large collections of heterogeneous material, for instance collections from distinct cultural institutions such as libraries, museums and archives.

Archival materials are often related to historical persons and events, and could benefit in particular from linking to other sources of information that cover those same topics. However, the existence of archival materials on a topic is often unknown, and incomplete analysis of the materials may mean that few key facts are recorded that will aid discovery. Much discovery of archival materials is serendipitous and users will follow non-specific clues in hopes of arriving at useful archival units.

Archives also have in common with other cultural institutions the fact that they carry on expensive data management activities that also often requires particular knowledge and expertise. This include for instance the preservation of materials in a digital format. It is important for the cultural heritage community to share these efforts and to avoid duplication of efforts for interchangeable materials by disseminating widely both the existence of preserved materials and their particular conditions and status.

Topic in the Context of Linked Data

Linked data would provide an opportunity for archives to create links based not on the ownership of the same items but on topical relationships between materials held in different archives. Such links would allow the institution to provide additional context and detail to its users at the least effort and cost.

Linking from external materials would also raise the visibility of archives as users would discover the existence of primary source material during their research in other information sources. In particular, a connection between archives, libraries and museums, based on semantics inherent in their collections, would expand the general access to cultural heritage materials and would create new alliances for sharing materials and developing user services based on topics rather than institution type.

The Linked data and semantic web technologies are expected to facilitate not only the access of cultural collections to end users, but also the sharing and management of data between institutions. Linked Data is seen as a way to enable richer connections between different domains, hence improving interoperability.

Scenarios (Case Studies)

These are the scenarios from the LLD XG that were incorporated to create this document, along with their goals.

Use Case Archipel

  • discovery over many sources (federated or union discovery) from different domain models (FEDERATE)
  • use linked data to expand data model -> to expand beyond domain models, and facilitate mappings - MAP(Metadata)

Use Case Digital Preservation

  • use LD to inform preservation actions (probably through federation of different sources?) - can be generalized as : use LD to inform specific library management actions (e.g. : preservation) - MANAGE
  • share object descriptions among institutions (SHARE) - NB : not limited to bibliographic data

Use Case Europeana

  • combine data from multiple sources (union catalog) FEDERATE
  • use linked data to expand data model -> to expand beyond domain models, and facilitate mappings - MAP(Metadata)

+ describe context / enhance through links (RELATE - already in the goals list)

Use Case LOCAH

  • discovery over many sources (federated or union discovery) - one domain model, but the resources are unique & scattered (FEDERATE)
  • data enhancement through links -> is that the "RELATE" goal from the current list ?

Use Case Photo museum

  • describe context & hierarchy that preserves context (DESCRIBE)

Use Case Radio Station Archive Digitisation

  • connect archive items to related events (linking to other communities; context) (DESCRIBE)
  • discovery over many sources (federated or union discovery) (FEDERATE)

Use Case Recollection -> build a network of trust & participation around common goals. Using LD to help create a community of shared interest. + potential partners. FEDERATE, MANAGE

Use Case Ontology of Cantabria's Cultural Heritage

  • discovery over many sources from different domain models FEDERATE
  • use linked data to expand data model (archives, libraries and museums data) MAP-METADATA

Extracted Use Cases

Semantic connections: A group of archives would like to better share information about their holdings. They have separate catalogs and these catalogs do not necessary use the same data formats. Exporting and sharing their data in linked data format would allow them to make connections between the collections using topics, names, place names, and other information contained in their metadata.

Serendipitous discovery: An archive would like to provide better discovery for its users. Traditional database methods do not allow users to follow connections that may be revealed in the descriptions of the archive's materials. Because it is hard to predict what methods a searcher will use and what information will be useful, linked data would allow searchers to follow the paths provided by any data points in the archival metadata.

Convergence: The archive would like to gain greater visibility by linking from web resources to its materials. It would do this by creating and exporting its metadata in linked data format, and by adding that data to the linked data cloud. This scenario is expected to facilitate the creation of semantic links between heterogeneous material such as library, archives, and museums data.

Data management improvement: Build network of institutions using similar metadata to describe preservation actions and to exchange expertise and collection information. Use semantic web tehnologies to facilitate and improve interoperability among heterogeneous data described using various metadata formats. Increase the use of digitally preserved materials to a wider user base.

Vocabularies and Technologies

Problems and limitations

Missing Vocabularies

  • sometimes specific (physical state of original in a preservation context), sometimes general (need vocabularies for preservation data), but no vocabulary for the function or data elements

Data incompatibilities or lacks

  • current data is free text, but contains quantitative information that needs to be pulled out
  • data needs to be qualified as "estimated" or "derived" so users know it is not precise (this is possibly a vocabulary issue)
  • current practice does not include rich relationships, just "related," so there is no source of relationships

Community guidance

  • no examples in our community domain that we can follow
  • lack of information on how to create a data model
  • no community guidance on which technologies and vocabularies to use

Technology questions

  • is linked data scalable to the size we need?
  • is linked data appropriate for highly hierarchical data models?

Technology availability

  • no systems available on market for linked data creation and use
  • open source solutions available are in an unfinished state