Rdb2RdfXG/ReusableIdentifier

From W3C Wiki

Due to our action we will describe the use cases for reusable identifiers in the RDB2RDF mapping language.

Premise

As of the draft Recommendation from the RDB2RDF XG we intend the following: where possible, the language will encourage the reuse of public identifiers for long-lived entities such as persons and corporations.

The reuse of the same public identifiers for the same entities will considerably improve the effectiveness, as well as the efficiency, of data and information integration processes. In the case where these identifiers are unique (when produced and assigned in a systematic way), then the above processes can gain significant further benefits in both effectiveness and efficiency.

We understand that a relational DB operates under closed world semantics, wheres the Web of Data has intrinsic open-world semantics.

We further state that every entity should have an URI ...

Options

There are a couple of existing solutions and/or proposals available that deal with reusable identifiers:

  • linked datasets, esp. DBPedia
  • OKKAM's ENS (see also http://www.eswc2008.org/final-pdfs-for-web-site/fisr-2.pdf). The Entity Name System (ENS) is a web-scale system for assigning and managing unique identifiers to entities in the WWW. These identifiers are global, with the purpose of consistently identifying a specific entity across system boundaries, regardless of the place in which references to this entity may appear.

Use Cases

In certain cases it is reasonable to reuse existing identifiers:

  • public, well-known entities such as people, institutions, organizations, corporations, geographic locations, pieces of art, etc.
  • ...

Proposed Approach

We propose the existence of an optional, discrete step/functionality in the RDF generation process that will allow users to search for, and use already existing identifiers for the entities they wish to publish in RDF.