SkosCoreGuideToc/SectionConceptIdentity

From W3C Wiki

Concept Identity

This section describes issues relating to establishing identity of resources of type skos:Concept.

Identity and Mapping

Here 'identity' is taken to mean:

Two resources of type skos:Concept are identical iff they represent the same concept as defined by a specified concept scheme.

Here 'equivalence' is taken to mean:

Two resources of type skos:Concept are equivalent iff they intend the same meaning.

Identical resources of type skos:Concept may be merged in an RDF graph.

Equivalent but not identical resources of type skos:Concept may NOT be merged in an RDF graph.

To put this in plain speak: for two concepts that mean the same thing, if they belong to different concept schemes then they should always be represented as two distinct resources in an RDF graph.

Or, in other words: identical concepts should be merged; equivalent but not identical concepts should be mapped.

Concept Identifiers

URI Identifiers

The simplest way to work with concepts in an RDF context is to refer to them using URIs.

There are a number of issues relating to assignment of URIs to concepts - for a full discussion see the VM TF note [ref].

Here are a couple of issues briefly stated as an introduction to the options.

HTTP URIs

@@TODO

INFO URIs

@@TODO

Other URIs

@@TODO

Non-URI Identifiers

If you already have an identifier scheme for your concepts that is not based on the URI, you will probably want to include these identifiers in the description of your concepts, and may want to use them instead of URIs for referring to the concepts.

To include non-URI identifiers in a description of a concept, we recommend you define your own property. So for example:


<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
  
  <rdf:Property rdf:about="http://my.example.org/knowledgebase/schema#identifier">
    <rdfs:comment>An identifier for a concept within my scheme.</rdfs:comment>
    <rdfs:subPropertyOf rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#value"/>
    <rdfs:subPropertyOf rdf:resource="http://purl.org/dc/elements/1.1/identifier"/>
    <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#InverseFunctionalProperty"/>
  </rdf:Property>

</rdf:RDF>


You can then use this new property in concept descriptions, for example:


<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:skos="http://www.w3.org/2004/02/skos/core#"
  xmlns:my="http://my.example.org/knowledgebase/schema#">

  <skos:Concept rdf:about="http://my.example.org/knowledgebase/biology#spottedbowerbird">
    <!-- Other properties of the concept ... -->
    <my:identifier>14H87</my:identifier>
  </skos:Concept>

</rdf:RDF>


There are three things to note about the definition of the new property.

First, it is a sub-property of dc:identifier. This helps to establish that the intended use of the new property is to specify an identifier for the given concept. I.e. generic applications can deduce the semantics of the property.

Second, it is a sub-property of rdf:value. This is to support interoperability with qualified dublin core usage of the dc:subject property (see below).

Third, it is an owl:[[InverseFunctionalProperty]]. This means that two concepts with the same value for this property will be deemed identical. In practise, this means that you can do things like:


<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:skos="http://www.w3.org/2004/02/skos/core#"
  xmlns:my="http://my.example.org/knowledgebase/schema#">

  <skos:Concept>
    <skos:prefLabel>Spotted bowerbird</skos:prefLabel>
    <my:identifier>14H87</my:identifier>
    <skos:broader rdf:parseType="Resource">
      <my:identifier>12L47</my:identifier>
    </skos:broader>
    <skos:narrower rdf:parseType="Resource">
      <my:identifier>42U43</my:identifier>
    </skos:narrower>
    <!-- Other properties of the concept ... -->
  </skos:Concept>

</rdf:RDF>


I.e. you can have a complete RDF description of a concept scheme, without needing URIs.

Published Subject Indicators

@@TODO

Concept Identity Rules (Smushing)

Strong Identity Rules

The following rules may be used as a reliable basis for merging of graphs of concepts.

  1. If two resources have the same URI, then they are identical (RDF identity).
  2. If two resources have the same literal value (including language tag) for a skos:prefLabel property AND the same resource value for a skos:inScheme property, then they are identical (SKOS identity).
  3. If two resources have the same value for a property of type owl:[[InverseFunctionalProperty]] then they are identical (OWL identity).

Weak Identity Rules

The following rules may be applied to attempt graph merging:

  1. If two resources have the same value for an rdf:value property AND the same value for an rdf:isDefinedBy property, then they are candidates for identity.
  2. If two resources have the same value for an rdf:value property AND the same value for an rdf:type property where the value is not skos:Concept, then they are candidates for identity.
  3. If two resources have the same value for an rdfs:label property AND the same value for an rdf:isDefinedBy property, then they are candidates for identity.
  4. If two resources have the same value for an rdfs:label property AND the same value for an rdf:type property where the value is not skos:Concept, then they are candidates for identity.

Note these rules are included to support interoperability with Qualified Dublin Core style use of the dc:subject property. For example, the dqc spec [ref] recommends usage such as:


<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:dcterms="http://purl.org/dc/terms/">

  <Description rdf:about="http://somewhere.org/someDoc.html">
    <dc:subject>
      <Description>
         <value>19D10</value>
         <rdfs:label>Algebraic K-Theory of spaces</rdfs:label>
         <rdfs:isDefinedBy rdf:resource="URI2"/>
      </Description>
    </dc:subject>
  </Description>

</RDF>


Compare this to a hypothetical description of a concept using SKOS Core:


<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:skos="http://www.w3.org/2004/02/skos/core#">
  
  <skos:Concept rdf:about="http://my.example.org/knowledgebase/mathematics#ktheory">
    <skos:prefLabel>Algebraic K-Theory of spaces</skos:prefLabel>
    <skos:inScheme rdf:resource="URI2"/>
  </skos:Concept>

</rdf:RDF>


Because skos:prefLabel is a sub-property of rdfs:label, and skos:inScheme is a sub-property of rdfs:isDefinedBy, the concept described in the second snippet is a candidate for identity with the blank node defined in the first snippet, under weak rule 3.

Or this description of a concept:


<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:skos="http://www.w3.org/2004/02/skos/core#"
  xmlns:my="http://my.example.org/knowledgebase/schema#identifier">
  
  <skos:Concept rdf:about="http://my.example.org/knowledgebase/mathematics#ktheory">
    <my:identifier>19D10</my:identifier>
    <skos:inScheme rdf:resource="URI2"/>
  </skos:Concept>

</rdf:RDF>


Here, because my:identifier is a sub-property of rdf:value (see above), and skos:inScheme is a sub-property of rdfs:isDefinedBy, the concept described in this snippet is a candidate for identity with the blank node defined in the first snippet, under weak rule 1.


End section