HCLS/Labels and Definitions

From W3C Wiki

Harmonization of the representation of labels, descriptions and definitions of entities in biomedical ontologies

This document is intended to contain the following:

  • Examples from biomedical ontologies of non-standard constructs to represent Labels, names, descriptions and definitions of entities.
  • An analysis of the motivations behind the creation of these constructs
  • A description of the problems that arise through a lack of harmonization (e.g. for queries and user interfaces)
  • A review of the basic constructs in the RDF, RDFS and OWL vocabularies and their intended usage for labels, descriptions and definitions
  • An informal recommendation for the representation of labels, descriptions and definitions and suggestions for the harmonization of biomedical ontologies in this regard.

Examples from ontologies: labels

Ontology Property A / D
BioPAX bp:NAME D
BioPAX bp:SHORT-NAME D
BioPAX bp:TERM D
BioPAX bp:SYNONYS D
SWAN swan:alias D
SWAN swan:name D
SWAN swan:alias D
SWAN swan:title D
Subcellular Anatomy Ontology (CCDB) sao:synonym A
Subcellular Anatomy Ontology (CCDB) sao:abbreviaton A

A... annotation property D... datatype property

Examples from ontologies: descriptions and definitions

Ontology Property A / D
BioPAX bp:COMMENT D
Subcellular Anatomy Ontology (CCDB) sao:definition A

A... annotation property D... datatype property

  • OBO-in-OWL metamodel

Example: see the OBO-in-OWL Celltype ontology at http://www.berkeleybop.org/ontologies/obo-all/cell/cell.owl The property oboInOwl:hasDefinition relates an entity to an instance of the class oboInOwl:Definition, the actual text of the definition is attached to this instance via rdfs:label Pro: This allows us to attach further annotations to the definition itself, e.g. references to publications and database entries, other kinds of provenance information. Con: Annotation of definitions in this way is not required for most use cases. Querying for definitions is complicated. The number of triples needed for definitions is at least doubled (if definitions are further annotated, the number is tripled).

Problems that arise through a lack of harmonization

  • Queries are more complex
  • It is hard to make generic user interfaces (users need to see labels and not URIs)

Review of the basic constructs in the RDF, RDFS and OWL vocabularies and their intended usage

rdfs:comment

“This is used to provide a human-readable description of a resource.” (from RDF Primer) “A textual comment helps clarify the meaning of RDF classes and properties. Such in-line documentation complements the use of both formal techniques (Ontology and rule languages) and informal (prose documentation, examples, test cases). A variety of documentation forms can be combined to indicate the intended meaning of the classes and properties described in an RDF vocabulary. Since RDF vocabularies are expressed as RDF graphs, vocabularies defined in other namespaces may be used to provide richer documentation.” (from http://www.w3.org/TR/rdf-schema/#ch_comment )

rdfs:isDefinedBy

The property rdfs:isDefinedBy is a sub-property of rdfs:seeAlso, and indicates the resource defining the subject resource. As with rdf:seeAlso, this property can be applied to any instance of rdfs:Resource and may have as its value any rdfs:Resource. The most common anticipated usage is to identify an RDF schema, given a name for one of the properties or classes defined by that schema. Although XML namespace declarations will typically provide the URI where RDF vocabulary resources are defined, there are cases where additional information is required. (from http://www.w3.org/TR/2000/CR-rdf-schema-20000327/#s2.3.5 )

rdfs:seeAlso

rdfs:seeAlso is an instance of rdf:Property that is used to indicate a resource that might provide additional information about the subject resource. A triple of the form:

   S rdfs:seeAlso O

states that the resource O may provide additional information about S. It may be possible to retrieve representations of O from the Web, but this is not required. When such representations may be retrieved, no constraints are placed on the format of those representations.

The rdfs:domain of rdfs:seeAlso is rdfs:Resource. The rdfs:range of rdfs:seeAlso is rdfs:Resource. (from http://www.w3.org/TR/rdf-schema/#ch_seealso)

rdf:value

rdf:value is an instance of rdf:Property that may be used in describing structured values.

rdf:value has no meaning on its own. It is provided as a piece of vocabulary that may be used in idioms such as illustrated in example 16 of the RDF primer [RDF-PRIMER]. Despite the lack of formal specification of the meaning of this property, there is value in defining it to encourage the use of a common idiom in examples of this kind.

The rdfs:domain of rdf:value is rdfs:Resource. The rdfs:range of rdf:value is rdfs:Resource. (from http://www.w3.org/TR/rdf-schema/#ch_value)

Recommendations in W3C documents

It is recommended that properties that have the same use as those from the core vocabularies are declared sub-properties of these core properties

Existing metadata standards and ontologies (not part of RDF/OWL vocabulary) that could be re-used

  • SKOS is an ontology of concepts, but some of the properties for labels of these concepts could be re-used for other purposes: skos:prefLabel ("preferred label"), skos:altLabel ("alternative label"), skos:defintion, skos:example, skos:hiddenLabel.

Informal recommendation

  • Make use of rdfs:label and rdfs:comment where possible (?)
  • If you really need to define a new annotation property, make it a sub-property of rdfs:label or rdfs:comment. Please be aware that making a owl:datatypeProperty a subclass of rdfs:label or rdfs:comment is NOT valid.

Discussion

The following OBI relations might also be included in the table:

(from http://obi.sourceforge.net/ontologyInformation/MinimalMetadata.html)

  • 1: preferred_term: The concise, meaningful, and human-friendly name for a class or property preferred by the ontology developers. (US-English)
  • 1: definition: The official OBI definition, explaining the meaning of a class or property. Shall be Aristotelian, formalized and normalized. Can be augmented with informal definitions to further explain the meaning of the term.
  • 1,*: definition_editor: The name of the editor of the definition.
  • 1: definition_source: An unambiguous and traceable reference to the source of the definition. Examples: ISBN, URI plus date, MeSH Term, PUBMED ID, DOI.
  • 1,*: curation_status: The curation status of a class or property. The allowed values come from an enumerated list of predefined terms. Examples: raw import, obo definition incomplete, graph position temporary, uncurated, curation approved
  • 1,*: example: A phrase describing how a term should be used. May also include other kinds of examples, such as widely known subclasses or instances of the class.

Optional metadata: SHOULD be provided

  • 0,*: alternative_term: An alternative name for a class or property which means the same thing, i.e. semantically equivalent, as the preferred_term.
  • 0,*: alternative_term_tag: A tag to indicate sets of alternative terms. Examples: toxicogenomics_community, abbreviation.
  • 0,*: alternative_term_source: An unambiguous and traceable reference to the source of the alternative_term. Examples: ISBN, URI plus date, MeSH Term, PUBMED ID, DOI.
  • 0,*: editor_note: An administrative note intended for the editor. It will not be included in the publication version of the ontology, so it should contain nothing necessary for end users to understand the ontology.
  • 0,1: external_class: An annotation property that indicates external classes, including their subtrees, for a given anchor class.