Annotation System

From OWL
Revision as of 19:09, 28 July 2008 by BijanParsia (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Motivation

This proposal aims to deal with the following issues:

and the use cases identified in this thread:

The basic idea was hashed out in Spring 2007 with Boris and Bernardo and other members of the UManchester DL Lunch. Alan Rector also recently reinvented something like this proposal.


Annotations are very important to many applications: often the point of the modeling is to support the annotations! But also, annotations might carry critical information for tools and for applications.

Gist

There are three major differences with the current (and OWL 1.0) system:

  • We can have "blobs" of annotations associated with an Entity or Axiom, rather than a single statement per annotation.
  • Annotations can live in multiple separate "annotation spaces". The easiest way to think about these is if you took the annotations in a space and put them in a separate document.
  • Annotations can be marked as affecting the content of the domain model.

Syntax

We need to update certain productions, e.g.,:

ontology := 'Ontology' '(' ontologyURI { importDeclaration } {annotationSpaceDeclaration} { annotation } { axiom} ')'

Annotation spaces would be declared as follows:

annotationSpaceDeclaration  := 'AnnotationSpace' '(' spaceUri {spaceTypeUri} {importDeclaration} status ')'
spaceUri  := URI
status  := 'mustUnderstand' | 'mayIgnore'

Annotations themselves may reference an Annotation Space.

annotationURI := URI
explicitAnnotationByConstant := 'Annotation' '(' [ spaceUri ] annotationURI constant')'
annotationByConstant := explicitAnnotationByConstant | labelAnnotation | commentAnnotation
annotationByEntity := 'Annotation' '(' [ spaceUri ] annotationURI entity')'

The set of annotations can also include "blob" annotations:

annotationByBlob := 'Annotations' '('[ spaceUri ] { annotationAssertions } ')'
annotationAssertions  := fact**
annotation := annotationByConstant | annotationByEntity | annotationByBlob

At the moment, I'm restricted annotation assertions to facts, though in principle you could have arbitrary XML, for example. Also, there is one difference from normal facts. You can have a distinguished constant Self which refers to the entity or axiom being annotated.

If there is no spaceUri , then the annotation is interpreted as being in the default space Default.

Semantics

This is a fairly informal presentation.

First, all annotations with a "spaceURI" are gathered together in one "document" (named graph, metaview, whatever). (NOTE: this is a metaphor. No user sees separate documents unless they explicitly request it from a tool.)

Each space can have its own logic or processing semantics. The default space is interpreted as an OWL ontology.

If a space is labeled "mayIgnore" then the space is interpreted entirely in isolation from the domain level axioms regardless of the associated semantics. Thus, an OWL reasoner trying to find entailments of the domain is free to throw these away.

If a space is marked "mustUnderstand" then an OWL processor must not give any entailments of the domain model without it "knowing" the effect of the annotation (although a tool could choose to override this). The semantics of the combined space/ontology are determined by an external spec.

There are clearly issues if you have multiple mustUnderstands. The conservative view is that if there are multiple the processor must specifically understand that combination. Different spaces could be speced so that they work in isolation (i.e., every other mustUnderstand is free to ignore them).

RDF Serialization

<http://www.ex.org/owl> rdf:type owl:Ontology;

      hasAnnotationSpace ex:mySpace;
      hasAnnoationSpace ex:Prot4

Examples

Simple Syntax Example

Ontology(<http://www.ex.org/owl>
	//The spaces are all default and don't import anything)
   AnnotationSpace(ex:mySpace)
   AnnotationSpace(ex:Prot4)

   SubClassOf(Annotation(ex:Prot4 dc:creator "Bijan Parsia") Student Person)

   SubClassOf(Annotations(ex:Prot4 
                             ObjectPropertyAssertion(SELF dc:creator "Bijan Parsia")
                             ObjectPropertyAssertion(SELF dc:date-modified "2008-07-03"))
       Person owl:Thing)

   //This goes into the default space and acts pretty much like a normal OWL2 annotation
   //The main difference is that I can import stuff into the default space.
   SubClassOf(Annotation(rdfs:comment "This is required") Employee Person)

   SubClassOf(Annotation(ex:mySpace rdfs:comment "This is controversial) AI Person)
   
)

Probabilistic extension

In Pronto (a version of Pellet implementing probabilistic reasoning), we used axiom anntotions to extend the language. In essence we overloded subClassOf to express so called "conditional constraints", that is the probabiity of an instance of one class, C, also being an instance of D. If the probability is 1, we have (ehhh...roughly) a normal subclassof. However, if it is not 1, then you cannot ignore the annotation. So this is a mustUnderstand.

Why do this? Well, otherwise, we have to introduce a whole new sort of axiom and thus get all tools to rev. Perhaps this isn't hugely common, but it doesn't seem nuts.

Constraints

Boris Motik et al have proposed a way of interpreting TBox axioms as integrity constraints, that is, with the closed world assumption rather than the open world assumption. This would lead to a mix of axioms in the ontology. One way to achieve this is to annotated the constraints with a mustUndertand marker that indicated that they should be understood as integrity constraitns.

OntoClean

OntoClean is a ontology development technique which involves annotating classes with so-called meta-properties. For example, a class is Rigid if members of that class are essentially members of that class. The classic example is being a student vs. being a human being. Pretty clearly, I can become a student, or stop being a student, without changing who I am (ok, not me, Bijan, personally, but most people). However, if you stop me from being a Person you've destroyed me (consider the most obivous way of doing this...killing me....or someone else). So Person is rigid and Student is not. In fact, student is anti-rigid...it's not essential to being anyone that they are a student (except me). Annotating classes with rigidity status imposes constraints: e.g., an anti-rigid property cannot subsume a rigid one (the other way round is fine). That would be ontocleanically inconsistent.

However, we might want to check these constraint separately. E.g., it might be that an anti-rigid class subsumed a rigid class because of a mistake in my domain modeling or because of a mistake in my OntoClean tagging. If OntoClean inconsistency makes my whole ontology inconsistent, that makes it harder to spot and work with domain issues. Similarly, I don't want OntoClean inconsistency affecting any entailments...OntoClean is for the developer, not the deployment system. So, OntoClean should get its own, mayIgnore space. (I have implemented a version like this, but not yet released it.) That space should import the OntoClean ontology. We also need a projection function that tells us what to reflect out from the domain.


Optional Restrictions

We can even use annotation spaces as a solution to Issue-85, explained in Alan's email.

It seems as if various applications require the specification that, for a class C, it is optional to have successors of property P that are instances of a class D. For example, for Wheels, we could say that it is optional to be partOf Car. This would not have any consequences - but a luxurious reasoner could test whether

Wheels And SomeValuesFrom partOf Car

is satisfiable w.r.t. our ontology and, if not, raise a warning.

We could use an annotation space http://@@@/@optional with the status mayIgnore.

Mapping cues

Experience from e.g. OAEI contests (http://oaei.ontologymatching.org/2007/) has taught that ontology mapping systems often fail on discrepancies that are not due to inherently different conceptualisations but due to different logical patterns (including those listed on http://www.w3.org/2001/sw/BestPractices/OEP/) used when adapting the conceptualisations to the restricted language. Let me present two such cases: reification of n-ary relations and handling of concept partitions. To illustrate them, I will borrow from the 'conference organisation' domain as one of those examined within OAEI (http://nb.vse.cz/~svabo/oaei2007/).

1) A reviewer can submit a review of a paper. There is inherently a ternary relationship between the reviewer, review and paper. Modeller of ontology A may use the SWBPD n-ary relation pattern (http://www.w3.org/TR/swbp-n-aryRelations/): creating a concept such as ReviewSubmission, with n-ary properties reviewSubmitted, submittedBy and reviewSubmissionForPaper. Modeller of ontology B may however prefer a simpler (though lossy) solution: creating two (or more) properties expressing alternative variants of review, e.g. submittedPositiveReviewForPaper and submittedNegativeReviewForPaper. If the ReviewSubmission concept were annotated as 'reified relation' in ontology A, the task for a (pattern-aware) mapping system, aligning A with B, would be easier, and the so-called heterogeneous mapping would be established in this case.

2) Submitted papers can be under review, accepted or rejected. At the same time, papers can be submitted as e.g. as full papers, position papers or poster papers. In ontology A, this can be done by class SubmittedPaper having disjoint partition subclasses such as PaperUnderReview, AcceptedPaper and RejectedPaper, as well as FullPaper, PositionPaper and PosterPaper. In ontology B, the same can be done (though possibly not along the 'best practice') using two data properties for the SubmittedPaper class: phase and category. If the two disjoint partition axioms in ontology A were annotated as (informally) 'criterion=phase' and 'criterion=category', respectively, the task for a (pattern-aware) mapping system, aligning A with B, would be easier, again yielding a heterogeneous mapping (which would not be so obvious otherwise).

Such types of distinctions would generally pertain to the mayIgnore space. On the other hand, they are not merely an issue for the ontology developer, but also for reasoning tools, although different from mainstream DL reasoners: compared to the situation ten years ago, ontology mapping has become an equally recognised (though not so stable) sphere of semweb reasoning as traditional entailment from a single ontology.

From the point of view of language parsimony, alternative solutions for this class of problems could possibly be formulated, e.g., for the first, creating a special ontology with class such as ReifiedRelation, importing this ontology and subclassing the ReviewSubmission class to it (thanks to Peter P.-S. for pointing out that). However, this would mean introducing an ontological concept for a notion that is merely anchored in the language (here, OWL as language lacking n-ary relation constructs) and has no real-world semantics.

In summary, the point of this kind of annotations is to 1) help the modeller encode into the formal ontology what they often have in the informal one (possibly using things like Protégé pattern wizards, i.e. w/o much additional effort/background) and would be lost otherwise... 2) and what can, on the other side of the chain, be used as formal knowledge (though not interfering with standard entailment semantics) by relevant tools.