[TF:DbE] The easiest keys there are

Hi folks,

The OWLED task force on DatabasEsque features:
	http://code.google.com/p/owl1-1/wiki/DatabasEsque

Well, at least Uli and me, have been doing a bit of work on keys  
(aka, inverseFunctional datatype properties) prompted by a visit to  
Manchester by Matthew Pocock. Some sort of keys is a pretty high  
value feature. However, if you check out this poster:
	http://webont.org/owled/taskforces/dbe/keys_poster.pdf

(Bit of explanation: Bits in the tan clouds correspond to the  
asserted parts of the kb. So you see the named entities, e.g., m1 or  
s1, in there. s1, for example, is known to have a key, but we don't  
know what the key is.)

===SEMANTICS===

You'll see that keys in a *general* sense can be tricky to combine  
with OWL. One way of getting an intuitive grasp on this, perhaps, is  
to consider that, in full generality, OWL let us reason about the  
compatibility of key constraint assertions. So, for example, if I  
have a database schema that declares a key to be a 6 digit string and  
I try to align it with a schema that has a key with only 5 digits you  
can see that there is a problem. We can have more distinct  
individuals in the first schema than in the second. This can be a  
very useful thing to learn, but it is a relatively difficult thing  
for a theorem prover to work with.

However, this is, as far as we can tell, a comparatively rare case.  
The far more common --- and desperately needed --- case is for  
working with keys in the data. So, roughly, key entailments should  
only be enforced on *named* entities, and perhaps only when  
considering fairly explicit assertions.

What might we reasonably expect from keys in OWL, then? Suppose we  
say that that the hasID property is a key for individuals who are  
persons and that the keys are positive integers. We might expect:

	1) That if there is a statement :bob rdf:type :Person, then there  
will be a statement :bob hasID x, where x is a specific integer.
	2) If :bob hasID x and :mary hasID x then :bob sameAs :mary.

The first is actually pretty durn tricky since it involves non-first  
order reasoning. Let's drop that for the moment. The latter seems  
similar to a DL Safe rule. E.g.,

	Person(X), Person(Z), hasID(X, Y), hasID(Z,Y) => X=Z

Now we're a little unclear about the safety restriction on Y, in  
general. But let's leave that aside.

Note that I did relativize the key to Persons. So other things could  
have IDs without being merged.

It'd be nice if we didn't need a full DL Safe rule implementation for  
this. Ideally, you'd have a bit of code that scraped through your  
ABox looking for keys and performing merges as necessary and then  
rechecking entailments. Unfortunately, nominals and datatype  
expressions can make this tricky. For example, you could have a class  
expression that said that something hasValue on hasID to be 17. Or a  
set of inequalities could force you to a single value. Reasoners  
don't necessarily generate all these sorts of entailments, so you'd  
have to check for them manually, which starts to get ugly. Query  
engines might help out here.

(Things start to get complicated because you might have to try  
merging a lot of different individuals to see if you can force a  
particular value.)

Alternatively, you could restrict your key properties to *not* be  
used in any class expressions anywhere. Then there would be  
(probably) no scary entailments to worry about. I hope :)

===SYNTAX===

Given that we're thinking of a rather restricted (if common!) case  
with lots of somewhat contorted constraints, it perhaps is better to  
introduce a special syntax for them. An implementation would be free  
to translate these to DL Safe rules or whatever and the syntactic  
checks would be easier.

So, obvious questions:

1) Does this very restricted semantics meet serious needs? I.e., who  
out there would be made happier by having such?
2) Is having special syntax palatable?
3) Is recognizing that you can hack a lot of this with DL Safe rules  
mean you'd rather just do that?

If people think that some approximation of what I discussed in this  
note would be worth having, we can move toward fleshing out the  
proposal and getting some implementation going.

Cheers,
Bijan.

Received on Friday, 28 September 2007 17:49:49 UTC