User defined datatypes & XML Schema WG

This is discharged what I thought was an action item, but may have  
just been a request :)

A datatype is a unary datatype predicate. That is, it is a  
description of a set of values. Integers, strings, integers less than  
10 and greater than 5, strings matching "aa*" are all examples of  
datatypes.

OWL 1.0 has two popular built in datatypes: xsd:string and  
xsd:integer. It also has built in a number of other xsd types, see:

	http://www.w3.org/TR/owl-semantics/syntax.html#2.1

The first two I thought were required while the others optional, but  
it doesn't seem to be the case. The first two show up explicitly in  
the semantics.

In principle, OWL 1.0 has an extensible set of datatypes, but section  
2.1 of S&AS points out:
	"""Because there is no standard way to go from a URI reference to an  
XML Schema datatype in an XML Schema, there is no standard way to use  
user-defined XML Schema datatypes in OWL."""

There is a working group note addressing this lack:
	http://www.w3.org/TR/swbp-xsch-datatypes/

Pellet supports (and has for some time) the DAML+OIL solution:
	http://www.w3.org/TR/2005/WD-swbp-xsch-datatypes-20050427/#sec-daml- 
soln

One point not addressed by this extensibility mechanism is inline  
anonymous datatype expressions. You have to name all the datatypes  
you will use, which can be inconvenient. (My personal experience of  
this was trying to code some features of event streams from video,  
where in events were individuals within certain intervals, with  
interval boundaries coded as datatypes.)

OWL 1.1 allows user-defined datatypes using XML schema facet  
restrictions on the built-in simple datatypes (though I do not find  
the OWL 1.0 list replicated...it should be):
	http://www.webont.org/owl/1.1/owl_specification.html#4.3

The Protege 3.x series owl plugin supported this sort of thing in OWL/ 
RDF as an extension (which is how it got into OWL 1.1).

This solution does not say how to use user defined datatypes from XSD  
files and, in the current serializations, it does not use the XML  
serialization supplied by the XML Schema working group.

So, things we need to decide:
	1) What datatypes are built in and what constructors are allowed
	2) What syntax(es) to support for "inline" user defined datatypes
	3) What naming mechanism for externally defined datatypes to support

Constraints:
	A) We need to make sure XML Schema WG is cool with what we do, but  
we shouldn't allow them to block us doing SOMETHING
	B) What we do must be reasonably implementable. I would support  
making some base set of datatypes required

Objectives:
	i) It would be good if we had a standard *effective* mechanism for  
extending the base types. It can be a registry, for all I care.

Note that many of these issues arise for n-ary data predicates as  
well. At the moment, OWL 1.1 has no built in ones. (This is distinct  
from the, e.g., HP worries about n-ary data predicates.)

Cheers,
Bijan.

Received on Sunday, 4 November 2007 10:24:40 UTC