LC3 Responses/JH1

From OWL
Revision as of 13:29, 4 September 2009 by IanHorrocks (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

To: heflin@cse.lehigh.edu
CC: public-owl-comments@w3.org
Subject: [LC response] To Jeff Heflin

Dear Jeff,

Thank you for your comment
     <http://lists.w3.org/Archives/Public/public-owl-comments/2009Aug/0036.html>
on the OWL 2 Web Ontology Language last call drafts.

Regarding deprecation, "essentially the same" refers to the fact that, as mentioned in the Quick Reference Guide [1], <anyIRI owl:deprecated "true"^^xsd:boolean> is equivalent to <someClass rdf:type DeprecatedClass>. Respecting the mapping, according to the OWL S&AS [5], each triple of the form

 classID rdf:type owl:DeprecatedClass .

must be accompanied by a triple of the form

 classID rdf:type owl:Class .

The situation is similar for deprecated properties. Consequently, the OWL 2 mapping from RDF Graphs to the Structural Specification [6] is backwards compatible with OWL.


Regarding profiles, the QL profile in particular aims at providing performance that is comparable to the performance of database systems. Achieving this in practice will require some care in the design of both query answering systems and the applications that use them. In this respect we do not believe that OWL QL is significantly different to relational database systems, where good performance at large scale also requires major system engineering efforts as well as care in the design of the schema. Some preliminary but reasonably encouraging evaluations of query answering performance using DL-Lite can be found in [2], [7].

As far as consistency checking is concerned, in QL this can be implemented by a union of queries that is worst case polynomial in the size of the schema (TBox), where each query is is rather simple, and where the overall size of the query depends on the position and number of negations (see [3] page 405). When checking in the presence of updates, it is even simpler as one can ground parts of the query and also determine which disjuncts are relevant.


Regarding arithmetic operations, I apologize for having omitted the pointer -- the latest version of the proposed Working Group Note can be found at [4].


[1] http://www.w3.org/2007/OWL/wiki/Quick_Reference_Guide#Additional_Vocabulary_in_OWL_2_RDF_Syntax

[2] http://www.abdn.ac.uk/~csc280/PaTh07.pdf

[3] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati. Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. of Automated Reasoning, 39(3):385–429, 2007.

[4] http://www.w3.org/2007/OWL/wiki/Data_Range_Extension:_Linear_Equations

[5] http://www.w3.org/TR/owl-semantics/mapping.html

[6] http://www.w3.org/2007/OWL/wiki/Mapping_to_RDF_Graphs#Mapping_from_RDF_Graphs_to_the_Structural_Specification

[7] http://www.inf.unibz.it/%7ecalvanese/papers/calv-etal-SDKB-2008.pdf

Please acknowledge receipt of this email to <mailto:public-owl-comments@w3.org> (replying to this email should suffice). In your acknowledgment please let us know whether or not you are satisfied with the working group's response to your comment.

Regards,
Ian Horrocks
on behalf of the W3C OWL Working Group



CUT AND PASTE THE BODY OF THE MESSAGE (I.E. FROM "Dear" TO "Group") INTO THE BODY OF AN EMAIL MESSAGE. SET THE To:, CC:, AND Subject: LINES ACCORDINGLY.

PLEASE TRY TO REPLY IN A WAY THAT WILL ALLOW THREADING TO WORK APPROPRIATELY, I.E., SO THAT YOUR REPLY CONTINUES THE THREAD STARTED BY THE ORIGINAL COMMENT EMAIL


Regarding deprecation:

Regarding deprecation:

You say "essentially" the same. How is it different? I point out that 
the OWL 1.0 documents gave some discussion of the intention of 
deprecation, which document will contain these points? My original 
e-mail also pointed out an error in the "Mapping to RDF Graphs" doc. 
Does the group agree, and if so, has the fix been incorporated into the 
working version?

Regarding profiles:

First, although disjointness is a common modeling primitive, it is not 
an operation typically performed in databases (yes, you can set up 
triggers to enforce a disjointness constraint, but this can be very 
expensive). Although OWL QL has good theoretical complexity results, 
this does not always lead to systems that perform well. You mention it 
scales millions of triples (BTW, I'd appreciate some references to this 
work), but I said that people who are interested in scalability are now 
dealing with billions of triples. And if we want to be taken seriously 
by the database community, we need to set our sights on trillions of 
triples in petabyte sized knowledge bases! After all, this is part of 
the Semantic Web effort, its not just an XML syntax for DL.

You say that consistency checking is relatively easy and need only be 
performed once per ontology. However, in a pragmatic setting data will 
be added every day, if not every hour or even second. Although the T-box 
is consistent, inconsistencies can arise from this ever-evolving A-box. 
Every time we insert a new rdf:type triple, we need to check to see if 
it violates a disjointness constraint. Given that triple tables are 
widely viewed as not scalable (since joining a 1 billion row table with 
itself n times to answer a query with n conjuncts is a bad idea), this 
might mean checking many database tables to see if the subject appears 
in any of them, and this can add up to a significant performance hit. 
Although each of these checks will be less than polynomial in the size 
of the table (assuming you use an index), it takes up valuable cycles 
that will increasingly slow loading the larger and larger the KB gets. 
Alternatively, you could check consistency of the entire KB right before 
each query, but this will involve doing set difference on many large 
tables. Queries will be as slow as molasses.

I think this focus on theoretical properties and omission of pragmatic 
considerations is a symptom of the OWL2 effort being dominated by KR 
academics (not that I have anything against academics, being one 
myself). Clearly, academics should play a role in precisely defining the 
language and in keeping the scope of the effort away from the 
impossible, but there should also be a healthy balance of industry 
personnel and their opinions should be given significant weight. The 
original WebOnt WG had a heavy academic bent, in part because the idea 
was so new. However, since the SemWeb is gaining in usage, I would think 
the trend should be more industry influence on the WG, not less.

Regarding arithmetic operations:

Thanks, I don't expect this feature in OWL 2. However, it is a 
requirement from real users and I think eventually some sort of 
arithmetic will eventually be necessary if OWL is to be the language of 
the Semantic Web. I'd appreciate a link to the proposed extension, if 
you don't mind.


To: heflin@cse.lehigh.edu
CC: public-owl-comments@w3.org
Subject: [LC response] To Jeff Heflin

Dear Jeff,

Thank you for your comment
     <http://lists.w3.org/Archives/Public/public-owl-comments/2009Jul/0014.html>
on the OWL 2 Web Ontology Language last call drafts.

Regarding imports, as you can see from [1] the direct semantics of OWL 2 ontologies is explicitly applied to the axiom closure (with at link back to the definition of axiom closure in Syntax [2]).

Regarding deprecation, we introduced owl:deprecated mainly for backward compatibility in order to capture the deprecated classes of OWL 1. Thus, the capabilities of OWL 2 regarding deprecation are essentially the same as those of OWL 1.

Regarding profiles, the current design is the result of long and careful analysis both within and without the working group. It is true that a consistency check is typically required as part of query answering, but this is true even if the language includes only disjointness, which is a basic feature of conceptual modelling languages; moreover, consistency checking is relatively easy in the profiles (see [2]), and only needs to be performed once for a given ontology. The QL profile has been designed so that query answering has the same complexity as for relational databases, so there is no reason why QL systems should not be just as scalable as relational database systems; tests show that existing implementations can easily deal with data in the order of millions of triples.

Regarding arithmetic operations, the working group has specified an extension that allows for linear (in)equations with rational coefficients; this was not made part of the basic specification as it would place a heavy burden on implementers.

[1] http://www.w3.org/2007/OWL/wiki/Direct_Semantics#Ontologies
[2] http://www.w3.org/2007/OWL/wiki/Profiles#Computational_Properties

Please acknowledge receipt of this email to <mailto:public-owl-comments@w3.org> (replying to this email should suffice). In your acknowledgment please let us know whether or not you are satisfied with the working group's response to your comment.

Regards,
Ian Horrocks
on behalf of the W3C OWL Working Group



CUT AND PASTE THE BODY OF THE MESSAGE (I.E. FROM "Dear" TO "Group") INTO THE BODY OF AN EMAIL MESSAGE. SET THE To:, CC:, AND Subject: LINES ACCORDINGLY.

PLEASE TRY TO REPLY IN A WAY THAT WILL ALLOW THREADING TO WORK APPROPRIATELY, I.E., SO THAT YOUR REPLY CONTINUES THE THREAD STARTED BY THE ORIGINAL COMMENT EMAIL


As a member of the original Web Ontology Working Group, I feel some 
responsibility to comment on OWL 2. First, let me preface that I had my 
doubts that now was an appropriate time to revise OWL. That said, I was 
pleasantly surprised to see that OWL 2 includes some useful features 
without doing any significant damage. I know how hard reaching agreement 
is, so I congratulate the working group on what they have achieved so far.

I will start with what I think are the most positive features of OWL 2 are:

- property chains: this is something that was in the Requirements 
document of OWL, but never made it into the language. This provides OWL 
with a significant improvement in defining the semantics of properties 
(which it was very weak on in the past). I consider this the single most 
important new feature of the language.

- the mapping from RDF graphs to the functional syntax. This is 
something I argued for in the first WG, but never got.

- the additional syntactic sugar is helpful

Now let me say a few things that I was either disappointed with or that 
I thought could be expressed better in the documents.

- I find it strange that imports is not mentioned at all in the Direct 
Semantics. I think it should be explicit that the semantics should be 
applied to the axiom closure of the ontology. I am aware that Sect. 3.6 
of the Structural Specification provides a canonical parsing of 
ontologies, and that this involves expanding the ontology with all 
declarations in imported ontologies; but I think the Direct Semantics 
could be clarified by expressly mentioning that this parsing is required 
to determine the semantics of an ontology.

- Deprecation. I think the introduction of the owl:deprecated annotation 
property is nice, but it almost seems like an afterthought. I found very 
little discussion of the intent and the purpose of the property. I know 
it has no formal syntax, but at least the original OWL specs explained 
what deprecation was used for. It should also be made clear in some 
document that DeprecatedClass is a class iri with owl:deprecated=true 
(and likewise for DeprecatedProperty). In the "Mapping to RDF Graphs" 
doc, it is shown that the graph containing DeprecatedClass and 
DeprecatedProperty gets translated to this annotation property, but this 
is a bit hidden. Note, I think the mapping should also include 
"Declaration( Class (*:x))" (or the property version) in order to be 
fully backward compatible with OWL. In general, I think versioning could 
be better explained in the documents, but perhaps the Primer would be 
the right place for such exposition.

- Profiles. I don't agree with the choice of features to include in the 
profiles. One particular example is I don't think negation should be in 
OWL QL and OWL RL, since these profiles are intended for databases 
and/or rule-based systems. Because these systems make the close-world 
assumption, handling negation in them correctly is not straight-forward. 
In particular, it seems that a consistency check would be required to 
see if the logic is trivialized and that everything might be entailed.

I also question the choice of basing OWL QL, the profile intended for 
large numbers of instances, on DLLite. I know there are a few 
implementations of OWL QL, but do any of them scale to 5 million triples 
or more? Work on scalable systems is now focusing on billions of 
triples, and yes some of these systems do much more than RDFS reasoning 
(e.g. OWLIM). Perhaps the QL profile should focus on the commonalities 
between some truly scalable systems? I'd like to note that property 
chains are an incredibly important feature in OWL, and that as long as 
there are no cycles, are also incredibly easy to implement in databases. 
These should be in OWL QL no matter what.

- Some important features that were identified by the OWL Requirements 
document and that still haven't made it in OWL 2 are the ability to 
perform arithmetic computations (e.g., to related a property 
lengthInMeters to to the property lengthInFeet) and to do string 
manipulations (e.g, to say that the name property is equivalent to the 
concatenation of the firstName  and lastName properties). These types of 
heterogeneity are extremely common on the Web, and are thus critical for 
any practical Web ontology language.

Thanks for your hard work, and I hope you find my comments helpful. I 
look forward to your response.