Comments on SPARQL from the XML Query and the XSL WGs

Notes on SPARQL Query Language for RDF
Last Call Draft July 21, 2005  

The XML Query and the XSL WGs have reviewed the SPARQL last call draft.  Our comments are below.  We apologize for the late delivery of these comments.  By way of explanation we mention that the WGs were on summer vacation during the month of August.

1. If RDF triples are stored in a relational database (in-memory databases are available now) then SQL can be used to query them.  SQL has much more power and is a well-known language, but there are a couple of features peculiar to RDF that are unique in SPARQL such as the OPTIONAL clause.  An analysis of how much overlap there is between SQL and SPARQL would be useful.

2. SPARQL cannot return a function of a stored value.  For example, if a website stores the profit of a corporation, SPARQL cannot be used to return the profit multiplied by a currency conversion factor, if a currency is specified.  This seems like a serious limitation.

3. The document uses a describe-by-examples style.  This is easy to understand but can give the impression that the language is more restrictive than it is.  For example, if you read section 2 you get the impression that only exact matches are supported.  Only much later do you find out that, in fact, functions and operators are also supported.   

3. How is reification supported?  That is, how does SPARQL handle values that are themselves triples?  I assume nested queries are required but there is no discussion of support for this important aspect of RDF.  Dan Connolly replied to this comment on the previous draft and explained how it could be done but an example in text would be illuminating.

4. Presumably, RDF encodes metadata about data that is represented as XML.  If this is the case, why are all the XML Schema datatypes not supported?  See section 11.2.  An accompanying document [http://www.w3.org/2001/sw/BestPractices/XSCH/xsch-sw/] that discusses the use of XML Schema datatypes in RDF/OWL discusses the problems with the duration datatype and recommends the use of the two subtypes of duration xdt:yearMonthDuration and xdt:dayTimeDuration.  These datatypes are not discussed in this document.

5. Instead of the XML Schema datatype anyURI, RDF and SPARQL uses a datatype that they refine called IRI.  The earlier version of the document said that the RDF datatype IRI is a restriction of the XML Schema datatype anyURI in that only absolute URIs are supported.  We commented that it might have been simpler just to support the XML Schema datatype and call out this restriction.  The definition of this datatype has been removed from this document and is said to be contained in the RDF Data Model.  We were unable to locate this document.

6. String comparison is defined only using the code point collation.  Other collations are not supported.  This may be a significant limitation.

7. Section 3.  Decimal values cannot be written as literals.  This seems like a needless limitation.  Suggest SPARQL use the literal definitions in XPath 2.0.

8. Section 3.1 has a very brief section entitled "Matching Arbitrary Datatypes".   There is no motivation for this and little explanation for how this works.  Motivation and
detailed semantics should be provided.

9. Section 11.2.1.3 introduces the function isURI.  Is this necessary?  There is another function that returns the datatype of a variable.

10 sop:str returns the string representation of an r:IRI.  It's not clear what this does.  In the example, it strips the scheme from the IRI.  Does it perform any escaping or unescaping?

Some typos:
-	Section 2: The simplest graph pattern is the triple patterns,
-	Section 11, subsection Namespaces: 
XML Schema datatypes with the prefix op: xs:
-  Section 11.1 last para says XMLSchema[]
-	There are a number of references to an XML Schema datatype called xs:int.  In fact, the XML datatype is called xs:integer.
 
In conclusion, the XML Query and XSL working groups believe that the SPARQL specification will need considerable rework and another last call WD before it is ready to advance. In particular, we believe that the spec needs to be made more rigorous and formal, we believe that it needs to be more clearly differentiated from SQL, and it needs to be more clearly and carefully aligned with other W3C Recommendations including XML Schema and XPath 2.0.   

All the best, Ashok

Received on Tuesday, 13 September 2005 15:28:24 UTC