From SPARQL Working Group
Jump to: navigation, search

Hello WG,

I have a couple of comments on the current Last Call working draft for 
SPARQL 1.1 query, mainly about the set of built-in functions (section 
17.4). These comments came about as part of me implementing these 
functions in Sesame's SPARQL processor.

We're glad to hear that there is now an implementation of SPARQL 1.1 query and update for Sesame. We hope you will participate in the implementation report that the working group will be doing as part of the W3C process.

1. String functions

The current set of built-in functions on strings seems rather 
arbitrarily chosen, with little evident use case requirements backing 
them up.

For example, while both fn:string-length and fn:substring are included, 
fn:substring-before and fn:substring-after are not, nor is there any 
form of 'indexOf'-function. This makes it currently not possible in 
SPARQL to determine the substring of a string based on a character match.

My comment is not that these functions should or should not be included 
per se, but rather a question: what criteria did the WG use to decide 
which functions 'make the cut'?

Having reviewed the choice of string functions, and having sought to understand the choices in XQuery/Xpath functions and operators, the working group has decided to add functions STRBEFORE (c.f. fn:substring-before), STRAFTER (c.f. fn:substring-after) and REPLACE (c.f. fn:replace). These function take into account the RDF data model - for example the handling of literal with language tags.

2. Hash functions

Perhaps my strongest problem with the current Working Draft is the 
inclusion of 6 variations for calculating a hash. Arguably calculating a 
hash is a _very_ outlying use case that comes up rarely in practical 
applications of SPARQL. I'm not denying there are valid use cases for 
it, but adding six different varieties seems, frankly, outlandish.

There is a practical consideration for me in this as well: on the Java 
platform, SHA-224 in particular is not supported by the default 
cryptography architecture. The fact that SPARQL includes it forces me to 
add a third-party dependency to my SPARQL implementation for a feature 
that very few users will ever need. I find this wasteful and an 
unncessary burden, both on implementors and on users of the software.

Given that the SPARQL specification supports the adding of custom 
functions, so that any vendor who needs to can extend the language, I 
would suggest that this kind of niche functionality has no place in the 
core spec and should be removed, or at the very least only a minimal set 
of hash functions (2 or 3, tops) should be required. In picking this 
subset, the WG should IMHO consider which algorithms are most commonly 
used and supported on various platforms.

The working group reviewed the choice of hash functions and also the availability of implementations for various programming languages. As a decision criteria, the working group choose to keep the SHA2 functions mentioned in "XML Signature Syntax and Processing Version 1.1" [1] which is SHA-256, SHA-384, SHA-512. MD5, while not secure, is known to be used as a checksum and SHA1 is used by FOAF.

Therefore, the working group has removed SHA-224, which is the function causing you some implementation difficulties.


Jeen Broekstra

Changes for both string functions and hash functions are drafted in the editors' working draft [2]. There will be a second last call for the SPARQL 1.1 Query document because these changes materially affect implementations.

We would be grateful if you would acknowledge that your comment has been answered by sending a reply to this mailing list.

Andy (on behalf of the SPARQL WG)

[1] [2]