RE: SquishQL/RDQL Comments

Sean,

Great news!  There are 5 implementations of RDQL I know of: Jena, Sesame,
3Store, RDFStore, PHPXMLClasses.

Do you pass the test cases for such a style of language?  Libby, Alberto and
myself have been getting tests cases together.  See www-rdf-rules and the
regular IRC chat.  For specific RDQL tests, the whole testing set up is in
the Jena2 download under testing/RDQL/ 

Would you mind forwarding this discussion to www-rdf-rules as there are
other implementations of RDQL who might be interested?

	Andy

Other comments inline:

> -----Original Message-----
> From: Sean B. Palmer [mailto:sean@mysterylights.com] 
> Sent: 1 June 2003 21:50
> To: Libby Miller; Andy Seaborne
> Cc: www-archive@w3.org
> Subject: SquishQL/RDQL Comments
> 
> 
> Hi Libby, Andy,
> 
> I just wrote a SquishQL parser [1] and hooked it up to my query engine
> so that I can run queries (e.g. [2]) over the Web. In writing the
> parser, I came up with a number of comments about SquishQL and RDQL.
> I'm CCing www-archive instead of www-rdf-rules since I'd like to clear
> some of these things up before posting some further RDF query ideas I
> have there.
> 
> I managed to implement SquishQL very quickly indeed from scratch, and
> the test cases given on the grammar page were very useful in testing,
> so that was very encouraging.
> 
> The SquishQL grammar [3] is, however, kinda hard to follow--and even
> wrong--in places. The RDQL grammar [4] is similarly afflicted, and
> worse in some ways since the <> delimited productions aren't further
> defined anywhere (except in the Jena API, I'm guessing) which makes it
> rather difficult to implement!

The grammar for RDQL is 
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/jena/jena2/src/com
/hp/hpl/jena/rdql/parser/rdql.jjt?rev=HEAD&content-type=text/plain

The reference [4] is the jjtree output - it is a summary, not a full
grammar.  The ugly URL above  is to the JavaCC grammar and has all the
productions for Jena's implementation of RDQL.  It is complete - it is what
the build system uses to build RDQL/Jena!  Note this has already covered
several of your points below in Jena2.

Soon there will be a written description of RDQL - I am in the process of
writing a note about it.

> 
> SquishQL Bugs:
> * I've read the list of SquishQL issues at [5], and the grammar page
> is very out of date with it: for example, it now says that "SELECT *"
> and "anon nodes" are supported, but these changes are not implemented
> in the grammar.
> * The definitions of TextLiteral and Identifier are rather dodgy. For
> example, the meaning of "letter" seems to be quite different in each
> one. I interpret TextLiteral as anything following the regexp
> "'[^'\\]*(?:\\.[^'\\]*)*'", and Identifer as anything following the
> regexp '[A-Za-z][A-Za-z0-9]*'.
> * Test 23 shows that URIs can't contain ")". That's kinda been
> resolved in RDQL by using "<" and ">" to delimit URIs. I think that
> all terms in SquishQL/RDQL should follow the style set out in NTriples
> (cf. RDQL bugs below).
> * The defintions of "integer" and "floating point number" are
> basically non-existant: I had to guess when implementing them (and
> just went with '(?:[-+]?[0-9]+)(?:\.[0-9]+)?(?:e[-+]?[0-9]+)?').
> * The concept of a QName isn't introduced at all in the grammar.
> * (minor point) "," in UriList is inconsistently quoted--use
> apostrophes.
> * (minor point) The "inverted commas" mentioned on the grammar page
> are properly called apostrophes... :-)
> 
> SquishQL testset bugs (this refers to the tests on the SquishQL
> grammar page):
> * pm::DeliverableSpec in test 4 is an invalid qname.
> * "=" is used as string operator in test 11, whereas the grammar
> specifies it as a number operator only. The SquishQL issues list seems
> to say that this is an open issue, but nontheless, the test data is
> inconsistent with the grammar as it stands. Perhaps a note could be
> added to the grammar?
> 
> RDQL bugs:
> * <> productions are not further explained (see above).

See URL for the master grammar file.  Sorry - that web page was a summary
and has slipped behind development of RDQL in Jena.  I'll fix that.

> * "Anon nodes" and QNames are apparently wrapped in "<" and ">" the
> same as URIs. That seems rather odd: why not use the same definitions
> as are used for NTriples/N3, i.e. _:bNode <uri> q:name ?univar
> "literal"?

Qnames: already fixed, allowing both old and new forms.  Note N3 has a
restricted definition of qnames over XML (no '.' aloowed in prefix or local
part).

URIs:  I had to use a more liberal character set than N-triples because of
the amount of strictly bad data out there.  No matter - it is an upwardly
compatible definition.  

Bnodes: writing "_:a" isn't going to work - it is not the same bnode as the
one you want to match oin the graph (RDQL isn't doing graph isomorphism but
is doing exact match).  The internal id of a bNode is hidden and may, or may
not, be anything to do with the bNode label used in the file.

Consider reading it two files into the same model: each file has a bNode _:a
(file scope)  There are two resource nodes.

> * In my query engine, I return the triples matched as well as
> bindings. It seems to me that a SELECT "triples" sort of addition to
> the grammar might be a nice idea, but then perhaps this is out of
> scope for the sort of things that RDQL was designed to do. Of course,
> one can always reconstruct the triples matched by feeding the binding
> results back into the query triples. Then again, I think that the
> reason that I return triples as well as bindings in queries is that
> this way you get the bNodes back properly. It seems to me that
> SquishQL and RDQL (and thus, I presume, their implementations) are not
> set up well for dealing with bNodes at the moment, which is odd
> because it's an important issue.

Not sure what you mean here - the objects returned by RDQL/Jena are real API
Java objects, not labels.  You get Resources (URI'ed nodes, bNodes,
Properties) so you can and do get the bNodes.

What do you return? the labels for graph nodes/arcs?

RDQL in Jena does also return the matching triples.  When the target graph
does inferencing, remember the matched triple may not be a ground fact - I
create matching statements from the query definition.

Joseki http://www.joseki.org/ returns the minimal complete matching subgraph
for an RDQL query.

> * There are some odd small changes from SquishQL that I'm not sure I
> understand--e.g. the introduction of commas to seperate
> squishql:ForList/rdql:PrefixDecl (as a compromise, I'd say that these
> should probably be optional in both languages).

Optional in RDQL once I fixed the grammar.  Commas are optional everywhere.

> Actually, the fact that both SquishQL and RDQL exist signals a bit of
> a warning to me: I know that RDQL was derived from SquishQL and that
> all the old code still runs, but haivng two extraordinarily similar
> implementations of SQL-ish syntaxes for RDF query is rather confusing.
> It'd be nice, if RDQL is deemed superior, to have more "use RDQL"
> style notes in the SquishQL stuff, or vice versa. For example, I'm not
> sure right now whether I should scrap my SquishQL parser and go with
> an RDQL parser instead or not. Or perhaps I need both? Guidance would
> be much appreciated!
>
> And then there are discussions as to whether SQL-ish syntaxes for RDF
> query are a good idea at all. Notation3 gets along well with mixing
> constraints and triples together in formulae, but then it's a
> different kind of system. Personally, I think that RDQL is a good
> direction, but obviously the RDF query community needs to a lot of
> work. I should send a followup to ww-rdf-rules.

More constraints (e.g. date testing) is one of the common feature requests I
get.

Actually, you can do constraint testing in cwm using the builtin predicates.

> Overall, my issues with the grammar are fairly minimal, and you've
> both done a lot of good work on the RDF query front, so thanks!
> Hopefully a standard syntax (or few) will emerge, and some decent test
> cases will shortly follow...

Libby has done a fine job getting a set of test cases together.  You can
also convert the RDQL ones in the Jena distribution.

> 
> Cheers,
> 
> [1] http://infomesh.net/2003/squishql/
> Announcement:
> http://lists.w3.org/Archives/Public/www-rdf-rules/2003Jun/0001
> [2] SELECT ?name, ?homepage FROM
>    http://www.w3.org/TR/rdf-syntax-grammar/example07.rdf
> WHERE
>    (ex:editor http://www.w3.org/TR/rdf-syntax-grammar ?editor)
>    (ex:fullName ?editor ?name)
>    (ex:homePage ?editor ?homepage)
> USING
>    ex FOR http://example.org/stuff/1.0/
> 
> Output:
> $ ./rdfquery.py squish_test.txt
> ?name: "Dave Beckett"
> ?homepage: <http://purl.org/net/dajobe/>
> 
> [3] http://swordfish.rdfweb.org/rdfquery/squish-bnf.html
> [4] http://www.hpl.hp.com/semweb/rdql.htm
> [5] http://ilrt.org/discovery/2001/07/squishql-issues/
> 
> --
> Sean B. Palmer, <http://purl.org/net/sbp/>
> "phenomicity by the bucketful" - http://miscoranda.com/
> 

Received on Monday, 2 June 2003 04:30:36 UTC