$Id: N3QL.html,v 1.66 2004/07/03 13:33:38 timbl Exp $

N3QL - RDF Data Query Language

N3QL is an implementation of an N3-based query language for RDF. It treats RDF as data and provides query with triple patterns and constraints over a single RDF model. The target usage is for scripting and for experimentation in information modelling languages. The language is derived from Notation3.and RDQL.

This paper is designed to be comparable with the RDQL paper. However it also describes the format for returned data, as the other part fo the data access protocol.

The purpose of N3QL is as a model-level access mechanism that is higher level than an RDF API. Query provides one way in which the programmer can write a more declarative statement of what is wanted, and have the system retrieve it.

The N3QL grammar
N3QL Usage
Example queries

Features

N3-like language for retrieving sets of values
No syntactic extension of RDF beyond that in N3
Arithmentic operations use RDF Properties and so very extensible.
Builtins could be separated into a separate graph.
Concise bandwidth-friendly

Advantages of n3 syntax for RDF Query

It is clean design to specify the graph which is to be matched in the same syntax as one would specify a graph. N3 does this.
The graph to be matched (and/or graph to be generated) are in the query not asserted, and so one must separate them from any other asserted information. This can be done using a quoting syntax.
One does not want to write many parsers for different quoting or other non-asserting syntaxes for RDF. N3 uses one syntax for quoting RDF.
In fact, rules, queries and updates have much overlap in terms of syntax, semantics and implementation. Specifically, they need to introduce unasserted graphs, and variables. N3 grammar allows these. It is silly to use a separate syntax for the same thing in each of the various cases.
Although N3 full uses quoting, when restricted to a two-level langauge, the complications of quoting in the language do not apply: it can be regarded only as a syntax for a query, semantically equivalent to queries in say RDQL.
N3 is very extensible. Because the query is itself a graph, metadata can be added easily for example to give a server clues as to where to federate parste of the server, or to include credentials, resource limites, and so on --- all properties of the query. One should use N3 for queries for the same reasons that one should use RDF for data.
There has already been noticeable advantage to having NTriples be a subset of N3, in terms of coding, understanding, and newcomer ramp-up time. At this stage in the semantic web history, getting new people on board is critical. N3 seems to be a good learning langauge for RDF and OWL, and commonality between that and the RDF-QL would help further.
Looking forward into the future, in fact, at MIT we have found that it very useful to be able to use rules which actually use N3 to its full extent. In other words, while a two-level system of RDF and unasserted RDF is useful and necessary for query, a recursive system has many uses too. One cannot without a lot of unnecessay pain extend things to full N3 is one doesn't start with restricted N3.

N3QL Grammar

N3QL is an application of Notation3. Like N3Rules, it is a constrained subset of N3 Full.

See:

BNF description of N3 full grammar

N3QL is constrained in that:

Formulae cannot be nested. As an RDF query language, the graphs are not n3 graphs but simple RDF graphs.
Arguably, N3QL should propagate the failure of RDF to allow literals as subjects.

N3QL Properties

N3QL is simply N3 which uses specific properties to convey the import of a query and its response. These properties are in the N3QL namespace, <http://www.w3.org/2004/ql>.

select: The seelct of a query is the template for returned argumnents. It is a an RDF graph with variables, in N3 formula syntax. No variable may occur in the select graph which does not occur in the where graph.
where: The where of a query is the graph to be matched. It is a an RDF graph with variables, in N3 formula syntax. It will typically share variables with the select graph.

Example query message and response

[] select { result is (?x ?y) };
   where  { ?x a ex:Librarian;  ex:hairColor ?y }.
_____________________________________________________
result is ( ex:Joe  "black"), (ex:Mary, "red").

Protocol

A query service, on receiving a message under this protocol, finds all things which are the subject of a "where" property. Each of these is deemed to be a Query. For each query, it finds every combination of bindings for the variables in the where graph which unify that graph with the service's knowledge base (stored or virtual). It then determines from that set the set of distinct bindings of the set of variables which also occur in the select graph. For each set of bindings, it creates a result graph by substituion of the bindings into the select graph.

The result graphs are merged and the resulting graph is the response to the query. Advantage: simpler. The result is a single RDF graph. The query client can simply find the tuples returned.

Example:

result is ( ex:Joe  "black"), (ex:Mary, "red").

HTTP GET binding

The query is expressed in N3 ideally without a lot of extra whitespace, and then is URL-encoded and appended to the service's URI with an intervening "?" character. The result of dereferencing the URI so formed is the return of a representation in N3 (unless content -negotiated otherwise) of the response message.

SOAP binding

TBD

Extensibility

The use of N3 makes the query language very extensible.

May-understand extensions

Extensions to the query message may be added in that extra properties, in any namespace, can be added to the query. This is extra information. The service is not required to understand such extra information, and is not bound to take it into consideration. Clients cannot add extra information which attempts to changes the meaning of the query, unless there is a prior agreement between the parties.

Similary, servers may add extra information to the response if the response is in the second form.

Must-understand extensions

In order to extend the query language in such a way that a service must understand the extension, the select and/or where properties are replaced by different ones, for example ones form a enw namespace. This means that existing older clients will not be able to find a query, and will have to return an error.

Prior agreements between the parties

A query service is a service, which may be described using RDF. The ability of a service to support specific built-in functions may for example be a good thing to advertize in a service description.

Explicit agreement between the parties

A query could be labelled within the request message as requiring specific functionality,

Response message

The result message is the response graph encoded, typcially, in N3. (Where HTTP GET is used, format negotiation could be used to negotiate something else).

Observations

This is a straw proposal which, while functional, really demonstrates how a query langauge is simple constructed using N3. The Data Access Working Group in designing such a langauge can of course change and add vocabulary.

A number of design options have been given.

Option: default namespace prefixes

Every query could have the default namespace (empty prefix) set up to be

The message itself for local variables (normal N3. probably the best option)
The query language namespace (makes examples pretty but less compact in real life probably)
The namespace of the service to which the message is posted (interesting for the case of an exported RDB where the column names etc in the tables can all have URIs close to the service, and so the query becomes naturally compact.)

Also, there could be prefixes assumed for the namespaces above even when they are not the empty prefix. For example, "q:" for the query language and "s:" for the service local namespace makes sense.

N3 or XML?

This query langauge uses the N3 formula extensions, in the sense that graphs are extended to allow nested graphs and variables. However, the N3 syntax does not have to be used. If an XML syntax is desired, an extended form of the RDF/XML syntax could be created to be completely equivalent to the N3 syntax.

For example, the first example above could be represented as shown in the appendix.

More examples

Assuming the declarations (which could be an assumption on the protocol)

@prefix : <http://www.w3.org/2004/ql#>.
@keywords a, is.

Query example 1: Retrieve the value of a known property of a known resource

<> select { ?x a result}
 where { <http://somewhere/res1> <http://somewhere/pred1> ?x }.
______________________________________________
<http://somewhere/pred1>a result.

Query example 2: constraints (option a)

<> select  { Result is (?a ?b)};
   where   {  ?a  <http://somewhere/pred1> ?b;
              ?b math:lessThan 5}.
__________________________________________________
Result is ( "blue" 2), ("red" 1), ("purple", -16)

Query example 2: constraints (option b)

<> select  { (?a ?b) a Result};
   where   {  ?a  <http://somewhere/pred1> ?b};
   and     { ?b math:lessThan 5}.
__________________________________________________
Result is ( "blue" 2), ("red" 1), ("purple", -16)

Query example 3: paths in the graph

<> select  { (?a ?b) a Result};
   where   { ?a <http://somewhere/pred1> [<http://somewhere/pred2> ?b]}.

compare RDQL:

SELECT ?a, ?b
WHERE (?a, <http://somewhere/pred1>, ?c) ,
(?c, <http://somewhere/pred2>, ?b)

Query example 4: contents of a collection

<> select { Result is (?x ?y);
   where  { ?x list:in ?x.
            ?y ex:memberList ?z }.
__________________________________________
Result is ( ex:joe ex:mathClub),
          (ex:joe ex:skiClub);
          (ex:Mary ex:skiClub).

References

RDQL: This document was initially designed on the basis of RDQL
DAWG: Work in this area is done by the W3C Data Access Working Group.

DAWG requirements document

Appendices

Appendix 1: Design Alternatives

These design possibilities are not part of the current language as defined here. They are there to show that the issues are acknowledged, and to show that solution to these issues are possible. The primary purpose of this specifciation at this current state is to show that a query language and easily be built using N3. It would be unwise to reject the syntax because this document did not provide a feature which the work group felt was essential. That said, addition of additonal features would in most cases make the language unnecessaily complicated for a version 1.0.

Design option: optional binding graph

Include the following property of a query or not?:

option: The graph of things which are optionally bound to. The opion of a query is a graph which connects to and extends the where graph. The query service binds the largest of subset of the union of where and option graphs which it can. This is not a well-defined operation right now, and it is not clear that it should be a requirement for every query service, or clear that it should be included in N3QL at all. This requires some sort of way of returning variable amounts of data. One way would be to make the select graph such that a subset could be returned by binding only those variables which had bindings.

Example query message using `option`

:q1 q:select { ex:result ex:is [ex:main(?x ?y); ex:phone ?p] };
   q:where  { ?x a ex:Librarian;  ?x ex:hairColor ?y };
   q:option { ?x ex:phone ?p}.
_____________________________________
ex:result ex:is
  [ex:main ( ex:Joe  "black"); ex:phone "+1-234-555-6789"],
  [ex:main (ex:Mary, "red")].

Design option: and property

Include the following property of a query? (No)

and: The and clause is like the where clause but contains constrains whcih are calculated rather than searched for. The name is from the AND clause in RDQL.

Example without:

<> select { ?p ex:had ?t};
   where  { [ a ex:Reading; ex:temp ?t; ex:place ?p].
            ?t math:greaterThan 25 }.
___________________________________________
ex:London ex:had 26.
ex:Barcelona had 31.

Example with:

<> select { ?p ex:had ?t };
   where  { [ a ex:Reading; ex:temp ?t; ex:place ?p]};
   and    { ?t math:greaterThan 25 }.
___________________________________________
ex:London ex:had 26.
ex:Barcelona had 31.

Advantages:

Some people seem to think that calculations should be clearly separated from knowledge base searches.

Disadvatages:

There is no distinct line between "calculation" and "search" especialy when one extends the "search" to include inference, and "calculation" to allow querying of the web, and the use of functions which may invoke in turn search and inference. It is short-sighted to make this distinction when in fact an implementation can easily triage the tuples into builtins and non-builtins. To make this more resilient, see the requirement property and the discussion.

Design option: Premis property

Include the following optional property? (No)

premis: A graph whose contents is to be considered as part of the knowledge base queried

Design option: Requirement property

Include the following property of a query? (No)

requirement: One or more features, identified by URI or qname. See Algae's requires . This could be a way of specifying the forms of processing required, sets of builtin, speed, accurancy, etc.

Design option: Quoted replies in response message.

Use the following alternative from of response message? (No)

The response graph is the set of statements of the form { qqq q:result G } where qqq is the query and G is a result graph. A disadvantage is complexity. An advantage is greater extensibility in that more information can easily be return ed without fear of confusing it with example: Advantage: the response graph can contain other metadata with less risk of it getting confused with the data returned.
```
<mid:2314@ex.com> q:result
   { result is (ex:Joe  "black")},
   { result is (ex:Mary, "red")};
  svc:responseTime 0.0023 .
    
```

Design option: Special syntax for select

The only awkwardness in the syntax above could be the select clause, in cases in which a list of variables is to be returned. The reason for this is that the selct clause has to be a graph, in order to be quoted. It is logically incorrect to give the variables unquoted. N3 has no syntax (such as backquote syntax) for explicitly quoting the name of a variable, and as variables are in fact URIs to give their string values is messy. Two ways this could be changed would be as follows.

One option would be to add variable name quoting to N3, like:

:q1  q:select  `x`, `y`;
     q:where { ?x a ex:Librarian, ex:hairColor ?y}.

A second option would be to add to N3 a special keyword specially for selct clauses, which would be new syntax (a little comparable with @forAll) which implies quoting, something like:

:q1  @select ?x, ?y;
     q:where { ?x a ex:Librarian, ex:hairColor ?y}.

Neither of these add a lot, and both complicate the N3 language itself.

Design option: Allow => syntax

The query langauge is quite similar to a rule language. If rather than [] select F; where G one writes G => F, then one has a rule foo.n3 which can then be run on a knowledge base with

cwm kb1.n3 --filter=rule.n3.

An alternative form of the query language, if the option and and clauses are not used, would be to use log:implies, shorthand =>, instead. This would be even more compact for small queries.

{?x a ex:Librarian; ex:hairColor ?y} => {result is (?x ?y)};.
_____________________________________________________
result is ( ex:Joe  "black"), (ex:Mary, "red").

Appendix2: Alternative representations

XML syntax

The query

[] select { result is (?x ?y) };
   where  { ?x a ex:Librarian;  ex:hairColor ?y }.
_____________________________________________________
result is ( ex:Joe  "black"), (ex:Mary, "red").

could be represented in XML, by extending the RDF/XML 1.0 syntax, for example as

<rdf:RDF xmlns="http://www.w3.org/2004/ql#"
    xmlns:ex="http://www.example.com/foo#"
    xmlns:q="http://www.w3.org/2004/ql#"
    xmlns:rdf="http://www.example.org/rdf2#"
    rdf:forAll="#x #y">

    <rdf:Description>
        <select rdf:parseType="Quote">
            <rdf:Description rdf:about="http://www.w3.org/2004/ql#result">
                <is rdf:parseType="Resource">
                    <rdf:first rdf:resource="#x"/>
                    <rdf:rest rdf:parseType="Resource">
                        <rdf:first rdf:resource="#y"/>
                        <rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#nil"/>
                    </rdf:rest>
                </is>
            </rdf:Description>
        </select>
        <where rdf:parseType="Quote">
            <ex:Librarian rdf:about="#x">
                <ex:hairColor rdf:resource="#y"/>
            </ex:Librarian>
        </where>
    </rdf:Description>
</rdf:RDF>

Note that this is NOT RDF 1.0, as the RDF spec specifically disallows such extensions.

Reification

It can also be represented as an RDF graph using a form of reification, for example:

   @prefix : <http://www.w3.org/2004/06/rei#> .
   @prefix owl: <http://www.w3.org/2002/07/owl#> .
   @prefix t: <#> .
  
    [  :existentials  [owl:oneOf  () ];
       :statements  [
         owl:oneOf  (
         [     :object  [
               :existentials  [
                 owl:oneOf () ];
               :statements  [
                 owl:oneOf  (
                 [   :object  [
                       :items  (
                       [:uri "http://www.w3.org/2000/10/swap/test/ql/t00.n3#x" ]
                       [:uri "http://www.w3.org/2000/10/swap/test/ql/t00.n3#y" ] ) ];
                     :predicate  [
                       :uri "http://www.w3.org/2004/ql#is" ];
                     :subject  [
                       :uri "http://www.w3.org/2004/ql#result" ] ] ) ];
               :universals  [owl:oneOf () ] ];
             :predicate  [:uri "http://www.w3.org/2004/ql#select" ];
             :subject  _:a ]

         [     :object  [
               :existentials  [owl:oneOf () ];
               :statements  [
                 owl:oneOf  (
                 [
                     :object  [:uri "http://www.example.com/foo#Librarian" ];
                     :predicate  [:uri "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" ];
                     :subject  [:uri "http://www.w3.org/2000/10/swap/test/ql/t00.n3#x" ] ]
                 [
                     :object  [:uri "http://www.w3.org/2000/10/swap/test/ql/t00.n3#y" ];
                     :predicate  [:uri "http://www.example.com/foo#hairColor" ];
                     :subject  [:uri "http://www.w3.org/2000/10/swap/test/ql/t00.n3#x" ] ] ) ];
               :universals  [owl:oneOf () ] ];
             :predicate  [:uri "http://www.w3.org/2004/ql#where" ];
             :subject  _:a ] ) ];
       :universals  [ owl:oneOf  (
        "http://www.w3.org/2000/10/swap/test/ql/t00.n3#x" 
        "http://www.w3.org/2000/10/swap/test/ql/t00.n3#y"  ) ] ].
  
#ENDS

for the record.