RDF Query Test Cases

URI
http://www.w3.org/2003/03/rdfqr-tests/rdf-query-testcases.html
Authors and Contributors:
Alberto Reggiori, ??
Abstract:
This document describes an RDF vocabulary to express machine-processable RDF Query Test cases to support interoperability testing among several different RDF query implementations, the creation of a test cases repository and the design of the machinary for mapping between different query syntaxes.

Status:
This is a discussion document, part of the RDF Interest Group collaboration on RDF query and rule testcases. Please send comments and feedback to the (publically archived) www-rdf-rules@w3.org mailing list.

Contents

1. Introduction

This document describes an RDF vocabulary to express machine-processable RDF Query Test cases to support interoperability testing among several different RDF query implementationsand the design of the machinary for mapping between different query syntaxes.

The manifest format being presented here together with the "Recording Query Results" [1] are part of the ad hoc work on "RDF Query (and Rule) Testcase" [2] which aims to define a set of common vocabularies to markup RDF query testcases and build a publicly avialable test cases repository for RDF tool developers. This document is also related to Work Package 7 [3] of the SWAD-Europe [4] project.

1.1 What this document is not

We understand how this RDF Query Test Cases manifest work is close in scope to W3C Semantic Web Services IG [5] activity and other related proposals such as Eric Prud'hommeaux WSDL-RDF mapping [6] work. With this document we are trying to take a much simpler and pragmatic approach, specifying a very basic set of vocabularies to build simple regression testing tools such the ones being distributed together with some RDF toolkits. Future work might make more clear the scope of this document into a broader scope such as Semantic Web Services descriptions.

2. Scope

This section outlines the rationale behind the need to design of a common vocabulary to express RDF query test cases. We also discuss how a generic RDF query can be seen as consisting of three distinct parts: RDF input source selection, query expression and recording query results.

2.1 Query Languages Interoperability Tests

Several different syntaxes [7] are being used by RDF tool developers to express arbitrary RDF graph queries, and each of these has its benefits and drawbacks. The ad hoc work on "RDF Query and Rule languages Use Cases and Examples" survey [8] clearly shows a lot of syntactic similarities between several different RDF query languages by collecting a common set of real-world use cases. Each language has its own syntactic dialect to select an input RDF source, to express a query pattern and recording the result sets. Some tools being very much database specific do not even allow to directly specify an input source, being implict; others instead allow to indicate several different input RDF sources which need to be aggregated/merged/smushed together [9][10] at query time. Query expressions vary from simple single arc to open subgraph matching on triples. The query result sets can be generally returned in a number of forms (triple, result table, multiple sub-graphs and single subgraph); each product and tool implements one or more of these features.

It is clear that to allow RDF query language interoperability tests we need to agree on a common set of  guidelines to express input sources, query expressions and recording the results. The second part of this paragraph tries to outline a common set of features which could be specified and implemented by different query languages.

2.1.1 Input source selection

Each RDF query is generally run over a bunch of input RDF datasets which could be expressed using one of the avaialble syntaxes like RDF/XML [11] N-Triples [12] or N3 [13]. Others input syntaxes might exist or being created by the user for specific usage and/or application; existing legacy relational databases or ad hoc built triple-stores might be other possible input query sources. Here we will assume that such input raw sources will be all available at some given URL either as RDF/XML, N-Triples or N3 files. This will guarantee that existing RDF parsers can be used to process such sources in a standard way with well defined semantics. Differently, an application might implement a specific URL schema to support natively third part input source formats [14][15] but it will not be further discussed in this document.

2.1.2 Query expression

Even if many of the existing query systems have significant advanced built in features, such as numerical constraints, regular-expressions and free-text operators on the values allowed to match the pattern part of the query, most (if not all) languages support simple conjunctive triple patterns; in the general case, query expressions are being represented as a matching of a graph pattern [16][17][18] against an input source RDF graph. The graph pattern is an RDF graph with variables for some arcs or nodes (resources, bNodes or literals). This also includes "bArcs" (blank arcs) for some queries, so taking the pattern outside of pure RDF. Not looking at the "bArcs" problem, simple triple patterns can be considered as a ground level common denominator between different query language implementations to express query expressions. Several attempts have been done to mark-up in some RDF syntax such triple patterns  containing "bArcs" [19][20].

2.1.3 Recording query results

For the class of query languages we are considering, results can returned in a number of forms:

A graph pattern may match the target graph in a number of ways. The first two forms of results take this into account, the third just merges the different ways into a single result.

For a deeper discussion about recording query results of generic RDF queries and a concrete proposal of an RDF vocabulary, see Andy Seaborne ongoing work "Recording Query Results" [1].

3. Modelling RDF Query Test Cases in RDF

The manfest format being specified is based on previous work done by Libby Miller [25], Dan Brickley [26] and Andy Seaborne [27] trying to express a bunch of interoperability tests for their SquishQL/RDQL implementations. The RDF vocabulary presented here extends and generalise the format originally proposed; it is also (on purpose) pretty much similar to the one used to express the RDFCore WG "RDF Tests Cases" [23][24]. Hopefully in the future a common RDF vocabualry might arise to express basic RDF syntax and query tests.

4. Query Test Cases Vocabulary

While modeling the RDF vocabulary we tried to capture the very general case when a given RDF query has several different input sources, multiple query graph patterns and possibly several different output result sets. Each input source, query expression and output result document has an URL associated to it, but it also possible to inline the content of them by using rdf:parseType="Literal" property values. Specific properties have been defined to uniquely identify the content-type (RDF/XML, N-Triples or N3) of the input sources or output result sets.

An RDF Query Test Cases manifest file generally consists of a bunch of ordered tests by using the rdf:parseType="Collection" construct; each test has a sequental number, a name or title and a textual description. An exit status (true/false) is also defined for each test case to allow expressing positive and negative query tests; a false exit status would for example mean that the input query expression (triple pattern) does not express a valid graph path into the given RDF source(s). This would also help the processing software to assess whether or not expect some output from the given test. Additional properties such as a submission date, author, email archive URL and additonal textual notes might be attached to the test case. Each test case also includes an input part and an output part; the former describes the given RDF input sources (RDF/XML, N-Triples or N3) and the query patterns expressed as RDF graph with "bArcs" (RDF/XML, N-Triples or N3). The latter describes the output documents resulting by carrying out the given test query; each output document has a possible number of rows (results).

It has been a design decision to split up the RDF Query Test Cases RDF vocabualry in two parts: manifest specific and input/output specific parts. Maybe splitting the test into an "inputs" sections and a "outputs" sections with separate nodes would be useful so properties that might apply to both (e.g. rdfs:comment and other annotations) can be attached to either the test
as a whole or to elements of the test.

The namespaces are the following:

4.1 Manifest Vocabulary

Classes

mf:Test
         An RDF Query Test Case

Properties

mf:tests
A collection (bunch) of tests eventually with rdf:parseType='Collection'
mf:emailArchiveURL
A URL pointing to www-rdf-rules mailing list for example
mf:description
Verbose descriptive text for this test case
mf:name
Single line descriptive text for this test case
mf:label
Unique index string (if needed)
Status
Exit status for this test case; possible values are 'true' and 'false'
mf:input
Pointer to a resource describing the input documents (sources and query docs)
mf:output
Pointer to a resource describing the output documents (results)
mf:notes
More textual notes further describing the test case
mf:author
The name of the person or organization that provided the information about the test case
mf:submissionDate
Date of publication of the RDF Query Test Case
mf:num
Some sequential number for the test (if not enough the one of rdf:parseType='Collection')

4.2 Input/Output Vocabulary

Classes

tq:Document
An input or output document
tq:NT-Document
An N-Triples document
tq:RDF-XML-Document
An RDF/XML document
tq:N3-Document
An N3 document

Properties

tq:queryDocument
An input document encoding the triple-patterns of the RDF Query
tq:inputDocument
An input source document containg the actual RDF data to query
tq:outputDocument
An output document containg the actual results of the RDF query. The format is under discussion: eg. whether or not the output is going to be a set of bindings or RDF sub-graph. A common approach is to have it as RDF of some nature at least
tq:numberRows
The number of rows (bindings?) in the RDF query result set

5. Example

Suppose we have an RDQL query like the following:

SELECT
        ?sal, ?t, ?x 
FROM
        <file:rdf/jobs-rss.rdf>,
        <file:rdf/jobs.rss>
WHERE
        (?x, <job:advertises>, ?y),
        (?y, <job:title>, ?t) ,
        (?y, <job:salary>, ?sal)
USING
        job for <http://ilrt.org/discovery/2000/11/rss-query/jobvocab.rdf#>

then we can encode this query in RDF/XML as:

<?xml version="1.0"?>

<!DOCTYPE rdf:RDF [
<!ENTITY tq 'http://www.w3.org/2003/03/rdfqr-tests/query.rdfs#'>
]>

<rdf:RDF
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:mf='http://www.w3.org/2003/03/rdfqr-tests/manifest.rdfs#'
xmlns:tq='&tq;' >

<rdf:Description>
<mf:tests rdf:parseType='Collection'>
<mf:Test mf:num="3">
<mf:name>test3</mf:name>
<mf:description>two rss files</mf:description>
<mf:status>true</mf:status>
<mf:input rdf:parseType='Resource'>
<tq:queryDocument
rdf:resource='file:queries/nt/q3.sq.nt'
rdf:type="&tq;NT-Document"/>
<tq:inputDocument rdf:resource='file:rdf/jobs-rss.rdf'
rdf:type='&tq;RDF-XML-Document'/>
<tq:inputDocument rdf:resource='file:rdf/jobs.rss'
rdf:type='&tq;RDF-XML-Document'/>
</mf:input>
<mf:output rdf:parseType='Resource'>
<tq:outputDocument rdf:resource='file:rdf/rs/q3.rdf'
rdf:type='&tq;RDF-XML-Document'
tq:numberRows='5'/>
</mf:output>
</mf:Test>
</mf:tests>
</rdf:Description>

</rdf:RDF>
having the tq:queryDocument expressed as N-Triples

# generated from queries/q3.sq
_:x <http://ilrt.org/discovery/2000/11/rss-query/jobvocab.rdf#advertises> _:y .
_:y <http://ilrt.org/discovery/2000/11/rss-query/jobvocab.rdf#salary> _:sal .
_:y <http://ilrt.org/discovery/2000/11/rss-query/jobvocab.rdf#title> _:t .


and tq:inputDocument(s) in RDF/XML syntax

<?xml version="1.0" encoding="ISO-8859-2"?>
<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
        xmlns:wn="http://xmlns.com/wordnet/1.6/"
     xmlns:job="http://ilrt.org/discovery/2000/11/rss-query/jobvocab.rdf#">

          <channel rdf:about="http://ilrt.org/discovery/2000/11/rss-query/jobs-rss.rdf">
            <title>A hypothetical job listings channel</title>
            <link>http://ilrt.org/discovery/2000/11/rss-query/</link>
            <description>
        This example shows RSS used as a lightweight data transport mechanism
            </description>

            <image rdf:resource="http://ilrt.org/discovery/2000/11/rss-query/joblogo.gif"/>

            <items>
              <rdf:Seq>
                <rdf:li rdf:resource="http://example.com/job1.html" />
                <rdf:li rdf:resource="http://example.com/job2.html" />
              </rdf:Seq>
            </items>

          </channel>
         
          <image rdf:about="http://ilrt.org/discovery/2000/11/rss-query/joblogo.gif">
            <title>RSS Job listing demo</title>
            <link>http://ilrt.org/discovery/2000/11/rss-query/</link>
            <url>http://ilrt.org/discovery/2000/11/rss-query/joblogo.gif</url>
          </image>
         
          <item rdf:about="http://example.com/job1.html">
            <title>The title of job1 goes here</title>
            <link>http://example.com/job1.html</link>
            <description>
        (Job1-Job1-Job1...) A simple textual description of the
        job (ie. abstract of the job advert we reference) goes here.
            </description>
         
            <job:advertises>        
                 <wn:Job job:title="Job title for job1 goes here"
             job:salary="100000"
             job:currency="USD"
             >
            <job:orgHomepage rdf:resource="http://www.ukoln.ac.uk/"/>
                  </wn:Job>
             </job:advertises>

          </item>

          <item rdf:about="http://example.com/job2.html">
            <title>The title of job2 goes here</title>
            <link>http://example.com/job2.html</link>
            <description>
        (Job2-Job2-Job2...) A simple textual description of the
        job (ie. abstract of the job advert we reference) goes here.
            </description>

            <job:advertises>        
                 <wn:Job job:title="Job title for job2 goes here"
             job:salary="150000"
             job:currency="UKP"
             >
            <job:orgHomepage rdf:resource="http://ilrt.org/"/>
                  </wn:Job>
             </job:advertises>
          </item>
</rdf:RDF>


<?xml version="1.0" encoding="utf-8"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:job="http://ilrt.org/discovery/2000/11/rss-query/jobvocab.rdf#"
 xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
 xmlns:wn="http://xmlns.com/wordnet/1.6/"
 xmlns="http://purl.org/rss/1.0/" >

<channel rdf:about="http://ilrt.org/discovery/2000/11/rss-query/jobs.rss">
 <title>JOBS!</title>
 <link>http://www.lotsofjobs.com</link>
 <description>
 Lots of fantastic jobs!
 </description>

 <image rdf:resource="http://ilrt.org/discovery/2000/11/rss-query/joblogo.gif"/>

    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://www.lotsofjobs.com/job1" />
        <rdf:li rdf:resource="http://www.lotsofjobs.com/job2"/>
        <rdf:li rdf:resource="http://www.lotsofjobs.com/job3" />
      </rdf:Seq>
    </items>         

</channel>


<image rdf:about="http://ilrt.org/discovery/2000/11/rss-query/joblogo.gif">
 <title>Jobs! logo</title>
 <link>http://www.lotsofjobs.com</link>
 <url>http://ilrt.org/discovery/2000/11/rss-query/joblogo.gif</url>
</image>


<item rdf:about="http://www.lotsofjobs.com/job1">
 <title>Job 1</title>
 <link>http://www.lotsofjobs.com/job1</link>

 <job:advertises>
   <wn:Job job:title="Job title for job11 goes here"
            job:salary="50000"
            job:currency="UKP"
            >
   <job:orgHomepage rdf:resource="http://www.nothing.com/"/>
   </wn:Job>
</job:advertises>

</item>

<item rdf:about="http://www.lotsofjobs.com/job2">
 <title>Job 2</title>
 <link>http://www.lotsofjobs.com/job2</link>
 <job:advertises>

    <wn:Job job:title="Job title for job22 goes here"
            job:salary="100000"
            job:currency="USD"
            >

    <job:orgHomepage rdf:resource="http://www.nothing.com/"/>
    </wn:Job>
</job:advertises>

</item>


<item rdf:about="http://www.lotsofjobs.com/job3">
 <title>Job 3</title>
 <link>http://www.lotsofjobs.com/job3</link>
 <job:advertises>
    <wn:Job job:title="Job title for job33 goes here"
            job:salary="60000"
            job:currency="USD"
            >

     <job:orgHomepage rdf:resource="http://www.nothing.com/"/>
 </wn:Job>
</job:advertises>

</item>

</rdf:RDF>


and tq:outputDocument in RDF/XML syntax (which is not yet using Andy Seaborne RDF vocabulary for results [1])

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
         xmlns="http://example.com/libby/config/"
         xmlns:foaf="http://xmlns.com/foaf/0.1/" >
<Row>
   <cell>
    <Binding>
        <key>y</key>
        <value rdf:nodeID="a"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>x</key>
        <value rdf:resource="http://example.com/job1.html"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>sal</key>
        <value>100000</value>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>t</key>
        <value>Job title for job1 goes here</value>
    </Binding>
   </cell>
</Row>
<Row>
   <cell>
    <Binding>
        <key>y</key>
        <value rdf:nodeID="b"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>x</key>
        <value rdf:resource="http://example.com/job2.html"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>sal</key>
        <value>150000</value>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>t</key>
        <value>Job title for job2 goes here</value>
    </Binding>
   </cell>
</Row>


<Row>
   <cell>
    <Binding>
        <key>y</key>
        <value rdf:nodeID="c"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>x</key>
        <value rdf:resource="http://www.lotsofjobs.com/job1"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>sal</key>
        <value>50000</value>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>t</key>
        <value>Job title for job11 goes here</value>
    </Binding>
   </cell>
</Row>


<Row>
   <cell>
    <Binding>
        <key>y</key>
        <value rdf:nodeID="d"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>x</key>
        <value rdf:resource="http://www.lotsofjobs.com/job2"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>sal</key>
        <value>100000</value>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>t</key>
        <value>Job title for job22 goes here</value>
    </Binding>
   </cell>
</Row>

<Row>
   <cell>
    <Binding>
        <key>y</key>
        <value rdf:nodeID="a"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>x</key>
        <value rdf:resource="http://www.lotsofjobs.com/job3"/>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>sal</key>
        <value>60000</value>
    </Binding>
   </cell>

   <cell>
    <Binding>
        <key>t</key>
        <value>Job title for job33 goes here</value>
    </Binding>
   </cell>
</Row>

</rdf:RDF>


6. Tools

Libby Millier has been working on a generic Java based query syntax translation tool [21].

7. References and Resources


[1] http://www.w3.org/2003/03/rdfqr-tests/recording-query-results.html
[2] http://www.w3.org/2003/03/rdfqr-tests/
[3] http://www.w3.org/2001/sw/Europe/plan/workpackages/live/esw-wp-7.html
[4] http://www.w3.org/2001/sw/Europe/
[5] http://www.w3.org/2003/03/swsig-charter.html
[6] http://www.w3.org/2002/02/21-WSDL-RDF-mapping/
[7] http://www.w3.org/2001/11/13-RDF-Query-Rules/
[8] http://rdfstore.sourceforge.net/2002/06/24/rdf-query/
[9] http://rdfweb.org/2001/01/design/smush.html
[10] http://www.w3.org/TR/rdf-concepts/#section-graph-equality
[11] http://www.w3.org/TR/rdf-syntax-grammar/
[12] http://www.w3.org/TR/rdf-testcases/#ntriples
[13] http://www.w3.org/DesignIssues/Notation3
[14] http://www.picdiary.com/triplequerying/
[15] http://rdfweb.org/2002/02/java/squish2sql/intro.html
[16] http://www.w3.org/TandS/QL/QL98/
[17] http://lists.w3.org/Archives/Public/www-archive/2002Apr/0040.html
[18] http://lists.w3.org/Archives/Public/www-rdf-rules/2003Feb/0003.html
[19] http://www.hpl.hp.com/semweb/publications/DaveR-www2003.pdf
[20] http://www.csd.abdn.ac.uk/research/AgentCities/QueryByExample/index.php
[21] http://www.ilrt.bris.ac.uk/discovery/2003/03/query/readme.html
[22] http://lists.w3.org/Archives/Public/www-rdf-rules/2003Mar/0013.html
[23] http://www.w3.org/TR/rdf-testcases/
[24] http://www.w3.org/2000/10/rdf-tests/rdfcore/testSchema.rdf
[25] http://swordfish.rdfweb.org/rdfquery/tests/query-results-manifest.rdf
[26] http://lists.w3.org/Archives/Public/www-rdf-rules/2003Jan/0010.html
[27] http://lists.w3.org/Archives/Public/www-rdf-rules/2003Mar/0008.html

To Do


Alberto Reggiori : <alberto@asemantics.com>

Valid XHTML 1.0!