GRDDL Test Cases

Editor's Draft March 17 2007

This version:
Latest version:
Chimezie Ogbuji, Cleveland Clinic Foundation, <ogbujic@ccf.org>
see Acknowledgments


This document describes and includes test cases for software agents that extract RDF from XML source documents by following the set of mechanisms outlined in the Gleaning Resource Description from Dialects of Language (GRDDL) specification. They demonstrate the expected behavior of a GRDDL-aware agent by specifying one (or more) RDF graph serializations which are the GRDDL results associated with a single source document.

Table of Contents


A set of test cases is provided as part of the definition of [GRDDL]. This document presents those test cases. They are intended to provide examples for, and clarification of, the normative behavior of a GRDDL-aware agent. They should be used for testing the conformance of GRDDL-aware agents. The normative tests cover behavior expected of a GRDDL-aware agent.  The informative tests demonstrate other permitted behavior with respect to the issues resolved by the Working Group. This document itself has (as a GRDDL result) a manifest describing the test cases in RDF. For convenience, serializations of the GRDDL result are available as RDF/XML and Turtle


The deliverables included as part of the test case collection are:

Note: the zip archive does not include tests which require network connectivity in order to properly calculate their GRDDL results.

Test Manifest Format

This test collection uses an RDF vocabulary for manifests developed for the RDF Test Cases Recommendation. A GRDDL-aware agent can extract the test collection and automatically test compliance by attempting to reproduce the expected GRDDL result(s) associated with each test case. Some input documents have multiple output documents, see below

Using the Test Driver

We provide testft.py, a test driver, written in Python and based on rdflib 2.3.3. Run it a la:

$ python testft.py --run your_grddl_impl testlist1.rdf >earl_out.rdf
All tests were passed!

It has options for --debug and such; invoke it with no arguments (or with --help) for details:

  -r, --run              path to a GRDDL implementation to use to process the 
                         source document (checking results)
  -u, --update           path to a GRDDL Implementation to use to process the 
                         source document
      --tester           The URI of an agent associated with the EARL test assertions.
                         A BNode is used if none is given                          
      --project          The URI of the EARL 'subject' (the implementation being tested).
                         A BNode is used if none is given

The tests do not require the use of this driver

EARL Reporting

In addition to writing various diagnostic messages to STDERR, the test harness writes additional RDF data to STDOUT: an [EARL] test assertion about each test it runs.

To tell it about the person running the tests and the software project being tested, point it to a tester (a URI in a [FOAF] RDF graph) and a test subject (a URI in a [DOAP] RDF graph).

Protocol Tracing

We find TCPWatch useful for debugging [HTTP] protocol interactions. If you start TCPWatch like so:

$ python tcpwatch.py -p 6543 &

then you can use it as a proxy:

$ http_proxy= python testft.py
--run your_grddl_impl testharness.rdf

GRDDL Transform Library

A library of standard transforms is available for widespread use by authors

Local Policies, Faithful Rendition, and Conformance

The GRDDL specification states that any transformation identified by an author of a GRDDL source document will provide a Faithful Rendition of the information expressed in the source document. The specification also grants a GRDDL-aware agent the license to makes a determination of whether or not to apply a particular transformation guided by user interaction, a local security policy, or the agent's capabilities. However, for the purpose of running these tests in order to determine compliance, a GRDDL-aware agent with a security policy which does not prevent it from applying transformations identified by each test will produce the GRDDL result associated with each test.

Tests with Multiple GRDDL Results

Certain tests have multiple GRDDL results as a direct consequence of Faithful Infoset considerations, information resources with multiple representations, and seperate GRDDL mechanisms which produce distinct GRDDL results. For such tests, A GRDDL-aware agent should output at least one of the GRDDL results associated with the test case.

Normative Tests

Each test has an input document and an output document. the output document is an RDF/XML document and represents a GRDDL result of the input document.

Localized Tests

For the sake of convenience, this first set of normative tests cover simple scenarios where neither namespace documents nor absolute URIs are used. Such tests can run offline rather easily.

Namespace Documents and Absolute Locations

These tests include the use of namespace documents and absolute URIs and are more difficult to run offline.

Ambiguos Infosets, Representations, and Traversals

These tests help check for robustness of implementations in the face of various odd cases.

Informative Tests

This section includes tests not covered explicitely by the normative text of the GRDDL but demonstrate additional behavior that a GRDDL-aware agent may exhibit. They reflect behavior suggested by the Working Group as a result of resolving certain issues.



Gleaning Resource Descriptions from Dialects of Languages (GRDDL) , Dan Connolly, 2007/03/02
[RDF Concepts]
RDF Concepts and Abstract Syntax , Graham Klyne and Jeremy J. Carroll, Editors, W3C Recommendation 10 February 2004. Latest version available at http://www.w3.org/TR/rdf-concepts/ .
[RDF Syntax]
RDF/XML Syntax Specification (Revised) . Dave Beckett, Editor, W3C Recommendation 10 February 2004. Latest version available at http://www.w3.org/TR/rdf-syntax-grammar/ .


Turtle - Terse RDF Triple Language . Dave Beckett, Editor, 04 December 2006.
Evaluation and Report Language (EARL) 1.0 Schema . Shadi Abou-Zahra and Charles McCathieNevile, Editors, W3C Working Draft 27 September 2006, http://www.w3.org/TR/EARL10-Schema/ .
Architecture of the World Wide Web, Volume One , N. Walsh, I. Jacobs, Editors, W3C Recommendation, 15 December 2004. Latest version available at http://www.w3.org/TR/webarch/ .
IETF RFC 2616: Hypertext Transfer Protocol - HTTP/1.1 , J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, June 1999. Available at http://www.ietf.org/rfc/rfc2616.txt.
FOAF Vocabulary Specification , Dan Brickley, Libby Miller, 27 July 2005.
DOAP: Description of a Project , Edd Dumbill.


The editor thankfully acknowledges the contributions of the following Working Group members:

Change Log

Changes since the Working Groups decision to publish on 7 March:

$Log: grddl-tests.html,v $
Revision 1.7  2007/03/19 14:03:13  cogbuji
added link to archive

Revision 1.6  2007/03/19 14:02:17  cogbuji
fixed well-formedness error - missing div element

Revision 1.5  2007/03/19 13:36:43  cogbuji
attempt to fix embedded "pre" tag in log commentary