Warning: This version accompanies the developement of RDFa 1.1 Core. As that document is not final yet, this service, and the underlying code, will change frequently until the development of RDFa 1.1 is finalized. The implementation may actually run ahead of the “official” version and implement the version in the editors’ draft already… Also, the package available for download may be out of sync with the code running this service.
If you intend to use this service regularly on large scale, consider downloading the package and use it locally. Storing a (conceptually) “cached” version of the generated RDF, instead of referring to the live service, might also be an alternative to consider in trying to avoid overloading this server…
RDFa is a specification for attributes to be used with XML languages or with HTML5 to express structured data. The rendered, hypertext data of XML or HTML is reused by the RDFa markup, so that publishers don’t need to repeat significant data in the document content. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. pyRdfa is a distiller that generates the RDF triples from an XML or HTML5 file annotated by RDFa in various RDF serialization formats. It can either be used directly from a command line or via a CGI service. It corresponds to the RDFa 1.1 Core document, XHTML+RDFa, and HTML+RDFa specifications, as well as to the SVG Tiny 1.2 Recommendation for the SVG version. The forms above can be used to start the service installed at this site. To learn more about RDFa, please consult the RDFa 1.1 Core Document. See also below for the possibilities to download the package.
As installed in this service pyRdfa is a server-side implementation of RDFa. This also means that pages that generate their (X)HTML content dynamically (e.g., using AJAX) will not be properly processed by this distiller.
format; values:
turtle, xml, json, nt;
default: turtle) rdfa-lite;
values: true, false; default: false) true, a warning will be issued if RDFa 1.1 Core
attributes, that are not part of the RDFa
1.1 Lite specification, are used. The separate graph option
should be used to make these warnings visible. host-language;
values: xhtml, html, svg, atom,
xml; default: html)graph;
values: output, processor, processor,output;
default: output)processor is set, then those triples are returned, too. See the RDFa 1.1. Core document
for further details. vocab-expansion;
values: true, false; default: false )vocab
attribute, i.e., to retrieve the corresponding RDF file and follow the possible
subclass and subproperty relationships. See the
RDFa 1.1. Core document for further details.embedded-turtle;
values: true, false; default: true )space-preserve;
values: true, false; default: true)vocab-cache;
values: true, false; default: true)vocab-cache-report;
values: true, false; default: false)graph option to processor or processor,default
(depending on the original setting of graph).vocab-cache-refresh;
values: true, false; default: false)When the RDFa resource is accessed through HTTP, the host language is determined based on the content type of the return header as follows:
metadata
element is also extracted and added to the output.If you use Firefox, Safari, Chrome, or Opera, you can also drag the following bookmarklets to your browser bar and use them to distill the current page: “RDFa it (Turtle)!”, “RDFa it (RDF/XML)!”, “RDFa it (N triples)!”.
When using the distiller URI directly, the option names for the default options can be ommited. Some examples:
http://www.example.com/rdfa.html, with
whitespace preservation and without warnings, serialized in Turtle:http://www.w3.org/2012/pyRdfa/extract?uri=http://www.example.com/rdfa.htmlhttp://www.example.com/rdfa.html, with
whitespace preservation and without warnings, serialized in RDF/XML:http://www.w3.org/2012/pyRdfa/extract?format=xml&uri=http://www.example.com/rdfa.htmlhttp://www.example.com/rdfa.html, with
whitespace preservation and including warnings, serialized in Turtle:http://www.w3.org/2012/pyRdfa/extract?graph=default,processor&uri=http://www.example.com/rdfa.htmlhttp://www.w3.org/2012/pyRdfa/extract?uri=refererThe underlying package, called pyRdfa, implemented as a Python package, is available for download. The package is based on the standard Python 2.x distribution. (It has been tested on version 2.7.2, which is the highest, and probably the last stable release in Python 2.x). The module does not run on the Python 3.x family.
The core package relies on the RDFLib package. It has been tested on the RDFLib 3.1.0, but it also runs with the RDFLib 2.x versions. RDFLib 3.x is preferred: the serialization modules are superior in quality. (Note, however, that the JSON serialization does not run on RDFLib 2.x versions!) The Python HTML5 parser is used to process HTML5. The general package also relies on a slightly modified version of Deron Meranda’s httpheader module. (Both the HTML5 Parser and httpheader are included in the distribution.)
For the JSON-LD serialization, two more external packages are used: Armin Ronacher’s Ordered Dictionary (odict) package, as well as Bob Ippolito’s simplejson package. odict is needed unless Python 2.7.x is used (an ordered dictionary module has been added to the standard distribution of Python 2.7.x); simplejson is needed for Python 2.5 or lower (json has been added to the standard Python 2.6.x distribution).
To install the package, download the distribution file (it is a compressed tar file) and either move the pyRdfa directory to your PYTHONPATH or modify your PYTHONPATH to to include that directory. The odict and httpheader modules (each consisting of a single Python file) have been added to the pyRdfa package under ‘extras’; you do not have to do anything special to install these. The HTML5 parser must be installed independently; to make this step easier, the compressed tar file has been added to the pyRdfa distribution file. The same is true for the simplejson package although, if you run Python 2.6.x or higher, that module can be ignored.
This software is available for use under the W3C® SOFTWARE NOTICE AND LICENSE