RDF-DSpace Query Scenario

Author: Dan Brickley

Dspace aims to create a standard cookbook, software package or hosting service for digital library services. This document outlines some possibilities relating to the need to query, in a flexible and expressive manner, data within DSpace. It takes as an example scenario a fragment of the data model presented in [tansley-june-2001] the DSpace object composition document. The diagram below describes some objects and relationships relating to DSpace workflow; specifically, it describes a composite object and a format transformation event. We use this as a sample scenario to explore issues relating to (meta)data query.

The relationship between object modelling in this style and the use of 'object oriented' information systems (Java, CORBA IDL, SOAP/WSDL, Fedora) is also of interest here. Mappings can be expressed that relate RDF-like object descriptions to object-based systems; see below.

Figure 1: Sample data (fig 8 from tansley-june-2001 doc)

dspace diagram

RDF Representation

This structure can be represented as a set of binary relationships using the RDF data model, and serialized as an XML document. Here is one such encoding...


<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF 
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:rdfs="http://www.w3.org/2001/01/rdf-schema#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
 	 xmlns:dst="http://www.w3.org/2001/06/dspace-swad-test#"
 	 xmlns:eg="http://www.w3.org/2001/06/dspace-swad-examples#"
	 > 

<eg:OBJ-DO2 rdf:about="http://dspace.org/id/obj-001">
 <dc:description>an object of type cbj-do2, supported html bundle type
</eg:OBJ-DO2>

<rdfs:Resource rdf:about="http://dspace.org/id/obj-004">
  <dc:description>some html content...
  <dst:partOf rdf:resource="http://dspace.org/id/obj-001" />
</rdfs:Resource>

<rdfs:Resource rdf:about="http://dspace.org/id/obj-005">
  <dc:description>some kind of table of contents for the object...(?)
  <dst:partOf rdf:resource="http://dspace.org/id/obj-001" />
</rdfs:Resource>


<eg:OBJ-101 rdf:about="http://dspace.org/id/obj-046">
  <dc:description>a transformation record

  <dst:consumed> 

  <rdfs:Resource rdf:about="http://dspace.org/id/obj-006">
    <dc:description>some html content...
    <dst:partOf rdf:resource="http://dspace.org/id/obj-001" />
    <dst:has_format>
      <rdfs:Resource rdf:about="http://dspace.org/id/obj-010">
        <dc:format>image/svg
          <dc:description>the image/svg image format
      </rdfs:Resource>
    </dst:has_format>
  </rdfs:Resource>


  </dst:consumed> 
</eg:OBJ-101>

<rdfs:Resource rdf:about="http://dspace.org/id/obj-021">
  <dst:has_format>
    <rdfs:Resource rdf:about="http://dspace.org/id/obj-012">
      <dc:format>image/jpeg
     </rdfs:Resource>
  </dst:has_format>
  <dst:produced_by rdf:resource="http://dspace.org/id/obj-046" />
</rdfs:Resource>

</rdf:RDF>

example query code...

[sample Perl RDF query code]



#!/bin/perl
#
# Squish/Algae test script
# usage: ./sqtest.pl --data=../samples/data2.rdf 
#
BEGIN {unshift @INC,('../../..','../..','..','/home/pldab/working/','/home/pldab/working/rudolf-perl/');}

use RDF::RDFWeb::SquishAlgae;
use strict;

my $datafile = "/home/pldab/working/rudolf-perl/samples/data.rdf"; 
print "Sample data: $datafile\n";

my $q = new RDF::RDFWeb::SquishAlgae;
my $q2 = new RDF::RDFWeb::SquishAlgae;

my $RDF='http://www.w3.org/1999/02/22-rdf-syntax-ns#';
my $DC= 'http://purl.org/dc/elements/1.1/';
my $DST ="http://www.w3.org/2001/06/dspace-swad-test#";
my $EG ="http://www.w3.org/2001/06/dspace-swad-examples#";

my $test1 = './test1-ds.rdf';


#########################################################################
# query in SquishQL format
my $iq = "SELECT ?x ?y FROM $test1 WHERE 
	 (rdf::type ?x ?y) 
	USING rdf as $RDF AND dc as $DC";

my @i = $q2->doalgae( $test1, $q->squish2algae( $iq ) );

print "\n\nQuery results:\n";

foreach my $hit (@i) {
  printf ( "object: %s type: %s\n\n",
 		$$hit{0}, $$hit{1} );
}

#########################################################################
# query in SquishQL format
my $pq1 = "SELECT ?x ?y  FROM $test1 WHERE 
	 (dst::partOf ?x ?y) 
	USING rdf as $RDF AND dc as $DC AND dst AS $DST";

my @i = $q->doalgae( $test1, $q->squish2algae( $pq1 ) );

print "\n\nQuery results:\n";

foreach my $hit (@i) {
  printf ( "object: %s is part of...: %s \n\n",
 		$$hit{0}, $$hit{1} );
}


#########################################################################
# from figure 8. of 'an object model from dspace', robert tansley 11 june 2001
#

# find objects ?x their rdf:type and dc:description where there is some
# object ?o which is dst:partOf ?x and that is consumed by an object ?t of
# rdf:type obj:101 # (transformation record type) where there is another resource ?y which
# dst:has_format of 'image/jpeg' dst:produced_by that transformation

my $complex1="
SELECT ?x ?o ?t  WHERE 
	(dst::consumed ?t ?o)
	(dst::partOf ?o ?x)
	(rdf::type ?x ?t)
USING rdf as $RDF AND dc as $DC AND dst AS $DST";

# more constraints...
#	(dc::description ?x ?d)
#	(dst::produced_by ?y ?t)
      
  
my @i = $q->doalgae( $test1, $q->squish2algae( $complex1 ) );

print "\n\n **notfinished** complex search results:\n";

foreach my $hit (@i) {
  printf ( "object: %s is part of...: %s \n\n",
 		$$hit{0}, $$hit{1} );
}

Query examples...

These examples use the 'Squish' prototype RDF query language, and have been tested using the W3C Perllib parser and RDF query system.

todo: ...in memory versus rewrite into sql implementation. generic triplestore versus optimised rdbms tables. discuss fedora-like models mapped into rdf-like object graphs, for cases where methods don't take single arguments.


Dan Brickley, mailto:danbri@w3.org June 20 2001