Author: Dan Brickley
Dspace aims to create a standard cookbook, software package or hosting service for digital library services. This document outlines some possibilities relating to the need to query, in a flexible and expressive manner, data within DSpace. It takes as an example scenario a fragment of the data model presented in [tansley-june-2001] the DSpace object composition document. The diagram below describes some objects and relationships relating to DSpace workflow; specifically, it describes a composite object and a format transformation event. We use this as a sample scenario to explore issues relating to (meta)data query.
The relationship between object modelling in this style and the use of 'object oriented' information systems (Java, CORBA IDL, SOAP/WSDL, Fedora) is also of interest here. Mappings can be expressed that relate RDF-like object descriptions to object-based systems; see below.
This structure can be represented as a set of binary relationships using the RDF data model, and serialized as an XML document. Here is one such encoding...
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2001/01/rdf-schema#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dst="http://www.w3.org/2001/06/dspace-swad-test#"
xmlns:eg="http://www.w3.org/2001/06/dspace-swad-examples#"
>
<eg:OBJ-DO2 rdf:about="http://dspace.org/id/obj-001">
<dc:description>an object of type cbj-do2, supported html bundle type
</eg:OBJ-DO2>
<rdfs:Resource rdf:about="http://dspace.org/id/obj-004">
<dc:description>some html content...
<dst:partOf rdf:resource="http://dspace.org/id/obj-001" />
</rdfs:Resource>
<rdfs:Resource rdf:about="http://dspace.org/id/obj-005">
<dc:description>some kind of table of contents for the object...(?)
<dst:partOf rdf:resource="http://dspace.org/id/obj-001" />
</rdfs:Resource>
<eg:OBJ-101 rdf:about="http://dspace.org/id/obj-046">
<dc:description>a transformation record
<dst:consumed>
<rdfs:Resource rdf:about="http://dspace.org/id/obj-006">
<dc:description>some html content...
<dst:partOf rdf:resource="http://dspace.org/id/obj-001" />
<dst:has_format>
<rdfs:Resource rdf:about="http://dspace.org/id/obj-010">
<dc:format>image/svg
<dc:description>the image/svg image format
</rdfs:Resource>
</dst:has_format>
</rdfs:Resource>
</dst:consumed>
</eg:OBJ-101>
<rdfs:Resource rdf:about="http://dspace.org/id/obj-021">
<dst:has_format>
<rdfs:Resource rdf:about="http://dspace.org/id/obj-012">
<dc:format>image/jpeg
</rdfs:Resource>
</dst:has_format>
<dst:produced_by rdf:resource="http://dspace.org/id/obj-046" />
</rdfs:Resource>
</rdf:RDF>
|
#!/bin/perl
#
# Squish/Algae test script
# usage: ./sqtest.pl --data=../samples/data2.rdf
#
BEGIN {unshift @INC,('../../..','../..','..','/home/pldab/working/','/home/pldab/working/rudolf-perl/');}
use RDF::RDFWeb::SquishAlgae;
use strict;
my $datafile = "/home/pldab/working/rudolf-perl/samples/data.rdf";
print "Sample data: $datafile\n";
my $q = new RDF::RDFWeb::SquishAlgae;
my $q2 = new RDF::RDFWeb::SquishAlgae;
my $RDF='http://www.w3.org/1999/02/22-rdf-syntax-ns#';
my $DC= 'http://purl.org/dc/elements/1.1/';
my $DST ="http://www.w3.org/2001/06/dspace-swad-test#";
my $EG ="http://www.w3.org/2001/06/dspace-swad-examples#";
my $test1 = './test1-ds.rdf';
#########################################################################
# query in SquishQL format
my $iq = "SELECT ?x ?y FROM $test1 WHERE
(rdf::type ?x ?y)
USING rdf as $RDF AND dc as $DC";
my @i = $q2->doalgae( $test1, $q->squish2algae( $iq ) );
print "\n\nQuery results:\n";
foreach my $hit (@i) {
printf ( "object: %s type: %s\n\n",
$$hit{0}, $$hit{1} );
}
#########################################################################
# query in SquishQL format
my $pq1 = "SELECT ?x ?y FROM $test1 WHERE
(dst::partOf ?x ?y)
USING rdf as $RDF AND dc as $DC AND dst AS $DST";
my @i = $q->doalgae( $test1, $q->squish2algae( $pq1 ) );
print "\n\nQuery results:\n";
foreach my $hit (@i) {
printf ( "object: %s is part of...: %s \n\n",
$$hit{0}, $$hit{1} );
}
#########################################################################
# from figure 8. of 'an object model from dspace', robert tansley 11 june 2001
#
# find objects ?x their rdf:type and dc:description where there is some
# object ?o which is dst:partOf ?x and that is consumed by an object ?t of
# rdf:type obj:101 # (transformation record type) where there is another resource ?y which
# dst:has_format of 'image/jpeg' dst:produced_by that transformation
my $complex1="
SELECT ?x ?o ?t WHERE
(dst::consumed ?t ?o)
(dst::partOf ?o ?x)
(rdf::type ?x ?t)
USING rdf as $RDF AND dc as $DC AND dst AS $DST";
# more constraints...
# (dc::description ?x ?d)
# (dst::produced_by ?y ?t)
my @i = $q->doalgae( $test1, $q->squish2algae( $complex1 ) );
print "\n\n **notfinished** complex search results:\n";
foreach my $hit (@i) {
printf ( "object: %s is part of...: %s \n\n",
$$hit{0}, $$hit{1} );
}
These examples use the 'Squish' prototype RDF query language, and have been tested using the W3C Perllib parser and RDF query system.
todo: ...in memory versus rewrite into sql implementation. generic triplestore versus optimised rdbms tables. discuss fedora-like models mapped into rdf-like object graphs, for cases where methods don't take single arguments.