Feature:DefaultDescribeResult

From SPARQL Working Group
Revision as of 12:59, 17 April 2009 by Aseaborne (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Feature: Default DESCRIBE Query Result

Feature description

Currently it is left undefined as to how a SPARQL processor should generate the results of a DESCRIBE query. There are several possible algorithms that might be applied, and these are useful in different contexts. However, for interoperability and for DESCRIBE queries to be more useful, there should be a default algorithm that should be implemented.

This default result should be created based on the idea that one uses DESCRIBEs to say "give me all you know about </foo>".

Example

Considering this simple query:


DESCRIBE ?x WHERE {
  ?x a foaf:Person.
}

With this in mind, it may interesting to return

<#me> a foaf:Person ;
      foaf:name "Kjetil Kjernsmo" ;
      bio:olb "RDF Geek"@en ;
      foaf:knows <#otherdude> .
foaf:name rdfs:label "Name"@en .
foaf:knows rdfs:label "knows"@en .
bio:olb rdfs:label "Short biography"@en .

This would be sufficient to create a human-readable presentation of my data.


Existing Implementation(s)

Various algorithms are implemented. Typically, implementations take all properties and objects of the subject to be described as well as blank nodes. See also the CDB submission and Feature:ControlOfDescribeQueries.

Existing Specification / Documentation

Compatibility

DESCRIBE is not normatively defined in SPARQL 1.0, which is part of the problem.

Links to postponed Issues

[1]


Related Features

Feature:ControlOfDescribeQueries is somewhat related, but is not needed, as the query author need not be able to influence it.

Feature:BlankNodeRefs

Champions

User:KjetilK, Computas AS. This suggestion is based on extensive implementation experience with the Sublima system. See also Working Group email from Kjetil.

Use cases

I'm a frequent user of a price comparison site with quite advanced search options. For an example, consider this page that can be used to search for CPUs at the Norwegian Hardware.no site.

However, it isn't quite there, as there are a number of properties that will influence the choice of other components in the system, and just a few characteristics I actually care about.

Thus, I'd like to find a CPU, motherboard and RAM by using a query like this:

DESCRIBE ?cpu ?mobo ?ram WHERE {
  ?cpu 	a <64-bit-CPU> ;
	ex:freq ?freq ;
       	ex:cores ?cores ;
       	ex:TTP ?power ;
       	ex:FSB ?cpufsb ;
	ex:socket ?socket ;
  FILTER (?freq > 2500) 
  FILTER (?cores >= 2)
  FILTER (?power <= 65)
  ?mobo a <ATXformatMobo> ;
  	ex:socket ?socket ;
   	ex:FSB ?mobofsb ;
	ex:ramtype <DDR3RAM> ;
	ex:ramslots 4 ;
	ex:minramcapacity ?ramcap ;
	ex:soundcard true ;
	ex:network true ;
	ex:IEEE1394 true ;
	ex:PCIslots ?pci .
  FILTER (?mobofsb >= ?cpufsb)
  FILTER (?ramcap >= 12)
  FILTER (?pci > 2)
  ?ram	a <DDR3RAM> ;
	ex:FSB ?ramfsb ;
	ex:amount 4096 .
  FILTER (?ramfsb = ?cpufsb)
}

Allthough this is a long query, it is not difficult to imagine something like this being generated by the interface at Hardware.no. Now, it is not difficult how this could be made even more interesting, by using linked data concepts, the information about RAID controllers, which many motherboard claim to have, could be augmented with data from a RAID FAQ which has data not provided by manufacturers. Data on e.g. linux compatibility could also be added by linking to third party endpoints. This is truly the power of linked data, and something that is not possible in today's web.

So, now that the use case is clear, what data would one expect from the endpoint? I would expect to get all the information that is needed to build a sensible user interface with the data retrieved in this query. I therefore think that the query should return not only triples containing ?cpu, ?mobo and ?ram as subjects. It should also return, if available certain triples that would aid the user in understanding the result, e.g. the triple

ex:TTP rdfs:label "Typical Thermal Power"@en .

is something that most users will need to understand the result. If they are not present, the user will need to launch more queries to make the original query useful. This is just one example, but I think some effort should be put into thinking about how to make DESCRIBE queries most useful.

There is some heterogenity about the information available for each of the components, thus a SELECT or CONSTRUCT query would be much more complex as many more variables would have to be listed. This is a very good case for using DESCRIBEs in general, and an interesting use case to figure out what DESCRIBEs should be good for. --KjetilK 22:38, 22 March 2009 (UTC)

References