SparqlEndpointDescription

This page is pretty much obsolete, most of what's discussed here is addressed by voiD: http://rdfs.org/ns/void-guide

This page collects ideas and proposals for describing SPARQL endpoints. There are a growing number of SPARQL endpoints and tools that access their data. Endpoint descriptions can be used to announce endpoint capabilities and contents, support discovery through service directories, supply browsing and federation hints.

The initial version of this page was based on discussions between MaxVölkel, BastianQuilitz, DaveBeckett and RichardCyganiak.

Note: OrriErling proposed other list of properties to be described (SparqlEndpointDescription2).

Retrieving endpoint self-descriptions

HTTP GET <endpointurl>

The simplest way to obain an RDF/XML file which includes all meta-data about the "endpointurl". RDF Forms (or something like it) would be a possible format.

This is a proposed method for retrieving a self-description from a SPARQL endpoint. To retrieve an RDF graph describing the endpoint, this SPARQL query is submitted to the endpoint:

DESCRIBE <servicename>

where `<servicename>` is a URI representing the service. In the resulting RDF graph, `<servicename>` represents the endpoint. Clients must be aware that the result triples may or may not be part of the regular dataset that is queried by `SELECT`, `CONSTRUCT` and `ASK` queries.

The service name URI should be the service endpoint URL. In situations where this is not feasible (e.g. the endpoint is accessed locally through a Java API and therefore doesn't have an obvious service URL), we need a SPARQL extension:

SELECT SERVICENAME

The result is a SPARQL result with one binding of one variable:

?servicename
-------------
<servicename>

where `<servicename>` is the URI representing the service. Clients can use this extension to retrieve the service name and then submit a `DESCRIBE` query with this URI as an argument.

@@@ Issue: Capitalization of query and variable?

Pro

can be implemented on all existing SPARQL servers that support `DESCRIBE` (except the `SELECT SERVICENAME` part, which is not necessary in the common case)
simple implementation
easy to remember
works regardless of protocol because on query language level

Con

uses a data query to retrieve metadata -- violates separation of concerns
needs language extension for the `SELECT SERVICENAME` part
clients may expect that the description graph's triples are accessible over `SELECT` as well
`DESCRIBE` is icky, in part because it doesn't follow the TAG's recommendations on when to use GET; "Use GET if [...] The interaction is more like a question (i.e., it is a safe operation such as a query, read operation, or lookup)"

Design alternatives

extend SPARQL protocol: <endpoint_url?meta> (but what about non-HTTP bindings and can't be faked on existing servers)
HTTP `MGET` on endpoint URL (extending HTTP is heavyweight, and what about non-HTTP bindings? --RC)
`DESCRIBE SERVICE` (nice, but not much nicer than DESCRIBE <url>, and requires extension of QL --RC)
`DESCRIBE <this>` where `<this>` is some “magic” URI (but magic URIs cause trouble when processing with normal RDF tools, e.g. merging serveral service descriptions)
`DESCRIBE ?x WHERE { ?x rdf:type foo:SparqlEndpoint }` (rather complex, can't mention other endpoints in description)
store description in a special named graph
don't bother with all this and just HTTP GET the description from some URL that may or may not be managed by the server

Vocabularies for endpoint descriptions

The method described above returns an RDF graph containing a resource that is known to represent the endpoint. This section is a collection of things one could say about an endpoint.

Basic metadata

`rdfs:label`, `rdfs:comment`
Dublin Core metadata, e.g. `dc:title`, `dc:description`, `dc:creator`, `dc:publisher`, `dc:rights`
copyright -- `cc:license`
last updated
contents -- `saddle:dataSet`
signatures, SWP warrants (see SWP vocabulary, PDF)

Endpoint capabilities

Kendall's SADDLE stuff
level of inference (RDFS, OWL-DL etc)
supported SPARQL extension functions: `sl:extensionFunctions`
supported non-standard SPARQL extensions (e.g. in Andy Seaborne's extended ARQ query language)
features not implemented, e.g. "I can't do OPTIONAL"
other supported query languages: `saddle:queryLanguage`
supported result formats: `saddle:resultFormat`
Using (maybe data-mined) schemas for this purpose: Describing SPARQL source contents

Issue: capabilities of named graphs might differ from each other, e.g. `:graph2` might be `:graph1` plus inference

Browsing / Rendering hints

for generic SPARQL endpoint browsers/visualizers

known classes and properties; links to schemas/ontologies -- `owl:imports`, `saddle:vocabulary`
Fresnel lenses
class/property instance counts
property selectivities (low-selectivity properties make better facets)
good namespace prefixes
human-readable representation of the endpoint's contents -- `saddle:humanInterface`

Query performance and federation

what kinds of queries can the endpoint answer quickly?
- `saddle:vocabulary`
selectivities and instance counts
- Source Content Descriptions
query federation stuff by BastianQuilitz -- DARQ service descriptions
overall # of triples
schedule/frequency of changes (small stores that don't change often can be cached and queried locally)
mirrors of this store