Warning:
This wiki has been archived and is now read-only.

Feature:PropertyPaths

From SPARQL Working Group
Jump to: navigation, search


Feature: Property Paths

Property paths provide for a compact syntax for paths in the graph.

Feature description

Property paths provide for a compact syntax for paths in the graph. They also extends the expressivity of the pattern language to include transitivity.

Examples

SELECT ?x ?name
{
  ?x foaf:knows+ ?y .
  OPTIONAL { ?y foaf:name ?name }
}

?x follow foaf:knows links, returning the name if found.

SELECT ?x ?t 
{ ?x rdf:type/rdfs:subClassOf* ?t }

All types of ?x, following subclass relationships.

Existing Implementation(s)

ARQ provides property paths.

Virtuoso provides a similar feature in expressions with different syntax. Related extension documented for Virtuoso Transitivity is supported through a different extension (see blog post about SQL transitivity, User's Guide section 10.33. Transitivity in SQL and sction 17.9.10. Transitivity in SPARQL The syntax for "inlined" recursive property paths is implemented as soon as a draft of the paper is available, because it is no more than a syntax sugar for generic transitivity.

Computas has a simple property paths idea in the Sublima-project in which e.g. dct:subject/skos:prefLabel when entered in as a name in a HTML form is used to generate two triples, ?s dct:subject ?var1 . ?var1 skos:prefLabel "Foo".

4Suite included an implementation of a path-based graph theoretic query language for RDF called Versa [1] . Traversal expressions look like so:

 all()-rdfs:label->*

Instead of

 SELECT ?label  
 WHERE  {
     ?resource rdfs:label ?label
 }


[1] http://www.xml.com/pub/a/2005/07/20/versa.html

Existing Specification / Documentation

ARQ documentation

Compatibility

If a path of one is equated to a triple pattern, then all existing queries are unaffected. The SPARQL grammar already makes the "+" part of the numeric token so the use trailing a property path does not conflict with existing queries.

Links to postponed Issues

None.

Related Use Cases/Extensions

Champions

Use Cases

  • From Holger Knublauch: "on-the-fly" inferencing so that hierarchies of sub-classes and properties can be traversed transitively

References

  • SPARQLer
  • PSPARQL
  • the topic of regular property paths was also treated in nSPARQL on a more theoretical level (no implementation known)

Implementation Experience in ARQ

(Material for the telecon of 2009-03-10)

This section reports implementation experience for property paths in ARQ and some possible implications for formal specification in SPARQL. This discussion is specific to ARQ.

The path operators are modeled on regular expressions:

  • "^" Reverse path
  • "/" Sequence of two paths
  • "|" alternation of two possibilities.
  • "*" -- A path of zero or more occurrences
  • "+" -- A path of one or more occurrences
  • "?" -- A path of zero or one
  • {n,m} A path between n and m occurrences of elt. The forms {n}, {,m} and {n,} are similar to the regular expression forms based on {n,m}

Simple and Complex Paths

There are two kinds of property paths that can be useful distinguished. Simple property paths are ones that express queries that can already be written in SPARQL/2008, because they can be expanded into triple patterns. Complex property paths and ones that add exprsesivity not directly available before because paths can be of arbitrary length.

Simple Paths

ARQ converts any simple paths that are equivalent to a basic graph pattern to a basic graph pattern and evaluates the BGP. There is no specific path operator in the algebra after this conversion, which is done during the conversion from query string to algebra expression. New variables are introduced to expand the short form. (An alternative design has been tested which is to treat simple path flattening as a transformation of the algebra into an equivalent algebra expression.)

Example of simple path:

 1 PREFIX  foaf: <http://xmlns.com/foaf/0.1/>
 2 
 3 SELECT  ?x ?name
 4 WHERE
 5   { ?x foaf:mbox <mailto:alice@example> .
 6     ?x foaf:knows/foaf:name ?name
 7   }

Algebra structure:

 1 (base <file:///home/afs/Joseki-3.3.0/>
 2   (prefix ((foaf: <http://xmlns.com/foaf/0.1/>))
 3     (project (?x ?name)
 4       (bgp
 5         (triple ?x foaf:mbox <mailto:alice@example>)
 6         (triple ?x foaf:knows ??P0)
 7         (triple ??P0 foaf:name ?name)
 8       ))))

??P0 is a processor-allocated variable.

Algebra syntax is ARQ's internal representation and is not the product of any stanardization process.

Complex Paths

Changes the the algebra and it's evaluation rules are needed for any path that involve arbitrary length matching.

 1 PREFIX  rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
 2 PREFIX  rdfs:   <http://www.w3.org/2000/01/rdf-schema#>
 3 PREFIX  :   <http://example/>
 4 
 5 SELECT ?x ?t
 6 {
 7   ?x rdf:type/rdfs:subcClassOf* :Class1 .
 8   ?x :p ?y .
 9 }

Algebra structure:

 1 (base <file:///home/afs/Joseki-3.3.0/>
 2   (prefix ((: <http://example/>)
 3            (rdfs: <http://www.w3.org/2000/01/rdf-schema#>)
 4            (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
 5     (project (?x ?t)
 6       (sequence
 7         (bgp (triple ?x rdf:type ??P0)
 8         )
 9         (path ??P0 (mod 0 * rdfs:subcClassOf) :Class1)
10         (bgp (triple ?x :p ?y)
11         )))))

sequence is an evaluation operation taking a list of sub-expressions. Intermediate results are streamed from one sub-expression to the next; earlier mention variables are in scope at each latter point.

Evaluation Notes

Current path evaluation in ARQ is separate from BGP-evaluation because BGP matching is an extension both for SPARQL (for other entailment regimes) and for ARQ (for different storage implementations). This could be changed to make the path part of a basic graph pattern, to repserver blank nodesa as non-distinguished variables in higher levels of inference.

Path evaluation yields a set of bindings of variables (excluding system-generated variables). If there are two or more paths, from <a> to <b>.

Cycles are premitted and included in matches.

Triple patterns are equivalent to a path of length exactly one.

Issues

This section highlights some discussion points (this section is not exhaustive):

Issue 1 Do we allow variables in a path (c.f. PSPARQL)? What happens if it is allowed in complex path segements? Not currently supported in ARQ.

Issue 2 Do we allow the path itself to become a datatype (c.f. SPARQLer)? This would allows a filter to restrict paths in more complex ways.

Issue 3 How does this interact with inference when SPARQL is used with systems carrying out inference as part of query processing?

Issue 4 Can this feature support the ability for results to be returned in the order they are traversed in the path? For example, if property is defined using owl:TransitiveProperty can the results be ordered in a transitive sequence?

References