Experience with N3 rules

Tim Berners-Lee, Dan Connolly, Eric Prud'homeaux, Yosi Scharf

MIT/CSAIL Decentralized Information Group

Abstract

This short paper summarizes experience at MIT/CSAIL in developing and using Notation3 (N3) as a language for RDF and as a rules language for the Semantic Web. N3 was developed as simple syntax for RDF. Then, to make a rule language, graph literals and variables were added to N3. RDF properties were then introduced to allow rules to be expressed, web access and built-in functions.

This paper is provided as input to the W3C workshop on rules languages for the semantic web. A more elaborate introduction is provided by the N3 Tutorial [Ber03].

Introduction

A universal data language: RDF

The semantic web operates at the data level by (a) considering the semantics of any existing data and representing it as a graph of typed binary relationship arcs between items; and (b) using URIs to identify items, including the types of arcs. This is the RDF data model, and the RDF specifications provide a serialization format for such information in XML. [RDFC04]

A human-readable syntax for RDF: N3

From whiteboards to chat channels, it was found useful to have a minimalist syntax for jotting down and reading RDF. Notation3 is a language using conventional unix-style punctuation, which has is both writable and readable more easily than the RDF/XML syntax.

ex:c1 rdf:type ex:Car;
      ex:licensedYear  2002, 2003, 2004;
      ex:color  "green".

Design Points

Adding graphs to N3: {}

Various forms of literal value are allowed in RDF graphs, however the RDF standard does not itself provide for another RDF graph itself to be a data value. Remedying this allows one to express relationships between graphs, for example that a given graph is the RDF content of a particular document. The importance of agents on the semantic web being aware of where data has come from and where it is allowed to go to raises a need to be able to explicitly talk about graphs.

ex:Joe  ex:said   {  ex:c1 ex:color "charcoal" }.

Adding variables: ?x

In its blank nodes (items in the graph not directly identifies by a URI) an RDF graph has a form of existential variable. Extending the language to allow variables existentially or universally quantified over a graph allows N3 to be used for a form of logic. The drive for this initially for N3 was so that, given variables, a rule is just a relation between two graphs.

Variables are defined such that when substitution occurs in a graph, it also occurs in any nested graph.

Making rules: log:implies

In the <http://www.w3.org/2000/10/swap/log#> namespace, here given the prefix log:, the log:implies property expresses a rule, its subject being the antecedent graph, and the object being the consequent graph. The shorthand => may be used for log:implies.

{ ?x fam:brother ?y; fam:son ?z } =>  { ?x fam:nephew ?z }.

The N3 rule engine built by the authors, cwm, is a crude forward chaining reasoner operating with such rules. Rules may have full N3, even with nested graphs, on both sides of the implication. This gives a form of completeness as rules can generate rules. When used as a rule language on RDF alone, N3 can of course be constrained so that there is no nesting of graphs.

Built-in Functions and operators: Use RDF Properties

The fact that the rule language and the data language are the same gives a certain simplicity (there is only one syntax) and completeness (rules can operate on themselves, anything written in N3 can be queried in N3). This would be broken if a special syntax were added for built-in functions and operators. Instead, these are simply represented as RDF properties. The cwm engine, when analyzing a rule prior to running it, treats specially those properties it knows as calculable functions which occur in the antecedent.

{ ex:d test:point ?x.  ?x math:sin ?y } =>  {...}

In the wide range of applications we hope to be deployed across the Semantic Web, it is expected that different engines will be capable of implementing different sets of functions. Also, one can expect to be able to dynamically load software to implement new functions. Also, cwm can be told that that a particular property is defined by a particular remote document or remote service. This means that dynamically, the treatment of a property can change as it becomes calculable. The boundary between "built-in" functions and other properties is not well defined. All this speaks against built-in functions being brought out as special syntax, and supports the use of RDF properties for them.

N-ary functions: Use argument lists

Using properties as built-in functions raises the common question in RDF of how n-ary functions are represented. The choice taken in cwm was to use RDF lists (collections) to group the multiple arguments to a function

{   ?x a ex:TestData.
    ( ?x 1 ) math:sum ?y.
    (  ?y  " is one more than " ?x ) string:concatenation ?s
} => { ?s a ex:Result }.

This may require [BP] attributing more tuple-like semantics to lists than they come with out of the RDF box.

Functions using graphs

The built-in function log:semantics accesses a resource, retrieves a representation of it, parses that and returns the graph. (Currently, cwm will parse RDF/XML, and N3 and its subsets; GRDDL maybe added later.)

Another function, log:includes, checks whether one graph is a subset of the other. Together, these allow rules to access the web, and to objectively check the contents of documents, without having to load them and believe everything they say. In this example the master.rdf file is checked to see what it says is an order, and those orders are checked to see what items they mention. At no time is either file trusted for any other information.

@forAll v:DOC, c:G1, v:Order, v:y.
{   <master.rdf>  log:semantics v:G1.
    v:G1 log:includes   {  vi:DOC a biz:CustomerOrder }.
    v:DOC  log:semantics  v:ORDER.
    v:ORDER log:includes  { []  biz:item v:y }.
} => {
    v:DOC ex:orderItem v:y
}.

Defaults: Use explicit domain

Wheras some datasets (such as a list of members of a club) are definitively complete, others (such as a set of temperature measurements) are not: one never knows when evidence may come to light of another. This aspect of the semantic web makes negation as failure meaningless unless it is associated to a specific dataset.

Just as RDF statements on the semantic web are reusable by other parties, and combinable with others to make a larger applications, so also it is a design goal for semantic web rules that they can be reusable in a similar way.

The effect of a default with an explicit domain is achieved with log:notIncludes, the negation of log:includes. In the example below, if an order has an item which is car, and the order doesn't say that the car has some color, then the car is black.

{    <thisOrder.rdf>  log:semantics ?ORDER.
     ?ORDER  log:includes    { ?x  biz:item ?y. ?y a ex:Car };
     ?ORDER  log:notIncludes { ?y  ex:color [] }
} => {
     ?y ex:Color "black"
}.

The semantics of this are a great improvement on negation as failure with an undisclosed domain, but the syntax is clumsy, and syntactic sugar could be investigated.

Operator syntax

The authors have considered introducing binary operator syntax as syntactic sugar. This would be a general extension to the N3 syntax. The need for it has not been sufficiently acute to date to merit increasing the complexity of the language.

Other Implementations

The language outlined here has been used as a data language by many implementations. It has been used as rule language also by Euler[DR05], a backward-chaining reasoner, and Pychinko[Par05], a rete-based rule engine. The fact that the rule language has been used in fairly different engines is encouraging.

Subsets of N3 have been published as NTriples [RDFT04] and Turtle [Beck04].

The N3 extensions to RDF have also been used to represent queries and patches. [Ber04]

Acknowledgements

N3 was developed with much discussion with Jos de Roo and Sean Palmer and others in the RDF Interest Group, now the Semantic Web Interest Group. Thanks to everyone involved.

References

[Ber03]: Berners-Lee, Tim and Hawke, Sandro and Connolly, Dan Semantic Web Tutorial Using N3 Twelfth International World Wide Web Conference Budapest, Hungary May 2003
[RDFC04]: Klyne, G. and Carroll, J. J. Resource Description Framework (RDF): Concepts and Abstract Syntax, W3C Recommendation, 10 February 2004.
Latest version available at http://www.w3.org/TR/rdf-concepts/
[RDFT04]: Grant, J. and Beckett, D. RDF Test Cases, W3C Recommendation, 10 February 2004.
Latest version available at http://www.w3.org/TR/rdf-testcases
[Beck04]: Beckett, D. Turtle - Terse RDF Triple Language work-in-progress 2004
[Ber04]: Berners-Lee, T. and Connolly, D. Delta: an ontology for the distribution of differences between RDF graphs work in progress 2004
[BP]: Bijan Parsia, private communication
[DR05]: De Roo, J. Euler proof mechanism work in progress 2005
[Par05]: Bijan Parsia, Yarden Katz, and Kendall Clark Pychinko: Rete-based RDF friendly rule engine work in progress 2005