RDF bNode Semantics and Interoperability with Rules Languages
An RDF triple has the form (s,p,o), where s and o may be bNodes. "bNode" stands for "blank node", which refers to the fact that the corresponding nodes in the RDF graph are "blank", i.e., have no label.
On this page we first review bNodes and the semantics of bNodes in RDF, identify the problems in combining the RDF bNode semantics with rules languages, and propose a number of possible ways of dealing with the issue.
Blank Nodes in RDF
bNodes serve the purpose of representing resources which cannot be named at present. For example, bNodes allow us to represent something like "somebody whose name is 'John' and who is 25 years old". This can be represented using the following triples:
(_:X, name, "John")
(_:X, age, "25"xsd:decimal)
In this particular notation, blank nodes always start with '_:', followed by a name to identify the blank node, so that the same blank node can be referenced in several triples of the same graph (a graph is a set of triples).
Now, bNodes are not only used to represent resources which cannot be named, but also to overcome some syntactical limitations of RDF and to facilitate the encoding of other languages (e.g., OWL DL) in RDF.
Take the triple (_:X, name, "John"). By the RDFS semantics conditions [1], every plain literal must be in the class extension of rdfs:literal, which means that <"John", rdfs:literal> must be a tuple in the relation associated with rdf:type. Therefore, one may expect that a graph consisting of the triple (_:X, name, "John") entails the triple ("John", rdf:type, rdfs:literal). However, this is not a valid triple in RDF, since RDF does not allow literals in the subject of a triple. The RDFS semantics therefore approximates this triple by using a blank node. We can illustrate this as follows:
- (_:X, name, "John")
- rdfs-entails
(_:X, name, "John")
(_:X, name, _:Y)
(_:Y, rdf:type, rdfs:literal)
This introduction of bNodes can not only be done for literals, but for any kind of resource:
- (s,p,o)
- rdf-entails
(s,p,o)
(_:X, p, o)
(s, p, _:Y)
(_:X1, p, _:X2)
Note also that the names of bNodes are not important. For example:
- (_:X, property, object)
- rdf-entails
(_:X1, property, object)
(_:X2, property, object)
(_:X3, property, object)
(_:X1, property, _:Y1)
(_:SomeBNode, property, _:SomeOtherBNode)
In fact, it turns out that bNodes can be interpreted as existentially quantified variables in first-order logical languages.
RDF bNodes in RIF
Although we recognize that there is an ongoing debate on how RDF triples are best represented using predicate formulas http://lists.w3.org/Archives/Public/public-rif-wg/2006Jan/0000.html, we choose for reasons of convenience to represent each triple (s,p,o) using a ternary predicate triple(s,p,o).
In fact, for this particular issue it does not matter whether triples are represented as ternary or binary (e.g. p(s,o)) predicates, but for the representation of certain SPARQL query patterns, namely patterns which contain a variable in place of the predicate name, the binary predicate brings additional challenges, because it requires a higher-order syntax (see also Higher-Orderness).
bNodes are represented by existentially quantified variables and an RDF graph corresponds to the existential closure of a conjunction of atomic formulae of the form triple(s,p,o).
For example, the RDF graph:
(_:X, name, "John")
(_:X, age, "25"xsd:decimal)
can be represented as:
exists ?x (triple(?x,name,"John") and triple(?x,age,"25"xsd:decimal)).
Whereas this type of formula is perfectly valid first-order logic, it is not Horn logic (see also Horn_Rules_Semantics) and thus does not fall in the domain of traditional rules languages.
Traditional rule languages based on Horn logic and the Herbrand model semantics cannot deal with unknown individuals which are introduced by existentially quantified variables in the heads of rules and RDF triples can be seen as facts, i.e., rules without a body.
Through a process of skolemization (i.e., replacing all existentially quantified variables with fresh constants), one can eliminate the existentially quantified variables. bNodes which have been skolemized are sometimes called "rigid bNodes".
However, skolemization does not solve all the problems. More specifically, when naively skolemizing RDF graphs, entailment may be lost.
Consider the two graphs:
- S = exists ?x (triple(?x,p,o))
- and
Clearly, S rdf-entails E. The skolemization of S, denoted Sk(S), also rdf-entails E. However, Sk(S) does not rdf-entail Sk(E), since the occurrences of ?x are replaced with different fresh skolem constants.
Conclusion
There are three general ways to deal with the bNode issue.
- Treat bNodes as rigid bNodes, i.e., as skolem constants. This would break the standard RDF entailment, but it would be compatible with traditional rule languages and with current implementations of rule systems. Notice that this is the way bNodes are currently handled by many RDF systems on the Semantic Web.
- Allow existentially quantified variables in the heads of rules. In this case, the rule language would diverge from traditional rule languages and could not be implemented on current rules systems. One language which goes in this direction is SWRL.
- View the RDF graph as a DL knowledge base and the rule base as two separate components and define an interface between the two. There are two main approaches: [2] and [3]. In [2], the interface between the DL knowledge base and the rule base is defined for single models, whereas in [3], the interface is defined with respect to entailment, i.e., all models. A detailed discussion of these approaches is beyond the scope of this page.
[1] Patrick Hayes. RDF semantics. W3C Recommendation 10 February 2004. http://www.w3.org/TR/rdf-mt/.
[2] Rosati, R. 2006. DL+log : Tight integration of description logics and disjunctive datalog. In KR2006.
[3] Eiter, T.; Lukasiewicz, T.; Schindlauer, R.; and Tompits, H. 2004. Combining answer set programming with description logics for the semantic web. In Proc. of the International Conference of Knowledge Representation and Reasoning (KR2004).