Copyright © 2007 W3C ® ( MIT , ERCIM , Keio ), All Rights Reserved. W3C liability , trademark and document use rules apply.
This document, developed by the Rule Interchange Format (RIF) Working Group , specifies the basic design for a format that allows logic rules to be translated between rule languages and thus transferred between rule systems.
In Phase 1 , the RIF Working Group first defines a condition language, which is envisioned to be a shared part of all RIF dialects. In this document, the RIF Condition Language is used to define rule bodies of the RIF Basic Logic Dialect (RIF BLD). We give an abstract syntax (with UML visualization) and semantics (model theory) for positive and slotted conditions as well as for a Horn rule language. Examples in this document are based on a human-oriented syntax and an XML syntax, which are derived from the abstract syntax, but in the current working draft these syntaxes are used only for explanations and illustration.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
The Rule Interchange Format (RIF) Working Group seeks public feedback on this Second Public Working Draft. Please send your comments to public-rif-comments@w3.org ( public archive ). If possible, please offer specific changes to the text which will address your concern.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
(Editor's Note: This text is maintained on wiki page RIF Overview ).
Based on the RIF Use Cases and Requirements , this document develops RIF BLD (the B asic L ogic D ialect of the R ule I nterchange F ormat) through a set of foundational concepts that are intended to be shared by all logic-based RIF dialects. The overall RIF design takes the form of a layered architecture organized around the notion of a dialect .
A
dialect
is
a
rule
language
with
a
well-defined
syntax
and
semantics.
This
semantics
must
be
model-theoretic,
proof-theoretic,
or
operational
in
this
order
of
preference.
Some
dialects
might
be
proper
extensions
of
others
(both
syntactically
and
semantically)
and
some
may
have
incompatible
expressive
power.
However,
all
All
logic-based
dialects
are
required
to
extend
the
RIF
Basic
Logic
Dialect.
From a theoretical perspective, RIF BLD corresponds to the language of definite Horn rules (see Horn Logic ) with equality (and with a standard first-order semantics). Syntactically, however, RIF BLD has a number of extensions to support features such as objects and frames, international resource identifiers (or IRIs) RFC 3987 as identifiers for concepts, and XML Schema data types. These latter features make RIF a Web language . However, RIF is designed to enable interoperability among rule languages in general, and its uses are not limited to the Web. The semantics of RIF has provisions for future extensions towards dialects that support pure FOL, dialects that support negation as failure (NAF), business (or production) rules, reactive rules, and other features.
Eventually, it is hoped that RIF dialects will cover a number of important paradigms in rule-based specification and programming. Our main target paradigms include production rules , logic programming , FOL-based rules, reactive rules , and normative rules (integrity constraints).
The central part of RIF is its Condition Language . The condition language defines the syntax and semantics for the bodies of the rules in RIF BLD and the syntax for the queries. However, it is envisioned that the Condition Language will have wider use in RIF. In particular, it might be used as a sublanguage for specifying the conditional part in the bodies of production rules ( PRD ), reactive rules, and normative rules.
As
mentioned,
IRI
constants
are
used
in
RIF
both
as
logical
constants,
as
names
of
predicate,
and
of
function
symbols.
RIF
BLD
does
not
allow
the
same
symbol
to
play
more
than
one
of
these
roles,
but
the
dialects
that
extend
the
BLD
can
and
will
may
support
polymorphic
symbols
(i.e.,
symbols
that
have
more
than
one
contextual
use;
e.g.,
of
a
constant
and
of
a
predicate).
Such
polymorphism
is
common
practice
in
Logic
Programming
(e.g.,
[Prolog],
[F-logic],
[
HiLog
])
and
in
[RDF].
This
extensibility
is
achieved
in
RIF
by
building
its
syntax
on
the
basis
of
signatures
.
This
draft
also
introduces
a
frame-based
syntax
and
semantics
and
defines
a
normative
way
for
RIF
rules
to
interact
with
RDF.
RIF
BLD
is
described
in
the
current
document
by
means
of
the
following:
An
Abstract
Syntax
Notation
and
Abstract
EBNF
Syntax
specification
visualized
by
a
UML
diagram
(abstract
syntax).
A
Concrete
EBNF
syntax.
This
is
a
concrete
human-oriented
syntax,
which
is
used
for
illustration/explanation
purposes
only.
An
XML
syntax.
XML
is
used
for
serializing
RIF
rule
sets.
In
the
present
document
the
XML
syntax
is
used
for
illustration
purposes
only.
A
model-theoretic
semantics.
The
XML
and
Concrete
EBNF
syntaxes
are
derived
from
the
abstract
syntax.
The
model-theoretic
semantics
is
designed
for
modularity
and
to
enable
further
extensions.
The current document is the second draft of the RIF BLD specification (in the first draft called 'RIF Core'). A number of extensions are planned to support built-ins, additional primitive XML data types, the notion of RIF compliance, and so on. Tool support for RIF BLD is forthcoming. RIF dialects that extend BLD will be specified in other documents by this working group.
(Editor's Note: This text is maintained on wiki page RIF Condition Language ).
The RIF Condition Language is intended to be a common basis for several dialects of RIF. First of all, it is used by RIF BLD, as described in this document. The other future dialects or groups of dialects where the Condition Language or its extensions might be used include:
Rule bodies and queries in declarative logic programming dialects (LP)
Rule bodies in first-order dialects (FO)
Conditions in the rule bodies of the Production Rule Dialect ( PRD )
The event and condition parts of the rule bodies in reactive rule dialects (RR)
Integrity constraints (IC)
It should be noted, however, that apart from RIF BLD and RIF PRD no decision has been made regarding which dialects will ultimately be part of RIF.
The RIF Condition Language is intended to be used only in rule bodies and queries, not in rule heads. The various RIF dialects diverge in the way they specify, interpret, or use rule heads and other components of the rules. By focusing on the condition part of the rule bodies we achieve maximum syntactic and a great deal of semantic reuse among the dialects.
This Condition Language part of the document describes Positive Conditions and Slotted Conditions .
(Editor's Note: This text is maintained on wiki page Positive Conditions ).
The language of positive RIF conditions determines what can appear as a body (the if-part) of a rule supported by the basic RIF logic. As explained in Section Overview , RIF's Basic Logic Dialect corresponds to definite Horn rules, and the bodies of such rules are conjunctions of atomic formulas without negation. However, it is well-known that disjunctions of such conditions can also be used in the bodies of rules without changing the essential properties of the rule language. This is based on the fundamental logical tautologies (h ← b ∨ c) ≡ ((h ← b) ∧ (h ← c)) and ∀ x (F ∧ G) ≡ (∀ x F ∧ ∀ x G). In other words, a rule with a disjunction in the body can be split into two or more rules that have no such disjunction.
In
a
later
draft,
positive
RIF
conditions
will
be
extended
to
include
builtins,
i.e.
calls
to
procedures
defined
outside
the
ruleset.
The
condition
language
will
is
intended
to
be
shared
among
the
bodies
of
the
rules
expressed
in
future
RIF
dialects,
such
as
LP,
FO,
PR
and
RR.
The
condition
language
might
also
be
used
to
uniformly
express
integrity
contraints
and
queries.
This
section
presents
a
syntax
and
semantics
for
the
RIF
condition
language.
To
make
RIF
dialects
suitable
as
Web
languages,
RIF
supports
XML
Schema
and
some
other
primitive
data
types.
In
addition,
RIF
promotes
the
use
of
international
internationalized
resource
identifiers
(or
IRI
s)
RFC
3987
to
refer
to
individuals,
predicates,
and
functions.
To ensure extensibility and to provide for future higher-order dialects based on formalisms, such as HiLog and Common Logic , the RIF logic language does not draw sharp boundary between the symbols used to denote individuals from symbols used as names for functions or predicates. Instead, all constant, preficate, and function symbols are drawn from the same universal set. RIF dialects control the contexts in which the different symbols can occur by attaching signatures to these symbols.
RIF's basic logic dialect carefully selects signatures for the symbols so that the corresponding logic will be first-order: each symbol has a unique role as an individual, a function symbol of a particular arity, or a predicate symbol of a particular arity. However, dialects extending the basic logic will be allowed to use signatures in less restrictive fashions so that symbols could be polymorphic, polyadic, and be allowed to occur in several different contexts (for example, both as individuals and as predicates).
We begin by describing a syntax, which is more general than what the basic logic dialect permits. This syntax can be used in the various dialects that extend RIF's basic logic dialect. Then we introduce the notion of a signature and specify the restrictions on the way signatures are allowed to be assigned to symbols.
In this document we will introduce several related but distinct syntaxes:
Formal syntax. This syntax is used in the definitions and follows the standard textbook conventions for logical syntax.
Presentation syntax. This syntax is closely related to the formal syntax and is derived from it by elaborating on the various aspects of RIF (such as, for example, data types). This is a human-oriented syntax, which we use in our the examples.
Abstract syntax. This is a normative version of the RIF syntax. It is also derived from the formal syntax, but is more verbose than the presentation syntax (this verbose nature of the abstract syntax is the main reason for the presentation syntax).
XML syntax. This syntax is the normative XML serialization of the abstract syntax.
The
language
of
RIF
Condition
Language
consists
of
a
countably
infinite
set
of
constant
symbols
Const
,
a
countably
infinite
set
of
variable
symbols
Var
(disjoint
from
Const
),
connective
symbols
∧
and
∨,
quantifiers
∀
and
quantifier
∃,
and
auxiliary
symbols
like
(,
),
and
so
on.
The
basic
language
construct
is
called
term
,
which
is
defined
inductively
as
follows:
If t ∈ Const or t ∈ Var then t is a term.
If t and t 1 , ..., t n are terms then t ( t 1 ... t n ) is a term
This definition is very general. It makes no distinction between constant symbols that represent individuals, predicates, and function symbols. The same symbol can occur in multiple contexts at the same time. For instance, p(p(a) p(a b c)) is a term. Even variables and general terms are allowed to occur in the position of predicates and function symbols. For instance, p(a)(p(b c) q) is also a term. To control the context in which any given symbol can occur in RIF dialects, the language associates a signature with each symbol (both constant and variable symbols).
Signatures.
Let
SigNames
be
a
non-empty
finite
or
countably
infinite
set
of
signature
names
.
We
assume
require
that
this
set
includes
at
least
the
following
signature
names:
i
and
bool
.
The
signature
name
i
is
intended
to
represent
represents
the
context
where
the
constants
that
denote
individual
objects
are
allowed
to
appear.
The
name
bool
represents
the
context
of
atomic
formulas.
A
signature
is
a
statement
of
the
form
η{
e
1
...
e
n
...}
where
η
∈
SigNames
is
the
name
of
the
signature
and
{
e
1
...
e
n
...}
is
a
set
of
signature
expressions
.
This
set
can
be
empty,
finite,
or
countably
infinite.
In
RIF's
basic
logic
dialect,
this
set
will
have
at
most
one
expression,
but
in
more
expressive
dialects
a
signature
can
have
more
than
one
expression,
and
in
this
way
they
can
support
polymorphism.
A signature expression can be a base signature expression or an arrow signature expression (as a special case, an arrow expression can be a Boolean signature expression ), defined as follows, where κ 1 , ..., κ n , and κ are signature names from SigNames :
i and bool are basic signature expressions .
(κ 1 ... κ n ) ⇒ κ, where n≥0, is an arrow signature expression . In particular, () ⇒ i and ( i ) ⇒ i are arrow signature expressions.
If κ above is bool then the signature is called a Boolean signature expression .
A set S of signatures is said to be coherent if
it contains the signatures i { } and bool { }, which represent the context of individual objects and atomic formulas; and
no two different signatures in S have the same name.
Well-formed terms and formulas. RIF uses signatures to control the context in which terms can appear through the notion of well-formed terms and well-formed atomic formulas. First, as mentioned above, each symbol (constant or variable) is associated with exactly one signature from a coherent set of signatures. (Different symbols can be associated with the same signature, but one symbol cannot be associated with more than one signature.) If σ is a signature then we will use σ # to denote the name of that signature.
If s is a constant or variable symbol with signature η{...} and a base expression i or bool belongs to the set {...} associated with that signature then s is a well-formed term with signature i { } or bool { }, respectively.
Note that if symbol s has a signature, say, κ{ i ( i )⇒ bool } and symbol t has a signature ρ{( i )⇒ i ( i )⇒ bool } then, according to the above definition, s is a well-formed term, while t is not. Also, according to this definition the signature of a symbol can be different from the signature of the same symbol when it is viewed as a term. For instance, in the above example, s has a signature κ{ i ( i )⇒ bool } as a symbol, but its signature as a term is i { }. Furthermore, a symbol always has just one signature (which might have many arrow expressions in it), but a term can have several signatures. For instance, if symbol q has the signature δ{ i bool } then q as a well-formed term has two signatures: i { } and bool { }.
A term t ( t 1 ... t n ) is a well-formed term with signature σ iff
t is a well-formed term with signature σ 0 ;
Each t i is a well-formed term with signature σ j , j = 1,...,n (the σ j 's are not necessarily distinct); and
σ 0 contains an arrow expression of the form (σ 1 # ... σ n # ) ⇒ σ #
A term t ( t 1 ... t n ) is a well-formed atomic formula iff it is a well-formed term with signature bool { }.
Note that f() and f are distinct terms and the proposed XML serialization also treats these as distinct fragments of XML: the nullary function application <Uniterm><Const>f</Const></Uniterm> vs. the symbol <Const>f</Const> .
RIF
assumes
defines
that
there
is
a
special
predicate,
=
,
which
denotes
equality.
Like
other
predicates,
predicates
it
has
a
signature,
which
and
this
signature
includes
the
Boolean
expression
(
i
i
)
⇒
bool
and
possibly
other
expressions
of
the
form
(
s
s
)
⇒
bool
,
where
s
is
a
signature
name.
No
other
signatures
are
allowed
for
=
.
We
note
that
it
is
common
practice
to
write
the
atomic
formulas
involving
=
using
infix
notation,
i.e.,
a = b
instead
of
=(a,b)
.
The
equality
predicate
has
special
model-theoretic
semantics,
as
explained
in
Section
Model
Theory
for
RIF's
Basic
Condition
Language
.
Examples. To illustrate the above definitions, we give several examples.
Consider the term p(p(a) p(a b c)) . If p has the (polymorphic) signature myPsig {( i )⇒ i ( i i )⇒ i ( i i i )⇒ i } and a , b , c each has the signature i { } then p(p(a) p(a b c)) is a well-formed term with signature i { }. If, for instance, p 's signature were myPsig2 {( i i )⇒ i ( i i i )⇒ i } then p(a) would not have been a well-formed term and the entire term would also be ill-formed.
Here is another, fancier, signature for p under which the above term would be well-formed (and again have signature i{ }): myPsig3 {( i )⇒ bool ( bool i )⇒ i ( i i i )⇒ i }. If p 's signature were myPsig4 {( i )⇒ bool ( bool i )⇒ bool ( i i i )⇒ i } instead, then p(p(a) p(a,b,c)) would have been a well-formed term with signature bool { } and, therefore, it would have been a well-formed atomic formula.
An even more advanced example of a signature is when the right-hand side of an arrow expression is a signature other than i or bool . For instance, let John , Mary , NewYork , and Boston have signatures i { }; flight and ancestor have signatures h 2 {i ( i i )⇒ bool }; and closure have signature hh 1 {( h 2 )⇒ p 2 }, where p 2 is the name of the signature p 2 {( i i )⇒ bool }. Then closure(flight)(NewYork,Boston) and closure(ancestor)(John,Mary) would be well-formed formulas. Such formulas are allowed in languages like HiLog , which support predicate constructors like closure in our example.
More general formulas are constructed out of atomic formulas with the help of logical connectives. RIF's Basic Condition Language defines the following general well-formed formulas .
If φ is a well-formed atomic formula then it is also a general well-formed formula.
If are φ and ψ are general well-formed formulas then so is φ ∧ ψ.
If are φ and ψ are general well-formed formulas then so is φ ∨ ψ.
If φ is a well-formed general formula and V 1 , ..., V n are variables then ∃ V 1 ... V n φ is a general well-formed formula.
Signatures in the RIF Basic Condition Language. RIF's basic condition language presents a much simpler picture to the user by restricting the set of well-formed terms to a specific coherent set of signatures. Namely, the signature set of RIF's basic logic language contains the following signatures:
i { } and bool { }
For every integer arity n ≥ 0, there are signatures f n {( i ... i ) ⇒ i } and p n {( i ... i ) ⇒ bool } (there are n i 's inside the parentheses), which represent function symbols of arity n and predicate symbols of arity n, respectively. In addition to i and bool , the symbols f n and p n are reserved signature names in the RIF basic logic dialect. In this way, each constant symbol can be either an individual, a predicate of one particular arity, or a function symbol of one particular arity.
All variables are associated with signature i { }, so they can range only over individuals.
The equality symbol, = , has signature p 2 {( i i ) ⇒ bool }. In this way, the equality predicate can compare only the terms whose signature is i{ }; it cannot compare predicate names or function symbols.
Symbol spaces. Throughout this document, the rif: prefix will stand for the XML Schema namespace URI http://www.w3.org/2001/XMLSchema and rif: will stand for the URI of the RIF namespace, http://www.w3.org/2007/rif .
The set of all constant symbols in RIF has a number of predefined subsets, called symbol spaces , which are used to represent XML data types, data types defined in other W3C specifications, such as rdf:XMLLiteral , and to distinguish other sets of constants. The following primitive data types are supported: xsd:long ( http://www.w3.org/2001/XMLSchema#long ), xsd:string ( http://www.w3.org/2001/XMLSchema#string ), xsd:decimal ( http://www.w3.org/2001/XMLSchema#decimal ), xsd:time ( http://www.w3.org/2001/XMLSchema#time ), xsd:dateTime http://www.w3.org/2001/XMLSchema#dateTime ), and rdf:XMLLiteral ( http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral ).
Basic
RIF
logic
also
defines
two
additional
symbol
spaces,
spaces:
rif:iri
(for
international
internationalized
resource
identifier
or
IRI
)
and
rif:local
(for
constant
symbols
that
are
not
visible
outside
of
a
particular
set
of
RIF
formulas).
Constant
symbols
that
belong
to
this
the
various
symbol
space
spaces
will
have
special
concrete
presentation
syntax,
and
semantic
structures
will
interpret
them
in
a
special
way.
To
In
the
formal
syntax,
however,
support
for
symbol
spaces
in
the
syntax
we
only
need
requires
us
to
state
that
the
set
Const
of
all
constant
symbols
has
subsets
Const
xsd:long
,
Const
xsd:string
,
Const
xsd:decimal
,
Const
xsd:time
,
Const
xsd:dateTime
,
Const
rdf:XMLLiteral
,
Const
rif:iri
,
and
Const
rif:local
.
These
sets
of
symbols
are
described
in
more
detail
later.
We
define
the
Abstract
Syntax
of
BLD
with
(strictly)
alternating
syntactic
class/property
categories,
i.e.
in
a
"(fully)
striped"
manner.
Section
Concrete
Presentation
Syntax
will
then
proceed
to
a
"stripe-skipped"
or
"unstriped"
human-readable
syntax
as
well
as
to
a
(fully)
striped
XML
syntax.
TODO: Explain "fully striped."
To compare two approaches at [ F2F7 ], we represent the abstract syntax of the RIF Condition Language both in Abstract Syntax Notation and in Abstract EBNF Syntax.
The abstract syntax of the condition language is given in asn06 as follows:
class CONDITION
subclass And
property formula : list of CONDITION
subclass Or
property formula : list of CONDITION
subclass Exists
property declare : list of Var
property formula : CONDITION
subclass ATOMIC
class ATOMIC
subclass Uniterm
subclass Equal
class Uniterm
property op: Const
property arg: list of TERM
class Equal
property side: list of TERM
class TERM
subclass Const
property name: CONSTNAME
property type: TYPENAME?
subclass Var
property name: VARNAME
subclass Uniterm
The length of the list of Var of the declare property (role) of the Exists class is assumed to be 1 or more. The multiplicity of the side property of the Equal class is assumed to be exactly 2.
We represent the abstract syntax of the RIF Condition Language with the help of usual EBNF , employed here to define BLD in a fully striped normal form. In this Abstract EBNF Syntax , the properties are named using idioms of the form ' name ->' , e.g. 'formula ->' .
CONDITION ::= 'And' '(' ('formula ->' CONDITION)* ')'
|
'Or' '(' ('formula ->' CONDITION)* ')'
|
'Exists' '(' ('declare ->' Var)+ 'formula ->' CONDITION ')'
|
ATOMIC
ATOMIC ::= Uniterm | Equal
Uniterm ::= 'Uniterm' '(' 'op ->' Const ('arg ->' TERM)* ')'
Equal ::= 'Equal' '(' 'side ->' TERM 'side ->' TERM ')'
TERM ::= 'Const' ( '(' 'type ->' TYPENAME ')' )? '(' CONSTNAME ')'
|
'Var' '(' VARNAME ')'
|
Uniterm
The above abstract syntax can be illustrated with a UML diagram, as shown below. Automatic transformation from the Abstract EBNF Syntax to UML can be implemented based on the earlier asn06-to-UML transformation .
The central class of RIF, CONDITION , is specified recursively through its subclasses and their parts. The Equal class has two side roles. Two pairs of roles distinguish between the declare and formula parts of the Exists class, and between the operation ( op ) and arguments ( arg ) of the Uniterm class, where multiple arguments are subject to an ordered constraint (the natural "left-to-right" order):
The syntactic classes are partitioned into classes that will not be visible in serializations (written in all-uppercase letters) and classes that will be visible in instance markup (written with a leading uppercase letter only).
The three classes Var , CONDITION , and ATOMIC will be required in the abstract syntax of Horn Rules .
The
abstract
syntax
of
Section
Abstract
Syntax
can
be
lowered
to
a
concrete
presentation
syntax.
The
Concrete
Presentation
EBNF
Syntax
below,
which
lowers
the
Abstract
Syntax,
is
used
throughout
this
document
to
explain
and
illustrate
the
main
ideas.
This
syntax
is
similar
in
style
to
what
in
OWL
is
called
the
Abstract
Syntax
http://www.w3.org/TR/owl-semantics/syntax.html
.
Concrete
Presentation
EBNF
Syntax.
The
concrete
human-readable
human-oriented
presentation
syntax,
described
in
this
EBNF
(usual
except
for
whitespace
handling),
is
work
in
progress
and
under
discussion.
CONDITION ::= 'And' '(' CONDITION* ')' |
'Or' '(' CONDITION* ')' |
'Exists' Var+ '(' CONDITION ')' |
ATOMIC
ATOMIC ::= Uniterm | Equal
Uniterm ::= Const '(' TERM* ')'
Equal ::= TERM '=' TERM
TERM ::= Const | Var | Uniterm
Const ::= CONSTNAME | '"'CONSTNAME'"''^^'TYPENAME
Var ::= '?'VARNAME
The above is a standard syntax for a variant of first-order logic. The application of a symbol from Const to a sequence of terms is called Uniterm ("Universal term") since it can be used to play the role of a function term or an atomic formula depending on the syntactic context in which the application occurs. The non-terminal ATOMIC stands for (positive) atomic formula , which can later be complemented with "negated atomic formula". The 'Exists' formula is an "existential formula", which in Horn-like conditions is the only quantified formula but in later conditions may be complemented with "universal formula" ( Var+ denotes the list of free variables in CONDITION ). The 'And' formula defines conjunctions of conditions, and 'Or' denotes disjunctions. Finally, CONDITION assembles everything into what we call RIF conditions . RIF dialects will extend these conditions in various ways.
Note that individuals, function symbols, and predicate symbols all belong to the same set of symbols ( Const ). This syntax is more general than what RIF's basic logic dialect actually permits. As explained in Section Formal Preliminaries a set of signatures restricts this syntax to allow only the terms that are allowed in standard first-order logic.
The
above
Concrete
Presentation
EBNF
Syntax,
where
TERM*
is
understood
as
TERM* ::= | TERM TERM*
,
uses
spaces
to
indicate
Lisp-like
whitespace
(
SPACE
,
TAB
,
or
NEWLINE
)
character
sequences
in
the
defined
language;
these
must
consist
of
at
least
one
whitespace
character
when
used
as
the
separator
between
adjacent
TERM
s,
and
can
consist
of
zero
or
more
whitespace
characters
elsewhere.
Where
spaces
are
omitted
from
the
Concrete
Presentation
EBNF
Syntax,
as
between
'?'
and
VARNAME
in
'?'VARNAME
,
there
must
be
no
whitespace
character
in
the
defined
language.
Constant symbols can have two forms: CONSTNAME and "CONSTNAME"^^TYPENAME . The second form is more general. The first form is used as a shorthand for some symbol spaces, like long integers, decimals, local RIF constants, etc. The precise syntax for RIF constants is given in Section Primitive Data Types .
At this point we do not commit to any particular vocabulary for the names of variables. These are assumed to be alphanumeric character sequences starting with a ?-sign.
**** MK: should probably be more specific: Vars are alphanums starting with a ? or something like ?'.....', if non-alphanum characters are involved.
The
lowering
of
the
abstract
syntax
to
this
concrete
presentation
syntax
can
be
done
via
automatic
EBNF-to-EBNF
mapping.
The
mapper,
abs2con4g
("
abs
tract
to
con
crete
for
g
rammar"),
uses
a
TokenTable
parameter
specifying
how
to
map
certain
abstract
classes,
in
prefix
notation,
to
concrete
operators,
in
infix
or
prefix
notation.
This
table
is
given
in
the
form
of
binary
class2token
facts,
accessed
via
a
lookup
function.
The
main
mapper
is
given
in
the
form
of
(left-to-right
oriented)
equations
defining
abs2con4g
as
a
binary
function
with
the
abstract
syntax
as
first
argument,
the
table
as
second
argument,
and
the
concrete
presentation
syntax
as
returned
value.
class2token('Equal','=')
class2token('type','^^')
class2token('Var','?')
abs2con4g(?Production1 ... ?ProductionN,
?TokenTable)
=
abs2con4g(?Production1,
?TokenTable)
...
abs2con4g(?ProductionN,
?TokenTable)
abs2con4g(?LeftSymbol ::= ?RightExpression, ?TokenTable)
=
?LeftSymbol ::= abs2con4g(?RightExpression, ?TokenTable)
abs2con4g('And'
'('
('formula ->' CONDITION)*
')'
|
'Or'
'('
('formula ->' CONDITION)*
')'
|
'Exists'
'('
('declare ->' Var)+
'formula ->' CONDITION
')'
|
ATOMIC,
?TokenTable)
=
'And' '(' CONDITION* ')' |
'Or' '(' CONDITION* ')' |
'Exists' Var+ '(' CONDITION ')' |
ATOMIC
abs2con4g(Uniterm
|
Equal,
?TokenTable)
=
Uniterm | Equal
abs2con4g('Uniterm'
'('
'op ->' Const
('arg ->' TERM)*
')',
?TokenTable)
=
Const '(' TERM* ')'
abs2con4g('Equal'
'('
'side ->' TERM
'side ->' TERM
')',
?TokenTable)
=
TERM lookup('Equal',?TokenTable) TERM
abs2con4g(Const
|
Var
|
Uniterm,
?TokenTable)
=
Const | Var | Uniterm
abs2con4g('Const' ('(' 'type ->' TYPENAME ')')?
'('
CONSTNAME
')',
?TokenTable)
=
CONSTNAME(lookup('type',?TokenTable)TYPENAME)?
abs2con4g('Var'
'('
VARNAME
')',
?TokenTable)
=
lookup('Var',?TokenTable)VARNAME
Note
that
variables
in
the
RIF
Condition
Language
can
be
free
and
or
quantified
.
All
quantification
is
explicit.
All
variables
introduced
by
quantification
should
must
also
occur
in
the
quantified
formula.
Initially,
only
existential
quantification
is
used.
Universal
quantification
will
be
introduced
later.
We
adopt
the
usual
scoping
rules
for
quantifiers
from
first-order
logic.
Variables
that
are
not
explicitly
quantified
are
free
.
Free variables arise because CONDITION can occur in an if part of a rule. When this happens, the free variables in a condition formula are precisely those variables that also occur in the then part of the rule. We shall see in Section Horn Rules that such variables are quantified universally outside of the rule, and the scope of such quantification is the entire rule. For instance, the variable ?Y in the first RIF condition of Example 1 is quantified, existentially, but ?X is free. However, when this condition occurs in the if part of the second rule in Example 1, then this variable is quantified universally outside of the rule. (The precise syntax for RIF rules will be given in Section Horn Rules .)
Example 1 (A RIF condition in and outside of a rule)
RIF condition:
Exists ?Y (condition(?X ?Y))
RIF Horn rule:
Forall ?X (then-part(?X) :- Exists ?Y (condition(?X ?Y)))
When RIF conditions are used as queries, their free variables carry answer bindings back to the caller.
In
this
document
Following
N-Triples
,
we
will
use
the
following
syntax,
following
N-Triples
,
presentation
syntax
to
specify
the
constant
symbols
that
belong
to
a
symbol
space
label
:
"literal"^^label
For instance, "123"^^xsd:long is a symbol that belongs to the symbol space xsd:long , "2007-11-23T03:55:44-02:30"^^xsd:dateTime is a symbol in symbol space xsd:dateTime , and "ftp://example.com/foobar"^^rif:iri is a symbol that belongs to the symbol space rif:iri . The part of such a symbol that occurs inside the double quotes is called the lexical form of the symbol. The surrounding double quotes are not part of the literal. If a double quote is included as part of a literal, it must be escaped with the backslash. Some data types have shorthand notation in which the ^^label part can be omitted (see below).
A symbol space in RIF has the following components:
A
non-empty
set
of
character
strings
called
the
lexical
space
of
the
symbol
space.
The
lexical
space
for
a
type,
D
,
determines
which
character
strings
are
allowed
as
value
in
a
symbol
like
"
value
"
D
.
.
For
instance,
"1.2"^^xsd:decimal
and
"1"^^xsd:decimal
are
two
legal
symbols
because
1.2
and
1
are
legal
in
the
lexical
space
of
the
XML
data
datatype
xsd:decimal
.
On
the
other
hand,
"a+2"^^xsd:decimal
is
not
a
legal
symbol,
since
a+2
is
not
part
of
the
lexical
space
of
the
XML
data
datatype
xsd:decimal
.
A constant symbol of the form "xyz"^^label is well-formed if its lexical form, xyz belongs to the lexical space associated with the symbol space label .
In addition the semantics of a symbol space defines
A non-empty set called the value space of the data type; and
These
two
components
will
be
explained
in
Section
Model
Theory
for
RIF's
Basic
Condition
Language
.
We
would
like
to
point
out,
however,
that
symbol
spaces
in
RIF
include
spaces,
such
as
xsd:long
,
whose
value
space
is
fixed
as
well
as
spaces
like
rif:iri
and
rif:local
whose
value
spaces
are
not
fixed.
We
will
refer
to
the
former
as
primitive
data
types
.
The
XML
syntax
for
symbol
spaces
can
utilize
the
type
attribute
associated
with
XML
term
elements
such
as
Const
.
For
instance,
the
earlier
xsd:dateTime
example
can
be
represented
as
<Const type="xsd:dateTime">2007-11-23T03:55:44-02:30</Const>
.
A
number
of
primitive
types
in
RIF
are
based
on
the
XML
Schema
data
types
types,
and
thus
it
is
expected
that
corresponding
their
names
of
RIF
primitive
types
will
be
are
references
to
those
the
corresponding
XML
Schema
types.
In
this
case,
these
IRIs
will
be
used
as
type
names.
In this version of the RIF BLD document, we define the following symbol spaces, where the prefix xsd refers to the XML Schema URI and rif is the prefix for the RIF language. The syntax such as xsd:long should be understood as a compact uri , i.e., a macro, which expands to a concatenation of the character sequence denoted by the prefix xsd and the string long . In the next version of this document we will introduce a syntax for defining prefixes for compact URIs and will also expand on the syntax for the symbols that can be used to denote RIF's primitive data types.
xsd:long . This symbol space corresponds to the XML data type xsd:long . Example: "123"^^xsd:long . Long integers also have a short notation, which does not require the "..."^^xsd:long wrapper. That is, long integers can be written as constants from the lexical space of xsd:long , but the surrounding double quotes and the type label ^^xsd:long can be omitted. Example: 123 .
xsd:decimal . This corresponds to the XML data type xsd:decimal . Example: "123.45"^^xsd:decimal . Decimals also have an alternative short notation, which does not require the "..."^^xsd:decimal wrapper. That is, decimals can be written as constants from the lexical space of xsd:decimal , but the surrounding double quotes and the type label ^^xsd:decimal can be omitted. Example: 123.45 .
The type xsd:decimal is a supertype of xsd:long .
xsd:string . This corresponds to the XML data type xsd:string . Example: "a string"^^xsd:string . Double quotes that appear inside strings are escaped with a backslash and a backslash that is supposed to appear in the string must be escaped with another backslash.
xsd:time . This corresponds to the XML data type xsd:time . Example: "18:33:44.2345"^^xsd:time .
xsd:dateTime . This type corresponds to the XML data type xsd:dateTime . Example: "2007-03-12T21:22:33.44-01:30"^^xsd:dateTime .
rdf:XMLLiteral
.
This
type's
lexical
space
contains
all
XML
documents
wrapped
between
a
certain
pair
of
tags.
This
lexical
space
is
used
to
represent
certain
XML
documents,
as
described
in
Resource
Description
Framework
(RDF):
Concepts
and
Abstract
Syntax
.
rif:iri . Symbols in this symbol space have the form "XYZ"^^rif:iri , where XYZ is an absolute IRI as specified in RFC 3987 .
rif:local . Symbols in that symbol space have the form "XYZ"^^rif:local , where XYZ is any string of characters (provided that each occurrence of " and of \ in such a string is escaped with a backslash). Thus, the lexical space of rif:local is the same as the lexical space of xsd:string . Local symbols also have an alternative short notation. Example: 'a local symbol' . In this notation, single quotes and backslashes that occur inside such strings are escaped with backslashes.
TODO: Define CURIs.
Other XML data types that are likely to be incorporated in RIF include xsd:double , xsd:date , and a type for temporal duration . At present, the partial order on the above primitive data types is imposed by the XML Schema hierarchy, so the only subtype relationship is between xsd:long and xsd:decimal . This may be extended as more types are added.
The
symbol
space
rif:iri
is
intended
to
be
used
in
a
way
similar
to
RDF
resources
.
The
domain
value
space
of
the
symbol
space
rif:iri
can
be
any
set
and
no
a
priori
equalities
among
the
members
of
the
type
rif:iri
are
assumed.
This
domain
is
not
the
same
as
the
value
space
of
the
XML
primitive
type
anyURI
.
The symbol space rif:local is used for constant symbols (including predicate and function symbols) that are local to the various sets of RIF formulas. They are not visible outside of the rule set in which these constants occur (except, possibly, when such a constant is equated to a constant of type rif:iri ).
TODO. We will need to define the notion of the rule set more precisely. For instance, what is the rule set in which a particular rule occurs? Will we have #include statements? Import? All this has to be hashed out.
The constant symbols that correspond to XML Schema data types all have the signature i { } in RIF's basic logic dialect. The symbols of type rif:iri and rif:local can have any signature allowed by the basic RIF logic: i , f n , or p n , for n = 0,1,....
Symbols with ill-formed lexical part. RIF constant symbols that belong to one of the RIF-supported symbol spaces must be well-formed, i.e., their lexical form must belong to the lexical space associated with the symbol space. For instance, "123"^^xsd:long has a correct lexical form, since 123 belongs to the lexical space of the data type xsd:long . In contrast, "abc"^^xsd:long is ill-formed, as it does not have a correct lexical form. A compliant RIF interpreter must reject such symbols.
Symbols with undefined symbol spaces. RIF allows symbols of the form "..."^^label where label is not one of the previously defined standard symbol spaces (such as xsd:long , rif:local , etc.). These are treated as uninterpreted constant symbols in the RIF language. For instance, "abc"^^cde is such an uninterpreted symbol. Dialects that extend the basic RIF logic dialect might appropriate some of the symbol spaces, which are undefined in the basic RIF dialect, and give them special semantics.
EDITOR'S
NOTE:
The
following
XML
syntax
for
BLD
presented
here
is
a
possible
XML-serializing
mapping
one
of
the
abstract
syntax
proposals
the
Working
Group
is
considering.
It
is
presented
here
to
get
feedback
on
this
strawman
and
to
give
readers
an
idea
for
the
kind
of
information
that
will
be
presented
in
Section
Abstract
Syntax
.
this
section.
This
The
XML
serialization
is
again
fully
striped,
where
for
RIF
abstract
syntax
presented
in
positional
Section
Abstract
Syntax
is
fully
striped.
Positional
information
is
optionally
exploited
only
for
the
arg
role
elements.
For
example,
role
elements
(
declare
and
formula
)
are
explicit
within
the
Exists
element.
Following
the
examples
of
Java
and
RDF,
we
use
capitalized
names
for
class
elements
and
names
that
start
with
lowercase
for
role
elements.
The all-uppercase classes in the abstract syntax, such as CONDITION , become XML entities. They act like macros and are not visible in instance markup. The other classes as well as non-terminals and symbols (such as Exists or = ) become XML elements with optional attributes, as shown below (see also Appendix Specification ).
- And (conjunction)
- Or (disjunction)
- Exists (quantified formula for 'Exists', containing declare and formula roles)
- declare (declare role, containing a Var)
- formula (formula role, containing a CONDITION formula)
- Uniterm (atomic or function-expression formula)
- op (operation role)
- arg (argument role, positional/non-positional without/with optional 'index' attribute)
- Equal (prefix version of term equation '=')
- side (left-hand and right-hand side role)
- Const (predicate symbol, function symbol, or individual, with optional 'type' attribute)
- Var (logic variable)
The XML syntax for symbol spaces utilizes the type attribute associated with XML term elements such as Const . For instance, a literal in the xsd:dateTime data type can be represented as <Const type="xsd:dateTime">2007-11-23T03:55:44-02:30</Const> .
The following example illustrates XML serialization of RIF conditions.
Example 2 (A RIF condition and its XML serialization):
a. RIF condition:
And ( Exists ?Buyer (purchase(?Buyer
?Seller
book(?Author "http://example.com/books#LeRif"^^rif:iri)
"http://example.com/currencies#USD"^^rif:iri(49)))
?Seller=?Author )
b. XML serialization:
<And>
<formula>
<Exists>
<declare><Var>Buyer</Var></declare>
<formula>
<Uniterm>
<op><Const type="rif:local">purchase</Const></op>
<arg><Var>Buyer</Var></arg>
<arg><Var>Seller</Var></arg>
<arg>
<Uniterm>
<op><Const type="rif:local">book</Const></op>
<arg><Var>Author</Var></arg>
<arg><Const type="rif:iri">http://example.com/books#LeRif</Const></arg>
</Uniterm>
</arg>
<arg>
<Uniterm>
<op><Const type="rif:iri">http://example.com/currencies#USD</Const></op>
<arg><Const type="xsd:long">49</Const></arg>
</Uniterm>
</arg>
</Uniterm>
</formula>
</Exists>
</formula>
<formula>
<Equal>
<side><Var>Seller</Var></side>
<side><Var>Author</Var></side>
</Equal>
</formula>
</And>
DaveR: Not clear what's the value of giving an informal example instance XML document for signature specifications here. Either formally define the XML syntax for signatures or drop the example.
We
will
now
explain
how
signatures
are
defined
in
the
Basic
Condition
Language
and
its
extensions.
In
RIF's
basic
Logic
dialect,
there
is
no
need
to
declare
signatures,
since
they
can
be
inferred.
Indeed,
the
basic
logic
dialect
assumes
requires
that
each
symbol
is
associated
with
a
unique
signature.
Therefore,
the
signature
can
be
determined
by
the
context
in
which
the
symbol
is
used.
If
a
symbol
is
used
in
more
than
one
context,
the
parser
should
deem
it
as
a
syntax
error.
In dialects that extend RIF's Basic Condition Language, signature inference of the above kind is not possible, in general, and advanced signature inference might not always be appropriate even when it is possible. For this reason, we introduce the attribute sig for the XML elements Const and Var . The value of such an attribute is supposed to be the signature name for the corresponding symbol. For instance, in the above example, purchase was a 4-ary predicate, so it could be specified as <Const sig="p4">purchase</Const> . Since variables in the basic condition language always have the signature i , we could write <Var sig="i">Author</Var> .
The sig attribute is optional, and its value can be inferred for the basic language. For instance, in our example, the values p4 and i for the attribute sig could be inferred from the usage of the symbols purchase and Author . Dialects that extend the basic logic dialect can adopt their own rules for omitting this attribute and for inferring its values. The basic language does not need any special syntax for defining signatures -- all signature names are already defined: bool , i , f0 , f1 , p0 , p1 , etc. The dialects, however, will need a special sublanguage for defining signatures. The signature
mysig{
(s1 s2 s3) -> bool
(s i) -> i
}
can, for example, be defined using the following XML excerpt:
<Signature signame="mysig">
<element>
<Sigexpr sighead="bool">
s1 s2 s3
</Sigexpr>
</element>
<element>
<Sigexpr sighead="i">
s i
</Sigexpr>
</element>
</Signature>
The first step in defining a model-theoretic semantics for a logic-based language is to define the notion of a semantic structure , also known as an interpretation . (See end note on the rationale .) In this section we define basic semantic structures. This definition will be extended in Extension of Semantic Structures for Frames when we introduce frame syntax.
Basic
semantic
structures.
A
basic
semantic
structure
,
I
,
is
a
tuple
of
the
form
<
D
,
I
C
,
I
V
,
I
F
,
I
R
>,
which
determines
the
truth
value
of
every
formula,
as
explained
below.
Currently,
by
formula
we
mean
anything
produced
by
the
CONDITION
production
in
the
Concrete
Presentation
EBNF
Syntax.
Later
on
this
term
on,
formulas
will
also
include
rules.
Other
dialects
might
extend
this
notion
even
further.
The set of truth values is denoted by TV . For RIF's basic logic dialect, TV includes only two values, t (true) and f (false). (See end note on truth values .)
The set TV has a total or partial order, called the truth order ; it is denoted with < t . In the basic RIF logic, f < t t , and it is a total order. (See end note on ordering truth values .)
To define how semantic structures determine the truth values of RIF formulas, we introduce the following sets:
D - a non-empty set of elements called the domain of I ,
Const - the set of individuals, predicate names, and function symbols,
Var - the set of variables.
The mappings that constitute an interpretation, I , are as follows:
I C from Const to elements of D
I V from Var to elements of D
I F from Const to functions from D* into D (here D* is a set of all tuples of any length over the domain D )
I R from Const to truth-valued mappings D* → TV
It is convenient to define a more general mapping, I , based on the mappings I C , I V , I F , and I R :
I (k) = I C (k) if k is a symbol in Const
I (?v) = I V (?v) if ?v is a variable in Var
I (f(t 1 ... t n )) = I F (f)( I (t 1 ),..., I (t n ))
Truth valuation for formulas. Observe that the notion of signatures from Section Formal Preliminaries is used only to constrain the syntax and does not appear in the definition of the semantic structure. This is because when we define truth valuations for formulas, below, all formulas are assumed to be well-formed.
Truth valuation for well-formed formulas in RIF's basic condition language is determined using the following function, denoted I Truth :
Atomic formulas: I Truth (r(t 1 ... t n )) = I R (r)( I (t 1 ),..., I (t n ))
Equality: I Truth (t 1 =t 2 ) = t iff I (t 1 ) = I (t 2 ); I Truth (t 1 =t 2 ) = f otherwise.
Conjunction: I Truth (And(c 1 ... c n )) = min t ( I Truth (c 1 ),..., I Truth (c n )), where min t is minimum with respect to the truth order.
Disjunction: I Truth (Or(c 1 ... c n )) = max t ( I Truth (c 1 ),..., I Truth (c n )), where max t is maximum with respect to the truth order.
Quantification: I Truth ( Exists ?v 1 ... ?v n (c)) = max t ( I * Truth ( c )), where max t is taken over all interpretations I * of the form < D , I C , I * V , I F , I R >, where the mapping I * V has the same value as I V on all variables except, possibly, on the variables ?v 1 ,..., ?v n .
Notice that the mapping I Truth is uniquely determined by the four mapping comprising I and, therefore, it does not need to be listed explicitly.
Interpretation of symbol spaces. We now explain how symbol spaces are integrated into the semantics of the basic RIF logic. As mentioned earlier, the semantics of a symbol space defines an associated value space and a mapping from the lexical space to the value space. The value space may be constrained in various ways or it can be completely fixed. For instance, the XML Schema Part 2: Datatypes specification defines a concrete value space for each XML data type, including the data types such as xsd:decimal , which are of interest to RIF. The value space is different from the lexical space. Lexical space refers to the syntax of the constant symbols that belong to a particular primitive data type. For instance, "1.2"^^xsd:decimal and "1.20"^^xsd:decimal are two legal constants in RIF because 1.2 and 1.20 belong to the lexical space of xsd:decimal . However, these two constants are interpreted by the same element of the value space of the xsd:decimal type.
Other symbol spaces may have their value spaces constrained in some ways, but not fixed. For instance, the value space of rif:iri is defined below as the entire domain of interpretation. This means that this value space cannot be an arbitrary subset of that domain. Some symbol spaces may have their value spaces restricted in other ways. For instance, a value space may be required to be disjoint from the value space of the XML data types.
Formally,
each
of
the
XML
or
RDF
data
types
supported
by
RIF
type
comes
with
a
value
space,
denoted
by
D
type
(for
instance,
D
xsd:decimal
),
and
a
mapping,
I
C
:
Const
type
→
D
type
.
We
require
that
D
type
⊆
D
for
each
XML
data
type.
type,
where
Const
type
denotes
the
set
of
constants
in
the
symbol
space
type
.
The
value
spaces
and
the
corresponding
mappings
for
the
XML
data
types
are
defined
by
the
XML
Schema
Part
2:
Datatypes
specification.
In
some
cases,
the
XML
Schema
specification
defines
the
subdomains
D
s
and
D
t
for
different
XML
primitive
types
s
and
t
as
being
disjoint
or
as
subsets
of
each
other.
For
instance,
the
value
space
of
xsd:long
is
defined
to
be
a
subset
of
xsd:decimal
.
The
requirements
imposed
by
the
XML
Schema
specification
may
have
interesting
consequences
for
the
mappings
from
the
lexical
space
to
the
value
space.
For
instance,
for
xsd:decimal
the
mapping
I
C
:
Const
xsd:decimal
→
D
xsd:decimal
is
such
that,
for
instance,
I
C
(
"1.2"^^xsd:decimal
)
=
I
C
(
"1.20"^^xsd:decimal
)
in
every
semantic
structure.
So,
it
follows
that
"1.2"^^xsd:decimal
=
"1.20"^^xsd:decimal
is
a
RIF
tautology.
The
semantics
of
XML
also
implies
some
inequalities.
For
instance,
since
distinct
strings
are
different
in
the
value
space
for
the
xsd:string
data
type,
something
like
"abc"^^xsd:string
≠
"abcd"^^xsd:string
is
a
tautology,
since
it
is
true
in
all
RIF
semantic
structures.
On
the
other
hand,
"abc"^^rif:iri
≠
"abcd"^^rif:iri
is
not
a
tautology
in
RIF
RIF,
since
it
is
possible
in
some
semantic
structures
(but
not
all
such
structures)
that,
for
example,
I
C
(
"abc"^^rif:iri
)
=
I
C
(
"abcd"^^rif:iri
).
The value space for the RDF data type rdf:XMLLiteral is defined in Resource Description Framework (RDF): Concepts and Abstract Syntax .
For the symbol space rif:iri , the corresponding mapping is I C : Const rif:iri → D , i.e., D rif:iri = D . That is, the interpretation allows to map any IRI into any value in the domain of the semantic structure.
The value space of the local RIF symbols is also the entire domain D and the corresponding mapping looks as follows: I C : Const rif:local → D . Thus, local constants can denote any element of the domain of discourse, including integers, strings, etc.
(Editor's Note: This text is maintained on wiki page Slotted Conditions ).
In this section we extend Positive Conditions, defined in Section Positive Conditions , with slotted Uniterm s and Frame formulas . A slot can be represented by an individual symbol or, more generally, by a Uniterm. It can be represented by an IRI or be known only locally. Semantically, a frame slot is a set-valued function that represents a property of an object. Such a function maps an object id to a set of values of the property. This can be a singleton set or even an empty set. In contrast, a uniterm slot semantically behaves like a unary uninterpreted function symbol. In both cases, however, the order of the slots is immaterial (in contrast to positional uniterms).
Syntactically, the extension is achieved by enriching the notion of a uniterm with slots and by complementing these with frame formulas. For uniformity and greater syntactic convenience, frame formulas can be nested inside other frame formulas. This is syntactic sugar, however, as explained later in this section.
The most important additions to the syntax for positive conditions in Section Formal Preliminaries are the notions of slotted terms (including slotted predicates) and frames.
Slotted terms. A term is either a term in the sense of Section Formal Preliminaries or a slotted term. Slotted terms are defined as follows:
A slotted term is of the form t ( p 1 → v 1 ... p n → v n ), where t is a term; p 1 , ..., p n are terms, which represent the names of the slots; and v 1 , ..., v n are terms.
In order to talk about the slotted terms that are also well-formed , we need to extend the notion of a signature to include slots. As before, a signature is a statement of the form η{ e 1 ... e n ...} where η ∈ SigNames is the name of the signature and { e 1 ... e n ...} is a set of signature expressions , which can be empty, finite, or countably infinite. But now a signature expression can be not only a base expression or an arrow expression; it can, in addition, be a slotted arrow expression . A slotted arrow expression is a statement of the form
(γ
1
→κ
1
...
γ
n
→κ
n
)
⇒
κ,
where
n≥0.
The
order
of
arguments
in
a
slotted
signature
is
assumed
to
be
immaterial,
so
any
permutation
of
arguments
is
assumed
to
yield
the
same
signature
expression.
For
instance,
(
i
→
i
i
→
i
)
⇒
i
is
a
slotted
arrow
signature
expression.
By
analogy
with
earlier
definitions,
if
κ
n
is
bool
then
the
expression
is
also
a
slotted
Boolean
signature
expression
.
For
instance,
(
i
→
i
i
→
i
)
⇒
bool
is
a
slotted
Boolean
signature
expression.
A term t ( p 1 → v 1 ... p n → v n ) is a well-formed slotted term with signature σ if
Each p i is a well-formed term with some signature γ i .
Each v i is a well-formed term (slotted or non-slotted) with some signature σ i , i=1, ..., n.
The signature of t includes the arrow expression (γ 1 # →σ 1 # ... γ n # →σ n # ) ⇒ σ # .
Recall that if σ is a signature then σ # denotes its name.
The aforesaid term t is a well-formed slotted formula if σ is bool { }.
A slotted formula is just like an atomic formula of Section Formal Preliminaries except that the arguments of the predicate are named and their order is considered immaterial. This is analogous to names of columns in tables in relational databases.
Frames. A well-formed frame formula is one of the following:
Membership formula : o#c , where o , c are well-formed terms.
Informally,
such
a
formula
says
that
object
o
is
a
member
of
class
c
.
Like
in
RDF,
RIF
classes
Classes
are
also
objects.
For
instance,
they
can
be
members
of
some
other
classes.
(In
object-oriented
languages
these
latter
classes
are
known
as
meta-classes.)
Subclass formula : s##c , where s , c are well-formed terms.
Informally, this formula states that class s is a member of class c .
Frame : t [ p 1 -> v 1 ... p n -> v n ], where t is a well-formed term, a membership formula, or a subclass formula; p 1 , ..., p n are well-formed terms; and each v 1 , ..., v n is either a well-formed term or a well-formed frame formula.
When t , all the p i , and all the v i are terms, such a formula should be understood as a statement that object t has properties p i , ..., p n , and for each i = 1,...,n, the value of property p i is a set that contains the object v i . When t , p i , or v i are not just terms but frame formulas themselves, the above is treated as a conjunction of simpler frame formulas, as defined by the unnest transformation in the section on semantics (below).
Atomic formulas and general formulas. The syntax for atomic formulas is extended with slotted formulas and frame formulas . More precisely, an atomic well-formed formula is
An atomic well-formed formula as defined in Section Formal Preliminaries ; or
The notion of well-formed general formulas needs no adjustments with respect to its earlier definition (without frames). As before, such formulas are constructed from well-formed atomic formulas using logical connectives ∧, ∨ etc.
Extended signatures for RIF's Basic Logic Dialect. Section Formal Preliminaries defined the allowed signatures for the basic logic dialect of RIF. With the introduction of slotted signatures, we augment the set of allowed signatures in the basic dialect as follows:
For every n≥0 there are signatures fs n {( i → i ... i → i ) ⇒ i } and ps n {( i → i ... i → i ) ⇒ bool } (in both cases the number of arguments inside the parentheses is n).
In addition, RIF's Basic Logic Dialect imposes the following restrictions:
they must have the form s , where s is a constant symbol.
In frame formulas, all terms must have the signature i { }.
Therefore, in the frame formulas t # s , t ## s , or t [ r → s ], the terms t , s , and r cannot have the signature bool { } -- they can represent individuals, but not predicates.
To compare two approaches at [ F2F7 ], we represent the abstract syntax of the RIF Slotted Condition Language both in Abstract Syntax Notation and in Abstract EBNF Syntax.
The abstract syntax of the slotted condition language is given in asn06 as follows:
class CONDITION
subclass And
property formula : list of CONDITION
subclass Or
property formula : list of CONDITION
subclass Exists
property declare : list of Var
property formula : CONDITION
subclass ATOMIC
class ATOMIC
subclass Uniterm
subclass Equal
subclass CLASSIFICATION
subclass Frame
class Uniterm
property op: Const
property
subproperty arg: list of TERM
subproperty slot: list of list Const TERM
class Equal
property side: list TERM TERM
class CLASSIFICATION
subclass Instance
property oid: TERM
property op: TERM
subclass Subclass
property sub: TERM
property op: TERM
class Frame
property
subproperty oid: TERM
subproperty classinst: CLASSIFICATION
property slot: list of list TERM TERMFRAME
class TERMFRAME
subclass TERM
subclass Frame
class TERM
subclass Const
property name: CONSTNAME
property type: TYPENAME?
subclass Var
property name: VARNAME
subclass Uniterm
The above asn06 is generalized with a fixed-arity list constructor (complementing the unfixed-arity list of constructor) so that the multiplicity of the side property (role) of the Equal class no longer requires the assumption of being exactly 2 (the 2 TERM s are just list ed). Also, optional of and subproperty constructs are assumed.
The abstract syntax of the slotted condition language is given in Abstract EBNF Syntax as follows:
CONDITION ::= 'And' '(' ('formula ->' CONDITION)* ')'
|
'Or' '(' ('formula ->' CONDITION)* ')'
|
'Exists' '(' ('declare ->' Var)+ 'formula ->' CONDITION ')'
|
ATOMIC
ATOMIC ::= Uniterm
|
Equal
|
CLASSIFICATION
|
Frame
Uniterm ::= 'Uniterm' '('
'op ->' Const ( ('arg ->' TERM)* | ( 'slot ->' '(' Const TERM ')' )* )
')'
Equal ::= 'Equal' '(' 'side ->' TERM 'side ->' TERM ')'
CLASSIFICATION
::= 'Instance' '(' 'oid ->' TERM 'op ->' TERM ')'
|
'Subclass' '(' 'sub ->' TERM 'op ->' TERM ')'
Frame ::= 'Frame' '('
( 'oid ->' TERM | 'classinst ->' CLASSIFICATION )
( 'slot ->' '(' TERM (TERM | Frame) ')' )*
')'
TERM ::= 'Const' ( '(' 'type ->' TYPENAME ')' )? '(' CONSTNAME ')'
|
'Var' '(' VARNAME ')'
|
Uniterm
Note that the name and filler of a slot are defined as a sequence of two classes. Also, a Const can have an optional type property. To maintain a clean separation of concerns, the unorderedness of slots within Uniterms and Frames is taken care of by the semantics in Section Semantics .
The above abstract syntax can be described by a UML diagram as shown below.
Upload new attachment "SlottedConditionModel.png"
The syntactic classes are partitioned into classes that will not be visible in serializations (written in all-uppercase letters) and classes that will be visible in instance markup (written with a leading uppercase letter only).
The three classes Var , CONDITION , and ATOMIC will be required in the abstract syntax of Horn Rules .
The
abstract
syntax
of
Section
Abstract
Syntax
can
be
lowered
to
a
concrete
presentation
syntax
which
is
an
extension
to
the
concrete
presentation
syntax
of
Positive
Conditions
for
slotted
Uniterms.
The
Concrete
Presentation
EBNF
Syntax
below,
which
lowers
the
abstract
syntax,
is
used
throughout
this
document
to
explain
and
illustrate
the
main
ideas.
The
concrete
human-readable
human-oriented
presentation
syntax,
described
in
this
EBNF
(usual
except
for
whitespace
handling),
is
work
in
progress
and
under
discussion.
CONDITION ::= 'And' '(' CONDITION* ')' |
'Or' '(' CONDITION* ')' |
'Exists' Var+ '(' CONDITION ')' |
ATOMIC
ATOMIC ::= Uniterm | Equal | CLASSIFICATION | Frame
Uniterm ::= Const '(' (TERM* | (Const '->' TERM)*) ')'
Equal ::= TERM '=' TERM
CLASSIFICATION ::= TERM '#' TERM | TERM '##' TERM
Frame ::= (TERM | CLASSIFICATION) '[' (TERM '->' (TERM | Frame))* ']'
TERM ::= Const | Var | Uniterm
Const ::= CONSTNAME | '"'CONSTNAME'"''^^'TYPENAME
Var ::= '?'VARNAME
A Uniterm () -applies a Const to positional TERM arguments or to slotted Const -> TERM arguments. A CLASSIFICATION specifies that one object is a member (in case of the #-connective) or a subclass (in case of the ##-connective) of another object (classes are treated as objects). A Frame is a TERM or a CLASSIFICATION applied to to slotted TERM -> (TERM | Frame) arguments.
The
lowering
of
the
abstract
syntax
to
this
concrete
the
presentation
syntax
can
again
be
done
via
automatic
EBNF-to-EBNF
mapping.
For
this,
the
mapper,
abs2con4g
,
and
the
table,
class2token
,
of
Section
Positive
Conditions
are
extended.
class2token('slot','->')
class2token('Instance','#')
class2token('Subclass','##')
abs2con4g(Uniterm
|
Equal
|
CLASSIFICATION
|
Frame,
?TokenTable)
=
Uniterm | Equal | CLASSIFICATION | Frame
abs2con4g('Uniterm'
'('
'op ->' Const
(('arg ->' TERM)* | ('slot ->' '(' Const TERM ')')*)
')',
?TokenTable)
=
Const '(' (TERM* | (Const lookup('slot',?TokenTable) TERM)*) ')'
abs2con4g('Instance'
'('
'oid ->' TERM
'op ->' TERM
')'
|
'Subclass'
'('
'sub ->' TERM
'op ->' TERM
')',
?TokenTable)
=
TERM lookup('Instance',?TokenTable) TERM | TERM lookup('Subclass',?TokenTable) TERM
abs2con4g('Frame'
'('
('oid ->' TERM
|
'classinst ->' CLASSIFICATION)
('slot ->' '(' TERM (TERM | Frame) ')')*
')',
?TokenTable)
=
(TERM | CLASSIFICATION) '[' (TERM lookup('slot',?TokenTable) (TERM | Frame))* ']'
Example
1
3
shows
Uniterm
and
Frame
conditions,
the
latter
with
variables
for
the
three
major
(combinations
of)
syntactic
categories,
corresponding
to
the
three
components
of
RDF
triples.
Example 3 (A RIF condition with bound variables)
RIF conditions using
Positional Uniterms:
book("http://example.com/authors#rifwg"^^rif:iri "http://example.com/books#LeRif"^^rif:iri)
Exists ?X (book(?X LeRif))
Slotted Uniterms:
book(author->"http://example.com/authors#rifwg"^^rif:iri title->"http://example.com/books#LeRif"^^rif:iri)
Exists ?X (book(author->?X title->"http://example.com/books#LeRif"^^rif:iri))
Frames:
wd1[author->"http://example.com/authors#rifwg"^^rif:iri title->"http://example.com/books#LeRif"^^rif:iri]
Exists ?X (wd2[author->?X title->"http://example.com/books#LeRif"^^rif:iri])
Exists ?X (wd2#book[author->?X title->"http://example.com/books#LeRif"^^rif:iri])
Exists ?I ?X (?I[author->?X title->"http://example.com/books#LeRif"^^rif:iri])
Exists ?I ?X (?I#book[author->?X title->"http://example.com/books#LeRif"^^rif:iri])
Exists ?S (wd2[author->"http://example.com/authors#rifwg"^^rif:iri ?S->"http://example.com/books#LeRif"^^rif:iri])
Exists ?X ?S (wd2[author->?X ?S->"http://example.com/books#LeRif"^^rif:iri])
Exists ?I ?X ?S (?I#book[author->?X ?S->"http://example.com/books#LeRif"^^rif:iri])
The following is a possible XML-serializing mapping of the abstract syntax in Section Abstract Syntax , extending the one in Section Positive Conditions (see also Appendix Specification ).
- And (conjunction)
- Or (disjunction)
- Exists (quantified formula for 'Exists', containing declare and formula roles)
- declare (declare role, containing a Var)
- formula (formula role, containing a CONDITION formula)
- Uniterm (positional or slotted Uniterm formula)
- Instance (instance-of formula)
- Subclass (subclass-of formula)
- Frame (slotted Frame formula)
- classinst (Frame role for Subclass or Instance)
- oid (Instance/Frame identifier role, containing a TERM)
- op (Uniterm/Instance/Subclass predicate/class role)
- sub (Subclass role for included class)
- slot (Uniterm/Frame slot role, prefix version of slot infix '->')
- Equal (prefix version of term equation '=')
- side (left-hand and right-hand side role)
- Const (slot, individual, function, or predicate symbol)
- Var (logic variable)
The following example illustrates XML serialization of Slotted RIF conditions.
Example 4 (A RIF condition and its XML serialization): a. RIF condition: And ( Exists ?Buyer ?P (?P#purchase[buyer->?Buyer seller->?Seller item->book(author->?Author title->"http://example.com/books#LeRif"^^rif:iri) price->49 currency->"http://example.com/currencies#USD"^^rif:iri]) ?Seller=?Author ) b. XML serialization: <And> <formula> <Exists> <declare><Var>Buyer</Var></declare> <declare><Var>P</Var></declare> <formula> <Frame> <classinst> <Instance> <oid><Var>P</Var></oid> <op><Const type="rif:local">purchase</Const></op> </Instance> </classinst> <slot><Const type="rif:local">buyer</Const><Var>Buyer</Var></slot> <slot><Const type="rif:local">seller</Const><Var>Seller</Var></slot> <slot> <Const type="rif:local">item</Const> <Uniterm> <op><Const type="rif:local">book</Const></op> <slot><Const type="rif:local">author</Const><Var>Author</Var></slot> <slot><Const type="rif:local">title</Const><Const type="rif:iri">http://example.com/books#LeRif</Const></slot> </Uniterm> </slot><slot><Const type="rif:local">price</Const><Const type="rif:long">49</Const></slot><slot><Const type="rif:local">price</Const><Const type="xsd:long">49</Const></slot> <slot><Const type="rif:local">currency</Const><Const type="rif:iri">http://example.com/currencies#USD</Const></slot> </Frame> </formula> </Exists> </formula> <formula> <Equal> <side><Var>Seller</Var></side> <side><Var>Author</Var></side> </Equal> </formula> </And>
The syntax of RIF frames permits nesting of two kinds. First, a classification formula of the form obj1#obj2 or obj1##obj2 can appear in the object position of a frame. Second, a frame may appear in the value position of an attribute. This nested notation is convenient and allows succinct representation of object properties, but is no more than a shorthand notation. A nested frame represents a conjunction of flat frames. For instance,
a#b[c -> e##f[g -> h]] == a#b /\ a[c -> e] /\ e##f /\ e[g -> h]
Formally, given a frame, f , we define the Unnest transformation and postulate f to be true in a semantic structure iff Unnest(f) is true. In this way, we reduce the semantics of nested frames to that of flat frames. Then we extend the basic semantic structures defined in Model Theory for the Core RIF Condition Language and define an interpretation for flat frames. The rest of the semantic definitions does not change, since it is defined in terms of atomic formulas (the ATOMIC production in the BNF syntax).
If
a
formula,
f
,
has
the
form
a#b
or
a##b
then
define
Obj(f)
to
be
a
.
If
f
has
the
form
o[a->v]
,
where
o
,
a
,
and
v
are
terms,
then
Obj(f)
is
recursively
defined
to
be
Obj(o)
.
Now,
if
f
is
a
uniterm
term
then
we
define
Unnest(f) =
true
,
where
true
is
a
formula
that
is
always
true
(a
tautology,
a
formula
whose
truth
value
is
t
).
If
f
is
a
classification
formula
then
Unnest(f) = f
.
Otherwise,
if
f
is
a
frame
of
the
form
o[a->v]
then
Unnest(f) = Unnest(o) /\ Obj(o)[a -> Obj(v)] /\ Unnest(v)
For instance, in case of the frame a#b[c->e##f[g->h]] above, unnesting yields:
Unnest(a#b[c -> e##f[g -> h]]) = a#b /\ a[c -> e] /\ e##f /\ e[g -> h] /\ true
This is almost the same conjunction as we have seen earlier. The only difference is the trailing true conjunct, which comes from Unnest(h) and can be omitted.
A semantic structure , I , is a tuple of the form
< D , I C , I V , I F , I R , I slot , I SF , I SR , I sub , I isa >.
All
the
components
except
the
last
five,
I
slot
,
I
SF
,
I
SR
,
I
sub
,
I
isa
,
are
the
same
as
before.
The
new
mapping
I
slot
is
used
to
interpret
frames;
the
mappings
I
SR
SF
and
I
SR
interpret
terms
and
predicates
with
named
arguments,
respectively;
I
sub
i
gives
meaning
to
the
subclass
hierarchy;
and
I
isa
interprets
class
membership.
I slot is a function from the domain D to truth-valued functions of the form D × D → TV . The intuitive meaning of frame slots is that they are functions that take objects and return sets of objects, where every member of the set has a certain degree of truth. In the two-valued case, a set of objects associated with the truth value true represents the (set of) values that the slot returns when applied to an object. Formally, this is expressed by extending I Truth to flat frames as follows:
I Truth ( T[p 1 ->V 1 ... p k ->V k ] ) = min i=1...k ( I Truth ( T[p i ->V i ] )), where T , p i , and V i are terms.
I Truth ( T[p->V ]) = I slot ( I ( p ))( I ( T ), I ( V ))
I
SF
interprets
terms
with
named
arguments.
It
is
a
function
from
Const
to
the
mappings
SetOfFiniteSubbags
(
D
×
D
)
→
D
.
This
is
analogous
to
the
interpretation
of
regular
(positional)
predicates
except
for
terms
with
two
differences:
Each
pair
<s,v>
∈
D
×
D
represents
a
slot
name
-
slot
value
slot-name/slot-value
pair
instead
of
just
a
value
in
positional
terms.
Bags
are
used
here
because
because,
for
slotted
terms
terms,
the
order
of
slot-value
slot/value
pairs
does
not
matter,
is
immaterial,
but
sets
cannot
be
used,
since
I
may
happen
to
map
different
slots
into
the
same
value
in
D
.
We can now extend the mapping I from Section Model Theory for RIF's Basic Condition Language as follows (where only the last item is new with respect to the earlier definition of I ):
I (k) = I C (k) if k is a symbol in Const
I (?v) = I V (?v) if ?v is a variable in Var
I (f(t 1 ... t n )) = I F (f)( I (t 1 ),..., I (t n ))
I (f( p 1 -> t 1 ... p n -> t n )) = I SF (f)({ < I ( p 1 ), I (t 1 ) > ,..., < I ( p n ), I (t n ) > })
I SR is used to interpret predicates with slotted arguments. It is a function from the set Const to truth-valued mappings SetOfFiniteSubbags ( D × D ) → TV . This is analogous to the interpretation of regular (positional) predicates except for two differences:
Each
pair
<s,v>
∈
D
×
D
represents
a
slot
name
-
slot
value
slot-name/slot-value
pair
instead
of
just
a
value
in
positional
predicates.
Bags (also known as multisets ) are used here because for slotted predicates the order of slot-value pairs does not matter, but sets cannot be used, since I may happen to map different slots into the same value in D .
We can now define I Truth for slotted predicates as follows:
I
Truth
(p(
p
1
->val
1
...
p
k
->val
k
))
=
I
SR
(p)({
<
I
(
p
1
)
I
(
val
1
)
>
,
...
,
<
I
(
p
k
)
I
(
val
k
)
>
}),
where
p
∈
Const
,
and
p
i
∈
Const
and
val
i
are
terms.
is
a
term
for
i=1,...k
.
Here
we
use
{...}
to
denote
bags.
I sub gives meaning to the subclass relationship. It is a truth-valued function D × D → TV . The truth valuation for classification formulas of the form sc##cl , where sc and cl are terms, is defined as follows:
I Truth ( sc##cl ) = I sub ( I ( sc ), I ( cl ))
In addition, we want the operator ## to be transitive, i.e., we would like c1##c2 and c2##c3 to imply c1##c3 . To this end, we require that the mapping I sub defines a partial order on D . More precisely,
For all elements ec1 , ec2 , ec3 ∈ D , the following must hold: min t ( I sub ( ec1 , ec2 ), I sub ( ec2 , ec3 )) ≤ t I sub ( ec1 , ec3 )
where ≤ t denotes the truth order on TV (see Section Model Theory for RIF's Basic Condition Language ) and min t is the minimum with respect to that truth order.
I isa gives meaning to class membership. It is a truth-valued function D × D → TV . The truth valuation for classification formulas of the form o#cl , where o and cl are terms, is defined as follows:
I Truth ( o#cl ) = I isa ( I ( o ), I ( cl ))
We also want # and ## to have the usual property that all members of a subclass are also members of the superclass, i.e., we want o#cl and cl##scl to imply o#scl . To this end, we introduce the following restriction on the mappings I isa and I sub :
For all elements eo , ec , es ∈ D , the following must hold: min t ( I isa ( eo , ec ), I sub ( ec , es )) ≤ t I isa ( eo , es )
where ≤ t and min t are as before.
MK: ***** Need to decide if we want to extend signatures to restrict what can appear in the object/attribute/value places in a frame ********
(Editor's Note: This text is maintained on wiki page List Constructor ).
This page is not up-to-date, and currently acts as a place-holder only.
In order to represent lists as Lisp-like dotted pairs (head . tail) or Prolog-like lists [head | tail] , a refinement of
Const '(' TERM* ')'
from Positive Conditions , namely the specialized syntax
'List' '(' TERM TERM ')'
is used by employing a distinguished binary function symbol, List , and a distinguished constant symbol, Nil . Besides dealing with these two distinguished symbols, nothing else changes (especially, in the semantics).
The abstract syntax of lists extends the class TERM in Positive Conditions by a class LIST as follows:
class TERM
subclass Var
property name: xsd:string
subclass Const
property name: xsd:string
subclass Uniterm
property op: Const
property arg: list of TERM
subclass LIST
subclass Nil
subclass PAIR
property arg1: TERM
property arg2: TERM
Here, Nil is a special singleton class. The properties ("role stripes") arg1 and arg2 at the same time suggest minimal use of positional information in concrete syntaxes: the natural order of the two elements of a LIST that is a PAIR . This will considerably support readability of the concrete syntaxes (see below). There is an obvious 1-to-1 mapping between the above LIST class and the following earlier-considered LIST class:
subclass LIST
subclass Nil
subclass PAIR
property first: TERM
property rest: TERM
The abstract syntax is visualized by a UML diagram. (***TBD***)
Upload new attachment "ListConditionModel.png"
The abstract syntax of Section Abstract Syntax can again be instantiated to a concrete EBNF syntax.
For this, TERM s, in Positive Conditions defined as
TERM ::= Const | Var | Uniterm
are extended to
TERM ::= Const | Var | Uniterm | LIST
with
LIST ::= 'Nil' | 'List' '(' TERM TERM ')'
For example, the Prolog list [a,Y,c] or [a | [Y | [ c | []]]] becomes the List nesting
List(
a
List(
?Y
List(
c
Nil
)
)
)
The following is a possible XML-serializing mapping of the abstract syntax in Section Abstract Syntax (and of the above EBNF syntax):
<List>
<Const>a</Const>
<List>
<Var>Y</Var>
<List>
<Const>c</Const>
<Nil/>
</List>
</List>
</List>
This shortens an equivalent earlier-considered version using explicit arg stripes:
<List>
<arg index="1"><Const>a</Const></arg>
<arg index="2">
<List>
<arg index="1"><Var>Y</Var></arg>
<arg index="2">
<List>
<arg index="1"><Const>c</Const></arg>
<arg index="2"><Nil/></arg>
</List>
</arg>
</List>
</arg>
</List>
Lists can be decomposed using syntactic equality (unification). For example, [?head | ?tail] = [a ?Y c] :
<List>
<Var>head</Var>
<Var>tail</Var>
</List>
unifies with
<List>
<Const>a</Const>
<List>
<Var>Y</Var>
<List>
<Const>c</Const>
<Nil/>
</List>
</List>
</List>
by binding
<Var>head</Var> to
<Const>a</Const>
and
<Var>tail</Var>
to
<List>
<Var>Y</Var>
<List>
<Const>c</Const>
<Nil/>
</List>
</List>
Nested Prolog Lists such as [a,[X,b],c] or [a | [[X | [b | []]] | [ c | []]]] are of course allowed:
<List>
<Const>a</Const>
<List>
<List>
<Var>X</Var>
<List>
<Const>b</Const>
<Nil/>
</List>
</List>
<List>
<Const>c</Const>
<Nil/>
</List>
</List>
</List>
The mapping to RDF lists is even more direct than in OWL 1.1 , since its source are binary List constructor applications rather than n-ary SEQ constructor applications.
(Editor's Note: This text is maintained on wiki page RIF Rule Language ).
This section develops a RIF Rule Language by extending the RIF Condition Language , where conditions become rule bodies. RIF Phase I covers only Horn Rules and a number of extensions that do not increase the expressive power of the language. The envisioned RIF dialects will extend the BLD rule language by generalizing the positive RIF conditions and by other means.
(Editor's Note: This text is maintained on wiki page Horn Rules ).
This section defines Horn rules for RIF Phase 1. The syntax and semantics incorporates RIF Positive Conditions defined in Section Positive Conditions .
To compare two approaches at [ F2F7 ], we represent the abstract syntax of the RIF Horn Rule Language both in Abstract Syntax Notation and in Abstract EBNF Syntax.
The abstract syntax of the Horn rule language extending Positive Conditions is given in asn06 as follows:
class Ruleset
property formula : list of RULE
class RULE
subclass Forall
property declare : list of Var
property formula : CLAUSE
class CLAUSE
subclass ATOMIC
subclass Implies
class Implies
property if: CONDITION
property then: ATOMIC
The abstract syntax of the Horn rule language extending Positive Conditions is specified using Abstract EBNF Syntax as follows:
Ruleset ::= 'Ruleset' '(' ('formula ->' RULE)* ')'
RULE ::= 'Forall' '(' ('declare ->' Var)* 'formula ->' CLAUSE ')'
CLAUSE ::= ATOMIC | Implies
Implies ::= 'Implies' '(' 'if ->' CONDITION 'then ->' ATOMIC ')'
This syntax can be represented using a UML diagram, which extends the diagram shown in Section Positive Conditions as shown below.
The class Ruleset contains zero or more RULE s, which are universally quantified RIF CLAUSE s. The class Forall is specified through its parts, i.e., zero or more variable ( Var ) declarations and a CLAUSE formula, which can be an Implies or an ATOMIC formula. The Implies class distinguishes if - CONDITION from then - ATOMIC parts.
The classes Var , CONDITION , and ATOMIC were specified in the abstract syntax of Positive Conditions .
The combined RIF BLD abstract syntax and its visualization are given in Appendix Specification .
The
abstract
syntax
of
Section
Abstract
Syntax
can
again
be
lowered
to
a
concrete
presentation
syntax.
The
concrete
human-readable
human-oriented
presentation
syntax,
described
in
this
Concrete
Presentation
EBNF
Syntax
extending
extends
the
one
syntax
for
Positive
Conditions
by
with
the
following
productions,
productions.
It
is
work
in
progress
and
under
discussion:
discussion.
Ruleset ::= RULE*
RULE ::= 'Forall' Var* '(' CLAUSE ')'
CLAUSE ::= Implies | ATOMIC
Implies ::= ATOMIC ':-' CONDITION
Var , ATOMIC , and CONDITION were defined as part of the syntax for positive conditions in Positive Conditions . The symbol :- denotes the implication connective used in rules. The statement ATOMIC :- CONDITION should be informally read as if CONDITION is true then ATOMIC is also true. RIF deliberately avoids using the connective ← here because in some RIF dialects, such as LP and PR, the implication :- will have different meaning from the meaning of the first-order implication ←.
Rules
are
generated
by
the
Implies
production.
Facts
are
generated
by
the
ATOMIC
production,
and
can
be
viewed
as
the
then
part
of
an
Implies
with
an
empty
conjunctive
if
(or
with
true
as
the
if
part).
The
CLAUSE
RULE
production
generates
a
universally
closed
rule
or
fact.
A ruleset is a set of RIF rules; it is generated by the production Ruleset .
Note also that, since CONDITION permits disjunction and existential quantification, the rules defined by the Implies production are more general than pure Horn rules. This extension was explained in the introduction to Section Positive Conditions .
Well-formed rules. A rule is well-formed if its ATOMIC part is a well-formed atomic formula and the CONDITION part is a well-formed condition formula.
The
lowering
of
the
abstract
syntax
to
this
concrete
presentation
syntax
can
again
be
done
via
automatic
EBNF-to-EBNF
mapping.
For
this,
the
mapper,
abs2con4g
,
and
the
table,
class2token
,
of
Section
Positive
Conditions
are
extended.
class2token('Implies',':-')
abs2con4g('Ruleset'
'('
('formula ->' RULE)*
')',
?TokenTable)
=
RULE*
abs2con4g('Forall'
'('
('declare ->' Var)*
'formula ->' CLAUSE
')',
?TokenTable)
=
'Forall' Var* '(' CLAUSE ')'
abs2con4g(ATOMIC
|
Implies,
?TokenTable)
=
ATOMIC | Implies
abs2con4g('Implies'
'('
'if ->' CONDITION
'then ->' ATOMIC
')',
?TokenTable)
=
ATOMIC lookup('Implies',?TokenTable) CONDITION
The document RIF Use Cases and Requirements includes a use case "Negotiating eBusiness Contracts Across Rule Platforms", which discusses a business rule slightly modified here:
If an item is perishable and it is delivered to John more than 10 days after the
scheduled delivery date then the item will be rejected by him.
In
the
Concrete
Presentation
EBNF
Syntax
used
throughout
this
document,
this
rule
can
be
written
in
one
of
these
two
equivalent
ways:
Example 3 (A RIF rule in the presentation syntax)
a. Universal form:
Forall ?item ?deliverydate ?scheduledate ?diffdate
(
reject("http://example.com/people#John"^^rif:iri ?item) :-
And ( perishable(?item)
delivered(?item ?deliverydate "http://example.com/people#John"^^rif:iri)
scheduled(?item ?scheduledate)
timediff(?diffdate ?deliverydate ?scheduledate)
greaterThan(?diffdate 10) )
)
b. Universal-existential form:
Forall ?item
(
reject("http://example.com/people#John"^^rif:iri ?item) :-
Exists ?deliverydate ?scheduledate ?diffdate
(
And ( perishable(?item)
delivered(?item ?deliverydate "http://example.com/people#John"^^rif:iri)
scheduled(?item ?scheduledate)
timediff(?diffdate ?deliverydate ?scheduledate)
greaterThan(?diffdate 10) )
)
)
In form (a), all variables are quantified universally outside of the rule. In form (b), the variables that do not appear in the rule then part are instead quantified existentially in the if part. It is well-known that these two forms are logically equivalent.
The following extends the XML syntax of Positive Conditions , by serializing the above Abstract EBNF Syntax of RIF Horn rules in XML. The Forall element contains the role elements declare and formula , which were earlier used within the Exists element of Positive Conditions . The Implies element contains the role elements if and then to designate these two parts of a rule (see also Appendix Specification ).
- Ruleset (collection of rules)
- Forall (quantified formula for 'Forall', containing declare and formula roles)
- Implies (implication, containing if and then roles)
- if (antecedent role, containing CONDITION)
- then (consequent role, containing ATOMIC)
For instance, the rule in Example 3a can be serialized in XML as shown below as the first element of a rule set whose second element is a business rule for Fred.
Example 4 (A RIF rule set in XML syntax)
<Ruleset>
<formula>
<Forall>
<declare><Var>item</Var></declare>
<declare><Var>deliverydate</Var></declare>
<declare><Var>scheduledate</Var></declare>
<declare><Var>diffdate</Var></declare>
<formula>
<Implies>
<if>
<And>
<formula>
<Uniterm>
<op><Const type="rif:local">perishable</Const></op>
<arg><Var>item</Var></arg>
</Uniterm>
</formula>
<formula>
<Uniterm>
<op><Const type="rif:local">delivered</Const></op>
<arg><Var>item</Var></arg>
<arg><Var>deliverydate</Var></arg>
<arg><Const type="rif:iri">http://example.com/people#John</Const></arg>
</Uniterm>
</formula>
<formula>
<Uniterm>
<op><Const type="rif:local">scheduled</Const></op>
<arg><Var>item</Var></arg>
<arg><Var>scheduledate</Var></arg>
</Uniterm>
</formula>
<formula>
<Uniterm>
<op><Const type="rif:local">timediff</Const></op>
<arg><Var>diffdate</Var></arg>
<arg><Var>deliverydate</Var></arg>
<arg><Var>scheduledate</Var></arg>
</Uniterm>
</formula>
<formula>
<Uniterm>
<op><Const type="rif:iri">http://www.w3.org/2007/rif/builtin/greaterThan</Const></op>
<arg><Var>diffdate</Var></arg>
<arg><Const type="xsd:long">10</Const></arg>
</Uniterm>
</formula>
</And>
</if>
<then>
<Uniterm>
<op><Const type="xsd:long">reject</Const></op>
<arg><Const type="rif:iri">http://example.com/people#John</Const></arg>
<arg><Var>item</Var></arg>
</Uniterm>
</then>
</Implies>
</formula>
</Forall>
</formula>
<formula>
<Forall>
<declare><Var>item</Var></declare>
<formula>
<Implies>
<if>
<Uniterm>
<op><Const type="rif:local">unsolicited</Const></op>
<arg><Var>item</Var></arg>
</Uniterm>
</if>
<then>
<Uniterm>
<op><Const type="rif:local">reject</Const></op>
<arg><Const type="rif:iri">http://example.com/people#Fred</Const></arg>
<arg><Var>item</Var></arg>
</Uniterm>
</then>
</Implies>
</formula>
</Forall>
</formula>
</Ruleset>
Section Positive Conditions defined the notion of semantic structures and how such structures determine truth values of RIF conditions. The current section defines what it means for such a structure to satisfy a rule.
While semantic structures can be multivalued in RIF dialects that extend the BLD, rules are typically two-valued even in dialects that support inconsistency and uncertainty. Consider a rule of the form Q then :- if , where Q is a quantification prefix for all the variables in the rule. For the Horn subset, Q is a universal prefix, i.e., all variables in the rule are universally quantified outside of the rule. We first define the notion of rule satisfaction without the quantification prefix Q :
I |= then :- if |
iff I Truth ( then ) ≥ t I Truth ( if ). (Recall that the set of truth values, TV , has a partial order ≥ t .)
We define I |= Q then :- if iff I* |= then :- if for every I* that agrees with I everywhere except possibly on some variables mentioned in Q . In this case we also say that I is a model of the rule . I is a model of a rule set R , denoted I |= R , if it is a model of every rule in the set, i.e., if it is a semantic structure such that I |= r for every rule r ∈ R .
The above defines the semantics of RIF BLD using standard first-order semantics for Horn clauses. Various RIF dialects will extend this semantics in the required directions. Some of these extended semantics might not have a model theory (for example, production rules) and some will have non-first-order semantics. However, all these extensions are required to be compatible with the above definition when the rule set is completely covered by RIF BLD. For further details on defining the semantics for RIF dialects see end note on intended models for rule sets .
We will now define what it means for a set of rules to entail a RIF condition. Let S be a RIF ruleset and φ a closed RIF condition (a condition with no free variables). Then we say that S entails φ, written as
S |= φ |
if for every semantic structure I , such that I |= S , it is the case that I truth (φ)= t .
(Editor's Note: This text is maintained on wiki page RIF Compatibility ).
The compatibility of RIF BLD is currently focussed on the Semantic Web standards OWL and RDF . The section RIF-OWL Compatibility provides a number of starting points for compatibility with OWL. The section RIF-RDF Compatibility provides a concrete specification of RDF compatibility.
In future versions, these two sections may be moved into a separate deliverable about RDF and OWL compatibility.
(Editor's Note: This text is maintained on wiki page RIF-OWL Compatibility ).
In its current version this page lists a number of possible approaches to combining RIF (BLD) and OWL. It is meant to evolve to a specification of OWL compatibility for BLD.
Besides this page, the working group might create a page which exemplifies typical usage patterns when combining OWL and rules.
This section, together with the mentioned usage patterns, are expected to be included in Recommendation-track deliverable on using RIF in combination with OWL, as called for by the working group charter .
There are some inherent discrepancies between OWL DL and rules languages, as pointed out on the OWL Compatibility page.
In order to specify OWL compatibility the working group needs to make a number of decisions, outlined below.
OWL specifies three increasingly expressive species, namely Lite, DL, and Full.
OWL Lite is a syntactic subset of DL, but the semantics is the same. The semantics of DL and Full are somewhat different; the latter is an extension of the semantics of RDFS, whereas the former has a direct semantics. The OWL semantics specification also defines an RDF-compatible semantics for OWL DL, but the direct semantics is normative. The OWL Full semantics does not directly extend the RDF-compatible OWL DL semantics; however, every OWL DL entailment is an OWL Full entailment.
Finally, the OWL DL syntax does not extend the RDF syntax, but rather restricts it; every OWL DL ontology is an RDF graph. OWL Full and RDF have the same syntax.
Note, however, that the RDF-compatible semantics of OWL DL is defined for arbitrary RDF graphs, not just OWL DL ontologies.
*** The working group will have to decide whether to specify compatibility with both OWL DL and Full, or with just one of the two. If the working group decides to specify compatibility with OWL DL, then the working group has to decide whether to be compatible with the direct or the RDF-compatible semantics of OWL DL; see also the subsection "OWL DL" in this section "semantic correspondence" *** |
The correspondence between URI references and literals in OWL and constant symbols in RIF can and should be defined in the same way as in RDF Compatibility . The only question which remains is how the correspondence of atomic formulas is defined.
We distinguish between syntactic correspondence for the cases of combination with OWL Full and OWL DL, respectively.
For the case of compatibility with OWL Full, syntactic correspondence should be defined in the same way as in RDF Compatibility , because the syntax is the same as that of RDF: a triple s p o . corresponds to an atomic formula s'[p' -> o'] , where s' , p' , and o' are RIF symbols corresponding to the RDF symbols s , p , and o , respectively.
In the case of compatibility with OWL DL, there seem to be two reasonable alternatives:
Correspondence is defined with respect to the RDF representation of OWL DL. In this case, correspondence of atomic formulas would be defined in the same way as for OWL Full, i.e. a triple s p o . corresponds to an atomic formula s'[p' -> o'] .
*** Their is a slight discrepancy between the abstract syntax of OWL DL and the RDF graph representation of it. Namely, the RDF representation is required to fulfill certain disjointness conditions which are not required in the abstract syntax. If we go for correspondence with the abstract syntax, we probably want to require these conditions to hold. *** |
In case option 2 is chosen, it might be necessary, depending on the choice of the semantics of the combination, to restrict the syntactic shape of the RIF rules (e.g. forbid formulas of the form a[rdf:type -> ?x , because ?x is a variable occurring in the class position, which is generally incompatible with the direct semantics of OWL DL).
Note that option 1 might also require certain syntactical restrictions in the rules, e.g. because OWL DL makes a distinction between object and datatype properties. These restrictions would be less severe than the ones required for option 2, and are probably only necessary for reasoning purposes.
Note, however, that such restrictions are not necessary if the combined semantics uses the RDF-compatible semantics of OWL DL, because this semantics is defined for all RDF graphs. In that case, no guarantees whatsoever can be given about reasoning.
*** If the working group decides to define compatibility with OWL DL, then the working group has to decide between options 1 and 2. The main arguments for option 1 are that (i) it is more natural with respect to the abstract syntax of OWL, (ii) more familiar and easier to implement the for people with an understanding of Description Logics, and (iii) it might require fewer syntactic restrictions in the RIF rules in order to allow effective processing. The main argument for option 2 is that it is consistent with the correspondence defined for RDF and (possibly) OWL Full, allowing the (to some extent) uniform processing of RDF, OWL DL and OWL Full. *** |
[here, the tough choices need to be made]
Many approaches for the combination of Description Logics and rules have been described in the academic literature. Most of these approaches are applicable to OWL DL or subsets thereof. For a brief overview of some of the approaches, see the corresponding section on the OWL Compatibility page . We are currently not aware of any approach to combining OWL Full with rules; there is an approach to combining a subset of OWL Full, called pD*-entailment, with rules, described in [1].
A straightforward approach to combining OWL DL and RIF is to define a notion of combined models and define appropriate restrictions on RIF rules, similar to the SWRL proposal . Such a combination is hard to process in the general case (reasoning is undecidable). There are, however, several known decidable subsets which impose certain restrictions either on the OWL ontology or the RIF rules. For example, direct rule-based processing can be done by suitably restricting the OWL component to a so-called DLP subset.
This approach is, however, not straightforward to extend to the case of rules with negation. Several approaches has been developed for this case, the two most prominent being [2], in which the notion of a combined model is defined, and [3], in which an external DL knowledge bases "queried" using special query atoms in the body of a rule. The main advantage of the former approach is that there is a tighter integration of the semantics. The main advantage of the latter approach is that processing can be done using out-of-the-box rules and DL engines (with minimal modifications).
For completeness we also mention [4] and [5], in which even tighter integration (than [2]) is proposed, based on the embeddings in nonmonotonic logics. This is probably out of scope for this working group.
Realistically speaking, RIF can define compatibility with OWL Full in the following ways:
Since OWL Full is an extension of RDF, the most straightforward way is to define the combination of RIF and OWL Full as an extension of the combination of RIF and RDF , i.e. based on correspondence of models. However, such a combination would not be suited for rule-based processing; i.e., there is no embedding of such a combination in RIF. If the working group goes for this option, it might recommend users to use only a subset of OWL Full; the working group might even decide to allow (or recommend) reasoners to implement a weaker (intentional) semantics than the normative (extensional) semantics, saying that it is acceptable to have fewer entailments; RIF would recommend to use that subset of OWL Full which can be processed with rule-based reasoners, and say that anyone going beyond that subset is on their own wrt. processing.
It should be noted at this point that most, if not all, currently available OWL Full reasoners are rule-based and implement a semantic (and sometimes also syntactic) subset. Hence, option 1. seems closer to current practice than option 2.
*** The main arguments for option 1 are (i) that it is more elegant, the semantics of the combination extends both the semantics of OWL Full and RIF, (ii) it builds on existing standards, and (iii) it is closer to the current implementation practice. The main arguments for option 2 are (i) that it is easier to implement (assuming there are complete OWL Full reasoners around, which is actually not a very realistic assumption) and (ii) easier to extend towards rule languages with negation. *** |
*** The working group has to decide whether to define correspondence with respect to the (1) direct semantics or the (2) RDF-compatible semantics of OWL DL. The main argument for option 1 is that it is the normative semantics of OWL DL. The main arguments for option 2 are (i) it is closer to the semantics of RDF and OWL Full, thereby enabling more uniform definitions of compatibility and (ii) it is defined for a larger set of RDF graphs, and thus possibly requires fewer restrictions on RIF rule sets in the combination. The decision on this point is strongly related to the decision on syntactic correspondence. If syntactic correspondence is defined based on the abstract syntax, then it seems reasonable to define the semantic correspondence with respect to the direct semantics. If syntactic correspondence is defined with respect to the RDF representation, then option (2) appears more natural. *** |
The other choices regarding semantic compatibility are similar to those for OWL Full:
For both 1a and 1b, the implementor is more-or-less on his own. However, these choices are probably the only ones which fulfill the requirements for the Interchanging Rule Extensions to OWL use case.
*** The main arguments for option 1 are that it is more elegant; the semantics of the combination extends both the semantics of OWL DL and RIF, it builds on existing standards, and is closer to the current implementation practice. The main arguments for option 2 are that it is easier to implement (than 1a and 1b) and easier to extend (than 1a and 1b) towards rule languages with negation. Finally, most people who care about OWL DL find the semantics for the case of option 1 more natural. *** |
[1] Herman J. ter Horst: Combining RDF and Part of OWL with Rules: Semantics, Decidability, Complexity . International Semantic Web Conference 2005: 668-684
[2] Riccardo Rosati. DL+log: Integration of Description Logics and Disjunctive Datalog In Proceedings of the Tenth International Conference on Principles of Knowledge Representation and Reasoning (KR 2006), pages 68-78, 2006. ISBN 978-1-57735-271-6.
[3] Thomas Eiter, Thomas Lukasiewicz, Roman Schindlauer, Hans Tompits: Combining Answer Set Programming with Description Logics for the Semantic Web . KR 2004: 141-151
[4] Boris Motik, Riccardo Rosati: A Faithful Integration of Description Logics with Logic Programming . IJCAI 2007: 477-482
[5] Jos de Bruijn, Thomas Eiter, Axel Polleres, Hans Tompits: Embedding Non-Ground Logic Programs into Autoepistemic Logic for Knowledge-Base Combination . IJCAI 2007: 304-309
(Editor's Note: This text is maintained on wiki page RIF-RDF Compatibility ).
*** The working group is chartered to deliver a Recommendation document about using RIF in combination with OWL. The working group might consider moving the definition of RIF-RDF combinations to that document as well. *** |
*** See http://lists.w3.org/Archives/Public/public-rif-wg/2007Sep/0012.html , http://lists.w3.org/Archives/Public/public-rif-wg/2007Sep/0045.html , and the ensuing threats for a discussion about the definition of the semantics of RIF-RDF combinations. *** |
[examples to be added throughout] |
This section defines combinations of RIF rules with RDF graphs, taking into account the various (normative) entailment regimes defined by RDF. A typical case where RIF rules and RDF graphs are combined is when an RIF rule set refers to one or more RDF data sets or RDFS ontologies (which are also RDF graphs).
Four semantics are defined for the combination, corresponding to the simple, RDF, RDFS, and D (datatype) entailment regimes of RDF, with the sense that D entailment is only defined for a fixed set of datatypes, because the set of data types supported by RIF is fixed.
*** RDF D-entailment extends RDFS entailment with support for an arbitrary number of datatypes, which is given by a datatype map (a mapping from URIs to datatypes). The only requirement on these datatype maps is that the rdf:XMLLiteral datatype must be in there; apart from that, an arbitrary number of datatypes can be used. OWL uses the same mechanism, but additionally requires the datatypes corresponding to xsd:string and xsd:integer to be defined. It is proposed in http://lists.w3.org/Archives/Public/public-rif-wg/2007Sep/0079.html to adopt the same (extensible) mechanism in RIF. *** |
Combinations are pairs of RIF rule sets and sets of RDF graphs. The semantics of combinations is defined in terms of combined models. Intuitively, given an RIF semantic structure, we define the corresponding RDF interpretation. If the RDF interpretation satisfies the RDF graphs, as well as the RIF rule set, then it is a model of the combination. Entailment is defined as model inclusion, as usual.
It turns out that reasoning with combinations can be reduced to reasoning with RIF rule sets, assuming the RDF graphs are finite, thereby showing how translators can take a combination and translate it to a set of rules, to enable reasoning with combinations using a local rule system.
*** Currently, this document only defines how combinations of RIF rule sets and RDF graphs should be interpreted; it does not suggest how such combinations are specified, or exchanged, nor does it specify which of the RDF entailment regimes (simple, RDF, or RDFS) should be used. The preliminary suggestion for the specification of combinations, as well as entailment regimes, through meta data in RIF rule sets was formulated in a preliminary section in the architecture document. Note that no agreement has yet been reached on this issue, and that especially the issue of the specification of entailment regimes is controversial (see http://lists.w3.org/Archives/Public/public-rif-wg/2007Jul/0030.html and the ensuing thread). *** |
An RDF vocabulary V consists of sets of RDF URI references V U , plain literals V PL (i.e. character strings with an optional language tag), and typed literals V TL (i.e. pairs of character strings and URI references, identifying data types), as defined in [RDF-Concepts].
Given an RDF vocabulary V and a set of blank nodes B , disjoint from the symbols in V , an RDF graph is a set of RDF triples s p o . , where s is either a blank node or a URI reference, p is a URI reference, and o is a blank node, a URI reference, a plain literal, or a typed literal; see also RDF-Concepts .
The RDF Core working group has listed two issues questioning the restrictions that literals may not occur in subject and blank nodes may not occur in predicate positions in triples. Anticipating lifting of these restrictions in a possible future version of RDF, we define the notion of generalized RDF graph. We note that the definitions in the RDF semantics document immediately apply to such generalized RDF graphs.
Given an RDF vocabulary V and a set of blank nodes B , disjoint from the symbols in V , a generalized RDF graph is a set of generalized RDF triples s p o . , where s , p and o are blank nodes, URI references, plain literals, or typed literals.
*** Note that our notion of generalized RDF graphs is more liberal than the notion of RDF graphs used by SPARQL; we additionally allow blank nodes and literals in predicate positions. *** |
Even though RDF allows the use of arbitrary datatype URIs in typed literals, not all such datatype URIs are recognized in the semantics. In fact, simple entailment does not recognize any datatype and RDF and RDFS entailment recognize only the XML content datatype (identified with rdf:XMLLiteral ). Furthermore, RDF allows to express typed literals such that the literal string is not in the lexical space of the datatype; such literals are called ill-typed literals. RIF, on the contrary, recognizes a number of different data types, and does not allow to express ill-typed literals. Finally, D-entailment is defined with respect to arbitrary datatype maps (mappings from URIs to datatypes) which include rdf:XMLLiteral , whereas RIF has a fixed number of datatypes, so it essentially has a fixed datatype map. Combinations of RIF with RDF under D-entailment are only defined for the case where D corresponds to the fixed datatype map of RIF.
A datatype map is a mapping from URIs to datatypes. RIF has a fixed number of datatypes it supports. The datatype map of RIF, denoted with D RIF , is the mapping of datatype identifiers to datatypes as described in the section "Symbol Spaces and Primitive Datatypes".
*** If RIF were to adopt datatypes maps in the way they are used in RDF, as proposed in http://lists.w3.org/Archives/Public/public-rif-wg/2007Sep/0079.html , then such combinations could be defined for arbitrary datatype maps. *** |
In order to ensure that typed literals are interpreted according to the datatypes recognized by RIF, which include the datatypes recognized by RDF and RDFS, we define the notions of well-typed and ill-typed literal relative to the data types recognized by RIF.
A typed literal ( s , u ) is a well-typed literal if
u is in the domain of D RIF and s is in the lexical space of D RIF ( u )
u is the URI of a symbol space defined by RIF and s is in the lexical space of the symbol space, or
u is not in the domain of D RIF and is not a symbol space defined by RIF.
Otherwise ( s , u ) is an ill-typed literal .
We define the RIF equivalent of the RDF names, given an RDF vocabulary V .
V U * is obtained from V U by replacing every RDF URI reference URI in V U with "URI"^^rif:iri . |
*** Note that the symbol space of rif:iri , i.e. the set of absolute IRIs, is a subset of the set of RDF URI references that omits spaces. *** |
V PL * is obtained from V PL by replacing every plain literal without a language tag "abc" in V PL with "abc"^^xsd:string and every plain literal with a language tag "abc"@lang in V PL with "abc'@lang"^^rif:text , where abc' is obtained from abc by replacing every occurrence of @ in abc with @@ , thereby escaping the character. |
*** The value space of RDF plain literals without language tags consists of all Unicode strings. Both in the current specification of XML schema datatypes and in the current working draft of XML schema 1.1 data types the value space of the string datatype is restricted to the sequences of Unicode characters excluding the surrogate blocks, FFFE, and FFFF; these characters are not actual Unicode characters, but rather reserved codes in UTF-16 encoding. There are some further differences between the specification of the string datatype in XML schema 1.0 and XML schema 1.1; in the former case, the datatype is based on the Char production in XML 1.0 ; in the latter case, the datatype is based on the Char production in XML 1.1 ; so, the string datatype in 1.1 is a superset of the string datatype in 1.0. All in all, it appears that the value space of the string datatype in XML Schema 1.1 is a superset of the value space of the RDF plain literals without language tags, but there are RDF plain literals which are not strings in XML Schema 1.0. So, if RIF speaks with the XML Schema 1.0 data types, then we have an issue here. See also http://lists.w3.org/Archives/Public/public-rif-wg/2007Sep/0002.html and the subsequent messages in the thread. *** |
*** We equate RDF plain literals with language tags with symbols in the lexical space of the rif:text datatype. We assume that the value space of this datatype consists of all possible pairs of strings, i.e. the value space of xsd:string , and language tags, i.e. a language identifiers as defined by RFC 3066 , which corresponds to the value space of the xsd:language datatype; therefore, it suffers from the same potential issue as plain literals without language tags, pointed out in the preceding discussion. We assume that the lexical space consists of Unicode strings of the form abc@lang , where abc is a string obtained from a string in the lexical space of xsd:string by replacing every @ with @@ , and lang is a string in the lexical space of xsd:language. *** |
V TL * is obtained from V TL by replacing every ill-typed literal ( s , u ) with http://www.w3.org/2005/rif/rdf-ill-typed-literal/uri-encode("s"^^u) and including all other typed literals as such. |
*** Because of the difference in interpretation of ill-typed literals, the proposal is to define the corresponding URI for every ill-typed literal. A canonical definition of this corresponding URI enables the reconstruction of the original ill-typed literal, e.g. for round-tripping. Agreement needs to be reached on what this URI looks like. The proposal is: http://www.w3.org/2005/rif/rdf-ill-typed-literal/uri-encode("s"^^u) , where uri-encode is a function which appropriately encodes a typed literal; this function is to be defined. For example, an ill-typed literal "a"^^xsd:int could be encoded as http://www.w3.org/2005/rif/rdf-ill-typed-literal/%22a%22%5E%5Ehttp:/www.w3.org/TR/xmlschema-2/#string *** |
*** (blank nodes) It was not necessary to define corresponding symbols for blank nodes, because blank nodes are interpreted locally in an RDF graph. The equivalent of blank nodes in RIF are existentially quantified variables in the body of the rule. For some considerations on the use of blank nodes in rules and exchanging RDF rule languages, see the preliminary section in the architecture document. *** |
An RIF-RDF combination is a tuple C=< R , S >, where R is a Rule set over sets of constant symbols Const and variable symbols Var , where Const and Var are disjoint, and S is a set of RDF graphs of an RDF vocabulary V , where Const is the union of V U * , V PL * , V TL * , and a set of untyped constants.
***Notice that the vocabularies of the RIF rule set (excluding the untyped constants) and the RDF graph are isomorphic. In this way, both semantics (the RIF and the RDF semantics) apply to all symbols in the combination. *** |
In
RIF-RDF
combinations,
there
is
a
correspondence
between
RDF
triples
of
the
form
s p o .
and
RIF
molecules
frames
of
the
form
s'[p' -> o']
,
where
s'
,
p'
,
and
o'
are
RIF
symbols
corresponding
to
the
RDF
symbols
s
,
p
,
and
o
,
respectively.
The semantics of RIF-RDF combinations is defined in terms of common models.
The
RDF
Semantics
document
defines
4
(normative)
kinds
of
interpretations:
simple
interpretations
(which
do
not
pose
any
conditions
on
the
RDF
and
RDFS
vocabularies),
RDF
interpretations
(which
impose
additional
conditions
on
the
RDF
vocabulary),
RDFS
interpretations
(which
impose
additional
conditions
on
the
RDFS
vocabulary),
and
D-interpretations
(which
impose
additional
conditions
on
the
treatment
of
datatypes,
relative
to
a
datatype
map
D).
We
only
treat
the
case
of
D-interpretation
squared
where
the
datatype
map
D
corresponds
to
the
fixed
datatype
map
of
RIF,
D
RIF
.
We define the notion of common interpretation , which is an interpretation of both an RIF rule sets and an RDF graph. This common interpretation is the basis for the semantic definitions in the following sections. A common interpretation extends both an RIF semantic structure and an RDF simple interpretation, thereby guaranteeing that the semantics of the combination is an extension of the semantics of both formalisms.
The
correspondence
between
the
RIF
structures
and
the
RDF
interpretation
is
defined
through
a
number
of
conditions
which
ensure
the
correspondence
in
the
interpretation
of
names
(i.e.
URIs
and
literals)
and
formulas,
i.e.
the
correspondence
between
RDF
triples
of
the
form
s p o .
and
RIF
molecules
frames
of
the
form
s'[p' -> o']
,
where
s'
,
p'
,
and
o'
are
RIF
symbols
corresponding
to
the
RDF
symbols
s
,
p
,
and
o
,
respectively..
As defined in the RDF Semantics specification, a simple interpretation of a vocabulary V is a tuple I=< IR,IP,IEXT,IS,IL,LV >, where
IS is a mapping from the RDF URI references in V into (IR union IP),
IL is a mapping from typed literals in V into IR, and
LV is the set of literal values, which is a subset of IR, and includes all plain literals in V .
A common interpretation is a combination of an RIF semantic structure and an RDF interpretation ( I = < D , I C , I V , I F , I R , I slot , I SF , I SR , I sub , I isa >, I=<IR, IP, IEXT, IS, IL, LV>) such that the following conditions hold:
IR is a subset of D ;
IP
is
a
superset
of
the
set
of
all
k
in
D
such
that
there
exist
a,b
in
D
and
I
slot
(k)(a,b)=1;
(k)(a,b)=t;
(IR union IP) = D ;
LV is a superset of ( D intersection (V d1 union V d2 union ... union V dn )), where d1 ... dn are in the range of D RIF , and V di denotes the value space of a datatype di, for all 1 ≤ i ≤ n;
IEXT(k)
=
the
set
of
all
pairs
(a,
b),
with
a,b
in
D
,
such
that
I
slot
(k)(a,b)=1,
(k)(a,b)=t,
for
every
k
in
D
;
IS(
u
)
=
I
F
C
(
"u"^^rif:iri
)
for
every
URI
reference
u
in
V
U
;
IL((
s
,
u
))
=
I
F
C
(
"s"^^u
)
for
every
well-typed
literal
(
s
,
u
)
in
V
TL
;
IL((
s
,
u
))
=
I
F
C
(
"http://www.w3.org/2005/rif/rdf-ill-typed-literal/uri-encode("s"^^u)"^^rif:iri
)
for
every
ill-typed
literal
(
s
,
u
)
in
V
TL
.
*** Simple interpretations require the set of plain literals V PL to be included in LV; this condition is met through the inclusion of the subset of the value spaces of the string and rif:text datatypes in D . *** |
*** (explanation of the conditions) RDF makes a distinction between the sets of resources and properties; RIF does not. Condition 1 ensures that all resources in an RDF interpretation correspond to elements in the RIF domain. Condition 2 ensures that the set of properties at least includes all elements which are used as properties in the RIF domain. Condition 3 ensures that the combination of resources and properties corresponds exactly to the RIF domain; note that if I is an rdf- or rdfs-interpretation, IP is a subset of IR, and thus IR= D . Condition 4 ensures that all literal values in D are included in LV. Condition 5 ensures that RDF triples are interpreted in the same way as properties frames. Condition 6 ensures that IRIs are interpreted in the same way. Finally, conditions 7 and 8 ensure that typed literals are interpreted in the same way. *** |
*** Alternatively, IR could be required to be equal to D . This will already be the case for RDF and RDFS entailment, because IP is required to be a subset of IR. However, in simple entailment IP is not required to be a subset of IR. Another possibility would be to additionally require IR to be a superset of D \IP, although it is at this point is not entirely clear what this would buy us. *** |
We
now
define
the
notion
of
satisfiability
for
common
interpretation,
i.e.
the
conditions
under
which
a
common
interpretation
(
I
,
I)
is
a
model
of
a
combination
C=<
R
,
S
>.
We
define
notions
of
satisfiability
for
all
3
4
entailment
regimes
of
RDF
(simple,
RDF,
RDFS,
and
RDFS).
D).
The
definitions
are
all
analogous.
Intuitively,
a
common
interpretation
(
I
,
I)
satisfies
a
combination
C=<
R
,
S
>
if
I
satisfies
R
and
I
satisfies
S
.
A common interpretation ( I , I) simple-satisfies an RIF-RDF combination C=< R , S > if I satisfies R , I is a simple interpretation , and I satisfies every RDF graph S in S ; in this case ( I , I) is called a simple model , or simply model , of C, and C is simple-satisfiable .
A simple interpretation I of a vocabulary V is an rdf-interpretation if V includes the RDF vocabulary and the conditions on rdf-interpretations described in the [RDF-Semantics] document hold for I.
A common interpretation ( I , I) rdf-satisfies an RIF-RDF combination C=< R , S > if I satisfies R , I is an rdf-interpretation , and I satisfies every RDF graph S in S ; in this case ( I , I) is called an rdf-model of C, and C is rdf-satisfiable .
An rdf-interpretation I of a vocabulary V is an rdfs-interpretation if V includes the RDFS vocabulary and the conditions on rdfs-interpretations described in the [RDF-Semantics] document hold for I.
A common interpretation ( I , I) rdfs-satisfies an RIF-RDF combination C=< R , S > if I satisfies R , I is a rdfs-interpretation , and I satisfies every RDF graph S in S ; in this case ( I , I) is called a rdfs-model , or simply model , of C, and C is rdfs-satisfiable .
Given a datatype map D, an rdfs-interpretation I of a vocabulary V is a D-interpretation if V includes the URIs in the domain of D RIF and the conditions on each of the <URI, datatype> pairs in D RIF described in the [RDF-Semantics] document hold in I.
A common interpretation ( I , I) D RIF -satisfies an RIF-RDF combination C=< R , S > if I satisfies R , I is a D^RIF^-interpretation , and I satisfies every RDF graph S in S ; in this case ( I , I) is called a rdfs-model , or simply model , of C, and C is D RIF -satisfiable .
Using the notions of models defined above, we define entailment in the usual way, i.e. through inclusion of sets of models.
A combination C 1 simple-entails a combination C 2 if every simple model of C 1 is a model of C 2 .
A combination C 1 rdf-entails a combination C 2 if every rdf-model of C 1 is a model of C 2 .
A combination C 1 rdfs-entails a combination C 2 if every rdfs-model of C 1 is a model of C 2 .
A combination C 1 D RIF -entails a combination C 2 if every D RIF -model of C 1 is a model of C 2 .
RIF-RDF combinations can be embedded into RIF Rule sets in a fairly straightforward way, thereby demonstrating how an RIF-compliant translator without native support for RDF can process RIF-RDF combinations.
For the embedding we use the concrete syntax of RIF and the N-Triples syntax for RDF.
Throughout this section the function tr is defined, which maps symbols, triples, and RDF graphs to RIF symbols, statements, and rule sets.
The function tr maps RDF symbols of a vocabulary V and a set of blank nodes B to RIF symbols, as defined in following table.
RDF Symbol |
RIF Symbol |
Mapping |
RDF URI reference uri in V U |
Constant with datatype rif:iri |
tr( uri ) = "uri"^^rif:iri |
Blank node x in B |
Variable symbols ? x |
tr( x ) = ? x |
Plain literal without a language tag xxx in V PL |
Constant with the datatype xsd:string |
tr( "xxx" ) = "xxx"^^xsd:string |
Plain literals with a language tag ( xxx , lang ) in V PL |
Constant with the datatype rif:text |
tr( "xxx"@lang ) = "xxx'@lang"^^rif:text , where xxx' is obtained from xxx by replacing every occurrence of @ with @@ |
Well-typed literal ( s , u ) in V TL |
Constant with the datatype u |
tr( "s"^^u ) = "s"^^u |
Ill-typed literal ( s , u ) in V TL |
Generated constant with the datatype rif:iri |
tr( "s"^^u ) = "http://www.w3.org/2005/rif/rdf-ill-typed-literal/uri-encode("s"^^u)"^^rif:iri |
The
mapping
function
tr
is
extended
to
embed
triples
as
RIF
statements
and
graphs
as
sets
of
RIF
statements.
Finally,
two
embedding
functions,
tr
S
R
and
tr
Q
embed
RDF
graphs
as
RIF
rule
sets
and
conditions,
respectively.
The
following
section
shows
how
these
embeddings
can
be
used
for
reasoning
with
combinations.
We
define
two
mappings
for
RDF
graphs,
one
(tr
S
R
)
in
which
variables
are
Skolemized,
i.e.
replaced
with
constant
symbols,
and
one
(tr
Q
)
in
which
variables
are
existentially
quantified.
The function sk takes as arguments a formula with variables R, and returns a formula R', which is obtained from R by replacing every variable symbol ? x in R with "new-uri"^^rif:iri , where new-uri is a new globally unique URI.
*** alternatively, sk could be defined to replace variables with new globally unique constants with the datatype rif:bNode , as proposed by DaveReynolds . However, as this embedding is only used for reasoning, I ( JosDeBruijn ) don't think this is necessary. This datatype might be necessary for the exchange of RDF rule languages; see the corresponding preliminary section in the architecture document. *** |
RDF Construct |
RIF Construct |
Mapping |
Triple s p o . |
Property frame tr( s ) [ tr( p ) -> tr( o ) ] |
tr( s p o . ) = tr( s ) [ tr( p ) -> tr( o ) ] |
Graph S |
Rule
set
tr
|
tr
|
Graph S |
Condition (query) tr Q (S) |
tr
Q
(S)
=
|
The following theorem shows how checking simple-entailment of combinations can be reduced to checking entailment of RIF conditions by using the embeddings of RDF graphs of the previous section.
Theorem
A
combination
<R
1
,{S1,...,Sn}>
simple-entails
a
combination
<c,{T1,...,Tm}>,
where
c
is
a
condition,
iff
(R
1
union
tr
S1
R
(S1)
union
...
union
tr
Sn
R
(Sn))
entails
c
2
,
c,
tr
Q
(T1),...,
and
tr
Q
(Tm).
The embeddings of RDF and RDFS entailment require a number of built-in predicate symbols to be available to appropriately deal with literals.
*** Alternatively, these predicates might be axiomatized. However, as this requires a large number of facts, we deem such axiomatization undesirable. *** |
Given a vocabulary V ,
the unary predicate wellxml V /1 is interpreted as the set of XML values corresponding to the well-typed XML literals in V TL ,
the unary predicate illxml V /1 is interpreted as the set of objects corresponding to ill-typed XML literals in V TL , and
the unary predicate illD V /1 is interpreted as the set of objects corresponding to ill-typed literals in V TL , and
*** Depending on the built-in predicates which will eventually be available in RIF (BLD), the suggested built-in predicates might not be necessary. *** |
We axiomatize the semantics of the RDF vocabulary using the following RIF rules and conditions.
The compact URIs used in the RIF rules in this section and the next are short for the complete URIs with the rif:iri datatype, e.g. rdf:type is short for "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"^^rif:iri
R RDF |
= |
( Forall tr( s p o . )) for every RDF axiomatic triple s p o . ) union |
|
|
( Forall ?x ?x[rdf:type -> rdf:Property] :- Exists ?y,?z (?y[?x -> ?z]) , |
|
|
Forall ?x ?x[rdf:type -> rdf:XMLLiteral] :- wellxml(?x) ) |
C RDF |
= |
( Exists ?x (And(?x[rdf:type -> rdf:XMLLiteral] illxml(?x)) ) |
Theorem A combination <R 1 ,{S1,...,Sn}> is rdf-satisfiable iff ( R RDF union R 1 union tr S1 (S1) union ... union tr Sn (Sn)) does not entail C RDF .
Theorem An rdf-satisfiable combination <R 1 ,{S1,...,Sn}> rdf-entails a combination <c,{T1,...,Tm}>, where c is a condition, iff ( R RDF union R 1 union tr S1 (S1) union ... union tr Sn (Sn)) entails c, tr Q (T1),..., and tr Q (Tm).
We axiomatize the semantics of the RDF(S) vocabulary using the following RIF rules and conditions.
R RDFS |
= |
R RDF union |
|
|
( Forall tr( s p o . )) for every RDFS axiomatic triple s p o . ) union |
|
|
( Forall ?x ?x[rdf:type -> rdfs:Resource] , |
|
|
Forall ?u,?v,?x,?y ?u[rdf:type -> ?y] :- And(?x[rdfs:domain -> ?y] ?u[?x -> ?v]) , |
|
|
Forall ?u,?v,?x,?y ?v[rdf:type -> ?y] :- And(?x[rdfs:range -> ?y] ?u[?x -> ?v]) , |
|
|
Forall ?x ?x[rdfs:subPropertyOf -> ?x] :- ?x[rdf:type -> rdf:Property] , |
|
|
Forall ?x,?y,?z ?x[rdfs:subPropertyOf -> ?z] :- And (?x[rdfs:subPropertyOf -> ?y] ?y[rdfs:subPropertyOf -> ?z]) , |
|
|
Forall ?x,?y,?z1,?z2 ?z1[y -> ?z2] :- And (?x[rdfs:subPropertyOf -> ?y] ?z1[x -> ?z2]) , |
|
|
Forall ?x ?x[rdfs:subClassOf -> rdfs:Resource] :- ?x[rdf:type -> rdfs:Class] , |
|
|
Forall ?x,?y,?z ?z[rdf:type -> ?y] :- And (?x[rdfs:subClassOf -> ?y] ?z[rdf:type -> ?x]) , |
|
|
Forall ?x ?x[rdfs:subClassOf -> ?x] :- ?x[rdf:type -> rdfs:Class] , |
|
|
Forall ?x,?y,?z ?x[rdfs:subClassOf -> ?z] :- And (?x[rdfs:subClassOf -> ?y] ?y[rdfs:subClassOf -> ?z]) , |
|
|
Forall ?x ?x[rdfs:subPropertyOf -> rdfs:member] :- ?x[rdf:type -> rdfs:ContainerMembershipProperty] , |
|
|
Forall ?x ?x[rdfs:subClassOf -> rdfs:Literal] :- ?x[rdf:type -> rdfs:Datatype] , |
|
|
Forall ?x ?x[rdf:type -> rdfs:Literal] :- lit(?x) ) |
C RDFS |
= |
( Exists ?x (And(?x[rdf:type -> rdfs:Literal] illxml(?x)) ) |
Theorem A combination <R 1 ,{S1,...,Sn}> is rdfs-satisfiable iff ( R RDFS union R 1 union tr S1 (S1) union ... union tr Sn (Sn)) does not entail C RDF .
*** Some more detailed analysis is still to do to see whether more conditions are required in C RDFS to check for ill-typed literals which are of the type rdfs:Literal. *** |
Theorem An rdfs-satisfiable combination <R 1 ,{S1,...,Sn}> rdfs-entails a combination <c,{T1,...,Tm}> iff ( R RDFS union R 1 union tr S1 (S1) union ... union tr Sn (Sn)) entails c, tr Q (T1),..., and tr Q (Tm).
We axiomatize the semantics of the data types using the following RIF rules and conditions.
R D |
= |
R RDFS union |
|
|
( Forall u[rdf:type -> rdfs:Datatype] | for every URI in the domain of D RIF ) union |
|
|
( Forall "s"^^u[rdf:type -> "u"^^rif:iri] | for every well-typed literal ( s , u ) in V TL ) union |
|
|
( Forall ?x, ?y dt(?x,?y) :- And(?x[rdf:type -> ?y] ?y[rdf:type -> rdfs:Datatype]) ) |
***Note that the existence of certain built-ins corresponding to data types (e.g. a built-in string/1 which is always interpreted as the set of xsd:strings) would simplify the axiomatization. *** |
C D |
= |
( Exists ?x (And(?x[rdf:type -> rdfs:Literal] illD(?x)) ) |
Theorem A combination <R 1 ,{S1,...,Sn}>, where R 1 does not contain the equality symbol, is D RIF -satisfiable iff ( R D union R 1 union tr S1 (S1) union ... union tr Sn (Sn)) does not entail C D , does not entail Exists ?x And(dt(?x,u) dt(?x,u')) for any two URIs u and u' in the domain of D RIF such that the value spaces of D RIF ( u ) and D RIF ( u' ) are disjoint, and does not entail Exists ?x dt(s^^u,"u'"^^rif:iri) for any ( s , u ) in V TL and u' in the domain of D RIF such that s is not in the lexical space of D RIF ( u' ).
*** Since this condition is very complex we might want to leave out this theorem, and suggest the above set of rules ( R D ) as an approximation of the semantics. *** |
Theorem A D RIF -satisfiable combination <R 1 ,{S1,...,Sn}>, where R 1 does not contain the equality symbol, D RIF -entails a combination <c,{T1,...,Tm}> iff ( R RDFS union R 1 union tr S1 (S1) union ... union tr Sn (Sn)) entails c, tr Q (T1),..., and tr Q (Tm).
*** The restriction to equality-free rule sets is necessary because D-interpretations impose stronger conditions on the interpretation of typed literals in case different datatype URIs are equal than RIF does. *** |
A disadvantage of using the axiomatization R RDF is that it is infinite, due to the fact that the RDF vocabulary is infinite. Therefore, we define a finite subset of R RDF , R RDF- , which can be used for reasoning. Note that this subset does not change the semantics; the embedding is still faithful with respect to the semantics of combinations.
R RDF- = R RDF \{ Forall tr( s p o . ) | s p o . is an RDF axiomatic triple of the form rdf:_i rdf:type rdf:Property . , with i a positive integer, such that rdf:_i does not occur in the context in which R RDF- is used}.
Theorem A combination <R,{S1,...,Sn}> rdf-entails a combination <c,{T1,...,Tm}> iff ( R RDF- union R 1 union tr S1 (S1) union ... union tr Sn (Sn)) entails c, tr Q (T1),..., and tr Q (Tm), where the context of R RDF- consists of the combinations <R 1 ,{S1,...,Sn}> and <c,{S1,...,Sn}>.
We define a finite subset of R RDFS analogous to the definition of R RDF- .
R RDFS- = R RDFS \{ Forall tr( s p o . ) | s p o . is an RDF axiomatic triple of the form rdf:_i rdf:type rdf:Property . , with i a positive integer, such that rdf:_i does not occur in the context in which R RDF- is used}.
Theorem A combination <R,{S1,...,Sn}> rdfs-entails a combination <c,{T1,...,Tm}> iff ( R RDFS- union R 1 union tr S1 (S1) union ... union tr Sn (Sn)) entails c, tr Q (T1),..., and tr Q (Tm), where the context of R RDFS- consists of the combinations <R 1 ,{S1,...,Sn}> and <c,{S1,...,Sn}>.
(Editor's Note: This text is maintained on wiki page Appendix: Specification ).
A syntactic specification of RIF BLD is given here as the combination of the RIF Condition and RIF Rule syntaxes.
The default namespace of RIF is http://www.w3.org/2007/rif# .
The abstract syntax of RIF BLD is specified in asn06 as follows:
It is visualized by a UML diagram as follows:
Automatic asn06-to-UML transformation has been employed for this, using the available tool.
The concrete syntax , in particular an XML Schema, will be derived from this (to follow in a later Working Draft).
Preliminary XML schemas for the RIF BLD sublanguages are also available , which already helped to find and fix invalid parts in the examples.
(Editor's Note: This text is maintained on wiki page End Notes ).
Rationale : There are several equivalent ways to define first-order semantic structures. The one we adopted has the advantage that it generalizes to rule sets with negation as failure (NAF) and to logics for dealing with uncertainty and inconsistency. The difficulty is that some popular theories for NAF, such as the well-founded semantics , are based on three-valued semantic structures. Some popular ways to handle uncertain or inconsistent information (which is certainly important in the Web environment) rely on four-valued and other multi-valued logics. Therefore, following M. Fitting, Fixpoint Semantics for Logic Programming A Survey, Theoretical Computer Science, 1999 , we build our definitions to be compatible with future RIF dialects, which will be based multivalued logics.
Truth values : Some RIF dialects will have additional truth values. For instance, some versions of NAF use three truth values: t , f , and u (undefined). Handling of contradictions and uncertainty requires at least four truth values: t , u , f , and i (inconsistent).
Ordering truth values : In the well-founded semantics for NAF, f < t u < t t , and it is again a total order. But in four-valued logics, which are often used for dealing with uncertain or inconsistent information, the truth order is partial: f < t u < t t and f < t i < t t . Such logics also have another partial order, called the knowledge order < k : u < k t < k i ; and u < k f < k i . Under the knowledge order, true and false are incomparable, and facts that are both true and false receive the truth value i , which is the least upper bound of f and t in the knowledge order.
Using
sorts
to
simulate
FOL,
RDF,
and
other
logics
:
Recall
that
RIF
BLD
uses
the
symbols
from
Const
to
denote
constants,
predicates,
and
function
symbols
alike,
so
the
same
symbol
can
occur
in
multiple
contexts.
However,
sometimes
it
is
useful
to
restrict
the
contexts
in
which
various
symbols
are
allowed
to
occur.
For
instance,
Prolog
or
RDF
don't
place
any
such
restrictions,
but
OWL-DL
has
a
unique
role
for
each
symbol.
This
restriction
of
the
context
is
accomplished
by
controlling
the
sorts
that
are
associated
with
each
constant.
For
instance,
if
we
do
not
want
integers
to
occur
as
predicate
and
function
symbols
then
we
will
not
associate
any
arrow
or
Boolean
sorts
with
the
constants
that
are
associated
with
the
primitive
sort
integer
.
On
the
other
hand,
we
do
want
URIs
to
denote
concepts
and
other
predicates.
In
that
case,
we
would
associate
every
Boolean
sort
with
every
constant
that
has
a
primitive
sort
URI.
If
we
want
to
also
allow
local
names
for
concepts
and
other
predicates,
then
we
might
introduce
a
separate
primitive
sort,
local
,
and
allow
the
symbols
that
belong
to
this
sort
to
have
Boolean
types.
RIF
multisorted
logic
can
also
easily
represent
the
syntax
of
the
usual
first
order
predicate
calculus.
To
this
end,
we
just
need
to
ensure
that
each
symbol
in
Const
has
exactly
one
sort
associated
with
it.
It
could
be
a
primitive
sort,
an
arrow
sort,
or
a
Boolean
sort,
but
it
must
be
exactly
one.
In
this
case,
any
given
symbol
can
occur
either
as
a
constant,
or
as
a
function
symbol
of
one
particular
arity,
or
as
a
predicate
of
one
particular
arity.
Polymorphic
constants
:
Association
of
multiple
sorts
with
the
same
constant
can
be
achieved
with
the
help
of
equality.
For
instance,
an
assertion
"foo"^^sort1 = "foo"^^sort2
ensures
that
these
two
constants
represent
the
same
entity,
which
therefore
must
belong
simultaneously
to
sorts
sort1
and
sort2
.
If
we
want
short
integers
to
be
also
treated
as
a
subsort
of
long
integers,
we
can
assert
an
equality
"xyz"^^short = "xyz"^^long
for
every
short
integer
xyz
(in
the
range
0
through
2
16
).
Intended
models
for
rule
sets
:
The
notion
of
a
model
is
only
the
basic
ingredient
in
the
definition
of
a
semantics
of
a
rule
set.
In
general,
a
semantics
of
a
rule
set
R
is
the
set
of
its
intended
models
(see
Y.
Shoham.
Nonmonotonic
logics:
meaning
and
utility.
In:
Proc.
10th
International
Joint
Conference
on
Artificial
Intelligence,
Morgan
Kaufmann,
pp.
388--393,
1987
).
There
are
different
theories
of
what
the
intended
sets
of
models
are
supposed
to
look
like
depending
on
the
features
of
the
particular
rule
sets.
For
Horn
rules,
which
we
use
in
this
section,
document,
the
intended
set
of
models
of
R
is
commonly
agreed
upon:
it
is
the
set
of
all
models
of
R
.
However, when rule bodies contain literals negated with the negation-as-failure connective naf , then only some of the models of a rule set are viewed as intended. This issue will be addressed in the appropriate dialects of RIF. The two most common theories of intended models are based on the so called well-founded models and stable models . Here we will just illustrate the problem with a simple example.
Suppose R consists of a single rule p :- naf q . If naf were interpreted as classical negation, not , then this rule would be simply equivalent to p \/ q , and so it would have two kinds of models: one in which p is true and one where q is true. In contrast to first-order logic, most rule-based systems do not consider p and q symmetrically. Instead, they view the rule p :- naf q as a statement that p must be true if it is not possible to establish the truth of q . Since it is, indeed, impossible to establish the truth of q , such theories will derive p even though it does not logically follow from p \/ q . The logic underlying rule-based systems also assumes that only the minimal models as intended (minimality here is with respect to the set of true facts). Therefore, the intended models of the above rule set R must have the property that not only p is true but also that q is false.