This is an archive of an inactive wiki and cannot be modified.

Syntax of a RIF Dialect as a Specialization of RIF-FLD

The syntax for a RIF dialect can be obtained from the general syntactic framework of RIF by specializing the following parameters (which are defined in this document):

Alphabet

The alphabet of RIF-FLD consists of a countably infinite set of constant symbols Const, a countably infinite set of variable symbols Var (disjoint from Const), a countably infinite set of argument names ArgNames (disjoint from both Const and Var), connective symbols And and Or, quantifiers Exists and Forall, the symbols =, #, ##, :-, ->, Naf, Neg, and auxiliary symbols, such as "(" and ")". The set of connective symbols, quantifiers, =, etc., is disjoint from Const and Var. Variables are written as Unicode strings preceded with the symbol "?". The syntax for constant symbols is given in Section Symbol Spaces.

The language of RIF-BLD is the set of formulas constructed using the above alphabet according to the rules spelled out below.

Terms

The most basic construct of a logic language is a term. RIF-FLD supports several kinds of terms: constants, variables, the regular positional terms, plus terms with named arguments, equality, classification terms, and frames. The word "term" will be used to refer to any kind of terms. Formally, terms are defined as follows:

The above definition is very general. It makes no distinction between constant symbols that represent individuals, predicates, and function symbols. The same symbol can occur in multiple contexts at the same time. For instance, if p, a, and b are symbols then p(p(a) p(a p c)) is a term. Even variables and general terms are allowed to occur in the position of predicates and function symbols, so p(a)(?v(a c) p) is also a term.

Frame, classification, and other terms can be freely nested, as exemplified by p(?X  q#r[p(1,2)->s](d->e f->g)). Some language environments, like FLORA-2 [FL2], OO jDREW [OOjD], and CycL [CycL] support fairly large (partially overlapping) subsets of RIF-FLD terms, but most languages support much smaller subsets. RIF dialects are expected to carve out the appropriate subsets of RIF-FLD terms, and the general form of the RIF logic framework allows a considerable degree of freedom.

The mechanism that allows "carving out" of such subsets is called a signature and works as follows. The RIF-FLD language associates a signature with each symbol (both constant and variable symbols) and uses signatures to define what is called well-formed terms. Each RIF dialect is expected to select appropriate signatures for the symbols in its alphabet, and only the terms that are well-formed according to the selected signatures are allowed in that particular dialect.

Signatures

In this section we introduce the concept of a signature, which is a key mechanism that allows RIF-FLD to control the context in which the various symbols are allowed to occur. Much of this development is inspired by [CK95]. It should be kept in mind that signatures are not part of the logic language in RIF, since they do not appear anywhere in the RIF formulas. Instead they are part of a separate language for signatures, which is akin to grammar rules in that it determines which sequences of tokens are in the language and which are not. In some dialects (for example RIF-BLD), signatures are derived from the context and no separate language for signatures is used. Other dialects may choose to specify signatures explicitly. In that case, they will need to define a concrete language for specifying signatures.

Let SigNames be a non-empty, partially-ordered finite or countably infinite set of symbols, disjoint from Const, Var, and ArgNames, called signature names. We require that this set includes at least the following signature names:

Dialects are expected to introduce additional signature names. For instance, RIF-BLD introduces one other signature name, term. The partial order on SigNames is dialect-specific; it is used in the definition of well-formed terms below.

We use the symbol < to represent the partial order on SigNames. Informally, α < β means that terms with signature α can be used wherever terms with signature β are allowed. We will write α ≤ β if either α = β or α < β.

A signature is a statement of the form η{e1, ..., en, ...} where η ∈ SigNames is the name of the signature and {e1, ..., en, ...} is a countable set of arrow expressions. Such a set can thus be infinite, finite, or even empty. In RIF-BLD, signatures can have at most one arrow expression. Other dialects (such as HiLog [CKW93], for example) may require polymorphic symbols and thus allow signatures with more than one arrow expression in them.

An arrow expression is defined as follows:

A set S of signatures is coherent iff

Well-formed Terms and Formulas

Signatures are used to control the context in which various symbols are allowed to occur, as explained next.

Each variable symbol is associated with exactly one signature from a coherent set of signatures. A constant symbol can have one or more signatures, and different symbols can be associated with the same signature. Since signature names uniquely identify signatures in coherent signature sets, we will often refer to signatures simply by their names. For instance, if one of f's signatures is atomic{ }, we may simply say that symbol f has signature atomic.

Next we define well-formed terms and their signatures. Like the constant symbols, well-formed terms can have more than one signature.

Note that, according to the above definition, f() and f are distinct terms. We define atomic formulas as follows:

More general formulas are constructed out of atomic formulas with the help of logical connectives. A formula is a statement that can have one of the following forms:

Example 1 (The use of signatures)

We illustrate the above definitions with the following examples. In addition to atomic, let there be another signature, term{ }, which is also used in RIF-BLD.

Consider the term p(p(a) p(a b c)). If p has the (polymorphic) signature mysig{(term)⇒term, (term term)⇒term, (term term term)⇒term} and a, b, c each has the signature term{ } then p(p(a) p(a b c)) is a well-formed term with signature term{ }. If instead p had the signature mysig2{(term term)⇒term, (term term term)⇒term} then p(p(a) p(a b c)) would not be a well-formed term since then p(a) would not be well-formed (in this case, p would have no arrow expression which allows p to take just one argument).

For a more complex example, let r have the signature mysig3{(term)⇒atomic, (atomic term)⇒term, (term term term)⇒term}. Then r(r(a) r (a b c)) is well-formed. The interesting twist here is that r(a) is an atomic formula that occurs as an argument to a function symbol. However, this is allowed by the arrow expression (atomic term)⇒ term, which is part of r's signature. If r's signature were mysig4{(term)⇒atomic, (atomic term)⇒atomic, (term term term)⇒term} instead, then r(r(a) r(a b c)) would be not only a well-formed term, but also a well-formed atomic formula.

An even more advanced example of signatures is when the right-hand side of an arrow expression is something other than term or atomic. For instance, let John, Mary, NewYork, and Boston have signatures term{ }; flight and parent have signature h2{(term term)⇒atomic}; and closure has signature hh1{(h2)⇒p2}, where p2 is the name of the signature p2{(term term)⇒atomic}. Then flight(NewYork Boston), closure(flight)(NewYork Boston), parent(John Mary), and closure(parent)(John Mary) would be well-formed formulas. Such formulas are allowed in languages like HiLog [CKW93], which support predicate constructors like closure in the above example.

Symbol Spaces

Throughout this document, the xsd: prefix stands for the XML Schema namespace URI http://www.w3.org/2001/XMLSchema#, the rdf: prefix stands for http://www.w3.org/1999/02/22-rdf-syntax-ns#, and rif: stands for the URI of the RIF namespace, http://www.w3.org/2007/rif#. Syntax such as xsd:string should be understood as a compact URI [CURIE] -- a macro that expands to a concatenation of the character sequence denoted by the prefix xsd and the string string.

The set of all constant symbols in a RIF dialect is partitioned into a number of subsets, called symbol spaces, which are used to represent XML Schema data types, data types defined in other W3C specifications, such as rdf:XMLLiteral, and to distinguish other sets of constants. Constant symbols that belong to the various symbol spaces have special presentation syntax and semantics.

Formally, a symbol space is a named subset of the set of all constants, Const. The semantic aspects of symbol spaces will be described in Section Semantic Framework. Each symbol in Const belongs to exactly one symbol space.

Each symbol space has an associated lexical space and an identifier.

To refer to a constant in a particular RIF symbol space, we use the following presentation syntax:

     LITERAL^^SYMSPACE

where LITERAL is a Unicode string, called the lexical part of the symbol, and SYMSPACE is an identifier of the symbol space in the form of an absolute IRI string. LITERAL must be an element in the lexical space of the symbol space. For instance, 1.2^^xsd:decimal and 1^^xsd:decimal are legal symbols because 1.2 and 1 are members of the lexical space of the XML Schema data type xsd:decimal. On the other hand, a+2^^xsd:decimal is not a legal symbol, since a+2 is not part of the lexical space of xsd:decimal.

The set of all symbol spaces that partition Const is considered to be part of the logic language used by RIF rule sets.

RIF supports the following symbol spaces. Rule sets that are exchanged through RIF can use additional symbol spaces as explained below.

The lexical spaces of the above symbol spaces are defined in the document [XML-SCHEMA2].

Notes on RIF-compliant support for symbol spaces.