The basis of the language is formed by what can appear in the bodies of Horn-like rules with equality -- conjunctions and disjunctions of atomic formulas and equations (such rules reduce to pure Horn). We later extend our proposal to include builtins. As mentioned in the introduction, the motivation is that this sublanguage can be shared among the bodies of the rules expressed in the following RIF dialects:

- FO (first order)
- LP (logic programming)
- PR (production rules)
- RR (reactive rules)

This sublanguage can also be used to uniformly express:

- IC (integrity constraints)
- QY (queries)

## SYNTAX

Essential BNF for human-readable syntax, where Expressions and Atoms abstract from the underlying data model in that they can take a Herbrand (positional) or a Slotted (keyed) form:

NAME ::= a unicode string in normal form C Con ::= a unicode string in normal form C Var ::= '?' NAME TERM ::= Con | Var | Expr Fun ::= a unicode string in normal form C Rel ::= a unicode string in normal form C Expr ::= Fun '(' TERM* ')' | Fun '{' (NAME '->' TERM)* '}' Atom ::= Rel '(' TERM* ')' | Rel '{' (NAME '->' TERM)* '}' LITFORM ::= Atom | TERM '=' TERM QUANTIF ::= 'Exists' Var+ '(' CONDIT ')' CONJ ::= 'And' '(' CONDIT* ')' DISJ ::= 'Or' '(' CONDIT* ')' CONDIT ::= LITFORM | QUANTIF | CONJ | DISJ

Here the Herbrand form (Expr or Atom) applies its operator (Fun or Rel) to positional TERM arguments using round parentheses, (... TERM ...), while the Slotted form (Expr or Atom) applies its operator (Fun or Rel) to keyed NAME->TERM arguments using curly braces, {... NAME->TERM ...}. Notice that LITFORM stands for Literal Formula and anticipates the introduction of negated atoms later on. QUANTIF stands for Quantified Formula, which for Horn-like conditions can only be 'Exists' Formulas (Var+ variables should occur free in the scoped CONDIT, so 'Exists' can quantify them; free variables are discussed below). More explicitly than in logic programming, CONJ expresses formula conjunctions, and DISJ expresses disjunctions. Finally, CONDIT combines everything and defines RIF conditions, which can later be extended beyond LITFORM, QUANTIF, CONJ, or DISJ.

We assume that all constants (Con) belong to one logical sort: the sort of elementary entities. This can be subsequently refined to allow multiple sorts for different data types. See A.1.1 Basis: Positive Conditions over Bipartitioned Constants.

In the present version, variables are not sorted and thus can range over all constants, Data or Ind. Sorted variables might be introduced in a later version of the language.

Note that there are two uses of variables in the RIF Condition Language: free and quantified. All quantified variables are quantified explicitly, existentially (and also universally, later). We adopt the usual scoping rules for quantification from first-order logic. Variables that are not explicitly quantified are free.

The free variables are needed because we are dealing with conditions that occur in rule bodies only. When a condition occurs in such a rule body, the free variables in the condition are precisely those that also occur in the rule head. Such variables are quantified universally outside of the rule, and the scope of such quantification is the entire rule. For instance, the variable ?X in the rule below is free in the condition that occurs in the rule body, but it is universally quantified outside of the rule.

Condition with a free variable ?X: ... Exists ?Y (condition(..?X..?Y..)) ... Rule using the condition in its body: Forall ?X (head(...?X...) :- ... Exists ?Y (condition(..?X..?Y..)) ...)

When conditions are used as queries, their free variables are to be bound to carry the answer bindings back to the caller.

The semantics of conditions is defined in the section SEMANTICS.

Example 1a (A Herbrand RIF condition in human-readable syntax): In this condition, ?Buyer is quantified existentially, while ?Seller and ?Author are free: And ( Exists ?Buyer (purchase(?Buyer ?Seller book(?Author LeRif) $49)) ?Seller=?Author )

This syntax is similar in style, and compatible to, the OWL Abstract Syntax http://www.w3.org/TR/owl-semantics/syntax.html.

Example 1b (A Slotted RIF condition in human-readable syntax): In this condition, ?Buyer is quantified existentially, while ?Seller and ?Author are free: And ( Exists ?Buyer (purchase{buyer->?Buyer seller->?Seller item->book{author->?Author title->LeRif} price->$49}) ?Seller=?Author )

An XML syntax can be obtained from the above BNF as follows. The non-terminals in all-upercase such as CONDIT become XML entities, which act like macros and will not be visible in instance markups. The other non-terminals as well as symbols ('Exists' etc. as well as '=' and '->') become XML elements, which are adapted from RuleML as shown below.

- Con (constant) - Var (logic variable) - slot (prefix version of keyed argument '->') - Fun (n-ary function symbol) - Rel (n-ary relation symbol [a.k.a. predicate]) - Expr (expression formula) - Atom (atomic formula) - Equal (prefix version of term equation '=') - Exists (quantified formula for 'Exists') - And (conjunction) - Or (disjunction)

This can be directly rewritten as a DTD (PositiveConditions.dtd) or an XML Schema.

The condition formula in Example 1a can be serialized in XML as shown below.

Example 2a (A Herbrand RIF condition in XML syntax): <And> <Exists> <Var>Buyer</Var> <Atom> <Rel>purchase</Rel> <Var>Buyer</Var> <Var>Seller</Var> <Expr> <Fun>book</Fun> <Var>Author</Var> <Con>LeRif</Con> </Expr> <Con>$49</Con> </Atom> </Exists> <Equal> <Var>Seller</Var> <Var>Author</Var> </Equal> </And>

The condition formula in Example 1b can be serialized in XML as shown below.

Example 2b (A Slotted RIF condition in XML syntax): <And> <Exists> <Var>Buyer</Var> <Atom> <Rel>purchase</Rel> <slot><Con>buyer</Con><Var>Buyer</Var></slot> <slot><Con>seller</Con><Var>Seller</Var></slot> <slot> <Con>item</Con> <Expr> <Fun>book</Fun> <slot><Con>author</Con><Var>Author</Var></slot> <slot><Con>title</Con><Con>LeRif</Con></slot> </Expr> </slot> <slot><Con>price</Con><Con>$49</Con></slot> </Atom> </Exists> <Equal> <Var>Seller</Var> <Var>Author</Var> </Equal> </And>

Using the DTD spec, Richard Goerwitz' STG Validator succeeds with the conjunction of Example 2a:

<?xml version="1.0" standalone="no"?> <!DOCTYPE And SYSTEM "http://www.jdrew.org/rif/PositiveConditions.dtd"> <And> ... content of Example 2a ... </And>

## SEMANTIC STRUCTURES (a.k.a. INTERPRETATIONS)

TO BE EXTENDED TO SLOTTED FORMS OF EXPRESSIONS AND ATOMS.

The first step in defining a model-theoretic semantics for a logic-based language is to define the notion of a *semantic structure*, also known as *interpretation*, and then to define the notion of truth valuation for the formulas in the language.

In case of the first-order semantics, the setting given here is one of the standard common definitions. Although it is not as frequently used as some other well-known definitions of semantic structures, it has the advantage of being easy to generalize to non-first-order cases --- for instance, rule sets with negation as failure (NAF), some of which (e.g., well-founded negation) use three-valued semantic structures, and settings, such as the Web, where information can be uncertain or contradictory. In the latter case, four-valued and other multi-valued semantic structures are used. (See, for example, M. Fitting, Fixpoint Semantics for Logic Programming A Survey, Theoretical Computer Science, 1999.)

A semantic structure is a mappings of the form

TV |

where * TV* is the set of all truth values. Thus, if φ if a formula then

*(φ) is its truth value.*

**I**The set of truth values * TV* typically has only two values,

**t**and

**f**. However, some versions of NAF have three,

**t**,

**u**(undefined), and

**f**, and, as we remarked, treatment of contradictions and uncertainty requires at least four:

**t**,

**u**,

**f**, and

**i**(inconsistent).

The set * TV* is assumed to have a total or partial order, called the

*truth order*; it is denoted <

_{t}. For instance, in the first-order case,

**f**<

_{t}

**t**, and it is a total order. In the well-founded semantics,

**f**<

_{t}

**u**<

_{t}

**t**, and it is again a total order. But in Belnap-style four-valued logics, which are suitable for dealing with uncertain or inconsistent information, the truth order is partial:

**f**<

_{t}

**u**<

_{t}

**t**and

**f**<

_{t}

**i**<

_{t}

**t**.

As a side remark, Belnap-style logics also have another order, called the *knowledge order* <_{k}: **u** <_{k} **t** <_{k} **i**; and **u** <_{k} **f** <_{k} **i**. Under the knowledge order, true and false are incomparable, and facts that are both true and false receive the truth value **i**, which is the least upper bound of **f** and **t** in the knowledge order.

More formally, let us define the following sets:

- D - a non-empty set (of domain elements),
- Con - the set of syntax elements recognized by the Con / entity production,
- Var - the set of syntax elements recognized by the Var / ?name production,
- Fun - the set of syntax elements recognized by the Fun production,
- Rel - the set of syntax elements recognized by the Rel production.

An interpretation * I* consists of four mappings:

**I**_{C}from Con to elements of D**I**_{V}from Var to elements of D**I**_{F}from Fun to functions from D* into D (D* is a set of all tuples over domain D)**I**_{R}from Rel to truth-valued mappings D* ->**TV**

Using these mappings, we can define a more general mapping, *I*, as follows:

(k) =**I****I**_{C}(k) if k is a constant(?v) =**I****I**_{V}(?v) if v is a variable(f(t1,...,tn)) =**I****I**_{F}(f)((t1),...,**I**(tn)) if f is a function symbol**I**

As explained earlier, an interpretation is supposed to map formulas to truth values. We define this mapping now:

Atomic formulas:

(r(t1,...,tn)) =**I****I**_{R}(r)((t1),...,**I**(tn))**I**Equality:

(t1=t2) =**I****t**iff(t1) =**I**(t2) and it is**I****f**otherwise.Conjunction:

(And(c1,...,cn)) = min**I**_{t}((c1),...,**I**(cn)), where min**I**_{t}is minimum with respect to the truth order.Disjunction:

(Or(c1,...,cn)) = max**I**_{t}((c1),...,**I**(cn)), where max**I**_{t}is maximum with respect to the truth order.Quantification:

(Exists v1 ... vn (c)) = max**I**_{t}(*(c)), where max**I**_{t}is taken over all interpretations* of the form <**I****I**_{C},***I**_{V},**I**_{F},**I**_{R}>, where***I**_{V}is the same as**I**_{V}except possibly on the variables v1,...,vn (i.e.,* agrees with**I**everywhere except possibly in its interpretation of the mappings of variables v1 ... vn).**I**

## POSSIBLE VARIANTS

This Positive Condition Language may evolve towards a smaller variant: A.1.0 Nucleus: Positive Conditions

It may also evolve towards a larger variant or develop towards extensions: A.1.1 Basis: Positive Conditions over Bipartitioned Constants

Back: A. RIF Condition Language --- Root: CORE --- Next: B. Extension: RIF Rule Language