### Semantics of a RIF Dialect as a Specialization of RIF-FLD

The RIF-FLD semantic framework defines the notions of *semantic structures* and of *models* of RIF formulas. The * semantics of a dialect* is derived from these notions by specializing the following parameters.

The

*effect of the syntax*.- The syntax of a dialect may limit the kinds of terms that are supported. For instance, if the dialect does not support frames or terms with named arguments then the parts of the semantic structures whose purpose is to interpret the unsupported types of terms become redundant.

*Truth values*.The RIF-FLD semantic framework allows formulas to have truth values from an arbitrary partially ordered set of truth values,

. A concrete dialect must select a concrete partially or totally ordered set of truth values.**TV**

*Data types*.- A data type is a symbol space that has a fixed interpretation in any semantic structure. RIF-FLD defines a set of core data types that each dialect is expected to support, but its semantics does not limit support to just the core types. RIF dialects can introduce additional data types, and each dialect is expected to define the exact set of data types that it supports.

*Logical entailment*.Logical entailment in RIF-FLD is defined with respect to an unspecified set of

*intended*models. A RIF dialect must define which models are considered to be intended. For instance, one dialect might specify that all models are intended (which leads to classical first-order entailment), another may consider only the minimal models as intended, while a third one might only use so-called well-founded or stable models.

All of the above notions are defined in the remainder of this document.

### Truth Values

Each RIF dialect is expected to define the set of * truth values*, denoted by

*. This set must have a partial order, called the*

**TV***, denoted <*

**truth order**_{t}. As a special case, <

_{t}can be a total order in some dialects. We write

*a*≤

_{t}

*b*if either

*a*<

_{t}

*b*or

*a*and

*b*are the same element of

*. In addition,*

**TV**must be a complete lattice with respect to <**TV**_{t}, i.e., the least upper bound (`lub`_{t}) and the greatest lower bound (`glb`_{t}) must exist for any subset of.**TV**is required to have two distinguished elements,**TV****f**and**t**, such that**f**≤_{t}*elt*and*elt*≤_{t}**t**for every*elt*∈.**TV**has an**TV**,**operator of negation**`~`:→**TV**, such that**TV**`~`is idempotent, i.e., applying`~`twice gives the identity mapping.`~`**t**=**f**(and thus`~`**f**=**t**).

RIF dialects can have additional truth values. For instance, the semantics of some versions of NAF, such as the *well-founded negation*, requires three truth values: **t**, **f**, and **u** (undefined), where **f** <_{t} **u** <_{t} **t**. Handling of contradictions and uncertainty usually requires at least four truth values: **t**, **u**, **f**, and **i** (inconsistent). In this case, the truth order is partial: **f** <_{t} **u** <_{t} **t** and **f** <_{t} **i** <_{t} **t**.

### Primitive Data Types

A * primitive data type* (or just a

*data type*, for short) is a symbol space that has

an associated set, called the

, and**value space**a mapping from the lexical space of the symbol space to the value space, called

.**lexical-to-value-space mapping**

Semantic structures are always defined with respect to a particular set of data types, denoted by * DTS*. In a concrete dialect,

*always includes the data types supported by that dialect. All RIF dialects are expected to support the following primitive data types:*

**DTS**`xsd:long``xsd:integer``xsd:decimal``xsd:string``xsd:time``xsd:dateTime``rdf:XMLLiteral``rif:text`

Their value spaces and the lexical-to-value-space mappings are defined as follows:

For the XML Schema data types of RIF, namely

`xsd:long`,`xsd:integer`,`xsd:decimal`,`xsd:string`,`xsd:time`, and`xsd:dateTime`, the value spaces and the lexical-to-value-space mappings are defined in the XML Schema specification [XML-SCHEMA2].The value space for the primitive data type

`rdf:XMLLiteral`is defined in RDF [RDF-CONCEPTS].The value space of

`rif:text`is the set of all pairs of the form`(string, lang)`, where`string`is a Unicode character sequence and`lang`is a lowercase Unicode character sequence which is a natural language identifier as defined by RFC 3066 [RFC-3066]. The lexical-to-value-space mapping of`rif:text`, denoted`L`_{rif:text}, maps each symbol`string@lang`in the lexical space of`rif:text`to`(string, lower-case(lang))`, where`lower-case(lang)`is`lang`written in all-lowercase letters.The value space and the lexical-to-value-space mapping for

`rif:text`defined here are compatible with RDF's semantics for strings with named tags [RDF-SEMANTICS].

Although the lexical and the value spaces might sometimes look similar, one should not confuse them. Lexical spaces define the syntax of the constant symbols in the RIF language that belong to the various primitive data types. In contrast, value spaces define the *meaning* of those constants. The lexical and the value spaces are often not even isomorphic. For instance, `1.2^^xsd:decimal` and `1.20^^xsd:decimal` are two legal -- and distinct -- constants in RIF because `1.2` and `1.20` belong to the lexical space of `xsd:decimal`. However, these two constants are interpreted by the *same* element of the value space of the `xsd:decimal` type. Therefore, `1.2^^xsd:decimal = 1.20^^xsd:decimal` is a RIF tautology. Likewise, RIF semantics for data types implies certain inequalities. For instance, `abc^^xsd:string` ≠ `abcd^^xsd:string` is a tautology, since the lexical-to-value-space mapping of the `xsd:string` type maps these two constants into distinct elements in the value space of `xsd:string`.

### Semantic Structures

The central step in specifying a model-theoretic semantics for a logic-based language is defining the notion of a *semantic structure*, also known as an *interpretation*. Semantic structures are used to assign truth values to RIF-FLD formulas.

A * semantic structure*,

*, is a tuple of the form <*

**I***,*

**TV***,*

**DTS***,*

**D**

**I**_{C},

**I**_{V},

**I**_{F},

**I**_{frame},

**I**_{SF},

**I**_{sub},

**I**_{isa},

**I**_{=},

**I**_{Truth}>. Here

*is a non-empty set of elements called the*

**D***of*

**domain***. We will continue to use*

**I**`Const`to refer to the set of all constant symbols and

`Var`to refer to the set of all variable symbols.

*denotes the set of truth values that the semantic structure uses and*

**TV***is the set of primitive data types used in*

**DTS***.*

**I**The other components of * I* are

*total*mappings defined as follows:

**I**_{C}maps`Const`to elements of.**D**- This mapping interprets constant symbols.

**I**_{V}maps`Var`to elements of.**D**- This mapping interprets variable symbols.

**I**_{F}mapsto functions**D**→**D***(here**D**is a set of all sequences of any finite length over the domain**D***)**D**- This mapping interprets positional terms.

**I**_{SF}interprets terms with named arguments. It is a total mapping from`Const`to the set of total functions of the form`SetOfFiniteBags`(`ArgNames`×) →**D**. This is analogous to the interpretation of positional terms with two differences:**D**Each pair

`<s,v>`∈`ArgNames`×represents an argument/value pair instead of just a value in the case of a positional term.**D**- The argument to a term with named arguments is a finite set of argument/value pairs rather than a finite ordered sequence of simple elements.

Bags are used here because the order of the argument/value pairs in a term with named arguments is immaterial and the pairs may repeat. For instance,

`p(a->b a->c)`.

**I**_{frame}is a total mapping fromto total functions of the form**D**`SetOfFiniteBags`(×**D**) →**D**.**D**This mapping interprets frame terms. An argument,

`d`∈, to**D****I**_{frame}represent an object and a finite bag`{<a1,v1>, ..., <ak,vk>}`represents a bag (multiset) of attribute-value pairs for`d`. We will see shortly how**I**_{frame}is used to determine the truth valuation of frame terms.Bags are used here because the order of the attribute/value pairs in a frame is immaterial and the pairs may repeat. For instance,

`o[a->b a->c]`means that the value of the attribute`a`on the object`o`is a set that*contains*`b`and`c`.

**I**_{sub}gives meaning to the subclass relationship. It is a total function×**D**→**D**.**D**The operator

`##`is required to be transitive, i.e.,`c1 ## c2`and`c2 ## c3`must imply`c1 ## c3`. This is ensured by a restriction in Section Interpretation of Formulas.

**I**_{isa}gives meaning to class membership. It is a total function×**D**→**D**.**D**The relationships

`#`and`##`are required to have the usual property that all members of a subclass are also members of the superclass, i.e.,`o # cl`and`cl ## scl`must imply`o # scl`. This is ensured by a restriction in Section Interpretation of Formulas.

**I**_{=}gives meaning to the equality. It is a total function×**D**→**D**.**D****I**_{Truth}is a total mapping→**D**.**TV**- It is used to define truth valuation of formulas.

We also define the following mapping * I* :

(k) =**I****I**_{C}(k), if k is a symbol in`Const`(?v) =**I****I**_{V}(?v), if ?v is a variable in`Var`(f(t**I**_{1}... t_{n})) =**I**_{F}((f))(**I**(t**I**_{1}),...,(t**I**_{n}))(f(**I**`s`_{1}`->v`_{1}...`s`_{n}`->v`_{n})) =**I**_{SF}((f))({**I**`<s`_{1},(v**I**_{1})`>`,...,`<s`_{n},(v**I**_{n})`>`})- Here we use {...} to denote a bag of argument/value pairs.

(**I**`o[a`_{1}`->v`_{1}`... a`_{k}`->v`_{k}`]`) =**I**_{frame}((**I**`o`))({`<`(**I**`a`_{1}),(v**I**_{1})`>`, ...,`<`(**I**`a`_{n}),(v**I**_{n})`>`})- Here {...} denotes a bag of attribute/value pairs.

(**I**`c1##c2`) =**I**_{sub}((**I**`c1`),(**I**`c2`))(**I**`o#c`) =**I**_{isa}((**I**`o`),(**I**`c`))(**I**`x=y`) =**I**_{=}((**I**`x`),(**I**`y`))

* The effect of signatures.* For every signature,

`sg`, supported by the dialect, there is a subset

**D**_{sg}⊆

*, called the*

**D***. Terms that have a given signature,*

**domain of the signature**`sg`, are supposed to be mapped by

*to*

**I**

**D**_{sg}, and if a term has more than one signature it is supposed to be mapped into the intersection of the corresponding signature domains. To ensure this, the following is required:

If

`sg < sg'`then**D**_{sg}⊆**D**_{sg'}.If

`k`is a constant that has signature`sg`then**I**_{C}(`k`) ∈**D**_{sg}.If

`?v`is a variable that has signature`sg`then**I**_{V}(`?v`) ∈**D**_{sg}.If

`sg`has an arrow expression of the form (`s1`...`sn`)⇒`s`then, for every`d`∈**D**_{sg},**I**_{F}(`d`) must map**D**_{s1}× ... ×**D**_{sn}to**D**_{s}.If

`sg`has an arrow expression of the form (`p1->s1`...`pn->sn`)⇒`s`then, for every`d`∈**D**_{sg},**I**_{SF}(`d`) must map the set`{<p1`,**D**_{s1}`>`, ...,`<pn`,**D**_{sn}`>`} to**D**_{s}.If the signature

`->`has an arrow expressions (`sg`,`s`_{1},`r`_{1})⇒`k`, ..., (`sg`,`s`_{n},`r`_{n})⇒`k`, then, for every`d`∈**D**_{sg},**I**_{frame}(`d`) must map {`<`**D**_{s1},**D**_{r1}`>`, ...,`<`**D**_{sn},**D**_{rn}`>`} to**D**_{k}.If the signature

`#`has an arrow expression (`s``r`)⇒`k`then**I**_{isa}must map**D**_{s}×**D**_{r}to**D**_{k}.If the signature

`##`has an arrow expression (`s``s`)⇒`k`then**I**_{sub}must map**D**_{s}×**D**_{s}to**D**_{k}.If the signature

`=`has an arrow expression (`s``s`)⇒`k`then**I**_{=}must map**D**_{s}×**D**_{s}to**D**_{k}.

* The effect of data types.* The data types in

*impose the following restrictions. If*

**DTS**`dt`is a symbol space identifier of a data type, let

**LS**_{dt}denote the lexical space of

`dt`,

**VS**_{dt}denote its value space, and

**L**_{dt}:

**LS**_{dt}→

**VS**_{dt}the lexical-to-value-space mapping. Then the following must hold:

**VS**_{dt}⊆; and**D**For each constant

`lit^^dt`∈**LS**_{dt},**I**_{C}(`lit^^dt`) =**L**_{dt}(`lit`).That is,

**I**_{C}must map the constants of a data type`dt`in accordance with**L**_{dt}.

RIF-FLD does not impose special requirements to **I**_{C} for constants in the lexical spaces that do not correspond to primitive datatypes in * DTS*. Dialects may have such requirements, however. An example of such a restriction could be a requirement that no constant in a particular symbol space (such as

`rif:local`) can be mapped to

**VS**_{dt}of a data type

`dt`.

### Interpretation of Formulas

* Truth valuation* for well-formed formulas in RIF-BLD is determined using the following function, denoted

*TVal*

_{I}:

*Constants*:*TVal*_{I}(`k`) =**I**_{Truth}((**I**`k`)), if`k`∈`Const`.*Variables*:*TVal*_{I}(`?v`) =**I**_{Truth}((**I**`?v`)), if`?v`∈`Var`.*Positional atomic formulas*:*TVal*_{I}(`r(t`_{1}...`t`_{n})) =**I**_{Truth}((**I**`r(t`_{1}...`t`_{n})))*Atomic formulas with named arguments*:*TVal*_{I}(`p(s`_{1}`->v`_{1}...`s`_{k}`->v`_{k})) =**I**_{Truth}((**I**`p(s`_{1}`-> v`_{1}...`s`_{k}`->v`_{k}))).*Equality*:*TVal*_{I}(`x = y`) =**I**_{Truth}((**I**`x = y`)).To ensure that equality has precisely the expected properties, it is required that

**I**_{Truth}((**I**`x = y`)) =**t**if and only if(**I**`x`) =(**I**`y`) and that**I**_{Truth}((**I**`x = y`)) =**f**otherwise.

*Subclass*:*TVal*_{I}(`sc ## cl`) =**I**_{Truth}((**I**`sc ## cl`)).To ensure that the operator

`##`is transitive, i.e.,`c1 ## c2`and`c2 ## c3`imply`c1 ## c3`, the following is required: For all`c1`,`c2`,`c3`∈,**D**`glb`_{t}(*TVal*_{I}(`c1 ## c2`),*TVal*_{I}(`c2 ## c3`)) ≤_{t}*TVal*_{I}(`c1 ## c3`).

*Membership*:*TVal*_{I}(`o # cl`) =**I**_{Truth}((**I**`o # cl`)).To ensure that all members of a subclass are also members of the superclass, i.e.,

`o # cl`and`cl ## scl`implies`o # scl`, the following is required: For all`o`,`cl`,`scl`∈,**D**`glb`_{t}(*TVal*_{I}(`o # cl`),*TVal*_{I}(`cl ## scl`)) ≤_{t}*TVal*_{I}(`o # scl`).

*Frame*:*TVal*_{I}(`o[a`_{1}`->v`_{1}`... a`_{k}`->v`_{k}`]`) =**I**_{Truth}((**I**`o[a`_{1}`->v`_{1}`... a`_{k}`->v`_{k}`]`)).- Since the different attribute/value pairs are supposed to be understood as conjunctions, the following is required:
*TVal*_{I}(`o[a`_{1}`->v`_{1}`... a`_{k}`->v`_{k}`]`) =`glb`_{t}(*TVal*_{I}(`o[a`_{1}`->v`_{1}`]`), ...,*TVal*_{I}(`o[a`_{k}`->v`_{k}`]`))

- Since the different attribute/value pairs are supposed to be understood as conjunctions, the following is required:
*Conjunction*:*TVal*_{I}(`And(`c_{1}... c_{n}`)`) =`glb`_{t}(*TVal*_{I}(c_{1}), ...,*TVal*_{I}(c_{n})).*Disjunction*:*TVal*_{I}(`Or(`c_{1}... c_{n}`)`) =`lub`_{t}(*TVal*_{I}(c_{1}), ...,*TVal*_{I}(c_{n})).*Negation*:*TVal*_{I}(`neg`φ) =`~`*TVal*_{I}(φ) and*TVal*_{I}(`naf`φ) =`~`*TVal*_{I}(φ)where

`~`is the idempotent operator of negation onintroduced in Section Truth Values. Note that both classical and default negation are interpreted the same way in any concrete semantic structure. The difference between the two kinds of negation comes into play when logical entailment is defined.**TV***Quantification*:*TVal*_{I}(`Exists``?v`_{1}...`?v`_{n}(φ)) =`lub`_{t}(*TVal*_{I*}(φ)) and*TVal*_{I}(`Forall``?v`_{1}...`?v`_{n}(φ)) =`glb`_{t}(*TVal*_{I*}(φ)).Here

`lub`_{t}(respectively,`glb`_{t}) is taken over all interpretations* of the form <**I**,**TV**,**DTS**,**D****I**_{C},***I**_{V},**I**_{F},**I**_{frame},**I**_{SF},**I**_{sub},**I**_{isa},**I**_{Truth}>, which are exactly like, except that the mapping**I*****I**_{V}, is used instead of**I**_{V}.***I**_{V}is defined to coincide with**I**_{V}on all variables except, possibly, on`?v`_{1},...,`?v`_{n}.*Rules*:*TVal*_{I}(*head*:-*body*) =**t**, if*TVal*_{I}(*head*) ≥_{t}*TVal*_{I}(*body*);*TVal*_{I}(*head*:-*body*) =**f**otherwise.

Note that rules and equality formulas are two-valued even if * TV* has more than two values.

A * model* of a set

**R**of formulas is a semantic structure

*such that*

**I***TVal*

_{I}(φ) =

**t**for every φ∈

**R**.

### Intended Models

The * semantics of a set of formulas*,

**R**, is the set of its

*. RIF-FLD does not specify what these intended structures are, leaving this to RIF dialects. There are different theories of how the intended sets of semantic structures are supposed to look like.*

**intended semantic structures**For the classical first-order logic, every semantic structure is intended. For RIF-BLD, which is based on Horn rules, intended semantic structures are defined only for rulesets: an intended semantic structure of a RIF-BLD ruleset **R** is the unique minimal Herbrand model of **R**. For the dialects in which rule bodies may contain literals negated with the negation-as-failure connective `naf`, only *some* of the minimal Herbrand models of a rule set are intended. Each dialect of RIF is supposed to define the notion of intended semantic structures precisely. The two most common theories of intended semantic structures are the so called * well-founded models* [GRS91] and

*[GL88].*

**stable models**The following example illustrates the notion of intended semantic structures. Suppose **R** consists of a single rule `p :- naf q`. If `naf` were interpreted as classical negation, `not`, then this rule would be simply equivalent to `p \/ q`, and so it would have two kinds of models: those where `p` is true and those where `q` is true. In contrast to first-order logic, most rule-based systems do not consider `p` and `q` symmetrically. Instead, they view the rule `p :- naf q` as a statement that `p` must be true if it is not possible to establish the truth of `q`. Since it is, indeed, impossible to establish the truth of `q`, such theories would derive `p` even though it does not logically follow from `p \/ q`. The logic underlying rule-based systems also assumes that only the *minimal* Herbrand models are intended (minimality here is with respect to the set of true facts). Furthermore, although our example has two minimal Herbrand models -- one where `p` is true and `q` is false, and the other where `p` is false, but `q` is true, only the first model is considered to be intended.

The above concept of intended models and the corresponding notion of logical entailment with respect to the intended models, defined below, is due to [Shoham87].

### Logical Entailment

We will now define what it means for a set of RIF formulas to entail a RIF formula. We assume that each ruleset has an associated set of intended semantic structures.

Let **R** be a set of RIF formulas and φ a closed RIF formula. We say that **R** * entails* φ, written as

**R**

`|=`φ, if and only if for every intended semantic structure

*of*

**I****R**and every ψ ∈

**R**, it is the case that

*TVal*

_{I}(ψ) ≤

*TVal*

_{I}(φ).

This general notion of entailment covers both first-order logic and non-monotonic logics that underlie many rule-based languages [Shoham87].