### Syntax of a RIF Dialect as a Specialization of RIF-FLD

The * syntax for a RIF dialect* can be obtained from the general syntactic framework of RIF by specializing the following parameters (which are defined in this document):

- The alphabet of RIF-FLD can be restricted.
An

*assignment of signatures*to each constant symbol.- Signatures determine which terms in the dialect are well-formed and which are not. The exact way this assignment is defined depends on the dialect. The assignment can be explicit or implicit (for instance, derived from the context in which each symbol is used).

The

*choice of the types of terms*supported by the dialect.- The RIF logic framework introduces the following types of terms:
- constant
- variable
- positional
- with named arguments
- equality
- frame
- class membership
- subclass

- The RIF logic framework introduces the following types of terms:
The

*choice of symbol spaces*supported by the dialect.- Symbol spaces determine the "shapes" of the symbols that are allowed by the syntax of the dialect.

The

*choice of the formulas*supported by the dialect.- RIF-FLD allows to build formulas of the following kind:
- Atomic
- Conjunction
- Disjunction
- Classical negation
- Default negation
- Rule
- Quantification: universal and existential

- RIF-FLD allows to build formulas of the following kind:

### Alphabet

The * alphabet* of RIF-FLD consists of a countably infinite set of

**constant symbols**`Const`, a countably infinite set of

**variable symbols**`Var`(disjoint from

`Const`), a countably infinite set of

**argument names**`ArgNames`(disjoint from both

`Const`and

`Var`), connective symbols

`And`and

`Or`, quantifiers

`Exists`and

`Forall`, the symbols

`=`,

`#`,

`##`,

`:-`,

`->`,

`Naf`,

`Neg`, and auxiliary symbols, such as "(" and ")". The set of connective symbols, quantifiers,

`=`, etc., is disjoint from

`Const`and

`Var`. Variables are written as Unicode strings preceded with the symbol "?". The syntax for constant symbols is given in Section Symbol Spaces.

The language of RIF-BLD is the set of formulas constructed using the above alphabet according to the rules spelled out below.

### Terms

The most basic construct of a logic language is a *term*. RIF-FLD supports several kinds of terms: constants, variables, the regular *positional* terms, plus terms with *named arguments*, equality, *classification* terms, and *frames*. The word "*term*" will be used to refer to any kind of terms. Formally, terms are defined as follows:

*Constants and variables*. If`t`∈`Const`or`t`∈`Var`then`t`is a.**simple term***Positional terms*. If`t`and`t`_{1}, ...,`t`_{n}are terms then`t(t`_{1}...`t`_{n}`)`is a. Positional terms in RIF-FLD generalize the regular notion of a term used in first-order logic. For instance, the above definition allows variables everywhere.**positional term***Terms with named arguments*. Ais of the form**term with named arguments**`t(s`_{1}`->v`_{1}...`s`_{n}`->v`_{n}`)`, where`t`,`v`_{1}, ...,`v`_{n}are terms (positional, with named arguments, frame, etc.), and`s`_{1}, ...,`s`_{n}are (not necessarily distinct) symbols from the set`ArgNames`. The term`t`here represents a predicate or a function;`s`_{1}, ...,`s`_{n}represent argument names; and`v`_{1}, ...,`v`_{n}represent argument values. Terms with named arguments are like regular positional terms except that the arguments are named and their order is immaterial. Note that a term with no arguments, like f(), is both positional and also is considered to have named arguments.*Equality terms*. Anhas the form**equality term**`t = s`, where`t`and`s`are terms.*Classification terms*. There are two kinds of classification terms:*class membership terms*(or just*membership terms*) and*subclass terms*.`t#s`is aif**membership term**`t`and`s`are arbitrary terms.`t##s`is aif**subclass term**`t`and`s`are arbitrary terms.

*Frame terms*.`t[p`_{1}`->v`_{1}...`p`_{n}`->v`_{n}`]`is a(or simply a**frame term**) if**frame**`t`,`p`_{1}, ...,`p`_{n},`v`_{1}, ...,`v`_{n}, n ≥ 0, are arbitrary terms. As in the case of the terms with named arguments, the order of the properties`p`_{i}`->v`_{i}in a frame is immaterial.Classification and frame terms are used to describe objects in object-based logics like F-logic [KLW95].

The above definition is very general. It makes no distinction between constant symbols that represent individuals, predicates, and function symbols. The same symbol can occur in multiple contexts at the same time. For instance, if `p`, `a`, and `b` are symbols then `p(p(a) p(a p c))` is a term. Even variables and general terms are allowed to occur in the position of predicates and function symbols, so `p(a)(?v(a c) p)` is also a term.

Frame, classification, and other terms can be freely nested, as exemplified by `p(?X q#r[p(1,2)->s](d->e f->g))`. Some language environments, like FLORA-2 [FL2], OO jDREW [OOjD], and CycL [CycL] support fairly large (partially overlapping) subsets of RIF-FLD terms, but most languages support much smaller subsets. RIF dialects are expected to carve out the appropriate subsets of RIF-FLD terms, and the general form of the RIF logic framework allows a considerable degree of freedom.

The mechanism that allows "carving out" of such subsets is called a *signature* and works as follows. The RIF-FLD language associates a signature with each symbol (both constant and variable symbols) and uses signatures to define what is called *well-formed terms*. Each RIF dialect is expected to select appropriate signatures for the symbols in its alphabet, and only the terms that are well-formed according to the selected signatures are allowed in that particular dialect.

### Signatures

In this section we introduce the concept of a *signature*, which is a key mechanism that allows RIF-FLD to control the context in which the various symbols are allowed to occur. Much of this development is inspired by [CK95]. It should be kept in mind that signatures are *not* part of the logic language in RIF, since they do not appear anywhere in the RIF formulas. Instead they are part of a separate language for signatures, which is akin to grammar rules in that it determines which sequences of tokens are in the language and which are not. In some dialects (for example RIF-BLD), signatures are derived from the context and no separate language for signatures is used. Other dialects may choose to specify signatures explicitly. In that case, they will need to define a concrete language for specifying signatures.

Let `SigNames` be a non-empty, partially-ordered finite or countably infinite set of symbols, disjoint from `Const`, `Var`, and `ArgNames`, called * signature names*. We require that this set includes at least the following signature names:

`atomic`-- used to represents the syntactic context where atomic formulas are allowed to appear.`=`-- used for representing contexts where equality terms can appear.`#`-- a signature name reserved for membership terms.`##`-- a signature reserved for subclass terms.`->`-- a signature reserved for frame terms.

Dialects are expected to introduce additional signature names. For instance, RIF-BLD introduces one other signature name, `term`. The partial order on `SigNames` is dialect-specific; it is used in the definition of well-formed terms below.

We use the symbol < to represent the partial order on `SigNames`. Informally, α < β means that terms with signature α can be used wherever terms with signature β are allowed. We will write α ≤ β if either α = β or α < β.

A * signature* is a statement of the form η{

`e`

_{1}, ...,

`e`

_{n}, ...} where η ∈

`SigNames`is the name of the signature and {

`e`

_{1}, ...,

`e`

_{n}, ...} is a countable set of

*arrow expressions*. Such a set can thus be infinite, finite, or even empty. In RIF-BLD, signatures can have at most one arrow expression. Other dialects (such as HiLog [CKW93], for example) may require polymorphic symbols and thus allow signatures with more than one arrow expression in them.

An * arrow expression* is defined as follows:

If κ, κ

_{1}, ..., κ_{n}∈`SigNames`, n≥0, are signature names then (κ_{1}... κ_{n}) ⇒ κ is a. For instance, () ⇒**positional arrow expression**`term`and (`term`) ⇒`term`are arrow expressions, if`term`is a signature name.If κ, κ

_{1}, ..., κ_{n}∈`SigNames`, n≥0, are signature names and`p`_{1}, ...,`p`_{n}∈`ArgNames`are argument names then (`p`_{1}`->`κ_{1}...`p`_{n}`->`κ_{n})`=>`κ is an. For instance, (**arrow expression with named arguments**`arg1->term``arg2->term`)`=>``term`is an arrow signature expression with named arguments. The order of the arguments in arrow expressions with named arguments is immaterial, so any permutation of arguments yields the same expression.

A set `S` of signatures is * coherent* iff

`S`contains the special signature`atomic{ }`, which represents the context of atomic formulas.`S`contains the signature`=`{`e`_{1}, ...,`e`_{n}, ...} for the equality symbol. All arrow expressions`e`_{i}here have the form (κ κ) ⇒ γ (both arguments in an equation must have the same signature) and at least one of these expressions must have the form (κ κ) ⇒`atomic`(i.e., some equations should be allowed as atomic formulas). Dialects may further specialize this signature.`S`contains the signature`#`{`e`_{1}, ...,`e`_{n}...} where all arrow expressions`e`_{i}are binary (have two arguments) and at least one has the form (κ γ) ⇒`atomic`. Dialects may further specialize this signature.`S`contains the signature`##`{`e`_{1}, ...,`e`_{n}...} where all arrow expressions`e`_{i}have the form (κ κ) ⇒ γ (both arguments must have the same signature) and at least one of these arrow expressions has the form (κ κ) ⇒`atomic`. Dialects may further specialize this signature.`S`contains the signature`->`{`e`_{1}, ...,`e`_{n}...}, where all arrow expressions`e`_{i}are ternary (have three arguments) and at least one of them is of the form (κ_{1}κ_{2}κ_{3}) ⇒`atomic`. Dialects may further specialize this signature.`S`has at most one signature for any given signature name.Whenever

`S`contains a pair of signatures, η`S`and κ`R`, such that η<κ then`R`⊆`S`. Here η`S`denotes a signature with the name η and the associated set of arrow expression`S`; similarly κ`R`is a signature named κ with the set of expressions`R`. The requirement that`R`⊆`S`ensures that symbols that have signature η can be used wherever the symbols with signature κ are allowed.

### Well-formed Terms and Formulas

Signatures are used to control the context in which various symbols are allowed to occur, as explained next.

Each variable symbol is associated with *exactly one* signature from a coherent set of signatures. A constant symbol can have *one or more* signatures, and different symbols can be associated with the same signature. Since signature names uniquely identify signatures in coherent signature sets, we will often refer to signatures simply by their names. For instance, if one of `f`'s signatures is `atomic{ }`, we may simply say that symbol `f` *has* signature `atomic`.

Next we define * well-formed terms* and their signatures. Like the constant symbols, well-formed terms can have more than one signature.

A

*constant*or*variable*symbol with signature η is a well-formed term with signature η.A

*positional term*`t(t`_{1}...`t`_{n}`)`, 0≤n, is well-formed and has a signature σ iff`t`is a well-formed term that has a signature that contains an arrow expression of the form (σ_{1}... σ_{n}) ⇒ σ; andEach

`t`_{i}is a well-formed term whose signature is γ_{i}, such that γ_{i}, ≤ σ_{i}.

As a special case, when

`n=0`we obtain that`t( )`is a well-formed term with signature σ, if`t`'s signature contains the arrow expression () ⇒ σ.A

*term with named arguments*`t(p`_{1}`->t`_{1}...`p`_{n}`->t`_{n}`)`, 0≤n, is well-formed and has a signature σ iff`t`is a well-formed term that has a signature that contains an arrow expression with named arguments of the form (`p`_{1}`->`σ_{1}...`p`_{n}`->`σ_{n}) ⇒ σ; andEach

`t`_{i}is a well-formed term whose signature is γ_{i}, such that γ_{i}, ≤ σ_{i}.

As a special case, when

`n=0`we obtain that`t( )`is a well-formed term with signature σ, if`t`'s signature contains the arrow expression () ⇒ σ.An

*equality term*of the form`t`_{1}`=t`_{2}is well-formed and has a signature κ iffThe signature

`=`has an arrow expression (σ σ) ⇒ κ`t`_{i}and`t`_{2}are well-formed terms with signatures γ_{1}and γ_{2}, respectively, such that γ_{i}≤ σ,`i=1,2`.

A

*membership term*of the form`t`_{1}`#t`_{2}is well-formed and has a signature κ iffThe signature

`#`has an arrow expression (σ_{1}σ_{2}) ⇒ κ`t`_{i}and`t`_{2}are well-formed terms with signatures γ_{1}and γ_{2}, respectively, such that γ_{i}≤ σ_{i},`i=1,2`.

A

*subclass term*of the form`t`_{1}`##t`_{2}is well-formed and has a signature κ iffThe signature

`##`has an arrow expression (σ σ) ⇒ κ`t`_{i}and`t`_{2}are well-formed terms with signatures γ_{1}and γ_{2}, respectively, such that γ_{i}≤ σ,`i=1,2`.

A

*frame term*of the form`t[s`_{1}`->v`_{1}...`s`_{n}`->v`_{n}`]`is well-formed and has a signature κ iffThe signature

`->`has arrow expressions (σ σ_{11}σ_{12}) ⇒ κ, ..., (σ σ_{n1}σ_{n2}) ⇒ κ (these`n`expressions need not be distinct).`t`,`s`_{j}, and`v`_{j}are well-formed terms with signatures γ, γ_{j1}, and γ_{j2}, respectively, such that γ ≤ σ and γ_{ji}≤ σ_{ji}, where`j=1,...,n`and`i=1,2`.

Note that, according to the above definition, `f()` and `f` are distinct terms. We define atomic formulas as follows:

A term is a

iff it is a well-formed term one of whose signatures is η, such that η =**well-formed atomic formula**`atomic`or η`< atomic`.Note that equality, membership, subclass, and frame terms are always atomic formulas, since

`atomic`is always one of their signatures.

More general formulas are constructed out of atomic formulas with the help of logical connectives. A * formula* is a statement that can have one of the following forms:

*Atomic*: If φ is a well-formed atomic formula then it is also a well-formed formula.*Conjunction*: If φ_{1}, ..., φ_{n},`n`≥ 0, are well-formed formulas then so is`And(`φ_{1}... φ_{n}`)`. As a special case,`And()`is allowed and is treated as a tautology, i.e., a formula that is always true.*Disjunction*: If φ_{1}, ..., φ_{n},`n`≥ 0, are well-formed formulas then so is`Or(`φ_{1}... φ_{n}`)`. When`n=0`, we get`Or()`as a special case; it is treated as a formula that is always false.*Classical negation*: If φ is a well-formed formula then`Neg`φ is a well-formed formula.*Default negation*: If φ is a well-formed formula then`Naf`φ is a well-formed formula.*Rule*: If φ and ψ are well-formed formulas then φ`:-`ψ is a well-formed formula.*Quantification*: If φ is a well-formed formula and`?V`_{1}, ...,`?V`_{n}are variables then`Exists``?V`_{1}...`?V`_{n}`(`φ`)`and`Forall``?V`_{1}...`?V`_{n}`(`φ`)`are well-formed formulas.

**Example 1** (The use of signatures)

We illustrate the above definitions with the following examples. In addition to `atomic`, let there be another signature, `term{ }`, which is also used in RIF-BLD.

Consider the term `p(p(a) p(a b c))`. If `p` has the (polymorphic) signature `mysig`{(`term`)⇒`term`, (`term` `term`)⇒`term`, (`term` `term` `term`)⇒`term`} and `a`, `b`, `c` each has the signature `term{ }` then `p(p(a) p(a b c))` is a well-formed term with signature `term{ }`. If instead `p` had the signature `mysig2`{(`term` `term`)⇒`term`, (`term` `term` `term`)⇒`term`} then `p(p(a) p(a b c))` would not be a well-formed term since then `p(a)` would not be well-formed (in this case, `p` would have no arrow expression which allows `p` to take just one argument).

For a more complex example, let `r` have the signature `mysig3`{(`term`)⇒`atomic`, (`atomic` `term`)⇒`term`, (`term` `term` `term`)⇒`term`}. Then `r(r(a) r (a b c))` is well-formed. The interesting twist here is that `r(a)` is an atomic formula that occurs as an argument to a function symbol. However, this is allowed by the arrow expression (`atomic` `term`)⇒ `term`, which is part of `r`'s signature. If `r`'s signature were `mysig4`{(`term`)⇒`atomic`, (`atomic` `term`)⇒`atomic`, (`term` `term` `term`)⇒`term`} instead, then `r(r(a) r(a b c))` would be not only a well-formed term, but also a well-formed atomic formula.

An even more advanced example of signatures is when the right-hand side of an arrow expression is something other than `term` or `atomic`. For instance, let `John`, `Mary`, `NewYork`, and `Boston` have signatures `term{ }`; `flight` and `parent` have signature `h`_{2}{(`term` `term`)⇒`atomic`}; and `closure` has signature `hh`_{1}{(`h`_{2})⇒`p`_{2}}, where `p`_{2} is the name of the signature `p`_{2}{(`term` `term`)⇒`atomic`}. Then `flight(NewYork Boston)`, `closure(flight)(NewYork Boston)`, `parent(John Mary)`, and `closure(parent)(John Mary)` would be well-formed formulas. Such formulas are allowed in languages like HiLog [CKW93], which support predicate constructors like `closure` in the above example.

### Symbol Spaces

Throughout this document, the `xsd:` prefix stands for the XML Schema namespace URI `http://www.w3.org/2001/XMLSchema#`, the `rdf:` prefix stands for `http://www.w3.org/1999/02/22-rdf-syntax-ns#`, and `rif:` stands for the URI of the RIF namespace, `http://www.w3.org/2007/rif#`. Syntax such as `xsd:string` should be understood as a *compact URI* [CURIE] -- a macro that expands to a concatenation of the character sequence denoted by the prefix `xsd` and the string `string`.

The set of all constant symbols in a RIF dialect is partitioned into a number of subsets, called *symbol spaces*, which are used to represent XML Schema data types, data types defined in other W3C specifications, such as `rdf:XMLLiteral`, and to distinguish other sets of constants. Constant symbols that belong to the various symbol spaces have special presentation syntax and semantics.

Formally, a * symbol space* is a named subset of the set of all constants,

`Const`. The semantic aspects of symbol spaces will be described in Section Semantic Framework. Each symbol in

`Const`belongs to exactly one symbol space.

Each symbol space has an associated lexical space and an identifier.

The

of a symbol space is a non-empty set of Unicode character strings.**lexical space**The

of a symbol space is an absolute IRI.**identifier**To simplify the language, we will often use symbol space identifiers to refer to the actual symbol spaces (for instance, we may use "symbol space

`xsd:string`" instead of "symbol space*identified by*`xsd:string`").

To refer to a constant in a particular RIF symbol space, we use the following presentation syntax:

LITERAL^^SYMSPACE

where `LITERAL` is a Unicode string, called the * lexical part* of the symbol, and

`SYMSPACE`is an

*of the symbol space in the form of an absolute IRI string.*

**identifier**`LITERAL`must be an element in the lexical space of the symbol space. For instance,

`1.2^^xsd:decimal`and

`1^^xsd:decimal`are legal symbols because 1.2 and 1 are members of the lexical space of the XML Schema data type

`xsd:decimal`. On the other hand,

`a+2^^xsd:decimal`is not a legal symbol, since

`a+2`is not part of the lexical space of

`xsd:decimal`.

The set of all symbol spaces that partition `Const` is considered to be part of the logic language used by RIF rule sets.

RIF supports the following symbol spaces. Rule sets that are exchanged through RIF can use additional symbol spaces as explained below.

`xsd:string`(`http://www.w3.org/2001/XMLSchema#string`)and all the symbol spaces that corresponds to the subtypes of

`xsd:string`as specified in [XML-SCHEMA2].`xsd:decimal`(`http://www.w3.org/2001/XMLSchema#decimal`)and all the symbol spaces that corresponds to the subtypes of

`xsd:decimal`as specified in [XML-SCHEMA2].`xsd:time`(`http://www.w3.org/2001/XMLSchema#time`).`xsd:date``http://www.w3.org/2001/XMLSchema#dateTime`).`xsd:dateTime``http://www.w3.org/2001/XMLSchema#dateTime`).

The lexical spaces of the above symbol spaces are defined in the document [XML-SCHEMA2].

`rdf:XMLLiteral`(`http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral`).This symbol space represents XML content. The lexical space of

`rdf:XMLLiteral`is defined in the document [RDF-CONCEPTS].`rif:text`(for text strings with language tags attached).This symbol space represents text strings with a language tag attached. The lexical space of

`rif:text`is the set of all Unicode strings of the form`...@LANG`, i.e., strings that end with`@LANG`where`LANG`is a language identifier as defined in [RFC-3066].`rif:iri`(foror**internationalized resource identifiers**s).**IRI**Constant symbols that belong to this symbol space are intended to be used in a way similar to RDF resources [RDF-SCHEMA]. The lexical space consists of all absolute IRIs as specified in [RFC-3987]; it is unrelated to the XML primitive type

`anyURI`. A`rif:iri`constant is supposed to be interpreted as a reference to one and the same object regardless of the context in which that constant occurs.`rif:local`(for constant symbols that are not visible outside of a particular set of RIF formulas).Symbols in this symbol space are used locally in their respective rule sets. This means that occurrences of the same

`rif:local`-constant in different rule sets are viewed as unrelated distinct constants, but occurrences of the same constant in the same rule set must refer to the same object. The lexical space of`rif:local`is the same as the lexical space of`xsd:string`.

**Notes on RIF-compliant support for symbol spaces.**

A

*RIF-compliant inference engine*must support the following symbol spaces:`xsd:string`,`xsd:decimal`,`xsd:time`,`xsd:date`,`xsd:dateTime`,`rdf:XMLLiteral`,`rif:text`,`rif:iri`,`rif:local`. Such an engine can support additional symbol spaces.A

*RIF-producing system*includes a RIF compliant inference engine and a transformation from the language of that engine into valid RIF XML format. Such an engine must support all the symbol spaces that are mentioned in the documents produced by the aforesaid transformation. In particular, this transformation must not produce invalid constant symbols, i.e., symbols whose lexical part is not an element of the lexical space of the symbol's symbol space.A

*RIF-consuming system*includes a RIF-compliant inference engine and a transformation from RIF XML to the language of the engine. A consumer engine is not required to support all symbol spaces that are subspaces of the symbol spaces supported by the producer engine. For instance, a RIF-producer system might support`xsd:short`, a subspace of`xsd:decimal`, but RIF consumers do not need to support`xsd:short`. The consumer is allowed to replace the constants in an unsupported symbol space with the corresponding constant symbols in a supported superspace. For example,`"123"^^xsd:short`can be replaced with`"123"^^xsd:decimal`and`"abc123"^^xsd:IDREF`with`"abc123"^^xsd:string`. Such substitutions are permitted because they do not affect the inferences that can be made from RIF documents (see Section RIF Semantic Framework).