W3C


RIF Framework for Logic Dialects

W3C Editor's Draft 1723 July 2008

This version:
http://www.w3.org/2005/rules/wg/draft/ED-rif-fld-20080717/http://www.w3.org/2005/rules/wg/draft/ED-rif-fld-20080723/
Latest editor's draft:
http://www.w3.org/2005/rules/wg/draft/rif-fld/
Previous version:
http://www.w3.org/2005/rules/wg/draft/ED-rif-fld-20080609/http://www.w3.org/2005/rules/wg/draft/ED-rif-fld-20080717/ (color-coded diff)
Editors:
Harold Boley, National Research Council, Canada
Michael Kifer, State University of New York at Stony Brook, USA


Abstract

This document, developed by the Rule Interchange Format (RIF) Working Group, defines a general Framework for Logic-based RIF dialects (RIF-FLD). The framework describes mechanisms for specifying the syntax and semantics of logic-based RIF dialects through a number of generic concepts such as signatures, symbol spaces, semantic structures, and so on. The actual dialects are required to specialize this framework to produce their syntaxes and semantics.

Status of this Document

May Be Superseded

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is being published as one of a set of 6 documents:

  1. RIF Use Cases and Requirements
  2. RIF Basic Logic Dialect
  3. RIF Framework for Logic Dialects (this document)
  4. RIF RDF and OWL Compatibility
  5. RIF Production Rule Dialect
  6. RIF Datatypes and Built-Ins 1.0

Please Comment By 2008-07-212008-07-28

The Rule Interchange Format (RIF) Working Group seeks public feedback on these Working Drafts. Please send your comments to public-rif-comments@w3.org (public archive). If possible, please offer specific changes to the text that would address your concern. You may also wish to check the Wiki Version of this document for internal-review comments and changes being drafted which may address your concerns.

No Endorsement

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Patents

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.


Contents

1 Overview of RIF-FLD

The RIF Framework for Logic-based Dialects (RIF-FLD) is a formalism for specifying all logic-based dialects of RIF, including the RIF Basic Logic Dialect [RIF-BLD]. It is a logic in which both syntax and semantics are described through a number of mechanisms that are commonly used for various logic languages, but are rarely brought all together. Amalgamation of several different mechanisms is required because the framework must be broad enough to accommodate several different types of logic languages and because various advanced mechanisms are needed to facilitate translation into a common framework. RIF-FLD gives precise definitions to these mechanisms, but allows certain details to vary. The design of RIF envisages that future standard logic dialects will be based on RIF-FLD. Therefore, any logic dialect being developed to become a stardard should either be a specialization of FLD or justify its deviations from (or extensions to) FLD.

The framework described in this document is very general and captures most of the popular logic-based rule languages found in Databases, Logic Programming, and on the Semantic Web. However, it is anticipated that the needs of future dialects might stimulate further evolution of RIF-FLD. In particular, future extensions might include a logic rendering of actions as found in production and reactive rule languages.

This document is mostly intended for the designers of future RIF dialects. All logic-based RIF dialects are required to be derived from RIF-FLD by specialization, as explained in Sections Syntax of a RIF Dialect as a Specialization of RIF-FLD and Semantics of a RIF Dialect as a Specialization of RIF-FLD. In addition to specialization, to lower the barrier of entry for their intended audiences, a dialect designer may choose to specify the syntax and semantics in a direct, but equivalent, way, which does not require familiarity with RIF-FLD. For instance, the RIF Basic Logic Dialect [RIF-BLD] is specified bothby specialization from RIF-FLD and also directly, without relying on the framework. Thus, the reader who is interested in [RIF-BLD ]only can proceed directly to that document.

RIF-FLD has the following main components:

Syntactic framework. The syntactic framework defines six types of RIF terms:

Terms are then used to define several types of RIF-BLD formulas. RIF dialects can choose to supportpermit all or some of the aforesaid categories of terms. The syntactic framework also defines the following mechanisms for specializing these terms:specialization mechanisms:

Semantic framework. This framework defines the notion of a semantic structure or(also knows as interpretation (both terms are usedin the literature [Enderton01, Mendelson97 ], but here we will mostly use the first).]). Semantic structures are used to interpret formulas and to define logical entailment. As with the syntax, this framework includes a number of mechanisms that RIF logic-based dialects can specialize to suit their needs. These mechanisms include:

XML serialization framework. This framework defines the general principles for mapping the presentation syntax of RIF-FLD to the concrete XML interchange format. This includes:

This document is the latest draft of the RIF-FLD specification. Each RIF dialect that is derived from RIF-FLD will be described in its own document. The first of such dialects, RIF Basic Logic Dialect, is described in [RIF-BLD].

2 Syntactic Framework

The next subsection explains how to derive the presentation syntax of a RIF dialect from the presentation syntax of the RIF framework. The actual syntax of the RIF framework is given in subsequent subsections.


2.1 Syntax of a RIF Dialect as a Specialization of RIF-FLD

The presentation syntax for a RIF dialect can be obtained from the general syntactic framework of RIF by specializing the following parameters, which are defined later in this document:

  1. The alphabet of RIF-FLD can be restricted.restricted (by omitting symbols).
  2. An assignment of signatures to each constant and variable symbol.
  3. Signatures determine which terms in the dialect are well-formed and which are not.

    The exact way signatures are assigned depends on the dialect. An assignment can be explicit or implicit (for instance, derived from the context in which each symbol is used).

  4. The choice of the types of terms supported by the dialect.
  5. The RIF logic framework introduces the following types of terms:

    • constant
    • variable
    • positional
    • with named arguments
    • equality
    • frame
    • class membership
    • subclass
    • external

    A dialect might support all of these terms or just a subset. For instance, some dialects might not support terms with named arguments or frame terms.terms or certain forms of external terms (e.g., external frames).

  6. The choice of symbol spaces supported by the dialect.
  7. Symbol spaces determine the syntax of the constant symbols that are allowed in the dialect.

  8. The choice of the formulas supported by the dialect.
  9. RIF-FLD allows formulas of the following kind:

    • Atomic
    • Conjunction
    • Disjunction
    • Classical negation
    • Default negation (as in logic programming)
    • Rule (as in logic programming as opposed to the classical material implication)
    • Quantification (universal and existential)

    A dialect might support all of these formulas or it might impose various restrictions. For instance, the formulas allowed in the conclusion and/or premises of implications might be restricted (e.g., [RIF-BLD] essentially allows Horn rules only), certain types of quantification might be prohibited (e.g., [RIF-BLD] disallows existential quantification in the rule head), classical or default negation (or both) might not be allowed (as in RIF-BLD), etc. A subdialect of RIF-BLD might disallow equality formulas in the conclusions of the rules.

Note that although the presentation syntax of a RIF logic-based dialect is normative, since semantics is defined in terms of that syntax, the presentation syntax is not intended as a concrete syntax, and conformant systems are not required to implement it.


2.2 Alphabet

Definition (Alphabet). The alphabet of the presentation language of RIF-FLD consists of

The set of connective symbols, quantifiers, =, etc., is disjoint from Const and Var. Variables are written as Unicode strings preceded withby the symbol "?". The argument names in ArgNames are written as Unicode strings that do not start with a "?". The syntax for constant symbols is given in Section Symbol Spaces.

The symbols =, #, and ## are used in formulas that define equality, class membership, and subclass relationships. The symbol -> is used in terms that have named arguments and in frame terms. The symbol External indicates that an atomic formula or a function term is defined externally (e.g., a builtin), Dialect is a directive used to indicate the dialect of a RIF document (for those dialects that require this), the symbols Base and Prefix are used in abridged representations of IRIs, and Import is an import directive.

Finally, the symbol Document is used for specifying RIF-FLD documents and the symbol Group is used to organize RIF-FLD formulas into collections.   ☐



2.3 Symbol Spaces

Throughout this document, we will be using the following abbreviations:

These and other abbreviations will be used as prefixes in the compact URI notation [CURIE], a notation for succinct representation of IRIs. The precise meaning of this notation in RIF is defined in [RIF-DTB].

The set of all constant symbols in a RIF dialect is partitioned into a number of subsets, called symbol spaces, which are used to represent XML Schema datatypes, datatypes defined in other W3C specifications, such as rdf:XMLLiteral, and to distinguish other sets of constants. All constant symbols have a syntax (and sometimes also semantics) imposed by the symbol space to which they belong.

Definition (Symbol space). A symbol space is a named subset of the set of all constants, Const. The semantic aspects of symbol spaces will be described in Section Semantic Framework. Each symbol in Const belongs to exactly one symbol space.

Each symbol space has an associated lexical space and a unique identifier. More precisely,

The identifiers for symbol spaces are not themselves constant symbols in RIF.   ☐

To simplify the language, we will often use symbol space identifiers to refer to the actual symbol spaces (for instance, we may use "symbol space xs:string" instead of "symbol space identified by xs:string").

To refer to a constant in a particular RIF symbol space, we use the following presentation syntax:

     "literal"^^symspace

where literal is called the lexical part of the symbol, and symspace is an identifier of the symbol space. Here literal is a sequence of Unicode characters that must be an element in the lexical space of the symbol space symspace. For instance, "1.2"^^xs:decimal and "1"^^xs:decimal are legal symbolssyntactically valid constants because 1.2 and 1 are members of the lexical space of the XML Schema datatype xs:decimal. On the other hand, "a+2"^^xs:decimal is not a legalsyntactically valid symbol, since a+2 is not part of the lexical space of xs:decimal.

The set of all symbol spaces that partition Const is considered to be part of the logic language of RIF-FLD.

RIF requires that all dialects include the symbol spaces listed and described in Section Constants and Symbol Spaces of [RIF-DTB ].] as part of their language. These symbol spaces include constants that belong to several important XML Schema datatypes, certain RDF datatypes, and constant symbols specific to RIF. The latter include the symbol spaces rif:iri and rif:local, which are used to represent internationalized resource identifiers (IRIs) and constant symbols that are not visible outside of the RIF document in which they occur, respectively. Rule setsDocuments that are exchanged through RIF can use additional symbol spaces.


The lexical spaces of the mandatory RIF symbol spaces are described in Section Constants and Symbol Spaces of [ RIF-DTB ].2.4 Terms

The most basic construct of a logic language is a term. RIF-FLD supports several kinds of terms: constants, variables, the regular positional terms, plus terms with named arguments, equality, classification terms, and frames. The word "term" will be used to refer to any kind of term.

Definition (Term). A term is a statement of one of the following forms:

  1. Constants and variables. If tConst or tVar then t is a simple term.
  2. Positional terms. If t and t1, ..., tn are terms then t(t1 ... tn) is a positional term.

    Positional terms in RIF-FLD generalize the regular notion of a term used in first-order logic. For instance, the above definition allows variables everywhere, as in ?X(?Y ?Z(?V "12"^^xs:integer)), where ?X, ?Y, ?Z, and ?V are variables. Even ?X("abc"^^xs:string ?W)(?Y ?Z(?V "33"^^xs:integer)) is a positional term (as in HiLog [CKW93]).

  3. Terms with named arguments. A term with named arguments is of the form t(s1->v1 ... sn->vn), where t, v1, ..., vn are terms, and s1, ..., sn are (not necessarily distinct) symbols from the set ArgNames.

    The term t here represents a predicate or a function; s1, ..., sn represent argument names; and v1, ..., vn represent argument values. Terms with named arguments are like regular positional terms except that the arguments are named and their order is immaterial. Note that a term with no arguments, like f(), isis, trivially, both a positional term and alsoa term with named arguments.

    For instance, "person"^^xs:string(name->?Y address->?Z), ?X("123"^^xs:integer ?W)(arg->?Y arg2->?Z(?V)), and "Closure"^^rif:local(relation->"http://example.com/Flight"^^rif:iri)(from->?X to->?Y) are terms with named arguments. The second of these terms has a positional term ?X(abc,?W), which occurs in the position of a function, and the third term's function is represented by a named arguments term.

  4. Equality terms. An equality term has the form t = s, where t and s are terms.
  5. Classification terms. There are two kinds of classification terms: class membership terms (or just membership terms) and subclass terms.

    Classification terms are used to describe class hierarchies.

  6. Frame terms. t[p1->v1 ... pn->vn] is a frame term (or simply a frame) if t, p1, ..., pn, v1, ..., vn, n ≥ 0, are terms.

    Frame terms are used to describe properties of objects. As in the case of the terms with named arguments, the order of the properties pi->vi in a frame is immaterial.

  7. Externally defined terms. If t is a constant, positional term, a term with named arguments, or a frame term then External(t) is an externally defined term.
  8. Such terms are used for representing builtin functions and predicates as well as "procedurally attached" terms or predicates, which might exist in various rule-based systems, but are not specified by RIF.

    Note that the above syntax allows very general interfaces to access externally defined data sources: not only predicates can be used,and functions, but also frames.frames can be used. In this way, externally defined objects can be accessed using the more natural frame-based interface. For instance, External("http://example.com/acme"^^rif:iri["http://example.com/mycompany/president"^^rif:iri(?Year) -> ?Pres]) could be an interface provided to access an externally defined method "http://example.com/mycompany/president"^^rif:iri of an external object "http://example.com/acme"^^rif:iri.   ☐

The above definitions are very general. They make no distinction between constant symbols that represent individuals, predicates, and function symbols. The same symbol can occur in multiple contexts at the same time. For instance, if p, a, and b are symbols then p(p(a) p(a p c)) is a term. Even variables and general terms are allowed to occur in the position of predicates and function symbols, so p(a)(?v(a c) p) is also a term.

Frame, classification, and other terms can be freely nested, as exemplified by p(?X  q#r[p(1,2)->s](d->e f->g)). Some language environments, like FLORA-2 [FL2], OO jDREW [OOjD], NxBRE [NxBRE], and CycL [CycL] support fairly large (partially overlapping) subsets of RIF-FLD terms, but most languages support much smaller subsets. RIF dialects are expected to carve out the appropriate subsets of RIF-FLD terms, and the general form of the RIF logic framework allows a considerable degree of freedom.

Observe that the argument names of frame terms, p1, ..., pn, are baseterms and, as a special case, can be variables. In contrast, terms with named arguments can use only the symbols from ArgNames to represent their argument names. They cannot be constants from Const or variables from Var. The reason for this restriction has to do with the complexity of unification, which is integral part of many inference rules underlying first-order logic. We are not aware of any rule language where terms with named arguments use anything more general than what is defined here.

Dialects can restrict the contexts in which the various terms are allowed by using the mechanism of signatures. The RIF-FLD language associates a signature with each symbol (both constant and variable symbols) and uses signatures to define well-formed terms. Each RIF dialect is expected to select appropriate signatures for the symbols in its alphabet, and only the terms that are well-formed according to the selected signatures are allowed in that particular dialect.


2.5 Schemas for Externally Defined Terms

This section introduces the notion of external schemas, which serve as templates for externally defined terms. These schemas determine which externally defined terms are acceptable in a RIF dialect. Externally defined terms include RIF builtins, which are specified in [RIF-DTB], but are more general. They are designed to accommodate the ideas of procedural attachments and querying of external data sources. Because of the need to accommodate many difference possibilities, the RIF logical framework supports a very general notion of an externally defined term. Such a term is not necessarily a function or a predicate -- it can be a frame, a classification term, and so on.

Definition (Schema for external term). An external schema is a statement of the form (?X1 ... ?Xn; τ) where

The names of the variables in an external schema are immaterial, but their order is important. For instance, (?X ?Y;  ?X[foo->?Y]) and (?V ?W;  ?V[foo->?W]) are considered to be the same schema,indistinguishable, but (?X ?Y;  ?X[foo->?Y]) and (?Y ?X;  ?X[foo->?Y]) are viewed as different schemas.

A term t is an instance of an external schema (?X1 ... ?Xn; τ) iff t can be obtained from τ by a simultaneous substitution ?X1/s1 ... ?Xn/sn of the variables ?X1 ... ?Xn with terms s1 ... sn, respectively. Some of the terms si can be variables themselves. For example, ?Z[foo->f(a ?P)] is an instance of (?X ?Y; ?X[foo->?Y]) by the substitution ?X/?Z  ?Y/f(a ?P).    ☐

Observe that a variable cannot be an instance of an external schema, since τ in the above definition cannot be a variable. It will be seen later that this implies that a term of the form External(?X) is not well-formed in RIF.

The intuition behind the notion of an external schema, such as (?X ?Y;  ?X["foo"^^xs:string->?Y]) or (?V;  "pred:isTime"^^rif:iri(?V)), is that ?X["foo"^^xs:string->?Y] or "pred:isTime"^^rif:iri(?V) are invocation patterns for querying external sources, and instances of those schemas correspond to concrete invocations. Thus, External("http://foo.bar.com"^^rif:iri["foo"^^xs:string->"123"^^xs:integer]) and External("pred:isTime"^^rif:iri("22:33:44"^^xs:time) are examples of invocations of external terms -- one querying an external source and another invoking a builtin.


Definition (Coherent set of external schemas). A set of external schemas is coherent if there can beis no term, t, that is an instance of two distinct schemas.schemas in the set.    ☐

The intuition behind this notion is to ensure that any use of an external term is associated with at most one external schema. This assumption is relied upon in the definition of the semantics of externally defined terms. Note that the coherence condition is easy to verify syntactically and that it implies that schemas like (?X ?Y;  ?X[foo->?Y]) and (?Y ?X;  ?X[foo->?Y]), which differ only in the order of their variables, cannot be in the same coherent set.

It is important to understandkeep in mind that external schemas are not part of the language in RIF, since they do not appear anywhere in RIF statements. Instead, like signatures, which are defined below, they are best thought of as part of the grammar of the language. In particular, they will be used to determine which external terms, i.e., the terms of the form External(t) are well-formed.


2.6 Signatures

In this section we introduce the concept of a signature, which is a key mechanism that allows RIF-FLD to control the context in which the various symbols are allowed to occur. For instance, a symbol f with signature {(term term) => term, (term) => term} can occur in terms like f(a b), f(f(a b) a), f(f(a)), etc., if a and b have signature term. But f is not allowed to appear in the context f(a b a) because there is no =>-expression in the signature of f to support such a context.

The above example provides intuition behind the use of signatures in RIF-FLD. Much of the development, below, is inspired by [CK95]. It should be kept in mind that signatures are not part of the logic language in RIF, since they do not appear anywhere in RIF-FLD formulas. Instead they are part of the grammar: they are used to determine which sequences of tokens are in the language and which are not. The actual way by which signatures are assigned to the symbols of the language may vary from dialect to dialect. In some dialects (for example [RIF-BLD]), this assignment is derived from the context in which each symbol occurs and no separate language for signatures is used. Other dialects may choose to assign signatures explicitly. In that case, they would require a concrete language for signatures (which would be separate from the language for specifying the logic formulas of the dialect).

Definition (Signature name). Let SigNames be a non-empty, partially-ordered finite or countably infinite set of symbols, called signature names. Since signatures are not part of the logic language, their names do not have to be disjoint from Const, Var, and ArgNames. We require that this set includes at least the following signature names:

Dialects may introduce additional signature names. For instance, RIF Basic Logic Dialect [RIF-BLD] introduces one other signature name, individual. The partial order on SigNames is dialect-specific; it is used in the definition of well-formed terms below.

We use the symbol < to represent the partial order on SigNames. Informally, α < β means that terms with signature α can be used wherever terms with signature β are allowed. We will write α ≤ β if either α = β or α < β.

Definition (Signature). A signature is a statement of the form η{e1, ..., en, ...} where ηSigNames is the name of the signature and {e1, ..., en, ...} is a countable set of arrow expressions. Such a set can thus be infinite, finite, or even empty. In RIF-BLD, signatures can have at most one arrow expression. Other dialects (such as HiLog [CKW93], for example) may require polymorphic symbols and thus allow signatures with more than one arrow expression in them.

An arrow expression is defined as follows:

RIF dialects are always associated with sets of coherent signatures, defined next. The overall idea is that a coherent set of signatures must include all the predefined signatures (such as signatures for equality and classification terms) and the signatures included in a coherent set should not conflict with each other. For instance, two different signatures should not have identical names and if one signature is said to extend another then the arrow expressions of the supersignature should be included among the arrow expressions of the subsignature (a kind of an arrow expression "inheritance").

Definition (Coherent signature set). A set Σ of signatures is coherent iff

  1. Σ contains the special signature atomic{ }, which represents the context of atomic formulas.
  2. Σ contains the signature ={e1, ..., en, ...} for the equality symbol.

    All arrow expressions ei here have the form (κ κ) ⇒ γ (the arguments in an equation must be compatible) and at least one of these expressions must have the form (κ κ) ⇒ atomic (i.e., equation terms are also atomic formulas). Dialects may further specialize this signature.

  3. Σ contains the signature #{e1, ..., en...}.

    Here all arrow expressions ei are binary (have two arguments) and at least one has the form (κ γ) ⇒ atomic. Dialects may further specialize this signature.

  4. Σ contains the signature ##{e1, ..., en...}.

    Here all arrow expressions ei have the form (κ κ) ⇒ γ (the arguments must be compatible) and at least one of these arrow expressions has the form (κ κ) ⇒ atomic. Dialects may further specialize this signature.

  5. Σ contains the signature ->{e1, ..., en...}.
    Here all arrow expressions ei are ternary (have three arguments) and at least one of them is of the form 1 κ2 κ3) ⇒ atomic. Dialects may further specialize this signature.
  6. Σ has at most one signature for any given signature name.
  7. Whenever Σ contains a pair of signatures, ηA and κB, such that η<κ then BA.
  8. Here ηA denotes a signature with the name η and the associated set of arrow expressions A; similarly κB is a signature named κ with the set of expressions B. The requirement that BA ensures that symbols that have signature η can be used wherever the symbols with signature κ are allowed.   ☐


The requirement that coherent sets of signatures must include the signatures for =, #, ->, and so on is just a technicality needed to simplify the definitions. Some of these signatures may go "unused" in a dialect even though, technically speaking, they must be present in the signature set associated with that dialect. If a dialect disallows equality, classification terms, or frames in its syntax then the corresponding signatures will remain unused. Such restrictions can be imposed by specializing RIF-FLD -- see Section Syntax of a RIF Dialect as a Specialization of RIF-FLD.

An incoherent set of signatures would be one that includes signatures mysig{() ⇒ atomic} and mysig{atomic ⇒ atomic} because it has two different signatures with the same name. Likewise, if this set contains mysig1{() ⇒ atomic} and mysig2{atomic ⇒ atomic} and mysig1 < mysig1 then it is incoherent because the set of arrow expressions of mysig1 does not contain the set of arrow expressions of mysig2.


2.7 Presentation Language of a RIF Dialect

The presentation language of a RIF dialect is a set of allwell-formed formulas, as defined in the next section. The language is determined by the following parameters (see Syntax of a RIF Dialect as a Specialization of RIF-FLD):

We have already seen how the alphabet and the symbol spaces are used to define RIF terms. The next section shows how signatures and external schemas are used to further specialize this notion to define well-formed RIF-FLD terms.


2.8 Well-formed Terms and Formulas

Since signature names uniquely identify signatures in coherent signature sets, we will often refer to signatures simply by their names. For instance, if one of f's signatures is atomic{ }, we may simply say that symbol f has signature atomic.


Definition (Well-formed term).

  1. A constant or variable symbol with signature η is a well-formed term with signature η.
  2. A positional term t(t1 ... tn), 0≤n, is well-formed and has a signature σ iff
    • t is a well-formed term that has a signature that contains an arrow expression of the form 1 ... σn) ⇒ σ; and
    • Each ti is a well-formed term whose signature is γi ,such that γi, ≤ σi.

    As a special case, when n=0 we obtain that t( ) is a well-formed term with signature σ, if t's signature contains the arrow expression () ⇒ σ.

  3. A term with named arguments t(p1->t1 ... pn->tn), 0≤n, is well-formed and has a signature σ iff
    • t is a well-formed term that has a signature that contains an arrow expression with named arguments of the form (p1->σ1 ... pn->σn) ⇒ σ; and
    • Each ti is a well-formed term whose signature is γi, such that γi ≤ σi.

    As a special case, when n=0 we obtain that t( ) is a well-formed term with signature σ, if t's signature contains the arrow expression () ⇒ σ.

  4. An equality term of the form t1=t2 is well-formed and has a signature κ iff
    • The signature = has an arrow expression (σ σ) ⇒ κ
    • ti and t2 are well-formed terms with signatures γ1 and γ2, respectively, such that γi ≤ σ, i=1,2.
  5. A membership term of the form t1#t2 is well-formed and has a signature κ iff
    • The signature # has an arrow expression 1 σ2) ⇒ κ
    • t1 and t2 are well-formed terms with signatures γ1 and γ2, respectively, such that γi ≤ σi, i=1,2.
  6. A subclass term of the form t1##t2 is well-formed and has a signature κ iff
    • The signature ## has an arrow expression (σ σ) ⇒ κ
    • t1 and t2 are well-formed terms with signatures γ1 and γ2, respectively, such that γi ≤ σ, i=1,2.
  7. A frame term of the form t[s1->v1 ... sn->vn] is well-formed and has a signature κ iff
    • The signature -> has arrow expressions (σ σ11 σ12) ⇒ κ, ..., (σ σn1 σn2) ⇒ κ (these n expressions need not be distinct).
    • t, sj, and vj are well-formed terms with signatures γ, γj1, and γj2, respectively, such that γ ≤ σ and γji ≤ σji, where j=1,...,n and i=1,2.
  8. An externally defined term, External(t), is well-formed and has signature κ iff

Note that, like constant symbols, well-formed terms can have more than one signature. Also note that, according to the above definition, f() and f are distinct terms.


Definition (Well-formed formula). A well-formed term is also a well-formed atomic formula iff one of its signatures is atomic or is ≤ atomic. Note that equality, membership, subclass, and frame terms are atomic formulas, since atomic is one of their signatures.

More general formulas are constructed out of atomic formulas with the help of logical connectives. A well-formed formula is a statement that can have one of the forms (1) -- (9) below. The Group and Document formulas, defined in (8) and (9), are aggregate formulas while the formulas in (1) -- (7) are non-aggregate. This distinction manifests itself in that Group and Document cannot be part of non-aggregate formulas, and Document cannot be part of a group.

  1. Atomic: If φ is a well-formed atomic formula then it is also a well-formed formula.
  2. Conjunction: If φ1, ..., φn, n ≥ 0, are non-aggregate well-formed formulas then so is And(φ1 ... φn).
  3. As a special case, And() is allowed and is treated as a tautology, i.e., a formula that is always true.

  4. Disjunction: If φ1, ..., φn, n ≥ 0, are non-aggregate well-formed formulas then so is Or(φ1 ... φn).
  5. As a special case, Or() is treated as a contradiction, i.e., a formula that is always false.

  6. Classical negation: If φ is a non-aggregate well-formed formula then Neg φ is also a well-formed formula.
  7. Default negation: If φ is a non-aggregate well-formed formula then Naf φ is also a well-formed formula.
  8. Rule implication: If φ and ψ are non-aggregate well-formed formulas then φ :- ψ is also a well-formed formula.
  9. Quantification: If φ is a non-aggregate well-formed formula and ?V1, ..., ?Vn are variables then the following formulas are also well-formed:
  10. Group: If φ1, ..., φn are well-formed non-Document formulas or group formulasthen Group Group(φ1 ... φn) is a well-formed RIF-FLD group formula (or simply a group formula when context is clear).

    Group formulas are intended to represent sets of formulas. Note that some of the φi's can be group formulas themselves, which means that groups can be nested.

  11. Document: An expression of the form Document(directive1 ... directiven Γ) is a well-formed RIF-FLD document formula (or simply a document formula if no ambiguity arises), if

All parts of the document formula -- the directives and the group formula -- are optional and can be omitted.In this definition, the component formulas φ, φi, ψi, and Γ are said to be subformulas of the respective formulas (conjunction, disjunction, nagation, implication, group, etc.) that are built with the help ofusing these components.   ☐

Observe that the restrictions in (1) -- (8) above imply that groups and documents cannot be nested inside non-aggregate formulas and documents cannot be nested inside groups.


Example 1 (Signatures, well-formed terms and formulas).

We illustrate the above definitions with the following examples. In addition to atomic, let there be another signature, term{ }, which is intended here to represent the context of the arguments to positional terms or atomic formulas.

Consider the term p(p(a) p(a b c)). If p has the (polymorphic) signature mysig{(term)⇒term, (term term)⇒term, (term term term)⇒term} and a, b, c each has the signature term{ } then p(p(a) p(a b c)) is a well-formed term with signature term{ }. If instead p had the signature mysig2{(term term)⇒term, (term term term)⇒term} then p(p(a) p(a b c)) would not be a well-formed term since then p(a) would not be well-formed (in this case, p would have no arrow expression which allows p to take just one argument).

For a more complex example, let r have the signature mysig3{(term)⇒atomic, (atomic term)⇒term, (term term term)⇒term}. Then r(r(a) r(a b c)) is well-formed. The interesting twist here is that r(a) is an atomic formula that occurs as an argument to a function symbol. However, this is allowed by the arrow expression (atomic term)⇒ term, which is part of r's signature. If r's signature were mysig4{(term)⇒atomic, (atomic term)⇒atomic, (term term term)⇒term} instead, then r(r(a) r(a b c)) would be not only a well-formed term, but also a well-formed atomic formula.

An even more interesting example arises when the right-hand side of an arrow expression is something other than term or atomic. For instance, let John, Mary, NewYork, and Boston have signatures term{ }; flight and parent have signature h2{(term term)⇒atomic}; and closure has signature hh1{(h2)⇒p2}, where p2 is the name of the signature p2{(term term)⇒atomic}. Then flight(NewYork Boston), closure(flight)(NewYork Boston), parent(John Mary), and closure(parent)(John Mary) would be well-formed formulas. Such formulas are allowed in languages like HiLog [CKW93], which support predicate constructors like closure in the above example.   ☐


2.9 Annotations in the Presentation Syntax

RIF-FLD allows every term and formula (including terms and formulas that occur inside other terms and formulas) to be optionally preceded by an annotation of the form (* id φ *) where id is a rif:iri constant and φ is a RIF formula, which is not a document-formula. Both items inside the annotation are optional. The id part represents the identifier of the term (or formula) to which the annotation is attached and φ is the rest of the annotation. RIF-FLD does not impose any restrictions on φ apart from what is stated above. In particular, it may include variables, function symbols, rif:local constants, and so on.

Document formulas with and without annotations will be referred to as RIF-FLD documents.

A convention is used to avoid a syntactic ambiguity in the above definition. For instance, in (* id φ *) t[w -> v] the annotation can be attributed to the term t or to the entire frame t[w -> v]. Similarly, for an annotated HiLog-like term of the form (* id φ *) f(a)(b,c), the annotation can be attributed to the entire term f(a)(b,c) or to just f(a). The convention adopted in RIF-FLD is that any annotation is syntactically associated with the largest RIF-FLD term/formula that appears to the right of that annotation. Therefore, in our examples the annotation (* id φ *) is considered to be attached to the entire frame t[w -> v] and to the entire term f(a)(b,c).


Example 2 (A RIF-FLD document with nested groups and annotations).

We illustrate formulas, including documents and groups, with the following complete example (with apologies to Shakespeare for the imperfect rendering of the intended meaning in logic). For better readability, we use the shortcut notation defined in [RIF-DTB]. The example also illustrates attachment of annotations.

 Document(
   Prefix(dc     http://http://purl.org/dc/terms/)
   Prefix(ex     http://example.org/ontology#)
   Prefix(hamlet http://www.shakespeare-literature.com/Hamlet/)
   
   (* hamlet:assertions hamlet:assertions[dc:title->"Hamlet" dc:creator->"Shakespeare"] *)
   Group(
      Exists ?X (And(?X # ex:RottenThing
                     ex:partof(?X <http://www.denmark.dk>)))
      Forall ?X (Or(hamlet:tobe(?X)  Naf hamlet:tobe(?X)))
      Forall ?X (And(Exists ?B (And(ex:has(?X ?B) ?B # ex:business))
                     Exists ?D (And(ex:has(?X ?D) ?D # ex:desire)))
                   :- ?X # ex:man)
      (* hamlet:facts *)
      Group(
         hamlet:Yorick # ex:poor
         hamlet:Hamlet # ex:prince
      )
   )
 )

Observe that the above set of formulas has a nested subset with its own annotation, hamlet:facts, which contains only a global IRI.   ☐



2.10 EBNF Grammar for the Presentation Syntax of RIF-FLD

Up to now we have used mathematical EnglishUntil now, to specify the syntax of RIF-FLD.RIF-FLD we relied on "mathematical English," a special form of English for communicating mathematical definitions, examples, etc. We will now specify itthe syntax using the familiar EBNF notation. The following points about the EBNF notation should be kept in mind:

In view of the above, the EBNF grammar can be viewed as just an intermediary between the mathematical English and the XML. However, it also gives a succinct overview of the syntax of RIF-FLD and as such can be useful for dialect designers and users alike.


  Document       ::= IRIMETA? 'Document' '(' Dialect? Base? Prefix* Import* Group? ')'
  Dialect        ::= 'Dialect' '(' Name ')'
  Base           ::= 'Base' '(' IRI ')'  
  Prefix         ::= 'Prefix' '(' Name IRI ')'
  Import         ::= IRIMETA? 'Import' '(' IRICONST PROFILE? ')'
  Group          ::= IRIMETA? 'Group' '(' (FORMULA | Group)* ')'
  Implies        ::= IRIMETA? FORMULA ':-' FORMULA
  FORMULA        ::= IRIMETA? 'And' '(' FORMULA* ')' |
                     IRIMETA? 'Or' '(' FORMULA* ')' |
                     Implies |
                     IRIMETA? 'Exists' Var* '(' FORMULA ')' |
                     IRIMETA? 'Forall' Var* '(' FORMULA ')' |
                     IRIMETA? 'Neg' FORMULA |
                     IRIMETA? 'Naf' FORMULA |
                     FORM
  FORM           ::= IRIMETA? (Var | ATOMIC | 'External' '(' ATOMIC ')')
  ATOMIC         ::= Const | Atom | Equal | Member | Subclass | Frame
  Atom           ::= UNITERM
  UNITERM        ::= TERM '(' (TERM* | (Name '->' TERM)*) ')'
  Equal          ::= TERM '=' TERM
  Member         ::= TERM '#' TERM
  Subclass       ::= TERM '##' TERM
  Frame          ::= TERM '[' (TERM '->' TERM)* ']'
  TERM           ::= IRIMETA? (Var | EXPRIC | 'External' '(' EXPRIC ')')
  EXPRIC         ::= Const | Expr | Equal | Member | Subclass | Frame
  Expr           ::= UNITERM
  Const          ::= '"' UNICODESTRING '"^^' SYMSPACE | CONSTSHORT
   IRICONST ::= '"' IRI '"^^' 'rif:iri'PROFILE        ::= TERM
  Name           ::= UNICODESTRING
  Var            ::= '?' UNICODESTRING
  SYMSPACE       ::= ANGLEBRACKIRI | CURIE
  
  IRIMETA        ::= '(*' IRICONST? (Frame | 'And' '(' Frame* ')')? '*)'

RIF-FLD formulas and terms can be prefixed with optional annotations, IRIMETA , for identification and metadata. IRIMETA is represented using (*...*)-brackets that contain an optional IRI constant as identifier followed by an optional Frame or conjunction of Frame s as metadata. An IRI has the form of an internationalized resource identifier as defined by [ RFC-3987 ].The RIF-FLD presentation syntax does not commit to any particular vocabulary and permits arbitrary Unicode strings in constant symbols, argument names, and variables. Constant symbols can have this form: "UNICODESTRING"^^SYMSPACE, where SYMSPACE is a ANGLEBRACKIRI or CURIE that represents an identifier of the symbol space of the constant, and UNICODESTRING is a Unicode string from the lexical space of that symbol space. ANGLEBRACKIRI and CURIE are defined in Section Shortcuts for Constants in RIF's Presentation Syntax of [RIF-DTB]. Constant symbols can also have several shortcut forms, which are represented by the non-terminal CONSTSHORT. These shortcuts are also defined in the same section of [RIF-DTB]. One of themof them is the CURIE shortcut, which is used in the examples in this document. Names are Unicode character sequences. Variables are composed of UNICODESTRING symbols prefixed with a ?-sign.

RIF-FLD formulas and terms can be prefixed with optional annotations, IRIMETA, for identification and metadata. IRIMETA is represented using (*...*)-brackets that contain an optional IRI constant as identifier followed by an optional Frame or conjunction of Frames as metadata. An IRICONST is the special case of a Const with the symbol space rif:iri, again permitting the shortcut forms defined in [RIF-DTB]. One such specialization is '"' IRI '"^^' 'rif:iri' from the CURIE shortcut, whichConst production, where IRI is used in the examples in this document. Names are Unicode character sequences. Variables are composeda sequence of UNICODESTRING symbols prefixed with a ?-sign.Unicode characters that forms an internationalized resource identifier as defined by [RFC-3987].

3 Semantic Framework

Recall that the presentation syntax of RIF-FLD allows the use of macros, which are specified via the Prefix and Base directives.directives, and various shortcuts for integers, strings, and rif:local symbols. The semantics, below, is described using the full syntax, i.e., the description assumeswe assume that all shortcuts and macros have already been expanded, as explaineddefined in [RIF-DTB], Section Constants and Symbol Spaces.

3.1 Semantics of a RIF Dialect as a Specialization of RIF-FLD

The RIF-FLD semantic framework defines the notions of semantic structures and of models for RIF-FLD formulas. The semantics of a dialect is derived from these notions by specializing the following parameters.

  1. The effect of the syntax.
  2. The syntax of a dialect may limit the kinds of terms that are supported.allowed. For instance, if the dialect does not supporta dialect's syntax excludes frames or terms with named arguments then the parts of the semantic structures whose purpose is to interpret the unsupportedthose types of terms (Iframe and INF in this case) become redundant.

  3. Truth values.
  4. The RIF-FLD semantic framework allows formulas to have truth values from an arbitrary partially ordered set of truth values, TV. A concrete dialect must select a concrete partially or totally ordered set of truth values.

  5. Data typesDatatypes.
  6. A datatype is a symbol space whose symbols have a fixed interpretation in any semantic structure. RIF-FLD defines a set of core datatypes that each dialect is required to support, butinclude as part of its semanticssyntax and semantics. However, RIF-FLD does not limit supportdialects to just the core types. RIF dialectstypes: they can introduce additional datatypes, and each dialect must define the exact set of datatypes that it supports.includes.

  7. Logical entailment.
  8. Logical entailment in RIF-FLD is defined with respect to an unspecified set of intended models. A RIF dialect must define which models are considered to be intended. For instance, one dialect might specify that all models are intended (which leads to classical first-order entailment), another may consider only the minimal models as intended, while a third one might only use well-founded or stable models [GRS91, GL88].

These notions are defined in the remainder of this document.


3.2 Truth Values

Definition (Set of truth values). Each RIF dialect must define the set of truth values, denoted by TV. This set must have a partial order, called the truth order, denoted <t. In some dialects, <t can be a total order. We write at b if either a <t b or a and b are the same element of TV. In addition,

RIF dialects can have additional truth values. For instance, the semantics of some versions of NAF, such as well-founded negation, requires three truth values: t, f, and u (undefined), where f <t u <t t. Handling of contradictions and uncertainty usually requires at least four truth values: t, u, f, and i (inconsistent). In this case, the truth order is partial: f <t u <t t and f <t i <t t.


3.3 Primitive Data TypesDatatypes

Definition (Primitive datatype). A primitive datatype (or just a datatype, for short) is a symbol space that has

Semantic structures are always defined with respect to a particular set of datatypes, denoted by DTS. In a concrete dialect, DTS always includes the datatypes supported by that dialect. All RIF dialects must support the primitive datatypes that are listed in Section Data TypesDatatypes of [RIF-DTB]. Their value spaces and the lexical-to-value-space mappings fot these datatypes are described in the same section.


Although the lexical and the value spaces might sometimes look similar, one should not confuse them. Lexical spaces define the syntax of the constant symbols in the RIF language. Value spaces define the meaning of the constants. The lexical and the value spaces are often not even isomorphic. For example, 1.2^^xs:decimal and 1.20^^xs:decimal are two legal -- and distinct -- constants in RIF because 1.2 and 1.20 belong to the lexical space of xs:decimal. However, these two constants are interpreted by the same element of the value space of the xs:decimal type. Therefore, 1.2^^xs:decimal = 1.20^^xs:decimal is a RIF tautology. Likewise, RIF semantics for datatypes implies certain inequalities. For instance, abc^^xs:stringabcd^^xs:string is a tautology, since the lexical-to-value-space mapping of the xs:string type maps these two constants into distinct elements in the value space of xs:string.


3.4 Semantic Structures

The central step in specifying a model-theoretic semantics for a logic-based language is defining the notion of a semantic structure , also known as an interpretation. Semantic structures are used to assign truth values to RIF-FLD formulas.

Definition (Semantic structure). A semantic structure, I, is a tuple of the form <TV, DTS, D, IC, IV, IF, Iframe, I SFNF, Isub, Iisa, I=, Iexternal, Itruth>. Here D is a non-empty set of elements called the domain of I. We will continue to use Const to refer to the set of all constant symbols and Var to refer to the set of all variable symbols. TV denotes the set of truth values that the semantic structure uses and DTS is a set of identifiers for primitive datatypes.

The other components of I are total mappings defined as follows:

  1. IC maps Const to elements of D.
  2. This mapping interprets constant symbols.

  3. IV maps Var to elements of D.
  4. This mapping interprets variable symbols.

  5. IF maps D to functions D*D (here D* is a set of all sequences of any finite length over the domain D)
  6. This mapping interprets positional terms.

  7. I SFNF interprets terms with named arguments. It is a total mapping from D to the set of total functions of the form SetOfFiniteBags(ArgNames × D) → D.

    This is analogous to the interpretation of positional terms with two differences:

  8. Iframe is a total mapping from D to total functions of the form SetOfFiniteBags(D × D) → D.

    This mapping interprets frame terms. An argument, dD, to Iframe represents an object and a finite bag {<a1,v1>, ..., <ak,vk>} represents a bag (multiset) of attribute-value pairs for d. We will see shortly how Iframe is used to determine the truth valuation of frame terms.

    Bags are employed here because the order of the attribute/value pairs in a frame is immaterial and the pairs may repeat. For instance, o[a->b a->b]. Such repetitions arise naturally when variables are instantiated with constants. For instance, o[?A->?B ?C->?D] becomes o[a->b a->b] if variables ?A and ?C are instantiated with the symbol a and ?B, ?D with b.

  9. Isub gives meaning to the subclass relationship. It is a total function D × DD.
  10. The operator ## is required to be transitive, i.e., c1 ## c2 and c2 ## c3 must imply c1 ## c3. This is ensured by a restriction in Section Interpretation of Formulas.

  11. Iisa gives meaning to class membership. It is a total function D × DD.
  12. The relationships # and ## are required to have the usual property that all members of a subclass are also members of the superclass, i.e., o # cl and cl ## scl must imply o # scl. This is ensured by a restriction in Section Interpretation of Formulas.

  13. I= is a total function D × DD.

    It gives meaning to the equality operator.

  14. Itruth is a total mapping DTV.

    It is used to define truth valuation for formulas.

  15. Iexternal is a mapping from the coherent set of schemas for externally defined functions to total functions D* → D. For each external schema σ = (?X1 ... ?Xn; τ) in the coherent set of such schemas associated with the language, Iexternal(σ) is a function of the form DnD.

    For every external schema, σ, associated with the language, Iexternal(σ) is assumed to be specified externally in some document (hence the name external schema). In particular, if σ is a schema of a RIF builtin predicate or function, Iexternal(σ) is specified in [RIF-DTB] so that:

For convenience, we also define the following mapping I :

  1. I(k) = IC(k), if k is a symbol in Const
  2. I(?v) = IV(?v), if ?v is a variable in Var
  3. I(f(t1 ... tn)) = IF(I(f))(I(t1),...,I(tn))
  4. I(f(s1->v1 ... sn->vn)) = I SFNF(I(f))({<s1,I(v1)>,...,<sn,I(vn)>})
  5. Here we use {...} to denote a bag of argument/value pairs.

  6. I(o[a1->v1 ... ak->vk]) = Iframe(I(o))({<I(a1),I(v1)>, ..., <I(an),I(vn)>})
  7. Here {...} denotes a bag of attribute/value pairs.

  8. I(c1##c2) = Isub(I(c1), I(c2))
  9. I(o#c) = Iisa(I(o), I(c))
  10. I(x=y) = I=(I(x), I(y))
  11. I(External(t)) = Iexternal(σ)(I(s1), ..., I(sn)), if t is an instance of the external schema σ = (?X1 ... ?Xn; τ) by substitution ?X1/s1 ... ?Xn/sn.

    Note that, by definition, External(t) is well formed only if t is an instance of an external schema. Furthermore, by the definition of coherent sets of external schemas, t can be an instance of at most one such schema, so I(External(t)) is well-defined.

The effect of signatures. For every signature, sg, supported by a dialect, there is a subset DsgD, called the domain of the signature. Terms that have a given signature, sg, must be mapped by I to Dsg, and if a term has more than one signature it must be mapped into the intersection of the corresponding signature domains. To ensure this, the following is required:

  1. If sg < sg' then DsgDsg'.
  2. If k is a constant that has signature sg then IC(k) ∈ Dsg.
  3. If ?v is a variable that has signature sg then IV(?v) ∈ Dsg.
  4. If sg has an arrow expression of the form (s1 ... sn)⇒s then, for every dDsg, IF(d) must map Ds1× ... ×Dsn to Ds.
  5. If sg has an arrow expression of the form (p1->s1 ... pn->sn)⇒s then, for every dDsg, I SFNF(d) must map the set {<p1,Ds1>, ..., <pn,Dsn>} to Ds.
  6. If the signature -> has arrow expressions (sg,s1,r1)⇒k, ..., (sg,sn,rn)⇒k, then, for every dDsg, Iframe(d) must map {<Ds1,Dr1>, ..., <Dsn,Drn>} to Dk.
  7. If the signature # has an arrow expression (s r)⇒k then Iisa must map Ds×Dr to Dk.
  8. If the signature ## has an arrow expression (s s)⇒k then Isub must map Ds×Ds to Dk.
  9. If the signature = has an arrow expression (s s)⇒k then I= must map Ds×Ds to Dk.

The