RIF Production Rule Dialect

W3C Editor's Draft 23 May 2008

This version:: http://www.w3.org/2005/rules/wg/draft/ED-rif-prd-20080523/
Latest editor's draft:: http://www.w3.org/2005/rules/wg/draft/rif-prd/
Previous version:: http://www.w3.org/2005/rules/wg/draft/ED-rif-prd-20080519/ (color-coded diff)

Editors:: Christian de Sainte Marie, ILOG

Abstract

This document specifies RIF-PRD, a Rule Interchange Format (RIF) dialect to enable the interchange of production rules.

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Please Comment By 2008-05-25

The Rule Interchange Format (RIF) Working Group seeks public feedback on these Working Drafts. Please send your comments to public-rif-comments@w3.org (public archive). If possible, please offer specific changes to the text that would address your concern. You may also wish to check the Wiki Version of this document for internal-review comments and changes being drafted which may address your concerns.

No Endorsement

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Patents

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

1 Overview
2 Syntax
3 Operational semantics
4 Compatibility with other RIF dialects
5 Glossary

1 Overview

1.1 Introduction

Editor's Note: General introduction to be developed.

1.2 Overview of RIF production rule dialect

TBD

1.2.1 RIF-PRD as an interchange format for production rules

Editor's Note: Overview of PRD and its semantics to be developed

1.2.2 RIF-PRD as a Web language

To make RIF dialects suitable as Web languages, RIF supports XML Schema primitive data types and some other data types. In addition, RIF promotes the use of Internationalized Resource Identifiers (or IRIs) RFC 3987 to refer to individuals, predicates, and functions.

Editor's Note: Does this still make sense?

1.2.3 Compatibility with other RIF dialects

TBD

1.2.4 Syntaxes

In this document we will introduce three related but distinct representations for RIF-PRD components:

1.2.4.1 XML syntax

This syntax is the normative XML serialization of RIF-PRD. It is meant, and has been designed, to work as a common XML serialisation for many production rule languages.

Three kinds of syntactic components are used to specify RIF-PRD:

- Concrete classes have a concrete definition and they are associated with specific XML markup. In this document, concrete constructs will be denoted with CamelCase names with leading capital letter;
- Properties, or roles, define how two classes relate to each other. They have concrete definitions and are associated with specific XML markup. In this document, properties will be denoted with camelCase names with leading smallcase letter;

The XML syntax is specified for each component as a pseudo-schemas, as part of the description of the component. The pseudo-schemas use BNF-style conventions for attributes and elements: "?" denotes optionality (i.e. zero or one occurrences), "*" denotes zero or more occurrences, "+" one or more occurrences, "[" and "]" are used to form groups, and "|" represents choice. Attributes are conventionally assigned a value which corresponds to their type, as defined in the normative schema. Elements are conventionally assigned a value which is the name of the syntactic class of their content, as defined in the normative schema.

<!-- sample pseudo-schema -->
    <defined_element
          required_attribute_of_type_string="xs:string"
          optional_attribute_of_type_int="xs:int"? >
      <required_element />
      <optional_element />?
      <one_or_more_of_these_elements />+
      [ <choice_1 /> | <choice_2 /> ]*

    </defined_element>

The BNF-like pseudo schema are not normative: the XML schema is normative.

1.2.4.2 Presentation syntax

A presentation syntax is also specified, to allow a more easily readable presetation of RIF-PRD language fragments (such as examples etc). The presentation syntax is used mainly to specify the intended semantics of RIF-PRD rulesets. It is non normative.

1.2.4.3 Structural diagram

The overall structure of the syntax of RIF-PRD is also presented as an UML-like diagram in appendix [ XXX]. The diagram provides an overview of RIF-PRD syntactic classes and shows at a glance how they relate to each others.

1.3 Overview of this document

TBD

1.3.1 Namespaces

Throughout this document, the xsd: prefix stands for the XML Schema namespace URI http://www.w3.org/2001/XMLSchema#, the rdf: prefix stands for http://www.w3.org/1999/02/22-rdf-syntax-ns#, and rif: stands for the URI of the RIF namespace, http://www.w3.org/2007/rif#. Syntax such as xsd:string should be understood as a compact uri -- a macro that expands to a concatenation of the character sequence denoted by the prefix xsd and the string string. In the next version of this document we intend to introduce a syntax for defining prefixes for compact URIs.

Editor's Note: Namespace for RIF-PRD constructs...

1.3.2 Running example

Jim the Hen Handler developed a very sophisticated program to grow feeble chicks into deliciously plump chickens for a minimal cost. Jim publishes his method on his Web site www.juicychicken.org. Since he is still experimenting to improve his method, the rules to follow change quite often, and Jim decided to publish them in RIF as well, so that his followers can easily, even automatically, implement the very latest version.

Jim's latest addition to his knowledge base is the following advice:

Except on Tuesdays, every potato owned by a chicken gets mashed, and the daily grain allowance of every chicken that owns a potato is increased by 10%, if the chicken is older than 8 months, the potato's weight (in decigrams) is more than half the chicken's age (in months), and there is no fox in the hen house. Do not forget to delete the mashed potatoes from the ownership table!

Jim calls it the "Chicken and Mashed Potatoes", or CMP, rule. Rephrased in a more "rule-like" form, the rule could read like:

Rule ChickensAndMashedPotatoes:

Forall ?c, where ?c is a chicken and the age of ?c in months is > 8;

Forall ?p, where ?p is a potato, ?p is owned by ?c, and the weight in decigrams of ?p > (age of ?c)/2;

If today is not Tuesday, and there is no fox in the hen house

Then mash ?p, increase the grain allowance of ?c by 10%, and remove the couple (?c, ?p) from the ownership relation.

The CMP rule is a production rule: that is, when the rule is implemented and the condition is satisfied, the consequence is that some actions are executed (mashing potatoes, modifying a chicken's regime, managing ownership relations etc). Therefore, it is best interchanged using the production rule dialect of RIF, and Jim the Hen Handler's method, and especially his "chicken and mashed potatoes" rule and components thereof, will be used as a running example in this document.

To make sure that his rules can be implemented unambiguously, Jim (the Hen Handler, not the chicken) published a data model (e.g., an XML schema) that identifies Chicken and Potato as two classes of objects:

instances of Chicken represent individual chickens. They have a property called age that represents the chicken's age in months as an xs:decimal and a property called allowance that represent the chicken's daily grain allowance as an xs:integer;
instances of Potato represent individual potatoes. They have a property called weight that represents the potato's weight in decigram as an xs:decimal.

Jim further specifies:

a data type DayOfTheWeek that specifies the days of the week as a subset of xs:string
a binary predicate owns(?o ?i) that denotes the ownership relation between owner ?o and owned item ?i;
a nullary function today() that returns the day of the week as a DayOfTheWeek;
a boolean fox sensor foxAlarm() that returns true if and only if the fox is in the hen house;
and a command mash(?i), which execution results in the physical item that is denoted by ?i being reduced to a soft pulpy state by beating or pressure (and removed from the instances of its class, as a consequence).

All these (types and names) are defined in Jim's namespace, which is represented in this document by the prefix jim:.

2 Syntax

2.1 Condition Language

This section specifies the XML syntax that is used to represent the different boolean expressions that can be found in production rules. The language is called the language of conditions, because it extends the language of conditions of [ RIF-Core], that determines what can appear as a condition in a rule supported by RIF-Core. The language plays a similar role in RIF-PRD.

The most basic construct in RIF is the TERM. The TERM is an abstract construct, concretly visible, in RIF-PRD, as a Const, a Var or an External.

The ATOMIC is the basic building block of RIF. ATOMIC is an abstract class: it is visible in RIF-PRD as either an Atom, an Equal, a Member, a Subclass or a Frame. Each kind of ATOMIC is a composition of TERMs.

The next level of constructs are assemblies of ATOMICs. Together, the various kinds of assemblies form the abstract construct FORMULA. RIF-PRD knows six kinds of FORMULAs: the single ATOMIC, the Externallly evaluated atomic relation, and the recursively specified And, Or, Naf and Exists.

The following sections specify each construct separately, grouped by abstract syntactic classes. Each element is given an XML syntax, in the form of an informative BNF pseudo schema: the normative XML syntax is given in the [ RIF-PRD XML schema].

2.1.1 TERM

The TERM class of constructs is used to represent constants, variables and the application of function symbols to other TERMs.

As an abstract class, TERM is not associated with specific XML markup in RIF-PRD instance documents. It is specified in the normative schema as a substitution group.

Editor's Note: ...Substitution group or another construct, depending on how we handle extensibility in the XML schema.

    [ Const | Var | External ]

Editor's Note: Difference wrt RIF-BLD: in RIF-PRD, there are no logic functions => a TERM cannot be a (non evaluated) Expr.

2.1.1.1 Const

In RIF, the Const construct is used to represent a constant.

The Const element has a required type attribute and an optional xml:lang attribute:

The value of the type attribute is the name of the Const type. It must be an absolute IRI;

The xml:lang attribute, as defined by 2.12 Language Identification of XML 1.0 or its successor specifications in the W3C recommandation track, is optionally used to identify the language for the presentation of the Const to the user. It is allowed only in association with constants of the type rif:text. A compliant implementation MUST ignore the xml:lang attribute if the type of the Const is not rif:text.

The content of the Const element is the constant's symbol, which can be any Unicode character string.

Editor's Note: Or should constant litterals be limited to xs:NormalizedString instead?

    <Const type="IRI" [xml:lang="xs:language"]? >
        Any Unicode string
    </Const>

Editor's Note: The following (until and exclusing examples, will be revised based on the content of DTB document.

Constant types. RIF-PRD conformant implementations MUST support the following builtin constant types:

xsd:long (http://www.w3.org/2001/XMLSchema#long), as defined in the document XML Schema Part 2: Datatypes;
xsd:string (http://www.w3.org/2001/XMLSchema#string), as defined in the document XML Schema Part 2: Datatypes;
xsd:integer (http://www.w3.org/2001/XMLSchema#integer), as defined in the document XML Schema Part 2: Datatypes;
xsd:decimal (http://www.w3.org/2001/XMLSchema#decimal), as defined in the document XML Schema Part 2: Datatypes;
xsd:time (http://www.w3.org/2001/XMLSchema#time), as defined in the document XML Schema Part 2: Datatypes;
xsd:dateTime (http://www.w3.org/2001/XMLSchema#dateTime), as defined in the document XML Schema Part 2: Datatypes;
rdf:XMLLiteral (http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral), as defined in the document Resource Description Framework (RDF): Concepts and Abstract Syntax;
rif:text, for text strings with language tags attached. The language tag requires a special treatment of rif:text constants with respect to the XML syntax;
rif:iri, for internationalized resource identifier or IRI. This type is used to represent constants that are intended to be used in a way similar to RDF resources. The lexical space consists of all absolute IRIs as specified in RFC 3987; it is unrelated to the XML primitive type anyURI. A rif:iri constant is supposed to be interpreted as a reference to one and the same object regardless of the context in which that constant occurs.
rif:local, for constant symbols that are not visible outside of a particular set of rules interchanged in RIF. Constants of this type are used locally in their respective rule sets. This means that occurrences of the same rif:local-constant in different rule sets are viewed as unrelated distinct constants, but occurrences of the same constant in the same rule set must refer to the same object. The lexical space of rif:local is the same as the lexical space of xsd:string.

Symbols with an ill-formed lexical part. Constants that are represented in RIF in one of the aforesaid builtin types must be well-formed, i.e., their representation must belong to the lexical space associated with the type. For instance, 123^^xsd:long (that is, <Const type="xsd:long">123</Const>) has a correct lexical part, since 123 belongs to the lexical space of the data type xsd:long. In contrast, abc^^xsd:long (<Const type="xsd:long">abc</Const>) is ill-formed, as it does not have a correct lexical part. A compliant RIF-PRD translator MUST reject ill-formed symbols.

Symbols with non-standard types. RIF-PRD builtin types are meant to be used for the interchange of constants that belong to the corresponding XML and RDF types. They can also be used for the interchange of constants with non builtin, application specific types with overlaping lexical and value spaces or similar behaviours: this may require additional translations steps and may result in unexpected or incorrect results; in addition, the information regarding the original types is lost in RIF-PRD, which may affect round-tripping.

Constants that do not belong or cannot be cast in one of RIF builtin types for interchange purposes, can be represented using the same LITERAL^^TYPE (i.e. <Const type="TYPE">LITERAL</Const>) syntax, where TYPE is not one of the RIF builtin types. RIF does not ascribe any specific lexical space to these types and any Unicode string should be considered a well-formed constant symbol as far as RIF is concerned. In the same way, RIF does not ascribe any particular semantics to such non-builtin TYPEs: it is the responsibility of the producers and consumers of RIF-PRD rulesets that reference types that are not built in RIF-PRD to agree on their lexical space and on their semantics.

Dialects that extend RIF-PRD might define additional builtin types, give them special semantics and appropriate the names of some non-standard types to name them.

Examples. In each of the examples below, a constant is first described, followed by its representation in RIF-PRD XML syntax.

a. A constant with builtin type xsd:integer and value 123:

<Const type="xsd:integer">123</Const>

b. A constant which symbol today is defined in Jim the Hen Handler's namespace, identified by the prefix jim:. The type of the constant is rif:iri:

<Const type="rif:iri">jim:today</Const>

c. A constant with symbol BigPotato that is local to the set of rules where it appears (e.g. a RuleSet specific to Paula's farm). The type of the constant is rif:local:

<Const type="rif:local">BigPotato</Const>

d. A constant with non-builtin type xsd:int and value 123:

<Const type="xsd:int">123</Const>

e. A constant with symbol Tuesday, defined in the literal space of the application specific data type jim:DayOfTheWeek:

<Const type="jim:DayOfTheWeek">Tuesday</Const>

2.1.1.2 Var

In RIF, the Var construct is used to represent variables.

The content of the Var element is the variable's name, represented as an Unicode character string.

Editor's Note: Shouldn't variable's names rather be limited to be, e.g. xs:NMTOKEN?

    <Var> xs:NMTOKEN </Var>

Editor's Note: Or should Var have a type attribute (and we could, maybe, get rid of the Member construct)?

Editor's Note: Difference wrt RIF-BLD: RIF-BLD says only that a Var element contains a VARNAME. However, most implemented rule languages enforce some syntactic restrictions on the name of variables and requiring all RIF implementations to support any Unicode string as VARNAMEs might be an additional burden with little benefits associated. This is a proposed alternative: whatever the decision, BLD and PRD should use the same definition.

2.1.1.3 External

As a TERM, an External element is used to represent an evaluated function.

Editor's Note: builtin function, evaluated function, or fixed interpretation function... need consistent terminology (Gary)

The External element contains one content element, which in turn contains one Expr element that contains one op</tt> element, followed by zero or more arg elements:

The content and Expr element ensure compatibility with [ RIF Basic Logic Dialect], that allows non-evaluated (that is, logic) functions to be represented by an Expr element;

The content of the op element must be a construct from the TERM abstract class. When the External is a TERM, the content of the op element represents a function symbol;

Editor's Note: Or should the op element be restricted to contain a Const element only?

The content of the arg elements must be constructs from the TERM abstract class. The order of the arg elements is significant and MUST be preserved.

    <External>
       <content>
          <Expr>
             <op> TERM </op>
             <arg> TERM </arg>*
          </Expr>
       </content>
    </External>

Editor's Note: Difference wrt RIF-BLD: This version of the PRD draft does not use named argument UNITERMs.

Editor's Note: Difference wrt RIF-BLD: Since PRD does not have logical function (non-External Expr, in BLD), it could be a good case for fallbacks to replace, in PRD, the <External><content><Expr> cascade by a single equivalent <Function> element.

Builtin functions. RIF-PRD specifies a subset of XPath/XQuery Functions and Operators [X F&O] that any conformant RIF-PRD implementation MUST support. The RIF-PRD builtin functions are listed in the [ Data Type and Builtins] document.

Editor's Note: TBC how to reference non-builtin evaluated functions

The op TERM must represent a constant symbol of type rif:iri that must uniquely identify the evaluated function to be applied to the arg TERMs. It can be one of the builtin functions specified for RIF-PRD, or it can be application specific. In the latter case, it is the responsibility of the producers and consumers of RIF-PRD rulesets that reference non-builtin functions to agree on their semantics.

Examples.

a. The first example below shows one way to represent, in RIF-PRD, the sum of integer 1 and variable ?X, where the addition conforms to the specification of the builtin fn:numeric-add.

The prefix fn is associated with the namespace http://www.w3.org/2005/xpath-functions.

<External>
   <content>
      <Expr>    
         <op> <Const type="rif:iri"> fn:numeric-add </Const> </op>
         <arg> <Const type="xsd:integer"> 1 </Const> </arg>
         <arg> <Var> ?X </Var> </arg>
      </Expr>
   </content>
</External>

b. Another example, that shows the RIF XML representation of a call to the application-specific nullary function today(), which symbol is defined in the example's namespace:

<External>
   <content>
      <Expr>    
         <op> <Const type="rif:iri"> jim:today </Const> </op>
      </Expr>
   </content>
</External>

2.1.2 ATOMIC

The ATOMIC class is used to represent atomic truth-valued statements. RIF-PRD covers rule languages in which the truth values of atomic statements can only be "true" or "false": that is, ATOMICs represent boolean statements. Dialects extending RIF-PRD may support additional truth values.

As an abstract class, ATOMIC is not associated with specific XML markup in RIF-PRD instance documents. It is specified in the normative schema as a substitution group.

Editor's Note: ...Substitution group or another construct, depending on how we handle extensibility in the XML schema.

    [ Atom | Equal | Member | Subclass | Frame ]

2.1.2.1 Atom

In RIF, the Atom element is used to represent a relation (or predicate).

The Atom element contains one op element, followed by zero or more arg arguments:

The content of the op element must be an element of the TERM class of construct. It represents the predicate symbol (the name of a relation);
The content of the arg elements must be constructs from the TERM abstract class. The order of the arg elements is significant and MUST be preserved.

    <Atom>
       <op> TERM </op>
       <arg> TERM </arg>*
    </Atom>

Editor's Note: Difference wrt RIF-BLD: Not sure what is the support for relational atoms where the predicate would not be a constant, among production rule languages: should op be restricted to be a Const? Depending on the answer, it might make sense for PRD to differ from BLD here. The benefits and drawbacks (e.g. wrt the definition of Core) have to be weighted.

Editor's Note: Difference wrt RIF-BLD: Named arguments Atoms are not included in this draft.

Example. The example below shows the RIF XML representation of the predicate owns(?c ?p), where the predicate symbol owns' is defined in the example namespace denoted by the prefix jim:.

<Atom>
   <op> <Const type="rif:iri"> jim:owns </Const> </op>
   <arg> <Var> ?c </Var> </arg>
   <arg> <Var> ?p </Var> </arg>
</Atom>

2.1.2.2 Equal

In RIF, Equal is used to represent equality relations.

The Equal element must contain two unordered side elements. The content of each side element must be a construct from the TERM abstract class. The order of the side elements is not significant and must not be preserved.

    <Equal>
       <side> TERM </side>
       <side> TERM </side>
    </Equal>

2.1.2.3 Member

In RIF, the Member element is used to represent membership relations.

The Member element contains two unordered sub-elements:

the object elements must be a construct from the TERM abstract class. It is required;
the class element must be a construct from the TERM abstract class. It is required as well.

    <Member>
       <object> TERM </object>
       <class> TERM </class>
    </Member>

Editor's Note: Difference wrt RIF-BLD: object and class make more sense to me as the roles in a membership relation than BLD's lower and upper. But, as noticed earlier, the discussion about naming is still to be had, and I do not care that much about label's names anyway... This is a proposed alternative: whatever the decision, BLD and PRD should use the same definition.

Example. The example below shows the RIF XML representation of a bollean expression that tests whether the individual denoted by the variable ?c is a member of the class Chicken that is defined in the example namespace denoted by the prefix jim:.

<Member>
   <object> <Var> ?c </Var> </object>
   <class> <Const type="rif:iri"> jim:Chicken </Const> </class>
</Member>

2.1.2.4 Subclass

In RIF, the Subclass element is used to represent class inclusion relations.

The Subclass element contains unordered two sub-elements:

the subClass element must be a construct from the TERM abstract class. It is required;
the class elements must be a construct from the TERM abstract class. It is required.

    <Subclass>
       <subClass> TERM </subClass>
       <class> TERM </class>
    </Subclass>

Editor's Note: Difference wrt RIF-BLD: BLD's lower and upper make more sense. I kept class for the upper element for consistency with the Member element, and subClass seemmed to make more sense, associated with class than lower would. But, again, I am not religious about tags. This is a proposed alternative: whatever the decision, BLD and PRD should use the same definition.

2.1.2.5 Frame

In RIF, the Frame construct is used to represent relations that hold between an individual, also called an object, and the values of some its properties or attributes: that is, object-attribute-value triples.

Accordingly, a Frame element must contain:

an object element, that contains an element of the TERM abstract class, the content of which represents the individual;
zero to many slot elements, each containing a Prop element that represents an attribute-value pair as:
- a key element that contains an element of the TERM abstract class that represents the name of the attribute (or property);
- a val element, that contains an element of the TERM abstract class, the content of which represents the attribute's value.

    <Frame>
       <object> TERM </object>
       <slot> Prop </slot>*
    </Frame>

    <Prop>
       <key> TERM </key>
       <val> TERM </val>
    </Prop>

Example. The example below shows the RIF XML syntax of an expression that states that the object denoted by the variable ?c has the value denoted by the variable ?a for the property Chicken/age that is defined in the example namespace.

<Frame>
   <object> <Var> ?c </Var> </object>
   <slot> 
      <Prop>
         <key> <Const type="rif:iri"> jim:Chicken/age </Const> </key>
         <val> <Var> ?a </Var> </val>
      </Prop>
   </slot>*
</Frame>

Editor's Note: Used an XPath style for the key. This has to be discussed, of course.

2.1.3 FORMULA

The FORMULA class is used to represent any truth-valued statement, atomic or not, that is allowed in the condition part of a rule covered by RIF-PRD. In rule languages covered by RIF-PRD, the truth values of statements can be "true" or "false": that is, FORMULAs can only represent boolean statements.

Dialects extending RIF-PRD may support additional truth values.

As an abstract class, FORMULA is not associated with specific XML markup in RIF-PRD instance documents. It is specified in the normative schema as a substitution group.

Editor's Note: ...Substitution group or another construct, depending on how we handle extensibility in the XML schema.

    [ ATOMIC | External | And | Or | NmNot | Exists ]

Editor's Note: Difference wrt RIF-BLD: a RIF-PRD without a construct for closed-world negation would not make mush sense; hence the addition, wrt BLD, of the non-monotonic (inflationary) negation NmNot construct. This construct is specific to PRD.

2.1.3.1 ATOMIC

A FORMULA can be a single ATOMIC statement. See specification of [ ATOMIC], above.

2.1.3.2 External

In RIF-PRD, an External represents an evaluated predicate when it is a FORMULA (as opposed to a TERM; that is, in particular, when it appears in a place where a FORMULA is required, and not a TERM).

The External element contains one content element that contains one Atom element:

    <External>
       <content>
          Atom
       </content>
    </External>

Builtin predicates. RIF-PRD specifies a subset of XPath/XQuery Functions and Operators [X F&O] that any conformant RIF-PRD implementation MUST support. The RIF-PRD builtin predicates are listed in the[ Data Types and Builtins] document.

Editor's Note: TBC how to reference non-builtin evaluated predicates

The op TERM in the Atom element must represent a constant symbol of type rif:iri that must uniquely identify the evaluated predicate to be applied to the arg TERMs. It can be one of the builtin predicates specified for RIF-PRD, or it can be application specific. In the latter case, it is up to the producers and consumers of RIF-PRD rulesets that reference non-builtin predicates to agree on their semantics.

Example. The example below shows the RIF XML representation of a boolean expression that tests whether the value denoted by variable ?a (e.g. the age of a chicken) is greater than the integer value 8, where the test is intended to behave like the builtin predicate op:numeric-greater-than as specified in XQuery 1.0 and XPath 2.0 Functions and Operators.

Editor's Note: What about the namespace associated with the prefix op:?

<External>
   <content>
      <Atom>    
         <op> <Const type="rif:iri"> op:numeric-greater-than </Const> </op>
         <arg> <Var> ?a </Var> </arg>
         <arg>  <Const type="xsd:decimal"> 8 </Const> </arg>
      </Atom>
   </content>
</External>

2.1.3.3 And

A FORMULA can represent the conjunction of zero of more statements, each of them represented by a FORMULA. This is represented by the And construct.

The And element contains zero or more formula elements, each containing an element of the FORMULA group.

    <And>
       <formula> FORMULA </formula>*
    </And>

2.1.3.4 Or

A FORMULA can represent the disjunction of zero of more statements, each of them represented by a FORMULA. This is represented by the Or construct.

The Or element contains zero or more formula elements, each containing an element of the FORMULA group.

    <Or>
       <formula> FORMULA </formula>*
    </Or>

2.1.3.5 NmNot

A FORMULA can represent the non-monotonic negation of a statement, itself represented by a FORMULA: this is represenetd by the NmNot construct.

The NnNot element contains exactly one formula element. The formula element contains an element of the FORMULA group, that represents the negated statement.

    <NnNot>
       <formula> FORMULA </formula>
    </NmNot>

Editor's Note: Difference wrt RIF-BLD: that construct is specific to RIF-PRD. I used a different name form the two kinds of negations specified in FLD (Not and Naf) for fear of snippers: naming remains to be discussed, anyway.

2.1.3.6 Exists

In RIF, the Exists construct is used to represent an existentially quantified FORMULA.

The Exists element contains:

one or more declare sub-elements, each containing a Var element that represents one of the existentially quantified variables;
exactly one required formula sub-element that contains an element from the FORMULA abstract class: the FORMULA represents the formula in the scope of the quantifier.

    <Exists>
       <declare> Var </declare>+
       <formula> FORMULA </formula>
    </Exists>

Example. The example below shows the RIF XML representation of a boolean expression that tests whether the chicken denoted by variable ?c is older than 8 months, by testing the existence of a value, denoted by variable ?a, that is both the age of ?c, as represented by a Frame as in example XX, and greater than 8, as represented by an External ATOMIC as in example YY.

<Exists>
   <declare> <Var> ?a </Var> </declare>
   <formula>
      <And>
         <Frame>
            <object> <Var> ?c </Var> </object>
            <slot> 
               <Prop>
                  <key> <Const type="rif:iri"> jim:Chicken/age </Const> </key>
                  <val> <Var> ?a </Var> </val>
               </Prop>
            </slot>*
         </Frame>
         <External>
            <content>
               <Atom>    
                  <op> <Const type="rif:iri"> op:numeric-greater-than </Const> </op>
                  <arg> <Var> ?a </Var> </arg>
                  <arg>  <Const type="xsd:decimal"> 8 </Const> </arg>
               </Atom>
            </content>
         </External>
      </And>
   </formula>
</Exists>

2.2 Actions

This section specifies the XML syntax that is used in RIF-PRD to represent what can appear in the action part of a rule supported by RIF-PRD.

Editor's Note: Difference wrt RIF-BLD: all the constructs in this section are specific to RIF-PRD.

Editor's Note: What about truth maintenance capabilities in many PR systems?

RIF-PRD defines one single abstract class for actions: ACTION, that is realised by five concrete constructs:

the Assert, Remove and Update constructs are associated with an ATOMIC that represents the target of the eponym actions;
the Execute construct is associated with a TERM that identifies the operation to be executed, and zero or more TERMs that represent the arguments with which the operation is to be executed;
the Assign construct is associated with a Frame that represents the target object and slot, as well as the new value to be assigned.

Editor's Note: Should other actions be considered in RIF-PRD? Esp., do we need some basic programatic constructs, such as a loop?

Editor's Note: Except for the different names, the missing abstract class that groups the Assert, Remove and Update actions, and the possibility to Assign a value to a (ruleset) variable (which are not defined in this draft), the ACTION abstract class is copied from the ImperativeExp abstract class that specifies the actions in the OMG [ PRR OCL] specification.

2.2.1 ACTION

The ACTION class of constructs is used to represent the individual actions in the action part of a production rule represented in RIF-PRD.

This version of RIF-PRD covers five ACTIONs: Assert, Remove, Update, Execute and Assign.

As an abstract class, ACTION is not associated with specific XML markup in RIF-PRD instance documents. It is specified in the normative schema as a substitution group.

Editor's Note: ...Substitution group or another construct, depending on how we handle extensibility in the XML schema.

    [ Assert | Remove  | Update  | Execute  | Assign ]

2.2.1.1 Assert

The Assert construct is used to represent actions that result in asserting an atomic statement.

The Assert element has one target sub-element that contains a construct from the ATOMIC group, which represents the atomic statement to be asserted on implementing the action. The Equal construct is not allowed as the target of an Assert

Editor's Note: Are there other restrictions on the kind of ATOMIC that can be Assert-ed? Shall we allow Member and Subclass ATOMICs to be Assert-ed? Or should we make no restriction: the assertion of an Equal might be well-formed, after all; its assertion might have unexpected results, but is that RIF's concern?

Editor's Note: Issues: ATOMIC might not be the right way to represent the target anyway: what about the case where a new individual has to be created? Use, by convention, the assertion of an anonymous frame (that is, a frame where the object is the name of a class, not an individual)?

Editor's Note: Or remove the Assert construct and interpret an ATOMIC as an ACTION as an assert? That would improve PRD/BLD/Core compatibility at the synatctic level. What would be the drawbacks?

    <Assert>
       <target> ATOMIC </target>
    </Assert>

2.2.1.2 Remove

The Remove construct is used to represent actions that result in negating an atomic statement.

The Remove element has one target sub-element that contains a construct from the ATOMIC group that represents the atomic statement to be removed on implementing the action.

Editor's Note: Are there any restriction on the kind of ATOMIC that can be removed? E.g. what would it mean to remove an Equal? Should we allow a Member or a Subclass ATOMIC to be Remove-ed?

Editor's Note: Issue: ATOMIC might not be the right way to represent the target anyway: e.g. how to Remove an individual (if empty Frames are not allowed)?

    <Remove>
       <target> ATOMIC </target>
    </Remove>

Editor's Note: Issue: How to remove multiple facts at once? E.g. Remove-ing an individual: use empty frames (if allowed) as a convention?

Example. The example below shows the RIF XML representation of an action that updates the chicken-potato ownership table by removing the predicate that states that the chicken denoted by variable ?c owns the potato denoted by variable ?p. The predicate is represented as in example XX.

<Remove>
   <target>
      <Atom>
         <op> <Const type="rif:iri"> jim:owns </Const> </op>
         <arg> <Var> ?c </Var> </arg>
         <arg> <Var> ?p </Var> </arg>
      </Atom>
   </target>
</Remove>

2.2.1.3 Update

The Update construct is used to represent actions that result in updating an atomic statement.

The Update element has one target sub-element that contains a construct from the ATOMIC group that represents the atomic statement to be updated on implementing the action.

    <Update>
       <target> ATOMIC </target>
    </Update>

From the specification of OMG PRR-OCL [ PRR]: "Some operations modify the state of objects and others do not. If the modified objects are in the scope of the engine, the engine must be notified that the objects state has been modified to be able to compute the list of eligible rules. It is not possible from the operation call to determine automatically what objects will be modified so it may be necessary in the rule to explicitly notify the engine."

Editor's Note: Issues: The Update as an action is specific to some rule engine technologies or implementations (e.g. RETE-based). Since it is required by such implementations only because of the opaque nature of the Execute actions with respect to the modification of ATOMICs, and since the specification of the procedural attachments to be Execute-ed requires a specific interchange channel anyway (as builtins or depending on an out-of-band agreement), should not the information relevant to any ATOMIC impacted by an Execute-ed procedure, and thus to required updates for engines that use them, rather be interchanged as part of the specification of said procedures? And Update not be covered by RIF-PRD, as a consequence?

2.2.1.4 Execute

The Execute is used to represent actions that result in the execution of a procedural attachment.

The Execute element has one op sub-element that represents the symbol of the procedural attachement to be executed on implementing the action, followed by zero or more arg sub-elements that represent its arguments:

The content of the op element must be a construct of the TERM abstract class;

Editor's Note: Or should the op element be restricted to a Const instead?

The content of the arg elements must be constructs from the TERM abstract class. The order of the arg elements is significant and MUST be preserved.

Review comment from Adrian
Or use the External construct instead (AdrianP)? But how about (i) the ambiguity with fixed interprestation predicates; (ii) asserted ATOMICs if we remove the Assert construct?

    <Execute>
       <op> TERM </op>
       <arg> TERM </arg>*
    </Execute>

Builtin operations.

TBD

Editor's Note: Should RIF-PRD have builtin executable operations (e.g. print etc) or should they be left to be the application specific?

Example. The example below shows the RIF-PRD XML representation of the action which execution results in the physical potato denoted by variable ?p being mashed, following the specification of the mash action in the example namespace denoted by the prefix jim:.

<Execute>
   <op> <Const type="rif:iri"> jim:mash </Const> </op>
   <arg> <Var> ?p </Var> </arg>
</Execute>

2.2.1.5 Assign

The Assign construct is used to represent actions that result in a value being assigned to a property of an individual.

The Assign element has one target sub-element that contains a Frame that represents an object-attribute-value, where the "value" has to be assigned to the "attribute" of the "object" on implementing the action.

Editor's Note: Or reuse the Equal construct (AdranP)? But that would be with a different semantics

Editor's Note: How will values be assigned to parameters (if and once we add them)?

    <Assign>
       <target> Frame </target>
    </Assign>

Example. The example below shows the RIF-PRD XML represnetation of the action that increases by 10% the value of the daily grain allowance of the chicken denoted by ?c.

<Assign>
   <target>
      <Frame>
         <object> <Var> ?c </Var> </object>
         <slot> 
            <Prop>
               <key> <Const type="rif:iri"> jim:Chicken/allowance </Const> </key>
               <val> ????Would need to declare a variable to compute 110% of its value??? </val>
            </Prop>
         </slot>*
      </Frame>
   </target>
</Assign>

Editor's Note: Really needs to define a standard syntax for getters and setters for properties, not just accessors à la Frame...

2.3 Production Rules

This section specifies the RIF-PRD constructs that are required, in addition to the [ RIF-PRD condition language] and the [ RIF-PRD action language], to represent, and interchange, complete production rules and rule sets.

The RULE construct is an abstract class: in RIF-PRD instance documents, it is either visible as a ConditionalStatement or as a Forall:

The ConditionalStatement associates an optional FORMULA and one non-empty ordered list of ACTIONS.

Editor's Note: Difference wrt RIF-BLD: The ConditionalStatement is the Implies construct of RIF-BLD. However, the word "Implies" does not carry the appropriate meaning for RIF-PRD. The proposal is to rename the Implies in RIF-BLD (to ConditionalStatement or whatever other name that is prefered) so that RIF-BLD and RIF-PRD can inherit the same construct from Core?

The Forall construct declares one or more Vars, and associates them with zero or more FORMULAs, as patterns that constrain the bindings of the variables, and with one RULE as the formula in its scope.

The RuleSet construct has zero or more constructs of the RULE group associated as rules.

Editor's Note: Do we need ruleset parameters, etc?

2.3.1 RULE

In RIF-PRD, the RULE class of constructs is used to represent rules, that is "if-then" statements, possibly universally quantified.

As an abstract class, RULE is not associated with specific XML markup in RIF-PRD instance documents. It is specified in the normative schema as a substitution group.

Editor's Note: ...Substitution group or another construct, depending on how we handle extensibility in the XML schema.

    [ ConditionalStatement | Forall ]

Editor's Note: Difference wrt RIF-BLD: The FORMULA is optional in a ConditionalStatement, but a list of ACTIONS with a trivially satisfied FORMULA is not the same, conceptually, as a head without a body in a logic program (e.g. it does not belong to the fact base). Thus, the rationale for making the head-without-a-body a different kind of RULE does not apply in the RIF-PRD case. ISSUE: how does Core represent a fact? From the answer to that question depends whether PRD and BLD may differ on this; preferably, they should not differ.

2.3.1.1 ConditionalStatement

The ConditionalStatement construct is used to represent the conditional statement (that is, the "if-then", or condition-conclusion, or antecedent-consequent pair) that is at the core of a rule.

The ConditionalStatement element contains zero or one if sub-element and one then sub-element:

the optional if element contains an element from the FORMULA class of constructs, that represents the condition of the rule;
the required then element contains a non-empty list of elements from the ACTION class of constructs. The order of the ACTION constructs in the then element is significant and MUST be preserved.

Editor's Note: Should we have an "else"-part, in addition to the if-part and the then-part? On the one hand, it is one of the rationale for separating the variable binding patterns FORMULAe from the if-part FORMULA; on the other hand, what is the support for it among the rule languages RIF-PRD aims to cover?

    <ConditionalStatement>
       <if> FORMULA </if>?
       <then>
          ACTION
          ACTION*
        </then>
     </ConditionalStatement>

Editor's Note: Difference wrt RIF-BLD: the content of the then element is specific to RIF-PRD (but PRD and BLD share the structure of the construct).

2.3.1.2 Forall

The Forall construct is used to represent universally quantified rules.

The Forall element contains:

one or more declare sub-elements, each containing a Var element that represent one of the universally quantified variable;
exactly one formula sub-element that represents the formula in the scope of the quantifier and that contains an element of the RULE group;
zero or more pattern sub-elements, each containing an element from the FORMULA group of constructs that represent a constraint on the bindings of one of the Vars that are declared by the Forall.

    <Forall>
       <declare> Var </declare>+
       <pattern> FORMULA </pattern>*
       <formula> RULE </formula>
    </Forall>

Editor's Note: Difference wrt RIF-BLD: the Forall in RIF-PRD adds, wrt BLD, the pattern sub-element to associate FORMULAs to the declared Vars. One of the rationale for the Pattern sub-element is to allow "if-then-else" rules. Both differences are specific to RIF-PRD.

Example. The exmaple below shows the RIF-PRD XML representation of a simplified version of the CMP rule that reads: except on Tuesdays, every potato thet belongs to a chicken is mashed, and the ownership table is updated, or

Forall ?c, where ?c is a jim:Chicken
Forall ?p, where ?p is a jim:Potato and owns(?c ?p)
If Not( jim:today() equal "Tueday"
Then mash(?p) and remove (?c ?p) from the ownership table.

Some tags and content are emphasized (bold-face) for readibility in the XML fragment below. . The complete RIF-PRD XML representation of the CMP rule can be found in [ appendix XXX].

<Forall>
   <declare> <Var> ?c </Var> </declare>
   <pattern>
      <Member>
         <object> <Var> ?c </Var> </object>
         <class> <Const type="rif:iri"> jim:Chicken </Const> </class>
      </Member>
   </pattern>
   <formula>
      <Forall>
         <declare> <Var> ?p </Var> </declare>+
         <pattern>
            <And>
               <Member>
                  <object> <Var> ?p </Var> </object>
                  <class> <Const type="rif:iri"> jim:Potato </Const> </class>
               </Member>
               <Atom>
                  <op> <Const type="rif:iri"> jim:owns </Const> </op>
                  <arg> <Var> ?c </Var> </arg>
                  <arg> <Var> ?p </Var> </arg>
               </Atom>
            </And>
         </pattern>
         <formula>
            <ConditionalStatement>
               <if>
                  &lt:NmNot>
                     <formula>
                        <Equal>
                           <side>
                              <External>
                                 <content>
                                    <Expr>    
                                       <op> <Const type="rif:iri"> jim:today </Const> </op>
                                    </Expr>
                                 </content>
                              </External>
                           </side>
                              <Const type="jim:DayOfTheWeek"> Tuesday </Const>
                           <side>
                           </side>
                        </Equal>
                     </formula>
                  </NmNot>
               </if>
               <then>
                  <Execute>
                     <op> <Const type="rif:iri"> jim:mash </Const> </op>
                     <arg> <Var> ?p </Var> </arg>
                  </Execute>
                  <Remove>
                     <target>
                        <Atom>
                           <op> <Const type="rif:iri"> jim:owns </Const> </op>
                           <arg> <Var> ?c </Var> </arg>
                           <arg> <Var> ?p </Var> </arg>
                        </Atom>
                     </target>
                  </Remove>
               </then>
            </ConditionalStatement>
         </formula>
      </Forall>
   </formula>
</Forall>

2.3.2 RuleSet

The RuleSet construct is used to represent a set of rules that have to be considered together from a semantic or operational viewpoint (e.g. a knowledge base, a rule base, a ruleset).

The RuleSet element contains zero or more rule element. Each rule element contains an element from the RULE group of constructs that represents one of the rules contained in the rule set.

Editor's Note: Shouldn't that be "at least one rule element"?

Editor's Note: RuleSet parameters? Input/ouput Vars?

    <RuleSet>
       <rule> RULE </rule>*
    </RuleSet>

Editor's Note: Difference wrt BLD: the RuleSet construct does not exist in BLD, although its grouping capability can be represented by the Group construct. However, production rule systems use the ruleset as a first-class construct, passing parameters and scoping local variables.

2.4 Presentation syntax

This section specifies a presentation syntax for RIF-PRD. The presentation syntax is not normative: its main purpose is to help make the normative specification of the semantics easier to read.

The presentation syntax is specified by the following EBNF, directly in terms of names of the XML elements specified in the previous sections (and, normatively, in the [ XML schema for RIF-PRD]. Where elements with the same name are used in different context with different content, which would lead to ambiguous productions, the ambiguous name has been appended to a disambiguating, context-specific prefix: namely, the name of the containing element. Specifically, the target role in the Assign production (and element) has been renamed assign_target to differenciate it from the target role of other ACTION elements, and the formula role in the Forall production has been renamed forall_formula.

TERM                       ::= Const | Var | External
Const                   ::= LITERAL '^^' type ('@' xml:lang)?
LITERAL         ::= any Unicode string
type                    ::= IRI
xml:lang                ::= xs:language
Var                     ::= xs:NMTOKEN
External                ::= ' External( ' op ' ( ' arg* ' )) '
op                      ::= TERM
arg                     ::= TERM

ATOMIC                     ::= Atom | Equal | Member | Subclass | Frame
Atom                    ::= op ' ( ' arg* ' ) '
Equal                   ::= side ' = ' side
side                    ::= TERM
Member                  ::= object ' # ' class
object                  ::= TERM
class                   ::= TERM
Subclass                ::= subClass ' ## ' class
subClass                ::= TERM
Frame                   ::= object ' [ ' (key ' -> ' val)* ' ] '
key                     ::= TERM
val                     ::= TERM

FORMULA            ::= ATOMIC | External | And | Or | Naf | Exists
And                     ::= 'AND ( ' formula* ' ) '
formula         ::= FORMULA
Or                      ::= 'OR ( ' formula* ' ) '
NmNot                   ::= ' NOT ( ' formula ' ) '
Exists                  ::= ' Exists ' declare+ ' ( ' formula ' ) '
declare         ::= Var

ACTION                     ::= Assert | Remove | Update | Execute | Assign
Assert                  ::= ' ASSERT ( ' target ' ) '
target                  ::= ATOMIC
Remove                  ::= ' REMOVE ( ' target ' ) '
Update                  ::= ' UPDATE ( ' target ' ) '
Execute         ::= ' EXECUTE { ' op  ' ( ' arg* ' ) } '
Assign                  ::= ' SET ( ' assign_target ' ] ) '
assign_target           ::= Frame

RULE                       ::= ConditionalStatement | Forall
ConditionalStatement    ::= (' IF ' if ' THEN ')? then
if                      ::= FORMULA
then                    ::= ACTION (' ; ' ACTION)*
Forall                  ::= ' Forall ' declare+ (' SUCH THAT ' pattern)* ' ( ' forall_formula ' ) '
pattern         ::= FORMULA
forall_formula          ::= RULE

RuleSet         ::= 'RULESET ( ' rule* ' ) '
rule                    ::= RULE

Example. The example below shows the RIF-PRD presentation syntax for the simplified version of the CMP rule shown in example XX. The RIF-PRD presentation syntax for the complete CMP rule is shows in [ appendix XX].

FORALL ?c SUCH THAT ?c#jim:Chicken^^rif:iri
  ( FORALL ?p SUCH THAT AND( ?p#jim:Potato^^rif:iri jim:owns^^rif:iri(?c ?p))
      ( IF NOT( jim:today^^rif:iri() = "Tuesday"^^jim:DayOfTheWeek )
        THEN EXECUTE( jim:mash^^rif:iri(?p) );
             REMOVE( jim:owns^^rif:iri(?c ?p) )
      )
  )

3 Operational semantics

3.1 Introduction

For the purpose of specifying the semantics of a RIF-PRD RuleSet, a production rule system PRS is defined as a labelled terminal transition system (e.g. Plo04).

Definition (labelled terminal transition system): A labelled terminal transition system is a structure {Γ, L, →, T}, where

Γ is a set of elements, γ, called configurations;
L is a set of elements, α, called actions (or labels);
→ ⊆ Γ × L × Γ is the transition relation, that is: (γ, α, γ' ) ∈ → iff there is a transition labelled α from the configuration γ to the configuration γ' or, more appropriately in the case of a PRS, the execution of action α in the configuration γ causes a transition to configuration γ' . We note: γ + α → γ' or γ ^α→ γ';
T ⊆ Γ is the set of final configurations, that is, the set of all the configurations γ from which there are no transitions: T = {γ ∈ Γ | ∀ α ∈ L, ∀ γ' ∈ Γ, (γ, α, γ' ) ∉ →}.

A labelled transition system is a structure {Γ, L, →} (without a set T of final configurations).

The idea of describing a PRS as a labelled terminal transition system is that, given a set of production rules RS and a set of facts w₀, the rules in RS that are satisfied, in some sense, in w₀ determine an action α₁, which execution results in a new set of facts w₁; the rules in RS that are satisfied in w₁ determine an action α₂ to execute in w₁, and so on, until the system reaches a final configuration and stops. The result is the set of facts w_n when the system stops.

Example. Judicael, one follower of Jim the Hen Handler's method, has four chicken, Jim, Jack, Joe and Julia, that own three potatoes (BigPotato, SmallPotato, UglyPotato) among them:

Jim (daily grain allowance = 10) is 12 months old, Jack (daily grain allowance = 12) is 9 months old, Joe (daily grain allowance = 6) is 6 months old and Julia (daily grain allowance = 14) is 10 months old;
BigPotato weights 70g, SmallPotato weights 10g, UglyPotato weights 50g;
Jim owns BigPotato, Jack and Woof own SmallPotato jointly (Woof is the farm's dog. It inherited joint ownership of SmallPotato from its aunt Georgette) and Joe owns UglyPotato.

That is the initial set of facts w₀. Paula's rule set contains one single rule, that is, Jim's CMP rule:

Rule ChickensAndMashedPotatoes:

Forall ?c, where ?c is a chicken and the age of ?c in months is > 8;

Forall ?p, where ?p is a potato, ?p is owned by ?c, and the weight in decigrams of ?p > (age of ?c)/2;

If today is not Tuesday, and there is no fox in the hen house

Then mash ?p, increase the grain allowance of ?c by 10%, and remove the couple (?c, ?p) from the ownership relation.

When the rule is applied to w₀:

the first line selects {Jim/?c, Jack/?c, Julia/?c} as possible values for variable ?c (Joe is too young);
the second line selects {(Jim/?c, BigPotato/?p)} as the only possible substitution for the variables ?c and ?p (UglyPotato does not belong to either Jim, Jack or Julia and SmallPotato is too small);

Suppose that Judicael's implementation of today() returns Monday and that the foxAlarm() is false when the CMP rule is applied: the condition is satisfied, and the actions in the conclusion are executed with BigPotato substituted for ?p and Jim substituted for ?c. This results in the following changes in the set of facts:

BigPotato is mashed (and removed from the list of potatoes to be considered in future applications of the CMP rule);
the daily grain allowance of Jim is now 11;
Jim does not own a potato anymore.

The resulting set of facts w₁ is thus:

Jim (daily grain allowance = 11) is 12 months old, Jack (daily grain allowance = 12) is 9 months old, Joe (daily grain allowance = 6) is 6 months old and Julia (daily grain allowance = 14) is 10 months old;
SmallPotato weights 10g, UglyPotato weights 50g;
Jack and Woof own SmallPotato jointly and Joe owns UglyPotato.

When the CMP rule in applied to w₁, the first line still selects {?c/Jim, ?c/Jack, ?c/Julia} as possible values for variable ?c, but the second line does not select any possible substitution for the couple (?c, ?p) anymore: the rule cannot be satisfied, and the system, having detected a final configuration, stops.

The result of the execution of the system is w₁.

3.2 Definitions

Let Const be the set of constant symbols that can be represented in RIF-PRD, as determined by the the specification of the Const construct; Var, the set of variable symbols that can be represented in RIF-PRD, as determined by the specification of the Var construct; and Term the set of terms that can be represented in RIF-PRD, as determined by the the specification of the TERM class of constructs. A substitution and the composition of substitutions are defined as follows (e.g. HAK07):

Definition (substitution): A substitution is a finitely non-identical assignment of terms to variables; i.e., a function σ from Var to Term such that the set {x ∈ Var | x ≠ σ(x)} is finite. This set is called the domain of σ and denoted by Dom(σ). Such a substitution is also written as a set such as σ = {t_i/x_i}_i=1..n where Dom(σ) = {x_i}_i=1..n and σ(x_i) = t_i for i = 1 to n.

Given X, a set of variables, and σ, a substitution such that X ⊆ Dom(σ), σ_X denotes, in this document, the restriction of σ to the variables in X: σ_X = {σ(x)/x | x ∈ X}.

Let Term_Σ be the set of ground terms (that can be represented in RIF-PRD), defined as follows:

if c ∈ Const, then c ∈ Term_Σ;
if f ∈ Const is the symbol of an externally specified function and t_i ∈ Term_Σ for i = 1..n, then External(f(t₁ ... t_n)) ∈ Term_Σ.

By convention, a ground External term is always evaluated and replaced by a constant symbol that represents its value.

Definition (ground substitution): A ground substitution is a substitution σ such that ∀ x ∈ Dom(σ), σ(x) ∈ Term_Σ.

The function Var(e) that maps a RIF-PRD element e to the set of its free or universally quantified variables is defined as follows:

if e ∈ Const, then Var(e) = {};
if e ∈ Var, then Var(e) = {e};
if p ∈ Term and arg_i ∈ Term, for i = 0..n, then, Var(p(arg_i) = Var(p) ∪_i=0..n Var(arg_i);
if p ∈ Term and arg_i ∈ Term, for i = 0..n, then, Var(External(p(arg_i)) = Var(p) ∪_i=0..n Var(arg_i);
if t₁ ∈ Term and t₂ ∈ Term, then Var(t₁ [=|#|##] t₂) = Var(t₁) ∪ Var(t₂);
if o ∈ Term, k_i ∈ Term for i = 1..n, and v_i ∈ Term for i = 1..n, then Var(o[k₁->v₁ ... k_n->v_n]) = Var(o) ∪_i=1..n Var(k_i) ∪_i=1..n Var(v_i);
if f_i are RIF-PRD FORMULAs for i = 0..n, then Var([AND|OR|NOT](f_i)) = ∪_i=0..n Var(f_i);
if f is a RIF-PRD FORMULA and x_i ∈ Var for i = 1..n, then, Var(Exists x₁ .. x_n (f)) = Var(f) - {x_i}.
if a is a RIF-PRD ATOMIC, then Var([ASSERT|REMOVE|UPDATE|SET](a)) = Var(a);
if op ∈ Term and arg_i ∈ Term, for i = 0..n, then, Var(EXECUTE(op(arg_i)) = Var(op) ∪_i=0..n Var(arg_i);
if f is a RIF-PRD FORMULA and a_i, i = 1..n are RIF-PRD ACTIONs, then Var(IF f THEN a₁ ... a_n) = Var(f) ∪_i=1..n Var(a_i);
if r is a RIF-PRD RULE, x_i ∈ Var for i = 1..n, and p_j, j = 0..m, are RIF-PRD FORMULAs, then Var(Forall x₁ ... x_n SUCH THAT p_j (r)) = Var(r) ∪_j=1..m Var(p_j);
if r_i, i = 0..n, are RIF-PRD RULEs, then Var(RULESET(r_i)) = ∪_i=1..n Var(r_i).

Definition (rule instance): An instance of a rule r is a pair (r^id, σ), where r^id uniquely identifies r and σ is a ground substitution such that Var(r) ⊆ Dom(σ). Given a rule r, id(r) denotes r^id, the unique identifier of r; given a rule instance ri = (r^id, σ), rule(ri) denotes the rule that is uniquely identified by r^id.

Definition (ruleset instance): an instance of a rule set RS is a set of rule instances ri_i, where ∀ i, rule(ri_i) ∈ RS. Given a rule set RS, Inst(RS) denotes the set of all the possible instances of RS.

Definition (agenda): an agenda is an ordered list of pairs (ri_i, mark_i), where each ri_i is a rule instance and mark_i ∈ {fireable, fired}. Given a set of rules RS, A_RS denotes the set of the agendas that are such that the rule instance parts of all their elements are instances of RS, i.e.: ∀ a = {(ri_i, m_i)}, a ∈ A_RS iff ∀ ri_i, ri_i ∈ Inst(RS).

Editor's Note: The order in the agenda might not be used in this draft; but defining an agenda as ordered may be useful in later drafts.

Let W be the set of ground formulas that can be represented in RIF-PRD, defined as follows:

if p ∈ Const is the symbol of an externally specified predicate and t_i ∈ Term_Σ for i = 1..n, then External(p(t₁ ... t_n)) ∈ W</sub>;
if p ∈ Const is neither the symbol of an externally specified predicate nor the symbol of an externally specified function, and t_i ∈ Term_Σ for i = 1..n, then p(t₁ ... t_n) ∈ W</sub>;
if t₁ ∈ Term_Σ and t₂ ∈ Term_Σ, then t₁ = t₂ ∈ W;
if t₁ ∈ Term_Σ and t₂ ∈ Term_Σ, then t₁ # t₂ ∈ W;
if t₁ ∈ Term_Σ and t₂ ∈ Term_Σ, then t₁ ## t₂ ∈ W;
if o ∈ Term_Σ, k_i ∈ Term_Σ for i = 1..n, and v_i ∈ Term_Σ for i = 1..n, then o[k₁->v₁ ... k_n->v_n] ∈ W;
if f_i ∈ W for i = 1..n, then AND(f₁ ... f_n) and OR(f₁ ... f_n) are ground formulas;
if f ∈ W, then NOT(f) ∈ W;
if f ∈ W and x_i ∈ Var for i = 1..n, then, ∀ n ≥ 1, Exists x₁ .. x_n (f) ∈ W. By convention, a ground formula Exists x₁ .. x_n (f) is always replaced by the semantically equivalent ground formula f.

Editor's Note: ..or W could be defined, except for the Exists, as the set of the formulas f such that Var(f) = {}?

Let L be the set of the atomic ground actions that can be represented in RIF-PRD:

if TACTION is a targeted RIF-PRD ACTION: ASSERT, REMOVE, UPDATE or SET, and if its target element is a ground ATOMIC a ∈ W, them TACTION(a) ∈ L;
if f ∈ Const is the symbol of an externally specified function and t_i ∈ Term_Σ for i = 1..n, then EXECUTE(f(t₁ ... t_n)) ∈ L.

Editor's Note: The definition of L is subject to change when the ACTIONs will be specified more precisely.

Finally, let R be the set of all the rules that can be represented in RIF-PRD.

In this document, given a set X, P(X) denotes the power set of X, and S(X) denotes the set of the ordered lists of elements of X.

3.3 Operational semantics of actions

The transition relation →_{_RIF-PRD} ⊆ P(W) × L × P(W) is defined as follows:

∀ w ⊆ P(W), w + ASSERT(f) →_{_RIF-PRD} w' = w ∪ {f};
∀ w ⊆ P(W), w + REMOVE(f) →_{_RIF-PRD} w' = w - f;
∀ w ⊆ P(W), w + UPDATE(f) →_{_RIF-PRD} w' = TBD;
∀ w ⊆ P(W), w + EXECUTE(f) →_{_RIF-PRD} w' = TBD;
∀ w ⊆ P(W), w + ASSIGN(f) →_{_RIF-PRD} w' = TBD;

The intended semantics of the actions that can be represented in RIF-PRD is completely specified by the definition of the labelled transition system {P(W), L, →_{_RIF-PRD}}.

Editor's Note: The definition of W would need to be modified if we wanted to cover non-ground facts as well, but the definition of the relation would remain essentially the same.

The relation →^*_{_RIF-PRD} ⊆ P(W) × S(L) × P(W) denote the transitive closure of the transition relation →_{_RIF-PRD}.

3.4 Operational semantics of rule sets

Assuming a function INSTANTIATE: P(W) × P(R) × A_R → A_R, that, given a set of facts, a set of rules, and an agenda, returns an agenda, let Γ_RS be a set of configurations, where a configuration is a triple γ = (w, a, INSTANTIATE(w, a, RS)), where w ⊆ W is a set of facts, a ∈ A is an agenda, and RS ⊆ R is a rule set.

Given a configuration γ = (w, a, a_γ), let w(γ) denote the first element, the set of facts w; let history(γ) denote the second element, the input agenda a; and let agenda(γ) denote the third element, the new agenda, a_γ, produced by INSTANTIATE.

Assuming two more functions:

PICK: Γ_RS × P(R) → S(L) × P(Inst(R)), that, given a configuration and a set of rules, returns an ordered list of ground actions and a set of rule instances;
FINAL: Γ_RS × P(R) → {true, false}, that, given a configuration and a set of rules, returns true or false;

The idea is that PICK returns the sequence of actions that the input rule set is intended to produce in the input configuration, as well as the set of rule instances that determined that sequence of actions. In the same way, FINAL is meant to return true if the input configuration is intended to be final, given the input rule set, and false otherwise.

Let updateAgenda: A_R × P(Inst(R)) → A_R, a function that, given an agenda a and a set rsinst of rule instances, marks as fired, in a, all the rule instances that belong to rsinst: update(a, rinst) = {(ri, mark) | ((ri, m') ∈ a) and (mark = fired, if ri ∈ rsinst; or mark = m' otherwise)}.

Editor's Note: What is the appropriate presentation for the "case" definition of the value of mark, in the definition of updateAgenda?

A RIF-PRD production rule system is defined as a labelled terminal transition system where:

Γ_RS is the set of configurations;
S(L) is the set of labels;
The transition relation →_{_RS} ⊆ Γ_RS × S(L) × Γ_RS, is defined as follows:
∀ (γ, s, γ') ∈ Γ_RS × S(L) × Γ_RS, (γ, s, γ') ∈ →_{_RS} if and only if all of the following hold:
1. PICK(γ, RS) = (s, pickedInst)
2. and (w(γ), s, w(γ')) ∈ →^*_{_RIF-PRD}
3. and a(γ') = updateAgenda(agenda(γ), pickedInst)
4. and γ ∉ T_RS
A configuration γ is final iff FINAL(γ, RS) = true: T_RS = {γ ∈ Γ_RS | FINAL(γ, RS) = true}.

Intuitively, the conditions 1 and 2 in the definition of the transition relation →_{_RS} say that there must be a path, according to the relation →_{_RIF-PRD} that specifies the semantics of the individual actions, leading from the set of facts w = w(γ) to the set of facts w' = w(γ'), that implements the sequence s of actions returned by PICK in configuration γ. Rule 3 means that the systems keeps a memory of the rule instances which action part has been executed.

The input function is defined as:
Eval(RS, w) →_{_RS} (w, ∅, INSTANTIATE(w, ∅, RS))) ∈ Γ_RS The output function is defined as:
(w', a, a_w') ∈ T_RS →_{_RS} w'

Or, using →^*_{_RS} to denote the transitive closure of the transition relation:

Eval(RS, w) →^*_{_RS} w'

The intended operational semantics of a production ruleset represented by a RIF-PRD RuleSet RS is, therefore, completely specified by the specification of the three functions INSTANTIATE, PICK and FINAL.

Editor's Note: to be developed: explain, by reference to the usual description of a PRS as a match-resolve conflict-fire cycle, why 3 functions instead of only PICK and final if empty pick.

3.4.1 Rules instantiation: INSTANTIATE

The evaluation of the function INSTANTIATE corresponds to the step that is often called matching in the description of production rule systems (e.g. PRR07).

Indeed, the function is most easily specified using the widely spread notion of pattern matching. The following definition is adapted from CIR04.

Definition (pattern matching): Given a set w ⊆ W of ground RIF-PRD ATOMICs and a RIF-PRD FORMULA p, p is said to match w with respect to a theory τ and a ground substitution σ, denoted σ(p) «_τ w, iff one of the following is true:

p is a RIF-PRD ATOMIC and ∃ f ∈ w such that σ(p) =_τ f (read: σ(p) is identical to f modulo τ);
p is a RIF-PRD External and σ(p) evaluates to true;
p is a RIF-PRD And that contains no conjunct, that is, p = AND();
p = AND(p₁ ... p_n), i = 1..n, and ∀ p_i, σ(p_i) «_τ w;
p = OR(p₁ ... p_n), i = 1..n, and ∃ p_i such that σ(p_i) «_τ w;
p = NOT(p') and σ(p') does not match w with respect to τ;
p = Exists {x_i} p' , where {x_i}, i = 1..n, denotes the set of the variables that are existentially quantified in p, and there is a substitution θ with {x_i} ⊆ Dom(θ), Dom(σ) ⊆ Dom(θ) and ∀ y ∈ Var(p') - {x_i}, θ(y) = σ(y), such that θ(p') «_τ w.

A set P of RIF-PRD FORMULAe is said to match a set w ∈ W of ground RIF-PRD ATOMICs with respect to a theory τ and a ground substitution σ, denoted σ(P) «_τ w, iff σ(p_i) «_τ w for all p_i ∈ P.

Editor's Note: The definition is easily, if tediously, extended to cover the case of matching arbitrary FORMULAe against any set of ground FORMULAe ⊆ W.

Editor's Note: The matching theory τ to be taken into account includes the equality theory defined between frames, the builtin datatypes, functions and relations, and any externally defined datatypes, classification theory, functions and relations as required by the ruleset. This is to be further specified when references to application-specific data models and externals is specified.

Editor's Note: Do we need to go into any deeper details wrt the matching of ATOMICs, or is that unambiguous enough?

Example. TBD

Let InstantiateRULE: P(W) × A_R × R → A_R, be a function that, given a set of facts w, an agenda a and a rule r, returns an agenda a' ∈ A_{r}.

The evaluation of InstantiateRULE(w, a, r), where w ⊆ W, a ∈ A_R and r ∈ R, is specified as a terminal transition system where:

A configuration is a sextuple (w, a, r, σ, Σ, acc), where w ⊆ W, a ∈ A_R, r ∈ R, σ is a substitution, Σ is a set of substitutions such that ∀ σ_i ∈ Σ, σ ⊆ σ_i and Var(r) ⊆ Dom(σ_i), and acc is a pair (id, Σ'), where id is the unique identifier of a rule, and Σ' ⊆ Σ;
The transition relation is defined by the following rules:
1. if θ ∈ Σ and ∀ y ∈ Var(P) - {x_i}, θ(y) = σ(y) and θ(P) «_τ w:
  (w, a, Forall {x_i} P (r), σ, Σ, acc) → (w, a, r, θ, Σ, acc)
2. if θ ∈ Σ and ∀ y ∈ Var(P) - {x_i}, θ(y) = σ(y) and ¬(θ(P) «_τ w):
  (w, a, Forall {x_i} P (r), σ, Σ, acc) → (w, a, Forall {x_i} P (r), σ, Σ - θ, acc)
3. if σ ∈ Σ and σ(c) «_τ w:
  (w, a, IF c THEN a, σ, Σ, (id, Σ')) → (w, a, rule(id), ∅, Σ - σ, (id, Σ' ∪ {σ}))
4. (w, a, THEN a, σ, Σ, (id, Σ')) → (w, a, rule(id), ∅, Σ - σ, (id, Σ' ∪ {σ}))
5. if σ ∈ Σ and ¬(σ(c) «_τ w):
  (w, a, IF c THEN a, σ, Σ, (id, Σ')) → (w, a, rule(id), ∅, Σ - σ, (id, Σ'))
Configurations where the set Σ of candidate substitutions is empty are final: T = {(w, a, r, σ, ∅, acc)}.

Intuitively, a configuration is made of the set of facts, w, against which the rule is evaluated; the agenda, a, that contains the memory of the system in terms of which rules were instantiated already, and the actions determined by which rule instances where implemented already; the rule, r, to be instantiated; the substitution, σ, that is under evaluation as possibly instantiating r in such a way that its condition matches w; the set, Σ, of the candidate substitutions that are still to be evaluated, including σ; and the instances accumulator, acc, that relates the identifier, id, of the instantiated rule, r, to the set of the substitutions that have been evaluated and with respect to which the condition of r is known to match w.

In the definition of the transition relation, the first rule says that, if the rule r to be instantiated is represented by a RIF-PRD Forall that declares a set of variables {x_i} constrained by a set of patterns P, and if there is a substitution θ that agrees with σ on all the variables that are free in P, except possibly on the variables in {x_i}, such that P matches the set of facts w with respect to θ, then, the instantiation of the RIF-PRD RULE that is in the scope of the Forall can be continued with the candidate substitution θ instead of σ. Rule 2 says that, in the same case, if θ(P) does not match w, r remains to be instantiated, but θ need not be considered further and can be removed from the set of candidate substitutions.

Rules 3 says that, if the rule to be instantiated is represented by a RIF-PRD ConditionalStatement, and if the condition c represented by the RIF-PRD FORMULA in its if element matches the set of fact w with respect to the substitution σ, then, σ is a matching substitution for the rule rule(id); and that the rule instantiation process can be started all over again with the remaining candidate substitutions. Rule 4 accounts for the special case where the if part is omitted in the ConditionalStatement: the intended semantics, in that case, is that the condition part is meant to be trivial true.

Rule 5 says that, if the rule r to be instantiated is represented by a RIF-PRD ConditionalStatement, and if the condition c represented by the RIF-PRD FORMULA in its if element does not match the set of fact w with respect to the substitution σ, then, r remains to be instantiated, but σ need not be considered further and can be removed from the set of candidate substitutions.

Editor's Note: There are certainly more elegant ways to make sure that all the possible instances have been found... Suggestion, anyone?

Given a rule r ∈ R, let Σ_r denote the set of the restrictions to Var(r) of all the subtitutions σ such that Var(r) ⊆ Dom(σ): Σ_r = {σ_Var(r) | ∀ σ, Var(r) ⊆ Dom(σ)}.

The input function can now be defined as:
InstantiateRULE(w, a, r) → (w, a, r, ∅, Σ_r, (id(r), ∅))

The output function is defined as:

(w, a, r, σ, ∅, (id, Σ')) → {((id, σ), m) | (σ ∈ Σ') and (m = m' if (id, σ, m') ∈ a; fireable otherwise)}

The output function creates an agenda out of the accumulator, marking new instances as fireable, whereas instances that where already in the input agenda keep their marking.

Example. TBD

The evaluation of the function INSTANTIATE(w, a, RS), where w ⊆ W, a ∈ A_R and RS ⊆ R, can now be specified as a simple terminal transition system where:

a configuration is a quadruple (w, a, rs, α), where rs ⊆ RS and α ⊆ A_{(RS -
rs)};
the transition relation is defined by the single rule:
(w, a, rs, α) → (w, a, rs - r, α ∪ InstantiateRULE(w, a, r))
configurations with an empty rule set component are final: T = {(w, a, ∅, α)}.

The input function is defined as:

INSTANTIATE(w, a, RS) → (w, a, RS, ∅)

The output function is defined as:

(w, a, ∅, α) → α

Editor's Note: To Be Added: textual explanation of the rules and the TTS (esp. re its confluence).

Example. TBD

3.4.2 Conflict resolution: PICK

TBD

default for 1st WD could be to return the instantiate then-part of the first, or a random, fireable rule instance in INSTANTIATE(w, a, RS), after refraction.

3.4.3 Halting test: FINAL

TBD

Default for 1st WD is likely to be when no rule is fireable in INSTANTIATE(w, a, RS), that is, either the agenda is empty or all rules are already fired = (if it is the above default) PICK return an empty list of actions.

4 Compatibility with other RIF dialects

TBD

5 Glossary

TBD

RIF Production Rule Dialect

W3C Editor's Draft 23 May 2008

Abstract

Status of this Document

May Be Superseded

Please Comment By 2008-05-25

No Endorsement

Patents

Contents

1 Overview

1.1 Introduction

1.2 Overview of RIF production rule dialect

1.2.1 RIF-PRD as an interchange format for production rules

1.2.2 RIF-PRD as a Web language

1.2.3 Compatibility with other RIF dialects

1.2.4 Syntaxes

1.2.4.1 XML syntax

1.2.4.2 Presentation syntax

1.2.4.3 Structural diagram

1.3 Overview of this document

1.3.1 Namespaces

1.3.2 Running example

2 Syntax

2.1 Condition Language

2.1.1 TERM

2.1.1.1 Const

2.1.1.2 Var

2.1.1.3 External

2.1.2 ATOMIC

2.1.2.1 Atom

2.1.2.2 Equal

2.1.2.3 Member

2.1.2.4 Subclass

2.1.2.5 Frame

2.1.3 FORMULA

2.1.3.1 ATOMIC

2.1.3.2 External

2.1.3.3 And

2.1.3.4 Or

2.1.3.5 NmNot

2.1.3.6 Exists

2.2 Actions

2.2.1 ACTION

2.2.1.1 Assert

2.2.1.2 Remove

2.2.1.3 Update

2.2.1.4 Execute

2.2.1.5 Assign

2.3 Production Rules

2.3.1 RULE

2.3.1.1 ConditionalStatement

2.3.1.2 Forall

2.3.2 RuleSet

2.4 Presentation syntax

3 Operational semantics

3.1 Introduction

3.2 Definitions

3.3 Operational semantics of actions

3.4 Operational semantics of rule sets

3.4.1 Rules instantiation: INSTANTIATE

3.4.2 Conflict resolution: PICK

3.4.3 Halting test: FINAL

4 Compatibility with other RIF dialects

5 Glossary