Deriving BLD PS from the XML syntax (Was: Re: escaping quotes)

Michael Kifer wrote:
> 
> I like your jokes. Especially when you tell them in person. (really :-)
> But April 1 was > 1 month ago.

Hmmm... My sense of humor must be as out of order as my calendar: I do 
not get the joke :-(

Anyway, I said some times ago (F2F8, I think) that I would give a 2 
hours try to reorganize BLD to allow the PS to be presented as derived 
from the XML and here is the result of my attempt: I reached the time 
limit in the middle of a sentence, so, it is still patchy and sketchy. 
But I think that it is enough to make my point (that it is not only 
feasible, but pretty easy).

Here is how it works: replace all of section 2.5 (EBNF Grammar for the 
Presentation Syntax of RIF-BLD) with the text in the file. Print whole 
document. Read. Does it make sense?

The next step would be to remove section 4.3 (Translation Between the 
RIF-BLD Presentation and XML Syntaxes) and to keep only the examples 
from 4.1 and 4.2 (and probably move them earlier in the text). But you 
can try my suggestion of presentation without going that far.

I believe that, give the short time I spent on it, it is close enough 
that we should at least give that alternative a thought...

Cheers,

Christian
==== XML syntax for RIF-BLD ====

The XML syntax is specified for each component as a pseudo-schemas. The pseudo-schemas use BNF-style conventions for attributes and elements: "?" denotes optionality (i.e. zero or one occurrences), "*" denotes zero or more occurrences, "+" one or more occurrences, "[" and "]" are used to form groups, and "|" represents choice. Attributes are conventionally assigned a value which corresponds to their type, as defined in the normative schema. Elements are conventionally assigned a value which is the name of the syntactic class of their content, as defined in the normative schema.

{{{
#!html
<pre>
&lt;!-- sample pseudo-schema --&gt;
    &lt;<strong>defined_element</strong>
          required_attribute_of_type_string="<em>xs:string</em>"
          optional_attribute_of_type_int="<em>xs:int</em>"? &gt;
      &lt;required_element /&gt;
      &lt;optional_element /&gt;?
      &lt;one_or_more_of_these_elements /&gt;+
      [ &lt;choice_1 /&gt; | &lt;choice_2 /&gt; ]*

    &lt;/<strong>defined_element</strong>&gt;
</pre>
}}}

===== Document, Group and rules =====

A RIF-BLD <tt>Document</tt> consists of an optional <tt>Directive</tt> element and an optional <tt>Group</tt> element with an optional <tt>meta</tt> subelement.

{{EdNote|text=Directives not yet in the XML syntax below}}

A <tt>Group</tt> can contain any number of <tt>sentence</tt> subelements that each contain a rule or any number of nested <tt>Group</tt>s.
Rules are represented by a <tt>Forall</tt> element, a <tt>Implies</tt> element, or any one of the <tt>ATOMIC</tt> elements.

{EdNote|text=I reach the time limit while I was working on that piece of text. CSMA}
 quantifier. If a <tt>CLAUSE</tt> in the <tt>RULE</tt> production has a
free (non-quantified) variable, it must occur in the <tt>Var+</tt> sequence. <tt>Frame</tt>, <tt>Var</tt>, <tt>ATOMIC</tt>, and <tt>FORMULA</tt> were defined as part of the syntax for positive conditions in Section [[#sec-ebnf-condition-language|EBNF for RIF-BLD Condition Language]]. In the <tt>CLAUSE</tt> production an <tt>ATOMIC</tt> is treated as a rule with an empty condition part -- in which case it is usually called a ''fact''. Note that, by a definition in Section [[#sec-formulas|Formulas]], formulas that query externally defined atoms (i.e., formulas of the form <tt>External(Atom(...))</tt>) are not allowed in the conclusion part of a rule (<tt>ATOMIC</tt> does not expand to <tt>External</tt>). 

  RULE      ::= [ Forall | Implies | ATOMIC ]
  
 &lt;Group>
     &lt;meta> '''''Frame''''' &lt;/meta>?
     &lt;sentence> [ RULE | Group ] &lt;/sentence>*
 &lt;/Group>
 
  &lt;Forall>
      &lt;declare> '''''Var''''' &lt;/declare>+
      &lt;formula> [ '''''Implies'''' | '''''ATOMIC''''' ]' &lt;/formula>
  &lt;/Forall>

  &lt;Implies>
      &lt;if> '''''FORMULA''''' &lt;/if>
      &lt;then> '''''ATOMIC''''' &lt;/then>
  &lt;/Implies>

===== Formulas and terms =====

The Condition Language represents formulas that can be used in the body of RIF-BLD rules. 

The production rule for the non-terminal <tt>FORMULA</tt> represents ''RIF condition formulas'' (defined earlier). The XML elements <tt>And</tt> and <tt>Or</tt> are used to represent conjunctions and disjunctions of conditions, respectively. The element <tt>Exists</tt> is used to introduces existentially quantified variables. Here <tt>Var+</tt> stands for the list of variables that are free in <tt>FORMULA</tt>. RIF-BLD conditions permit only existential variables. A RIF-BLD <tt>FORMULA</tt> can also be an <tt>ATOMIC</tt> term, i.e. an <tt>Atom</tt>, <tt>External</tt> <tt>Atom</tt>, <tt>Equal</tt>, <tt>Member</tt>, <tt>Subclass</tt>, or <tt>Frame</tt>. A <tt>TERM</tt> can be a constant, variable, <tt>Expr</tt>, or <tt>External</tt> <tt>Expr</tt>.

The RIF-BLD XML syntax does not commit to any particular vocabulary and permits arbitrary Unicode strings in constant symbols, argument names, and variables. Constant symbols are represented by the <tt>Const</tt> element, where the value of the <tt>type</tt> attribute is a Unicode string that represents an identifier or an alias of the symbol space of the constant, and the content is a Unicode string from the lexical space of that symbol space. Names are denoted by Unicode character sequences. Variables are represented by the <tt>Var</tt> element, that contains a Unicode string. Equality, membership, and subclass terms are represented, respectively, by the <tt>Equal</tt>, <tt>Member</tt> and <tT>Subclass</tt> elements. An <tt>Atom</tt> (resp. <tt>Expr</tt>) element can either represent a positional atom (resp. function) or an atom (resp. function) with named arguments. The <tt>Frame</tt> element is used to represent terms that are composed of an object Id and a collection of attribute-value pairs. The <tt>content</tt> subelement of an <tt>External</tt> element can contain any one of the <tt>ATOMIC</tt> elements when it represents a call to an externally defined predicate, equality, membership, subclassing, or frame. It contains an <tt>Expr</tt> element when it is used to represent a call to an externally defined function.

====== FORMULA ====== 

  FORMULA ::= [ ATOMIC | External | And | Or | Exists ]

  &lt;External>
      &lt;content> '''''Atom''''' &lt;/content>
  &lt;/External>
  
  &lt;And>
      &lt;formula> '''''FORMULA''''' &lt;/formula>*
  &lt;/And>

  &lt;Or>
      &lt;formula> '''''FORMULA''''' &lt;/formula>*
  &lt;/Or>

  &lt;Exists>
      &lt;declare> '''''Var''''' &lt;/declare>+
      &lt;formula> '''''FORMULA''''' &lt;/formula>
  &lt;/Exists>

===== ATOMIC =====

  ATOMIC         ::= [ Atom | Equal | Member | Subclass | Frame ]

  &lt;Atom>
      &lt;op> '''''TERM''''' &lt;/op>
      [ &lt;arg> '''''TERM''''' &lt;/arg>* {ordered} | &lt;slot> '''''Prop'''''' &lt;/slot>* ]
  &lt;/Atom>

  &lt;Prop>
      &lt;key> '''''Any Unicode string''''' &lt;/key>
      &lt;val> '''''TERM''''' &lt;/val>
  &lt;/Prop>

  &lt;Equal>
      &lt;side> '''''TERM''''' &lt;/side>
      &lt;side> '''''TERM''''' &lt;/side>
  &lt;/Equal>

  &lt;Member>
      &lt;lower> '''''TERM''''' &lt;/lower>
      &lt;upper> '''''TERM''''' &lt;/upper>
  &lt;/Member>

  &lt;Subclass>
      &lt;lower> '''''TERM''''' &lt;/lower>
      &lt;upper> '''''TERM''''' &lt;/upper>
  &lt;/Subclass>

  &lt;Frame>
      &lt;object> '''''TERM''''' &lt;/object>
      &lt;slot>
          &lt;Prop>
              &lt;key> '''''TERM''''' &lt;/key>
              &lt;val> '''''TERM''''' &lt;/val>
          &lt;/Prop>
      &lt;/slot>*
  &lt;/Frame>

====== TERM ======

  TERM		::= [ Const | Var | External | Expr ]

  &lt;Const type="'''''IRI'''''" [xml:lang="'''''xs:language'''''"]? >
      '''''Any Unicode string'''''
  &lt;/Const>

  &lt;Var> '''''Any Unicode string''''' &lt;/Var>

  &lt;External>
      &lt;content> '''''Expr''''' &lt;/content>
  &lt;/External>
  
  &lt;Expr>
      &lt;op> '''''TERM''''' &lt;/op> 
      [ &lt;arg> '''''TERM''''' &lt;/arg>* {ordered} | &lt;slot> '''''Prop'''''' &lt;/slot>* ]
  &lt;/Expr>

==== EBNF Grammar for the Presentation Syntax of RIF-BLD ====

So far, the syntax of RIF-BLD has been specified in mathematical English and XML. Tool developers, however, may prefer EBNF notation, which provides a more succinct overview of the syntax. Several points should be kept in mind regarding this notation. 

* The syntax of first-order logic is not context-free, so EBNF does not capture the syntax of RIF-BLD precisely. For instance, it cannot capture the section on [[#sec-well-formed|well-formedness conditions]], i.e., the requirement that each symbol in RIF-BLD can occur in at most one context. As a result, the EBNF grammar defines a strict ''superset'' of RIF-BLD (not all rules that are derivable using the EBNF grammar are well-formed rules in RIF-BLD). 
* The EBNF syntax is ''not a concrete'' syntax: it does not address the details of how constants and variables are represented, and it is not sufficiently precise about the delimiters and escape symbols. Instead, white space is informally used as a delimiter, and white space is implied in productions that use Kleene star. For instance, <tt>TERM*</tt> is to be understood as <tt>TERM&nbsp;TERM&nbsp;...&nbsp;TERM</tt>, where each ' ' abstracts from one or more blanks, tabs, newlines, etc. This is done intentionally, since RIF's presentation syntax is used as a tool for specifying the semantics and for illustration of the main RIF concepts through examples. It is ''not'' intended as a concrete syntax for a rule language. RIF defines a concrete syntax only for ''exchanging'' rules, and that syntax is XML-based, obtained as a refinement and serialization of the EBNF syntax.
* For all the above reasons, the EBNF syntax is ''not normative''.

<span id="sec-ebnf-condition-language" class="anchor"></span> 

===== EBNF for RIF-BLD =====

The RIF-BLD presentation syntax does not commit to any particular vocabulary and permits arbitrary Unicode strings in constant symbols, argument names, and variables. Constant symbols have the form: <tt>"UNICODESTRING"^^SYMSPACE</tt>, where <tt>SYMSPACE</tt> is a Unicode string that represents an identifier or an alias of the symbol space of the constant, and <tt>UNICODESTRING</tt> is a Unicode string from the lexical space of that symbol space. Names are denoted by Unicode character sequences. Variables are denoted by a <tt>UNICODESTRING</tt> prefixed with a ?-sign.

The EBNF grammar for a superset of the RIF-BLD condition language is as follows.

   FORMULA        ::= 'And' '(' FORMULA* ')' |
                      'Or' '(' FORMULA* ')' |
                      'Exists' Var+ '(' FORMULA ')' |
                      ATOMIC |
                      'External' '(' ATOMIC ')'
   ATOMIC         ::= Atom | Equal | Member | Subclass | Frame
   Atom           ::= UNITERM
   UNITERM        ::= Const '(' (TERM* | (Name '->' TERM)*) ')'
   Equal          ::= TERM '=' TERM
   Member         ::= TERM '#' TERM
   Subclass       ::= TERM '##' TERM
   Frame          ::= TERM '[' (TERM '->' TERM)* ']'
   TERM           ::= Const | Var | Expr | 'External' '(' Expr ')'
   Expr           ::= UNITERM
   Const          ::= '"' UNICODESTRING '"^^' SYMSPACE
   Name           ::= UNICODESTRING
   Var            ::= '?' UNICODESTRING
   SYMSPACE       ::= UNICODESTRING

The presentation syntax for RIF-BLD rules extends the syntax in Section [[#sec-ebnf-condition-language|EBNF for RIF-BLD Condition Language]] with the following productions.

{{EdNote|text=The metadata syntax and the approach to rule identification presented in this draft are currently under discussion by the Working Group.   Input is welcome.  See Issue-51}}

   Document  ::= 'Document' '(' IRIMETA? DIRECTIVE* Group? ')'
   DIRECTIVE ::= Import
   Import    ::= 'Import' '(' IRI PROFILE? ')'
   Group     ::= 'Group' IRIMETA? '(' (RULE | Group)* ')'
   IRIMETA   ::= Frame
   RULE      ::= 'Forall' Var+ '(' CLAUSE ')' | CLAUSE
   CLAUSE    ::= Implies | ATOMIC
   Implies   ::= ATOMIC ':-' FORMULA
   IRI       ::= UNICODESTRING
   PROFILE   ::= UNICODESTRING

'''Example 1''' (RIF-BLD conditions).

This example shows conditions that are composed of atoms, expressions, frames, and existentials. In frame formulas variables are shown in the positions of object Ids, object properties, and property values. For brevity, we use the ''compact URI'' notation <nowiki>[</nowiki>[[#ref-curie|CURIE]]], <tt>prefix:suffix</tt>, which should be understood as a macro that expands into a concatenation of the <tt>prefix</tt>  definition and <tt>suffix</tt>. Thus, if <tt>bks</tt> is a prefix that expands into <tt><nowiki>http://</nowiki>example.com/books#</tt> then <tt>bks:LeRif</tt> should be understood merely as an abbreviation for <tt><nowiki>http://</nowiki>example.com/books#LeRif</tt>.  The compact URI notation is ''not'' part of the RIF-BLD syntax.


 Compact URI prefixes:
 
   bks  ''expands into'' <nowiki>http://</nowiki>example.com/books#
   auth ''expands into'' <nowiki>http://</nowiki>example.com/authors#
   cpt  ''expands into'' <nowiki>http://</nowiki>example.com/concepts#

 Positional terms:
 
   "cpt:book"^^rif:iri("auth:rifwg"^^rif:iri "bks:LeRif"^^rif:iri)
   Exists ?X ("cpt:book"^^rif:iri(?X "bks:LeRif"^^rif:iri))
 
 Terms with named arguments:
 
   "cpt:book"^^rif:iri(cpt:author->"auth:rifwg"^^rif:iri
                       cpt:title->"bks:LeRif"^^rif:iri)
   Exists ?X ("cpt:book"^^rif:iri(cpt:author->?X cpt:title->"bks:LeRif"^^rif:iri))
 
 Frames:
 
   "bks:wd1"^^rif:iri["cpt:author"^^rif:iri->"auth:rifwg"^^rif:iri
                      "cpt:title"^^rif:iri->"bks:LeRif"^^rif:iri]
   Exists ?X ("bks:wd2"^^rif:iri["cpt:author"^^rif:iri->?X
                                 "cpt:title"^^rif:iri->"bks:LeRif"^^rif:iri])
   Exists ?X (And ("bks:wd2"^^rif:iri#"cpt:book"^^rif:iri
                   "bks:wd2"^^rif:iri["cpt:author"^^rif:iri->?X                                                                              
                                      "cpt:title"^^rif:iri->"bks:LeRif"^^rif:iri]))
   Exists ?I ?X (?I["cpt:author"^^rif:iri->?X "cpt:title"^^rif:iri->"bks:LeRif"^^rif:iri])
   Exists ?I ?X (And (?I#"cpt:book"^^rif:iri
                      ?I["cpt:author"^^rif:iri->?X                                                  
                         "cpt:title"^^rif:iri->"bks:LeRif"^^rif:iri]))
   Exists ?S ("bks:wd2"^^rif:iri["cpt:author"^^rif:iri->"auth:rifwg"^^rif:iri
                                 ?S->"bks:LeRif"^^rif:iri])
   Exists ?X ?S ("bks:wd2"^^rif:iri["cpt:author"^^rif:iri->?X
                                    ?S->"bks:LeRif"^^rif:iri])
   Exists ?I ?X ?S (And (?I#"cpt:book"^^rif:iri  ?I[author->?X ?S->"bks:LeRif"^^rif:iri]))



<span id="sec-ebnf-rule-language" class="anchor"></span>

'''Example 2''' (RIF-BLD rules).

This example shows a business rule borrowed from the document [[UCR|RIF Use Cases and Requirements]]:
<ul>
   <em>
     If an item is perishable and it is delivered to John more than 10 days
     after the scheduled delivery date then the item will be rejected by him.
   </em>
</ul>
As before, for better readability we use the compact URI notation.

 Compact URI prefixes:
 
   ppl ''expands into'' <nowiki>http://</nowiki>example.com/people#
   cpt ''expands into'' <nowiki>http://</nowiki>example.com/concepts#
   op  ''expands into'' the yet-to-be-determined IRI for RIF builtin predicates

 a. Universal form:
 
    Forall ?item ?deliverydate ?scheduledate ?diffduration ?diffdays (
         "cpt:reject"^^rif:iri("ppl:John"^^rif:iri ?item) :-
             And("cpt:perishable"^^rif:iri(?item)
                 "cpt:delivered"^^rif:iri(?item ?deliverydate "ppl:John"^^rif:iri)
                 "cpt:scheduled"^^rif:iri(?item ?scheduledate)
                 External("fn:subtract-dateTimes-yielding-dayTimeDuration"^^rif:iri(?deliverydate ?scheduledate ?diffduration))
                 External("fn:get-days-from-dayTimeDuration"^^rif:iri(?diffduration ?diffdays))
                 External("op:numeric-greater-than"^^rif:iri(?diffdays "10"^^xsd:integer)))
    )
 
 b. Universal-existential form:
 
    Forall ?item (
         "cpt:reject"^^rif:iri("ppl:John"^^rif:iri ?item ) :-
             Exists ?deliverydate ?scheduledate ?diffduration ?diffdays (
                  And("cpt:perishable"^^rif:iri(?item)
                      "cpt:delivered"^^rif:iri(?item ?deliverydate "ppl:John"^^rif:iri)
                      "cpt:scheduled"^^rif:iri(?item ?scheduledate)
                      External("fn:subtract-dateTimes-yielding-dayTimeDuration"^^rif:iri(?deliverydate ?scheduledate ?diffduration))
                      External("fn:get-days-from-dayTimeDuration"^^rif:iri(?diffduration ?diffdays))
                      External("op:numeric-greater-than"^^rif:iri(?diffdays "10"^^xsd:integer)))
             )
    )


'''Example 3''' (A RIF-BLD group annotated with metadata).

This example shows a group formula that consists of two RIF-BLD rules. The first of these rules is copied from Example 2a. The group is annotated with Dublin Core metadata represented as a frame.

 Compact URI prefixes:
 
   ppl  ''expands into'' http://example.com/people#
   cpt  ''expands into'' http://example.com/concepts#
   dc   ''expands into'' http://purl.org/dc/terms/
   w3   ''expands into'' http://www.w3.org/

 Group "http://sample.org"^^rif:iri["dc:publisher"^^rif:iri->"w3:W3C"^^rif:iri
                                    "dc:date"^^rif:iri->"2008-04-04"^^xsd:date]
   (
 
     Forall ?item ?deliverydate ?scheduledate ?diffduration ?diffdays (
         "cpt:reject"^^rif:iri("ppl:John"^^rif:iri ?item) :-
             And("cpt:perishable"^^rif:iri(?item)
                 "cpt:delivered"^^rif:iri(?item ?deliverydate "ppl:John"^^rif:iri)
                 "cpt:scheduled"^^rif:iri(?item ?scheduledate)
                 External("fn:subtract-dateTimes-yielding-dayTimeDuration"^^rif:iri(?deliverydate ?scheduledate ?diffduration))
                 External("fn:get-days-from-dayTimeDuration"^^rif:iri(?diffduration ?diffdays))
                 External("op:numeric-greater-than"^^rif:iri(?diffdays "10"^^xsd:integer)))
     )
  
     Forall ?item (
         "cpt:reject"^^rif:iri("ppl:Fred"^^rif:iri ?item) :- "cpt:unsolicited"^^rif:iri(?item)
     )
 
   )

<span id="sec-bld-direct-semantics" class="anchor"></span>

Received on Wednesday, 7 May 2008 10:54:43 UTC