1605 – The inference engine is not specified (only the rule base it uses is)

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 1605 - The inference engine is not specified (only the rule base it uses is)

Summary: The inference engine is not specified (only the rule base it uses is)

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	Formal Semantics 1.0 (show other bugs)
Version:	Last Call drafts
Hardware:	PC Windows 2000

Importance:	P2 normal
Target Milestone:	---
Assignee:	Jerome Simeon
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2005-07-14 23:56 UTC by Fred Zemke
Modified:	2005-09-27 13:56 UTC (History)
CC List:	0 users

See Also:

Attachments

Description Fred Zemke 2005-07-14 23:56:33 UTC

2.1.3 Notations for inference rules
It says "Inference rules express the logical relation between
judgments and describe how complex judgments can be concluded from
simpler premise judgments."  This tells when an inference CAN be
made, but it does not tell me when an inference MUST be made.

There are a few places where you want the application of
inferences to be nondeterminstic, for example, whether an "and"
or "or" expression raises an error, or which error is propagated
from a function invocation with multiple errors.  However, for
the most part, you do not intend to leave the selection of which
inferences to draw entirely to the implementation's discretion.
Otherwise an implementation could simply refuse to draw some
conclusions and (for example) not complete the evaluation of 
some evaluatable expression, perhaps raising a gratuitous 
error, and still claim compliance.  This should be clarified.

Here are some general principles you might consider:

a) Any inference whose conclusion is of the form
statEnv |- Expr : Type that can be drawn during static analysis,
must be drawn during static analysis.

b) If the Static Typing Feature is supported, then any inference
whose conclusion is of the form
statEnv |- Expr raises TypeError that can be drawn during static
analysis, must be drawn during static analysis.

c) Any inference whose conclusion is of the form
dynEnv |- ... that can be drawn during dynamic evaluation, must be
drawn during dynamic evaluation.

Principle c) would seem to exclude certain desired nondeterminisms.
I think the solution is to build the nondeterminism into the
judgment through an "implementation-dependent choice" operator.  Thus

dynEnv |- Expr1 raises Error
dynEnv |- Expr2 => false
----------------------------
dynEnv |- choice (Expr1 and Expr2 => false,
                  Expr1 and Expr2 raises Error)

An alternative approach would be, instead of rule c), say
"if an Expr passes static analysis phase, then the dynamic
evaluation phase must draw inferences until either a judgment
of the form 'dynEnv |- Expr => Value' or a judgment of the form
'dynEnv |- Expr raises an error' is reached."

Comment 1 Jerome Simeon 2005-07-20 19:08:23 UTC

This is a very good comment. This would probably help the reader a lot to have
such a description. This is mostly of editorial nature, but would make the whole
specification much more approachable. Let me try and sketch a description for
the 'inferrence engine' as you call it.

1. What does an inferrence rule mean. (See comment also on bug [1612]).

statEnv |- Expr1 : Type1
statEnv |- Expr2 : Type2
------------------------
statEnv |- Expr1, Expr2 : Type 1, Type2

logically means:

IF
  statEnv |- Expr1 : Type1
AND
  statEnv |- Expr2 : Type2
THEN
  statEnv |- Expr1, Expr2 : Type 1, Type2

The important point is the use of conjunctions between the preconditions.

2. What if you have several inferrence rules?

a set of rules is essentially a disjunction:

dynEnv |- Expr1 => false
--------------------------------------
dynEnv |- Expr1 and Expr2 => false

dynEnv |- Expr2 => false
--------------------------------------
dynEnv |- Expr1 and Expr2 => false

logically means:

IF
  dynEnv |- Expr1 => false
THEN
  dynEnv |- Expr1 and Expr2 => false

OR

IF
  dynEnv |- Expr2 => false
THEN
  dynEnv |- Expr1 and Expr2 => false

3. How does the inferrence engine works

You take the disjunction of all the rules and your try and 'prove'
that a judgment is true or not.

For static typing, typically, you are trying to prove:

statEnv |- Expr : Type

For the given expression, the inferrence rules will tell you if
this is true or not. This is true if you can logically infer that the
judgment holds, and results in a given type. If you cannot draw an inferrence,
then the judgement does not hold and static typing fails.

for example, take the following XQuery expression:

fn:count((1,2,3))

you are trying show that statEnv |- fn:count((1,2,3)) : Type holds

you can draw the inference as follows:

  statEnv |- 1 : xs:integer  (from typing of literals)
  statEnv |- 2 : xs:integer  (from typing of literals)
  ---------------------------------------- (from typing of sequence constr.)
    statEnv |- 1,2 : xs:integer, xs:integer
    statEnv |- 3 : xs:integer
    -----------------------------------------------------(sequence constr.)
    statEnv |- 1,2,3 : xs:integer, xs:integer, xs:integer

    declare function fn:count($x as item()*) as xs:integer
    statEnv |- xs:integer,xs:integer,xs:integer <: item*
    ----------------------------------------------------------(function call)
    statEnv |- fn:count((1,2,3)) : xs:integer

So the static typing succeeds and results in the static type of xs:integer

Here is another example:

   fn:nilled((1,2,3))

Here again, the inferrence can proceed up to:
    ....
    -----------------------------------------------------(sequence constr.)
    statEnv |- 1,2,3 : xs:integer, xs:integer, xs:integer

But there is no rule that can infer the type of fn:nilled((1,2,3)), notably
because the static typing rules for function calls will only hold if
xs:integer,xs:integer,xs:integer <: node?
which is false.

Would something along those lines help?

= Jerome

Comment 2 Jerome Simeon 2005-07-20 23:46:17 UTC

As an addition:

we should have a 4. item which talks about non-determinism and errors.

Notably:
  * An error is either raised when you reach a conclusion of the form 'Expr
raises Error' or when no conclusion can be reached.

  * In the cases several conclusions can be reached, the result is
non-deterministic. Note that this can only occur for the dynamic semantics, not
the static semantics.

For instance, consider the expression:

  true() or fn:error()

The inference rules can allow to infer both:

  dynEnv |- true() or fn:error() raises Error

as well as

  dynEnv |- true() or fn:error() => true()

and both results are acceptable for the query.

- Jerome

Comment 3 Fred Zemke 2005-07-20 23:55:41 UTC

What you wrote is probably useful, but I had in mind something
like the following.  There are still gaps in this 
description because I am not sure this permits the following:

1) static type error if no inference of the form 
statEnv |- Expr : Type can be deduced.

2) short circuit evaluation of some expressions

I wrote this without seeing your second reply, which goes into
some of this.  But I must quit now. 

The group can kick this around.

Execution model of the inference engine

This specification does not specify a particular order of execution.
Instead, it uses a theoretical device called an inference engine.
There are two inference engines, one for the static analysis phase,
and the other for the dynamic evaluation phase.

An inference engine consists of the following components:

1. The rule base (a collection of concrete inferences);

2. The deductions (a collection of concrete judgments);

3. The rule processor (an algorithm for utilizing the rule base
to populate the deductions).

4. The objectives.  The inference engine stops when either:
a) a deduction that is an objective is placed in the collection 
of deductions, or
b) no further deductions can be drawn.

A rule base consists of a theoretically
infinite number of inferences.  Since there are infinitely 
many possible inferences, it is impossible for this specification
to enumerate all of them.  Instead, this specification provides
a finite number of inference patterns.  Each inference pattern
has some finite number of premise patterns (possibly zero) and one
conclusion pattern.  Premise patterns and conclusion patterns
can be viewed as templates from which concrete judgments can be
derived by substituting concrete syntax or values for the 
italicized variables in the premise or conclusion pattern.

For example, here is a possible premise or conclusion pattern:

dynEnv |- Expr => Value

This pattern is a shorthand for the infinite number of judgments 
that can be obtained by substituing concrete objects for the 
two variables, Expr and Value.  For example, "a/b" is a concrete
instance of Expr, and "2" is a concrete instance of Value.  
Substituting these two objects in the pattern produces the 
concrete judgment

dynEnv |- a/b => 2

Note that this judgment is not necessarily true.  It is the role
of the inference engine to find true judgments, out of the space
of all possible judgments.  The inference engine seeks to 
discover true judgments.  Any true judgment that the inference
engine discovers is placed in the collection of
deductions.

Also note that this judgment might be true in some contexts and
might be false in other contexts.  This fact is handled by 
initializing the inference engine with the information about
the initial state of the static and dynamic environment groups.

So far we have seen an example of a pattern for a single judgment.
Such patterns are grouped into an iference pattern.  An
example of an inference pattern is

dynEnv |- Expr1 => Value1
dynEnv |- Expr2 => Value2
-------------------------
dynEnv |- (Expr1, Expr2) => (Value1, Value2)

This inference pattern is a template from which concrete 
inferences such as the following may be constructed:

dynEnv |- a/b => 2
dynEnv |- x/y => 3
-------------------
dynEnv |- (a/b, x/y) => (2, 3)

This specification contains many inference patterns.  Each inference
pattern denotes a collection of concrete inferences, consisting of
all inferences that can be obtained by substituting appropriate
objects for the variables in the inference pattern.  Every possible
combination of objects that substitute for the variables
constitutes a separate concrete inference.

It must also be recognized that there are infinitely many possible
static and dynamic environment groups.  Each  concrete static
environment group or concrete dynamic environment group is a 
possible value for the statEnv or dynEnv in judgments.  Thus a
judgment pattern such as dynEnv |- Expr => Value denotes the 
infinite collection of judgments that a particular dynamic 
environment group evaluates a particular expression to a value
(and that concrete judgment may be true or false).

The inference engine for the static analysis phase uses only 
those inference patterns whose conclusion begins with 
either "|-" or "statEnv |-".  The inference engine for the dynamic
phase uses all inferences in this specification.

The rule processor may be visualized as a collection of 
"daemons".  Each daemon corresponds to a particular concrete 
inference.  If D is the daemon for a concrete inference I, then
the behavior of D is as follows: if D detects that all the 
premises of I are in the pool of deductions, then D places the
conclusion of I in the pool of deductions.

The initial state of the inference engine is that the pool of 
deductions is empty.  From this state, any concrete inference
with no premises may be placed into the pool of deductions
by its daemon.  This will provide a collection of judgments
which will enable other daemons to "fire" and place more 
judgments in the pool of deductions.  
This provides the seed to start the recursive 
activity of an inference engine.

Once started, the inference engine could continue forever, 
expanding its collection of deductions.  However, the user is
only interested in learning certain deductions, so the 
inference engine can stop when the objective set by the user
has been deduced.  The [query/expression] supplied by the
user determines the objectives.

In the static analysis phase, the objectives are to find 
a deduction matching either of these patterns:

statEnv |- Expr : Type

or 

statEnv |- Expr raises a type error

where statEnv is the initial static environment group and Expr
is the user's [query/expression].  

In the dynamic evaluation phase, the objective are to find
a deduction matching any of these patterns;

dynEnv |- Expr => Value

or

dynEnv |- Expr raises a type error

or

dynEnv |- Expr raises a dynamic error

where dynEnv is the initial dynamic environment group.

Comment 4 Jerome Simeon 2005-07-22 16:29:17 UTC

Fred,

The WGs have discussed this thread today and is leaning toward adding
explainatory material along the lines I suggested in my additional comment #1,
and not go all the way to add the kind of description for an 'inference' engine
that you suggest. The rationale for this is that the FS is more declarative in
nature that what you suggest, and that it would require more work.

Please let us know if you could live with this before Wednesday next week if
possible so we can process this issue at our next telcon.

Best,
- Jerome

On behalf of the XML Query and XSL WGs

Comment 5 Fred Zemke 2005-07-25 20:45:28 UTC

I can live with your "declarative description", with a little
tightening of the description, as follows:

Your point 1, "what does an inference rule mean".  What you wrote 
is good, but I would add my proposal, that the judgments and 
inferences in FS are patterns, which represent 
every possible instantiation of the variables with concrete
values matching the BNF non-terminal indicated by the name of
the variable.  For example, the judgment pattern

  statEnv |- Expr1 : Type1
  statEnv |- Expr2 : Type2
  ------------------------
  statEnv |- Expr1, Expr2 : Type1, Type2

has as one of its concrete instantiations

  statEnv |- (a/b) : xs:integer+
  statEnv |- (c + d) : xs:integer
  --------------------------------
  statEnv |- (a/b), (c+d) : xs:integer+, xs:integer

This adddresses the issue about the implicit quantification
of the inferences (more about that in my response to #1612).

Yor point 2, "what if you have several inferences?". 
I question whether you really mean that the implicit conjunction
between inferences is OR.  Consider your example:

  dynEnv |- Expr1 => false
  --------------------------------------
  dynEnv |- Expr1 and Expr2 => false

  dynEnv |- Expr2 => false
  --------------------------------------
  dynEnv |- Expr1 and Expr2 => false

This does not mean OR, ie, that at least one of these inferences is 
true.  Both of the inferences are true.  For example, consider
the concrete instantiations with "2=3" as Expr1 and "1=1" as 
Expr2.  Then the two inferences are:

  dynEnv |- 2=3 => false
  ----------------------------
  dynEnv |- 2=3 and 1=1 => false

  dynEnv |- 1=1 => false
  ----------------------------
  dynEnv |- 2=3 and 1=1 => false

Both of these inferences are true.  The second one is true due to
the mathematical definition of "material implication", ie,
either "dynEnv |- 1=1 => false" is false, 
or "dynEnv |- 2=3 and 1=1 => false" is true.

What may be deceptive about this example is that the second
inference, though true, is also irrelevant
because you can never deduce "dynEnv |- 1=1 => false".
This is an issue in general with material implication, that there
are many irrelevant cases.  Implementations will of course not
need to worry about ever utilizing an inference with a false
premise, but there is no simple way to weed them out of the 
specification (or mathematics) so we just live with it.

Further, I think that saying the implicit conjunction is 
OR actually breaks the specification.  This is because, if 
P OR Q, where P and Q are statements, you do not know whether
it is P that is true, or Q, or both.  You do mean that both are
true, so P AND Q is the correct conjunction between inferences.

I think when you proposed OR, what you meant was "the implementation
gets to pick which inferences to use in making its deductions".
I expressed this concept by providing asynchronous daemons
that fire when they see that their premise(s) are fulfilled.  
(To model that an implementation ignores a particular 
inference, simply let the daemon for that inference be so slow
to fire or with such a low priority that other inferences are 
drawn first.)  Anyway, this is generally true about inference engines,
that there is an element of choice when deciding which inference
to use.  Take for example
a working mathematicican, who is an inference engine whose rule
base is all the mathematics he has learned.  When tackling a 
problem, he may or may not happen to use a particular inference
known to him.  But this does not mean that he regards the 
collection of truths that he knows as logically conjoined by 
OR.  Rather, he simply has a choice about how to reach his 
objective.

Your point 3, "how does an inference engine work".  In this
section you say "For static typing, typically, you are trying to 
prove: 'statEnv |- Expr : Type' ".  I have an issue with the word 
"typically", because it suggests that there is no definite 
target for the inference engine, and to that extent, leaves the 
specification vague.  "Typically" does not connote a normative
specification.

To fix this, I would tighten this to 
say "The objective of the inference engine is to 
deduce either a judgment of the form 'statEnv |- Expr : Type', where
statEnv is the initial static environment group, Expr is the
[query/expression] submitted by the user, and Type is to be 
discovered by the inference engine, or to conclude that there
is no such Type."  

I realize that the inference engine may have other judgments 
that it attempts to prove along the way to its final objective.
This is probably why you inserted the "typically".  If you want
to capture this thought, I suggest a further sentence such as
"The inference engine may also attempt to deduce additional 
judgments as steps in its pursuit of its overall objective."

Your point 4, "about non-determinism and errors", this is good
and important.  Concerning the first subpoint "An error is either 
raised when you reach a conclusion of the form 'Expr
raises Error' or when no conclusion can be reached." My question
is whether an error is raised only for the the failure to
reach the overall objective, or could an error be raised for
the failure to reach an intermediate objective?  I don't know
the answer to this.

Comment 6 Jerome Simeon 2005-07-26 15:02:17 UTC

Dear Fred,

I agree with most of your changes. I think they are all good suggestions. Again,
I think I would like to stay away from any description which is too procedural
or implementation oriented (such as introducing something like a 'deamon').

It seems that we are mostly in agreement though, and that the remaining part
could be left to editorial discretion.

Best,
- Jerome

Comment 7 Michael Rys 2005-07-26 15:30:07 UTC

The working group endorses Jerome's response and will resolve the issue.

Please close the issue if you agree with the resolution or reopen the issue if 
you disagree.