Re: RIF-in-RDF: Requirement 4 from Sandro Hawke on 2010-07-25 (public-rif-wg@w3.org from July 2010)

From: Sandro Hawke <sandro@w3.org>
Date: Sun, 25 Jul 2010 18:46:02 -0400
To: kifer@cs.stonybrook.edu
Cc: public-rif-wg <public-rif-wg@w3.org>
Message-ID: <1280097962.1863.179.camel@waldron>
On Sun, 2010-07-25 at 17:57 -0400, Michael Kifer wrote:
> Sandro,
> I don't understand your argument. So, you are proposing that somebody would
> explicitly write
> 
> foo[my:conjuncts->list(bar1 bar2 bar3)]
> 
> If so, why can't the same one write this:
> 
> foo[my:conjuncts->bar1]
> foo[my:conjuncts->bar2]
> foo[my:conjuncts->bar3]
> 
> ?

The key point is that after the translation rules are done, we need to
have inferred something like:

   foo[rif:formulas->list(bar1 bar2 bar3)]

instead of

   foo[rif:formula->bar1]
   foo[rif:formula->bar2]
   foo[rif:formula->bar3]

And we need this, because otherwise we don't know when to stop looking
for more solutions.   With the first form, we can stop as soon as we get
the first match (in the context of a large matching process, trying to
find a match for rif:Document).  

In the second form, however, it's not clear when to stop.  We will only
know we have all the appropriate values for rif:formula when a complete
reasoner has run until termination.  But I think many RIF reasoners will
not be complete, and I think many sets of fallback rules will produce
lots of unwanted solutions, perhaps never terminating.   

If we can stop after finding the first solution, or the first few good
ones (as in the list style), we're okay -- if we have to find them all
before knowing what a correct translation looks like, I think that will
be a problem.

(Obviously, I'm thinking in terms of a backward-chaining BLD system
here, trying to extract a rif:Document.   I don't understand termination
conditions in PRD well enough to know how to handle this, there, or if
it's even possible.)

    -- Sandro

> michael
> 
> 
> 
> On Sun, 25 Jul 2010 16:49:52 -0400
> Sandro Hawke <sandro@w3.org> wrote:
> 
> > Dave [1], Harold [2], and Michael [3] have all expressed a desire to
> > have the RIF-in-RDF mapping more closely follow the XML syntax.  In
> > particular, they suggest it use repeated properties instead of
> > gathering all the values of the properties into a list.
> > 
> > I'm extremely sympathetic to this desire.  If you look back at the
> > history of the web page, you'll see this is what my first version did,
> > and then I stalled out for months as I realized it wouldn't work.
> > Eventually I decided I just had to go ahead with the list-based
> > approach that's currently in the document.
> > 
> > The compelling problem for me is that using repeated properties, as
> > far as I know, it is not possible to reliably transform a RIF document
> > using an incomplete reasoner.  I've called this "Requirement 4" in
> > RIF-in-RDF [4].
> > 
> > Let me back up and explain what I'm trying to do and why I think it's
> > important.
> > 
> > In my talks and writing about RIF to Semantic Web audiences, I explain
> > that where I think RIF is essential is in data transformation.  With
> > RIF, we can allow interoperation between vocabularies.  My standard
> > example is that FOAF has a foaf:name property, and it also has
> > foaf:firstName and foaf:lastName.  When you're producing FOAF data,
> > which should you use?  When you're consuming FOAF data, which should
> > you look for?  In both cases, if you want interoperability, you have
> > to do both.  When there are only two options, and everyone knows about
> > them, that's okay.  But what happens when the third, fourth, and fifth
> > "standard" properties for representing names comes along?  It's a
> > nightmare; the fact that the producer and consumer are both using RDF
> > ends up not buying you very much at all.
> > 
> > But RIF can solve this problem.  By having the ontology documents for
> > each of terms include some RIF (via rif:importWithProfile), the folks
> > deploying new properties can express how they map data to alternative
> > properties.  (In this case, with some string operations.)  Now,
> > data-consuming systems which implement RIF can automatically get the
> > data in exactly the vocabulary they want.
> > 
> > I think this is a very compelling use case.  In fact, without this
> > mechanism (or an equivalent one) I don't see how the Semantic Web can
> > work at all.  More recently, I've started using another example (which
> > I mentioned on a recent telecon), where facebook's Open Graph Protocol
> > uses RDF with a different style of modeling than most of the Semantic
> > Web; here, again, RIF can provide interoperability via translation
> > rules.
> > 
> > Now, imagine we have this all in place.  Lots of RDF data out there,
> > using various vocabularies.  When you dereference the terms you find
> > some RIF that lets you translate between them, so it's all roughly
> > interoperable.  Of course, not every vocabulary can be mapped; some
> > aren't well understood enough to formalize, etc.  But many can be
> > translated.  This allows new vocabularies to be deployed, and the
> > overall system to grow and evolve in place.
> > 
> > Now, remember the RIF extensibility requirement?  In the current
> > design, we met it by providing may-ignore and must-understand
> > extensions via annotations and new xml elements.  This works, but only
> > in very broad strokes.  We have no "graceful" fallback.  Extensions
> > can't offer syntactic sugar, and they certainly can't offer features
> > which can be approximated.  This mechanism may not be good enough to
> > allow extensions to really be deployed on the open Web.  We talked
> > about all this years ago, but decided we didn't have time to work out
> > all the details, and that it could wait.
> > 
> > So, as you may have guessed by now, I want to provide RIF
> > extensibility the same way I want to provide FOAF name extensibility:
> > with RIF translation (fallback) rules.
> > 
> > I'll walk through this, below, but here's the punchline: I think it
> > works fine with the list-style of RIF-in-RDF, but I don't think it can
> > be done with the repeated-properties style.  This is why I need the
> > lists.
> > 
> > I have a few ideas of transformations I want right now...
> > 
> >   - automatically add universal quantification to free variables
> >   - extend frames to allow for context/named-graphs (cf Decker's TRIPLE)
> >   - convert some kinds of rules between PRD and BLD (trading off
> >     between new() and logic functions)
> >   - convert logic functions to builtin list operations (I think this
> >     can be done; not sure) getting more of BLD into Core
> >   - standard rewritings: get rid of conjunction in rule heads, disjunction
> >     in rule bodies, Skolemize
> >   - re-write out named-argument-uniterms
> > 
> > ... but they're all too complex to use as first illustrations.  For
> > that I'll use something that ridiculous, but pleasantly simple:
> > 
> >   - Allow people to use the term my:Conjunction instead of rif:And.   Also,
> >     use my:conjunct instead of rif:formula inside it.
> > 
> > Before actually writing the transformation rule, we have to decide
> > what the transformations are going to look like in RIF.   Some options:
> > 
> >    1.  in place, new and old, overlapping; the new data (the output)
> >        is distinguished by using different properties and/or classes.
> >    2.  copy the whole document, with changes
> >    3.  ...   maybe some other approaches?
> > 
> > Let's try (1) first, since it's more terse.  Our input looks like
> > this:
> > 
> >       ...
> >       <if>       <!-- or something else that can have an And in it -->
> >          <my:Conjunction>
> >              <my:conjunct>$1</my:conjunct>
> >              <my:conjunct>$2</my:conjunct>
> >              ...
> >          </my:Conjunction>
> >       </if>
> >       ...
> > 
> > and we'll just "replace" the element names.
> > 
> > However, since we don't have a way to "replace" things in this
> > "overlapping" style, we'll just add a second <if> property, and the
> > serializer or consumer will discard this one, since it contains an
> > element not allowed by the dialect syntax.
> > 
> > So, the rule will add new triples, but leave the old ones intact.
> > The rule will leave us with this:
> > 
> > 
> >       ...
> >       <if>       <!-- or something else that can have an And in it -->
> >          <my:Conjunction>
> >              <my:conjunct>$1</my:conjunct>
> >              <my:conjunct>$2</my:conjunct>
> >              ...
> >          </my:Conjunction>
> >       </if>
> >       <if>      <!-- the same property, whatever it was -->
> >          <And>
> >              <formula>$1</formula>
> >              <formula>$2</formula>
> >              ...
> >          </And>
> >       </if>
> >       ...
> > 
> > Here's the rule:
> > 
> >  forall ?parent ?prop ?old ?conjunct ?new
> >  if And(
> >    ?parent[?prop->?old]
> >    my:Conjunction#?old[my:conjunct->?conjunct]
> >    ?new = wrapped(?old)  <!-- use a logic function to create a new node -->
> >  ) then And (
> >    ?parent[?prop->?new]
> >    rif:And#?new[rif:formula->?conjunct]
> >  )
> > 
> > This works fine, as long as the reasoning is complete.  However, if
> > the reasoning is ever incomplete, we end up with undetectably
> > incorrect results.  Rules that were "if and(a b c) then d" might get
> > turned into "if and(a b) then d"!
> > 
> > I don't think it's sensible to expect reasoners to be complete.  It's
> > great to have termination conditions arise from the rules; it's not
> > good to require the reasoner to run until it knows all possible
> > inferences have been made.  With the above approach, there's no
> > termination condition other than "make all the inferences possible".
> > 
> > Alternatively, if we use the list encoding, the rule is very similar:
> > 
> >  forall ?parent ?prop ?old ?conjuncts ?new
> >  if And(
> >    ?parent[?prop->?old]
> >    my:Conjunction#?old[my:conjuncts->?conjuncts]
> >    ?new = wrapped(?old)
> >  ) then And (
> >    ?parent[?prop->?new]
> >    rif:And#?new[rif:formulas->?conjuncts]
> >  )
> > 
> > ... but now we can set a termination condition: if a RIF document in
> > the desired dialect *can* be extracted, then you're done.
> > 
> > A few notes:
> > 
> >     * I've included the types (like rif:And) for now.  Whether to do
> >       that is a separate issue (specifically ISSUE-101).
> > 
> >     * It's okay to have the rules produce multiple valid RIF
> >       documents; you can stop after generating one, but you can also
> >       continue.  If there's some kind of weighting on the rules (cf
> >       XTAN's "impact" mechanism) you can search for a solution that's
> >       better than some others.  It may be possible to efficiently
> >       direct this search towards the best solution; I'm not sure.
> > 
> >     * I don't think the copy-the-whole-document approach to
> >       translation helps at all.  There, instead of attaching the new
> >       node to the same parent, we attach it to a new parent, and we
> >       end up with a whole new tree.  But still, branches of the tree
> >       are generated by separate rules applications, so an incomplete
> >       reasoner may produce incomplete (wrong) output trees.
> > 
> > I think that's it.  I trust y'all will point out any confusing or
> > incorrect elements of this argument.
> > 
> >       -- Sandro
> > 
> > [1] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0015
> > [2] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0017
> > [3] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0018
> > [4] http://www.w3.org/2005/rules/wiki/RIF_In_RDF#Requirements
> > 
> > 
> > 
> > 
> > 
>
Received on Sunday, 25 July 2010 22:46:12 UTC