Re: [TF-ENT] Agenda 24th Feb teleconf from Axel Polleres on 2010-02-24 (public-rdf-dawg@w3.org from January to March 2010)

From: Axel Polleres <axel.polleres@deri.org>
Date: Wed, 24 Feb 2010 09:18:30 +0000
To: "Sandro Hawke" <sandro@w3.org>
Cc: "Birte Glimm" <birte.glimm@comlab.ox.ac.uk>, "SPARQL Working Group" <public-rdf-dawg@w3.org>
Message-Id: <F505ACB3-9DF5-464B-B9DF-8324BA1F209C@deri.org>
Hi all, 

some more thoughts.

On 24 Feb 2010, at 06:29, Sandro Hawke wrote:

> 
> A few thoughts, in advance.  Some of this is very much over my head; I'm
> trying....
> 
> > * RIF issues
> >   o Will/should RIF be marked as "at risk" depending on the RIF WG note
> >     about the RIF-to-RDF mapping? What is the status of the RIF to RDF
> >     mapping? Will there be something like rif:import?
> 
> I'm still hard at work on the mapping.  I did a first cut, implemented
> with round tripping, etc, but it wasn't right in a way I'm not sure how
> to formally characterize.  But I'll try: if you encode a RIF document D
> with id U in an RDF graph G, and G entails G', and G' contains an
> encoding of a RIF Document D' with id U, then D entails D'.  Maybe D=D'.
> 
> Less formally, if you put a bunch of RIF documents into a triple store,
> run "sound" rules over the triple store, and try to extract your RIF
> documents again, you'll either get something ill-formed or something
> that's essentially the same as you started with.
> 
> My first attempt, the "obvious" mapping, has a lot of multi-valued
> properties.  For example the zero-or-more conjuncts for an And term are
> each connected to the And term by a rif:formula arc.  If you simple drop
> a triple when extracting D, you'll get a ruleset with a completely
> different meaning.
> 
> My new version uses a lot of rdf:Lists to avoid this problem.  This is a
> little clumsy viewed as triples, but it has several advantages.  It
> leads to a much more direct object mapping for non-RDF systems, and it
> makes RIF documents easier to process (correctly) using rules.
> 
> Anyway, the mapping is perhaps orthogonal to the Ent issue, since graphs
> which merely describe RIF documents don't have any special semantics.
> For the semantics, yeah, we want rif:imports, as Axel has mentioned.
> 

One reason why I am still slightly afraid of the RDF/RDF embedding and which is a pro for
simply rif:imports is:

if you encode rif into triples of the graph at hand, these triples will also contribute to 
the rule based answers/inferences, this would bring us into the same issues as OWL has, i.e.
there'd be suddenly two ways to deal with this:
(a) simply treating all as RDF (RDF-based semantics, ir "RIF/RDF Full")
(b) first extracting the rules, and not considering them as part of the graph, which includes checking whether the RIF rules indeed form a well-formed ruleset, etc. and then use the RDF/OWL combination-semantics as defined in [1] (RIF "direct semantics"?)?

I am not sure whether I want to get into the issues of (a) when we still have a lot to sort out for (b) alone, such as
a reasonable restriction to finite answers for example.


> >   o Entailment regimes have to define which graphs the accept. Will the RIF
> >     entailment regime work with all RDF graphs? Different lists in RDF and
> >     RIF?
> 
> I think all graphs, yeah, at least for some import profiles.

see above, when going for (a), we should imo clearly restrict to graph that let you extract 
a well-formed ruleset for (b) accepting any graph is easier, IMO.

> 
> >   o Will each rule set be an entailment regime, e.g., the SD says something
> >     like: myEndpoint sd:EntailmentRegime <http://example.org/myRules.rif>?

If we go for (b) I would see the allowed RIF rulesets similar to the "graph universe" that is one might 
want to define/advertise to which RIF rulesets the endpoint has access.
If we go for (a), I think this question doesn't matter... but it does matter which imports profile (cf. [1])
the graph store supports.

> >     Or is there a suitable RIF entailment relation (RIF+RDF semantics) and
> >     one specifies a rule set in a from clause or in the data set? Which RIF
> >     profiles does that cover? This might affect the condition on extensions
> >     to BGP matching that requires that
> >     SG E-entails (SG union P1(BGP1) union ... union Pn(BGPn))
> 

I am not sure I understand that question.

> I'm not following the math, but it seems to me the main thing we want is
> a single RIF entailment relation, which uses rif:imports in the data to
> bring in whatever other entailments you might want.  There may be an
> advantage to also allowing the use of names of RIF documents as names of
> entailment regimes.  (I don't actually understand how users interact
> with entailment regime identifiers, sorry.)
> 
> >   o How are blank nodes defined in RIF? Will skolemization/mapping to RIF
> >     local symbols work as for the other regimes?
> 
> I think so.
> 
> >   o Not all RIF dialects are based on a model-theory (e.g., RIF PRD), so
> >     they do not come with an entailment relation, but have a procedural
> >     semantics. Can we still use the procedural semantics to define
> >     something like an entailment regime?
> 
> I sure hope so.  When we say "entailment regime" what we really mean is
> "specification of inference", right?  And PRD has a specification of
> inference, which is all that actually matters.

I would hope that we can define the Core/BLD semantics also in a procedural way, 
via some finitely-bounded  canonical approximation of the minimal Herbrand model... 
I will sketch that in a separate mail...

> 
> >   o Which RIF profiles should be included? Only RIF Core? Does RIF Core
> >     coincide with OWL RDF-Based or Direct Semantics? How many profiles are
> >     there?
> 

The advantage of (strongly safe) RIF Core is that it lends itself straightforwardly to the
idea of defining the entailment regime via the closure (i.e. minimal Herbrand model), since 
it is always finite. For other entailment regimes, restricting to a finite answer set is trickier.

> I hope we don't have to say anything about which RIF profiles (dialects)
> are included.  Certainly we don't want to exclude user extensions.
> (But, again, I don't understand how users interact with this spec.)  

I wouldn't be too worried to leave that open... as we did leave open the definition of 
more dialects in RIF. I'd be happy to go with RIF strongly safe Core, RIF Core and 
RIF BLD for a start. (If we find a volunteer to tackle RIF PRD, also fine).

> 
> >   o What effects do the non-monotonic features of some RIF dialects have?
> >     E.g., RIF PRD and (anticipated) RIF dialects with default negation.
> >     How does that interact with SPARQL's non-monotonic features?
> >     This probably affects issue-43: Should entailment-regimes be declared
> >     over the whole dataset or individual graphs?
> 
> Eeeeek, scary stuff.

not really scary, the non-monotonic features of SPARQL are on top of BGP matching.
All we need to define for an entailment regime in SPARQL is the extension of BGP matchin 
(i.e. roughly speaking "conjunctive queries") anything else is "on top".

> 
> Still, for PRD, I think it's easy to just say you're querying over the
> state of the system after running all the rules to completion.
> 
> If someone does a logical non-mon RIF dialect, that could be more
> dangerous.

As said above, don't think so. (unless you have a dialect that has several "models", e.g. a stable model semantics  based dialect, but even for those cases there are solutions, such as only considering cautious consequences true 
in all models)

> 
> >   o RIF production rules: it is no even clear how conjunctive queries work.
> 
> Just query over the completed state, yes?
> 
> >   o What is our timeline for RIF?
> 
> Soon?  :-)    RIF is trying to get to REC in a very small number of months
> (like 2).
> 
> One little concern I have is Condition 4.  I don't see how one can
> possibly define a finite answer set for RIF.   For example, given the
> document:
> 
>     if p(?n) then p(?n+1).
>     p(1).
> 
> and the query
> 
>     p(?x).
> 

I will sketch a proposal in a separate mail. I think there IS a sensible proposal to limit the answers.

> well, I don't think there's any sensible way to limit this to a finite
> set of answers.  Is there?  Maybe we can just say such things are
> pathological and we don't define what the answer set might be?  But
> there might be perfectly reasonable, important, useful, working rulesets
> that have infinite answers....  Thinking...  Oh, sure, you could have a
> rule which defines a predicate odd(...), which is true for odd integers
> and false for even integers.  ask odd(17) makes perfect sense; query
> odd(?x) has infinite (very well defined) answers.
> 
>      -- Sandro
> 
> 


1. http://www.w3.org/TR/rif-rdf-owl/
Received on Wednesday, 24 February 2010 09:19:07 UTC