Re: Q: ISSUE-41 bNode semantics from Richard Cyganiak on 2011-05-19 (public-rdb2rdf-wg@w3.org from May 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Thu, 19 May 2011 13:03:21 +0100
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: Ivan Herman <ivan@w3.org>, Pat Hayes <phayes@ihmc.us>, Michael Hausenblas <michael.hausenblas@deri.org>, W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
Message-Id: <A54F457B-D0B1-47A9-AFDD-C0394000A8C0@cyganiak.de>
Enrico,

I will repeat:

> If we have this:
> 
> <Alice> <name> "NULL"^^rdb2rdf:NULL .
> <Bob>   <name> "NULL"^^rdb2rdf:NULL .
> 
> then from RDF Semantics it follows that <Alice> and <Bob> have the same name. A constant is a constant, that's how datatypes work in RDF and you can't do anything about that unless you are prepared to change the formal semantics.


Your proposal does not work because of the technical problem above.

You try to fix it up by layering a custom query answering apparatus on top of the RDF. That does not work because RDF triples themselves already have a formal semantics, and your proposal is incompatible with this formal semantics. It does not allow the use of a typed literal for any other purpose than denoting an entity.

>>> (2) but given this, you then decide to deal anyway with NULL values, and you give a (already shown incorrect) mapping for them?
>> 
>> Perhaps incomplete, but not incorrect.
> 
> According to a standard notion of incorrectness your proposal is incorrect (and incomplete, of course).
> Indeed, in my example <http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011May/0111.html> if you query DB1 for all the tuples in R which do NOT have any value for the attribute A (a query you can easily write as the difference between all the tuples in R and the tuples in R which do have some value for A) you will get both tuples by using your mapping which neglects the NULL values. So you get spurious wrong tuples in the answer. And this is plain unsound.

You defined your own non-standard query mechanism, declared that it should return X, showed that it does not return X if applied to the graph produced by the Direct Mapping, and hence declared the Direct Mapping unsound. This doesn't show anything. The standard query language for RDF is SPARQL.

And a mapping from some other data model instance to an RDF graph is sound even if it is incomplete -- that is, it does not capture all the knowledge encoded in the data model instance. That's the open world assumption. Of course if you apply a closed-world query language that supports some form of negation or set difference to an open-world logic, you can produce spurious triples -- but that doesn't mean that the mapping is unsound.

>> If you need a walkthrough of the spec to see why this is true, let me know.
> 
> I was in the SPARQL WG, so I don't need you to walk me through.


But did you read the RDF Semantics document?
http://www.w3.org/TR/rdf-mt/

If you did, then you can surely speak to the very specific technical problem with your proposal that I pointed out above?

Best,
Richard



On 19 May 2011, at 07:41, Enrico Franconi wrote:

> On 19 May 2011, at 01:09, Richard Cyganiak wrote:
> 
>> On 18 May 2011, at 21:22, Enrico Franconi wrote:
>>> (1) you have no evidence that "the semantics of RDF offers no construct that adequately covers this"
>> 
>> I do have evidence:
>> http://www.w3.org/TR/rdf-mt/
>> 
>> I read it all and nothing in there works for this.
> 
> The burden is on you to show that :-)
> 
>> Burden of proof is on you now I'd say ;-)
> 
> Indeed I proposed how to "use" unmodified standard normative RDF + SPARQL to deal correctly with NULL values.
> 
>>> (2) but given this, you then decide to deal anyway with NULL values, and you give a (already shown incorrect) mapping for them?
>> 
>> Perhaps incomplete, but not incorrect.
> 
> According to a standard notion of incorrectness your proposal is incorrect (and incomplete, of course).
> Indeed, in my example <http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011May/0111.html> if you query DB1 for all the tuples in R which do NOT have any value for the attribute A (a query you can easily write as the difference between all the tuples in R and the tuples in R which do have some value for A) you will get both tuples by using your mapping which neglects the NULL values. So you get spurious wrong tuples in the answer. And this is plain unsound.
> 
>>> When a future WG will fix this, the change will be most likely non backward compatible, since now you are taking a non-motivated choice for the mapping of NULLs. And this is bad.
>> 
>> As long as we translate the non-null parts correctly, our translation is correct.
> 
> As long you translate only RDBs which do not contain any NULL value, your translation is correct. Indeed, this is my proposal to the group in the case you do not want to explore the possibility to correctly deal with NULLs.
> 
>> It may not be complete because the presence of a null may give some extra information that we do not capture.
> 
> As I just said, it is incorrect (unsound) *and* incomplete. So it is just wrong.
> 
>> However, if some clever spirit in the future should find some way of squeezing that extra information into RDF triples, then these extra triples can always be added without breaking backwards compatibility. RDF is monotonic.
> 
> Since you are unsound, monotonicity does not help you to preserve backward compatibility.
> Look: the issue is complex, there is no reason to fight over this. The facts are that your proposal is wrong and my proposal is not trivial to analyse.
> 
>>> If the majority of this WG does not want to explore the correctness of the mapping for NULL values, them my proposal would be something along the lines:
>>> "Note: if a relational database contains NULL values, then the direct mapping is not applicable. This case is postponed for consideration to a future WG."
>> 
>> I repeat: Not mapping nulls may be incomplete, but not incorrect. Punting on the NULL cells doesn't mean that we have to punt on the entire DB.
> 
> I repeat: no.
> 
>>> I propose to translate a NULL value as a special constant from a special datatype, and to understand how SPARQL 1.0 queries should be modified in order to behave properly in presence of RDF data coming from a direct mapping of a RDB with NULL values. My guess is that it is enough to enrich the BGP part with a conjunct NOT-EQUAL(X,'NULL') [pardon my naive syntax here] for each joined (namely repeated in the BGP) variable X, so we remain in pure SPARQL 1.0.
>> 
>> I'll not comment on charter scope here, I'll leave that to others.
>> 
>> You talk about changing SPARQL.
> 
> No. I'm talking about using unmodified standard normative SPARQL.
> 
>> But let me just say that you'd have to go deeper than that.
>> 
>> If we have this:
>> 
>> <Alice> <name> "NULL"^^rdb2rdf:NULL .
>> <Bob>   <name> "NULL"^^rdb2rdf:NULL .
>> 
>> then from RDF Semantics it follows that <Alice> and <Bob> have the same name. A constant is a constant, that's how datatypes work in RDF and you can't do anything about that unless you are prepared to change the formal semantics. If you need a walkthrough of the spec to see why this is true, let me know.
> 
> I was in the SPARQL WG, so I don't need you to walk me through.
> Please look again at my proposal, and realise that I am not proposing to change anything in SPARQL (or RIF or OWL). I am proposing a recipe on how to *use* unmodified standard normative RDF and SPARQL 1.0 in order to get the correct answers over the graph obtained by a direct mapping where the NULL value is translated as a special recognisable constant.
> It may well be the case that the WG and the SW community at large does not like this approach (since it requires to adhere to a recipe in order to get the right answers, á la 'best practices') but having worked on the theory of SQL null values for few years now I am confident (but I can always be proved wrong) that this is the only way to correctly deal with the matter.
> 
>> It's not just SPARQL that's built on that foundation, but also OWL2 and RIF. Are you prepared to change them as well?
> 
> No I am not :-)
> 
> cheers
> --e.
> 
>> Best,
>> Richard
> 
>
Received on Thursday, 19 May 2011 12:03:48 UTC