Re: Agenda for June 14 Telcon - Revision 1 from Michael Hausenblas on 2011-06-14 (public-rdb2rdf-wg@w3.org from June 2011)

From: Michael Hausenblas <michael.hausenblas@deri.org>
Date: Tue, 14 Jun 2011 14:24:29 +0100
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
Message-Id: <A09DEF7E-2D8B-414F-AFEE-BC608CAA7AE6@deri.org>
> "Proposal EF-1:

Exactly. EF-*1* ... sorry, my mind reading skills are not yet fully  
developed.

Cheers,
 Michael
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

On 14 Jun 2011, at 14:07, Enrico Franconi wrote:

>
> On 14 Jun 2011, at 13:51, Michael Hausenblas wrote:
>
>>
>>> The proposal says that the DM is not applicable to RDBs with NULL  
>>> values.
>>
>> I didn't see your proposal, yet.
>
> uh? From the wiki:
> "Proposal EF-1: If a relational database contains NULL values, then  
> the direct mapping is not applicable. This case is postponed for  
> consideration to a future WG."
>
> cheers
> --e.
>
>
>>
>>> Don't restart all the discussion again.
>>
>> Please let's not go there. As a WG co-chair it is my responsibility  
>> to ensure progress. If you don't like that, you're more than  
>> welcome to take over my position, I'll happily resign.
>>
>> Cheers,
>>  Michael
>> --
>> Dr. Michael Hausenblas, Research Fellow
>> LiDRC - Linked Data Research Centre
>> DERI - Digital Enterprise Research Institute
>> NUIG - National University of Ireland, Galway
>> Ireland, Europe
>> Tel. +353 91 495730
>> http://linkeddata.deri.ie/
>> http://sw-app.org/about.html
>>
>> On 14 Jun 2011, at 12:44, Enrico Franconi wrote:
>>
>>> NO.
>>> The proposal says that the DM is not applicable to RDBs with NULL  
>>> values.
>>> Don't restart all the discussion again.
>>>
>>> On 14 Jun 2011, at 13:37, Michael Hausenblas <michael.hausenblas@deri.org 
>>> > wrote:
>>>
>>>>
>>>>> Fair enough. If you believe so, then the proposal should be the  
>>>>> one where we give up on NULL values, since it is the only one  
>>>>> where there is no technical disagreement in the WG :-)
>>>>
>>>> OK. So here is the proposal:
>>>>
>>>> [[
>>>> PROPOSAL: To resolve ISSUE-42, the Direct Mapping will include  
>>>> triples representing the relational schema and will omit triples  
>>>> for NULL values.
>>>> ]]
>>>>
>>>>
>>>> Cheers,
>>>> Michael
>>>> --
>>>> Dr. Michael Hausenblas, Research Fellow
>>>> LiDRC - Linked Data Research Centre
>>>> DERI - Digital Enterprise Research Institute
>>>> NUIG - National University of Ireland, Galway
>>>> Ireland, Europe
>>>> Tel. +353 91 495730
>>>> http://linkeddata.deri.ie/
>>>> http://sw-app.org/about.html
>>>>
>>>> On 14 Jun 2011, at 12:24, Enrico Franconi wrote:
>>>>
>>>>> On 14 Jun 2011, at 13:17, Michael Hausenblas <michael.hausenblas@deri.org 
>>>>> > wrote:
>>>>>
>>>>>>
>>>>>>> In the wiki I came up explicitly with 3 alternative concrete  
>>>>>>> wordings; please look at them.
>>>>>>
>>>>>>
>>>>>> Looked at them. I need one (1) not three (3).
>>>>>>
>>>>>>
>>>>>>> What I can not do is to solve the open technical problem for  
>>>>>>> the representation with missing NULLs, since it is hard and  
>>>>>>> complex.
>>>>>>
>>>>>> That's also my understanding. Hence we can't normatively spec  
>>>>>> something where even the scientific part is not solved.
>>>>>
>>>>> Fair enough. If you believe so, then the proposal should be the  
>>>>> one where we give up on NULL values, since it is the only one  
>>>>> where there is no technical disagreement in the WG :-)
>>>>> I argued that also the proposal with materialised NULLs is  
>>>>> technically sound, but not everybody in the WG believes so.
>>>>> --e.
>>>>>
>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Michael
>>>>>> --
>>>>>> Dr. Michael Hausenblas, Research Fellow
>>>>>> LiDRC - Linked Data Research Centre
>>>>>> DERI - Digital Enterprise Research Institute
>>>>>> NUIG - National University of Ireland, Galway
>>>>>> Ireland, Europe
>>>>>> Tel. +353 91 495730
>>>>>> http://linkeddata.deri.ie/
>>>>>> http://sw-app.org/about.html
>>>>>>
>>>>>> On 14 Jun 2011, at 12:15, Enrico Franconi wrote:
>>>>>>
>>>>>>> In the wiki I came up explicitly with 3 alternative concrete  
>>>>>>> wordings; please look at them.
>>>>>>> What I can not do is to solve the open technical problem for  
>>>>>>> the representation with missing NULLs, since it is hard and  
>>>>>>> complex. The proposers of this representation should come up  
>>>>>>> with an answer to this question, so to support their argument.  
>>>>>>> Otherwise only my proposals can stand.
>>>>>>>
>>>>>>> On 14 Jun 2011, at 13:07, Michael Hausenblas <michael.hausenblas@deri.org 
>>>>>>> > wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>> It is ages I'm asking to this WG how to rebuild the correct  
>>>>>>>>> answers with explicit NULLs from your representation
>>>>>>>>
>>>>>>>> This is, IMO, the core of the problem. You're asking rather  
>>>>>>>> than coming up with a concrete wording for the proposal.
>>>>>>>>
>>>>>>>> Please, for the sake of getting this issue closed and meeting  
>>>>>>>> the September deadline for LC: Enrico, can you draft a  
>>>>>>>> concrete wording such as:
>>>>>>>>
>>>>>>>>
>>>>>>>> [[
>>>>>>>> PROPOSAL: To resolve ISSUE-42, ...
>>>>>>>> ]]
>>>>>>>>
>>>>>>>>
>>>>>>>> that we can discuss and hopefully resolve today?
>>>>>>>>
>>>>>>>> If we fail to get this done today I'm inclined to change the  
>>>>>>>> overall timeline because we have a lot of more issues to  
>>>>>>>> resolve and simply can not afford it to discuss one single  
>>>>>>>> issue (no matter how important it is) till the cows come home.
>>>>>>>>
>>>>>>>> This is not a scientific beauty context. We're writing a  
>>>>>>>> spec, for heavens sake.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Michael
>>>>>>>> --
>>>>>>>> Dr. Michael Hausenblas, Research Fellow
>>>>>>>> LiDRC - Linked Data Research Centre
>>>>>>>> DERI - Digital Enterprise Research Institute
>>>>>>>> NUIG - National University of Ireland, Galway
>>>>>>>> Ireland, Europe
>>>>>>>> Tel. +353 91 495730
>>>>>>>> http://linkeddata.deri.ie/
>>>>>>>> http://sw-app.org/about.html
>>>>>>>>
>>>>>>>> On 14 Jun 2011, at 11:44, Enrico Franconi wrote:
>>>>>>>>
>>>>>>>>> On 13 Jun 2011, at 23:16, Eric Prud'hommeaux wrote:
>>>>>>>>>
>>>>>>>>>> There is a fundamental difference between SPARQL and SQL  
>>>>>>>>>> users in that SQL users either prohibit a query from  
>>>>>>>>>> answering with NULLs:
>>>>>>>>>> SELECT name, company            
>>>>>>>>>> ┌────────────────┐
>>>>>>>>>> FROM Conctacts         │ name │ company │
>>>>>>>>>> WHERE name="Sue"          
>>>>>>>>>> ├──────┼─────────┤
>>>>>>>>>> AND company IS NOT NULL      
>>>>>>>>>> └──────┴─────────┘
>>>>>>>>>> or they write in some application code to skip over the  
>>>>>>>>>> NULLs, or, pretty commonly, the UI paints an empty string  
>>>>>>>>>> and the interface user has to guess whether it's was a NULL  
>>>>>>>>>> or a company named "". The intent of the query in this  
>>>>>>>>>> example was clearly to get the names of the companies which  
>>>>>>>>>> Sue represents, for wich neither NULL nor r2rml:NULL nor ""  
>>>>>>>>>> are acceptable answers.
>>>>>>>>>
>>>>>>>>> I claim that you can filter out NULLs, exactly like you  
>>>>>>>>> would do in SQL. On which ground do you claim that  
>>>>>>>>> applications built on top of RDF data are different from  
>>>>>>>>> applications built on top a RDB wrt the usage of NULLs? I  
>>>>>>>>> don't see any evidence that there is such a radical  
>>>>>>>>> difference to justify your non-standard way in dealing with  
>>>>>>>>> standard NULLs.
>>>>>>>>>
>>>>>>>>>> At any rate, I was just arguing that given a tension  
>>>>>>>>>> between putting burden on the query author to incorporate  
>>>>>>>>>> <code>FILTER (?company != r2rml:NULL)</code> into the above  
>>>>>>>>>> query, vs. requiring the person who wants to see the NULL  
>>>>>>>>>> to know the schema:
>>>>>>>>>>                                               
>>>>>>>>>> ┌────────────────┐
>>>>>>>>>> SELECT *                                            │  who  
>>>>>>>>>> │ company │
>>>>>>>>>> WHERE { ?who <Conctacts#name> "Sue"               
>>>>>>>>>> ├──────┼─────────┤
>>>>>>>>>> OPTIONAL { ?who <Conctacts#company> ?company } }   │  Sue  
>>>>>>>>>> │ UNBOUND │
>>>>>>>>>>                        
>>>>>>>>>> └──────┴─────────┘
>>>>>>>>>> , I *think* the rest of the WG is in favor of the the  
>>>>>>>>>> latter (hence the claim of rough concensus).
>>>>>>>>>
>>>>>>>>> No, this doesn't work, since you would confuse the answer  
>>>>>>>>> with a NULL value with the answer with a non existing value.  
>>>>>>>>> So, the above query doesn't do the job you are declaring. It  
>>>>>>>>> is ages I'm asking to this WG how to rebuild the correct  
>>>>>>>>> answers with explicit NULLs from your representation (even  
>>>>>>>>> with the schema). To no avail.
>>>>>>>>> So, please tell me explicitly how do you get the right  
>>>>>>>>> answer in the above case, with all the details (how the  
>>>>>>>>> schema is used, how do you distinguish the missing value  
>>>>>>>>> with the NULL value, how this can be applied mechanically to  
>>>>>>>>> general queries, etc).
>>>>>>>>>
>>>>>>>>>>> That's why I am saying "This mapping for NULL values is  
>>>>>>>>>>> arbitrary since the WG has left unexplored its  
>>>>>>>>>>> relationship with the original meaning and behaviour of  
>>>>>>>>>>> NULL values in the source RDB."
>>>>>>>>>
>>>>>>>>> I can repeat that :-)
>>>>>>>>>
>>>>>>>>>>> What I am asking you since ages is to go through my three  
>>>>>>>>>>> examples and see how your proposal would actually encode  
>>>>>>>>>>> the answers, and show how this would lead to a generic  
>>>>>>>>>>> recipe.
>>>>>>>>>
>>>>>>>>> This request still stands.
>>>>>>>>>
>>>>>>>>>>> My argument is that this will most likely be possible, but  
>>>>>>>>>>> that it will be overly complex since it will necessarily  
>>>>>>>>>>> require the ability to recognise whether a missing value  
>>>>>>>>>>> is a NULL or not (also in the answer set!).
>>>>>>>>>
>>>>>>>>> Let's see your answer to my question in bold above.
>>>>>>>>>
>>>>>>>>>>> Clearly, by having explicit NULL values this problem is  
>>>>>>>>>>> avoided. Moreover, you can easily switch the the absent- 
>>>>>>>>>>> NULL representation by just filtering all the tuples with  
>>>>>>>>>>> NULL values in one simple shot.
>>>>>>>>>>
>>>>>>>>>> In <http://www.w3.org/2001/sw/rdb2rdf/wiki/RDBNullValues#Comments_and_Proposal_by_Enrico 
>>>>>>>>>> >, you asked how to discriminate between the direct graphs of
>>>>>>>>>> ┌┤R├────────┐ and ┌┤R'├┐
>>>>>>>>>> │ ID │    A │     │ ID │
>>>>>>>>>> ├────┼──────┤      
>>>>>>>>>> ├────┤
>>>>>>>>>> │  1 │ NULL │     │  1 │
>>>>>>>>>> └────┴──────┘      
>>>>>>>>>> └────┘
>>>>>>>>>> , but we do that by knowing the schema so the question  
>>>>>>>>>> doesn't help us learn what is a reasonable mapping.
>>>>>>>>>
>>>>>>>>> This is too vague: "we do that by knowing the schema". As I  
>>>>>>>>> said above, please tell how do you proceed explicitly.
>>>>>>>>>
>>>>>>>>>> I instead propose that you ask questions of the  
>>>>>>>>>> ┤Conctacts├ database above and show how, even knowing  
>>>>>>>>>> the schema, the direct graph doesn't give you reallistic  
>>>>>>>>>> access to information. Remember, this isn't a database  
>>>>>>>>>> interchance language, but instead a way to give RDF users  
>>>>>>>>>> an useful view of relational data.
>>>>>>>>>
>>>>>>>>> I don't understand this point :-(
>>>>>>>>>
>>>>>>>>> cheers
>>>>>>>>> --e.
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>
>
Received on Tuesday, 14 June 2011 13:24:58 UTC