RDF DAWG Weekly -- 6 Mar 2007

housekeeping

<LeeF> Minutes from the 20th Feb: http://lists.w3.org/Archives/Public/public-rdf-dawg/2007JanMar/att-0127/2007-02-20-dawg-minutes.html seconded by SimonR and approved.

<LeeF> Minutes from 27th Feb: http://lists.w3.org/Archives/Public/public-rdf-dawg/2007JanMar/att-0123/27-dawg-minutes.html amended to show Jeen's regrets seconded by EricP and approved.

(Jeen's amendment at http://lists.w3.org/Archives/Public/public-rdf-dawg/2007JanMar/0125.html)

Issue: daylight savings time will affect meeting schedules for american and european participants over the next several weeks. The possibility of moving DAWG's meeting times in response to this was discussed.

<LeeF> decision: stay at 14:30 UTC

review action items

<LeeF> ACTION: LeeF to talk to SteveH and JeenB about auto distinct behavior in implementations [DONE] [recorded in http://www.w3.org/2007/03/06-dawg-minutes.html#action01]

(Discussion with Steve and Jeen at: http://lists.w3.org/Archives/Public/public-rdf-dawg/2007JanMar/0128.html )

<LeeF> ACTION: EricP to run the yacker tool over and annotate the existing tests [CONTINUES] [recorded in http://www.w3.org/2007/03/06-dawg-minutes.html#action03]

<LeeF> ACTION: EricP to add text to spec noting that ORDER BY comparisons may use extended implementations of < that operate on types beyond what's given in the operator table [CONTINUES] [recorded in http://www.w3.org/2007/03/06-dawg-minutes.html#action04]

<LeeF> ACTION: LeeF to remember that the wee, lost filter tests should be put [CONTINUES] [recorded in http://www.w3.org/2007/03/06-dawg-minutes.html#action05]

<LeeF> action -2

<LeeF> ACTION: AndyS to add text clarifying the prohibition on blank node labels in multiple BGPs to rq25 [DONE] [recorded in http://www.w3.org/2007/03/06-dawg-minutes.html#action06]

<LeeF> ACTION: LeeF or EliasT to reply to Bjoern regarding (not) POSTing application/sparql-query documents [CONTINUES] [recorded in http://www.w3.org/2007/03/06-dawg-minutes.html#action07]

unexpected/auto DISTINCT

LeeF: Last week's call tended to favor exact cardinality results. Dissenters were SimonR (during the call) and SteveH in subsequent communication to LeeF. Jeen said could work with either.

SimonR: Why do we care about cardinality? What is its logical meaning?

EricP: FredZ told horror stories about the history of SQL, in which migration to "non-capricious DISTINCT" happened painfully.

<AndyS> There 3 positions:

auto DISTINCT
defined # duplicates
anything in between is OK (loose)

SteveH: Agrees with EricP's account of what FredZ said, but concerned that we're limiting possible indexing schemes.
... What happens if a triple occurs in two graphs? AndyS: One for each graph.

<LeeF> <s> <p> ?o

<LeeF> occuring in multiple graphs - get 1 solution per graph (w/ the same ?o)

<LeeF> (3a) No way to say that auto-distinct is ok

<LeeF> (3b) A "distinctable" keyword to allow 1 < #solutions < n

<ericP> CONDENSED

<AndyS> I'd prefer to provide defined output (for aggregates later). If impls choose to be indeterminate - fine but not in spec.

<Souri> MIN, ALL, SOME

<Souri> w.r.t cardinality

<SimonR> It certainly becomes harder to do testing, if we permit multiple correct answers -- but we already have that problem. (eg without ORDER BY, etc)

<EricP> In favor of DISTINCT, ALL, and reducible as the default

<AndyS> Not in favor of that because most people will write the default case

EricP: Now in favor of having an ALL keyword, with the default being reducible.

<patH> Might this be handled by deft use of language? condesable is MUST but some clear option we choose is SHOULD, or somethng like that?

<AndyS> It affects testing and wouldn't we'd need conformance language?

The chair called a straw poll on the following five possible designs:

default is strict counting as per the algebra, no keyword to loosen the counting
default is strict counting as per the algebra, add a DISTINCTABLE keyword to loosen the counting
default is loose counting, no keyword to force strict counting
default is loose counting, add an ALL keyword to force strict counting
always distinct

<AndyS> ARQ has external flag for "always distinct" - I think those sorts of issues are outside the language

<LeeF> Andy: +1 on #1

<LeeF> Simon: -1 on strict counting; +1 #5, +0.5 #3

<LeeF> Souri: +1 #2

<SteveH> SteveH: +1 #4, +0.9 #2, -1 #5 #1 is acceptable

<AndyS> Andy's example: SELECT sum(?salary) { ?x :hasSalary ?salary }

<SimonR> Response to Andy's query: SUM(?salary, SELECT ?person ?salary { ?person :salary ?salary })

<AndyS> Simon: I wanted the minutes to record my example.

><SimonR> It's a good example. You see the point of the response, though? I choose the properties that serve to distinguish the salary values, in this case the people drawing them.

<AndyS> And my response was that the burden is now on the app writer. This is for the minutes.

LeeF: Unlikely to arrive at a decision before Andy needs to leave. Not quite sure how to proceed. EricP: Nothing that'll work in the next 10 minutes. :)

SteveH: What if we required "loose" counting to be either DISTINCT or ALL, but not in between? EricP: Doesn't really help implementers.

<patH> vote on 1-5: Pat: no very strong opinion, marginally like 2 best.

<AndyS> Hmm - can change the CONSTRUCT answers!

LeeF: Hoping to have QL document ready (minus perhaps just this issue) -- so reviews of rq25 must come in promptly.

<ericP> vote: 1) +1, 2) +1, 3) -1, 4) +1

<AndyS> e.g. CONSTRUCT { [] :salary ?sal } WHERE { ?x :salary ?sal }

<SteveH> AndyS, as [] is a bNode, I don't think it changes anything

AndyS: What are we proposing for next week re: LC LeeF: Propose that we make the decision of normative parts of document and DISTINCT issues.

<AndyS> It does! A new one gets generated each template :-) So it, not lean, it counts the results!

<SteveH> AndyS, but leaning is perfectly acceptable

<SteveH> -> _:a :salary 100000 . _:b :salary 100000 .

<AndyS> Yep - leaning is good here.

<AndyS> SELECT sum(?salary) { ?x a :person . ?x :hasSalary ?salary }

<SteveH> G1 { <person-a> :salary 100000 } G2 { <person-a> :salary 100000 }

<SteveH> G1 { [ :salary 100000 ; :id 12 ] } G2 { [ :salary 100000 ; :id 12 ] }

<ericP> G1 { [ :employeeId 21; :salary 100000 ] } G2 { [ :employeeId 21; :salary 100000 ] }

<ericP> by a scant second

<ericP> SELECT ?g ?id ?salary WHERE { GRAPH ?g { :employeeId ?id; :salary ?salary } }

<SimonR> SUM(?salary, SELECT ?co ?id ?salary WHERE { GRAPH ?co { ?e :employeeID ?id . ?e :salary ?salary }}

<ericP> SELECT ?g ?id ?salary WHERE { GRAPH ?g { ?who :employeeId ?id; :salary ?salary } }

<ericP> SELECT SUM(?salary) WHERE { GRAPH ?g { ?who :employeeId ?id; :salary ?salary } }

<ericP> 200000

<Souri> SELECT ?x, SUM(?sal) { ?x a :Person . ?x :hasSalary ?salary } GROUP BY (?x) HAVING COUNT(*) > 1

<Souri> :-)

<ericP> ooo, HAVING COUNT... cool

<patH> are we all talking the same language?

<SimonR> patH, We're all talking our different, slightly preferred languages and trying to subvert each other. :)

<patH> thanks, Simon.

<ericP> SELECT SUM(?salary) WHERE { GRAPH ?g { ?who :employeeId ?id; :salary ?salary } } => 200000

<ericP> SELECT SUM(?salary) WHERE { GRAPH <G1> { ?who :employeeId ?id; :salary ?salary } } => 100000

<patH> Seems to me this is a lot of hassle to solve a problem that shouldnt even arise in RDF anyway.

<SteveH> G1 { [ :employeeId 21; :salary 100000; :worksFor :A ] } G2 { [ :employeeId 21; :salary 100000; :worksFor :B ] }

<ericP> SELECT SUM(?salary) WHERE { GRAPH ?g { ?who :employeeId ?id; :salary ?salary; ?worksFor ?w } } => 200000

<ericP> SELECT SUM(?salary) WHERE { GRAPH ?g { ?who :employeeId ?id; :salary ?salary; ?worksFor :B } } => 100000

<SteveH> G1 { [ :employeeId 21; :salary 100000; :worksFor :A ] } G2 { [ :employeeId 21; :salary 100000; :worksFor :B ] . [ :employeeId 21; :salary 100000; :worksFor :B ] }

<ericP> pretty-printing that:

<ericP> G1 { [ :employeeId 21; :salary 100000; :worksFor :A ] }

<ericP> G2 { [ :employeeId 21; :salary 100000; :worksFor :B ] }

<patH> If G2 is put into RDF using bnodes as subjects (case 1) then you will likely have two different bnodes. If you use URIs as subjects (case 2) then the redundancy will not get into the graph.

RDF DAWG Weekly

6 Mar 2007

Attendees

Contents

housekeeping

review action items

unexpected/auto DISTINCT

Summary of Action Items