14:28:13 RRSAgent has joined #rdf-star 14:28:17 logging to https://www.w3.org/2025/06/27-rdf-star-irc 14:28:17 pfps has joined #rdf-star 14:28:29 meeting: https://www.w3.org/events/meetings/ffcebe59-f304-495a-bb7d-3a9c2c4dd5f7/20250627T143000/ 14:28:45 I have made the request to generate https://www.w3.org/2025/06/27-rdf-star-minutes.html AndyS 14:29:26 olaf has joined #rdf-star 14:29:29 Meeting: SPARQL-TF 14:29:35 present+ 14:29:48 agenda: https://www.w3.org/events/meetings/ffcebe59-f304-495a-bb7d-3a9c2c4dd5f7/20250627T143000/#agenda 14:29:49 clear agenda 14:29:49 agenda+ Scribe? 14:29:49 agenda+ Recap discussion on subqueries 14:29:49 agenda+ TBD 14:29:49 agenda+ Topic for next time 14:29:51 present+ 14:30:00 I have made the request to generate https://www.w3.org/2025/06/27-rdf-star-minutes.html AndyS 14:30:22 present+ 14:30:39 present+ 14:30:44 chair: AndyS 14:32:04 present+ 14:32:37 present+ 14:32:59 zakim, open item 1 14:32:59 agendum 1 -- Scribe? -- taken up [from agendabot] 14:33:13 scribe+ 14:33:24 zakim, next item 14:33:24 agendum 1 was just opened, AndyS 14:33:36 zakim, close item 1 14:33:36 agendum 1, Scribe?, closed 14:33:36 previous meeting: https://www.w3.org/2025/06/26-rdf-star-minutes.html 14:33:37 next meeting: https://www.w3.org/2025/07/03-rdf-star-minutes.html 14:33:38 I see 3 items remaining on the agenda; the next one is 14:33:38 2. Recap discussion on subqueries [from agendabot] 14:33:42 james4 has joined #rdf-star 14:33:53 zakim, open item 2 14:33:53 agendum 2 -- Recap discussion on subqueries -- taken up [from agendabot] 14:34:31 AndyS: last week discussed subqueries. how they relate to usage patterns. underlying question: what is the nature of variables inside subqueries that are not projected from the subqueries? 14:35:07 ... related to parameterized queries and intuition that all occurrences of variables would change. cf. a more algebraic viewpoint where anything inside a subquery which is not projected can be renamed, and no change to query outcome. 14:35:28 ... in terms of values insertion, it's whether you do or do-not do the renaming step. 14:35:46 present+ james 14:35:53 I have made the request to generate https://www.w3.org/2025/06/27-rdf-star-minutes.html TallTed 14:36:05 ... if you perform renaming, you get the more algebraic viewpoint. otherwise you get the parameterized query behavior. 14:37:02 pfps: other option is just doing shallow values injection. 14:37:34 ... options: no values injection (qlever); values injection everywhere (without renaming); values injection (with renaming); values injection only at the top. 14:37:49 AndyS: You have to decide shallow vs. not anyway. 14:37:55 ... Qlever doesn't do substitute at all? 14:38:03 pfps: yes. I put comments in the issue. 14:38:17 ... afaict, it does MINUS, essentially. 14:38:37 ... that's the easy thing to do. 14:38:53 AndyS: I would say it's not "easy" if it doesn't do what user expects. 14:39:19 pfps: Qlever implemented exists in response to complaint from me. Done very quickly. 14:41:37 james: I look at queries and wanted to satisfy curiosity on what people use EXISTS for. 14:41:52 ... there's a file in the repo called exist statistics(?) and the numbers are very low. 14:42:16 ... you can also look into individual directories and there's a txt file in each which shows the number of unique queries. 14:42:29 ... literally every single query that has been run on respective hosts for at least 5 years. 14:43:06 where is the file? 14:43:07 ... they are hash coded, so unique. of that, I used javascript rdf tools to parse, hash-code, and reserialize every exists form. 14:43:32 ... output isn't perfect. places that are not SPARQL (json means re-serialization wasn't possible) 14:43:45 ... surprised that didn't find any attempt to do a sub-select. 14:43:46 q+ 14:43:57 ... mostly two triples. three triples. occassionally 4-triple BGPs. but very few. 14:44:00 email -- "EXISTS variants" -- 24/06/2025, X:02 14:44:09 ... my concerns about complexity are in some sense not well founded. 14:45:14 https://github.com/datagraph/SPARQL-exists/tree/main/test-tools 14:45:24 ... if you look at the csv file in the repo, even the ones that are 4-statement BGPs, if you run into large cases, those won't behave nicely with an iterative process. 14:45:27 .... that's my concern. 14:46:36 ... this is produced by a javascript script. all it cares about is finding a directory of queries. others can run it. 14:46:43 pfps: the serializer doesn't work all the time? 14:46:58 james: yes, that's a problem with the serializer. 14:47:06 pfps: I found a fair number of "SELECT", but those are all serializer issues. 14:47:17 james: I'd have to look more closely to know why they failed. 14:47:18 EXISTS {"queryType":"SELECT","type":"query","variables":[{"termType":"Variable","value":"v0"}],"where":[{"expression":{"termType":"NamedNode","value":"http://example.org/x0"},"type":"bind","variable":{"termType":"Variable","value":"v0"}},{"expression":{"args":[{"termType":"Variable","value":"v1"},{"termType":"NamedNode","value":"http://example.org/x1"}],"operator":"=","type":"operation"},"type":"filter"}]} 14:47:33 pfps: I found this [pasted example] 14:47:49 james: I did not collect the hash code of this example. Would need to look to find context. 14:48:07 james: if that is a sub-select, that migth be the only one (or two). 14:48:17 ... 6 select forms in there. I can look at them and find the actual context. 14:48:33 pfps: also could look for embedded groups. 14:48:46 ... where you have a left { and right }. 14:48:49 e.g. {} UNION {} 14:49:03 ... that's another place where normal execution of SPARQL introduces a new variable context. 14:49:11 james: I'd expect to see a join in that case. 14:49:33 ... it's preliminary, but shows information. 14:49:50 pfps: groups are introduced in some places, operations. MINUS introduces a new group. 14:50:10 AndyS: yes, but your characterization of introducing new variables misses the fact that it's because of bottom-up execution. 14:50:32 ... UNION would introduce. BIND has a join (relative to the things on the left). 14:50:44 ... VALUES also. 14:51:03 james: other thing not attempted is disconnected variables. 14:51:18 ... not good enough at the tooling to do that. 14:51:33 AndyS: most important thing is the observation of how rare EXISTS is at all. 14:51:41 ... you dont' support remote SHACL on your stores? 14:51:44 james: no support for SHACL. 14:52:01 AndyS: some implementations can run client side, but run large number of calls. 14:52:21 ... possible because all of SHACL 1.0 have a definition in SPARQL. 14:52:26 ... not true with 1.2. 14:52:40 ... somewhere in SHACL there is something which is effectively EXISTS. 14:53:27 ... generally very fine-grained. 14:53:43 olaf: in the CSV files, there's a column called complexity? 14:54:11 james: that is looking at how many operators were in the parsed version. a single join is 1. second join 2. other operators similarly. 14:54:16 ... in the EXISTS pattern. 14:54:29 ... no sense of what's being fed into the EXISTS pattern. 14:54:36 AndyS: interesting one of them has BOUND in it. 14:55:14 james: I could make a column referring to which query it came from with context. 14:55:54 ... with respect to numbers, the txt files show places where 0.5M queries. 100 EXISTS forms of them. prevelance is very low. 14:56:27 AndyS: Qlever doesn't support exists. Other engines? 14:57:29 pfps: you can look at the issue (156). 14:57:46 ... the last comment has 4 different queries and what happens with them in jena, blazegraph, and qlever. 14:58:05 ... [summarizing details in ticket] 14:58:12 s/ticket/issue/ 14:58:59 https://github.com/w3c/sparql-query/issues/156#issuecomment-2991640816 14:58:59 https://github.com/w3c/sparql-query/issues/156 -> Issue 156 Addressing SPARQL EXISTS errata (by afs) [ErratumRaised] 14:59:11 https://github.com/w3c/sparql-query/issues/156#issuecomment-2991640816 15:00:44 AndyS: can't read too much into qlever results if it's really doing a MINUS. 15:01:00 pfps: hannah had said MINUS is the right thing to do. Not sure there's much support for that. 15:01:06 AndyS: that would be more change for users. 15:01:19 ... couldn't consider that an errata at that point. 15:01:51 pfps: hoped to get virtuoso, too. but endpoing has been down. 15:02:07 ... maybe it has been moved. 15:02:27 ... created queries to not depend on the data. should be able to run them on any data and get same results. 15:02:51 TallTed: I think that's true. Send me an email if that continues to be true. 15:03:24 TallTed: there are a bunch of public endpoints. 15:03:33 pfps: can run it on any endpoint. 15:03:48 ... will try it on virtuoso again. 15:04:05 this one is likely to be up -- https://dbpedia.org/sparql 15:04:39 pfps: Qlever endpoint has a link to the old virtuoso wikidata endpoint. 15:05:55 Yes, the QLever endpoint thinks Virtuoso Wikidata is at https://wikidata.demo.openlinksw.com/sparql 15:06:00 AndyS: summarizing: top-level or everywhere for variable injection. 15:06:10 olaf: can you explain using queries pfps has in the issue? 15:06:32 AndyS: effect means that you only encounter injected value after you've done some evaluation. 15:06:52 ... evaluation may be a filter. so any variables mentioned in the filter and e.g. on branch of a union could be unbound at that point. 15:07:01 ... if you put it at beginning of ALL BGPs, then it is available to the filter. 15:07:15 olaf: you're talking now about substitution? 15:07:22 AndyS: any mechanism. 15:07:27 ... or values insertion. 15:07:32 ... mechanism for doing the correlation. 15:07:38 https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Alternative_endpoints also has the old endpoint for Virtuoso 15:08:02 ... if we move away from that, then we're not changing the filter expression, just making sure the value is available during evaluation. 15:08:39 ... in the algenra, every group pattern starts with an empty BGP. one way of looking at it is to start with a VALUES instead of an empty BGP. 15:08:46 s/algenra/algebra/ 15:09:33 olaf: distinction between shallow and deep is (looking at last example in issue), before the deepest select, in shallow there would be a VALUES and only there? 15:09:48 EXISTS { SELECT ==> EXISTS { HERE join { SELECT ==> }} 15:09:49 pfps: in shallow one, you get VALUES right at the beginning. that VALUES does not penetrate into sub-select or the embedded group. 15:09:57 ... in deep one, you get VALUES at beginning of every group. 15:10:10 ... in second and third, you get values injection in two places. at top and inside the braces. 15:11:18 pfps: in shallow one, only after the starting left brace of the EXISTS 15:11:29 AndyS: in algebra terms, into every leaf of the tree. 15:12:13 ... techniques hyper uses should be applicable. but they are particularly for decorelating queries. 15:12:31 ... should be possible to do it, but I haven't. 15:12:50 james: how does that apply to the fourth query? 15:13:03 Ted: I sent you an email asking for the new Virtuoso endpoint. 15:13:21 PREFIX ex: 15:13:21 SELECT ?x ?y ?z WHERE { 15:13:21 VALUES (?x ?y ?z) { ( ex:a ex:b 7 ) } 15:13:21 FILTER EXISTS { SELECT ?k WHERE { SELECT ?z WHERE { FILTER ( ?z = 7 ) } } } 15:13:22 } 15:13:45 EXISTS { ^^^ join SELECT ?k WHERE { ^^^ join SELECT ?z WHERE { ^^^ FILTER ( ?z = 7 ) } } } 15:13:49 AndyS: do it at beginning of every BGP 15:14:42 james: even though the SELECT ?k prevents the ?z from being injected? 15:15:04 pfps: inner ?z changes to ?z' (or something). still do injection, but it doesn't do anything since innermost filter has been changed to use ?z' 15:15:22 AndyS: all the ?z would change. 15:15:48 pfps: I could shorten that example. 15:16:30 pfps: status of getting somebody from Qlever to these meetings? 15:16:46 AndyS: have previously sent invites. 15:17:26 ... we have their input. detailed comment on the issue. 15:19:25 zakim, next item 15:19:25 I see a speaker queue remaining and respectfully decline to close this agendum, AndyS 15:19:31 q? 15:19:35 q- 15:19:38 zakim, next item 15:19:38 agendum 3 -- TBD -- taken up [from agendabot] 15:20:08 AndyS: we got top- vs. everywhere. and to rename or not. any other key areas for discussion? 15:20:22 q+ 15:20:22 pfps: haven't seen anything to indicate anything else. 15:20:34 ack james 15:20:47 james: wasn't there an issue about process being bottom-up or top-down? 15:20:57 pfps: that's covered by the one AndyS just mentioned. 15:21:12 ... it's on the issues list. 15:21:42 AndyS: "any execution is top down" is confusing wrt. rest of the spec. 15:22:06 ... top-down comes from the first last call of SPARQL 1.0. 15:22:57 AndyS: will work on agenda and maybe strawpolls for next time. 15:23:37 zakim, end meeting 15:23:37 As of this point the attendees have been james, gtw, olaf, AndyS, pfps, TallTed 15:23:39 RRSAgent, please draft minutes 15:23:40 I have made the request to generate https://www.w3.org/2025/06/27-rdf-star-minutes.html Zakim 15:23:46 I am happy to have been of service, AndyS; please remember to excuse RRSAgent. Goodbye 15:23:47 Zakim has left #rdf-star 15:23:55 rrsagent, please leave us 15:23:55 I see no action items