15:58:37 RRSAgent has joined #rdf-star 15:58:41 logging to https://www.w3.org/2026/04/30-rdf-star-irc 15:58:41 zakim, start meeting 15:58:41 rrsagent, make logs public 15:58:42 RRSAgent, make logs Public 15:58:43 please title this meeting ("meeting: ..."), AndyS 15:58:50 meeting: RDF and SPARQL Working Group 15:58:59 agenda: https://www.w3.org/events/meetings/e5234c80-4c06-4c6b-af43-c78a1dbd390a/20260423T120000/#agenda 15:58:59 previous meeting: https://www.w3.org/2026/04/16-rdf-star-minutes.html 15:58:59 next meeting: https://www.w3.org/2026/04/30-rdf-star-minutes.html 15:59:00 clear agenda 15:59:00 agenda+ Approval of last week’s minutes: -> 1 https://www.w3.org/2026/04/16-rdf-star-minutes.html 15:59:00 agenda+ RDF Threat Model: get some help from the TMCG -> 2 https://www.w3.org/groups/cg/tmcg/calendar/ 15:59:00 agenda+ Report from the meeting with the TAG 15:59:01 agenda+ Review of open PRs, available at -> 4 https://github.com/orgs/w3c/projects/20/views/4 15:59:04 agenda+ Identifying issues to solve before CR -> 5 https://github.com/orgs/w3c/projects/20/views/8 15:59:07 agenda+ Any Other Business (AOB), time permitting 15:59:43 niklasl has joined #rdf-star 16:00:12 agenda: https://www.w3.org/events/meetings/11e4d020-9c58-4fff-83c5-37c9e2502295/20260430T120000/#agenda 16:00:12 clear agenda 16:00:12 agenda+ Approval of last week’s minutes: -> 1 https://www.w3.org/2026/04/23-rdf-star-minutes.html 16:00:12 agenda+ Allowing \u escaped surrogate pairs -> 2 https://lists.w3.org/Archives/Public/public-rdf-star-wg/2026Apr/0028.html 16:00:12 agenda+ Review of open actions, available at -> 3 https://github.com/orgs/w3c/projects/20/views/3 16:00:12 agenda+ Review of open PRs, available at -> 4 https://github.com/orgs/w3c/projects/20/views/4 16:00:13 present+ 16:00:15 agenda+ Identifying issues to solve before CR -> 5 https://github.com/orgs/w3c/projects/20/views/8 16:00:18 agenda+ Any Other Business (AOB), time permitting 16:00:24 present+ 16:00:29 present+ 16:00:37 present+ 16:00:37 present+ 16:00:37 rrsagent, make logs public 16:00:39 rrsagent, please draft minutes 16:00:40 I have made the request to generate https://www.w3.org/2026/04/30-rdf-star-minutes.html AndyS 16:00:47 agenda? 16:00:47 scribe+ 16:00:51 present+ 16:00:55 chair+ 16:01:10 present+ 16:01:19 pfps has joined #rdf-star 16:02:10 previous meeting: https://www.w3.org/2026/04/23-rdf-star-minutes.html 16:02:10 16:58 next meeting: https://www.w3.org/2026/05/07-rdf-star-minutes.html 16:02:19 present+ 16:02:42 rrsagent, draft minutes 16:02:43 I have made the request to generate https://www.w3.org/2026/04/30-rdf-star-minutes.html AndyS 16:02:58 present+ 16:04:10 previous meeting: https://www.w3.org/2026/04/23-rdf-star-minutes.html 16:04:10 next meeting: https://www.w3.org/2026/05/07-rdf-star-minutes.html 16:04:16 rrsagent, draft minutes 16:04:18 I have made the request to generate https://www.w3.org/2026/04/30-rdf-star-minutes.html AndyS 16:05:03 Souri has joined #rdf-star 16:05:09 present+ 16:05:17 doerthe has joined #rdf-star 16:05:24 present+ 16:05:29 regrest+ ktk 16:05:30 regrets+ pchampin, ktk 16:05:33 zakim, open item 1 16:05:33 agendum 1 -- Approval of last week’s minutes: -> 1 https://www.w3.org/2026/04/23-rdf-star-minutes.html -- taken up [from agendabot] 16:05:36 minutes look acceptable 16:05:38 s/regrest+ ktk// 16:06:12 PROPOSAL: Approve last week's minutes 16:06:18 +1 16:06:20 +0 (not present) 16:06:22 +1 16:06:25 +1 16:06:29 +1 16:06:32 +1 16:06:51 +1 16:06:53 +0 (not present) 16:06:54 +1 16:06:56 +1 16:07:00 +1 16:07:13 RESOLVED: Approve last week's minutes 16:07:21 I have made the request to generate https://www.w3.org/2026/04/30-rdf-star-minutes.html TallTed 16:07:27 zakim, open next item 16:07:27 agendum 2 -- Allowing \u escaped surrogate pairs -> 2 https://lists.w3.org/Archives/Public/public-rdf-star-wg/2026Apr/0028.html -- taken up [from agendabot] 16:07:33 regrets+ olaf 16:07:42 s|16:58 next meeting: https://www.w3.org/2026/05/07-rdf-star-minutes.html|| 16:07:47 regrets+ az 16:07:49 https://lists.w3.org/Archives/Public/public-rdf-star-wg/2026Apr/0028.html 16:08:13 I have made the request to generate https://www.w3.org/2026/04/30-rdf-star-minutes.html TallTed 16:08:40 AndyS: There have been an email thread. The sentiment seems to not have \u surrogate pairs, despite the i18n suggestion 16:09:21 AndyS: A choice is to prevent parsers to doing that or not 16:09:42 AndyS: We don't have much response from UTF-16 programming language people like JavaScript, Python or Java 16:09:56 q+ 16:10:00 ora: Your suggestion would have been to prevent them? 16:10:18 q+ to way that the current document is perfect 16:10:26 """ 16:10:26 Two adjacent numeric escape sequences forming a Surrogate Pair MAY be converted to a supplementary codepoint as described by Unicode 17.0 section 3.9.2 UTF-16. 16:10:26 """ 16:10:28 AndyS: I am pretty neutral 16:10:53 AndyS: We can put a MAY to don't make any obligation to support it in any way 16:11:20 +1 to MAY 16:11:47 ack pfps 16:11:47 pfps, you wanted to way that the current document is perfect 16:11:47 s/to don't make/to not make/ 16:12:12 pfps: It appears to me that the document in its current form is perfect 16:12:33 pfps: It states that surrogates are not allowed and that Turtle parsers are allowed to do whatever they want outside the standard 16:13:09 pfps: If we state that "Turtle document MAY include escaped surrogate pairs" it means that every processor should support them 16:13:25 pfps: I read Turtle 1.1 as preventing surrogate pairs, they are not character 16:13:48 q+ 16:14:15 AndyS: There are no "unicode character" there are the "abstract character" you don't write and 1.1 does not explicitly prevent surrogates. This is i18n reading 16:14:54 pfps: The intent of 1.1 is that surrogates are not allowed 16:15:02 AndyS: Unfortunately, intent is not speccable 16:15:28 pfps: You are right "unicode character" are not a well define things but I don't think surrogates are "character" 16:15:57 no characters, but there are "noncharacters" (which is orthogonal to this discussion) 16:15:59 AndyS: The nearest thing at this level is a code point. So technically U+0020 is not a character, it's a code point 16:16:09 ack lisp 16:16:44 q+ 16:16:45 lisp: I really don't understand what is argued about. We are talking about something encoded in UTF-8 16:16:56 lisp: if you have surrogate characters you get an invalid sequence 16:17:11 AndyS: at utf-8 level you get the \u... syntax that is valid UTF-8 16:17:33 ack gtw 16:17:58 AndyS: What is doing on is you process as UTF-8 and you happen to find some escape sequence that encode surrogates 16:18:38 lisp: even if there is a pair of surrogates, this does not give a valid sequence of code points 16:20:46 q+ 16:21:10 gtw: At the point we come across the \u escape, we are not any more parsing the utf-8 input string we are manipulating code points 16:21:57 ack lisp 16:21:57 gtw: My point is that \u is something we have defined in our format, it is used to encode unicode escapes but it is something we handle 16:22:13 lisp: If that the case, then Turtle is not UTF-8, it is not 16:22:16 q+ 16:22:18 q+ 16:22:50 ack AndyS 16:22:57 q- 16:23:00 lisp: A UTF-8 is a sequence of code unit (1..4). If you put into it an escape for a code unit, then we are doing something invalid 16:23:19 q+ 16:23:36 AndyS: The 12 character \u....\u.... are decoded into a single unicode codepoint just like \n gives you the new line code point 16:23:36 ack lisp 16:23:51 lisp: We may do that but you are no longer decoding utf-8 16:24:05 q+ 16:24:13 ack gtw 16:24:27 q+ 16:24:31 These sequence of characters have a meaning for unicode and it's sequence of two surrogates and it's not allowed 16:24:55 ack lisp 16:25:32 q+ to say that the 12 bytes are *Turtle* 16:25:39 ack pfps 16:25:39 pfps, you wanted to say that the 12 bytes are *Turtle* 16:25:55 q+ 16:26:18 q+ to move on to changes - if any - and response to i18n 16:26:35 ack lisp 16:26:42 gtw: Turtle see a sequence of 12 character \u....\u.... and has to choose if they are legal or illegal 16:27:12 lisp: if you are decoding UTF-8, it has a definition for codepoint, you are not out of UTF-8 yet 16:27:26 lisp: If the WG want to say it's UTF-8, then it's not possible 16:27:47 ora: What you want to say is that's a "UTF-8 escape sequence" 16:28:02 q+ 16:28:10 UTF-8 does have a definition of the twelve bytes \unnnn\unnnn, it's twelve Unicode code units, each of which are in the ASCII block 16:28:20 q+ 16:28:25 lisp: No, it ends up as a UTF-16 escape sequence. Surrogates are not valid escape sequences 16:28:34 q- 16:28:40 ack AndyS 16:28:40 AndyS, you wanted to move on to changes - if any - and response to i18n 16:29:06 AndyS: Can we find some way forward to what we can answer to i18n 16:29:25 AndyS: What I am hearing is that we can keep the current text (surrogates are not allowed) and leave it to that 16:29:29 q- 16:29:47 AndyS: I would be happy if we had some response from people who have worked in UTF-16 centric languages 16:30:28 AndyS: I changed Jena last fall to put it as close to the spec at all. Java miss represent code on output, it currently will accept invalid surrogate pairs but state in the code "this is an extension" 16:30:33 STRAWPOLL: Keep current text: surrogates not allowed 16:30:41 +1 16:30:41 +1 16:30:47 +0 (neutral) 16:30:50 +1 16:30:56 +0 16:30:59 +1 16:31:04 +0.8 16:31:04 +0 16:31:17 -0.5 16:31:25 For \u: A Unicode code point in the ranges U+0000 to U+D7FF and U+E000 to U+FFFF, corresponding to the value encoded by the four hexadecimal digits interpreted from most significant to least significant digit. 16:32:00 "A numeric escape sequence MUST NOT produce a code point value in the range U+D800 to U+DFFF, which is the range for Unicode surrogates." 16:32:11 +0 16:32:30 q+ 16:33:04 j22: I do not have strong feeling. It is a feature in the spec that could increase compatibility in existing specs 16:33:23 ack lisp 16:33:26 j22: So, I though it would be reasonable to follow the suggestion but I do not have strong enough feeling 16:33:51 lisp: The text which Andy wrote is an accurate rendition of what UTF-8 requires 16:34:20 lisp: The notion that one will increase compatibility by allowing extension will mean people won't know what they can expect 16:34:25 q+ 16:34:35 ack AndyS 16:34:42 lisp: the "surrogate MAY appears mean parsers must implement them to get interoperability" 16:35:05 AndyS: The spec text use the U+ notation that talks about code points after decoding 16:35:44 AndyS: if you want to talk interoperability, we should talk about what systems produce 16:36:47 ora: If we don't allow surrogates, what are the bad things that would happen? I do not have enough experience to know how marginal this issue is 16:36:50 q+ 16:36:57 ack Tpt 16:37:41 scribe+ 16:38:48 AndyS: Change in Jena to exclude bad use of surrogate did not cause user reports 16:39:59 Tpt: According to the issue author, DotNetRdf writes escaped surrogate pairs 16:40:44 Someone indicated that Python was UTF-16 friendly. I'm not seeing that in documents about strings in Python. 16:41:27 ora: the low energy option is to keep the old text 16:41:40 q+ 16:41:47 ack Tpt 16:42:38 q+ 16:42:43 ack gtw 16:42:47 tpt: could put in a note about a common extensions to Turtle. 16:43:11 Tpt: What about stating that parsers MAY accept escaped surrogates and serializers MUST NOT use it 16:43:32 gtw: That is the current status quo - don't feel that we need to mention it as it suggests it is acceptable. 16:43:41 gtw: I don't think it's a good idea, it suggests it is something "fine" to do. I am concern it will cause compatibility issues 16:44:31 ora: Do people want to decide now, in the grand scheme of things, it's a small thing 16:45:03 ora: Adrian and I are giving a small status update at the Knowledge Graph Conference 16:45:27 PROPOSAL: Keep current text: surrogates not allowed 16:45:32 +1 16:45:35 +1 16:45:36 +1 16:45:38 +1 16:45:42 +1 16:45:50 +1 16:45:51 +1 16:45:57 +0 16:46:03 +1 16:46:08 +1 16:46:35 RESOLVED: Keep current text: surrogates not allowed 16:47:24 AndyS: What we answer to i18n, "the WG has considered your position and decided to keep the current text as is clearer" 16:48:48 ora: AndyS will respond 16:49:23 zakim, open item 4 16:49:23 agendum 4 -- Review of open PRs, available at -> 4 https://github.com/orgs/w3c/projects/20/views/4 -- taken up [from agendabot] 16:52:40 ora: Where are we in tests? 16:52:49 q+ 16:52:58 ack Tpt 16:54:43 q+ 16:54:49 ack AndyS 16:55:17 +1 to rdf/xml informal meeting next week 16:55:24 Tpt: Tests for NTriples/NQuads/Turtle/TriG are in good shape, RDF/XML is blocked on possible substantive changes, SPARQL still misses some parts like EXISTS 16:55:57 AndyS: What about an informal session next week on RDF/XML? 16:56:12 is there a SPARQL TF call tomorrow? 16:56:33 ora: Neither me and Adrian will be there next week likely 16:56:41 Perhaps also talk a little about https://github.com/w3c/rdf-star-wg/issues/189 next week? (Related.) 16:56:42 https://github.com/w3c/rdf-star-wg/issues/189 -> Issue 189 Consolidate advice about versions of RDF syntax and HTTP access (by niklasl) [needs discussion] [ms:CR] 16:56:46 ora: Let's not cancel the meeting but send a mailing list email explicitely 16:57:08 I have made the request to generate https://www.w3.org/2026/04/30-rdf-star-minutes.html TallTed 16:57:53 s/explicitely/expliciting this/ 16:58:00 ora: thank you everyone 16:58:38 Zakim, end meeting 16:58:38 As of this point the attendees have been TallTed, lisp, j, Tpt, AndyS, ora, gtw, niklasl, pfps, Souri, doerthe 16:58:40 RRSAgent, please draft minutes 16:58:41 I have made the request to generate https://www.w3.org/2026/04/30-rdf-star-minutes.html Zakim 16:58:47 I am happy to have been of service, TallTed; please remember to excuse RRSAgent. Goodbye 16:58:48 Zakim has left #rdf-star 16:59:08 present- j 16:59:08 present+ j22 16:59:20 I have made the request to generate https://www.w3.org/2026/04/30-rdf-star-minutes.html TallTed 16:59:33 RRSAgent, bye 16:59:33 I see no action items