15:59:01 RRSAgent has joined #hcls 15:59:01 logging to https://www.w3.org/2022/03/10-hcls-irc 15:59:07 rrsagent, make logs public 15:59:11 Meeting: FHIR RDF 15:59:14 Chair: David Booth 16:03:28 Topic: RDF lists 16:04:10 jim: Looking into OWL API, trying to understand how the parser works. No prognosis yet. 16:04:25 Present: David Booth, Dagmar, Jim Balhoff, Gaurav Vaidya 16:04:41 jim: Creating a test now. 16:05:14 Topic: Concept IRIs 16:05:44 dbooth: Heard back from Martin Durst. No standard algorithm for converting percent-encoding unicode strings to append to IRIs. 16:06:51 [[ 16:06:55 Anyway, the characters you need to escape are essentially in two categories: 16:06:55 - ASCII characters that are syntactically relevant in URIs and IRIs. 16:06:55 These are called reserved characters. Overall, they are all ASCII 16:06:55 characters except letters, digits, and "-" / "." / "_" / "~". 16:06:56 (https://datatracker.ietf.org/doc/html/rfc3987#section-2.2) 16:06:56 (If you do a careful analysis on where your Unicode strings can 16:06:58 end up in the IRI, you may be able to leave more characters as-is. 16:07:00 As an example, you may be able to leave in a ":" if you know 16:07:02 that you will always have a full IRI with http: or some such 16:07:04 at the start.) 16:07:06 - Any Unicode codepoints (where there might or might not be characters) 16:07:08 that you want to leave out for one reason or another. In RFC 3987, 16:07:10 we on purpose didn't restrict this too much. But you definitely don't 16:07:12 want surrogates (0xD800-0xDFFF), because these are only used in pairs 16:07:14 in UTF-16. We also excluded private-use characters (except for query 16:07:16 parts) and co-called non-characters (0xYZFFFE, 0xYZFFFF). You can 16:07:20 exclude more, for example unassigned code points,... But it's probably 16:07:22 better to exclude these altogether rather than to allow them when 16:07:24 they are percent-encoded. 16:07:26 As for text for a standard, please have a look through 16:07:28 https://datatracker.ietf.org/doc/html/rfc3987#section-3 16:07:30 where you probably can find quite a few pieces (but you will have to select and put them together yourself). 16:07:32 ]] 16:08:58 Gaurav: I started drafting an algorithm. 16:09:55 eric: We're trying to see the future. If people only use flat name spaces, they can. But it seems likely that someone may want to introduce hierarchy we may want to help support that. 16:12:03 eric: I think we only want to escape chars that either produce an invalid IRI or ____ . 16:12:21 ... Unicode chars should never include surrogates anyway. 16:12:43 ... Do we allow dots? Slashes? Colons? 16:13:04 ... Hashes are out because they have a special meaning. 16:15:12 gaurav: In first draft, suggested evertying outside of iquery should be percent-encoded. 16:17:22 dbooth: Do we want to do a direct concatenation of stemIRI with percent-encoding of the code? 16:18:04 eric: I think so. Considered relative URL resolution, but don't need it. 16:18:50 gaurav: Yes. 16:20:28 Jim, dagmar: yes 16:20:49 dbooth: Do we want to place any restrictions on stemIRI other than being a valid IRI? 16:21:13 eric: fragID is not sent to server when an HTTP request is sent. 16:21:41 ... I think we should allow users to engineer their URI space how they want. 16:22:21 s/valid absolute IRI/ 16:26:24 dbooth: Should the absolute IRI be required to have a slash (after the iauthority) to prevent the code from changing the apparent domain name? 16:26:37 ... Feels like a security risk if we don't. 16:31:11 dbooth: If the stemIRI is something like https://hl7.org (with no slash) and the code is .evil-hacker.com/ 16:31:21 ... then that could be a security risk. 16:34:19 eric: Could say that if the scheme is a URL scheme, then the stemIRI must contain a slash after the iauthority. 16:39:42 dbooth: Should we have this restriction? 16:40:36 eric: yes, but it would have to be English text. 16:41:23 dbooth: We should also list currently known URL schemes. 16:41:53 gaurav: Wikidata uses a template, with $1 placeholder. 16:50:38 AGREED: Gaurav w draft re-write 16:51:59 dbooth: directly concatenate stemIRI + percentEncode(code) ? 16:52:07 eric: Agreed. 16:53:29 gaurav: Might want the percentEncode function to depend on the stemIRI -- whether it is in the query string, the fragID, etc. 17:00:25 gaurav: Should we percent-encode everything that is not in ifragment? 17:02:38 ACTION: gaurav to draft algorithm to percent-encode everything but ifragment chars, and show us corner cases or cases we might want to reconsider. 17:02:51 ADJOURNED 17:02:56 rrsagent, draft minutes 17:02:56 I have made the request to generate https://www.w3.org/2022/03/10-hcls-minutes.html dbooth 17:03:02 TallTed has joined #hcls 17:04:07 i/ADJOURNED/gaurav: I might be out next week. 17:08:57 Present+ EricP 17:09:01 rrsagent, draft minutes 17:09:01 I have made the request to generate https://www.w3.org/2022/03/10-hcls-minutes.html dbooth