See also: IRC log
so, we have an issue that new TLDs are being introduced that are right-to-left
if the last label in the host-name and the first part of the path are RTL, they get visually mixed
RTL readers expect this, but it confuses everyone else (if you get a URL in an email)
Josh: maybe we can leverage what we learn fron certs e.g. the country of origin
<timeless> the country name is visible for EV cert https://usercontent.irccloud-cdn.com/file/1lz68mcV/paypal%20ev.png
<annevk> (it's not necessarily a country, it's a legal jurisdiction)
(this only helps after you’ve visited, and people need to be able to work out whether they SHOULD visit beforehand)
CAs should not sign a cert. that uses a script which cannot reasonably be expected to be readable in the cited jurisdiction
anne: but thsi relies on EV certs which are problematic. they make it harder to do things securely
unfortunately, LetsEncrypt should be able to issue certs for all domains
Josh: Maybe we can signal a
language somewhere?
... the hope that you can have users read URLs and be able to
judge whether they are probably OK or probably not is almost an
empty hope
dave: then there is the ‘visual cross-over’ problem; numerals get to cross over separators and this confuses everyone
we need to present structured text with BIDI-isolates around the structure markers (and LTR embedding around the whole thing)
note that we typically don’t display scheme names (except some banks tell their users to look for ‘https’)
jeff: basically structured text is an object and we have got in the habit of presenting it as text, and maybe that’s a mistake
<timeless> https://usercontent.irccloud-cdn.com/file/Jbwl1UlB/OS%20X%20hierarchy%20view%20of%20a%20path.png
adrian: you could write an
algorithm that inserted the right over-rides and isolates
... and note that the slash is a convention, not part of the
spec. at all (and the “?” is only weakly so as well)
josh: we’ll probably have to say that the ‘usual’ structure separator is “/“ (has the advantage of being true)
note that Microsoft uses the opposite slash
(Bartek saith)
Anne: “/“ is special (think of ../.. etc. handling)
Josh: are double-width slash (and other potentially confusing characters) allowed in host-names?
<timeless> /
dave: posted my initial anxiety attack to public-iri yesterday https://lists.w3.org/Archives/Public/public-iri/2015Oct/0000.html
(it has many mistakes in it)
anne: overrides are not disallowed in a path
josh: should they be?
anne: but typically (always?) we encode control characters
adrian: is there ever a requirement that we actually don’t encode them for rendering, but let them have an effect?
dave: there is a question of course as to whether we’re ‘internationalizing’ the internet the ‘right way’ by introducing new domains in other scripts
josh: the only way for people to
validate e.g. email addresses is to build a trust web (whether
or not you can read the string)
... we need better messaging around this
dave: the problem spans w3c, unicode, ietf and probably some icann; it’s ugly, but we could each pick off a bit of it
jeff: note that MTAs already have
some ‘drop it’ behavior (e.g. DKIM mismatches on mailing
purporting to come from gmail)
... sometimes policy changes are easier and/or better than
trying to re-write specs to change the rules (which gets
resistance)
... the dmarc spec has feedback provisions in it
<annevk> https://url.spec.whatwg.org/#url-rendering
anne: the latest URL spec. has a section on rendering; we could revisit, but we need some commitment from browsers that they’ll do it
some sense in the room that another look at this is worthwhile
but do we want best practices in the URL spec or should it be a more general document on structured text (paths, mail addresses, hostnames, URLs, URNs etc.)?
(side conversation on the selection problem)
This is scribe.perl Revision: 1.140 of Date: 2014-11-06 18:16:30 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) No ScribeNick specified. Guessing ScribeNick: dsinger Inferring Scribes: dsinger WARNING: No "Topic:" lines found. Present: Josh_Soref Adrian_Bateman Bartek_Kozlowski Dave_Singer Jeff_Hodges Got date from IRC log name: 28 Oct 2015 Guessing minutes URL: http://www.w3.org/2015/10/28-itld-minutes.html People with action items: WARNING: Input appears to use implicit continuation lines. You may need the "-implicitContinuations" option. WARNING: No "Topic: ..." lines found! Resulting HTML may have an empty (invalid) <ol>...</ol>. Explanation: "Topic: ..." lines are used to indicate the start of new discussion topics or agenda items, such as: <dbooth> Topic: Review of Amy's report[End of scribe.perl diagnostic output]