W3C

- DRAFT -

iTLDs & presentation of them

28 Oct 2015

See also: IRC log

Attendees

Present
Josh_Soref, Adrian_Bateman, Bartek_Kozlowski, Dave_Singer, Jeff_Hodges
Regrets
Chair
dsinger
Scribe
dsinger

Contents


so, we have an issue that new TLDs are being introduced that are right-to-left

if the last label in the host-name and the first part of the path are RTL, they get visually mixed

RTL readers expect this, but it confuses everyone else (if you get a URL in an email)

Josh: maybe we can leverage what we learn fron certs e.g. the country of origin

<timeless> the country name is visible for EV cert https://usercontent.irccloud-cdn.com/file/1lz68mcV/paypal%20ev.png

<annevk> (it's not necessarily a country, it's a legal jurisdiction)

(this only helps after you’ve visited, and people need to be able to work out whether they SHOULD visit beforehand)

CAs should not sign a cert. that uses a script which cannot reasonably be expected to be readable in the cited jurisdiction

anne: but thsi relies on EV certs which are problematic. they make it harder to do things securely

unfortunately, LetsEncrypt should be able to issue certs for all domains

Josh: Maybe we can signal a language somewhere?
... the hope that you can have users read URLs and be able to judge whether they are probably OK or probably not is almost an empty hope

dave: then there is the ‘visual cross-over’ problem; numerals get to cross over separators and this confuses everyone

we need to present structured text with BIDI-isolates around the structure markers (and LTR embedding around the whole thing)

note that we typically don’t display scheme names (except some banks tell their users to look for ‘https’)

jeff: basically structured text is an object and we have got in the habit of presenting it as text, and maybe that’s a mistake

<timeless> https://usercontent.irccloud-cdn.com/file/Jbwl1UlB/OS%20X%20hierarchy%20view%20of%20a%20path.png

adrian: you could write an algorithm that inserted the right over-rides and isolates
... and note that the slash is a convention, not part of the spec. at all (and the “?” is only weakly so as well)

josh: we’ll probably have to say that the ‘usual’ structure separator is “/“ (has the advantage of being true)

note that Microsoft uses the opposite slash

(Bartek saith)

Anne: “/“ is special (think of ../.. etc. handling)

Josh: are double-width slash (and other potentially confusing characters) allowed in host-names?

<timeless> /

dave: posted my initial anxiety attack to public-iri yesterday https://lists.w3.org/Archives/Public/public-iri/2015Oct/0000.html

(it has many mistakes in it)

anne: overrides are not disallowed in a path

josh: should they be?

anne: but typically (always?) we encode control characters

adrian: is there ever a requirement that we actually don’t encode them for rendering, but let them have an effect?

dave: there is a question of course as to whether we’re ‘internationalizing’ the internet the ‘right way’ by introducing new domains in other scripts

josh: the only way for people to validate e.g. email addresses is to build a trust web (whether or not you can read the string)
... we need better messaging around this

dave: the problem spans w3c, unicode, ietf and probably some icann; it’s ugly, but we could each pick off a bit of it

jeff: note that MTAs already have some ‘drop it’ behavior (e.g. DKIM mismatches on mailing purporting to come from gmail)
... sometimes policy changes are easier and/or better than trying to re-write specs to change the rules (which gets resistance)
... the dmarc spec has feedback provisions in it

<annevk> https://url.spec.whatwg.org/#url-rendering

anne: the latest URL spec. has a section on rendering; we could revisit, but we need some commitment from browsers that they’ll do it

some sense in the room that another look at this is worthwhile

but do we want best practices in the URL spec or should it be a more general document on structured text (paths, mail addresses, hostnames, URLs, URNs etc.)?

(side conversation on the selection problem)

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.140 (CVS log)
$Date: 2015/10/28 02:59:07 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.140  of Date: 2014-11-06 18:16:30  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

No ScribeNick specified.  Guessing ScribeNick: dsinger
Inferring Scribes: dsinger

WARNING: No "Topic:" lines found.

Present: Josh_Soref Adrian_Bateman Bartek_Kozlowski Dave_Singer Jeff_Hodges
Got date from IRC log name: 28 Oct 2015
Guessing minutes URL: http://www.w3.org/2015/10/28-itld-minutes.html
People with action items: 

WARNING: Input appears to use implicit continuation lines.
You may need the "-implicitContinuations" option.


WARNING: No "Topic: ..." lines found!  
Resulting HTML may have an empty (invalid) <ol>...</ol>.

Explanation: "Topic: ..." lines are used to indicate the start of 
new discussion topics or agenda items, such as:
<dbooth> Topic: Review of Amy's report


[End of scribe.perl diagnostic output]