See also: IRC log
NM: Approve minutes of 27 May?
... Regrets from John Kemp
<DKA> +1 to approving 27-may minutes
NM: Minutes of 27 May are
approved
... Minutes from f2f?
<DKA> I've read most of them...
<jar> maybe 1/2 or so...
NM: Minutes from f2f of 7 June -- 9 June approved
<DKA> +1 to approve
RESOLUTION: Minutes from f2f of 7 June -- 9 June approved
NM: Minutes of 17 June?
... Minutes of 17 June approved
... Telcons of 1 July and 8 July are cancelled
... Date of next meeting: 15 July, John Kemp is expected to
scribe
... Availability tabulated in [member-only] a message to the TAG internal mailing list
... F2F at Google confirmed for 19--21 October, but hold off booking
plane tickets until review at beginning of September
<DKA> I plan to organize other meetings in Silicon Valley that week so I hope we do stand by having the f2f those dates.
LM: The most recent discussions about
IRI[bis]/LEIRIs/etc. arose originally from a concern about venues
(how many places should something about this be said)
... But once we started looking at that, some technical issues
emerged
... Some of these have begun to be addressed in IRIbis, progressing
(slowly) at IETF
... Here is my list of issues I think are on the table:
<noah> I'd like to know how Larry feels about Roy's position, which I take to be: HTML and similar "containers" don't directly contain URIs or IRIs, but rather reference strings in some document encodings; standardizing those isn't likely to be practical; let HTML5 define what it does.
<masinter> IRI -> URI via hex encoding, and then parse & reencode domain names
LM: 1) (small) -- processing of non-ASCII hostnames, e.g. http://[something in Chinese]/ -- IRIbis draft said to turn all non ASCII chars in an IRI into hex-encoded UTF-8, and then translate the domain name back into punycode
<noah> This recommendation is embodied where? I missed that.
LM: And that seems to be a bad
approach, because the domain names didn't always get turned back
into punycode
... at the right point
LM: And the browsers don't do that
anyway, in practice
... they parse the UTF-8 and convert directly to punycode
... w/o going through hex-encoding
... Fixing this required changes to IRIbis, but maybe also the http
and ftp URI schemes, maybe LDAP too
... That work can't happen in the HTML WG
... 2) (related) Security issues around spoofing of IRIs are made
worse by the large number of homographs
... (different characters with similar glyphs)
... That's a small part of a larger issue of the visual
presentation of IRIs that also arises with more complexity when
characters from RtoL national languages are used
... Overall, getting a clear picture of the stages of processing
wrt different tasks involving IRIs is still a job that needs to be
done
... Writing ASCII URIs on the side of a bus was easy -- it's not
easy now, with combining chars and bidi
... So that's three areas where we have work to do
... And the question of who has to do it, and can we do it so that
HTML is not either overly involved or overly delayed
... ABarth change proposal [http://lists.w3.org/Archives/Public/public-html/2010Jun/0394.html]
seems more about venue than about actually doing the work
... Maastricht IETF meeting coming up might offer the opportunity
to really get moving
... but browser vendor participation is in doubt, which is worrying
. . .
NM: Helpful summary, thanks
<masinter> not so much in doubt as not quite aggressive enough
NM: I'm confused by some aspects of
ABarth's change proposal which you didn't mention
... Particularly where he quotes Roy Fielding
<noah> From Roy by way of Adam:
RFC 3986 defines how to parse URIs (for recipients) and provides many rules for scheme-specific specs to define how to generate URIs of a given scheme (for producers) within the overall constraint of matching the URI syntax (the formal ABNF). Please understand that browsers almost never parse URI or IRI or anything in between. Browsers have input strings that contain one or more references, usually in the document encoding, and so there is a sequence of context-specific and charset-specific and media-type-specific processing that occurs before you even get to the individual URI-reference or IRI-reference that are defined by 3986/3987. Some people have proposed that most of that pre-processing be added to the IRIbis spec, but I have seen no evidence to suggest that such pre-processing is even remotely standardizable (it seems to be different for every input context). If you can demonstrate or get agreement on a single way to preprocess an input string, or at least a few named processes (like single-ref and multi-ref), then that would be useful.
NM: So Roy is focussing on the
strings in browser input and the processing they do
... which he doesn't think can be standardized
<Yves> [was http://lists.w3.org/Archives/Public/public-iri/2010May/0008.html ]
NM: We have to confront that point --
either by saying "oh yes it can", or by being more careful and
splitting it into two parts
... one of which can be done globally, and one of which remains
with e.g. the HTML spec.
... So I think that makes it more than just a venue issue
<Zakim> noah, you wanted to look a little more closely at Adam's note and whether the word "venue" captures the issue
AM: LM, are those three technical points written up in more detail?
<masinter> http://tools.ietf.org/wg/iri/charters
LM: Some are to some extent in the
IRI WG mailing list issue processing, and some have been (partly)
addressed in the current IRIbis draft
... Follow your nose from the charter http://tools.ietf.org/wg/iri/charters
to drafts
<masinter> http://tools.ietf.org/wg/iri/draft-ietf-iri-3987bis/
<masinter> that is the current draft that address many of the issues I discussed
<noah> To clarify: I think we need to come to a more considered opinion, on the technical merits, as to which aspects of parsing belong in the HTML 5 draft (I suspect at minimum things like lead/trailing blanks and document-coding specific concerns), vs. parts that belong in IRIbis. WRT to the latter, we need to decide whether HTML5 need hang up waiting for them to be settled.
<masinter> and http://trac.tools.ietf.org/wg/iri/trac/report/1 notes issues still open
<noah> I don't feel that I have an informed opinion on the answers, but I believe those are the questions.
LM: So what ABarth and RFielding are saying is partly true: some processing is context-dependent and some is generic
<masinter> http://lists.w3.org/Archives/Public/public-html/2010Feb/0882.html
LM: My proposal for HTML issue 56 and the IRIbis draft embody my answer
LM: Specifically, my proposal
includes draft rewrites of parts of the HTML5 spec. which make the
split of responsibility wrt IRIbis clear
... You could make the cut somewhere else
... but you must to connect to IRIbis somewhere
<noah> Too much there for me to grok in a hurry -- any guesses as to whether it's consonant with Roy's position?
<masinter> if you go back to http://tools.ietf.org/html/draft-ietf-iri-3987bis-00
<masinter> section 7.2
<noah> This seems to be what Roy is saying is a bad bet architecturally.
LM: That describes some preprocessing
that might be appropriate for HTML alone, except that WebApps has
also proposed to adopt that
... So it shouldn't go in the HTML spec.
<jar> LM is giving evidence then against Roy's "I have seen no evidence to suggest that such pre-processing is even remotely standardizable" ?
LM: I think RFielding is arguing that there is not a second context which might share it
NM: Not sufficiently common across
different contexts to merit factoring
... He's betting against getting much value from saying "mostly do
it this way" in IRIbis
... Which links to the venue issue via scheduling - - why wait for
something being done elsewhere which you believe is not
architecturally subject to good factoring anyway
LM: We none-the-less do need to find
a good venue for the technical issues to be addressed
... I am told anecdotally of a software package with 7--9 options
for parsing IRIs, depending on the kind of IRI and the context
<noah> I think we also need to figure out what likely should have shared specs (I.e. shared between HTML and lots of other uses) vs. separate specs, before we can settle the venue question
LM: 9 is not 100, maybe it can be reduced by removing some inadvertent differences
<noah> OK, but that seems to be the discussion we need to help the community have. Looks to me like, to some degree, people are talking past each other.
LM: And if so, bringing the remainder together would make sense
NM: How can the TAG help?
<noah> Not presuming the answer is that we should do anything.
LM: There are still some problems
with IRIbis, which is a fundamental document for
(internationalizing) the Web
... Separating the presentation of IRIs to humans from the
representation of IRIs as sequences of unicode characters [for
mechanical use] is an important architectural distinction, which we
have not made in the past
... So, in particular, I invite the TAG to look at the issues still
before the IRI WG
... Comparing IRIbis, which includes a change summary, with 3987 as published would be a big
help
NM: So, the ultimate goal is to tell
the whole story of getting between a string in a text/html
representation which is delimited by double-quotes and may have
leading/trailing space, and the characters needed for a GET
request
... And getting the right layering of specs for that is looking
like a lot of people talking past each other
<masinter> the issue of permanence of URIs interacts with things like character encodings, internationalization
NM: Getting input for Maastricht would be good, but it would have to be done w/o further discussion
[no volunteers]
NM: Rather than close 448, I will REOPEN it to cause us to come back to it after Maastricht
<noah> ACTION-448?
<trackbot> ACTION-448 -- Noah Mendelsohn to schedule discussion of http://lists.w3.org/Archives/Public/public-html/2010Jun/0394.html on 15 July (followup to 24 June discussion) -- due 2010-07-13 -- OPEN
<trackbot> http://www.w3.org/2001/tag/group/track/actions/448
NM: That completes the agenda -- we
could look at frag-ids
... but TBL is involved with that, and not here
<masinter> action-382?
<trackbot> ACTION-382 -- Larry Masinter to review Web Arch web material on W3C Web Site and make proposals for changes or TAG action -- due 2010-05-31 -- OPEN
<trackbot> http://www.w3.org/2001/tag/group/track/actions/382
<jar> Can we just close action 382?
<noah> ISSUE-39?
<trackbot> ISSUE-39 -- Meaning of URIs in RDF documents -- open
<trackbot> http://www.w3.org/2001/tag/group/track/issues/39
<noah> ACTION-449?
<trackbot> ACTION-449 -- Noah Mendelsohn to schedule discussion of pushback on generic handling of fragment IDs in application/xxx+xml media types (self-assigned) -- due 2010-07-13 -- OPEN
<trackbot> http://www.w3.org/2001/tag/group/track/actions/449
<masinter> relates to new issue on MIME and the web
<noah> FWIW, I have some sympathy with the suggestion that 3032 bis should call out rdf+xml in particular as an exception that's being grandfathered.
LM: Including fragid definitions in media type registrations was new, in that it wasn't needed for the email use of media types
<masinter> but svg+xml and xhtml+xml also need exceptions?
LM: And it wasn't well-architected
<masinter> application/xhtml+xml for polyglot
LM: URI rfc says media type registration determines, but the media type guidance doesn't cover this
<Zakim> noah, you wanted to say that, in this case, media type registration is working dandy...except that there's a looming inconsistency if 3023bis goes forward
LM: There at least three cases of conflict -- rdf+, svg+ and xhtml+
NM: Is there problem with svg+
YL: Not sure it's a conflict, it's an addition
NM: If a generic processor encounters it, will it do "the wrong thing"?
YL: Not sure
NM: Would you take an action?
<masinter> i think the horse is already out of the barn, and that there is no hope of generic fragment identifier handling for +xml MIME types
<jar> i tend to agree.
<noah> ACTION yves to investigate generic processing of svg+xml and XHTML+xml
<trackbot> Created ACTION-450 - Investigate generic processing of svg+xml and XHTML+xml [on Yves Lafon - due 2010-07-01].
NM: I agree with LM that the
relationship between [guidance for] media reg docs and the URI rfc
could be clarified
... but in practice the right thing has been happening
... But 3023bis should cause a red flag, if it goes forward as
written
... because it contradicts an existing registration
... So the TAG is trying to prevent that, in the first instance by
suggesting the removal of the entry for fragids in the list of
generic processing
... We've gotten strong pushback
NM: So I'd like to look again at some kind of grandfathering as an alternative approach
<noah> So, my position is: 3023bis should say "process generically, except if it's rdf+xml (or, as necessary, svg+xml, etc.)
<Zakim> jar, you wanted to ask how to involve Norm etc.
JR: I'm interested in the pushback, maybe, since we sort of covered the options they propose, that we should summarize our analysis
NM: We could do that, but with the
new information, I'd like to start by reconsidering the whole thing
ourselves
... HST would also like to come back to this, we have new input
from outside, and from LM -- particularly wrt making clear what a
media type registration involves
... so we should come back to this in July
<jar> umm... no, not interested in pushback or defense, just wanted to help open up the discussion by making sure www-tag folks knew about all the options that we considered at the F2F. this gives people a chance to say "no I like option Z" instead of just "no I don't like the option you chose"
LM: Contrasting a position of
allowing/encouraging particular fragid handling vs. focussing on
the generic, I prefer the former
... Better to encourage new registrations to adopt the generic
processing
... of fragids, i.e. XPointer
<masinter> I think netting out the pros and cons, the advantage of allowing MIME types -- even if they're +xml -- to have specific fragment ID processing, even if they are +xml.
NM: The alternative is to encourage non-generic-enabling types to not use +xml
LM: Need to net out the pros and
cons
... . I'm confident that more exceptions will arise
<noah> And in particular, to note that allowing non-XPointer for +xml means that generic processors can never safely gamble on using XPointer.
NM: Or you could play with the syntax
LM: I don't see the pro for {always generic except these 3 exceptions}, because i'm confident that more exceptions will arise
<noah> ACTION-382?
<trackbot> ACTION-382 -- Larry Masinter to review Web Arch web material on W3C Web Site and make proposals for changes or TAG action -- due 2010-05-31 -- OPEN
<trackbot> http://www.w3.org/2001/tag/group/track/actions/382
LM: I can't see getting to this any time soon
JR: I have a very similar action already
ACTION-381?
<trackbot> ACTION-381 -- Jonathan Rees to spend 2 hours helping Ian with http://www.w3.org/standards/webarch/ -- due 2010-06-23 -- OPEN
<trackbot> http://www.w3.org/2001/tag/group/track/actions/381
<noah> close ACTION-382
<trackbot> ACTION-382 Review Web Arch web material on W3C Web Site and make proposals for changes or TAG action closed
NM: Adjourned
<jar> http://www.adobe.com/devnet/xmp/pdfs/DynamicMediaXMPPartnerGuide.pdf