TAG f2f -- 04 Nov 2011

SPDY

<stpeter_> http://www.ietf.org/proceedings/80/slides/httpbis-7.pdf (SPDY slides from IETF 80)

NM: [introduces the TAG to Mike Belshe]

MB: I was at Google until 2 months ago, worked on the Chrome team, we started SPDY around 2008
... Google focus on performance, so interested in protocol speedup
... Using the existing mechanisms in HTTP was just gnarly
... So we started experimenting in the lab with doing something of our own
... but based on a lot of prior work in a lot of areas
... SPDY is beginning to spread -- Firefox have started some work
... to date we've owned it, published an informal spec., some unit tests
... Firefox adoption possibility has pushed us towards standardization
... Interop guarantee is necessary before we can move forward

MB: Roberto Peon is the person at Google who is on point for SPDY now
... We looked at taking this to the IETF, and that took me to PStA

NM: We got in to this also in part because of our contact with Jim Gettys over the buffer bloat issue

MB: State of play -- Google Chrome now using SPDY for all SSL traffic to sites that advertise support of SPDY (see about:net-internals)
... All Google SSL "properties" now do advertise SPDY ... Firefox is implementing it
... Amazon Kindle Fire recently announced that they will be using SPDY
... A number of other less big names involved, implementations in Python and Ruby, etc.
... Main parts of SPDY:

Multiplexing;
Compression;
Prioritization;
[Maybe] Server push

MB:Primary focus is on improving page-load latency

MB: Browser use of HTTP and TCP involves various attempts to game the TCP expected behaviour, particularly in the area of multiple connections
... SPDY tries to avoid this by addressing multiplexing, with prioritization, directly

<Yves> problem in doing multiplexing at the spdy level (or httpmux earlier) is the bad interaction that might happen between the tcp window size and the chunk size at l7.

MB: Google research found that when there are packet losses, having two connections is a real win
... Multiplexing uses fewer connections which overall simplifies things
... Performance == minimal latency

NM: Trying to get this via multiple connections did seem to make things worse

MB: Wins with NAT as well
... Two connections is already recommended in HTTP1.1

<Yves> note that the number of // connections in http has been removed in httpbis (as it was not relfecting the reality of things)

MB: When that got multiplied by separate hosts for e.g. js and jpg, and then 2 per went to 6 per, suddenly we were up to 12 --- 18 connections for a single page

TBL: TCP will back off and reroute if things get stuck

MB: We took a serious approach to this
... The average hosts hit per page is 8, rising 9
... and the size of things retrieved has grown too
... Browsers are trading off resource fetching against page load performance
... SPDY was trying to take that optimization off of browsers' backs

<Yves> having mux helps against multiple tcp conn and badly implemented http pipelining.

MB: Important because the difficulty of modelling web pages has grown enormously
... JS, CSS have execution times, so building benchmarks for web page latency is very tricky
... Didn't want to just use Chrome, because it already has a set of decisions built in
... But building a platform from scratch for benchmarking was too big a job
... So we did in the end build plugins for Chrome to benchmark SPDY and HTTP side-by-side
... Very glad that we have Firefox now doing their own similar work
... First three goals above are parallel to HTTP -- GET etc.
... Server push is different/new, which requires client- and server-level rework
... We tried some experimental services using push
... The cache is an issue
... You have to understand the service/application detail, and you only get rid of one round-trip
... Doesn't look like there's enough energy

HT: There are apps that make sense to build if you don't have to busy wait.

MB: which is kind of what we have today

HT: In HTTP 1.1 you can hang on a get.

NM: Yes, comet.

HT: But there were too many glitches.

MB: Yes -- GoogleDocs does mutual refresh by hanging gets
... So if you have 30 GoogleDocs open, there's a problem with the limit of 6 connections per host
... So GoogleDocs fakes it with multiple hosts :-(
... SPDY multiplexing is enough to fix this

TBL: Shared editing experience everywhere would be really good

MB: Hanging get over SPDY is cheap, and does the job, so Server Push doesn't seem so urgent
... Server push is server-initiated

MB: Why SSL? Well, SPDY doesn't require SSL
... First choice we had to make was TCP or UDP -- TCP, to save hassle
... So, what port? 80 or 443 -- pipelining not really there for 80, pblms with proxies, slowly getting sorted out

NM: Problem would be that new stuff would confuse proxies if it went through port 80, right?

MB: Right
... So we went to 443
... And in any case, we came to feel that securing the Web was independently a good thing

[Various]: Certificates are broken, how to secure the web is a huge issue, maybe not in scope today

LM: What about IPv6?

TBL: For stuff that doesn't need to be secured, it's very tempting to get to a P2P solution.
... But some things, say NYT front page or a TED video, which are truly public
... must that be destined for P2P, or some other architecture?

MB: You're right, we recognize that not everything should go this way
... But operators are not very good at recognizing the difference
... Even the front-page of the NYT is not a simple case, if it's personalized to you on the basis of personal (private) info, then that is important to keep secure

<Yves> encrypting everything is a double edged sword...

MB: The Google China experience made us all very sensitive to the fact that personal data can in fact be a matter of life and death, and you never know where it's going to turn up

<timbl> (ages ago .. TBL: Peter, you said when 1 TCP connection has losses and slows down it is faster to add a second connection. Of course, when there is congestion adding more connections adds to the congestion, which on a large scale when everyone does it once, could have overall very detrimental effect on latency for everyone else. So one should simulate or measure the effect of doing this to everyone on the net )

MB: The whole security layer needs improvement
... both in terms of security as such, and in terms of speed
... What about proxies?

<noah> I think this will drive ubiquitous inclusion of SSL acceleration in hardware, something I've thought for a long time would be a good enabling step

MB: On the back end, inside firewall, use SPDY w/o SSL

MB: Proxies are a good thing, and SPDY doesn't have a story about how to play nice with proxies
... SPDY does not address cacheable secured content
... But everyone is using Content Distribution Networks [CDNs], which have largely overtaken proxies for many large operators
... But this lack is a weakness for SPDY

<noah> I think it's really large organizations that are deploying CDNs...you prejudice against the long tail when you assume that everything accessed from distant locations is sourced by a large organization like CNN or Google

LM: We need an analysis of what the impact of not being able to cache actually is

MB: Right -- what's the impact in aggregate -- even though there are clear cases where it loses on an individual basis

NM: For the original pre-CDN world, your ISP got you pages that started a long way away quickly
... That won't work with SPDY

LM: Right, so that's why some more global measurement and analysis is required

MB: And of course any SSL use today is already not proxied
... With SNI, you can see the hostname, but not in vanilla SSL, which makes virtual hosting difficult

<Yves> java doesn't have SNI as well :(

LM: Corporations may not be happy with the loss of filtering capability that follows from ubiquitous SSL usage

PL: Moving to signed content is another important avenue to look at, necessary for peer to peer failover for http

MB: There are a lot of horror stories out there from big sites about proxy badnesses
... SSL removes that vulnerability
... The mobile operators have this lose-lose tradeoff between idiosyncratic compression (fast but potentially bad on the device) vs. not (slow but reliable on the device)
... Patrick McManus of Firefox has looked at some numbers
... 83 connections for the NYT home page puts really bad pressure on NAT
... But the NAT things cuts both ways -- dependence on a single channel makes NAT dropout more noticeable/serious

MB: Speculation about Kindle Fire -- you could push the multiplexing out to the Amazon connection point at EC2
... So that all traffic goes via a single connection (over 443) from the Kindle customer
... This appears to contradict the end-to-end story SSL demands
... Requires the notion of trusted proxy -- SSL man-in-the-middle
... So, and explicit proxy: Kindle to EC2, or anyone to their corporate firewall
... Yes, there is a potential for head-of-line blocking, which can amplify in certain cases
... But overall we are still winning
... It's difficult to model this, you have to collect empirical data
... No doubt that with multiple streams, you are more vulnerable
... I think there are some TCP tweaks that can help, we're working on it
... [Stuff about 'slow startup' which scribe didn't get]

NM: Adding another stream to SPDY doesn't allow cross-stream fixup, right?

MB: Yes
... SPDY does in general fix the head-of-line blocking problem
... Firefox guys have been trying to get pipelining working better, but it's really hard
... They presented at IETF last year [ref?]
... We were pushed to start all the way down at the packet protocol level, but resisted
... We think there's a lot of room to optimize on top of TCP, and we'll only look downwards after that's worked through

DKA: So standardization -- What does "take this to IETF in 2012" mean in detail? Should the TAG stay involved, and if so why -- in what way does it impact on Web Arch?

NM: Yes, I think TAG should stay involved, but we should discuss this on a telcon

<noah> ACTION: Noah to schedule discussion of how, if at all, TAG should continue to be involved with SPDY [recorded in http://www.w3.org/2001/tag/2011/11/04-minutes.html#action01]

<trackbot> Created ACTION-626 - Schedule discussion of how, if at all, TAG should continue to be involved with SPDY [on Noah Mendelsohn - due 2011-11-11].

TBL: At the Web level, things such as Content type, the Link header, things like that
... Does SPDY change that? Where are HTTP headers?

MB: Almost entirely untouched
... So the framing layer is pretty much all that changes
... We looked at 'improving' some aspects of HTTP -- absolutizing all URIs
... but there are some servers which don't support it

TBL: Absolute URIs can indeed be problematic

LM: Does HTTP 1.1 really require support?

[Various]: Yes

MB: Net-net on that -- we backed off doing anything like that
... Not yet sure how we go to IETF, exploring that with PStA

LM: SPDY sounds extremely immature to me -- the impact of this on a wide scale, outside Chrome<->Google servers, is just unknown
... W3C used to have resources in this area, it would be good to have W3C involved in taking something like this forward
... Without any guarantee that the outcome will be much like SPDY
... But I think a prerequisite for standardization is more exploration of the requirements and consequences

NM: W3C should do this? My sense is that we've been happy with IETF taking the lead

LM: Well, comparing two protocols wrt 'page load latency' is a W3C area
... 'Doing this' will involve a lot of different tasks - "What does the Web require in terms of optimization" is a W3C issue, almost by definition

TBL: Yes

NM: So, push back on taking it to IETF?

TBL: No, makes sense to do the protocol there, as LM said
... Low-level question -- Can the client change its mind about priority?

MB: Change-priority is not supported, but tab-change in the browser might provoke us to rethink that
... Note that the priorities are advisory, the server can do whatever it likes

<timbl> chrome://net-internals/#spdy

MB: Wrt standardization and changes -- putting it out there means Google understands that other perspectives are now needed, and will lead to changes

NM: Need to stop, so thanks to Mike
... the TAG will come back to this -- who do we feed back to?

<noah> http://groups.google.com/group/spdy-dev

<noah> Send email to this group: spdy-dev 'at' googlegroups.com

HST: I think the W3C IETF liaison needs to be aware of this, and help us decide where the W3C needs to be involved

NM: Suspended until 1050

Publishing and Linking on the Web

NM: Resuming

<noah> http://www.w3.org/2001/tag/doc/publishingAndLinkingOnTheWeb-2011-10-27.html

DKA: [introduces the above doc]
... Questions for Rigo -- Is the overall goal sensible/useful; Are there other terms we should add, e.g. 'performance'; Are there other regulatory/policy issues we should add to the framing?

LM: Great start, ready to figure out what the next steps are
... To be useful, this has to be published in a way that gets community consensus
... How do we get there

<masinter> the TAG could publish it as a NOTE and start a community group?

<jar> but we wanted this to go rec track.

<jar> Thinh [Nguyen] says effectiveness much enhanced by rec status

NM: Maybe we should schedule detailed review of this at some length at the January F2F

<noah> ACTION: Noah to schedule very detailed line-by-line review of Pub&Linking draft at January F2F [recorded in http://www.w3.org/2001/tag/2011/11/04-minutes.html#action02]

<trackbot> Created ACTION-627 - Schedule very detailed line-by-line review of Pub&Linking draft at January F2F [on Noah Mendelsohn - due 2011-11-11].

LM: Can we get the TAG out of the critical path?

LM: Community group, which might have some lawyers in it

RW: The scope of this document is too large

<noah> I would prefer to focus on linking

RW: Publishing is a nightmare, and linking even more so
... Legal tactic is to partition as much as possible
... Reduce this, make two documents

<jar> jar to rw: this is not a legal document

DKA: Start with linking?

RW: Yes

RW: Look at www.linksandlaw.com

<jar> linksandlaw.com - I've given out the pointer a few times

<noah> http://www.linksandlaw.com/linkingcases.htm

RW: there are a very large number of relevant cases

LM: Would you come to the TAG for help if you were tasked to do so?

RW: Your first document, to get the story clear, is not for the world

TBL: Why not for the world?

RW: If you talk too much, your central message goes away [tip of the hat to TLR]

NM: Advice could focus on "if ... then ..." heads-up kinds of observations

RW: There is a long-standing conflict between the technical and legal communities

RW: In particular a conflict over ownership of terms
... There are already laws, with terminology definitions

<masinter> I think we should take what we have and see how we can make it useful....

LM: I just want to make this useful enough to publish
... Suppose this is just an outline of what the TAG understands in this space
... And accepting that we can't resolve the conflicts
... Could we point out where the conflicts are?

RW: If you asked me, I would try to come up with a concise statement of the fact that publication implies the possibility of linking to
... See e.g. the KPMG example from linksandlaw: http://www.linksandlaw.com/linkingcases-linkingpolicies.htm#KPMG

<jar> oops, I had promised to write something about "don't link" terms of use

<jar> look at the american airlines site (whose URL I can't give you)

RW: Not stop this document, to get our own understanding clear
... but also publish simple short observations that are at the core

DKA: But that's a legal statement

NM: We set out to avoid making policy statements
... We can't state that publishing gives a right to others to link

RW: Not a right, but an expectation

TBL: You're preaching to the choir, but how do we say this?

RW: We can distinguish between linking itself and the existence of access control

NM: [Hypothetical KPMG example]
... How do we make it a technical observation, not a policy one

RW: The fact that you include a pointer to something on the Web in your document has no meaning for the content over there, and is completely unrelated to the thing identified

TBL: If you can from the KPMG home page browse to another page, I should be able to pass that link to someone else
... The UK position appears to be that publishing a link collection to pirate music sites is to be an accessory to copyright theft
... We could try reciprocal banning, by notifying KPMG that they are not allowed to read the W3C site :-)

JAR: Wrt what RW and LM said, the original idea, from Thinh Nguyen, was that it would be useful, to forestall bad decisions, if there was a document that simply stated what the technical community thinks these terms mean.
... Thinh went on to say that to get the necessary impact, it needs to go out as a REC
... So it doesn't try to argue with the law, it just says what the technical understanding of these terms is

<jar> session with Thinh: http://www.w3.org/2001/tag/2010/12/02-minutes.html#item01 ...

DKA: So we started from there, and the fact that a URI is public identifier, to get to the parallel between speech acts and URI use, and that brings in the 'right to link', parallel to free speech
... from which we got drawn in to the distinction between linking and embedding
... And getting embedding clear requires us to get the publishing/hosting distinction clear
... Because it isn't clear that a page with text and video involves multiple sources

RW: The legal side knows about this

DKA: So, maybe we don't need all of that?

NM: We're getting different advice from different legal sources. . .

AM: Alternatively, maybe we should not publish a TAG doc't, but something in the popular space, e.g. the NYT over Tim's byline

<masinter> so we should make a short statement around which we can get consensus, focused on one small issue around linking and copyright

<masinter> at least that's what Rigo is advocating

RW: The TAG endorsement of a short statement about the passive linking case which is that URIs are constitutive of the Web and publishing a page with a URI creates a legitimate expectation of linkability

AM: If we did a short statement, how do we get it out? To make an impact?

<masinter> the current document is useful for us, but way too broad.

RW: The publication channel of the TAG itself should be ordinary, which has our opinion

LM: It could be a finding

RW: And then you go to the NYT and say "The TAG has said. . ."

NM: If the W3C/the membership/TBL want to say [a quasi-hortatory claim about the right to link], that's fine, but it's not for the TAG to do so
... The TAG does architecture

TBL: But the architecture has a fundamental social component

<masinter> i think this can be a finding, making a statement about the architectural assumption of the web and underlying many of the TAG's other activities is that linking is fundamental, that providing URIs for content is recommended, exactly to promote linking

TBL: Spam is a social violation of the architecture
... So I think the TAG can speak on this subject in social terms

NM: Yes, I'm comfortable with talking about social value, and the impact of policy on value

TBL: Forbiding incoming links breaks the [social] system

NM: But there are clearly (jurisdictional bounded) cases where linking to unacceptable material is itself unacceptable

LM: We haven't said anything about laws

TBL: But we are close to that in saying KPMG are doing something wrong

RW: Wrt DKA's point, using the free speech analogy is going further that I would

<masinter> we have designed the system such that linkability is a benefit. Attempts to restrict linkability is counter to the effective use of the system as designed.

RW: I would focus on the passive case: it's wrong for sites to pubish rules which forbid linking to them

<masinter> we don't have to say "the law is a bad idea", we have to point out the negative consequences of such laws

<masinter> i would still like to see Jeni/Dan's document published as a note after review, even if we also have a finding about common use of linking and the TAG's position on it

<jar> Thinh Nguyen: default rules are important in court... may be supplied by technical standards (that was Thinh last December)

LM: Trying to avoid value judgements, but just say what the consequences of doing so would be

<jar> The consequences speak for themselves. Generally speaking.

NM: I want to focus on the consequences of laws which enable behaviours

HST: That's much narrower than I hear from JAR and LM, who are happy to discuss consequences of actions, not just laws (or not laws at all)

NM: Look at the KPMG case -- it's going to get legal very quickly
... Immediately the question will arise as to whether their statement is enforceable by law

RW: As Thinh Nguyen said, the legal interpretation will be based on usage, on community habits

RW: Is the expectation in practice that certain things hold?
... So for example the attempt to smuggle in obligations via shrink-wrap was surprising, contrary to expectation, and so not supported by the courts

RW: Similarly the expectation is that publishing something on the Web gives the possibility of linking

<masinter> it gives the possibility of linking, and we've encouraged that

JAR: I actually do think it makes sense for a technical person to say that "operating under such-and-such a restriction would have the following [bad] consequences", so I'm not ruling out laws

<noah> +1 to what Jonathan just said

<masinter> i'm trying to distinguish between "such restrictions are bad" and "negative consequences of such restrictions are X, Y, Z; such restrictions are bad if they do not have clear redeeming value"

<masinter> the web was designed for X, it was built with this assumption

HST: OK, I see a continuum, was a mistake to try to push for two distinct positions

DKA: Saying "restricting linking will have a detrimental effect on the Web" is weak from a legal perspective -- The lawyer will say "Not my problem"

<noah> What I want to rule out, mostly, is: "law XXX should not be passed". I would rather say: "if you pass law XXX, you should understand that the consequences to the operation of the system, and to its positive social value will be YYY"

<masinter> restrictions on linking are impossible to accomplish because of web architecture? search, robots.txt, harvesting, ...?

<masinter> we should be clearer about who the primary audience is for this finding

<masinter> maybe a blog post?

DKA: We're not asking for unrestricted freedom to link, but no less freedom than the freedom of speech

NM: The speech parallel is weak, because URIs don't point to consistent things necessarily
... I prefer the address analogy

NM: Consider the address of [a prohibited organization] in a public works document about street repairs versus in a list of recommended destinations

NM: So consider a log of URIs versus a Web page which references one

LM: Summarizing RW -- the scope of this document is too broad, you should find a few one-page extracts

DKA: I'll take this back to Jeni and consider all of the input we've gotten
... And decide whether to take this forward broadly but internally
... or whether we can pare it down effectively

NM: There are clearly different opinions about:
... 1) What the goal of the document is;
... 2) What its scope is.
... We should acknowledge the lack of consensus, and maybe the divergence of advice we're getting

RW: Pursuing (w/o publishing) this document, will improve the value of TAG utterances in the future
... That the passive case: I'm a site, and I forbid linking, is wrong is what you should say
... not the active "I have a right to link"
... The latter gets quickly extremely messy

HST: I'm not sure sending the editors off to work harder when the TAG hasn't agreed on scope is an invitation to waste time

DKA: I wasn't going to dive right in to cut the document down -- I want to work with all the feedback we've gotten this week, particularly on Wednesday, and that's where I want to focus

LM: So a new draft is worth working on if it yields something we can publish

NM: Remember Goals and Success Criteria -- we should keep these (http://www.w3.org/2001/tag/products/PublishingLinking.html) in mind
... And consider revising them too

<noah> ACTION: Appelquist with help from Jeni to propose changes to goals, success criteria etc. for publishing/linking product page [recorded in http://www.w3.org/2001/tag/2011/11/04-minutes.html#action03]

<trackbot> Created ACTION-629 - With help from Jeni to propose changes to goals, success criteria etc. for publishing/linking product page [on Daniel Appelquist - due 2011-11-11].

LM: Right -- for example split the work between a small 'official' publication and a larger background unofficial 'white paper'

NM: Thank you Rigo

3023bis -- Media type registration for the XML family

CL: I've mailed a summary of recent progress on 3023bis to www-tag
... Previous draft deprecated text/xml
... Implementors pushed back, as we have to support it even if new authors don't use it
... I took this to IETF80, got lots of interaction
... HTTPbis then removed the default charset handling rules, which is consistent with applications today
... But that left email out of sync with HTTP
... But but it now appears there is willingness to fix the text/... story to allow individual text/ media types to declare their own charset rules
... So text/xml can say there is no default charset and the XML spec rules determine [which will apply to email as well as HTTP]

LM: So, what is in the way of publication?

CL: On this front, nothing, but the fragid situation is still pending

JAR: xhtml+xml can have RDFa in it

HST: What's the problem with that?

JAR: The case is this -- you have a URI with a fragid, and you want to follow your nose
... If you look at the 3023bis registration, it says XPointer tells you the semantics
... so you look for an ID, and don't find one
... So there's an error (per XPOinter)

HST: The no new syntax move, for SVG:
... if syntax is not a valid xpointer, then defer to the specific media type reg
... the xhtml media type has to say something about RDFa
... and 3023 needs an override policy

jar: the new piece for me was realizing that we need to also talk about application/xhtml+xml

HST: 3023bis says it is the definitive spec for fragids
... what it needs to do is to say under what circumstances it defers to the subsidiary media type reg

<masinter> why is this on critical path for 3023bis ?

ACTION Henry to work with Chris Lilley to bring forward prose for 3023bis wrt generic processing of fragment identifiers which addresses the rdf+xml and xhtml+xml issues

<trackbot> Created ACTION-628 - Work with Chris Lilley to bring forward prose for 3023bis wrt generic processing of fragment identifiers which addresses the rdf+xml and xhtml+xml issues [on Henry Thompson - due 2011-11-11].

ACTION-628 due 2011-12-31

<trackbot> ACTION-628 Work with Chris Lilley to bring forward prose for 3023bis wrt generic processing of fragment identifiers which addresses the rdf+xml and xhtml+xml issues due date now 2011-12-31

<noah> That's fine, Henry, assuming you'll also put in the cover email a bit of framing to remind something about the history and context for whatever is proposed

LM: Isn't this a general point about mixins

<masinter> use of RDFa as a mixin applies to more things, like JSON

LM: Can't we get some cleaner layering?

NM: In the interest of getting 3023bis out, can't we do this locally first, in 3023bis, on the assumption that it will be consistent with any cleaner general solution?

LM: Getting mixins right is likely to be a huge problem, you're going to get stuck in a tarpit

CL: You may be right, we'll see

LM: The way to get follow-your-nose for mixins requires you to modify the host-language media type definitions systematically

HST: In my recent email I distinguish between "witting" and "unwitting" host languages
... The unwitting case is when an XML language (e.g. XML Schema, XSLT, XML Signatures) allows any attribute on any element, as long as it's in a namespace that's not theirs

HST: The witting cases work easily, but the unwitting are important. They allow mixins without explicitly naming them in their specs, so they can't be expected to change the media type registration

<masinter> are there other mixins other than RDFa that add fragment identifier possibilities?

<masinter> can i mix-in SVG into something and use SVG visible objects as fragment targets?

HST: 'Unwitting' embedding of RDFa can't trigger a change to a media type registration

LM: Would unwitting embedding of e.g. SVG introduce the possibility of using SVG-style fragids which only work in the SVG sub-part?

NM: There is at least some discussion of that case in the Self-describing Web finding

TBL: This comes under the XML functions story as well

LM: So, go ahead, but be aware that there be dragons

Web Storage

AM: There is a Web Storage Last Call WD just out
... We need to decide whether we want to comment

Admin

NM: RDFa vs. Microdata will require our attention wrt HTML WG process by mid-January, we will return to this

TAG f2f

04 Nov 2011

Attendees

Contents

SPDY

Publishing and Linking on the Web

3023bis -- Media type registration for the XML family

Web Storage

Admin

Summary of Action Items