RE: Web Request Status Codes from Jatinder Mann on 2013-04-16 (public-web-perf@w3.org from April 2013)

From: Jatinder Mann <jmann@microsoft.com>
Date: Tue, 16 Apr 2013 23:45:51 +0000
To: James Simonsen <simonjam@google.com>, "public-web-perf@w3.org" <public-web-perf@w3.org>
Message-ID: <0df10ded630b4a15be9e3ef81fe73a1b@BLUPR03MB065.namprd03.prod.outlook.com>
Considering the level of detailed information we're considering exposing, I think it's quite reasonable that we should run this proposed interface by our security and privacy teams first. I will schedule a similar review with the IE Security team and get back to the working group.

Thanks,
Jatinder

From: James Simonsen [mailto:simonjam@google.com]
Sent: Thursday, April 11, 2013 2:01 PM
To: public-web-perf@w3.org
Subject: Re: Web Request Status Codes

Third party errors are absolutely off limits unless we receive explicit permission to report them. Without succeeding with the HTTP request, we don't have that permission. Otherwise, sites can figure out which bank a user uses by requesting third party resources from all of the banks and seeing which report errors.

Additionally, we are concerned that our users will be fingerprinted by malicious sites. Exposing additional information makes those attacks much easier.

I've requested review from the Chrome privacy and security teams. I don't think we should bother discussing Error Logging any further until everyone else does the same.

James

On Thu, Apr 11, 2013 at 11:44 AM, Austin,Daniel <daaustin@paypal-inc.com<mailto:daaustin@paypal-inc.com>> wrote:
Hi James,

                Thanks for the feedback. I appreciate your taking the time to look at this.  However, I'm not yet convinced that there is any privacy/security concern here. My reasoning goes like this:


a)      There are a large number of companies doing this already, including Google (Analytics), Yahoo! (Roundtrip and Y! Analytics), Omniture (SiteCatalyst), Mediaplex (Analytics), Compuware/Gomez (RUM), and many others. These services regularly provide collection and transport for this same data and send it upstream, often to a 3rd party (which is worse IMHO). We're not exposing anything that others are not already doing, we're just institutionalizing it and giving the user some control. I can certainly see 304's, 200 (cache) responses, and proxies in that data. Presumably these companies privacy policies already alert the user about all of this, and the user has provided consent by viewing the page. (This isn't an argument about right or wrong, but about current industry practice.)



b)      Users can see all of this data already, by pressing F12 or similar, so it's not concealed from the user and then exposed to others. The data isn't terribly useful to end users (unless they're performance geeks) but it's not secret.



On the cross-origin issue, I think there's something I'm not understanding. Why would cross-origin requests not be logged by the client? For this data to be useful we need to know what happened when the page loaded, regardless of the source. If I put an analytics tag in my page, for example,  and it fails for some reason, I need to know about it, and omitting the error codes is the opposite of helping.



3rd party calls are very often the source of performance problems on the page, and the client, IMHO, should provide full information about everything that happens in all the HTTP request/response cycles that went into that page's composition. In today's world, nearly every page published by any commercial organization is likely to have some 3rd party content.



The more I think about this the more I think the right path is to provide detailed information for everything and be transparent about it all.



Regards,



D-



From: James Simonsen [mailto:simonjam@google.com<mailto:simonjam@google.com>]
Sent: Wednesday, April 10, 2013 2:58 PM
To: public-web-perf@w3.org<mailto:public-web-perf@w3.org>
Subject: Re: Web Request Status Codes

Exposing HTTP status codes exposes a lot of information that hasn't been exposed before. For instance, there are codes that explicitly reveal the existence of a proxy and whether or not a resource is cached. We haven't exposed this sort of information before.

Before getting too far ahead of ourselves, I think we need to have a thorough security and privacy review about whether it's safe to expose this level of information. Otherwise, we're just wasting time discussing this.

Separately, note that the DNS and TCP (and possibly many HTTP) errors are useless for cross-origin requests, because there's no way to determine if logging is allowed.

James

On Mon, Apr 8, 2013 at 3:48 PM, Austin,Daniel <daaustin@paypal-inc.com<mailto:daaustin@paypal-inc.com>> wrote:
Hi Team,
                I've attached to this email an HTML file with the current list of Web Request Status Codes. This list includes all of the status codes that I've been able to track down, with some exceptions. There are a great many of them. Here's a breakdown of the process and the decisions I made to produce the current list:

*         Some status codes were omitted for being ridiculous (418, 420)

*         Some status codes returned by existing servers but not part of any RFC are still listed in red - I don't think they belong here (possibly with the exception of 509) but I've left them in for discussion purposes.

*         Non-HTTP status codes have been added. There are a lot of them (around 40). Since RFC 2616 clearly specifies that HTTP status codes have 3 digits, I've begun the numbering for non-HTTP status codes at 1000. These status codes are broken down by their level in the OSI stack and namespaced accordingly e.g. 1207 SSL: Cipher Error as opposed to 1109 TCP: No route to host. There are four groups of these, namespaced as DNS:, TCP:, SSL:, HTTP:, and Client: . The HTTP: status codes are not currently included in RFC 2616 or any of the other specs, but are common errors seen by clients e.g. 1302 HTTP: Header malformed. Perhaps 'HTTP server:' is better?

*         I've included a key to the different RFCs that contain HTTP status codes. There are 13 (!) of them, and 2 status codes are in draft proposals, linked in the document.

*         For any status code not included in RFC 2616, I've tried to provide a rationale for its existence.

*         Color codes: black = RFC 2616, blue = new for this spec or repurposed from some proprietary list, red = proprietary and doesn't belong here IMHO

*         Sources: RFC 2616, other RFCs and drafts as listed, Wikipedia, Stack Overflow, MSFT sites, Compuware/Gomez, KeyNote, Catchpoint, Nginx, Apache

*         For completeness, I've included all status codes received by the client, not just the error codes. There are several that are not in RFC 2616.

*         I took the liberty of repurposing some existing-but-nonstandard codes and renumbering them for our purposes. I've tried to indicate the source e.g. (Nginx)
Here's the next steps as I see them:

*         Agree on a more-or-less final list of status codes, correct any omissions or duplicates

*         Move this table into Jatinder's spec (or maybe a separate Note?)
This task took considerably more time and effort than I had expected. Who knew there were so many status codes ?
Regards,
D-
Received on Tuesday, 16 April 2013 23:46:38 UTC