22953 – the websocket API is missing headers access

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22953 - the websocket API is missing headers access

Summary: the websocket API is missing headers access

Status:	RESOLVED WONTFIX

Alias:	None

Product:	WHATWG
Classification:	Unclassified
Component:	HTML (show other bugs)
Version:	unspecified
Hardware:	Other other

Importance:	P3 normal
Target Milestone:	Unsorted
Assignee:	Ian 'Hixie' Hickson
QA Contact:	contributor

URL:	http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-08-14 07:40 UTC by contributor
Modified:	2013-09-12 20:45 UTC (History)
CC List:	3 users (show)

See Also:

Attachments

Description contributor 2013-08-14 07:40:48 UTC

Specification: http://dev.w3.org/html5/websockets/
Multipage: http://www.whatwg.org/C#top
Complete: http://www.whatwg.org/c#top
Referrer: 

Comment:
The websocket API is missing two elements:
1) provide additional headers when opening the websocket: this allow for
application-specific headers, or authentication headers with tokens (eg. OAuth
etc.). Cookies are not the only headers that can be used.
In my current use case, I use a Token header to authentify requests, and I
need to be able to authentify them with the same manner whilst opening a
websocket.
2) give access to return headers: as defined by the RFC, the server doesn't
have to upgrade the connection but can simply return standard HTTP headers,
this allows for sending RESTful responses eg. 404, 402 (payment negotiation
for websockets) etc. Not only the response code but response headers can be
useful.
In my current use case, I use a RESTful request for creating an item (sadly
the RFC only supports GET, so it's always a GET but the other elements remain
standard), the server then decides if a negotiation is required (our
application use websockets for different kind of real-time negotiations), if
not negociation is required, it simply returns a 401 with a Location header
indicating the URI for the newly created item, otherwise opens a websocket for
negotiation.

Posted from: 70.112.97.206
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.95 Safari/537.36

Comment 1 Ian 'Hixie' Hickson 2013-08-15 20:57:24 UTC

> 1) provide additional headers when opening the websocket: this allow for
> application-specific headers, or authentication headers with tokens (eg.
> OAuth
> etc.). Cookies are not the only headers that can be used.
> In my current use case, I use a Token header to authentify requests, and I
> need to be able to authentify them with the same manner whilst opening a
> websocket.

Just send the data in the first WebSocket frame.


> 2) give access to return headers: as defined by the RFC, the server doesn't
> have to upgrade the connection but can simply return standard HTTP headers,
> this allows for sending RESTful responses eg. 404, 402 (payment negotiation
> for websockets) etc. Not only the response code but response headers can be
> useful.

We can't expose cross-origin headers. That would be a security disaster. Exposing same-origin headers would be possible, but I don't see much point. Just send the data in a WebSocket frame isntead.


I think though that this request in general makes the assumption that when you start a WebSocket connection, you first do HTTP and then convert to WebSocket. That's not really a good way to look at it. It's more accurate to view it as WebSocket just being its own independent protocol that just fakes looking like HTTP to confuse proxies in a useful manner, and to allow the protocol to share a port with HTTP.

(Consider, for instance, what happens when you try to do HTTP to a WebSocket server. The WebSocket server won't respond with HTTP, it'll just close the connection.)

Comment 2 Tarik Ansari 2013-08-16 03:28:02 UTC

(In reply to comment #1)
> > 1) provide additional headers when opening the websocket: this allow for
> > application-specific headers, or authentication headers with tokens (eg.
> > OAuth
> > etc.). Cookies are not the only headers that can be used.
> > In my current use case, I use a Token header to authentify requests, and I
> > need to be able to authentify them with the same manner whilst opening a
> > websocket.
> 
> Just send the data in the first WebSocket frame.
> 
> 
> > 2) give access to return headers: as defined by the RFC, the server doesn't
> > have to upgrade the connection but can simply return standard HTTP headers,
> > this allows for sending RESTful responses eg. 404, 402 (payment negotiation
> > for websockets) etc. Not only the response code but response headers can be
> > useful.
> 
> We can't expose cross-origin headers. That would be a security disaster.
> Exposing same-origin headers would be possible, but I don't see much point.
> Just send the data in a WebSocket frame isntead.
> 
> 
> I think though that this request in general makes the assumption that when
> you start a WebSocket connection, you first do HTTP and then convert to
> WebSocket. That's not really a good way to look at it. It's more accurate to
> view it as WebSocket just being its own independent protocol that just fakes
> looking like HTTP to confuse proxies in a useful manner, and to allow the
> protocol to share a port with HTTP.
> 
> (Consider, for instance, what happens when you try to do HTTP to a WebSocket
> server. The WebSocket server won't respond with HTTP, it'll just close the
> connection.)

Actually, the Websocket server can respond with 404 (let's say you open a chatroom which does not exists), among other things.

Having access to the headers on a same-origin request (I see why it would pose a security issue otherwise) is significantly useful as it allows to port existing authentication schemes within the websocket handshake (eg. Token:, OAuth etc).

Quoting the RFC:
"Any status code other than 101 indicates that the WebSocket handshake
   has not completed and that the semantics of HTTP still apply.  The
   headers follow the status code."

Comment 3 Ian 'Hixie' Hickson 2013-08-16 21:01:08 UTC

(In reply to comment #2)
> 
> Actually, the Websocket server can respond with 404 (let's say you open a
> chatroom which does not exists), among other things.

That would be an HTTP server responding, not a WebSocket server.


> Having access to the headers on a same-origin request (I see why it would
> pose a security issue otherwise) is significantly useful as it allows to
> port existing authentication schemes within the websocket handshake (eg.
> Token:, OAuth etc).

Just put them in the first WebSocket frame instead.


> Quoting the RFC:
> "Any status code other than 101 indicates that the WebSocket handshake
>    has not completed and that the semantics of HTTP still apply.  The
>    headers follow the status code."

Yeah, I disagree with that. I think that's bogus.

Comment 4 Tarik Ansari 2013-08-16 23:18:58 UTC

(In reply to comment #3)
> (In reply to comment #2)
> > 
> > Actually, the Websocket server can respond with 404 (let's say you open a
> > chatroom which does not exists), among other things.
> 
> That would be an HTTP server responding, not a WebSocket server.

Some well-thought server libraries handles both HTTP and Websockets, so the line is blurry here, ie. they offer control over going ahead with the handshake or stopping with an HTTP response.

Let's take a practical example, you have a web app/web API that allows connecting to chat rooms, the API has methods which are HTTP (ie. list chat rooms) as well as websocket upgradable, ie. connect. The way I see it (and implemented it) to maintain RESTfulness is that the server returns a 404 (chatroom not found), 401 (auth required), 402 (payment required), or goes ahead and opens a websocket if access is granted. This is the way I currently implemented it (the current client is iOS, and I used web sockets hopping to design a browser-accessible API). Maybe, maybe querying an HTTP URL first, that 302 to the websocket but this poses all sort of state issues and augment complexity on server side, without mentioning it breaks current standards.

If XHR was designed with the same line of thought, APIs would be quite messy IMHO.

Comment 5 Tarik Ansari 2013-08-20 05:37:57 UTC

Just want to add a few things on here:

1) I was going to suggest looking at Cross-origin resource sharing regarding security, until I actually noticed this is in the RFC!

2) I could see your point of saying why use headers when you can communicate through the websocket, specially, in a scenario where the websocket stack is separate from the HTTP/applicative server, as in most current mainstream architectures (ie. Google App Engine, Pusher etc.) where the socket server is separate due to request being distributed by a load balancer and spawning new threads, and state maintained by a database. And I think this is where your bias is. If you consider a stateless architecture where you have micro-distribution of resources and a core thread on each node running the application (which I think is going to become more prevalent in the future), or simply an architecture on which HTTP and Websockets are scaled together, you would see why it would make sense to share headers functionality and why forgoing it doesn't make sense.

Less incidental, but your reasoning could also be used to say let's forgo the close status codes, specifically for the applicative range. This is an approach that augments the overall overhead of the protocol for no good reason. Looking at this post from Square on releasing SocketRocket and why they choose to use websockets on native mobile clients: http://corner.squareup.com/2012/02/socketrocket-websockets.html, those are the exact same reason we made this choice, and I believe they would agree with my position.

You are basically suggesting lowering the utility of websockets, portability of native to/from web implementations and integrability with other application components and trading it for a simplification which provide few benefits.

A better approach would be making the API consistent with the RFC (which I think was very well designed expect for forgoing HTTP verbs) and finding a way to allow access to headers without exposing complexity. I don't even see this posing a high risk of implementors diverging.

Comment 6 Ian 'Hixie' Hickson 2013-08-20 17:13:46 UTC

> Less incidental, but your reasoning could also be used to say let's forgo
> the close status codes, specifically for the applicative range.

Yeah, I think the whole close status codes this is a waste of time. That's one of the features the IETF added after they took over the protocol.


> Let's take a practical example, you have a web app/web API that allows 
> connecting to chat rooms, the API has methods which are HTTP (ie. list chat 
> rooms) as well as websocket upgradable, ie. connect

Just put it all in WebSocket. Why use both HTTP and WebSocket? That's confusing.


WebSocket is supposed to be the closest thing to TCP that we can expose to JS. It's not supposed to be a framework or architecture and so on. To the extent that it is, those are mistakes made by the IETF working group when they took over the definition of the protocol.

Comment 7 Tarik Ansari 2013-08-22 02:55:03 UTC

> Just put it all in WebSocket. Why use both HTTP and WebSocket? That's
> confusing.

Except that in the real world, applications have existing HTTP APIs and WebSocket is used to augment certain functionalities. So in practice, applications use it with HTTP both on the server and client sides.

All is not as simple as a chat client, although if you want to take advantage of web semantics you might still want to have some part of a chat application be HTTP requests.

> WebSocket is supposed to be the closest thing to TCP that we can expose to
> JS. It's not supposed to be a framework or architecture and so on. To the
> extent that it is, those are mistakes made by the IETF working group when
> they took over the definition of the protocol.

That is ignoring what made the web successful in the first place, it is not merely giving sandboxed access to resources in a browser, but the semantics of it's many standards. Why do you think many applications today use REST APIs even when they have no interaction with browsers. Because the semantics were well designed and gave birth to reusable components. both on the software side (eg. proxies, https), and the hardware side (layer 7 routers, caching appliances etc). Not all applies to websockets, but some of it, and thanks to maintaining HTTP semantics at the handshake.

This is exactly why it is natural to use websockets headers if you already built an HTTP API, because you can reuse authentication and what application logic you need at the handshake. This is one of the many reasons folks are starting to  use websockets in application APIs. If the browser API spec doesn't include those, the applications we are building will not be portable to browsers.

Many HTTP Headers could have been forgone with that reasoning and their information instead been contained within HTML tags, but it would have made the web overall less successful and less accessible to optimizations, middleware etc.

Comment 8 Ian 'Hixie' Hickson 2013-08-22 20:22:42 UTC

I disagree with your conclusions regarding HTTP and REST and the relationship of the latter with the success of the Web.

Comment 9 Ian 'Hixie' Hickson 2013-09-12 20:45:57 UTC

> Except that in the real world, applications have existing HTTP APIs and
> WebSocket is used to augment certain functionalities. So in practice,
> applications use it with HTTP both on the server and client sides.

If you want to use HTTP, you can use XMLHttpRequest.
If you want to use Web Sockets, you can use WebSocket.

I don't see the problem here.


> > WebSocket is supposed to be the closest thing to TCP that we can expose to
> > JS. It's not supposed to be a framework or architecture and so on. To the
> > extent that it is, those are mistakes made by the IETF working group when
> > they took over the definition of the protocol.
> 
> That is ignoring what made the web successful in the first place, it is not
> merely giving sandboxed access to resources in a browser, but the semantics
> of it's many standards.

That's not, IMHO, what made the Web successful. The Web was successful not _because_ of the many features in its standards, but despite them. Barely anyone uses those features. We've dropped tons of stuff from the standards over the years because nobody uses them. In fact we've had to fight to keep the Web using semantics when people wanted to go even more basic — e.g. <font> vs CSS.


> Why do you think many applications today use REST APIs even when they have no 
> interaction with browsers.

Because some people jump on any bandwagon if it has a catchy name. Many applications just use their own protocol over TCP/IP. Many applications jumped on even more crazy bandwagons like WebServices. There's all kinds of stuff out there.


> Because the semantics were well designed and gave birth to reusable
> components.

I disagree that this describes HTTP, but some people feel that way, sure.

That doesn't mean we should shoe-horn that into Web Sockets also.