W3C

HTML 5

A vocabulary and associated APIs for HTML and XHTML

← 6 User InteractionTable of contents8 The HTML syntax →

7 Communication

7.1 Event definitions

Messages in server-sent events, Web sockets, cross-document messaging, and channel messaging use the message event.

The following interface is defined for this event:

interface MessageEvent : Event {
  readonly attribute any data;
  readonly attribute DOMString origin;
  readonly attribute DOMString lastEventId;
  readonly attribute WindowProxy source;
  readonly attribute MessagePort messagePort;
  void initMessageEvent(in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in any dataArg, in DOMString originArg, in DOMString lastEventIdArg, in WindowProxy sourceArg, in MessagePort messagePortArg);
  void initMessageEventNS(in DOMString namespaceURI, in DOMString typeArg, in boolean canBubbleArg, in boolean cancelableArg, in any dataArg, in DOMString originArg, in DOMString lastEventIdArg, in WindowProxy sourceArg, in MessagePort messagePortArg);
};

The initMessageEvent() and initMessageEventNS() methods must initialize the event in a manner analogous to the similarly-named methods in the DOM3 Events interfaces. [DOM3EVENTS]

The data attribute represents the message being sent.

The origin attribute represents, in server-sent events and cross-document messaging, the origin of the document that sent the message (typically the scheme, hostname, and port of the document, but not its path or fragment identifier).

The lastEventId attribute represents, in server-sent events, the last event ID string of the event source.

The source attribute represents, in cross-document messaging, the WindowProxy of the browsing context of the Window object from which the message came.

The messagePort attribute represents, in cross-document messaging and channel messaging the MessagePort being sent, if any.

Unless otherwise specified, when the user agent creates and dispatches a message event in the algorithms described in the following sections, the lastEventId attribute must be the empty string, the origin attribute must be the empty string, the source attribute must be null, and the messagePort attribute must be null.

7.2 Server-sent events

This section describes a mechanism for allowing servers to dispatch DOM events into documents that expect it. The eventsource element provides a simple interface to this mechanism.

7.2.1 The RemoteEventTarget interface

Any object that implements the EventTarget interface must also implement the RemoteEventTarget interface.

[NoInterfaceObject, ImplementedOn=EventTarget] interface RemoteEventTarget {
  void addEventSource(in DOMString src);
  void removeEventSource(in DOMString src);
};

When the addEventSource(src) method is invoked, the user agent must resolve the URL specified in src, relative to the first script's base URL, and if that succeeds, add the resulting absolute URL to the list of event sources for that object. The same URL can be registered multiple times. If the URL fails to resolve, then the user agent must raise a SYNTAX_ERR exception.

When the removeEventSource(src) method is invoked, the user agent must resolve the URL specified in src, relative to the first script's base URL, and if that succeeds, remove the resulting absolute URL from the list of event sources for that object. If the same URL has been registered multiple times, removing it must remove only one instance of that URL for each invocation of the removeEventSource() method. If the URL fails to resolve, the user agent does nothing.

7.2.2 Connecting to an event stream

Each object implementing the EventTarget and RemoteEventTarget interfaces has a list of event sources that are registered for that object.

When a new absolute URL is added to this list, the user agent should queue a task to run the following steps with the new absolute URL:

  1. If the entry for the new absolute URL has been removed from the list, then abort these steps.

  2. Fetch the resource identified by that absolute URL.

    As data is received, the tasks queued by the networking task source to handle the data must consist of following the rules given in the following sections.

When an event source is removed from the list of event sources for an object, if that resource is still being fetched, then the relevant connection must be closed.

Since connections established to remote servers for such resources are expected to be long-lived, UAs should ensure that appropriate buffering is used. In particular, while line buffering may be safe if lines are defined to end with a single U+000A LINE FEED character, block buffering or line buffering with different expected line endings can cause delays in event dispatch.

Each event source in the list must have associated with it the following:

In general, the semantics of the transport protocol specified by the URLs for the event sources must be followed, including HTTP caching rules.

For HTTP connections, the Accept header may be included; if included, it must contain only formats of event framing that are supported by the user agent (one of which must be text/event-stream, as described below).

Other formats of event framing may also be supported in addition to text/event-stream, but this specification does not define how they are to be parsed or processed.

Such formats could include systems like SMS-push; for example servers could use Accept headers and HTTP redirects to an SMS-push mechanism as a kind of protocol negotiation to reduce network load in GSM environments.

User agents should use the Cache-Control: no-cache header in requests to bypass any caches for requests of event sources.

If the event source's last event ID string is not the empty string, then a Last-Event-ID HTTP header must be included with the request, whose value is the value of the event source's last event ID string.

For connections to domains other than the document's domain, the semantics of the Access-Control HTTP header must be followed. [ACCESSCONTROL]

HTTP 200 OK responses with a Content-Type header specifying the type text/event-stream that are either from the document's domain or explicitly allowed by the Access-Control HTTP headers must be processed line by line as described below.

For the purposes of such successfully opened event streams only, user agents should ignore HTTP cache headers, and instead assume that the resource indicates that it does not wish to be cached.

If such a resource (with the correct MIME type) completes loading (i.e. the entire HTTP response body is received or the connection itself closes), the user agent should request the event source resource again after a delay equal to the reconnection time of the event source. This doesn't apply for the error cases that are listed below.

HTTP 200 OK responses that have a Content-Type other than text/event-stream (or some other supported type), and HTTP responses whose Access-Control headers indicate that the resource are not to be used, must be ignored.

HTTP 204 No Content, and 205 Reset Content responses must be treated as if they were 200 OK responses with the right MIME type but no content, and should therefore cause the user agent to refetch the resource after a delay equal to the reconnection time of the event source.

Other HTTP response codes in the 2xx range must be treated like HTTP 200 OK responses for the purposes of reopening event source resources. They are, however, likely to indicate an error has occurred somewhere and may cause the user agent to emit a warning.

HTTP 300 Multiple Choices responses should be handled automatically if possible (treating the responses as if they were 302 Found responses pointing to the appropriate resource), and otherwise must be treated as HTTP 404 responses.

HTTP 301 Moved Permanently responses must cause the user agent to reconnect using the new server specified URL instead of the previously specified URL for all subsequent requests for this event source. (It doesn't affect other event sources with the same URL unless they also receive 301 responses, and it doesn't affect future sessions, e.g. if the page is reloaded.)

HTTP 302 Found, 303 See Other, and 307 Temporary Redirect responses must cause the user agent to connect to the new server-specified URL, but if the user agent needs to again request the resource at a later point, it must return to the previously specified URL for this event source.

HTTP 304 Not Modified responses should be handled like HTTP 200 OK responses, with the content coming from the user agent cache. A new request should then be made after a delay equal to the reconnection time of the event source.

HTTP 305 Use Proxy, HTTP 401 Unauthorized, and 407 Proxy Authentication Required should be treated transparently as for any other subresource.

Any other HTTP response code not listed here or network error (e.g. DNS errors) must be ignored.

For non-HTTP protocols, UAs should act in equivalent ways.

7.2.3 Parsing an event stream

This event stream format's MIME type is text/event-stream.

The event stream format is as described by the stream production of the following ABNF, the character set for which is Unicode. [ABNF]

stream        = [ bom ] *event
event         = *( comment / field ) end-of-line
comment       = colon *any-char end-of-line
field         = 1*name-char [ colon [ space ] *any-char ] end-of-line
end-of-line   = ( cr lf / cr / lf / eof )
eof           = < matches repeatedly at the end of the stream >

; characters
lf            = %x000A ; U+000A LINE FEED
cr            = %x000D ; U+000D CARRIAGE RETURN
space         = %x0020 ; U+0020 SPACE
colon         = %x003A ; U+003A COLON
bom           = %xFEFF ; U+FEFF BYTE ORDER MARK
name-char     = %x0000-0009 / %x000B-000C / %x000E-0039 / %x003B-10FFFF
                ; a Unicode character other than U+000A LINE FEED, U+000D CARRIAGE RETURN, or U+003A COLON
any-char      = %x0000-0009 / %x000B-000C / %x000E-10FFFF
                ; a Unicode character other than U+000D CARRIAGE RETURN or U+003A COLON

Event streams in this format must always be encoded as UTF-8.

Lines must be separated by either a U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF) character, or a single U+000D CARRIAGE RETURN (CR) character.

7.2.4 Interpreting an event stream

Bytes or sequences of bytes that are not valid UTF-8 sequences must be interpreted as the U+FFFD REPLACEMENT CHARACTER.

One leading U+FEFF BYTE ORDER MARK character must be ignored if any are present.

The stream must then be parsed by reading everything line by line, with a U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair, a single U+000A LINE FEED (LF) character, a single U+000D CARRIAGE RETURN (CR) character, and the end of the file being the four ways in which a line can end.

When a stream is parsed, a data buffer and an event name buffer must be associated with it. They must be initialized to the empty string

Lines must be processed, in the order they are received, as follows:

If the line is empty (a blank line)

Dispatch the event, as defined below.

If the line starts with a U+003A COLON character (':')

Ignore the line.

If the line contains a U+003A COLON character (':') character

Collect the characters on the line before the first U+003A COLON character (':'), and let field be that string.

Collect the characters on the line after the first U+003A COLON character (':'), and let value be that string. If value starts with a single U+0020 SPACE character, remove it from value.

Process the field using the steps described below, using field as the field name and value as the field value.

Otherwise, the string is not empty but does not contain a U+003A COLON character (':') character

Process the field using the steps described below, using the whole line as the field name, and the empty string as the field value.

Once the end of the file is reached, the user agent must dispatch the event one final time, as defined below.

The steps to process the field given a field name and a field value depend on the field name, as given in the following list. Field names must be compared literally, with no case folding performed.

If the field name is "event"

Set the event name buffer the to field value.

If the field name is "data"

If the data buffer is not the empty string, then append a single U+000A LINE FEED character to the data buffer. Append the field value to the data buffer.

If the field name is "id"

Set the event stream's last event ID to the field value.

If the field name is "retry"

If the field value consists of only characters in the range U+0030 DIGIT ZERO ('0') U+0039 DIGIT NINE ('9'), then interpret the field value as an integer in base ten, and set the event stream's reconnection time to that integer. Otherwise, ignore the field.

Otherwise

The field is ignored.

When the user agent is required to dispatch the event, then the user agent must act as follows:

  1. If the data buffer is an empty string, set the data buffer and the event name buffer to the empty string and abort these steps.

  2. If the event name buffer is not the empty string but is also not a valid NCName, set the data buffer and the event name buffer to the empty string and abort these steps.

  3. Otherwise, create an event that uses the MessageEvent interface, with the event name message, which does not bubble, is cancelable, and has no default action. The data attribute must be set to the value of the data buffer, the origin attribute must be set to the Unicode serialization of the origin of the event stream's URL, and the lastEventId attribute must be set to the last event ID string of the event source.

  4. If the event name buffer has a value other than the empty string, change the type of the newly created event to equal the value of the event name buffer.

  5. Set the data buffer and the event name buffer to the empty string.

  6. Queue a task to dispatch the newly created event at the RemoteEventTarget object to which the event stream is registered. The task source for this task is the remote event task source.

If an event doesn't have an "id" field, but an earlier event did set the event source's last event ID string, then the event's lastEventId field will be set to the value of whatever the last seen "id" field was.

The following event stream, once followed by a blank line:

data: YHOO
data: -2
data: 10

...would cause an event message with the interface MessageEvent to be dispatched on the eventsource element, whose data attribute would contain the string YHOO\n-2\n10 (where \n represents a newline).

This could be used as follows:

<eventsource src="http://stocks.example.com/ticker.php"
              onmessage="var data = event.data.split('\n'); updateStocks(data[0], data[1], data[2]);">

...where updateStocks() is a function defined as:

function updateStocks(symbol, delta, value) { ... }

...or some such.

The following stream contains four blocks. The first block has just a comment, and will fire nothing. The second block has two fields with names "data" and "id" respectively; an event will be fired for this block, with the data "first event", and will then set the last event ID to "1" so that if the connection died between this block and the next, the server would be sent a Last-Event-ID header with the value "1". The third block fires an event with data "second event", and also has an "id" field, this time with no value, which resets the last event ID to the empty string (meaning no Last-Event-ID header will now be sent in the event of a reconnection being attempted). Finally the last block just fires an event with the data "third event". Note that the last block doesn't have to end with a blank line, the end of the stream is enough to trigger the dispatch of the last event.

: test stream

data: first event
id: 1

data: second event
id

data: third event

The following stream fires just one event:

data

data
data

data:

The first and last blocks do nothing, since they do not contain any actual data (the data buffer remains at the empty string, and so nothing gets dispatched). The middle block fires an event with the data set to a single newline character.

The following stream fires two identical events:

data:test

data: test

This is because the space after the colon is ignored if present.

7.2.5 Notes

Legacy proxy servers are known to, in certain cases, drop HTTP connections after a short timeout. To protect against such proxy servers, authors can include a comment line (one starting with a ':' character) every 15 seconds or so.

Authors wishing to relate event source connections to each other or to specific documents previously served might find that relying on IP addresses doesn't work, as individual clients can have multiple IP addresses (due to having multiple proxy servers) and individual IP addresses can have multiple clients (due to sharing a proxy server). It is better to include a unique identifier in the document when it is served and then pass that identifier as part of the URL in the src attribute of the eventsource element.

Implementations that support HTTP's per-server connection limitation might run into trouble when opening multiple pages from a site if each page has an eventsource to the same domain. Authors can avoid this using the relatively complex mechanism of using unique domain names per connection, or by allowing the user to enable or disable the eventsource functionality on a per-page basis.

7.3 Web sockets

7.3.1 Introduction

This section is non-normative.

To enable Web applications to maintain bidirectional communications with server-side processes, this specification introduces the WebSocket interface.

This interface does not allow for raw access to the underlying network. For example, this interface could not be used to implement an IRC client without proxying messages through a custom server.

An introduction to the client-side and server-side of using the direct connection APIs.

7.3.2 The WebSocket interface

[Constructor(in DOMString url)]
interface WebSocket {
  readonly attribute DOMString URL;

  // ready state
  const unsigned short CONNECTING = 0;
  const unsigned short OPEN = 1;
  const unsigned short CLOSED = 2;
  readonly attribute long readyState;

  // networking
           attribute Function onopen;
           attribute Function onmessage;
           attribute Function onclosed;
  void postMessage(in DOMString data);
  void disconnect();
};

WebSocket objects must also implement the EventTarget interface. [DOM3EVENTS]

The WebSocket(url) constructor takes one argument, url, which specifies the URL to which to connect. When the WebSocket() constructor is invoked, the UA must run these steps:

  1. Parse the url argument.

  2. If the previous step failed, or if url does not have a <scheme> component whose value is either "ws" or "wss", when compared in an ASCII case-insensitive manner, then throw a SYNTAX_ERR exception.

  3. Return a new WebSocket object, and continue these steps in the background (without blocking scripts).

  4. Let origin be the ASCII serialization of the origin of the script that invoked the WebSocket() constructor.

  5. If the <scheme> component of url is "ws", set secure to false; otherwise, the <scheme> component is "wss", set secure to true.

  6. Let host be the value of the <host> component of url.

  7. If url has a <port> component, then let port be that component's value; otherwise, there is no explicit port.

  8. Let resource name be the value of the <path> component (which might be empty) of url.

  9. If resource name is the empty string, set it to a single character U+002F SOLIDUS (/).

  10. If url has a <query> component, then append a single U+003F QUESTION MARK (?) character to resource name, followed by the value of the <query> component.

  11. Establish a Web Socket connection to a host host, on port port (if one was specified), from origin, with the flag secure, and with resource name as the resource name.


The URL attribute must return the value that was passed to the constructor.

The readyState attribute represents the state of the connection. It can have the following values:

CONNECTING (numeric value 0)
The connection has not yet been established.
OPEN (numeric value 1)
The Web Socket connection is established and communication is possible.
CLOSED (numeric value 2)
The connection has been closed or could not be opened.

When the object is created its readyState must be set to CONNECTING (0). The steps executed when the constructor is invoked change this attribute's value.

The postMessage(data) method transmits data using the connection. If the connection is not established (readyState is not OPEN), it must raise an INVALID_STATE_ERR exception. If the connection is established, then the user agent must send data using the Web Socket.

The disconnect() method must close the Web Socket connection or connection attempt, if any. If the connection is already closed, it must do nothing. Closing the connection causes a close event to be fired and the readyState attribute's value to change, as described below.

7.3.3 WebSocket Events

The open event is fired when the Web Socket connection is established.

The close event is fired when the connection is closed (whether by the author, calling the disconnect() method, or by the server, or by a network error).

No information regarding why the connection was closed is passed to the application in this version of this specification.

The message event is fired when when data is received for a connection.


The following are the event handler attributes that must be supported, as DOM attributes, by all objects implementing the WebSocket interface:

onopen

Must be invoked whenever an open event is targeted at or bubbles through the WebSocket object.

onmessage

Must be invoked whenever a message event is targeted at or bubbles through the WebSocket object.

onclosed

Must be invoked whenever an closed event is targeted at or bubbles through the WebSocket object.

7.3.4 Feedback from the protocol

When the Web Socket connection is established, the user agent must run the following steps:

  1. Change the readyState attribute's value to OPEN (1).

  2. Queue a task to fire a simple event named open at the WebSocket object.


When a Web Socket message has been received with text data, the user agent must create an event that uses the MessageEvent interface, with the event name message, which does not bubble, is cancelable, has no default action, and whose data attribute is set to data, and queue a task to dispatch it at the WebSocket object.


When the Web Socket connection is closed, the readyState attribute's value must be changed to CLOSED (2), and the user agent must queue a task to fire a simple event named close at the WebSocket object.


The task source for all tasks queued in this section is the Web Socket task source.

7.3.5 The Web Socket protocol

This section will be extracted into an RFC in due course.

7.3.5.1 Introduction

...

7.3.5.2 Client-side requirements

This section only applies to user agents, not to servers.

This specification doesn't currently define a limit to the number of simultaneous connections that a client can establish to a server.

7.3.5.2.1 Handshake

When the user agent is to establish a Web Socket connection to a host host, optionally on port port, from an origin origin, with a flag secure, and with a particular resource name, it must run the following steps:

  1. If there is no explicit port, then: if secure is false, let port be 81, otherwise let port be 815.

  2. If the user agent is configured to use a proxy to connect to host host and/or port port, then connect to that proxy and ask it to open a TCP/IP connection to the host given by host and the port given by port.

    For example, if the user agent uses an HTTP proxy for all traffic, then if it was to try to connect to port 80 on server example.com, it might send the following lines to the proxy server:

    CONNECT example.com HTTP/1.1

    If there was a password, the connection might look like:

    CONNECT example.com HTTP/1.1
    Proxy-authorization: Basic ZWRuYW1vZGU6bm9jYXBlcyE=

    Otherwise, if the user agent is not configured to use a proxy, then open a TCP/IP connection to the host given by host and the port given by port.

  3. If the connection could not be opened, then fail the Web Socket connection and abort these steps.

  4. If secure is true, perform a TLS handshake over the connection. If this fails (e.g. the server's certificate could not be verified), then fail the Web Socket connection and abort these steps. Otherwise, all further communication on this channel must run through the encrypted tunnel. [RFC2246]

  5. Send the following bytes to the remote side (the server):

    47 45 54 20

    Send the resource name value, encoded as US-ASCII.

    Send the following bytes:

    20 48 54 54 50 2f 31 2e  31 0d 0a 55 70 67 72 61
    64 65 3a 20 57 65 62 53  6f 63 6b 65 74 0d 0a 43
    6f 6e 6e 65 63 74 69 6f  6e 3a 20 55 70 67 72 61
    64 65 0d 0a

    The string "GET ", the path, " HTTP/1.1", CRLF, the string "Upgrade: WebSocket", CRLF, and the string "Connection: Upgrade", CRLF.

  6. Send the following bytes:

    48 6f 73 74 3a 20

    Send the host value, encoded as US-ASCII and converted to lowercase, if it represents a host name (and not an IP address).

    Send the following bytes:

    0d 0a

    The string "Host: ", the host, and CRLF.

  7. Send the following bytes:

    4f 72 69 67 69 6e 3a 20

    Send the origin value, encoded as US-ASCII and converted to lowercase.

    Send the following bytes:

    0d 0a

    The string "Origin: ", the origin, and CRLF.

  8. If the client has any authentication information or cookies that would be relevant to a resource accessed over HTTP, if secure is false, or HTTPS, if it is true, on host host, port port, with resource name as the path (and possibly query parameters), then HTTP headers that would be appropriate for that information should be sent at this point. [RFC2616] [RFC2109] [RFC2965]

    Each header must be on a line of its own (each ending with a CR LF sequence). For the purposes of this step, each header must not be split into multiple lines (despite HTTP otherwise allowing this with continuation lines).

    For example, if the server had a username and password that applied to http://example.com/socket, and the Web Socket was being opened to ws://example.com:80/socket, it could send them:

    Authorization: Basic d2FsbGU6ZXZl

    However, it would not send them if the Web Socket was being opened to ws://example.com/socket, as that uses a different port (81, not 80).

  9. Send the following bytes:

    0d 0a

    Just a CRLF (a blank line).

  10. Read the first 85 bytes from the server. If the connection closes before 85 bytes are received, or if the first 85 bytes aren't exactly equal to the following bytes, then fail the Web Socket connection and abort these steps.

    48 54 54 50 2f 31 2e 31  20 31 30 31 20 57 65 62
    20 53 6f 63 6b 65 74 20  50 72 6f 74 6f 63 6f 6c
    20 48 61 6e 64 73 68 61  6b 65 0d 0a 55 70 67 72
    61 64 65 3a 20 57 65 62  53 6f 63 6b 65 74 0d 0a
    43 6f 6e 6e 65 63 74 69  6f 6e 3a 20 55 70 67 72
    61 64 65 0d 0a

    The string "HTTP/1.1 101 Web Socket Protocol Handshake", CRLF, the string "Upgrade: WebSocket", CRLF, the string "Connection: Upgrade", CRLF.

  11. Let headers be a list of name-value pairs, initially empty.

  12. Header: Let name and value be empty byte arrays.

  13. Read a byte from the server.

    If the connection closes before this byte is received, then fail the Web Socket connection and abort these steps.

    Otherwise, handle the byte as described in the appropriate entry below:

    If the byte is 0x0d (ASCII CR)
    If the name byte array is empty, then jump to the headers processing step. Otherwise, fail the Web Socket connection and abort these steps.
    If the byte is 0x0a (ASCII LF)
    Fail the Web Socket connection and abort these steps.
    If the byte is 0x3a (ASCII ":")
    Move on to the next step.
    If the byte is in the range 0x41 .. 0x5a (ASCII "A" .. "Z")
    Append a byte whose value is the byte's value plus 0x20 to the name byte array and redo this step for the next byte.
    Otherwise
    Append the byte to the name byte array and redo this step for the next byte.

    This reads a header name, terminated by a colon, converting upper-case ASCII letters to lowercase, and aborting if a stray CR or LF is found.

  14. Read a byte from the server.

    If the connection closes before this byte is received, then fail the Web Socket connection and abort these steps.

    Otherwise, handle the byte as described in the appropriate entry below:

    If the byte is 0x20 (ASCII space)
    Ignore the byte and move on to the next step.
    Otherwise
    Treat the byte as described by the list in the next step, then move on to that next step for real.

    This skips past a space character after the colon, if necessary.

  15. Read a byte from the server.

    If the connection closes before this byte is received, then fail the Web Socket connection and abort these steps.

    Otherwise, handle the byte as described in the appropriate entry below:

    If the byte is 0x0d (ASCII CR)
    Move on to the next step.
    If the byte is 0x0a (ASCII LF)
    Fail the Web Socket connection and abort these steps.
    Otherwise
    Append the byte to the name byte array and redo this step for the next byte.

    This reads a header value, terminated by a CRLF.

  16. Read a byte from the server.

    If the connection closes before this byte is received, or if the byte is not a 0x0a byte (ASCII LF), then fail the Web Socket connection and abort these steps.

    This skips past the LF byte of the CRLF after the header.

  17. Append an entry to the headers list that has the name given by the string obtained by interpreting the name byte array as a UTF-8 byte stream and the value given by the string obtained by interpreting the value byte array as a UTF-8 byte stream.

  18. Return to the header step above.

  19. Headers processing: If there is not exactly one entry in the headers list whose name is "websocket-origin", or if there is not exactly one entry in the headers list whose name is "websocket-location", or if there are any entries in the headers list whose names are the empty string, then fail the Web Socket connection and abort these steps.

  20. Handle each entry in the headers list as follows:

    If the entry's name is "websocket-origin"

    If the value is not exactly equal to origin, converted to lowercase, then fail the Web Socket connection and abort these steps.

    If the entry's name is "websocket-location"

    If the value is not exactly equal to a string consisting of the following components in the same order, then fail the Web Socket connection and abort these steps:

    1. The string "http" if secure is false and "https" if secure is true
    2. The three characters "://".
    3. The value of host.
    4. If secure is false and port is not 81, or if secure is true and port is not 815: a ":" character followed by the value of port.
    5. The value of resource name.
    If the entry's name is "set-cookie" or "set-cookie2" or another cookie-related header name

    Handle the cookie as defined by the appropriate spec, with the resource being the one with the host host, the port port, the path (and possibly query parameters) resource name, and the scheme http if secure is false and https if secure is true. [RFC2109] [RFC2965]

    Any other name
    Ignore it.
  21. The Web Socket connection is established. Now the user agent must send and receive to and from the connection as described in the next section.

To fail the Web Socket connection, the user agent must close the Web Socket connection, and may report the problem to the user (which would be especially useful for developers). However, user agents must not convey the failure information to the script that attempted the connection in a way distinguishable from the Web Socket being closed normally.

7.3.5.2.2 Data framing

Once a Web Socket connection is established, the user agent must run through the following state machine for the bytes sent by the server.

  1. Try to read a byte from the server. Let frame type be that byte.

    If no byte could be read because the Web Socket connection is closed, then abort.

  2. Handle the frame type byte as follows:

    If the high-order bit of the frame type byte is set (i.e. if frame type anded with 0x80 returns 0x80)

    Run these steps. If at any point during these steps a read is attempted but fails because the Web Socket connection is closed, then abort.

    1. Let length be zero.

    2. Length: Read a byte, let b be that byte.

    3. Let bv be integer corresponding to the low 7 bits of b (the value you would get by anding b with 0x7f).

    4. Multiply length by 128, add bv to that result, and store the final result in length.

    5. If the high-order bit of b is set (i.e. if b anded with 0x80 returns 0x80), then return to the step above labeled length.

    6. Read length bytes.

    7. Discard the read bytes.

    If the high-order bit of the frame type byte is not set (i.e. if frame type anded with 0x80 returns 0x00)

    Run these steps. If at any point during these steps a read is attempted but fails because the Web Socket connection is closed, then abort.

    1. Let raw data be an empty byte array.

    2. Data: Read a byte, let b be that byte.

    3. If b is not 0xff, then append b to raw data and return to the previous step (labeled data).

    4. Interpret raw data as a UTF-8 string, and store that string in data.

    5. If frame type is 0x00, then a message has been received with text data. Otherwise, discard the data.

  3. Return to the first step to read the next byte.

If the user agent is faced with content that is too large to be handled appropriately, then it must fail the Web Socket connection.


Once a Web Socket connection is established, the user agent must use the following steps to send data using the Web Socket:

  1. Send a 0x00 byte to the server.

  2. Encode data using UTF-8 and send the resulting byte stream to the server.

  3. Send a 0xff byte to the server.

7.3.5.3 Server-side requirements

This section only applies to servers.

7.3.5.3.1 Minimal handshake

This section describes the minimal requirements for a server-side implementation of Web Sockets.

Listen on a port for TCP/IP. Upon receiving a connection request, open a connection and send the following bytes back to the client:

48 54 54 50 2f 31 2e 31  20 31 30 31 20 57 65 62
20 53 6f 63 6b 65 74 20  50 72 6f 74 6f 63 6f 6c
20 48 61 6e 64 73 68 61  6b 65 0d 0a 55 70 67 72
61 64 65 3a 20 57 65 62  53 6f 63 6b 65 74 0d 0a
43 6f 6e 6e 65 63 74 69  6f 6e 3a 20 55 70 67 72
61 64 65 0d 0a

Send the string "WebSocket-Origin" followed by a U+003A COLON (":") followed by the ASCII serialization of the origin from which the server is willing to accept connections, followed by a CRLF pair (0x0d 0x0a).

For instance:

WebSocket-Origin: http://example.com

Send the string "WebSocket-Location" followed by a U+003A COLON (":") followed by the URL of the Web Socket script, followed by a CRLF pair (0x0d 0x0a).

For instance:

WebSocket-Location: ws://example.com:80/demo

Send another CRLF pair (0x0d 0x0a).

Read (and discard) data from the client until four bytes 0x0d 0x0a 0x0d 0x0a are read.

If the connection isn't dropped at this point, go to the data framing section.

7.3.5.3.2 Handshake details

The previous section ignores the data that is transmitted by the client during the handshake.

The data sent by the client consists of a number of fields separated by CR LF pairs (bytes 0x0d 0x0a).

The first field consists of three tokens separated by space characters (byte 0x20). The middle token is the path being opened. If the server supports multiple paths, then the server should echo the value of this field in the initial handshake, as part of the URL given on the WebSocket-Location line (after the appropriate scheme and host).

The remaining fields consist of name-value pairs, with the name part separated from the value part by a colon and a space (bytes 0x3a 0x20). Of these, several are interesting:

Host (bytes 48 6f 73 74)

The value gives the hostname that the client intended to use when opening the Web Socket. It would be of interest in particular to virtual hosting environments, where one server might serve multiple hosts, and might therefore want to return different data.

The right host has to be output as part of the URL given on the WebSocket-Location line of the handshake described above, to verify that the server knows that it is really representing that host.

Origin (bytes 4f 72 69 67 69 6e)

The value gives the scheme, hostname, and port (if it's not the default port for the given scheme) of the page that asked the client to open the Web Socket. It would be interesting if the server's operator had deals with operators of other sites, since the server could then decide how to respond (or indeed, whether to respond) based on which site was requesting a connection.

If the server supports connections from more than one origin, then the server should echo the value of this field in the initial handshake, on the WebSocket-Origin line.

Other fields

Other fields can be used, such as "Cookie" or "Authorization", for authentication purposes.

7.3.5.3.3 Data framing

This section only describes how to handle content that this specification allows user agents to send (text). It doesn't handle any arbitrary content in the same way that the requirements on user agents defined earlier handle any content including possible future extensions to the protocols.

The server should run through the following steps to process the bytes sent by the client:

  1. Read a byte from the client. Assuming everything is going according to plan, it will be a 0x00 byte. Behaviour for the server is undefined if the byte is not 0x00.

  2. Let raw data be an empty byte array.

  3. Data: Read a byte, let b be that byte.

  4. If b is not 0xff, then append b to raw data and return to the previous step (labeled data).

  5. Interpret raw data as a UTF-8 string, and apply whatever server-specific processing should occur for the resulting string.

  6. Return to the first step to read the next byte.


The server should run through the following steps to send strings to the client:

  1. Send a 0x00 byte to the client to indicate the start of a string.

  2. Encode data using UTF-8 and send the resulting byte stream to the client.

  3. Send a 0xff byte to the client to indicate the end of the message.

7.3.5.4 Closing the connection

To close the Web Socket connection, either the user agent or the server closes the TCP/IP connection. There is no closing handshake. Whether the user agent or the server closes the connection, it is said that the Web Socket connection is closed.

Servers may close the Web Socket connection whenever desired.

User agents should not close the Web Socket connection arbitrarily.

7.3.5.5 Security considerations

...

7.3.5.6 IANA considerations

...(two URI schemes, two ports, HTTP Upgrade keyword)

7.4 Cross-document messaging

Web browsers, for security and privacy reasons, prevent documents in different domains from affecting each other; that is, cross-site scripting is disallowed.

While this is an important security feature, it prevents pages from different domains from communicating even when those pages are not hostile. This section introduces a messaging system that allows documents to communicate with each other regardless of their source domain, in a way designed to not enable cross-site scripting attacks.

The task source for the tasks in cross-document messaging is the posted message task source.

7.4.1 Introduction

This section is non-normative.

For example, if document A contains an iframe element that contains document B, and script in document A calls postMessage() on the Window object of document B, then a message event will be fired on that object, marked as originating from the Window of document A. The script in document A might look like:

var o = document.getElementsByTagName('iframe')[0];
o.contentWindow.postMessage('Hello world', 'http://b.example.org/');

To register an event handler for incoming events, the script would use addEventListener() (or similar mechanisms). For example, the script in document B might look like:

window.addEventListener('message', receiver, false);
function receiver(e) {
  if (e.origin == 'http://example.com') {
    if (e.data == 'Hello world') {
      e.source.postMessage('Hello', e.origin);
    } else {
      alert(e.data);
    }
  }
}

This script first checks the domain is the expected domain, and then looks at the message, which it either displays to the user, or responds to by sending a message back to the document which sent the message in the first place.

7.4.2 Security

7.4.2.1 Authors

Use of this API requires extra care to protect users from hostile entities abusing a site for their own purposes.

Authors should check the origin attribute to ensure that messages are only accepted from domains that they expect to receive messages from. Otherwise, bugs in the author's message handling code could be exploited by hostile sites.

Authors should not use the wildcard keyword ("*") in the targetOrigin argument in messages that contain any confidential information, as otherwise there is no way to guarantee that the message is only delivered to the recipient to which it was intended.

7.4.2.2 User agents

The integrity of this API is based on the inability for scripts of one origin to post arbitrary events (using dispatchEvent() or otherwise) to objects in other origins (those that are not the same).

Implementors are urged to take extra care in the implementation of this feature. It allows authors to transmit information from one domain to another domain, which is normally disallowed for security reasons. It also requires that UAs be careful to allow access to certain properties but not others.

7.4.3 Posting messages

When a script invokes the postMessage(message, targetOrigin) method (with only two arguments) on a Window object, the user agent must follow these steps:

  1. If the value of the targetOrigin argument is not a single U+002A ASTERISK character ("*"), and resolving it relative to the first script's base URL either fails or results in a URL with a <host-specific> component that is neither empty nor a single U+002F SOLIDUS character (/), then throw a SYNTAX_ERR exception and abort the overall set of steps.

  2. Let message clone be the result of obtaining a structured clone of the message argument. If this throws an exception, then throw that exception and abort these steps.

  3. Return from the postMessage() method, but asynchronously continue running these steps.

  4. If the targetOrigin argument has a value other than a single literal U+002A ASTERISK character ("*"), and the Document of the Window object on which the method was invoked does not have the same origin as targetOrigin, then abort these steps silently.

  5. Create an event that uses the MessageEvent interface, with the event name message, which does not bubble, is cancelable, and has no default action. The data attribute must be set to the value of message clone, the origin attribute must be set to the Unicode serialization of the origin of the script that invoked the method, and the source attribute must be set to the script's global object.

  6. Queue a task to dispatch the event created in the previous step at the Window object on which the method was invoked. The task source for this task is the posted message task source.

7.4.4 Posting messages with message ports

When a script invokes the postMessage(message, messagePort, targetOrigin) method (with three arguments) on a Window object, the user agent must follow these steps:

  1. If the value of the targetOrigin argument is not a single U+002A ASTERISK character ("*"), and resolving it relative to the first script's base URL either fails or results in a URL with a <host-specific> component that is neither empty nor a single U+002F SOLIDUS character (/), then throw a SYNTAX_ERR exception and abort the overall set of steps.

  2. Let message clone be the result of obtaining a structured clone of the message argument. If this throws an exception, then throw that exception and abort these steps.

  3. If the messagePort argument is null, then act as if the method had just been called with two arguments, message and targetOrigin.

  4. Try to obtain a new port by cloning the messagePort argument with the Window object on which the method was invoked as the owner of the clone. If this returns an exception, then throw that exception and abort these steps.

  5. Return from the postMessage() method, but asynchronously continue running these steps.

  6. If the targetOrigin argument has a value other than a single literal U+002A ASTERISK character ("*"), and the Document of the Window object on which the method was invoked does not have the same origin as targetOrigin, then abort these steps silently.

  7. Create an event that uses the MessageEvent interface, with the event name message, which does not bubble, is cancelable, and has no default action. The data attribute must be set to the value of message clone, the origin attribute must be set to the Unicode serialization of the origin of the script that invoked the method, and the source attribute must be set to the script's global object.

  8. Let the messagePort attribute of the event be the new port.

  9. Queue a task to dispatch the event created in the previous step at the Window object on which the method was invoked. The task source for this task is the posted message task source.

These steps, with the exception of the second and third steps and the penultimate step, are identical to those in the previous section.

7.5 Channel messaging

7.5.1 Introduction

This section is non-normative.

An introduction to the channel and port APIs.

7.5.2 Message channels

[Constructor]
interface MessageChannel {
  readonly attribute MessagePort port1;
  readonly attribute MessagePort port2;
};

When the MessageChannel() constructor is called, it must run the following algorithm:

  1. Create a new MessagePort object owned by the script's global object, and let port1 be that object.

  2. Create a new MessagePort object owned by the script's global object, and let port2 be that object.

  3. Entangle the port1 and port2 objects.

  4. Instantiate a new MessageChannel object, and let channel be that object.

  5. Let the port1 attribute of the channel object be port1.

  6. Let the port2 attribute of the channel object be port2.

  7. Return channel.

The port1 and port2 attributes must return the values they were assigned when the MessageChannel object was created.

7.5.3 Message ports

Each channel has two message ports. Data sent through one port is received by the other port, and vice versa.

interface MessagePort {
  readonly attribute boolean active;
  void postMessage(in any message, [Optional] in MessagePort messagePort);
  void start();
  void close();

  // event handler attributes
           attribute Function onmessage;
};

Objects implementing the MessagePort interface must also implement the EventTarget interface.

Each MessagePort object can be entangled with another (a symmetric relationship). Each MessagePort object also has a task source called the port message queue, initial empty. A port message queue can be open or closed, and is initially closed.

When the user agent is to create a new MessagePort object owned by a script's global object object owner, it must instantiate a new MessagePort object, and let its owner be owner.


When the user agent is to entangle two MessagePort objects, it must run the following steps:

  1. If one of the ports is already entangled, then unentangle it and the port that it was entangled with.

    If those two previously entangled ports were the two ports of a MessageChannel object, then that MessageChannel object no longer represents an actual channel: the two ports in that object are no longer entangled.

  2. Associate the two ports to be entangled, so that they form the two parts of a new channel. (There is no MessageChannel object that represents this channel.)


When the user agent is to clone a port original port, with the clone being owned by owner, it must run the following steps, which return either a new MessagePort object or an exception for the caller to raise:

  1. If the original port is not entangled without another port, then return an INVALID_STATE_ERR exception and abort all these steps.

  2. Let the remote port be the port with which the original port is entangled.

  3. Create a new MessagePort object owned by owner, and let new port be that object.

  4. Move all the events in the port message queue of original port to the port message queue of new port, if any, leaving the new port's port message queue in its initial closed state.

  5. Entangle the remote port and new port objects. The original port object will be unentangled by this process.

  6. Return new port. It is the clone.


The active attribute must return true if the port is entangled, and false otherwise.


The postMessage() method, when called on a port source port, must cause the user agent to run the following steps:

  1. Let message be the method's first argument.

  2. Let data port be the method's second argument, if any.

  3. Let message clone be the result of obtaining a structured clone of the message argument. If this throws an exception, then throw that exception and abort these steps.

  4. If the source port is not entangled with another port, then return and abort these steps.

  5. Let target port be the port with which source port is entangled.

  6. Create an event that uses the MessageEvent interface, with the name message, which does not bubble, is cancelable, and has no default action.

  7. Let the data attribute of the event have the value of message clone.

  8. If the method was called with a second argument data port and that argument isn't null, then run the following substeps:

    1. If the data port is the source port or the target port, then throw an INVALID_ACCESS_ERR exception and abort all these steps.

    2. Try to obtain a new data port by cloning the data port with the owner of the target port as the owner of the clone. If this returns an exception, then throw that exception and abort these steps.

    3. Let the messagePort attribute of the event be the new data port.

  9. Return from the method, but continue with these steps.

  10. Add the event to the port message queue of target port.


The start() method must open its port's port message queue, if it is not already open.

When a port's port message queue is open, the event loop must use it as one of its task sources.

If the Document of the port's event handlers' global object is not fully active, then the messages are lost.


The close() method, when called on a port local port that is entangled with another port, must cause the user agents to unentangle the two ports. If the method is called on a port that is not entangled, then the method must do nothing.


The following are the event handler attributes that must be supported, as DOM attributes, by all objects implementing the MessagePort interface:

onmessage

Must be invoked whenever a message event is targeted at or bubbles through the MessagePort object.

The first time a MessagePort object's onmessage DOM attribute is set, the port's port message queue must be opened, as if the start() method had been called.

7.5.3.1 Ports and garbage collection

User agents must act as if MessagePort objects have a strong reference to their entangled MessagePort object.

Thus, a message port can be received, given an event listener, and then forgotten, and so long as that event listener could receive a message, the channel will be maintained.

Of course, if this was to occur on both sides of the channel, then both ports would be garbage collected, since they would not be reachable from live code, despite having a strong reference to each other.

Furthermore, a MessagePort object must not be garbage collected while there exists a message in a task queue that is to be dispatched on that MessagePort object, or while the MessagePort object's port message queue is open and there exists a message event in that queue.