From W3C Wiki
Jump to: navigation, search

Web / Internet Tension

This is a list of areas of tension between the Web [1] and Internet [2] architectures.

Key to understanding these tensions (which do not always follow organisational lines between the W3C and IETF) is that the Web is coalescing into a "Web Stack" of specifications, practices and implementations that have very broad adoption among end users (using browsers) and developers (using HTML, JavaScript, HTTP, etc.). The Web stack is a very appealing target for application development, and once you've adopted it, the cost of working outside of it becomes prohibitive.

Indicating Security Context

Within the Web stack, domains of control are represented by origins, i.e., scheme, host, port tuples. This same-origin policy applies to, among other things, the ability of JavaScript code to manipulate a document object model, the ability to retrieve data from other origins using XMLHttpRequest, the ability to place and retrieve cookies, and other.

Indicating the security context of a connection in the URL -- usually, using the URL scheme -- is therefore critical to the Web stack: Whether or not a network attacker controls the content of a security context is encoded in the distinction between https and http, and (to first order), insecure scripting contexts are isolated from insecure contexts.

New mechanisms are also being built with the requirement to distinguish between secure and insecure contexts based on their URI built in (e.g., CORS [3]).

In the Internet architecture, protocols are encouraged NOT to reflect security properties by registering a separate URL scheme or using a separate TCP/UDP port. It is widely believed that protocols should upgrade their security properties without changing them, based upon the requirements of the application.

Performing Discovery

Increasingly, Web applications are needing to perform discovery to find nearby resources. E.g., CORE [4] is chartered to do this, and examining emerging Web architecture for discovery such as well-known URIs [5] and the hostmeta format [6] (packaged as LRRD [7]).

The Internet stack already has well-understood solutions to discovery (e.g. ZeroConf [8]). However, they're not readily available from the Web stack; there aren't JavaScript APIs for these capabilities, nor are they widely exposed as part of OS DNS implementations, so they're "invisible" to Web developers.


HTTP is a "MIME-like" protocol, but not actually MIME. While issues in HTTP/MIME gateways haven't often been seen in deployment, there have been architectural issues brought about by the differences.

For example, HTTP uses the media type system from MIME to identify message intent. It also allows a message to be encoded using a content-coding (end to end) or transfer-coding (hop-by-hop). However, some Web formats want to be able to unambiguously identify whether something is compressed using a filename extension (e.g., svg vs. svgz).

See also Larry's blog.