HTTP/1.0, 1.1 and Beyond,
An Evolutionary Perspective on HTTP

Henrik Frystyk Nielsen

Purpose of this Talk

Why did HTTP end up the way it did?
- What caused new features to be introduced?
- Nobody could have predicted the path it took!
- Kiss (Keep it simple, stupid!) principle is essential
What is HTTP, really?
- The Basic HTTP Building Blocks
Bare Bone HTTP
- In every big protocol there is a small waiting to get out
Common Comments and Questions about HTTP
- On performance, state, complexity, extensibility

Once Upon a Time...

HTTP started out as a simple hypertext protocol
Send GET request - get back a document
- Hypertext was what you asked for and what you got
There was no information about the documents you retrieved - was embedded in the document
This were the early days of HTTP/0.9

Then came the <IMG …> hack

No more only text based documents
Needed type information to distinguish images from text
MIME provided a mechanism for describing protocol messages
- Was adopted for describing HTTP messages
- A major cross road on the evolutionary path
HTTP/1.0 was on its way

Then came Proxies and Gateways

First to get access to other systems like Gopher and WAIS
- Bootstrap mechanism for accessing information
Then to traverse firewalls
- Turned out to be better than mechanisms like SOCKS
And soon caching became popular based on last modified dates and heuristics
Proxies crucial piece of Web architecture
- Allows for new levels of indirection

In the Mean Time...

People wanted faster renditions of their pages containing text, images and audio
Solution: Use multiple, parallel TCP connections
- This actually makes a lot of sense
- TCP sockets are easy to program
- You get a lot of resources from the OS and the Net
- It seems to be a lot faster!
One problem - impact on Net a disaster
- Web applications were wasting huge amounts of resources. Servers did not do any real work

The More the Merrier

People wanted all their information in their browser
Use of POST to represent "strange ideas"
POST is not AUTOMATABLE!
- Difference from Automated!
- It is not a question of handling strange ideas!
- It is a question of letting your computer handle strange ideas!
HTTP become a byte transport
- Lack of interoperability

The Web was Commercialized

Vanity host names become popular
- Everybody wants their own domain name (www.henrik.com)
Due to a misoptimization, this could only be done using multiple IP addresses
- Result is that many machines have multiple IP addresses
- Examples of 100 or more IP addresses pr machine

Web Fueled by Advertisement

Main accounting mechanism was hit counts in the form of TCP connections
No trust in heuristic caching - bust it!
- We loose revenue every time a cache serves a cached document!

The Internet was on its Knees

Several reports of busy links collapsing - no data got through
IP addresses were consumed at very high rate
But… I don’t think that HTTP would have the position it has today as the most used protocol if started with HTTP/1.1

HTTP/1.1 - The Big Fire Fighter

Main purpose was to fix three problems
- Provide a semantically well-defined caching model
- Support vanity hostnames
- Limiting waste of TCP connections
Criteria for solutions was that the end user would see a clear win
- People need personal incentives to change
- Implementors need clear market benefit to implement

Hmm, Looks Promising!

Success criteria was met
In our performance work, we could show that
- HTTP/1.1 cuts down Round Trip Times by a lot
- Cut down TCP overhead by a factor of three
- Cut down time to transfer data by a factor of 2
We can blast out PPP, LANs and WANs
- Have not made explicit testing on wireless
- Would urge people to help doing this

But How can We Extend it?

HTTP is not a centrally controlled protocol
- Has maybe never been
- It’s extended by everybody for any possible purpose
Clearly suffering from "HTTP is the hammer - everything is a nail" syndrome
No structured way of extending HTTP
Lack of type information
- Using POST as a tunnel mechanism
- Reducing HTTP to a byte transport
We need a more powerful framework!

HTTP - the Next Generation

HTTP-NG is generic application level protocol
- A simple, extensible framework
Explicit Layering and modularization
- Break up the big "lump" style HTTP message
Extensibility at the core
- Lessons from our HTTP/PEP/Mandatory work
Can the Web be implemented using Distributed Object technology?

So What is HTTP anyway?

Let’s have a quick look at the model
It looks like MIME but isn’t quite
- HTTP is a layered Protocol
- Has Scope, Proxying and Caching
- Has inherent fuzziness built in
- Content negotiation and redirections
It looks like RPC but isn’t quite
- Proxies are explicit in the interfaces
- Has the notion of end-to-end and hop-by-hop scope
- Interfaces are both vertical and horizontal
- Headers separated from methods

Methods, Headers, and Status Codes

No explicit relationship between methods, header fields and status codes in an HTTP message. Relationship must be defined implicitly
Methods to be performed on the resource
- A priori agreement of semantics. Can’t be extended dynamically
Headers carry information about the parties involved, the transaction, the message body or the resource
- Unknown header fields must be ignored without affecting the outcome of the transaction
Status Codes are the results returned by the server
- Status codes are somewhat easier to extend, as unknown status codes must be treated as the x00 code of that class.

Common Questions about HTTP

People often discard HTTP using inaccurate assumptions
Not a question of "HTTP all over" but a path for evolvability
Working our way towards a generic application level protocol framework
- An important goal of HTTP-NG!

Why is HTTP/1.1 so big?

I have often heard: We only need a small subset, not the whole thing
What does it really take to be an HTTP application? Not a lot!
Most features defined by header fields have a request part and a response part.
The SHOULD and MUST requirements in which header field to support often comes in pairs: if you support a certain feature then you have to support all header fields associated with that feature.

HTTP is for HTTP URLs, right?

HTTP can handle arbitrary URIs - not only "http://…"
This is a consequence of Proxies and Gateways in the HTTP model
I don’t believe in Gateways
- I want one information space with a consistent set of services
URI space is getting more complex
- New URI schemes on a daily basis
- A serious problem for interoperability

I can’t use HTTP - I need state!

HTTP is inherently a stateless protocol
- Request-response pairs are independent but not necessarily idempotent.
- POST, as well as sequences of PUT and DELETE, changes state
State can be built on top of HTTP
- Often sufficient to add a simple header field or a parameter on an existing one
Cookies is state at a higher level
- Involves the end-user and hence concerns about privacy etc.

HTTP is for TCP Only!

There is (almost) nothing that binds HTTP to TCP
- HTTP is known to run on top of non-TCP networks
Often said that UDP is faster than TCP
- Pipelining changes this dramatically: requests and responses take fragments of TCP packets
UDP Support should be done by layering
- Many examples of MIME based protocols supporting UDP at the application level
- Doesn’t make sense!

HTTP Can’t handle Streamed Data

There is a difference between Controlling and Transmitting
HTTP is not a real time protocol
But can be used to control audio/video streams
- Essentially as a remote control protocol

HTTP is too Slow!

Well, performance is relative
HTTP is not a fast protocol - but there are not very many fast protocols around
- POP is really bad with respect to round trips (RTT).
- CORBA is really bad with respect to bytes and RTTs
On wireless, RTT is the factor that kills you
HTTP/1.1 is fairly good at avoiding RTT delays

More Information on the Web

HTTP-NG Project
HTTP-NG Activity
HTTP-NG Working Groups (W3C Members only)
HTTP/1.x Overview
W3C Member Site
W3C

HTTP/1.0, 1.1 and Beyond, An Evolutionary Perspective on HTTP

Henrik Frystyk Nielsen

Purpose of this Talk

Once Upon a Time...

Then came the <IMG …> hack

Then came Proxies and Gateways

In the Mean Time...

The More the Merrier

The Web was Commercialized

Web Fueled by Advertisement

The Internet was on its Knees

HTTP/1.1 - The Big Fire Fighter

Hmm, Looks Promising!

But How can We Extend it?

HTTP - the Next Generation

So What is HTTP anyway?

Methods, Headers, and Status Codes

Common Questions about HTTP

Why is HTTP/1.1 so big?

HTTP is for HTTP URLs, right?

I can’t use HTTP - I need state!

HTTP is for TCP Only!

HTTP Can’t handle Streamed Data

HTTP is too Slow!

More Information on the Web

HTTP/1.0, 1.1 and Beyond,
An Evolutionary Perspective on HTTP