HTTP/1.0, 1.1 and Beyond,
An Evolutionary Perspective on HTTP
Purpose of this Talk
-
Why did HTTP end up the way it did?
-
What caused new features to be introduced?
-
Nobody could have predicted the path it took!
-
Kiss (Keep it simple, stupid!) principle is essential
-
What is HTTP, really?
-
The Basic HTTP Building Blocks
-
Bare Bone HTTP
-
In every big protocol there is a small waiting to get out
-
Common Comments and Questions about HTTP
-
On performance, state, complexity, extensibility
Once Upon a Time...
-
HTTP started out as a simple hypertext protocol
-
Send GET request - get back a document
-
Hypertext was what you asked for and what you got
-
There was no information about the documents you retrieved - was embedded
in the document
-
This were the early days of HTTP/0.9
Then came the <IMG
> hack
-
No more only text based documents
-
Needed type information to distinguish images from text
-
MIME provided a mechanism for describing protocol messages
-
Was adopted for describing HTTP messages
-
A major cross road on the evolutionary path
-
HTTP/1.0 was on its way
Then came Proxies and Gateways
-
First to get access to other systems like Gopher and WAIS
-
Bootstrap mechanism for accessing information
-
Then to traverse firewalls
-
Turned out to be better than mechanisms like SOCKS
-
And soon caching became popular based on last modified dates and heuristics
-
Proxies crucial piece of Web architecture
-
Allows for new levels of indirection
In the Mean Time...
-
People wanted faster renditions of their pages containing text, images and
audio
-
Solution: Use multiple, parallel TCP connections
-
This actually makes a lot of sense
-
TCP sockets are easy to program
-
You get a lot of resources from the OS and the Net
-
It seems to be a lot faster!
-
One problem - impact on Net a disaster
-
Web applications were wasting huge amounts of resources. Servers did not
do any real work
The More the Merrier
-
People wanted all their information in their browser
-
Use of POST to represent "strange ideas"
-
POST is not AUTOMATABLE!
-
Difference from Automated!
-
It is not a question of handling strange ideas!
-
It is a question of letting your computer handle strange ideas!
-
HTTP become a byte transport
The Web was Commercialized
-
Vanity host names become popular
-
Everybody wants their own domain name (www.henrik.com)
-
Due to a misoptimization, this could only be done using multiple IP addresses
-
Result is that many machines have multiple IP addresses
-
Examples of 100 or more IP addresses pr machine
Web Fueled by Advertisement
-
Main accounting mechanism was hit counts in the form of TCP connections
-
No trust in heuristic caching - bust it!
-
We loose revenue every time a cache serves a cached document!
The Internet was on its Knees
-
Several reports of busy links collapsing - no data got through
-
IP addresses were consumed at very high rate
-
But
I dont think that HTTP would have the position it has today
as the most used protocol if started with HTTP/1.1
HTTP/1.1 - The Big Fire Fighter
-
Main purpose was to fix three problems
-
Provide a semantically well-defined caching model
-
Support vanity hostnames
-
Limiting waste of TCP connections
-
Criteria for solutions was that the end user would see a clear win
-
People need personal incentives to change
-
Implementors need clear market benefit to implement
Hmm, Looks Promising!
-
Success criteria was met
-
In our performance
work, we could show that
-
HTTP/1.1 cuts down Round Trip Times by a lot
-
Cut down TCP overhead by a factor of three
-
Cut down time to transfer data by a factor of 2
-
We can blast out PPP, LANs and WANs
-
Have not made explicit testing on wireless
-
Would urge people to help doing this
But How can We Extend it?
-
HTTP is not a centrally controlled protocol
-
Has maybe never been
-
Its extended by everybody for any possible purpose
-
Clearly suffering from "HTTP is the hammer - everything is a nail" syndrome
-
No structured way of extending HTTP
-
Lack of type information
-
Using POST as a tunnel mechanism
-
Reducing HTTP to a byte transport
-
We need a more powerful framework!
HTTP - the Next Generation
-
HTTP-NG is generic application
level protocol
-
A simple, extensible framework
-
Explicit Layering and modularization
-
Break up the big "lump" style HTTP message
-
Extensibility at the core
-
Lessons from our HTTP/PEP/Mandatory work
-
Can the Web be implemented using Distributed Object technology?
So What is HTTP anyway?
-
Lets have a quick look at the model
-
It looks like MIME but isnt quite
-
HTTP is a layered Protocol
-
Has Scope, Proxying and Caching
-
Has inherent fuzziness built in
-
Content negotiation and redirections
-
It looks like RPC but isnt quite
-
Proxies are explicit in the interfaces
-
Has the notion of end-to-end and hop-by-hop scope
-
Interfaces are both vertical and horizontal
-
Headers separated from methods
Methods, Headers, and Status Codes
-
No explicit relationship between methods, header fields and status codes
in an HTTP message. Relationship must be defined implicitly
-
Methods to be performed on the resource
-
A priori agreement of semantics. Cant be extended dynamically
-
Headers carry information about the parties involved, the transaction, the
message body or the resource
-
Unknown header fields must be ignored without affecting the outcome of the
transaction
-
Status Codes are the results returned by the server
-
Status codes are somewhat easier to extend, as unknown status codes must
be treated as the x00 code of that class.
Common Questions about HTTP
-
People often discard HTTP using inaccurate assumptions
-
Not a question of "HTTP all over" but a path for evolvability
-
Working our way towards a generic application level protocol framework
-
An important goal of HTTP-NG!
Why is HTTP/1.1 so big?
-
I have often heard: We only need a small subset, not the whole thing
-
What does it really take to be an HTTP application? Not a lot!
-
Most features defined by header fields have a request part and a response
part.
-
The SHOULD and MUST requirements in which header field to support often comes
in pairs: if you support a certain feature then you have to
support all header fields associated with that feature.
HTTP is for HTTP URLs, right?
-
HTTP can handle arbitrary URIs - not only "http://
"
-
This is a consequence of Proxies and Gateways in the HTTP model
-
I dont believe in Gateways
-
I want one information space with a consistent set of services
-
URI space is getting more complex
-
New URI schemes on a daily basis
-
A serious problem for interoperability
I cant use HTTP - I need state!
-
HTTP is inherently a stateless protocol
-
Request-response pairs are independent but not necessarily idempotent.
-
POST, as well as sequences of PUT and DELETE, changes state
-
State can be built on top of HTTP
-
Often sufficient to add a simple header field or a parameter on an existing
one
-
Cookies is state at a higher level
-
Involves the end-user and hence concerns about privacy etc.
HTTP is for TCP Only!
-
There is (almost) nothing that binds HTTP to TCP
-
HTTP is known to run on top of non-TCP networks
-
Often said that UDP is faster than TCP
-
Pipelining changes this dramatically: requests and responses take fragments
of TCP packets
-
UDP Support should be done by layering
-
Many examples of MIME based protocols supporting UDP at the application level
-
Doesnt make sense!
HTTP Cant handle Streamed Data
-
There is a difference between Controlling and Transmitting
-
HTTP is not a real time protocol
-
But can be used to control audio/video streams
-
Essentially as a remote control protocol
HTTP is too Slow!
-
Well, performance is relative
-
HTTP is not a fast protocol - but there are not very many fast protocols
around
-
POP is really bad with respect to round trips (RTT).
-
CORBA is really bad with respect to bytes and RTTs
-
On wireless, RTT is the factor that kills you
-
HTTP/1.1 is fairly good at avoiding RTT delays
More Information on the Web