NOTE-pipelining-970207

Network Performance Effects of HTTP/1.1, CSS1, and PNG

NOTE 7-February-1997

This version:: /TR/NOTE-pipelining-970112
$Id: Pipeline.html,v 1.32 1997/02/10 18:46:19 frystyk Exp $
Latest version:: /TR/NOTE-pipelining-970112
Authors:: Henrik Frystyk Nielsen, W3C, <frystyk@w3.org>,
Jim Gettys, Visiting Scientist, W3C, Digital Equipment Corporation, <jg@w3.org>,; Anselm Baird-Smith, W3C, <abaird@w3.org>,
Eric Prud'hommeaux, W3C, <eric@w3.org>,; Håkon Wium Lie, W3C, <howcome@w3.org>,; Chris Lilley, W3C, <chris@w3.org>

Status of This Document

This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no endorsement of its content, nor that the Consortium has, is, or will. A list of current NOTEs can be found at: /TR/

Since NOTEs are subject to frequent change, you are advised to reference the above URL, rather than the URLs for NOTEs themselves. The results contained in this note are preliminary - as we perform further experiments it will continue to evolve. When this work is complete and results considered by us to be "final", the status of this Note will be updated to reflect its completion. In particular, further experimentation with range requests is planned soon. Please check back again for further later results.

The results here are provided for community interest, though it has not been rigorously validated and should not alone be used to make commercial decisions. In addition, the exact results are obviously a function of the tests performed; your mileage will vary.

Submitted to ACM SIGCOMM '97.

Status of This Document
Table of Contents
Abstract
Introduction
Prior Work
Test Setup
Initial Investigations and Tuning
After Initial Tuning Tests
Measurements
Connection Management
Impact of Changing Web Content
Observations on PNG and Use of Style Sheets
Observations on All the Data
Implementation Experience
Range Requests and Validation
Future Work
Conclusions
References
Acknowledgements
Author's Addresses

Abstract

We describe our investigation of the effect of persistent connections, pipelining and link level document compression on our client and server HTTP implementations. A simple test setup is used to verify HTTP/1.1's design and understand HTTP/1.1 implementation strategies. We present TCP and real time performance data between the libwww robot and both the Jigsaw and Apache HTTP servers using HTTP/1.0, HTTP/1.1 with persistent connections, HTTP/1.1 with pipelined requests, and HTTP/1.1 with pipelined requests and deflate data compression [22]. We also investigate whether the TCP Nagle algorithm has an effect on HTTP/1.1 performance. While somewhat artificial and possibly overstating the benefits of HTTP/1.1, we believe the tests and results approximate some common behavior seen in browsers. The results confirm that HTTP/1.1 is meeting its major design goals. Our experience has been that implementation details are very important to achieve all of the benefits of HTTP/1.1.

For all our tests, a pipelined HTTP/1.1 implementation outperformed HTTP/1.0, even when the HTTP/1.0 implementation used multiple connections in parallel, under all network environments tested. The savings were at least a factor of two, and sometimes as much as than a factor of ten, in terms of packets transmitted. Elapsed time improvement is less dramatic, and strongly depends on your network connection.

Note that the savings in network traffic and performance shown in this document are solely due to the effects of pipelining, persistent connections and transport compression. Some data is presented showing further savings possible by the use of CSS1 style sheets [10], and the more compact PNG [20] image representation that are enabled by recent recommendations at higher levels than the base protocol. Time did not allow full end to end data collection on these cases. The results show that HTTP/1.1 and changes in Web content will have dramatic results in Internet and Web performance as HTTP/1.1 and related technologies deploy over the near future. Universal use of style sheets, even without deployment of HTTP/1.1, would cause a very significant reduction in network traffic.

This paper does not investigate further performance and network savings enabled by the improved caching facilities provided by the HTTP/1.1 protocol, or by sophisticated use of range requests.

Introduction

The intent of this paper is to present some of the thought processes that we used to test and optimize our implementations in the hopes it may guide others through their own implementation efforts, rather than just present final polished results, which would not serve as guidance to others.

HTTP/1.1 [4] is an upward compatible protocol to HTTP/1.0 [3]. Both HTTP/1.0 and HTTP/1.1 use the TCP protocol [12] for data transport. The effects of HTTP/1.0's use of TCP on the Internet have resulted in major problems caused by congestion and unnecessary overhead [6].

Major HTTP/1.1 goals include:

improving performance for end users
lower HTTP's load on the Internet for the same amount of "real work"
make HTTP a good "network citizen"
enable applications to work reliably even with caching

HTTP/1.1 includes a number of new elements that together should have a major effect on Internet traffic. These include:

Transport improvements, consisting of persistent connections, additions to allow pipelining, transport data compression, and range requests. All of these improvements are optional parts of HTTP/1.1.
Caching extensions, to allow applications to work reliably in the face of caching, and to allow applications to mark more content cacheable, including the results of searches.

HTTP must become a good network citizen to overcome the current Internet congestion problems. The current "World Wide Wait" can only be solved if both HTTP and the content are significantly changed to improve unneeded overhead. If end user performance does not improve, it is unlikely that HTTP/1.1 will be deployed, and is therefore vital to its success.

Protocol elements often interact and have multiple uses; for example, range requests may be very useful to retrieve the remainder of cached images after a communications failure or user interrupted transfer, avoiding retransmission data already successfully transferred. They may also be used to avoid excessive serialization of requests behind a large transfer (see the range requests and validation section). Finally, as an example of another use of range requests, the directory of an Adobe PDF document might be retrieved from the end of a document.

HTTP/1.1 does not attempt to solve some commonly seen problems, such as hot spots, or "flash crowds" at popular web sites, but will at least help these problems.

Simultaneously to the deployment of HTTP/1.1, the Web will see the deployment of style sheets and new image and animation formats, which will also change the nature of the content that HTTP/1.1 will transport. This paper presents measured results of the consequences of HTTP/1.1 protocol additions, and some computed data on effects that the deployment of new content are likely to have in addition, as well as some speculations on how these changes to the Web will affect Internet behavior.

To test the effects of some of these changes, we took data on two tests that simulate contrasting behavior of clients: visiting a site for the first time, where nothing is in a client cache, and revalidating cached items when a site is revisited. We do so in three network environments we believe span common situations of web use: a local Ethernet, a wide area network connected to a local Ethernet, and a dialup PPP connection using a 28.8Kbaud modem.

While both the first time and revalidate tests are likely common simulation of client behavior seen on the web, we have no idea which is more common, and the performance of HTTP/1.1 will likely change client and user behavior to such a large degree that it is impossible to extrapolate from these tests any numeric results on the Internet.

Prior Work

A number of analyses of HTTP/1.0 and proposals influenced HTTP/1.1's design and the work described in this paper.

Padmanabhan and Mogul [1] show results from a prototype implementation which extended HTTP to support both persistent connections and pipelining, and study latencies, throughput, and system overhead issues involved in persistent connections. This analysis formed the basic data and justification behind HTTP/1.1's persistent connection and pipelining design. HTTP/1.1 primarily relies on pipelining rather than introducing new HTTP methods to achieve the performance benefits documented below. As this paper makes clear, both pipelining and persistent connections are needed to achieve high performance over a single HTTP connection.
Pipelining, or batching, have been successfully used in a number of other systems, notably graphics protocols such as the X Window System [15] or Trestle [16], in its original RPC based implementation.
Netscape's "Keep-Alive" extensions to HTTP are a form of persistent connections. HTTP/1.1's design differs in detail from Keep-Alive since a problem was discovered when Keep-Alive is used with more than one proxy between a client and a server.
Touch, Heidemann, and Obraczka [5] explore a number of possible changes that might help HTTP behavior. While we do not agree with all of their conclusions, they discuss the use of sharing TCP control blocks [19] and Transaction TCP [17], [18]. The extended length of deployment of changes to TCP argued against any dependency of HTTP/1.1 on either of these; however, we believe that both may be useful generally, independently to the improvements made by HTTP/1.1. T/TCP might help reduce latency when revisiting a Web server after the server has closed its connection. Sharing of TCP control blocks would help primarily HTTP/1.0, however, since the HTTP/1.1 limits the number of connections from between a client/server pair.
In independent work, Touch [7] describes the interactions of persistent connections with Nagle's algorithm. His experience is confirmed by both our experience described in this paper, and the experience of one of the authors with the X Window System, which caused the original introduction of the ability to disable Nagle's algorithm into BSD derived TCP implementations.
Simon Spero analyzed HTTP/1.0 performance [6] and prepared a proposal for a replacement for HTTP. HTTP/1.1, however, was constrained to maintain upward compatibility with HTTP/1.0. Many of his suggestions are worthwhile and should be explored further.

Test Setup

Test Web Site

We synthesized a test web site serving data by combining data (HTML and GIF image data) from two very heavily used home pages (Netscape and Microsoft) into one; hereafter called "Microscape". The initial layout of the Microscape web site was a single page containing typical HTML totalling 42K with 41 GIF inlined images totalling 125K.

First Time Retrieval Test

The first time test is equivalent to a browser visiting a site for the first time, e.g. its cache is empty and it has to retrieve the top page and all the embedded objects. In HTTP, this is equivalent to 42 GET requests.

Revalidate Test

This test is equivalent to revisiting a home page where the contents are already available in a local cache. The initial page and all embedded objects are validated, resulting in no actual transfer of the HTML or the embedded objects. In HTTP, this is equivalent to 42 Conditional GET requests. HTTP/1.1 supports two mechanisms for cache validation: etags and date stamps whereas HTTP/1.0 only supports the latter. This test is roughly equivalent to pressing "reload" on a browser, and, depending on configuration, what a browser may do when revisiting a site which has not changed.

HTTP/1.0 support was provided by an old version of libwww (version 4.1D) which supported plain HTTP/1.0 with multiple simultaneous connections between two peers and no persistent cache. In this case we simulated the cache validation behavior by issuing HEAD requests on the images instead of Conditional GET requests. The profile of the HTTP/1.0 revalidation requests therefore was a total of 42 associated with the top page with one GET (HTML) and 41 HEAD requests (images), in the initial tests. The HTTP/1.1 implementation of libwww (version 5.1) differs from the HTTP/1.0 implementation. It uses a full HTTP/1.1 compliant persistent cache generating 42 Conditional GET requests with appropriate cache validation headers to make the test more similar to likely browser behavior. Therefore the number of packets in the results reported below for HTTP/1.0 are higher than of the correct cache validation data reported for HTTP/1.1. We believe cache validation is a very common operation, and will become even more common, given HTTP/1.1's dramatic improvement in semantics and performance in this area.

Network Environments Tested

In order to measure the performance in commonly used different network environments found in today's Internet, we used the following three combinations of bandwidth and latency:

Network Environments Tested
Channel	Connection
High bandwidth, low latency	LAN - 10Mbit Ethernet
High bandwidth, high latency	WAN - Massachusetts (MIT/LCS) to California (LBL)
Low bandwidth, high latency	PPP - 28.8k modem line using a telephone switch simulator

Applications, Machines and OSs

Several platforms were used in the initial stage of the experiments for running the HTTP servers. However, we ended up using relatively fast machines to try to prevent unforeseen bottlenecks in the applications used. Jigsaw is written entirely in Java and relies on specific network features for controlling TCP provided only by JDK 1.1.

Applications, Machines, and OSs
Component	Type and Version
Server Hardware	www26.w3.org (Sparc Ultra-1, Solaris 2.5)
LAN Client Hardware	zorch.w3.org (Digital AlphaStation 400 4/233, Digital UNIX 4.0a)
WAN Client Hardware	turn.ee.lbl.gov (Digital AlphaStation 3000, Digital UNIX 4.0)
PPP Client Hardware	big.w3.org (Pentium Pro PC, Windows NT Server 4.0)
HTTP Server Software	Jigsaw 1.05 and Apache 1.2b7 + patches
HTTP Client Software	libwww robot and Netscape Communicator 4.0 beta 1 on Windows NT

None of the machines were under significant load while the tests were run. The server is identical through our final tests - only the client changes connectivity and behavior. Both Jigsaw and Libwww are currently available with HTTP/1.1 implementations without support for the features described in this paper and Apache is in beta release. During the experiments changes were made to all three applications. These changes will be made available through normal release procedures for each of the applications.

In order to get tcpdumps of the PPP packets over a modem line, we had to route the packets through a UNIX system where we could obtain tcpdumps. We set up a Linux system with both a PPP interface and an Ethernet interface and connected the Windows NT system to the network interface and the PPP interface to the telephone system and changed the routes accordingly. Due to wet weather we constantly had problems with the public telephone lines. We therefore set up our own four port telephone switch simulator for handling PPP data internally.

Initial Investigations and Tuning

The HTTP/1.0 robot was set to use plain HTTP/1.0 requests using a TCP connection per request. We set the maximum number of simultaneous connections to 6 (by default, existing HTTP/1.0 applications like Netscape Navigator often use 4 simultaneous connections between client and server; many users, however raise this to a larger number). Using 6 instead of 4 gives parallel connections an edge (at least on higher speed networks) over default behavior exhibited by many browsers.

After testing HTTP/1.0, we then to run the robot as a simple HTTP/1.1 client using persistent connections. That is, the request / response sequence looks identical to HTTP/1.0 but all communication happens on the same TCP connection instead of 6, hence serializing all requests. The results as seen in the table below was a significant saving in TCP packets using HTTP/1.1 but also a big drop in elapsed time.

Pipelining

As a means to lower the elapsed time and improve the efficiency, we introduced pipelining into libwww. That is, instead of waiting on a response to arrive before issuing new requests, as many requests as possible are issued at once. The responses are still serialized and no changes were made to the HTTP messages; only the timing has changed as the robot has multiple outstanding requests on the same connection as illustrated in the figure below:

Diagram showing packets in pipelined vs. non-pipelined implementation

The robot generates quite small HTTP requests - our library implementation is very careful not to generate unnecessary headers and not to waste bytes on white space. The result is an average request size of around 190 bytes, which is smaller than many existing HTTP implementations.

The requests are buffered before transmission so that multiple HTTP requests can be sent with the same TCP segment. This has a significant impact on the number of packets required to transmit the payload and lowers CPU usage by both client and server. A consequence of the output buffering is that we need a mechanism to flush the output buffer. First we implemented a version with two mechanisms:

If the data in the output buffer reached a certain size then the buffer was flushed. We experimented with the output buffer size and found that 1024 bytes is a good compromise. In case the MTU is 536 or 512 we will then produce two full TCP segments, and if the MTU is 1460 (Ethernet size) then we can nicely fit into one segment.
We introduced a timer in the output buffer stream which would time out after a specified period of time and force the buffer to be flushed. It is not clear what the optimal flush time out period is but it is likely that it is a function of the network load and connectivity. We used a 50 ms delay in our implementation for these tests. Further work is required to understand where we should set such a timer, which might also take into account the RTT for this particular connection or other factors.

Jigsaw - Initial High Bandwidth, Low Latency Cache Revalidation Test
	HTTP/1.0	HTTP/1.1 Persistent	HTTP/1.1 Pipeline
Raw tcpdumps	client / server	client / server	client / server
Max simultaneous sockets	6	1	1
Total number of sockets used	40	1	1
Packets from client to server	226	70	25
Packets from server to client	271	153	58
Total number of packets	497	223	83
Total elapsed time [secs]	1.85	4.13	3.02

We were simultaneously very happy and quite disappointed with the initial results above, taken late at night on a quiet Ethernet. Elapsed time performance of HTTP/1.1 with pipelining was worse than HTTP/1.0 in this initial implementation, though the number of packets used were dramatically better. We scratched our heads for a day, then convinced ourselves that on a local Ethernet, there was no reason that HTTP/1.1 should ever perform more slowly than HTTP/1.0 (since the network overhead is so much lower and the local Ethernet cannot suffer from fairness problems that might give multiple connections a performance edge in a long haul network), so we dug into our implementation further.

After study, we realized that the application (the robot) has much more knowledge about the requests than libwww, and by introducing an explicit flush mechanism in the application, we could get significantly better performance. We modified the robot to force a flush after issuing the first request on the HTML document and then buffer the following requests on the inlined images. While a HTTP library can be arranged to automatically flush its buffers automatically after some timeout (when the data above was taken, the timeout was set to 1 second), taking advantage of knowledge in the application can result in a considerably faster implementation than relying on such a timeout. The final results show HTTP/1.1 elapsed time performance significantly faster than HTTP/1.0 on even a local network.

We expected due to experience of one of the authors that a pipelined implementation of HTTP might encounter the Nagle algorithm [2] [5] in TCP. The Nagle algorithm was introduced in TCP as a means of reducing the number of small TCP segments by delaying their transmission in hopes of further data becoming available, as commonly occurs in telnet or rlogin traffic. As our implementation can generate data asynchronously without waiting for a response, the Nagle algorithm could be a bottleneck. In order to test this we turned the Nagle algorithm off in both the client and the server. This was the first change to the server - all other changes were made in the client. In our initial tests, we did not observe significant problems introduced by Nagle's algorithm, though with hindsight, this was the result of our pipelined implementation and the specific test cases chosen, since with effective buffering, the segment sizes are large, avoiding Nagle's algorithm. In later experiments in which the buffering behavior of the implementations were changed, we did observe significant (sometimes dramatic) transmission delays due to Nagle; we recommend therefore that HTTP/1.1 implementations that buffer output disable Nagle's algorithm (set the TCP_NODELAY socket option). This confirms the experiences of Touch [7].

We then improved Jigsaw's output buffering behavior. For each connection, it maintains a response buffer that it flushes either when full, or when there is no more requests coming in on that connection, or before it goes idle. This allows aggregating responses (for example, cache validation responses) into fewer packets even on a high-speed network, and saving CPU time for the server.

We also performed some tests against the Apache 1.2b2 server, which also supports HTTP/1.1, and observed essentially similar results to Jigsaw. Its output buffering in that initial beta test release was not yet as good as our revised version of Jigsaw, and in that release it processes at most five requests before terminating a TCP connection. When using pipelining, the number of HTTP requests served is often a poor indicator for when to close the connection. We discussed these results with Dean Gaudet and others of the Apache group and similar changes were made to the Apache server; our final results below are using a version of Apache 1.2b7 plus patches provided by Dean Gaudet.

After Initial Tuning Tests

With the modified applications, we took a complete set of data, both the first time retrieval and cache validation, in all three network environments. At the same time, to make the tests closer to a real implementation, we took the opportunity to change the HTTP/1.1 version of the robot to issue full HTTP/1.1 cache validation requests which use the If-None-Match header and opaque validators, rather than the HEAD requests used in our HTTP/1.0 version of the robot.

It was easiest to implement this functionality by enabling persistent caching in libwww, but this had unexpected consequences; an initial performance run resulted in worse performance than our first set of tests. Further analysis showed that libwww's implementation of persistent caching on disk is written for ease of porting and implementation rather than performance. Each cached object contains two independent files: one containing the cacheable message headers and the other containing the message body. This would be an area that one would optimize carefully in a product implementation; the overhead in our implementation became a performance bottleneck in our HTTP/1.1 tests. Time and resources did not permit optimizing this code. Our final measurements use correct HTTP/1.1 cache validation requests, and are run with a persistent cache on a memory file system to reduce the disk performance problems that we observed.

The measurements below therefore represent a second round of data collection, against servers (both Jigsaw and Apache) which have optimized output buffering. While Jigsaw had outperformed Apache in the first round of tests, Apache now outperforms Jigsaw.

Compression (Content Codings)

After having determined that HTTP/1.1 outperforms HTTP/1.0 we decided to try other means of optimizing the performance. We therefore investigated how much we would gain by using data compression of the HTTP message body. That is, we do not compress the HTTP headers, but only the body using the "Content-Encoding" header to describe the encoding mechanism. We use the zlib compression library version 1.04, which is a freely available C based code base. It has a stream based interface which interacts nicely with the libwww stream model. Note that the PNG library uses zlib, so common implementations will share the same data compression code. Implementation was at most a day or two.

The client indicates that it is capable of handling the "deflate" content coding by sending an "Accept-Encoding: deflate" header in the requests. In our test, the server does not perform on-the-fly compression but sends out a precomputed deflated version of the Microscape page. The client performs on-the-fly inflation and parses the inflated HTML using its normal HTML parser.

The zlib library has several flags for how to optimize the compression algorithm, however we used the default values for both deflating and inflating. In our case this caused the Microscape HTML page to be compressed more than a factor of three from 42K to 11K. We believe that this is a typical factor of gain using this algorithm on HTML files.

The result was positive for all three types of connections: LAN, WAN, and PPP. On a LAN the number of saved TCP packets amounted to 28 which is a 17% gain and on a WAN we saw a gain of 14% and also gain in time to handle the data transfer.

Measurements

The tables shown in these tables are a summary of the more detailed data acquisition overview. In all cases, the traces were taken on client side, as this is where the interesting delays are. Each run was repeated 5 times in order to make up for network fluctuations.

As a means to compare the various experiments, we define the "efficiency" as the ratio of the number of bytes in the payload to the total number of bytes transmitted:

        Efficiency = Bytes in Payload / Total number of bytes

Jigsaw - High Bandwidth, Low Latency
	Packets	bytes	time [sec]	efficiency	Packets	bytes	time [sec]	efficiency
	First time retrival				Cache validation
HTTP/1.0	455.2	187525.6	1.12	0.912	362.8	58993.0	0.76	0.803
HTTP/1.1	234.4	189938.0	1.32	0.953	88.4	16878.0	0.80	0.827
HTTP/1.1 Pipelined	168.0	189646.0	0.69	0.966	27.6	16878.0	0.52	0.939
HTTP/1.1 Pipelined and compression	140.4	158460.0	0.59	0.966	27.2	16873.0	0.47	0.939

Jigsaw - High Bandwidth, High Latency
	Packets	bytes	time [sec]	efficiency	Packets	bytes	time [sec]	efficiency
	First time retrival				Cache validation
HTTP/1.0	455.4	191808.2	4.02	0.913	339.6	60745.0	3.28	0.817
HTTP/1.1	254.4	190965.2	9.19	0.949	90.0	16916.4	5.34	0.825
HTTP/1.1 Pipelined	210.6	190635.8	3.22	0.958	26.8	17170.0	1.32	0.941
HTTP/1.1 Pipelined and compression	181.0	159032.4	3.18	0.956	27.8	16873.0	1.30	0.938

Jigsaw - Low Bandwidth, High Latency
	Packets	bytes	time [sec]	efficiency	Packets	bytes	time [sec]	efficiency
	First time retrival				Cache validation
HTTP/1.0 **)	489	235027	65.05	-	-	-	-	-
HTTP/1.1	349.6	189458.0	63.82	0.931	129.0	16800.0	12.28	0.765
HTTP/1.1 Pipelined	286.0	190383.2	52.35	0.943-	32.0	16868.0	5.40	0.929

Apache - High Bandwidth, Low Latency
	Packets	bytes	time [sec]	efficiency	Packets	bytes	time [sec]	efficiency
	First time retrival				Cache validation
HTTP/1.0	449.8	188237.4	1.11	0.913	339.4	59008.0	0.54	0.813
HTTP/1.1	232.8	187618.0	0.81	0.953	88.0	13731.0	0.39	0.796
HTTP/1.1 Pipelined	163.2	187618.0	0.52	0.966	24.4	13731.0	0.27	0.934

Apache - High Bandwidth, High Latency
	Packets	bytes	time [sec]	efficiency	Packets	bytes	time[sec]	efficiency
	First time retrival				Cache validation
HTTP/1.0	473.6	191385.4	5.49	0.910	340.6	59008.0	2.36	0.812
HTTP/1.1	252.0	188786.0	6.93	0.949	88.8	13755.2	4.73	0.795
HTTP/1.1 Pipelined	204.0	188811.2	2.91	0.959	25.2	13731.0	0.95	0.932

*) These measurements were performed using Netscape Communicator 4.0 beta 1 with max 4 simultaneous connections and HTTP/1.0 keep-alive connections. The Netscape HTTP client implementation uses the HTTP/1.0 Keep-Alive mechanism to allow for multiple HTTP messages to be transmitted on the same TCP connection. It therefore used 8 connections compared to 42 for the libwww HTTP/1.0 implementation, in which this feature was disabled.

Connection Management

Implementations need to close connections carefully. HTTP/1.0 implementations often naively close both halves of the TCP connection simultaneously when finishing the processing of a request. A pipelined HTTP/1.1 implementation can cause major problems if it does so.

The scenario is as follows: an HTTP/1.1 client talking to a HTTP/1.1 server starts pipelining a batch of requests, for example 15 on an open TCP connection. The server decides that it will not serve more than 5 requests per connection and closes the TCP connection in both directions after it successfully has served the first five requests. The remaining 10 requests that are already sent from the client will along with client generated TCP ACK packets arrive on a closed port on the server. This "extra" data causes the server's TCP to issue a reset which makes the client TCP stack pass the last ACK'ed packet to the client application and discard all other packets. This means that HTTP responses that are either being received or already have been received successfully but haven't been ACK'ed will be dropped by the client TCP. In this situation the client does not have any means of finding out which HTTP messages were successful or even why the server closed the connection. The server may have generated a "Connection: Close" header in the 5th response but the header may have been lost due to the TCP reset. Servers must therefore close each half of the connection independently.

TCP's congestion control algorithms [11] work best when there are enough packets in a connection that TCP can determine the optimal rate at which to insert packets into the Internet, and the performance of TCP is also best once a connection is beyond the Slow Start algorithm. Observed packet trains in the Internet have been dropping [13], almost certainly due to HTTP/1.0's behavior, as demonstrated in the data above, where a single connection rarely involves more than 10 packets, including TCP open and close. Some IP switch technology exploits packet trains to enable faster IP routing. In the tests above, the packet trains are longer, but not as much longer as one might first expect, since many fewer, larger packets are transmitted due to pipelining. While the HTTP/1.1 proposed standard specification does permit two connection to be established between a client/server pair, it is clear that dividing the mean length of packet trains down by a factor of two would put diminish the benefits to the Internet (and possibly to the end user due to slow start) substantially. Range requests need to be exploited to enable good interactive feel in Web browsers while using a single connection. Connections should be maintained as long as makes reasonable engineering sense, to pick up user's "click ahead" while following links.

Impact of Changing Web Content

We believe the content transported by HTTP will be changing significantly over the next several years, with the introduction of style sheets and incremental improvements in image and animation formats. This section explores some of the changes we may see if these facilities are exploited significantly.

Replacing Images with HTML+CSS1

The web has lacked most of the facilities that graphics designers have used to control presentation and layout; as a result, many pages seen on the web have been painfully synthesized using straight HTML and a plethora of small images. Many graphical elements in the Microscape page could easily be expressed using style sheets. The table below lists all images that appear in the "Microscape" test page and gives an estimate of which images that might be replaced by more compact HTML+CSS1 code.
Images that can easily be converted to HTML+CSS1
File name GIF size PNG size HTML+CSS estimated size Comments

content markup

about1 1403 653 20 250

action 788 770 15 50

enter 157 166 10 50

prod 165 158 10 40

search 151 165 10 30

shop 128 131 10 30

solutions 682 638 10 60

support 156 161 10 30

1ptrans 44 83 ¹ 0 30 the CSS property will be set on an existing element

spacer 70 70 0 70 a new element should probably be added

spacer1 40 70 0 70 ditto

spacer2 69 73 0 70 ditto

vrule 62 74 0 70 ditto

nav_home 1664 1355 150 250 it's possible to make button bars with floating textual elements

navigation_bar 1698 1457 150 250 ditto

Sum: 15/40 images 7277 6024 395 1350

Images that can easily be converted to HTML+CSS1
File name	GIF size	PNG size	HTML+CSS estimated size	Comments
content	markup
about1	1403	653	20	250
action	788	770	15	50
enter	157	166	10	50
prod	165	158	10	40
search	151	165	10	30
shop	128	131	10	30
solutions	682	638	10	60
support	156	161	10	30
1ptrans	44	83 ¹	0	30	the CSS property will be set on an existing element
spacer	70	70	0	70	a new element should probably be added
spacer1	40	70	0	70	ditto
spacer2	69	73	0	70	ditto
vrule	62	74	0	70	ditto
nav_home	1664	1355	150	250	it's possible to make button bars with floating textual elements
navigation_bar	1698	1457	150	250	ditto
Sum: 15/40 images	7277	6024	395	1350

Images that can easily be converted to HTML+CSS1 if the proper font resource are available
File name GIF size PNG size HTML+CSS estimated size Comments

content markup

arrowbl 69 113 4 60 A similar arrow glyph exists in Unicode

arrowgr 75 125 4 60 A similar arrow glyph exists in Unicode

arrowr 75 125 4 60 A similar arrow glyph exists in Unicode

comdex_6 799 743 20 200 the two-colored pointer complicates matters

Sum: 4/40 images 1018 1106 32 380

Images that can easily be converted to HTML+CSS1 if the proper font resource are available
File name	GIF size	PNG size	HTML+CSS estimated size	Comments
content	markup
arrowbl	69	113	4	60	A similar arrow glyph exists in Unicode
arrowgr	75	125	4	60	A similar arrow glyph exists in Unicode
arrowr	75	125	4	60	A similar arrow glyph exists in Unicode
comdex_6	799	743	20	200	the two-colored pointer complicates matters
Sum: 4/40 images	1018	1106	32	380

Images that can be converted to HTML+CSS1 if negative margins are allowed
File name GIF size PNG size HTML+CSS estimated size Comments

content markup

tagline 2718 2581 30 300 the shadow effects require negative margins

worldwide 1698 1583 20 300 the shadow effects require negative margins

h_microsoft 2080 1877 20 300

Sum: 3/40 images 6496 6041 70 900

Images that can be converted to HTML+CSS1 if negative margins are allowed
File name	GIF size	PNG size	HTML+CSS estimated size	Comments
content	markup
tagline	2718	2581	30	300	the shadow effects require negative margins
worldwide	1698	1583	20	300	the shadow effects require negative margins
h_microsoft	2080	1877	20	300
Sum: 3/40 images	6496	6041	70	900

Images that can be reduced in size by converting parts to HTML+CSS1
File name GIF size PNG size HTML+CSS estimated size Comments

content markup

Pacman1 4378 3841 50 300 probably can be reduced by 50%

one_sm 2917 2534 30 150 probably can be reduced by 50%

home_on 246 215 10 80 the "house" can't be found in Unicode

Sum: 3/40 images 7541 6590 90 520

Images that can be reduced in size by converting parts to HTML+CSS1
File name	GIF size	PNG size	HTML+CSS estimated size	Comments
content	markup
Pacman1	4378	3841	50	300	probably can be reduced by 50%
one_sm	2917	2534	30	150	probably can be reduced by 50%
home_on	246	215	10	80	the "house" can't be found in Unicode
Sum: 3/40 images	7541	6590	90	520

Images that possibly can be reduced, but it's probably not worth it
File name GIF size PNG size Comments

idc 2428 1545

comdex 1102 1117

pointcast_small 786 730

Sum: 3/40 images 4316 3392

Images that possibly can be reduced, but it's probably not worth it
File name	GIF size	PNG size
idc	2428	1545
comdex	1102	1117
pointcast_small	786	730
Sum: 3/40 images	4316	3392

Logo-like images not possible or not worth doing
File name GIF size PNG size Comments

msft 366 422 Microsoft logo, very textual but with two different "o" glyphs

tbi 4012 3511

netnow3 1884 1294 rotated text not possible in CSS

home_igloo 40095 35933

n 1435 1107

Sum: 5/40 images 47792 42237

Logo-like images not possible or not worth doing
File name	GIF size	PNG size	Comments
msft	366	422	Microsoft logo, very textual but with two different "o" glyphs
tbi	4012	3511
netnow3	1884	1294	rotated text not possible in CSS
home_igloo	40095	35933
n	1435	1107
Sum: 5/40 images	47792	42237

Photographic images, clearly outside the scope of CSS
File name GIF size PNG size Comments

clinton 2095 1912

appfoundry 4853 4461

commish 7540 7115

inbox_img 5076 4668

rolodex 3340 3145

sports 3544 3231

woofer 2411 2174

Sum: 7/42 images 28859 26706

Photographic images, clearly outside the scope of CSS
File name	GIF size	PNG size
clinton	2095	1912
appfoundry	4853	4461
commish	7540	7115
inbox_img	5076	4668
rolodex	3340	3145
sports	3544	3231
woofer	2411	2174
Sum: 7/42 images	28859	26706

Animations
File name GIF size PNG size Comments

ie_animated 9132
- 3
20 frame animation

msinternet 15856
- 3
16 frame animation

Sum: 2/2 animations 24988

Animations
File name	GIF size	PNG size	Comments
ie_animated	9132	- 3	20 frame animation
msinternet	15856	- 3	16 frame animation
Sum: 2/2 animations	24988

1: 1ptrans is a one pixel by one pixel transparent image. It contributes nothing to content but is used to adjust spacing. As noted, this is a task much better suited to style sheets. Opening a network connection (under HTTP/1.0) to transfer 44 bytes of invisible GIF is inefficient. Also, such small files are almost entirely metadata. Only 4 bytes are required for the actual image data (red, green, blue, and alpha). The same file is 83 bytes as a PNG and 8 bytes as a PBM. Even better is to transmit zero bytes and one less network connection.
2: Sizes are for converting the entire image. Size of the converted (not expressible in CSS) portion is given in the conclusions.
3: GIF is also used as an animation format (despite the prohibition of this in the GIF specification). PNG is a single image format. Conversion to MNG or QT was not tried.

Observations on PNG and Use of Style Sheets

The observations here are very preliminary, but indicate that style sheets may make a very significant impact on bandwidth (and end user delays) of the web. Savings from PNG in this data however are modest.

A very few images in our dataset accounted for much of the total size. Over half of the data was contained in a single image and two small animations. Care in selection of images is clearly very important to good design. It is therefore clear that this sample is too small to draw any conclusions on typical savings due to PNG.
Of 40 images and 2 animations, 22 images can be replaced by HTML+CSS1. Encoded in GIF, these images represent 14791 bytes, and the HTML+CSS1 replacement is 3237 bytes, saving 11572 bytes. The best HTTP request is the one that no longer exists, and therefore we save both the overhead of the requests and responses no longer needed, and the CPU time and possible delays retrieving the images from secondary storage.
Further, 3 of the 40 images can be reduced to roughly half their size by converting part of their content into HTML+CSS1. Their current size is 7541 bytes, the PNG replacements of the remaining content total 3294 bytes and the extra HTML+CSS1 represents 610 bytes, saving 3637 bytes.
14 of the 40 images, representing 80601 bytes, cannot be represented in HTML+CSS1. However they could be converted to PNG giving a total size of 71913 bytes, a saving of 8688 bytes.
Total savings come to 23989 bytes and 22 HTTP requests. The elimination of 22 HTTP requests would save approximately 4600 bytes transmitted and the approximately 4300 bytes received, presuming the length of the requests (210 bytes) and responses (192 bytes for cache validation, and ~240 bytes for an actual successful GET request). This overstates the savings; one of the major motivations for style sheets is to enable a common "look" to be shared across pages on a web site, and so many of these would naturally be independent URL's and cached independently. However, individual items on a page would often be expressed inline in the HTML.
The HTML and CSS1 replacement text totals approximately 3800 bytes (which should compress approximately equivalently to other plain text) would likely save a further approximately 2.5 kilobytes, presuming a compression ration of 3.
Notes: PNG files listed here additionally contain gamma information, so that they display the same on all platforms; this adds 16 byte
- Persistent connections significantly reduce network traffic, avoiding the overhead caused by TCP open and close. These packets are particularly pernicious to the health of the Internet, since they are not congestion controlled.
  s per image. The GIF images do not contain this information.
  The conversion of images to PNG was not optimal. The GIFs were clearly optimized by experts. PNG does not perform as well on the very low bit depth images in the sub-200 byte category because its checksums and other information make the file a bit bigger even though the actual image data is often smaller.
  The HTML+CSS1 sizes are estimates. At this date of this writing, no CSS browser is able to render all the replacements in HTML+CSS.
Observations on All the Data
Buffering requests and responses significantly further reduces the number of packets required. For a common situation operation in the Web (revisiting a page cached locally), our HTTP/1.1 with buffered pipelining implementation uses less than 1/10 of the total number of packets that HTTP/1.0 does, and executes in much less elapsed time, using a single TCP connection. This is a further factor of three improvement over HTTP/1.1 implemented without buffering of requests and responses. A more compact wire representation for HTTP could increase pipelining's benefit for cache revalidation.
An HTTP/1.1 implementation that does not implement pipelining will perform worse (have higher elapsed time) than an HTTP/1.0 implementation using multiple connections.
The mean size of a packet in our traffic roughly doubled.
The mean number of packets in a TCP session increased between a factor of two and a factor of ten. However, if style sheets see widespread use, do not expect as large an improvement in the number of packets in a TCP session, as they may eliminate much unneeded image transfers, shortening packet trains.
Nagle's algorithm does not affect HTTP/1.0, since it is a strict request/response protocol. HTTP/1.1 can encounter it in a pipelined implementation. We recommend disabling it (setting the TCP_NODELAY socket option) in HTTP/1.1 implementations.
Since fewer TCP segments were significantly bigger and could almost always fill a complete Ethernet segment, server performance also increases when using pipelined requests, even though only the client changed behavior.
Transport compression enables further significant savings (about 16% of the packets and of the elapsed time in our first time transfer test); the decompression time for the client is more than offset by the savings in data transmission. The savings will be less than this over a modem connection, due to the common implementation of data compression. Deflate compression is more efficient than the data compression algorithms used in modems. Your mileage will vary depending on precise details of your Internet connection.
For clients that do not load images, transport compression should provide a major gain. Faster retrieval of HTML pages will also help time to render significantly, for all environments.
Over half of the images in our test page could be replaced by style sheet constructs. Doing so would result in savings over half of the HTTP requests of this test. For use with HTTP/1.0, this would save half of the packets (and much of the time), and with HTTP/1.1 the validation test would save half the requests, providing almost another factor of two beyond that provided by persistent connections and pipelining.
For the first time test, bandwidth savings just due to pipelining and persistent connections of HTTP/1.1 is only a few percent. If servers implement transport compression universally, the savings would be approximately 15% (~30 Kbytes of 195Kbytes). The total savings of using PNG and style sheets (with transport level compression of the resulting CSS1 and HTML) should therefore result in total savings of approximately 36 kilobytes beyond the approximately 30K bytes saved observed in our tests of pipelined, persistent HTTP/1.1. The combination of all techniques (transport compression, style sheets, and PNG) available in the timeframe that HTTP/1.1 will be deploying could therefore save approximately 35% of the total bandwidth required by HTTP/1.0 without style sheets or PNG, for our tests, without changing the visual appearance of the result.
Use of style sheets, (e.g. CSS1) and of more compact image representations (e.g. PNG) allows for further significant performance improvements independent of HTTP version. Exploiting style sheets as much as possible would make the largest difference in bandwidth used.
Use of CSS1 in our tests to reduce the number of embedded objects, however, would result in further savings of up to 9200 bytes, resulting in total bandwidth savings of approximately 30% in the revalidate test.

Implementation Experience

Pipelining implementation details can make a very significant difference on network traffic, and bear some careful thought, understanding, and testing. To take full advantage of pipelining in applications may require explicit interfaces to flush buffers and other minor changes to applications.

To get optimal performance over a single connection in HTTP/1.1 implementations, the read buffer size of an implementation, and the details of how urgently data is read from the operating system, can be very significant. If too much data accumulates in a socket buffer TCP may delayed ACKS by 200ms. Opening multiple connections in HTTP/1.0 resulted in more socket buffers in the operating system, which as a result imposed lower requirements of speed on the application, while keeping the network busy.

We estimate two people for two months implemented the work reported on here, starting from working HTTP/1.0 implementations. We expect others leveraging from the experience reported here might accomplish the same result in much less time, though of course we may be more expert than many due to our involvement in HTTP/1.1 design.

Tools

Our principle data gathering tool is the widely available tcpdump program [14]. Some vendors do not ship tcpdump; others ship older versions of tcpdump. We found it necessary to install a current version (version 3.3) on all platforms we used, due to the observation that the last FIN TCP packet was often missing. We used both the NetMon program and IP forwarding on NT to gather the data for the PPP connection. The output is incompatible with tcpdump but a conversion made it possible to use our tcpdump tools for handling the data.

We also used Tim Shepard's xplot program [8] to graphically plot the dumps; this was very useful to find a number of problems in our implementation not visible in the raw dumps. We looked at both data going over both directions of the TCP connections. In the detailed data summary, there are direct links to all dumps in xplot formats. The tcpshow program [21] was very useful when we needed to see the contents of packets to understand what was happening.

In addition to these generic TCP analysis tools, we produced a set of dedicated tools for handling the large amount of data taken:

tcpdump2xplot: A Perl program that is included in the xplot package for converting tcpdumps to xplot format. We had to extend the program significantly in order to handle our tcpdumps.
getdata: Getdata is a small C program that runs the robot in various modes taking tcpdumps at the same time.
iter.pl: A Perl program to iterate over all the tcpdumps extracting the detailed summary.
splitBigTcpdump.pl: This is a Perl program to split up a large tcpdump taken over the PPP connection.

Range Requests and Validation

We have thought about realistic uses of HTTP/1.1 by browsers. Browsers need to know most urgently if an object has changed in a way that requires reformatting the page. Additionally, if an embedded image is large and no size is specified, a browser will want to be able to layout the page completely before finishing retrieval of embedded objects. HTTP /1.1 defines (and most current HTTP/1.0 servers implement) byte range facilities that allow a client to perform partial retrieval of objects. We believe therefore that the natural revalidation request for HTTP/1.1 will combine both cache validation headers and an If-Range request header, to prevent large objects from monopolizing the connection to the server over its connection. The range requested should be large enough to usually return any embedded metadata for the object for the common data types. This capability of HTTP/1.1 is implicit in its caching and range request design.

When a browser revisits a page, it has a very good idea what the type of any embedded object is likely to be, and can therefore both a cache validation request and also simultaneously request the metadata of the embedded object (to detect any change). This information is much more valuable than the embedded image data itself. Subsequently, the browser might generate requests for the rest of the object, or for enough of each object to allow for progressive display of image data types (e.g. progressive PNG, GIF or JPEG images), or to multiplex between multiple large images on the page. We call this style of use of HTTP/1.1 "poor man's multiplexing". Further work is underway [9] to experiment with a multiplexing transport to provide a better way to multiplex the connection than this crude (and relatively high overhead) way provided by HTTP/1.1.

We therefore believe cache validation combined with range requests will likely become a very common idiom of HTTP/1.1.

The HTTP metadata (as opposed to metadata stored inside the object itself) can become a significant amount of overhead for the "poor man's multiplexing" that browsers are likely to want to perform (first to get the metadata stored in an object, and then to perform progressive display of embedded images). A naive implementation (as in the Jigsaw implementation used for the initial tests) might resend all of the HTTP metadata headers in a 206 (Partial Content) response. The HTTP/1.1 proposed standard specification is silent on which header fields are sent on a 206 response. We believe implementations will likely want to be careful on what headers are sent with partial content, and the HTTP/1.1 specification may want clarification in this area.

Future Work

We believe the CPU time savings of HTTP/1.1 is very substantial due to the great reduction in TCP open and close and savings in packet overhead, and could now be quantified for Apache (currently the most popular Web server on the Internet). HTTP/1.1 will increase the importance of reducing parsing and data transport overhead of the very verbose HTTP protocol, which, for many operations, has been swamped by the TCP open and close overhead required by HTTP/1.0. Optimal server implementations for HTTP/1.1 will likely be significantly different than current servers.

Connection management bears significant further experimentation and modeling. Padmanabhan [1] gives some guidance on how long connections should be kept open, but this work needs updating to reflect current content and usage of the Web, which have changed significantly since completion of the work.

Persistent connections, pipelining, transport compression, as well as the widespread adoption of style sheets (e.g. CSS1) and more compact image representations (e.g. PNG) will increase the relative overhead of the very verbose HTTP text based protocol, particularly for high latency and low bandwidth environment such cellular telephones and other wireless situations. A binary encoding or tokenized compression of HTTP and/or a replacement for HTTP will become more urgent given these changes in the infrastructure of the Web.

We have not investigated perceived time to render (our browser has not yet been converted to use HTTP/1.1), but with the range request techniques outlined below, we believe HTTP/1.1 can perform well over a single connection. PNG also provides significant time to render benefits relative to GIF. The best strategies to optimize time to render are clearly significantly different from those used by HTTP/1.1.

We did not have time to perform a test that would show the relative benefits of deflate compression relative to the data compression provided by current modems.

Future work worth investigating here includes the use of compression dictionaries optimized for HTML and CSS1 text.

Conclusions

For HTTP/1.1 to outperform HTTP/1.0 in elapsed time, an implementation must implement pipelining. Properly buffered pipelined implementations will gain additional performance and reduce network traffic further.

HTTP/1.1 implemented with pipelining outperformed HTTP/1.0, even when the HTTP/1.0 implementation uses multiple connections in parallel, under all circumstances tested. In terms of packets transmitted, the savings are typically at least a factor of two, and often much more, for our tests. Elapsed time improvement is less dramatic, but significant, and all HTTP/1.1 tests using pipelining and a single connection out performed HTTP/1.0 tests using multiple connections.

Since bandwidth savings due to HTTP/1.1 and associated techniques are modest (between 2% and 35% depending on the techniques used in our tests), it is clear that the HTTP/1.1 work on caching is as important as the improvements reported in this paper to conserving total bandwidth on the Internet. Hotspots on the network also strongly argue for good caching systems. The addition of transport compression in HTTP/1.1 provided the largest bandwidth savings. The savings of HTTP/1.1 in terms of number of packets, however, are truly dramatic.

HTTP/1.1 will significantly change the character of traffic on the Internet (given HTTP's dominant fraction of internet traffic), with significantly larger mean packet sizes, more packets per TCP connection, and drastically fewer packets that are not subject to flow control (by elimination of most packets due to TCP open and close).

HTTP/1.1 changes dramatically the "cost" and performance of HTTP, particularly for revalidating cached items. As a result, we expect that applications will significantly change their behavior. For example, caching proxies intended to enable disconnected operation may find it feasible to perform much more extensive cache validation than was feasible with HTTP/1.0. Researchers and product developers should be very careful when extrapolating from current Internet and HTTP server log data future web or Internet traffic and should plan to rework any simulations as these improvements to web infrastructure deploy.

Changes in web content enabled by deployment of style sheets; more compact image, graphics and animation representations will also significantly improve network and perceived performance during the period that HTTP/1.1 is being deployed. To our surprise, style sheets promise to be the biggest possibility of major network performance improvements, whether deployed with HTTP/1.0 or HTTP/1.1, by significantly reducing the need for inlined images to provide graphic elements, and the resulting network traffic. Heavy use of style sheets whenever possible will result in the greatest observed improvements in downloading new web pages, without sacrificing sophisticated graphics design.

Modest, careful implementations can achieve all of the goals set out for HTTP/1.1.

References

[1] V.N. Padmanabhan, J. Mogul, "Improving HTTP Latency", Computer Networks and ISDN Systems, v.28, pp. 25-35, Dec. 1995. Slightly revised version of paper in Proc. 2nd International WWW Conference '94: Mosaic and the Web, Oct. 1994

[2] J. Nagle, "Congestion Control in IP/TCP Internetworks," RFC 896, Ford Aerospace and Communications Corporation, January 1984.

[3] T. Berners-Lee, R. Fielding, H. Frystyk. "Informational RFC 1945 - Hypertext Transfer Protocol -- HTTP/1.0," MIT/LCS, UC Irvine, May 1996

[4] R. Fielding, J. Gettys, J.C. Mogul, H. Frystyk, T. Berners-Lee, "RFC 2068 - Hypertext Transfer Protocol -- HTTP/1.1," UC Irvine, Digital Equipment Corporation, MIT

[5] J. Touch, J. Heidemann, K. Obraczka, "Analysis of HTTP Performance," USC/Information Sciences Institute, June, 1996.

[6] S. Spero. "Analysis of HTTP Performance Problems," July 1994

[7] J. Heidemann, "Performance Interactions Between P-HTTP and TCP Implementation," USC/Information Sciences Institute, Submitted for publication to ACM Computer Communication Review.

[8] T. Shephard, Source for this very useful program is available at ftp://mercury.lcs.mit.edu/pub/shep. S.M. thesis "TCP Packet Trace Analysis" for David Clark at the MIT Laboratory for Computer Science. The thesis can be ordered from MIT/LCS Publications. Ordering information can be obtained from +1 617 253 5851 or send mail to publications@lcs.mit.edu. Ask for MIT/LCS/TR-494.

[9] J. Gettys, "Simple MUX Protocol Specification," World Wide Web Consortium.

[10] H. Lie, B. Bos, "Cascading Style Sheets, level 1," W3C Recommendation, World Wide Web Consortium, 17 Dec 1996.

[11] Van Jacobson. "Congestion Avoidance and Control". In Proc. SIGCOMM '88 Symposium on Communications Architectures and Protocols, pages 314-329. Stanford, CA, August 1988.

[12] Jon B. Postel. "Transmission Control Protocol," RFC 793, Network Information Center, SRI International, September, 1981.

[13] V. Paxson, "Growth Trends in Wide-Area TCP Connections," IEEE Network, Vol. 8 No. 4, pp. 8-17, July 1994.

[14] V. Jacobson, C. Leres, and S. McCanne, tcpdump, available at ftp://ftp.ee.lbl.gov/tcpdump.tar.Z

[15] R. W. Scheifler, J. Gettys, "The X Window System," ACM Transactions on Graphics # 63, Special Issue on User Interface Software.

[16] Mark S. Manasse and Greg Nelson, "Trestle Reference Manual," Digital Systems Research Center Research Report # 68, December 1991.

[17] Braden, R., "Extending TCP for Transactions -- Concepts," RFC-1379, USC/ISI, November 1992.

[18] Braden, R., "T/TCP -- TCP Extensions for Transactions: Functional Specification," RFC-1644, USC/ISI, July 1994.

[19] Touch, J., "TCP Control Block Interdependence," (work in progress), USC/ISI, June 1996.

[20] T. Boutell, T. Lane et. al. "PNG (Portable Network Graphics) Specification", W3C Recommendation, October 1996, RFC 2083, Boutell.Com Inc., January 1997. General PNG information can be found at /Graphics/PNG.

[21] M. Ryan, tcpshow, I.T. NetworX Ltd., 67 Merrion Square, Dublin 2, Ireland, June, 1996.

[22] P. Deutsch, "DEFLATE Compressed Data Format Specification version 1.3," RFC 1951, Aladdin Enterprises, May 1996.

[23] L. Peter Deutsch, Jean-Loup Gailly, "ZLIB Compressed Data Format Specification version 3.3," RFC 1950, Aladdin Enterprises, Info-ZIP, May 1996.

Acknowledgements

Jeff Mogul of Digital's Western Research Laboratory has been instrumental in making the case for both persistent connections and pipelining in HTTP. We are very happy to be able to produce data with a real implementation confirming his and V.N. Padmanabhan's results and for his discussions with us about several implementation strategies to try.

Our thanks to Sally Floyd, Van Jacobson, and Craig Leres for use of a machine at Lawrence Berkeley Labs for the high bandwidth/high latency test.

Our thanks to Dean Gaudet of the Apache group for his timely cooperation to optimize Apache's HTTP/1.1 implementation given our feedback.

The World Wide Web Consortium supported this work.

Authors' Addresses

Henrik Frystyk Nielsen
W3 Consortium
MIT Laboratory for Computer Science
545 Technology Square Cambridge, MA 02139, USA
Fax: +1 (617) 258 8682
Email: frystyk@w3.org

Jim Gettys
W3 Consortium
MIT Laboratory for Computer Science
545 Technology Square
Cambridge, MA 02139, USA
Fax: +1 (617) 258 8682
Email: jg@w3.org

Anselm Baird-Smith
W3 Consortium
Institut National de Recherche en Informatique et en Automatique
2004, route des Lucioles
BP 93 06902 Sophia Antipolis Cedex
France
Email: anselm@w3.org

Eric Prud'hommeaux
W3 Consortium
MIT Laboratory for Computer Science
545 Technology Square Cambridge, MA 02139, USA
Fax: +1 (617) 258 8682
Email: eric@w3.org

Håkon W. Lie
Institut National de Recherche en Informatique et en Automatique
W3C 2004,
route des Lucioles - B.P. 93 06902
Sophia Antipolis Cedex
France
Fax : +33 (0) 493657765
Email: howcome@w3.org

Chris Lilley
World Wide Web Consortium
INRIA, Projet W3C 2004,
Route des Lucioles - B.P. 93
06902 Sophia Antipolis Cedex
France
Fax: +33 93 65 77 65
Email: chris@w3.org

Corresponding Author: Jim Gettys

Henrik Frystyk Nielsen and Jim Gettys
@(#) $Id: Pipeline.html,v 1.32 1997/02/10 18:46:19 frystyk Exp $