This document describes the possible benefits in performance with compression of HTML files over a LAN connection and how it may interact with TCP slow start and delayed acknowledgement algorithms.
The compression is done using the zlib compression algorithm and the test is done on the microscape web site which was part of the test performed in the paper "Network Performance Effects of HTTP/1.1, CSS1, and PNG". We performed the "First time retrieval" which is equivalent to a browser visiting a site for the first time, e.g. its cache is empty and it has to retrieve the top page and all the embedded objects. In HTTP, this is equivalent to 42 GET requests.
Note that we only compress the HTML page (the fist GET request) and not any of the following images which are already compressed using various other compression algorithms. The overall payload that is transferred in the uncompressed version of the download is 42K HTML with 41 GIF inlined images totalling 125K The compression decreases the size of the HTML page from 42K to 11K but the images are untouched. This means that we decrease the overall payload by 31K or approx 19%.
An important part of using pipelining is that the application buffers its output before it is delivered to the underlying TCP stack. This is in essence making a Nagle equivalent buffering suited for HTTP instead of for telnet connections. This is why it is not a good idea to use Nagle's algorithm together with HTTP pipelining as the two buffer mechanisms tend to step on each other causing an overall performance degradation.
When opening a new TCP connection, the first few packets are controlled by the TCP slow start algorithm. This means that the first TCP packet containing any payload (that is, not the TCP open handshake) on an Ethernet contains 1460 bytes (the HTTP response headers and the first part of the HTML body). Now, what happens on client side is that the client starts parsing the HTML in order to present the contents to the end-user. The important thing here is the number of inlined objects contained in the first segment as they are likely to go to the same server and hence can be pipelined onto the same TCP connection. If the number of new requests generated by the parser exceeds the pipeline output buffer size then a new batch of HTTP requests can be sent immediately - otherwise the batch is delayed until either of the pipeline flush mechanism are triggered:
If the next batch of requests are delayed then no new data is transferred to the TCP stack. As we are in a slow start situation then the server cannot send any more data until it gets an ACK from the first segment. TCP can either piggybag its ACK onto an outgoing packet or generate a separate ACK packet. However, the separate ACK packet is subject to the delayed acknowledgement algorithm that may delay the packet up to 200ms. In either case, the end result is an extra delay causing the overall performance degradation on a LAN.
The reason why HTML compression is important in this scenario is that it increases the probability that there are enough inlined objects in the first segment to immediately issue a new batch without introducing any extra delay. As described in "Simple Test of Compressing HTML Using ZLib", the compressed version may be less than 1/3 the original size. As a result of that, if the inlined objects are evenly distributed then the probability of finding enough inlined objects increases by about a factor of 3.
In the table below you can see the effects of delayed AKS on the first packet sent from the server to the client. In the Pipelining case, the HTML text (sent in clear) does not contain enough information to force a new batch of requests. In the Pipelining and HTML compression, the first packet contains 3 times as much data so the probability of having enough requests to immediately send a new batch is higher.
|client -> server
|server -> client
|Pipelining and HTML compression
Compare the entries in the table column-wise and note the difference in the delay in the lower left corner in the first row of data in comparison with the significantly less delay in the second row.
In our test case we observed a gain in time relative to number of TCP packets as follows:
|Pipelining and HTML compression
|Saved using compression
The table shows that for the Jigsaw server compression provides a net gain of 15% fewer packets but as much as a 27% gain in time. Likewise, for Apache we see a packet gain of 16% but a time gain of 23%. The interesting thing is that the overall payload is decreased by 19% which is more than the gain in TCP packets. From this perspective, compression gives a slightly worse "TCP packet usage". However, the gain in time is relatively better than the gain in payload. This indicates that the relationship between payload, TCP packets and transfer time is non-linear and that the first packets on a connection are relatively more expensive than the rest.
The result may depend on how the slow start algorithm is implemented on the particular platform. Some TCP stacks implement slow start using one TCP packet whereas others implement it using two packets. However, in both cases, we believe that the effect of compression will be positive on a LAN.
This test does not take into account the time it takes to compress an HTML object on the fly and whether this will take longer than the time gained transferring less bytes. Decompression is done on the fly by the client.