This is a closer investigation of the impact of Nagle's algorithm on HTTP/1.1 pipelining in order to see the impact on using Nagle or not with HTTP/1.1 pipelining.
The test setup is the same as was used for the SIGCOMM'97 paper "Network Performance Effects of HTTP/1.1, CSS1, and PNG". That is, we synthesized a test web site serving data by combining data (HTML and GIF image data) from two very heavily used home pages (Netscape and Microsoft) into one; hereafter called "Microscape". The initial layout of the Microscape web site was a single page containing typical HTML totaling 42KB with 42 inlined GIF images totaling 125KB. The embedded images range in size from 70B to 40KB; most are small, with 19 images less than 1KB, 7 images between 1KB and 2KB, and 6 images between 2KB and 3KB. While the resulting HTML page is larger, and contains more images than might be typical, such pages can be found on the Web.
For this test, I ran the first-time retrieval test and the cache validation test. The former is equivalent to a browser visiting a site for the first time, e.g. its cache is empty and it has to retrieve the top page and all the embedded objects. In HTTP, this is equivalent to 43 GET requests. The latter is equivalent to 43 conditional GET requests.
The test was run on a LAN (10Mbit Ethernet) with no intermediate hubs, RTT < 1ms and MSS=1460. The two machines used have both lots of memory and were not under significant load while the tests were run:
|Component||Host name||Type and Version|
|Server Hardware||www26.w3.org||Sun SPARC Ultra-1, Solaris 2.6|
|Client Hardware||zorch.w3.org||Digital Alpha station 400 4/233, UNIX 4.0a|
I should say I have tried the same with Solaris 2.5.1 and ran into a lot of strange delay problems - one is that it goes to sleep for a second without explanation when communicating over a LAN. We believe that the problem can be avoided by setting the following parameter, which was done when running the tests:
ndd -set /dev/tcp tcp_co_min 1500
Sun's Jerry Chu is aware of the problem and it should be fixed in Solaris 2.6.
I used tcpdump version 3.4a5 on both boxes to take the dump in both directions. We used xplot to create the view graphs from the data.
Our server serves out the xplots with media type application/x-xplot, so if you set your browser up to start xplot when it sees this media type then you can browse the data much easier.
I ran the same test with Nagle both enabled and disabled on both sides. You can study the full set of results (including a summary) or you can have a look at the client side summary here. You can also download a tar file of all the traces and xplots.
|Nagle Settings||Traces taken on zorch (client)||Traces taken on www26 (server)|
|Binary TCP dumps||Text TCP Dumps||xplots||Binary TCP dumps||Text TCP Dumps||xplots|
|Enabled||dumps||dumps||client / server||dumps||dumps||na|
|Disabled||dumps||dumps||client / server||dumps||dumps||na|
The problem noticed in these traces is the same "Odd/Short-Final-Segment problem" as the one John Heidemann, USC/Information Sciences Institute, noticed in his paper "Performance Interactions Between P-HTTP and TCP Implementations". When the segments happens to lign up with an even set of segments, the presence of Nagle's algorithm doesn't make any difference in the traces. However, when there is an odd, short final segment, then the last ACK from the client is delayed up to 200ms.
We do not have the other performance problems (primarily the Short-Initial-Segment Problem) that John eludes to as pipelining and output buffering eliminates them.