W3C home

MUX Data Flow Analysis

super tcpdump -tt -S \
 	( not port login ) and ( not port nfsd ) and ( not tcp port 6000 )
 	and ( ( src www45 and dst www43 ) or ( src www43 and dst www45 ) )
 

Goals

When considering these tcp dumps, you should keep in mind that MUX tries to save a certain amount of packets that would be generated if we were to use multiple separate sockets instead. These savings fall into two categories:

This represent a lot of savings, when possible, but anything that is supposed to overcome these costs has (of course) to be - at least - better then opening multiple TCP connections.

Building TCP dumps

TCP dumps were first obtained by using the well-known tcpdump program. The results were then processed through a perl-script, and viewed through the xplot program. This document contains a number of gif images, which are screen snapshots of the output of the xplot program.

In all tests:

server
The HTTP server (with or without MUX support) is Jigsaw. The box is an Ultra 1 running solaris 2.5, whose name is www43.inria.fr.
client
For the client we used Sun's HotJava browser, customized to use W3C Java HTTP client side API (this supports both MUX and plain HTTP/1.1). The machine it runs on is a solaris 2.5 box.

Ethernet link, single file transfer

In this situation, we are interested in the dynamics of MUX data transfer. Note that MUX doesn't intent to make this better than what TCP does, however, it is crucial that it doesn't make it worse, since a lot of the web traffic is getting only an HTML document (FIXME needs number, will provide numbers): every site generally reuses the same set of icons, that the browser gets only once (it caches them). After this, most of the requests are for HTML documents only (FIXME again, unprooved assertion here).

[FIXME As an example, browsing our own www.w3.org site, with an initially empty cache generated the following requests:

http://www.w3.org/
 /
 /Icons/WWW/w3c_main
 /mit_48x48
 /inria_48x48
 /Icons/DARPA/globe_80x48
 /Icons/Europe/cec.gif
 /Jigsaw/
 /Icons/WWW/w3c_home.gif
 /Icons/WWW/new_red
 /Icons/WWW/relnotes48x
 /Jigsaw/User/Introduction/wp.html
 /Icons/WWW/w3c_home
 /Jigsaw/
 /Jigsaw/User/Introduction/performance.html
 /Jigsaw/
 /Jigsaw/User/Introduction/installation.html
 /Icons/WWW/doc48x.gif
 /Jigsaw/Overview.html
 /Jigsaw/User/Administration/Overview.html
 /Jigsaw/User/Introduction/architecture.html
 /Jigsaw/User/Tutorials/configuration.html
 /Jigsaw/User/Introduction/architecture.html
 /Jigsaw/User/Introduction/
 /Jigsaw/User/Introduction/indexer.html
 /Jigsaw/User/Tutorials/configuration.html
 
 

See how the number of requests for images decrease as I was browsing our site, this is just because all our pages reuse the same set of icons, no magic here. Of course, don't compute any numbers out of this trace, it's just so that you get the idea.]

HTTP/1.1 TCP flow

To compare things, we start by showing the TCP dynamics for getting a single, 64 kilobytes document using plain HTTP/1.1 on top of TCP.  The first table below shows the TCP dumps obtained on such a transfer.

HTTP/1.1 TCP flow

This is a typicall TCP data transfer, lets zoom in the middle of some spot during the transfer, to describe what happens more precisely:

Zoomed HTTP/1.1 TCP flow

The client www45.inria.fr.33050 - on the bottom of the table - keeps sending TCP ACK packets, each time it receives a packet from the server www43.inria.fr.jigsaw-proxy. Upon receiving the ACKs the server sends more packet (admire, here, the self clocking nature of TCP), the window is always kept full, some ACKs don't trigger output, probably because the server hasnt fill its internal buffer yet with enough data (which you can see latter on, because it uses the full TCP window - here 4 packets).

Number of packets exchanged in that HTTP transaction were:

server->client client->server
55 33

HTTP/1.1 over MUX TCP flow

Experiment 1:

In this experiment, we use HTTP/1.1 over MUX to fetch the above 64k document. The MUX settings are as follow (you will shortly understand that this is indeed important):

Initial credit value for all sessions (receiver side):
     public static final int RECEIVER_DEFAULT_CREDIT      = 4096;
 Initial credit value for all sessions (sender side):
     public static final int SENDER_DEFAULT_CREDIT        = 4096;
 Initial write buffer size (more credit sent only when half-buffer full)
     public static final int WRITER_BUFFER_SIZE           = 8192;
 Read buffer size (doesn't really matter):
     public static final int READER_BUFFER_SIZE           = 4096;
 Default MUX fragment size (too small on ethernet, too big on modems):
     public static final int SENDER_DEFAULT_FRAGMENT_SIZE = 512;
 

The global TCP dump duting that exchange is as follows:

Global TCP flow

When we zoom into it, we see these not so nice figures:

Zoomed TCP flow

If you compare this to our native HTTP/1.1 table, you will remark that because of the MUX flow control, the TCP pipe is not always filled up: in the whole zoomed section, we were never able to send more then three segments, and more then half of them were less then two packets (again, normal window filling would be roughly sending four packets each time). That's bad.

But this is not all, in that test, the number of packets used in that test are:

server -> client client->server
62 55

Compare this with the figure for plain HTTP/1.1, we have nearly doubled the number of packets from the client back to the server: TCP ACKs didn't match MUX ACKs, they were duplicated. That's bad.

Experiement 2

Now even worse, experiment 1 was roughly tuned for ethernet usage (FIXME and that tuning cannot be done programmaticaly). Assume we tune MUX for some other network

Initial credit size (receiver):
     public static final int RECEIVER_DEFAULT_CREDIT      = 2048;
 Initial credit size (sender)
     public static final int SENDER_DEFAULT_CREDIT        = 2048;
 Buffer size (that's the one that matter, it means credit size is 1024)
     public static final int WRITER_BUFFER_SIZE           = 2048;
 Others unchanged
     public static final int READER_BUFFER_SIZE           = 4096;
     public static final int SENDER_DEFAULT_FRAGMENT_SIZE = 512;
 

Again, it is important to realize that at the application layer you cannot dynamically select from this configuration to the first one; anyway, these are the horrible results we get now. The global TCP dump shows that everything is even slower then before, but scroll a little and look at the zoom !

Global TCP dump, small credit values

This is a zoom on a random zone of the above tcp dump:

Zoomed TCP dump, small credit values

Is there any need for explanations here ? Basically the server lost approximately one fourth of the available bandwidth, to be complete here are the number of packets exchanged:

server->client client->server
102 82

Conclusion

Is left to the reader (please email me what you think).



Anselm Baird-Smith
$Id: Overview.html,v 1.3 1999/04/17 00:20:41 frystyk Exp $