I'm not sure the w3-mux list is active yet, so I'm sending my comments
out early to you guys... something like this will probably go out to
the list, if I don't see the holes in my own arguments and come to my
senses in the meantime.  So, what follows is drafty, but here goes:

This note discusses the hash-based notion of protocol IDs, and an
alternative.  To start with, I summarize the current hash-based
system, as I understand it.  I also run on for a bit about what are
either problems with the scheme or goofs on my part; if you want to
skip to my proposed alternative, look for the line of asterisks.
Finally, for some notes on the need or non-need for registries in
these systems, see after the line of dashes.

Bill's proposal for a hash-based system starts by assuming that any
protocol stack (he explicitly allows for the possibility of several
stacked transports, perhaps serving different purposes, such as
compression and authentication) can be designated by a string of the
form, say:

   http:zippy-zip:nsafe

where zippy-zip and nsafe are compression and cryptographic layers,
respectively, which I just made up.  

Since some of these protocols may have options, he allows a syntax for
specifying them, as in the following protocol stack designator (I'll
call these things PSDs from here on):

   http;version=1.1:zippy-zip;dictsz=2048:nsafe;trusted-signatory="fred.net"

To save the latency hit incurred by transmitting a string of this
length, the proposal tosses in a hash function (specifically CRC32,
but nothing yet depends on this choice of hash, as opposed to
another).  So, when a Alice has a MUX connection open to Bob, and
wants to initiate a session using the protocol stack named by the
above PSD, she runs the string through the hash function, and
transmits the hash code as part of the session-initiating SYN-flagged
MUX block.

Bob is supposed to have a complete dictionary of the hash codes for
all PSDs representing protocols that he understands (or to be able to
compute one on the fly).  So, when he receives Alice's SYN, with the
hash-code, he can look it up in his dictionary.  At this point, there
are three possible cases:

  1) None of Bob's protocols have a PSD with that hash code
  2) Exactly one of Bob's protocols has a PSD with that hash code
  3) More than one of Bob's protocols has a PSD with that hash code

In case 1, the connection attempt is simply rejected; this is no
  problem.

In case 2, Bob accepts the connection, and uses the protocol
  designated by the PSD in his dictionary.  There can be a problem
  here.  Suppose that there are two PSDs, PSDa and PSDb with the same
  hash code.  Suppose further that Alice speaks PSDa, and Bob speaks
  PSDb, but neither party speaks both protocols.  Then, if Alice tries
  to open a session using the protocol named by PSDa, Bob will accept
  it and start using the protocol named by PSDb, with potentially
  confusing results.

  (As discussed below, Bill had assumed that Alice would simply never
  try to talk to Bob using PSDa unless she knew in advance, by some
  unspecified means, that Bob understood that protocol.  If that
  assumption holds, then the troublesome cases can't arise).

Case 3 requires further treatment --- the proposal suggests that Bob
  should ask Alice for the full PSD of the protocol that she is trying
  to use, but the over-the-wire details of how he does that aren't
  fully worked out yet.

So much for outlining the proposal.  The one remaining feasibility
issue that I'm aware of is that the size of Bob's dictionary must stay
within reasonable bounds for the scheme to be workable.  To begin
with, this implies some constraints on the grammar of the PSDs (to
avoid having to stuff the dictionaries with multiple redundant names
for the same protocol stack which differ only in, say, whitespace).

However, a more significant issue is the number of *meaningfully
distinct* PSDs Bob might support --- something that depends rather
strongly on what options are supported by the protocols he
understands.  There is at least the possibility of blowup here; if Bob
supports only one protocol stack with three layers, but those layers
have five three-valued options each (giving 243 supported variants of
each layer), then Bob has to put over fourteen million total PSDs in
his dictionary, which starts to get a little burdensome.

One potentially dangerous case is illustrated by the trusted-signatory
option to my made-up nsafe authentication layer.  If Bob knows in
advance of some particular set of N signatories he is willing to
trust, he can build a dictionary with N entries for each protocol
stack involving nsafe, one for each signatory --- which isn't
necessarily so bad.

However, if Bob is willing to allow Alice to name a trusted signatory
which he doesn't know about in advance (we presume he's got some test
for the trustworthiness of the named signatory, or maybe Alice is the
paranoid one and Bob just doesn't care), then Bob would have to
somehow come up with dictionary entries for PSDs naming every possible
trusted-signatory on the entire net.  It isn't clear how Bob would go
about doing this.

(I freely admit that the "trusted signatory" option is a bit fanciful,
and perhaps symptomatic of poor design in my nonexistant nsafe layer,
but the same problem can potentially arise with any protocol option
whose values name an external third party or entity which is not known
to both parties in advance).

****************************************************************

FWIW, all the potential problems above would not arise if Bob had
access to Alice's PSD (protocol stack designator), instead of just
having a hash code for it.  The reason he only has a hash code is
because transmitting full PSDs over and over would consume bandwidth,
and we want to avoid that.  However, at least to my eyes, the
complications look somewhat awkward.  So, in the remainder of this
note, I'll be sketching out an approach which has a lot fewer
difficulties, and consumes (to my taste) little enough bandwidth to be
worth the trade.

What I'll assume is that bandwidth is at least cheap enough that it's
no tragedy to transmit any given full PSD exactly *once* on a given
(hopefully long-lived) MUX connection.  As a check, I'll note that the
somewhat over-elaborate PSD used above as an example is a bit under 80
characters in length, and so should take roughly 40 ms. to transmit
over a 14.4K modem.  IIRC, delays have to be roughly a quarter second
before people start getting really annoyed, so transmitting one PSD of
this length won't annoy anybody much even over a relatively slow link,
but transmitting it repeatedly might.

Anyway, the basic idea of my scheme is as follows: the first time
Alice wants to open a session using a particular protocol, she
transmits the full PSD to Bob, *and* an arbitrary brief designator
which she will use to designate the same protocol stack in the future.
(The brief designator only has to be unique among protocols used on
*this* MUX connection, so fewer than sixteen bits ought to do).  At
this point, with the full PSD, Bob can determine whether he speaks the
protocol without having to try to invert a one-way hash function.

For future sessions, Alice uses the brief designator, and need not
transmit the full PSD.  Note that she need not wait for any sort of
acknowledgment from Bob before knowing this will work, since *she*
assigned the brief ID, and since MUX blocks are guaranteed (by TCP) to
be processed in the order of transmission.  If Bob supports the
protocol stack that Alice is trying to use, everything just works; if
not, he simply notes that fact the first time the brief designator
arrives (with a full PSD), and rejects future connection attempts
which use the same brief designator.

Now for the tricky details:

To avoid hash collisions, and all their attendant problems, it
probably makes the most sense to just assign the IDs dynamically, per
connection; this involves a little extra per-connection data structure
for each peer, but maintaining that structure is likely to be less
work for everybody than complex machinery for handling hash
collisions.

Given dynamic assignment, it is clearly important to keep Alice and
Bob from assigning the same brief designator to different protocol
stacks; this is most easily accomplished, as with session IDs, by
arbitrarily having one peer allocate the odd brief designators, and
the other peer allocate the even ones.  We could also simply require
each peer not to try to open connections using brief designators
assigned by the other, but that could require the same full PSD to be
transmitted both ways in some fairly common cases, and that's just a
waste of bandwidth.

----------------------------------------------------------------

Lastly, some notes on the notion of registries.  In the ILU context,
schemes of this sort don't need a registry, but it's important to note
the reason why.  First, a scenario:

*) Alice is working at SGI, where she's working on networked Virtual
   Reality.  To save bandwidth, she has developed a protocol which
   allows one peer to incrementally transmit data on VRML objects as
   they come into the other peer's virtual view (anticipating what
   will come into view along the other peer's current path, and
   accepting updates and corrections about that peer's direction of
   motion).  She calls this Virtual Reality Transport Protocol, and
   abbreviates it in PSDs as VRTP.

*) Bob is a househusband with a VCR in his apartment.  Since his
   friend's VCR broke down, he allows the poor fellow to set the timer
   on his own (Bob's) VCR, and to allow him to do it over the net,
   he's developed a Video Recorder Twiddling Protocol for transmitting
   the timer settings, and he abbreviates the name of that protocol in
   PSDs as VRTP.

So, we have two parties, with two very different protocols which they
both designate in PSDs (which, remember, are the main feature I've
retained from the hash-coding scheme) as VRTP.  The question, now,
is how we keep Alice from trying to download entire virtual worlds
into Bob's VCR.

IIRC from Bill's email, the reason that this isn't supposed to happen
in the ILU context is because Alice is never even supposed to *try* to
speak (her version of) VRTP to Bob unless she *knows in advance*, by
some unspecified, out-of-band means that Bob speaks that protocol, and
refers to it by the same name in his own PSDs.  It is because of this
assumption about the behavior of peers using the negotiation mechanism
that the scheme doesn't require a global registry; mechanistic
details, such as the use or non-use of hash functions, are not
particularly key.

Now, how workable this is for unrelated parties on the global internet
obviously depends on the unspecified, out-of-band means by which Alice
finds out about Bob's capabilities.  I can come up with schemes which
do the job for completely unrelated parties, but all of mine involve
either the creation of some global registry for contact information
such as layer names (and what they signify), or at least the
exploitation of an existing global registry, such as the DNS.  Of
course, I may very well be missing something here...

rst