infoset and bindings

Here are some initial thoughts on a possible framework for thinking
about things like attachments and bindings and their relationship to
the envelope.

We have consistently been plagued by issues such as "what is the
message", "what is the role of SOAPAction", "what is the difference
between data in attachments or in the envelope", etc.  I'm coming to
the view that a message should be abstractly viewed as an infoset
comprising ALL of the relevant information to be carried along the
wire.  The role of a binding is to determine how to convey that
information to the next hop.  Pictorially, we have the following
situation, where S is the originating, sender node, I1 is the first
intermediary, ...., and R is the final recipient.

S ---> I1 ---> I2 . . . In-1 ---> R

For each node k (S, I1, ... In-1, R), we can define Out(k) to be the
infoset of the message (leaving) node k and In(k) is the infoset
constructable from the inbound message to node k.  [Out(k-1) is not
necessarily equivalent to In(k) -- see the discussion below.]

In practice, a binding is selected to transmit the required parts of
the message to the next node.  The binding specifies an underlying
transfer/transport protocol, (optionally) content-carrying formats
(e.g., for attachments), and the XML envelope.  Each of these levels
in the stack may be parameterized with portions of the infoset
(possibly redundantly).

As each node receives the transmitted message, it (theoretically)
constructs an infoset In(k) for the inbound message, and transforms it
according to the SOAP processing it performs into Out(k).  If the node
is an intermediary, it then selects a binding for the next hop, etc.

The behavior of a binding is modulated by the overall messaging
paradigm (static/dynamic routing, request/response, etc.).  To make
this more concrete, consider a one-way message in a simple routing
model in which all of the routing information is determined be the
initial client, perhaps via some out of band technology such as WSDL.
I will use an XML serialization of the infoset in this very sketchy
example.  Don't hold me to the precise syntax -- this is just for
illustration.  There also should be a more abstract representation in
the infoset of the various blocks (headers, body and trailers),
perhaps marked with dependency relationships, etc.  As given, the
infoset for the message probably anticipates the infoset for the
envelope a bit too much.

Out(S):

<SOAP-ENV:Message>
  <SOAP-ENV:Path>
    <r:Routing xmlns:r="some-routing-ns-1" SOAP-ENV:actor="I1-uri">
      <transport>http</transport>
      <destination>foo.com:80</destination>
      <intent>some-acceptable-I1-intent</intent>
    </r:Routing>
    <r:Routing xmlns:r="some-routing-ns-2" SOAP-ENV:actor="I2-uri">
      <transport>http</transport>
      <destination>foo.com:80</destination>
      <intent>some-acceptable-I2-intent</intent>
    </r:Routing>
    ...
    <r:Routing xmlns:r="some-routing-ns-n" SOAP-ENV:actor="R-uri">
      <transport>http</transport>
      <destination>soap.baz.com:1080</destination>
      <intent>some-final-intent</intent>
    </r:Routing>
  </SOAP-ENV:Path>
  <SOAP-ENV:Header>
    <a:SomeHeader xmlns:a="some-header-ns" SOAP-ENV:actor="H1-uri">
      ...
    </a:SomeHeader>
    ...
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    ...
  </SOAP-ENV:Body>
  <SOAP-ENV:Trailer>
    <t:SomeTrailer xmlns:t="some-trailer-ns">
    ...
    </t:SomeTrailer>
  </SOAP-ENV:Trailer>
</SOAP-ENV:Message>


The binding strategy for the first hop (from S to I1), will consume
the routing information for the first hop (http to foo.com:80), and
provide the rest of the routing info in the XML envelope.  For some
routing paths, all of the routing information for the entire path
could be carried in the underlying protocol with none of it in the
SOAP envelope.  [It might be necessary to explicitly characterize the
global routing paradigm in the message infoset.]

Here is what the wire format of the SOAP envelope on the first hop
might look like.

<SOAP-ENV:Envelope xmlns:SOAP-ENV="...">
  <SOAP-ENV:Header>
    <r:Routing xmlns:r="some-routing-ns-2" SOAP-ENV:actor="I2-uri">
      <transport>http</transport>
      <destination>foo.com:80</destination>
      <intent>some-acceptable-I2-intent</intent>
    </r:Routing>
    ...
    <r:Routing xmlns:r="some-routing-ns-n" SOAP-ENV:actor="R-uri">
      <transport>http</transport>
      <destination>soap.baz.com:1080</destination>
      <intent>some-final-intent</intent>
    </r:Routing>
    <a:SomeHeader xmlns:a="some-header-ns" SOAP-ENV:actor="H1-uri">
      ...
    </a:SomeHeader>
    ...
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    ...
  </SOAP-ENV:Body>
  <t:SomeTrailer xmlns:t="some-trailer-ns">
    ...
  </t:SomeTrailer>
  ...
<SOAP-ENV:Envelope>


A different, but equally plausible strategy would be to carry the
Routing information in another layer (e.g., as a MIME attachment), but
the above approach seems to fit into the framework better since the
routing for the next hop can be handled by a targeted module.

The infoset received by node I1 would be whatever was available from
the SOAP envelope above and anything observable by the inbound HTTP
request processing (i.e., by the binding on the inbound side).  Thus,
Out(S) is not necessarily equivalent to In(I1).  More complex
messaging patterns and routing dependencies may dictate that earlier
routing information be preserved in the infoset.  Strategies like
dynamic, content-based routing might carry very little explicit
routing information in the infoset.


I hope the ideas are clear enough to serve as a starting point for
discussion.

--mark


Mark A. Jones
AT&T Labs - Research
Shannon Laboratory
Room A201
180 Park Ave.
Florham Park, NJ  07932-0971

email: jones@research.att.com
phone: (973) 360-8326
  fax: (973) 360-8970

Received on Friday, 29 June 2001 14:57:38 UTC