Bill Janssen Internet Draft Xerox PARC expires in six months 1 August 1998 w3ng: Binary Wire Protocol for HTTP-ng Status of this Document *********************** This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." To view the entire list of current Internet-Drafts, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). This document has been produced as part of the W3C HTTP-ng Activity (for current status, see "http://www.w3.org/Protocols/HTTP-NG/Activity"). This is work in progress and does not imply endorsement by, or the consensus of, either W3C or members of the HTTP-ng Protocol Design Working Group. We expect the document to evolve considerably as the project continues. Distribution of this document is unlimited. Please send comments to the HTTP-NG mailing list at . Discussions are archived at "http://www.w3.org/Protocols/HTTP-NG/". Please read the "HTTP-NG Short- and Longterm Goals" [HTTP-ng-goals] for a discussion of goals and requirements of a potential new generation of the HTTP protocol and how we intend to evaluate these goals. Abstract ******** This document describes a binary `on-the-wire' protocol to be used when sending HTTP-NG operation invocations or terminations across a network connection. Table of Contents ***************** 1. Terminology and Syntax 2. Model of Operation 3. Global Issues 4. Utility Types 5. Messages 5.1. Extension Headers 5.2. Request Message 5.2.1. Operation and Object Memoizing 5.3. Reply Message 5.4. InitializeConnection Message 5.5. TerminateConnection Message 5.6. DefaultCharset Message 6. Data Marshalling 7. Connection Exceptions 8. Security Considerations 9. References 10. Address of Author 1. Terminology and Syntax ************************** Two data description languages are used in this document. The first is a pseudo-C syntax. It should be interpreted as C data structure layouts without any automatic padding to size boundaries, and allowing arbitrary bit-size limits on structs and unions as well as on ints and enums. The second is the data structure definition language defined in the XDR specification [RFC 1832]. Each use of pseudo-C and XDR is marked as to which language is being used. This document uses a number of terms which are defined in the HTTP-ng Architecture Model document [HTTP-ng-arch]. It also references the type system defined in that document. 2. Model of Operation ********************** This protocol assumes a particular model of operation based on conventional request/response messaging technology, with certain variations, as described in the HTTP-ng Architecture Model [HTTP-ng-arch]. The basic idea is that clients make use of services exported from a server by invoking operations on objects resident in that server. This model also assumes only hop-by-hop operation; proxying is supported at the application level. The model used here assumes that operations are grouped into sets, the elements of which have a well-defined ordering; each operation set is called an "object type". It further assumes that an object type is identified by a UUID in the form of a URI; and that each operation in an object type can be identified with the ordinal number of the operation within the ordering of the elements of the object type. It assumes that every object has an "object ID", which also forms a unique identifier. It provides for the fact that instances of object types are grouped into "object groups"; an object group may contain any number of instances, even only a single instance. Object IDs always consist of a unique identifier for the object group of the instance, which we call the "group ID", along with a group-relative identifier we call the "instance handle". Note that grouping of instances is not required (i.e., every instance might define its own object group), but the protocol provides an efficiency optimization for the case where it is true. The client is connected to the server by a "connection", which carries operation invocation requests from the client (known as the "caller") to the server (known as the "callee"), and operation results from the callee back to the caller. The connection has state associated with it, which allows the caller and callee to use shorthand notations for some of the data passed to the other party. Connections, in this model, are to a particular object group on the server; multiple connections can exist simultaneously between the same client and server, to different object groups; multiple connections to the same object group are also allowed. This protocol does not currently allow multiplexing of a single connection across multiple object groups; the HTTP-ng webmux transport layer is assumed to fulfill that function [HTTP-ng-webmux]. Two fundamental messages are defined by this protocol: the Request, which is used by the caller to invoke an operation on the callee, and the Reply, which is used to transfer the results of an operation from the callee to the caller. Every Reply message is associated with a particular Request message, but not every Request message has a Reply message associated with it. Connections are directional; operation invocation Requests always flow from the caller to the callee; Replies always flow from the callee to the caller. In addition to these messages, several control messages are defined for this protocol. These control messages are used to improve the efficiency and robustness of the connection. They are intended to be generated and consumed by the implementation of the wire protocol, and should have no direct effect on the applications using the protocol. A Request message indicates two important elements, the "operation" and the "discriminant object", or discriminant; it also contains data values which are the input parameters to the operation. Each Request has an implicit connection-specific serial number associated with it; serial numbers begin with the value one (1), and have a maximum value of 16777215. When the maximum serial number of a connection has been reached, the connection must be terminated, and further operations must be invoked over a new connection. A Reply message indicates the termination status of the operation, provides information about synchronization, and may contain data values which are output parameters or `return values' from the operation. It contains an explicit serial number to indicate which Request it is a reply to. Replies may either indicate successful completion of the operation, or several different kinds of exceptional termination; if an exception is signalled, additional information is passed to indicate which of the possible exceptions for the operation was raised. The model assumes that the messages are carried back and forth between the two parties by a "transport" subsystem. It requires that the transport subsystem be "reliable", "sequenced", and "message-oriented". By reliable, we mean that after a message is handed to the transport, the transport will either deliver it to the other party, or will signal an error if its reliable delivery cannot be ascertained. By sequenced, we mean that the transport will deliver messages to the other party in the same order in which the sender handed them to the transport. By message-oriented, we mean that the transport will provide indication of the beginning and ending of the messages, without reference to any data encoded inside the message. An example of this type of transport would be the record marking defined in ONC RPC [RFC 1831] used with TCP/IP, or the HTTP-ng webmux transport layer [HTTP-ng-webmux] used with TCP/IP. 3. Global Issues ***************** 3.1. Byte Order ================ All values use `network standard' byte order, i.e. big-endian, because all Internet protocols use it. If in the future this becomes a problem for the Internet, this protocol will be affected by whatever solution is used to solve the problem in the wider Internet context. Note that the data marshalling format defined in XDR, which this protocol incorporates by reference, is also defined to be a big-endian protocol. 3.2. Alignment and Padding =========================== The marshalled form of each value begins on a 32-bit boundary. The marshalled form of each value is padded-after, if necessary, to the next 32-bit boundary. The padding bits may be either 0 or 1 in any combination. 3.3. Marshalling Format ======================== Marshalling is via the XDR format specified in the XDR specification [RFC 1832]. It could be argued that this format is inexcusably wasteful with certain value types, such as boolean (32 bits) or byte (32 bits), and that a 16-bit or 8-bit oriented format should be designed and used in its place. However, the argument of using an existing Internet standards-track marshalling format for this purpose, rather than inventing a new one, is a strong one; a new format should only be defined if measurement of the overhead shows gross waste, or if progress of the XDR specification slows unacceptably. Additionally, the simplicity of the XDR specification should allow correct implementations of it to be realized with a minimum of effort. 3.4. Session Context ===================== Unlike some previous protocols, this protocol is "session-oriented". That means that individual messages are sent in the context of a session, and are context-sensitive. This context-sensitivity allows session-wide compression. However, to support various kinds of marshalling architectures in implementations of this system, all marshalling can be done in a context-insensitive fashion, at the expense of sending additional bytes across the wire. However, unmarshalling implementations must always be capable of tracking and using context-sensitive information. 4. Utility types **************** The following data structures are defined in pseudo-C: typedef enum { False = 0, True = 1 } Boolean; typedef enum { InitializeConnection = 0, TerminateConnection = 1, DefaultCharset = 2 } ControlMsgType; typedef enum { Success = 0, UserException = 1, /* occurred during operation */ SystemExceptionBefore = 2, /* occurred before beginning operation */ SystemExceptionAfter = 3 /* occurred after beginning operation */ } ReplyStatus; typedef struct { Boolean cached_disc : 1; /* True if cached object key */ union { struct { Boolean cache_key : 1; /* True if both sides cache it */ unsigned key_len : 13; /* length of key bytes */ } uncached_key; unsigned cache_index : 14; /* cache index if cached */ } v; } DiscriminantID; typedef struct { Boolean cached_op : 1; /* True if cached id */ union { struct { Boolean cache_operation : 1; /* True if should be cached */ unsigned method_id : 13; /* method index */ } uncached_op_info; unsigned cache_index : 14; /* cache index if "cached_op" set */ } v; } OperationID; typedef enum { MangledMessage = 0, /* bad protocol synchronization */ ProcessFinished = 1, /* sending party has `exitted' */ ResourceManagement = 2, /* transient close */ WrongCallee = 3, /* bad object group ID received */ MaxSerialNumber = 4 /* the maximum serial number was used */ } TerminationCause; typedef struct { unsigned major : 4; unsigned minor : 4; } ProtocolVersion; typedef unsigned Unused; 5. Messages *********** Only a few messages are defined. The `InitializeConnection' message is used by the caller to verify that it has connected to the right server, and that it is using the correct version of the wire protocol. The `DefaultCharset' message allows both sides to independently define a default value for string charsets. The `Request' message causes an operation to be started on the remote server. The `Reply' message is sent from the server to the client to inform it of the completion status of the operation, and to convey any result values. The `TerminateConnection' message allows either side to indicate graceful shutdown of a connection. 5.1. Extension Headers ======================= This protocol uses a mechanism called an "extension header" to provide for extensibility and tailorability. Features such as transaction contexts or global thread identifiers may be implemented via this mechanism. An extension header is name-value pair, where the name is a UUID expressed as a URI, and the value is an HTTP-ng pickle (see [HTTP-ng-arch]) value. This name-value pair is then expressed as an XDR value of the type `ExtensionHeader' as described below. Each request message and reply message may contain a sequence of extension headers, expressed as a value of the XDR type `ExtensionHeaderList'. /* XDR */ struct { string name<0xFFFF>; /* URI for extension header */ opaque value<>; /* Pickle containing value of header */ } ExtensionHeader; typedef ExtensionHeader ExtensionHeaderList<>; 5.2. `Request' Message ======================= Request header (pseudo-C): typedef struct { Boolean control_msg : 1; /* == FALSE */ Boolean ext_hdr_present : 1; /* True if ext hdr list present */ OperationID operation_id : 15; /* identifies operation */ DiscriminantID object_key : 15; /* identifies discriminant */ } RequestMsgHeader /* 4 bytes total */ The actual message consists of the following sections: [ `RequestMsgHeader' ] [ extension header list, if any ] [ XDR `string' containing object type ID of object type defining operation, if not cached ] [ bytes of OBJECT_KEY, if not cached, padded to 4 byte boundary ] [ explicit input parameter values, if any, padded to a 4 byte boundary ] The OPERATION_ID contains either a connection-specific 14-bit cache index, or a 13-bit method id (the zero-based ordinal position of the method in the definition of the object type in which the operation is defined) of the operation. If the method id is given, an additional value, an XDR `string' value containing the object type ID of the object type in which the operation is defined, is also passed. This means that this protocol will not support interfaces in which object types have more than 8192 methods directly defined. The OBJECT_KEY is either a 14-bit connection-specific cache index, or the length of a variable length octet sequence of 8192 or fewer bytes containing the service-point-relative name for the object (the INSTANCE-HANDLE of the URL). The object key value of `{ False, False, 0 }', normally a zero byte variable length object key, is reserved for use by the protocol. The OBJECT_KEY is marshalled onto the transport as an XDR value of type `fixed-length opaque data', where the length is that specified in the `v.key_len' field of the OBJECT_KEY. 5.2.1 Operation and Object Memoizing ------------------------------------ Callers may reduce the size of messages by memoizing operation IDs and object IDs that are passed in the connection. This is done by the caller setting the `cache_key' (for object IDs) or `cache_operation' (for operation IDs) bit in the `DiscriminantID' or `OperationID' struct when the object key or operation ID is first sent. Each side must then assign the next available index to that object or operation. The space of operations is separate from the space of object ids, so that a total of 16383 possible values is available for memoizing of discriminant objects, and 16383 different possible values for memoizing of operations. Note that the index is passed implicitly, so both sides of the connection must synchronize their use of indices. A shared set of indices may be loaded into the connection by some mechanism before any messages are sent. This specification does not define a mechanism for doing so. 5.3. `Reply' Message ===================== Reply header (pseudo-C): typedef struct { Boolean control_msg : 1; /* == FALSE */ Boolean ext_hdr_present : 1; /* True if ext hdr list present */ ReplyStatus : 2; Unused reply_1 : 4; unsigned serial_no : 24; /* serial # from Request */ } ReplyMsgHeader; /* 4 bytes total */ The actual message consists of the following fields: [ `ReplyMsgHeader' ] [ extension header list, if any ] [ exception ID (32-bit unsigned), if any ] [ explicit output parameter values, if any, padded to 4 byte boundary ] 5.4. `InitializeConnection' Message ==================================== InitializeConnection header (pseudo-C): typedef struct { Boolean control_msg : 1; /* == TRUE */ ControlMsgType msg_type : 3; /* == InitializeConnection */ Unused verify_1 : 4; ProtocolVersion version : 8; /* what version of the protocol? */ unsigned objgroup_id_len : 16; /* length of object group ID */ } InitializeConnectionMsgHeader; The actual message consists of the following fields: [ `InitializeConnectionMsgHeader' ] [ `objgroup_id_len'-length object group ID for supposed callee, padded to 4-byte boundary ] This message is sent from caller to callee as the first message of the connection. It is used to pass the object group ID of the connection from client to server, so that both sides understand what the omitted prefix portion of discriminant IDs is. If the object group ID received by the callee is not the correct object group ID for the callee (i.e., the callee has objects which do not have that prefix in their object IDs), the callee should terminate the connection, with the appropriate reason. The object group ID is passed as an XDR `fixed-length opaque data' value of the length specified in `objgroup_id_len'. 5.5. `TerminateConnection' Message =================================== TerminateConnection header (pseudo-C): typedef struct { Boolean control_msg : 1; /* == TRUE */ ControlMsgType msg_type : 3; /* == TerminateConnection */ TerminationCause cause: 4; /* why connection terminated */ unsigned serial_no : 24; /* last request processed/sent */ } TerminateConnectionMsgHeader; The actual message consists simply of the header; it provides for graceful connection shutdown. It is sent either from the caller to the callee, or from the callee to the caller, and informs the other party that it is cancelling the connection, for one of these reasons: 1. A badly formatted message has arrived from the other party, and protocol sychronization is believe lost, or, the caller has sent a `InitializeConnection' message with the wrong major version for the protocol; 2. This party (process, thread, whatever) is going away, and the other party should not attempt to reconnect to it; 3. This connection is being terminated due to active resource management; the other party should attempt to reconnect if it needs to - this reason is typically only useful from callee to caller; 4. The caller has sent a `InitializeConnection' message with the wrong object group ID; 5. The caller has used the maximum serial number available for this connection. The `serial_no' field contains the serial number of the last message completely processed by the caller (when `TerminateConnection' is sent from caller to callee), or the serial number of the last message sent by the callee (when sent from callee to caller). No further messages should be sent on the connection by a sender of a `TerminateConnection' message after it has been sent, or by a receiver of `TerminateConnection' messsage after it has been received. 5.6. `DefaultCharset' Message ============================== DefaultCharset header (pseudo-C): typedef struct { Boolean control_msg : 1; /* == TRUE */ ControlMsgType msg_type : 3; /* == DefaultCharset */ Unused bits_12: 12; /* unused */ unsigned charset_mibenum : 16; /* default charset */ } DefaultCharsetMsgHeader; This message is sent by either side of a connection to establish a default charset for subsequent messages sent by that side of the connection. The charset defines how string values are marshalled as octet sequences. The default charset defines the default marshalling, unless overridden by an explicit charset in a string value. Each side of the connection may establish a default charset independently of the other side of the connection; the default charset only applies to string values in messages coming from that side. A new value of the default charset may be established at any time by sending another `DefaultCharset' message. 6. Data Marshalling ******************** This section defines how values of the type specified in the HTTP-ng type system [HTTP-ng-arch] are marshalled into a Request or Reply message. The data value format used for parameters is the XDR format specified in [RFC 1832]. However, we extend the XDR specification with one additional type, called "flagged variable-length opaque data". It is similar to XDR's regular variable-length opaque data, except that the high-order bit of the length field is used as a flag bit, instead of being part of the length. This means that flagged variable-length opaque data can only carry opaque data of lengths less than or equal to (2**31)-1. 0 1 2 3 4 5 ... ++----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ flag -->|| length n |byte0|byte1|...| n-1 | 0 |...| 0 | bit ++----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ ||<------31 bits------->|<------n bytes------>|<---r bytes--->| |<----n+r (where (n+r) mod 4 = 0)---->| FLAGGED VARIABLE-LENGTH OPAQUE 6.1. Boolean Type ================== Values of type `BOOLEAN' are passed as XDR `bool'. 6.2. Enumeration Types ======================= Values of enumeration types are passed as XDR `enum'. Each enumeration value is assigned its ordinal value as it appears in the declaration of the enumeration type, starting with the value `one'. 6.3. Numeric Types =================== 7.3.1. Fixed-point Types ------------------------- Values of fixed-point types are passed by passing the value of the numerator. We define a number of special cases for efficient marshalling of common integer types, as well as a general case for passing values of fixed-point types that are not covered by the special cases. Special cases: * 32-bit integer: Fixed-point values with a minimum-numerator value greater than or equal to -2147483648 and with a minimum numerator value less than or equal to 2147483647 are passed as XDR `integer'. * 32-bit unsigned integer: Fixed-point values with a minimum-numerator value greater than or equal to 0 and with a maximum numerator less than or equal to 4294967295 are passed as XDR `unsigned integer'. * 64-bit integer: Fixed-point values with a with a minimum numerator value greater than or equal to -9223372036854775808 and with a maximum numerator less than or equal to 9223372036854775807are passed as XDR `hyper integer'. * 64-bit unsigned integer: Fixed-point values with a minimum-numerator value greater than or equal to 0 and with a maximum numerator value less than or equal to 18446744073709551615 are passed as XDR `unsigned hyper integer'. General case: The numerator of the value is passed as XDR `flagged variable-length opaque data', with the bytes of the data containing the value expressed as a base-256 number, in big-endian order; that is, with the most significant digit of the value first. The flag bit is used to carry the sign; the flag bit is 0 for a positive number or zero, and 1 for a negative number. 7.3.2. Floating-point Types ---------------------------- We define a number of special cases for efficient marshalling of common floating-point types, as well as a general case for passing values of floating-point types that are not covered by the special cases. Special cases: * IEEE single: floating point types matching the IEEE 32-bit floating-point format (that is, with the parameters significand-size=24, exponent-base=2, maximum-exponent-value=127, minimum-exponent-value=-126, has-Not-A-Number=TRUE, has-Infinity=TRUE, denormalized-value-allowed=TRUE, and has-signed-zero=TRUE) are passed as XDR `floating-point'. * IEEE double: floating point types matching the IEEE 64-bit floating-point format (that is, with the parameters significand-size=53, exponent-base=2, maximum-exponent-value=1023, minimum-exponent-value=-1022, has-Not-A-Number=TRUE, has-Infinity=TRUE, denormalized-value-allowed=TRUE, and has-signed-zero=TRUE) are passed as XDR `double-precision floating-point'. * Intel extended double: floating point types matching the Intel IEEE floating-point-compliant extended double floating-point format (that is, with the parameters significand-size=64, exponent-base=2, maximum-exponent-value=16383, minimum-exponent-value=-16382, has-Not-A-Number=TRUE, has-Infinity=TRUE, denormalized-value-allowed=TRUE, and has-signed-zero=TRUE), are passed as a 12-byte value of XDR `fixed-length opaque data', containing the floating-point value in the format specified in the UNIX System V Application Binary Interface Intel 386 Processor Supplement (Intel ABI) document: the 63 bits of the fraction occupy the first 7 bytes in little-endian order plus the low seven bits of the eighth byte; the 1 bit explicit leading significand bit occupies the high-order bit of the eighth byte; the 15 bits of the exponent occupy the ninth byte and the low-order bits of the tenth byte, in little-endian order; the sign bit occupies the high-order bit of the tenth byte; the eleventh and twelfth bytes are unused, and should contain zero values. * SPARC & PowerPC extended double: floating point types matching the XDR quadruple-precision floating-point format (that is, with the parameters significand-size=113, exponent-base=2, maximum-exponent-value=16383, minimum-exponent-value=-16382, has-Not-A-Number=TRUE, has-Infinity=TRUE, denormalized-value-allowed=TRUE, and has-signed-zero=TRUE), which is the form of extended double floating-point used by PowerPC and SPARC processors, are passed as XDR `quadruple-precision floating-point'. General case: Values of floating-point types not matching the special cases identified above are passed as a value of the XDR struct type `GeneralFloatingPointValue', which has the following definition: /* XDR */ enum { Normal = 1, NotANumber = 2, Infinity = 3 } FloatingPointValueType; struct { flagged opaque FixedPointSignAndSignificand<>; flagged opaque FixedPointExponent<>; } NormalFloatingPointValue; union switch (FloatingPointValueType disc) { case Normal: NormalFloatingPointValue value; case NotANumber: void; case Infinity: void; } GeneralFloatingPointValue; The two fields of the `NormalFloatingPointValue' struct each contain an on-the-wire representation of a fixed-point value of the fixed-point type (denominator=1, no-mininum-numerator, no-maximum-numerator). The `FixedPointSignAndSignificand' field contains the sign of the floating-point value as the sign, and the actual significand as the absolute value of the fixed-point value. The `FixedPointExponent' field contains the exponent of the floating-point value. 6.4. String Types ================== Each string value sent in this protocol has a "charset" [RFC 2278] associated with it, identified by the charset's IANA-assigned MIBEnum value. Each side of a session may establish a "default charset" by sending the `DefaultCharset' message. String values that use the default character set do not contain explicit charset information; string values that use a charset other than the default charset contain the MIBEnum value for the charset, along with the bytes of the string. We send a string value as a value of XDR `flagged variable-length opaque data'. If the flag bit is 1, the first two bytes of the string value are the MIBEnum of the charset, high-order byte first; the remaining bytes are the bytes of the string. If the flag bit is 0, the bytes of opaque data simply contain the bytes of the string; the charset is the default charset for the session. It is a marshalling error to send a string value with a flag bit of 0 over a session for which no default charset has been established. To avoid context-sensitivity in marshalling a string, it is always valid to marshal a string with an explicit charset value, even if the charset value is the same as the default charset for the session. When marshalling a string into a pickle, the charset should always be explicitly included. 6.5. Sequence Types ==================== Values of sequence types are passed as XDR `variable-length arrays', with one exception: Sequences of any fixed-point type with a minimum numerator greater than or equal to 0, and a maximum numerator less than or equal to 255, are passed as XDR `variable-length opaque data', with one numerator value per octet. 6.6. Array Types ================= Values of array types are passed as XDR `fixed-length arrays', with one exception: Arrays of any fixed-point type with a minimum numerator greater than or equal to 0, and a maximum numerator less than or equal to 255, are passed as XDR `fixed-length opaque data', with one numerator value per octet. Values of array types are passed as XDR `fixed-length arrays', with one exception: 6.7. Record Types ================== Values of record types are passed as XDR `struct'. 6.8. Union Types ================= Values of union types are passed as XDR `union', with the union discriminant being the zero-based ordinal value for the encapsulated value's type. 6.9. Pickle Type ================= A pickle is passed as an XDR `variable-length opaque data', containing the type ID of the pickled value's type, followed by the XDR-marshalled pickled value. To save pickle space for common value types used in metadata, we define a packed format for the type ID marshalling. A type ID is marshalled into a pickle as a 32-bit header, in an XDR `unsigned integer', possibly followed by an XDR `fixed-length opaque data', containing the string form of the type ID of the pickled type. The header has the following internal structure: /* Pseudo-C */ typedef struct { unsigned version : 8; PickleTypeKind type_kind : 8; unsigned type_id_len : 16; } TypeIDHeader; The `version' field gives the version number of the pickle format; the `type_kind' field contains a value from the enum /* Pseudo-C */ typedef enum { TypeKind_unconstrained = 0, /* anything not covered by other type kinds... */ TypeKind_boolean = 1, /* BOOLEAN */ TypeKind_s8 = 2, /* FIXED-POINT DENOM=1 MIN-NUM=-128 MAX-NUM=127 */ TypeKind_s16 = 3, /* FIXED-POINT DENOM=1 MIN-NUM=-32768 MAX-NUM=32767 */ TypeKind_s32 = 4, /* FIXED-POINT DENOM=1 MIN-NUM=-2147483648 MAX-NUM=2147483647 */ TypeKind_s64 = 5, /* FIXED-POINT DENOM=1 MIN-NUM=-9223372036854775808 MAX-NUM=9223372036854775807 */ TypeKind_u8 = 6, /* FIXED-POINT DENOM=1 MIN-NUM=0 MAX-NUM=255 */ TypeKind_u16 = 7, /* FIXED-POINT DENOM=1 MIN-NUM=0 MAX-NUM=65535 */ TypeKind_u32 = 8, /* FIXED-POINT DENOM=1 MIN-NUM=0 MAX-NUM=4294967296 */ TypeKind_u64 = 9, /* FIXED-POINT DENOM=1 MIN-NUM=0 MAX-NUM=18446744073709551616 */ TypeKind_ieee_float32 = 10, /* FLOATING-POINT SIGNIFICAND-SIZE=24 EXPONENT-BASE=2 MAXIMUM-EXPONENT-VALUE=127 MINIMUM-EXPONENT-VALUE=-126 HAS-NOT-A-NUMBER=TRUE HAS-INFINITY=TRUE DENORMALIZED-VALUE-ALLOWED=TRUE HAS-SIGNED-ZERO=TRUE */ TypeKind_ieee_float64 = 11, /* FLOATING-POINT SIGNIFICAND-SIZE=53 EXPONENT-BASE=2 MAXIMUM-EXPONENT-VALUE=1023 MINIMUM-EXPONENT-VALUE=-1022, HAS-NOT-A-NUMBER=TRUE HAS-INFINITY=TRUE DENORMALIZED-VALUE-ALLOWED=TRUE HAS-SIGNED-ZERO=TRUE */ TypeKind_i_default_str = 12, /* STRING LANGUAGE="i-default" */ TypeKind_object = 13, /* local or remote object */ ... /* other types like Date, etc, should be added here... */ ... } PickleTypeKind; If the value of `type_kind' is `TypeKind_unconstrained', the value of `type_kind_len' is the length of a value of XDR type `fixed-length opaque data', containing the full string type ID of the type, which immediately follows the header. Otherwise, no `opaque data' is marshalled. For the purposes of marshalling, pickles have no default charset; this means that strings marshalled into a pickle should always contain an explicit charset. Pickles should be considered a single "message" for the purposes of marshalling aliased reference types. 6.10. Reference Types ====================== 6.10.1. Optional Types ----------------------- Optional types are passed as XDR `optional-data'. 6.10.2. Aliased Types ---------------------- The scope of aliasing in this protocol is the message, as in Java RMI, rather than the call, as in DCE RPC. That is, aliasing occurs only within the context of a single invocation or result, rather than across a full invocation-result pair. For the purposes of marshalling, a pickle scope should be considered a single message scope. Each unique value of an aliased type that is marshalled is assigned a 32-bit unsigned integer value, unique in the scope of aliasing, called its "aliased identifier". This identifier is marshalled as an XDR `unsigned integer'. If the aliased value has not previously been sent in this scope, its value is then marshalled as a value of its base type would be. Note that this means that the full value of every aliased type is sent only once in a scope; subsequent occurrences send only the aliased identifier. [ XXX - how to handle overflow of aliased value cache? ] 6.11. Object Types =================== An instance of an object type is passed as the state of the object type, which also contains information about the actual type of the value. For remote object types, this state is followed by the object identifier, and optionally information about how the instance may be contacted. 6.11.1. Parameter Type Versus Actual Type ------------------------------------------ When marshalling the state of an object, it's important to distinguish two important types of the value: the "parameter type", which is the type that both sides of the session expect the value to have, and the "actual type" of the value, which is the most-derived type of the object, and may be a subtype of the parameter type. If the actual type is different from the parameter type, extra information must be passed along with the value to allow the receiver to properly distinguish the type and its associated data. However, if the actual type is the same as the parameter type, some of this information can be omitted. 6.11.2. Passing the Actual Type Information -------------------------------------------- We try to pass the type information of the object type as the type ID of the most-derived-type of the object. However, for proper unmarshalling of local object types, we also need to pass additional type IDs. Type information is thus passed according to the following rules: 1. If the parameter type of the object is sealed, both sides already know the most-derived-type ID of the instance, and know that the actual type must be the same as the parameter type. In this case, the type ID is passed as XDR `void'. 2. Otherwise, type information is passed as a value of the following XDR union type `GeneralTypeInformation': /* XDR */ enum { FormalType = 1, RemoteMSTID = 2, LocalTypeTree = 3 } ObjectTypeInformation; union switch (ObjectTypeInformation disc) { case FormalType: void; /* passed implicitly */ case RemoteMSTID: opaque<0xFFFF>; /* contains mdt type ID */ case LocalTypeTree: VALUE OF HTTP-NG.TYPEIDTREENODEREF; /* full inheritance tree */ } GeneralTypeInformation; That is: 1. If the actual type of the object is the same as the parameter type, again both sides know it, and the type information is passed implicitly. 2. If the object type inherits from `HTTP-ng.RemoteObjectBase', the type ID of the most-derived type of the object is passed as a value of XDR `variable-length opaque data' containing the type ID. 3. If the object type does not inherit from `HTTP-ng.RemoteObjectBase', and is thus a local object type, the full type inheritance hierarchy of the type is passed as a value of `HTTP-ng.TypeIDTreeNodeRef'. 6.11.3. Passing the State Attributes ------------------------------------- The state attributes are marshalled in one of two ways: 1. If the actual type of the instance is the same as the parameter type, the state of each of the types of the object are passed by walking the supertype inheritance tree of the instance in a depth-first order, passing the value of each attribute of any particular state in the order in which they are defined, as if each state formed an XDR `structure' with the attributes as the components of the structure. The value of each attribute is marshalled directly according to the type of the attribute. 2. If the actual type of the instance is a subtype of the parameter type, the receiver has to be able to handle state for types it has no knowledge of. To allow for this, the state of each type is passed as an encapsulation. That is, the state of the instance is passed as a sequence of XDR `structure' values, each containing the state for one of the types of the instance. Types of the instance which have no associated state do not appear in this sequence. An XDR expression of the sequence would be the following: /* XDR */ struct { opaque type_id<0xFFFF>; opaque state<>; } TypeState; typedef TypeState StateSequence<>; The type_id field contains the type ID for that type of the the object value. The variable-length opaque data field state contains the values of the attributes of the state marshalled as an XDR `structure', where the components of the structure are the attributes of the state. 6.11.4. Passing the Object ID and Contact Info ----------------------------------------------- In the case of a remote object type, the object group ID, instance handle and contact info for the value are passed as a value of the following XDR structure type `RemoteObjectInfo': /* XDR */ typedef string ContactInfo<0xFFFF>; struct { opaque objgroup_id<>; opaque instance_handle<>; ContactInfo cinfos<>; } RemoteObjectInfo; where objgroup_id is a identifier for the server which supports the desired object, and instance_handle is a server-relative name for the object. The cinfos field contains zero or more pieces of information about the way in which the object needs to be contacted, including information such as whether various transport layers are involved. 6.11.5. Syntax of Cinfo Strings -------------------------------- [Note: this cinfo syntax is poorly defined. An extensible more conventionally URI-based scheme should replace this at some point. ] Each cinfo string has the form described below (where brackets indicate optionality, an is an identifier composed of ASCII lowercase alphabetic and numeric characters, beginning with a lowercase alphabetic character, and a is any string of ASCII characters not containing the underscore character '_'): := '@' := [ '_' ] := := [ '_' ] := := [ '=' ] := [ '_' ] * 6.11.5.1. Syntax of `w3ng' Pinfo The current syntax of the pinfo string for the `w3ng' wire protocol is := 'w3ng' := [ '.' ] where `' and `' are numbers between 0 and 15. If the `' is not specified, it defaults to 0. * 6.11.5.2. Syntax of `w3mux' Tinfo The current syntax of the tinfo string for the `w3mux' transport layer is := 'w3mux' := '_' where `' is a protocol ID number [MUX], and `' is a UUID string for an endpoint. The size of the `' string must be less than 1000 bytes. * 6.11.5.3. Syntax of `tcp' Tinfo The current syntax of the tinfo string for the `tcp' transport layer is := 'tcp' := '_' where `' is string of less than 1000 bytes indicating the IP address or hostname of the remote machine, and `' is the TCP port on which the host is listening. 7. Connection Exceptions ************************* We define a number of exceptions, as defined in [HTTP-ng-arch], which may be signalled `across the wire' (between compatibility domains) as a result of any operation invocation, and are called "connection exceptions". Each connection exception has a specified "exception code", and may also have output parameters associated with it. This set of exceptions may not be the same as the set of system exceptions built into any implementation of the HTTP-ng architecture; the implementation is responsible for mapping from its internal set of exceptions to that supported by the wire protocol. 7.1. `UnknownProblem' ====================== Exception code: 0 Output parameters: None An unknown problem occurred. 7.2. `ImplementationLimit' =========================== Exception code: 1 Output parameters: None The request could not be properly addressed because of some implementation resource limit on the callee side. 7.3. `SwitchConnectionCinfo' ============================= Exception code: 2 Output parameters: NEW-CINFO : value of a string type with language "i-default" This exception requests the caller to upgrade the connection protocol and transport information to the cinfo specified as the argument, and re-try the call. This is the equivalent of the `UPGRADE' message in HTTP 1.1, and the `RELOCATE_REPLY' message in CORBA GIOP. 7.4. `Marshal' =============== Exception code: 3 Output parameters: None A marshalling problem was encountered. 7.5. `NoSuchObjectType' ======================== Exception code: 4 Output parameters: None The object type of the operation was unknown at the server. 7.6. `NoSuchMethod' ==================== Exception code: 5 Output parameters: None The object type of the operation was known at the server, but did not contain the indicated method. 7.7. `NoSuchObject' ==================== Exception code: 6 Output parameters: None The specified discriminant object was not available at the server. 7.8. `InvalidType' =================== Exception code: 7 Output parameters: None The object specified by the discriminant did not participate in the type specified in the operation. 7.9. `Rejected' ================ Exception code: 8 Output parameters: REASON : value of a string type with language "i-default" The server refused to process the request. It may return a string giving a reason for the rejection. 7.10. `OperationOrDiscriminantCacheOverflow' ============================================= Exception code: 9 Output parameters: None The request caused the receiver's cache of operations or discriminants to overflow. The sender may retry the request with uncached operation and discriminant values; subsequent requests should not cache any additional operation or discriminant values, but may continue to use previously successfully cached values. 8. Security ************ This protocol assumes that security provisions are made either at some level above it, typically in the application interfaces, or at some level below it, typically by use of a secure transport mechanism. It contains no protocol-level mechanisms for providing or assuring any of the concerns normally related to security. 9. References ************** [HTTP-ng-arch]: HTTP-ng Architecture Model. (See `http://www.w3.org/TR/WD-HTTP-NG-architecture'.) [HTTP-ng-goals]: HTTP-ng Short- and Long-term Goals. (See `http://www.w3.org/TR/WD-http-ng-goals'.) [HTTP-ng-webmux]: HTTP-ng WEBMUX Protocol Specification. (See `http://www.w3.org/TR/WD-mux'.) [RFC 1831]: RFC 1831, RPC: Remote Procedure Call Protocol Specification Version 2; R. Srinivasan, August 1995. (See `http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1831.txt'.) [RFC 1832]: RFC 1832, XDR: External Data Representation Standard; R. Srinivasan, August 1995. (See `http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1832.txt'.) [RFC 2277] RFC 2277, IETF Policy on Character Sets and Languages; H. Alvestrand, January 1998. (See `http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2277.txt'.) [RFC 2278] RFC 2278, IANA Charset Registration Procedures; N. Freed & J. Postel, January 1998. (See `http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2278.txt'.) 10. Address of Author ********************** Bill Janssen Mail: Xerox Palo Alto Research Center 3333 Coyote Hill Rd Palo Alto, CA 94304 Phone: (650) 812-4763 FAX: (650) 812-4777 Email: janssen@parc.xerox.com HTTP: http://www.parc.xerox.com/istl/members/janssen/