W3 Consortium                                                Rohit Khare
WORKING DRAFT                                        W3 Consortium / MIT
<WD-http-pep-951127.html>
Expires: May 22, 1996                                  November 22, 1995

PEP: An Extension Mechanism for HTTP

Status of this document

This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C tech reports can be found at: http://www.w3.org/pub/WWW/TR/

Note: since working drafts are subject to frequent change, you are advised to reference the above URL, rather than the URLs for working drafts themselves.


Table of Contents

  1. Abstract

  2. Motivation

  3. Concepts

    1. PEP Architecture

      1. Naming

      2. Addressing

      3. Negotiation

      4. Processing

    2. Terminology

  4. Operation

    1. Scenarios

    2. Interpretation

    3. Negotiation

    4. Relay Operation

    5. Origin Operation

    6. Deployment Issues

      1. Relationship to HTTP/1.1

  5. Notation

    1. Header Fields

    2. Content Codings

    3. Status Codes

  6. Usage

  7. Security Considerations

    1. Information Leakage

    2. Trusting PEP Headers

    3. Protocol Interaction Effects

    4. Protocol Substitution Effects

    5. Negotiation Scope

  8. Development Path

  9. Acknowledgements

  10. References

  11. Author's Address

1. Abstract

PEP is a system for HTTP clients, server, and proxies to reliably reason about custom extensions to HTTP. Traditionally, mutually informed HTTP agents could offer extended behavior by adding new message headers. PEP has features for standardizing scope, strength, and ordering of such extensions. PEP also offers an extensible negotiation framework.

2. Motivation

HTTP messages, like most applications of RFC 822 [5], can be extended with additional header fields. However, this provides no guidance to HTTP agents on whether to strip the header, or to act on a header, and if acted upon, in what order, and so on. Furthermore, multiple extensions may use conflicting header fields.

``Protocol extensions'' are a higher-level abstraction. They can specify any associated header lines and also provide guidance on each of the above features. PEP is an extension protocol for HTTP that captures such information about protocol extensions (hence `PEP': Protocol Extension Protocol).

Using PEP , HTTP agents can interoperate correctly with unknown protocol extensions and also negotiate a set of common protocol extensions. PEP brings to HTTP the extensibility lessons learned from ESMTP (extension naming) [11,13], IPv6 (unknown-option disposition) [6], and Telnet (option negotiation) [14].

3. Concepts

PEP is a tool for deploying applications which can be superimposed on HTTP transactions. PEP provides certain core features -- naming, addressing, negotiation, and processing -- on top of standard HTTP/1.x.

First, though, a brief introduction to the headers of PEP for HTTP/1.x. PEP uses the Protocol: and Accept-Protocol: headers to indicate which protocol extensions the current message conforms to and which are available for subsequent messages, respectively.

The Protocol: header indicates that an HTTP agent is acting in accordance with the named protocol extension in constructing the message, and the receiver (according to the strength and scope of the header), may be required to understand the same protocol, or report an error.

Similarly:

The Accept-Protocol: header indicates that an HTTP agent can act in accordance with the named protocol extension, and the receiver (according to the strength and scope of the header), may be required to employ the same protocol, or report an error.

In many cases, a protocol-extension corresponds to a software module: the "rot13" protocol is an agreement to shift all characters by 13 positions. Applying several protocols, in order, corresponds to a pipeline. To order each stage, PEP reuses a third header, Content-Encoding:.

3.1 PEP Architecture

The PEP architecture provides the key elements for reliably extending the HTTP protocol between agents. Section 4, Operation, discusses how these facilities modify agent behavior in more detail.

Note that the PEP architecture is completely symmetric; none of the features below distinguishes between ``client'' and ``server''.

3.1.1 Naming

PEP is required to identify protocol-extensions in two contexts: describing a message which employs a protocol, and in describing an offer to use a protocol. The identification process is separated into two parts: a protocol name, and a configuration that describes how the protocol is being used.

The protocol name is a pointer to a specification of the protocol itself: either a full URI, or a registered name. This takes the thinking of ESMTP [11] -- to encouraging proliferation of standardized, named extensions -- one step further, by using the web to dynamically register names.

A protocol specification, in addition to the ``usual'' material defining the meaning and structure of its payload, can also specify well-known parameters that specify how the protocol is being used (e.g. modes, sizes, etc). PEP identifies protocol extensions by combining the name and the parameters.

3.1.2 Addressing

While HTTP is a host-to-host protocol, an HTTP transaction is not; it may be routed through proxies, caches, or gateways. PEP is required to offer a way to select which agents are involved in processing each protocol instance or offer.

PEP uses a standard attribute, "scope", to indicate which HTTP agents are required to pay attention to an offer or instance. Scope has three values: connection, route, and origin. The concept is largely similar to option processing in IPv6 [6], especially in the importance of proper error reporting by handlers which are in scope, but cannot handle the option.

``Connection'' scope is addressed to the next HTTP agent; each protocol instance or offer so labeled must be removed before proceeding. The rationale is similar to the Connection: header in Section 10.9 of HTTP/1.1.

``Route'' scope is addressed to all HTTP agents until the origin, to set up synchronous tunnel-type extensions

``Origin'' scope is addressed exclusively to the opposite endpoint (the ``origin server'' or ``origin client''). No intermediate agents are allowed to act upon or modify such protocol instances or offers -- unless the agent is explicitly trusted to act for the origin.

3.1.3 Negotiation

Before an HTTP transaction begins, none of the agents involved can be fully informed of the other agents' capabilities. PEP provides a framework for advertising capabilities and selecting interoperable sets of protocol-extensions. Telnet option negotiation [14] is a direct inspiration for the symmetric negotiation model PEP uses. PEP, though, adds a twist, borrowed from SHTTP [15] of controlling the process by explicitly encoding the "strength" of a request.

PEP negotiation advertises a protocol offer by name (Section 3.1.1) and by strength. This allows agents to explictly require, reject, or optionally accept particular protocol configuations.

The only source of asymmetry is that, in HTTP, the client always moves first. Once the client has listed a set of protocol extensions, the server can choose, against its own preferences, which protocol extensions it will use, and which of its own offers to extend.

Note that the negotiation proceeds not just on the name, but on the offered parameters, akin to ``subnegotiation'' in Telnet. The protocol specification can specify how to merge offers and compute responses. For example, consider the simple-cipher extension, which defines one parameter, key-length. The client can offer one range, the server another, and the servers' response computed by choosing a value from the intersection.

Finally, agents may choose to respond not with the named offer, but a protocol considered to be equivalent.

3.1.4 Processing

The final PEP requirement is to accomodate multiple extensions to a single HTTP message. If a protocol must be evaluated in a certain order, the protocol instance must define an "enc" attribute, allowing that instance to be part of the Content-Encoding: pipeline. Protocol extensions that are order-independent do not need to do this.

PEP does not offer a way to order negotiation offers, per [9] (i.e. ``only accept A after B'').

3.2 Terminology

This following terms have specific meaning in the context of this document. The HTTP/1.0 specification [3] defines additional useful terms.

bag
In HTTP/1.1 and PEP, a named, unordered list according to the <bag> grammar. Attribute names can either be complete URIs or registered short-names.

connection
("conn") The scope which addresses the next HTTP agent to receive the message.

encoding
("enc") An encoding is a process that read or writes the message (header or body). The Content-Encoding: header is an ordered list of encodings to apply; besides the methods defined in HTTP/1.0, an encoding may refer to a protocol instance's "enc" attribute.

HTTP agent
Any process that communicates according to HTTP. In particular, any process that communicates in HTTP/1.2 or later is expected to be PEP-compliant. ``Agent'' encompasses servers, proxies, caches, and clients.

module
Many protocols will imply complementary processing, which is implemented by a module. A module can be used as a stage of a processing pipeline. A single module can implement several protocols.

negotiation context
To remember what protocols are available between HTTP transactions, protocol offers can be stored into some context, perhaps by session.

optional
("opt") A strength value indicating that the associated protocol is optional. As an attribute of an instance, it means that an agent may elide corresonding processing. As an attribute of an offer, it means that a reply or subsequent request may be created in accordance with this protocol.

origin
("origin") The scope which addresses the opposite endpoint of an HTTP transaction. For a request, it means the origin server; for a reply message, it means the origin client (user). This scope includes proxies trusted to act ``in place of'' the origin.

parameters
("params") A list of values related to the particular instance or offer. Any sub-bags in the list should have registered attributes, and may use differing syntax for negotiation (in offers) and application (in instances). Also referred to as ``parameter configuration''

protocol
A convention for communication between two or more parties relating the syntax, sequence and semantics of the communication between them.

protocol instance
A bag describing the protocol and parameter configuration to which the message conforms.

protocol name
The name of a protocol specification, either a URI or registered name.

protocol offer
A bag describing a protocol and parameter configuration that may/may not be acceptable to the offering agent.

protocol specification
A human-readable document that describes a protocol defining a message, associated semantics, and possible compatibility with other specifications. The specification must be available from the registry or by dereferencing the protocol-name.

registry
IANA shall register Protocol-Extensions: names, header names, encodings, and additional attributes and parameters. An experimental registry can be resolve short-names relative to http://www.w3.org/Registry/PEP/. Unregistered names should be complete URIs.

refused
("ref") A strength value indicating that the associate protocol should not be used. Valid only as an attribute of an offer: a reply or subsequent request must not be created in accordance with this protocol or an equivalent.

required
("req") A strength value indicating that the associated protocol is required. As an attribute of an instance: an agent must not elide corresonding processing. As an attribute of an offer: a reply or subsequent request must be created in accordance with this protocol or an equivalent.

route
("route") The scope which addresses every HTTP agent in a transaction from the current agent to the origin.

scope
("scope") In an HTTP transaction, the set of HTTP agents being addressed: the next hop ("conn"), the subsequent chain ("route"), or the endpoints ("origin").

strength
("str") As an attribute of an instance: whether the recipient may or may not elide processing according to the given protocol. As an attribute of an offer: whether the offering party will require, refuse, or optionally accept the given protocol in a reply or subsequent request.

strip
To strip a protocol instance or offer, an agent must remove the bag describing it, and each of the header fields listed in that bag's header list.

4. Operation

This section is an operational guide to PEP. Section 5 includes a formal presentation of the syntax, status codes, and semantics.

When a PEP-capable HTTP agent receives a PEP-enhanced message, it will parse the various headers, store negotiation data away for later use, and decide which protocols to ``invoke'' and, if relaying the messsage, what data to strip from the message. This section covers each of these phases in detail.

4.1 Scenarios

There are a few modes of operation. Here is a quick example where two parties are attempting to use the Foo protocol:

4.2 Interpretation

First, parse each of the three PEP headers: Protocol:, Accept-Protocol:, and Content-Encoding: (General parsing problems flag error 420, Bad Protocol Extension Request).

For each protocol offer and instance, the response will depend on its strength and scope:

Second, select only those which are ``in scope'' (for the origin, everything; otherwise all except "origin").

Third, for each protocol instance and offer, if strength = required, the agent must return the error codes below; if the strength is optional or refused, it is at the agent's discretion whether to report the error:

Each protocol instance that ends up in scope, required (or optional and elected by the agent), must be evaluated, either in the order its encoding is mentioned in Content-Encoding:, or after all those mentioned in Content-Encoding:. Note that some extensions may not modify the message contents; successive extensions can then be evaluated immediately.

4.3 Negotiation

Each of the offers received in an HTTP request or reply should be stored for the duration of an HTTP transaction, if not longer, i.e. an entire HTTP ``session''. When preparing a reply, or a subsequent request to the same resource (or server, or security realm), the agent should merge its preferences against the stored offers to choose which protocol extensions to employ.

If there is no compabtible set, a server may be forced to reply with Error 520, Protocol Extension Error, akin to Error 406, None Acceptable, for content negotiation.

4.4 Relay Operation

A proxy, gateway, firewall, or other non-origin HTTP agent will have to relay HTTP response and reply messages. When relaying a PEP-enhanced message:

  1. Select protocol instances that must be evaluated and do so (Section 4.2)

  2. Before relaying, the agent must strip all instances and offers in connection scope.

  3. Before relaying, the agent may strip optional instances and offers in route scope. This is only suggested if the agent cannot, in fact, abide by the protocol extension in question.

4.5 Origin Operation

The origin client or server need only abide by Section 4.2 and 4.3.

4.6 Deployment Issues

PEP is designed to tolerate being relayed through non-PEP-aware HTTP agents. There is only one PEP-compatibility error, namely detecting a non-PEP-aware relay which passes a PEP message containing protocol instances or offers it should have acted upon.

For experimental purposes, PEP-compatbility is equated with HTTP/1.2.

To deploy PEP services to the installed base of HTTP/1.0 services, it is possible to design a local, trusted PEP HTTP/1.2 <--> HTTP/1.0 proxy.

4.6.1 Relationship to HTTP/1.1

HTTP/1.1 defines a number of new constructs that PEP either relies upon or integrates with.

Wrapping
PEP can enhance wrapped messages, and PEP-enhanced messages can be wrapped. Only the outermost headers are consulted for PEP features, so wrapped PEP messages should not include connection or route scope directives.

Options
This new method returns the methods and other properties of the specified URI, or of the entire server (if the URI is "*"). It is perfectly appropriate for a server to reply with the various Accept-Protocol: configurations it supports for the server or the resource.

Tunneling
PEP can be used to set up a tunnel (e.g. a secure channel protocol). Note that any HTTP agent acting as a tunnel in a transaction, by definition, cannot act upon any PEP directives in the encapsulated traffic.

Chunked Transfer Encoding
This transfer encoding allows agents to manipulate streaming, unknown-length data. Protocol extensions adapted to streaming will operate cleanly on top, but some extensions may force PEP-aware agents to buffer up the entire data stream.

5. Notation

PEP-related syntax is specified here relative to the definitions and rules of the HTTP/1.0 [3], HTTP/1.1 [4], and the relative URL specification [7].

5.1 Header Fields

PEP defines two new general header fields, Protocol: and Accept-Protocol:, and adds new meaning to a third, Content-Encoding:.

/* Added to General Header rule, Sec 4.3 of HTTP/1.1 */ Protocol = "Protocol" ":" 1#bag Accept-Protocol = "Accept-Protocol" ":" 1#bag /* Following rules are copied from HTTP/1.1 */ bag = "{" word *(word | bag) "}" word = token | quoted-string token = 1*<any CHAR except CTLs or tspecials> tspecials = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | "}" | SP | HT quoted-string = ( <"> *(qdtext) <"> ) qdtext = <any CHAR except <"> and CTLs but including LWS>

Each instance or offer bag can contain sub-bags for each of the following attributes:

{<protocol name> {scope (origin | conn | route)} {str (opt | req | ref)} {enc <token>} {headers *<token>} {params ...}}

protocol name
The registered name of the protocol.

scope
scope is one of "conn", "route", or "origin" ; the default is "origin".

str
strength is one of "req", "ref" or "opt" ; the default is "opt". ["ref" only allowed for offers]

enc
encoding is a unique token ; default is undefined [only allowed for instances]

headers
A list of associated headers; the default is the empty set.

params
A bag of parameters configuring the instance or offer, according to protocol spec; the default is the empty bag.

5.2 Content Codings

The only tokens describing content-codings in HTTP/1.1 are "gzip" and "compress". Other content-coding tokens may be selected from the "enc" attribute of protocol instances. See Sections 10.10, 3.5, and appendix C.3 of the HTTP/1.0 specification [3] for details.

5.3 Status Codes

PEP defines several new status codes for HTTP replies. Note that the HTTP/1.0 specification [3] states in Section 6.1.1:

The first digit of the Status-Code defines the class of response. The last two digits do not have any categorization role.

To informally distinguish PEP-dependent response codes PEP uses x2z codes.

200 Class
220 Uses Protocol Extensions
400 Class
420 Bad Protocol Extension Request
421 Protocol Extension Unknown

422 Protocol Extension Refused
423 Bad Protocol Extension Parameters
500 Class
520 Protocol Extension Error
521 Protocol Extension Not Implemented

522 Protocol Extension Parameters Not Acceptable

Each of 400 and 500 class responses may include entity bodies with an explanation of the error, and an indication of whether the problem is temporary or permanent.

6. Usage Examples

There is a reasonable example for each of the strength-scope combinations:

PICS Labels
(origin, opt) The only parties that have to agree to transmit PICS labels are the endpoints, and compliance is optional.

rot-13
(origin, req) The only parties that have to agree to transmit rot-13 data labels are the endpoints, but compliance is required.

keep-alive
(conn, req | opt) The only parties that have to agree to keep alive an HTTP connection are the immediate two hosts, but compliance may be required. [8]

SSL tunnel
(route, req) To set up a secure channel, every agent on the path is required to cooperate.

7. Security Considerations

There are several security issues PEP implementors must be aware of, especially when deploying security protocol extensions. Fundamentally, PEP emphasizes flexibility, which is at odds with principles of secure design. See [10] for further analysis of PEP-based security solutions.

Separately, PEP encourages a plug-in software architecture for HTTP agents. There are manifold risks to executing untrusted or marginally-trusted code, especially if sensitive data is passed into such modules. These are not PEP-specific risks, but are of importance to any implementor.

7.1 Information Leakage

Typically, a PEP-compliant implementation will read Content-Encoding: and create a processing pipeline for each module.

The information passing between processing stages should be considered sensitive.

For example, one module may compute a shared session key, and pass it inband (in the clear) to the next stage, an encipherment protocol that protects the key.

Implementations should carefully protect intermediate data flows. Consider: controlling access to pipe endpoints, avoiding writing to disk, and wiping clean all memory buffers after use.

Implementors should be particularly careful on platforms that do not provide secure interprocess communication.

Finally, protocol-extension designers may wish to specify that implementations should handle several protocol instances with a single module; in the example above, a single module that generated the session key and the ciphertext.

7.2 Trusting PEP Headers

HTTP messages travel in the clear; messages using PEP are no different.

Do not trust the integrity of PEP headers without proof.

Unless PEP headers are being read from an authenticated channel or a wrapped, signed or encrypted message, PEP headers are unreliable. An attacker can modify, remove, or add protocol offers, instances, and encoding order.

The consequences of such a man-in-the-middle attack include denial-of-service, since two parties that actually have the facilities to communicate can each end up believing that the other does not.

In general, the risks here can be limited if implementors apply reasonable sanity checks: e.g. don't send sensitive data in the clear, scrutinize the order and plausibility of the modules to be run, and so on.

7.3 Protocol Interaction Effects

Protocol extension designers must be very careful about interactions with other protocols.

Protocol extensions considered safe individually can be dangerous in combination or reordered.

For example, a digital-signature protocol and an encryption protocol are both separately correct operations to execute; but it is a well-known cryptographic protocol design error [1,2] to allow signature after encryption. PEP, as specified in this document, offers no explicit syntax for expressing order-constraints.

Separate protocols may also interpret data in conflicting ways, or offer contradictory modes of operation. Repeated application may also be an error, e.g. rot-13(rot-13(text)).

7.4 Protocol Substitution Effects

The language of the PEP specification is carefully formulated to allow agents to reply using protocol extensions that an agent believes to be equivalent.

Any protocol offer may be satisfied using a different protocol the originator believes to be interchangeable.

Implementors can use this technique to deploy new technology, or to make generic requests (``I require a Signature on the response''...``OK, it has been signed with the FooBar algorithm''). There may be security risks in trusting the counterparty's beliefs: the other agent might believe that cleartext is no different from ciphertext, for example.

7.5 Negotiation Scope

HTTP is a stateless protocol, and PEP does not modify that. As a result, agents cannot rely on negotiating within a fixed context; each request may be considered anew.

PEP for HTTP/1.x cannot require all agents to maintain common beliefs about capabilities.

In particular, there is no way for servers to enforce ``refused'' semantics on clients. The client may state its preferences (including what it refuses) and the server must reply according to those preferences, since the server is ``fully informed'' at the time the request is received. The reverse is not true: when the server replies, along with its preferences, the client is not obligated to ``remember'' this information.

For example, if a server refuses clear text POST to a certain URI, there is no basis to assume that a client will not, in fact, attempt to POST clear text.

8. Development Path

The W3 Consortium is actively pursuing PEP research and deployment. Reference implementations will be freely available from W3C, as will protocol extension modules for a wide variety of applications, including PICS [12]. W3C's Security and Payments Working Groups have been involved with PEP since July 1995.

Please contact the author with any questions, comments, or concerns at khare@w3.org.

9. Acknowledgements

This specification makes heavy use of the grammar, constructs, and style of HTTP/1.0 and HTTP/1.1. Thanks to Roy T. Fielding for his work on those documents, and for his input to PEP.

The W3 Consortium technical staff at MIT have put as much effort into this proposal as I have: Tim Berners-Lee, Dan Connolly, Jim Gettys, Phillip Hallam-Baker, Jim Miller, Henrik Frystyk Nielsen, and Dave Raggett.

Allan Schiffman helped clarify the logic and power of PEP security. Finally, credit is due to Dave Kristol, whose original ``A Proposed Extension Mechanism for HTTP'' Internet Draft inspired PEP.

10. References

[1]
M. Abadi and R. Needham. "Prudent Engineering Practice for Cryptographics Protocols." Digital Systems Research Center: Report 125, Digital, June 1994.

[2]
R. Anderson and R. Needham. "Robustness principles for public key protocols." ftp://ftp.cl.cam.ac.uk/users/rja14/robustness.ps.Z [in proceedings of Crypto '95], Cambridge University Computer Laboratory, 1995.

[3]
T. Berners-Lee, R. Fielding, and H. Frystyk Nielsen, "Hypertext Transfer Protocol -- HTTP/1.0". Internet Draft W3 Consortium/MIT, UC Irvine, W3 Consortium/MIT, October 1995 (Work in Progress).

[4]
T. Berners-Lee, R. Fielding, and H. Frystyk Nielsen, "Hypertext Transfer Protocol -- HTTP/1.1". Internet Draft W3 Consortium/MIT, UC Irvine, W3 Consortium/MIT, November 1995 (Work in Progress).

[5]
D. H. Crocker. "Standard for the Format of ARPA Internet Text Messages." STD 11 , RFC 822, UDEL, August 1982.

[6]
S. Deering, R. Hinden, Editors, "Internet Protocol, Version 6 (IPv6) Specification", Internet Draft, June 1995 (Work in Progress).

[7]
R. Fielding. "Relative Uniform Resource Locators." RFC 1808 , UC Irvine, June 1995.

[8]
A. Hopmann, "HTTP Session Extension", Internet Draft, July 1995 (Work in Progress).

[9]
D. Kristol, "A Proposed Extension Mechanism for HTTP", Internet Draft, January 1995 (Work in Progress, Expired).

[10]
R. Khare. "PEP Design & Implementation." W3C Working Draft , W3 Consortium, November 1995 (Work In Progress).

[11]
J. Klensin, N. Freed, M. Rose, E. Stefferud, and D. Crocker. "SMTP Service Extensions." RFC 1869. MCI, Innosoft, Dover Beach Consulting, Network Management Associates, Brandenburg Consulting, November 1995.

[12]
J. Miller. "Label Syntax and Communication Protocols." Internet Draft , W3 Consortium/PICS, November 1995 (Work In Progress).

[13]
J. Postel. "Simple Mail Transfer Protocol." STD 10, USC/ISI, August 1982.

[14]
J. Postel, J. Reynolds, "Telnet Protocol specification." STD 8, RFC 854, USC/ISI, May 1983

[15]
E. Rescorla, and A. Schiffman, "The Secure Hypertext Transfer Protocol", Internet Draft, July 1995 (Work in Progress).

11. Author's Address

Rohit Khare
Technical Staff, W3 Consortium
MIT Laboratory for Computer Science
545 Technology Square
Cambridge, MA 02139, U.S.A.
Tel: +1 (617) 253 5884
Fax: +1 (617) 258 8682
Email: khare@w3.org
Web: http://www.w3.org/People/Khare