HTTP Working Group R. Fielding, UC Irvine INTERNET-DRAFT H. Frystyk, MIT/LCS T. Berners-Lee, MIT/LCS J. Gettys, DEC Jeffrey C. Mogul, DEC Expires September 23, 1996 April 23, 1996 Hypertext Transfer Protocol -- HTTP/1.1 Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or made obsolete by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as _work in progress_. To learn the current status of any Internet-Draft, please check the _1id-abstracts.txt_ listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. Please send comments to the HTTP working group at . Discussions of the working group are archived at . General discussions about HTTP and the applications which use HTTP should take place on the mailing list. NOTE: This specification is for discussion purposes only. It is not claimed to represent the consensus of the HTTP working group, and contains a number of proposals that either have not been discussed or are controversial. The working group is discussing significant changes in many areas, including - support for caching, persistent connections, range retrieval, content negotiation, MIME compatibility, authentication, timing of the PUT operation. Abstract The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (commands). A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred. Fielding, Frystyk, Berners-Lee, Gettys and Mogul [Page 1] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 HTTP has been in use by the World-Wide Web global information initiative since 1990. This specification defines the protocol referred to as _HTTP/1.1_. Note to Readers of This Document This document is still organized to minimize changes from the previous draft, to ease reviewers work in finding new material (and because the editor has not had time to reorganize it).. However, the current organization is now quite poor for new readers of this document. We recommend that new readers of this document not read it in the current order of presentation, but may want to skip ahead after reading sections 1-9 and read sections 11, 12 13 and 14 before reading section 10 which defines the header field definitions. Section 10 itself is now also not in alphabetical order, again, to avoid renumbering sections to be able to easily compare between drafts. If you are reading the version of this document showing revision markup, note that we've tried to preserve significant changes from the previous version, though a few changes may have slipped through unmarked. We make no guarantees that all changes have revision marks, though we've tried to preserve them as an aid to those who wish to check a specific change has been reflected in this draft. Note that some sections are still marked as SLUSHY and a few are marked FLUID; these are still undergoing drafting. Note that text in bold in the text are as yet incompletely resolved issues. Opinions are solicited_ Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 2] Table of Contents HYPERTEXT TRANSFER PROTOCOL -- HTTP/1.1................................1 Status of this Memo....................................................1 Abstract...............................................................1 Note to Readers of This Document.......................................2 Table of Contents......................................................3 1. Introduction........................................................9 1.1 Purpose ..........................................................9 1.2 Requirements .....................................................9 1.3 Terminology .....................................................10 1.4 Overall Operation ...............................................12 1.4 HTTP and MIME ...................................................14 2. Notational Conventions and Generic Grammar.........................14 2.1 Augmented BNF ...................................................14 2.2 Basic Rules .....................................................16 3. Protocol Parameters................................................18 3.1 HTTP Version ....................................................18 3.2 Uniform Resource Identifiers ....................................19 3.2.1 General Syntax ...............................................19 3.2.2 http URL .....................................................21 3.3 Date/Time Formats ...............................................22 3.3.1 Full Date ....................................................22 3.3.2 Delta Seconds ................................................24 3.4 Character Sets ..................................................24 3.5 Content Codings .................................................25 3.6 Transfer Codings ................................................26 3.7 Media Types .....................................................27 3.7.1 Canonicalization and Text Defaults ...........................28 3.7.2 Multipart Types ..............................................29 3.8 Product Tokens ..................................................29 3.9 Quality Values ..................................................30 3.10 Language Tags ..................................................30 3.12 Full Date Values ...............................................31 3.13 Opaque Validators ..............................................31 3.14 Variant IDs ....................................................32 3.15 Validator Sets .................................................32 3.16 Variant Sets ...................................................32 3.17 HTTP Protocol Parameters Related to Ranges .....................32 3.17.1SLUSHY Range Units ...........................................32 3.17.2 SLUSHY Byte Ranges ..........................................33 3.17.3 SLUSHY: Content Ranges ......................................34 Fielding, Frystyk, Berners-Lee, Gettys and Mogul [Page 3] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 4. HTTP Message.......................................................35 4.1 Message Types ...................................................35 4.2 Message Headers .................................................36 4.3 General Header Fields ...........................................37 5. Request............................................................38 5.1 Request-Line ....................................................38 5.1.1 Method .......................................................38 5.1.2 Request-URI ..................................................39 5.2 Request Header Fields ...........................................41 6. Response...........................................................42 6.1 Status-Line .....................................................43 6.1.1 Status Code and Reason Phrase ................................43 6.2 Response Header Fields ..........................................46 7. Entity.............................................................46 7.1 Entity Header Fields ............................................46 7.2 Entity Body .....................................................47 7.2.1 Type .........................................................48 7.2.2 Length .......................................................48 8. Method Definitions.................................................49 8.1 OPTIONS .........................................................49 8.2 GET .............................................................50 8.3 HEAD ............................................................50 8.4 POST ............................................................51 8.4.1 SLUSHY: Entity Transmission Requirements .....................52 8.5 PUT .............................................................53 8.9 DELETE ..........................................................54 8.12 TRACE ..........................................................54 9. Status Code Definitions............................................55 9.1 Informational 1xx ...............................................55 9.2 Successful 2xx ..................................................56 9.3 Redirection 3xx .................................................58 9.4 Client Error 4xx ................................................60 9.5 Server Error 5xx ................................................63 10. Header Field Definitions..........................................65 10.1 Accept .........................................................65 10.2 Accept-Charset .................................................67 10.3 Accept-Encoding ................................................67 10.4 Accept-Language ................................................68 10.5 Allow ..........................................................69 10.6 Authorization ..................................................70 10.7 Cache-Control ..................................................70 Check: is this true? ...............................................72 10.7.1 SLUSHY: Restrictions on What is Cachable ....................72 10.7.2 Restrictions On What May be Stored by a Cache ...............73 10.7.3 Modifications of the Basic Expiration Mechanism .............73 10.7.4 SLUSHY: Controls over cache revalidation and reload .........74 10.7.5 FLUID: Restrictions on use count and demographic reporting ..76 10.7.6 Miscellaneous restrictions ..................................77 Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 4] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 10.8 Connection .....................................................77 10.8.1 Persist ......................................................78 10.9 Content-Base ...................................................78 10.10 Content-Encoding ..............................................78 10.11 Content-Language ..............................................79 10.12 Content-Length ................................................80 10.13 Content-MD5 ...................................................80 10.14 SLUSHY Content-Range ..........................................82 10.14.1 MIME multipart/byteranges content-type .....................82 10.14.2 Additional rules for Content-Range .........................83 10.15 Content-Type ..................................................83 10.16 Content-Location ..............................................84 10.17 Date ..........................................................84 10.19 SLUSHY Expires ................................................85 10.20 Via ...........................................................86 10.21 From ..........................................................88 10.22 Host ..........................................................88 10.23 If-Modified-Since .............................................89 10.25 Last-Modified .................................................90 10.27 Location ......................................................91 10.29 Pragma ........................................................91 10.30 Proxy-Authenticate ............................................92 10.31 Proxy-Authorization ...........................................92 10.32 Public ........................................................93 10.33 Range .........................................................93 10.34 Referer .......................................................94 10.36 Retry-After ...................................................95 10.37 Server ........................................................95 10.38 Title .........................................................95 10.39 Transfer Encoding .............................................96 10.41 Upgrade .......................................................96 10.43 User-Agent ....................................................97 10.44 WWW-Authenticate ..............................................98 10.45 Max-Forwards ..................................................98 10.46 Age ...........................................................99 10.47 CVal ..........................................................99 10.48 If-Invalid ....................................................99 10.49 If-Valid .....................................................100 10.50 If-Unmodified-Since ..........................................101 10.51 Warning ......................................................102 10.52 Vary .........................................................103 10.53 Alternates ...................................................106 10.54 SLUSHY: Accept-Ranges ........................................107 10.55 SLUSHY: Range-If .............................................107 11. Access Authentication............................................108 11.1 Basic Authentication Scheme ...................................109 11.2 Digest Authentication Scheme ..................................110 12. Content Negotiation..............................................111 12.1 Negotiation facilities defined in this specification .........111 13 Caching in HTTP...................................................112 13.1 Semantic Transparency .........................................112 Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 5] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 13.2 Expiration Model ..............................................113 13.2.1 Server-Specified Expiration ................................113 13.2.2 Limitations on the Effect of Expiration Times ..............114 13.2.3 Heuristic Expiration .......................................114 13.2.4 Client-controlled Behavior .................................114 13.2.5 Exceptions to the Rules and Warnings .......................115 13.2.6 Age Calculations ...........................................115 13.2.7 Expiration Calculations ....................................117 13.2.8 UT Mandatory ................................................118 13.3 Validation Model ..............................................118 13.3.1 Last-modified Dates ........................................119 13.3.2 Opaque Validators ..........................................119 13.3.3 Weak and Strong Validators .................................120 13.3.4 Rules for When to Use Opaque Validators and Last-modified Dates .............................................................122 13.3.5 SLUSHY: Non-validating conditionals ........................123 13.3.6 FLUID: Other Issues ........................................123 13.4 Cache-control Mechanisms ......................................123 13.5 Warnings ......................................................124 13.6 Explicit Indications Regarding User-specified Overrides .......124 13.7 Constructing Responses From Caches ............................125 13.7.1 End-to-end and Hop-by-hop Headers ..........................125 13.7.2 Non-modifiable Headers .....................................126 13.7.3 Combining Headers ..........................................126 13.7.4 Combining Byte Ranges ......................................126 13.7.5 SLUSHY: Scope of Expiration ................................127 13.8 Caching and Content Negotiation ...............................127 13.8.1 Use of the Vary header .....................................127 13.8.2 SLUSHY: Use of the Alternates header .......................128 13.8.3 Use of Variant-IDs .........................................128 13.8.4 Use of Selecting Opaque Validators .........................129 13.10 Shared and Non-Shared Caches .................................130 13.11 SLUSHY: Miscellaneous Considerations .........................130 13.11.1 Detecting Firsthand Responses .............................130 13.11.2 Disambiguating Expiration values ..........................130 13.11.3 Disambiguating Multiple Responses .........................131 13.12 SLUSHY: Cache Keys ...........................................131 13.12.1 Non-varying Resources .....................................132 13.12.2 SLUSHY: Varying Resources .................................132 13.12.3 SLUSHY: Key-Matching Procedure ............................133 13.12.4 Canonicalization of URIs ..................................134 13.13 FLUID: Cache-Related Problems Not Addressed in HTTP/1.1 ......134 13.14 Cache Operation When Receiving Errors or Incomplete Responses 134 13.14.1 Caching and Status Codes ..................................135 13.14.2 Handling of Retry-After ...................................135 13.15 FLUID: Compatibility With Earlier Versions of HTTP ...........135 13.16 SLUSHY: Side Effects of GET and HEAD .........................135 13.17 SLUSHY: Invalidation After Updates or Deletions ..............136 13.18 Write-Through Mandatory ......................................136 13.19 Interoperability of Varying Resources with HTTP/1.0 Proxy Caches .............................................................136 13.20 Cache Replacement for Varying Resources ......................137 13.22 FLUID: Network Partitions ....................................138 13.23 FLUID: Caching of Negative Responses .........................138 Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 6] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 13.24 History Lists ................................................138 14 Persistent Connections............................................138 14.1 Purpose .......................................................138 14.2 Overall Operation .............................................139 14.2.3 Negotiation ................................................139 14.2.4 Pipe-lining ................................................139 14.2.5 Delimiting Entity-Bodies ...................................139 14.3 Proxy Servers .................................................140 14.4 Interaction with Security Protocols ...........................140 14.5 Practical Considerations ......................................140 15. Security Considerations..........................................141 15.1 Authentication of Clients .....................................141 15.2 Safe Methods ..................................................142 15.3 Abuse of Server Log Information ...............................143 15.4 Transfer of Sensitive Information .............................143 15.5 Attacks Based On File and Path Names ..........................143 15.6 Personal Information ..........................................144 15.7 Privacy issues connected to Accept headers ....................144 15.8 DNS Spoofing ..................................................145 15.9 SLUSHY: Location Headers and Spoofing .........................145 16. Acknowledgments..................................................145 17. References.......................................................147 18. Authors' Addresses...............................................150 Appendices...........................................................151 A. Internet Media Type message/http..................................151 B. Tolerant Applications.............................................152 C. Differences Between HTTP Bodies and RFC 1521 Internet Message Bodies .....................................................................152 C.1 Conversion to Canonical Form ...................................153 C.2 Conversion of Date Formats .....................................153 C.3 Introduction of Content-Encoding ...............................153 C.4 No Content-Transfer-Encoding ...................................154 C.5 HTTP Header Fields in Multipart Body-Parts .....................154 C.6 Introduction of Transfer-Encoding ..............................154 C.7 MIME-Version ...................................................155 D. Changes from HTTP/1.0.............................................155 D.1 Changes to Simplify Multi-homed Web Servers and Conserve IP Addresses ..........................................................155 E. Additional Features...............................................156 E.1 Additional Request Methods .....................................156 E.1.1 PATCH .......................................................156 E.1.2 LINK ........................................................157 E.1.3 UNLINK ......................................................157 Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 7] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 E.2 Additional Header Field Definitions ............................157 E.2.1 Content-Version .............................................157 E.2.2 Derived-From ................................................158 E.2.3 Link ........................................................158 E.2.4 URI .........................................................159 E.2.5 Compatibility with HTTP/1.0 Persistent Connections ..........160 F.1 Compatibility with Previous Versions ...........................160 G. Proxy Cache Implementation Guidelines ...........................161 G.1 Support for Content Negotiation by Proxy Caches ................161 G.2 Propagation of Changes in Opaque Selection ....................163 G.3 SLUSHY: State ..................................................163 G.4 FLUID: Cache Replacement Algorithms ............................163 G.5 FLUID: Bypassing in Caching Hierarchies ........................164 Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 8] 1. Introduction 1.1 Purpose The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. HTTP has been in use by the World-Wide Web global information initiative since 1990. The first version of HTTP, referred to as HTTP/0.9, was a simple protocol for raw data transfer across the Internet. HTTP/1.0, as defined by RFC xxxx [6], improved the protocol by allowing messages to be in the format of MIME-like entities, containing metainformation about the data transferred and modifiers on the request/response semantics. However, HTTP/1.0 does not sufficiently take into consideration the effect of hierarchical proxies and caching, the desire for persistent connections and virtual hosts, and a number of other details that slipped through the cracks of existing implementations. In addition, the proliferation of incompletely-implemented applications calling themselves _HTTP/1.0_ has necessitated a protocol version change in order for two communicating applications to determine each other's true capabilities. This specification defines the protocol referred to as _HTTP/1.1_. This protocol is backwards-compatible with HTTP/1.0, but includes more stringent requirements in order to ensure reliable implementation of its features. Practical information systems require more functionality than simple retrieval, including search, front-end update, and annotation. HTTP allows an open-ended set of methods that indicate the purpose of a request. It builds on the discipline of reference provided by the Uniform Resource Identifier (URI) [3], as a location (URL) [4] or name (URN) [20], for indicating the resource on which a method is to be applied. Messages are passed in a format similar to that used by Internet Mail [9] and the Multipurpose Internet Mail Extensions (MIME) [7]. HTTP is also used as a generic protocol for communication between user agents and proxies/gateways to other Internet protocols, such as SMTP [16], NNTP [13], FTP [18], Gopher [2], and WAIS [10], allowing basic hypermedia access to resources available from diverse applications and simplifying the implementation of user agents. 1.2 Requirements This specification uses the same words as RFC 1123 [8] for defining the significance of each particular requirement. These words are: MUST This word or the adjective _required_ means that the item is an absolute requirement of the specification. SHOULD This word or the adjective _recommended_ means that there may exist valid reasons in particular circumstances to ignore this item, but Fielding, Frystyk, Berners-Lee, Gettys and Mogul [Page 9] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 the full implications should be understood and the case carefully weighed before choosing a different course. MAY This word or the adjective _optional_ means that this item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because it enhances the product, for example; another vendor may omit the same item. An implementation is not compliant if it fails to satisfy one or more of the MUST requirements for the protocols it implements. An implementation that satisfies all the MUST and all the SHOULD requirements for its protocols is said to be _unconditionally compliant_; one that satisfies all the MUST requirements but not all the SHOULD requirements for its protocols is said to be _conditionally compliant_. 1.3 Terminology This specification uses a number of terms to refer to the roles played by participants in, and objects of, the HTTP communication. connection A transport layer virtual circuit established between two application programs for the purpose of communication. message The basic unit of HTTP communication, consisting of a structured sequence of octets matching the syntax defined in Section 4 and transmitted via the connection. request An HTTP request message (as defined in Section 5). response An HTTP response message (as defined in Section 6). resource A network data object or service that can be identified by a URI (Section 3.2). entity A particular representation, rendition, encoding, or presentation of a resource. Resources not supporting content negotiation are bound to a single entity. Resources supporting content negotiation Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 10] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 are bound to a set of one or more entities, whose membership may vary over time. entity instance The definite value of an entity at a given point in time. The HTTP protocol transfers entity instances in request or response messages. An entity instance is transferred as metainformation in the form of entity headers and content in the form of an entity body. client An application program that establishes connections for the purpose of sending requests. user agent The client which initiates a request. These are often browsers, editors, spiders (web-traversing robots), or other end user tools. server An application program that accepts connections in order to service requests by sending back responses. origin server The server on which a given resource resides or is to be created. proxy An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally or by passing them, with possible translation, on to other servers. A proxy MUST interpret and, if necessary, rewrite a request message before forwarding it. Proxies are often used as client-side portals through network firewalls and as helper applications for handling requests via protocols not implemented by the user agent. gateway A server which acts as an intermediary for some other server. Unlike a proxy, a gateway receives requests as if it were the origin server for the requested resource; the requesting client may not be aware that it is communicating with a gateway. Gateways are often used as server-side portals through network firewalls and as protocol translators for access to resources stored on non-HTTP systems. tunnel A tunnel is an intermediary program which is acting as a blind Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 11] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 relay between two connections. Once active, a tunnel is not considered a party to the HTTP communication, though the tunnel may have been initiated by an HTTP request. The tunnel ceases to exist when both ends of the relayed connections are closed. Tunnels are used when a portal is necessary and the intermediary cannot, or should not, interpret the relayed communication. cache A program's local store of response messages and the subsystem that controls its message storage, retrieval, and deletion. A cache stores cachable responses in order to reduce the response time and network bandwidth consumption on future, equivalent requests. Any client or server MAY include a cache, though a cache cannot be used by a server while it is acting as a tunnel. Any given program MAY be capable of being both a client and a server; our use of these terms refers only to the role being performed by the program for a particular connection, rather than to the program's capabilities in general. Likewise, any server MAY act as an origin server, proxy, gateway, or tunnel, switching behavior based on the nature of each request. 1.4 Overall Operation The HTTP protocol is based on a request/response paradigm. A client sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME-like message containing request modifiers, client information, and possible body content over a connection with a server. The server responds with a status line, including the message's protocol version and a success or error code, followed by a MIME-like message containing server information, entity metainformation, and possible body content. Most HTTP communication is initiated by a user agent and consists of a request to be applied to a resource on some origin server. In the simplest case, this may be accomplished via a single connection (v) between the user agent (UA) and the origin server (O). request chain ------------------------> UA -------------------v------------------- O <----------------------- response chain A more complicated situation occurs when one or more intermediaries are present in the request/response chain. There are three common forms of intermediary: proxy, gateway, and tunnel. A proxy is a forwarding agent, receiving requests for a URI in its absolute form, rewriting all or parts of the message, and forwarding the reformatted request toward the server identified by the URI. A gateway is a receiving agent, acting as a layer above some other server(s) and, if necessary, translating the requests to the underlying server's protocol. A tunnel acts as a relay point between two connections without changing the messages; tunnels are used when the communication needs to pass through an intermediary (such Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 12] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 as a firewall) even when the intermediary cannot understand the contents of the messages. request chain --------------------------------------> UA -----v----- A -----v----- B -----v----- C -----v----- O <------------------------------------- response chain The figure above shows three intermediaries (A, B, and C) between the user agent and origin server. A request or response message that travels the whole chain MUST pass through four separate connections. This distinction is important because some HTTP communication options may apply only to the connection with the nearest, non-tunnel neighbor, only to the end-points of the chain, or to all connections along the chain. Although the diagram is linear, each participant may be engaged in multiple, simultaneous communications. For example, B may be receiving requests from many clients other than A, and/or forwarding requests to servers other than C, at the same time that it is handling A's request. Any party to the communication which is not acting as a tunnel may employ an internal cache for handling requests. The effect of a cache is that the request/response chain is shortened if one of the participants along the chain has a cached response applicable to that request. The following illustrates the resulting chain if B has a cached copy of an earlier response from O (via C) for a request which has not been cached by UA or A. request chain ----------> UA -----v----- A -----v----- B - - - - - - C - - - - - - O <--------- response chain Not all responses are cachable, and some requests may contain modifiers which place special requirements on cache behavior. HTTP requirements for cache behavior and cachable responses are defined in Section 13. On the Internet, HTTP communication generally takes place over TCP/IP connections. The default port is TCP 80 [19], but other ports can be used. This does not preclude HTTP from being implemented on top of any other protocol on the Internet, or on other networks. HTTP only presumes a reliable transport; any protocol that provides such guarantees can be used; the mapping of the HTTP/1.1 request and response structures onto the transport data units of the protocol in question is outside the scope of this specification. However, HTTP/1.1 implementations SHOULD implement persistent connections (See section 14). Both clients and servers MUST be capable of handling cases where either party closes the connection prematurely, due to user action, automated time-out, or program failure. In any case, Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 13] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 the closing of the connection by either or both parties always terminates the current request, regardless of its status. 1.4 HTTP and MIME HTTP/1.1 uses many of the constructs defined for MIME, as defined in RFC 1521 [7]. Appendix C describes the ways in which the context of HTTP allows for different use of Internet Media Types than is typically found in Internet mail, and gives the rationale for those differences. 2. Notational Conventions and Generic Grammar 2.1 Augmented BNF All of the mechanisms specified in this document are described in both prose and an augmented Backus-Naur Form (BNF) similar to that used by RFC 822 [9]. Implementers will need to be familiar with the notation in order to understand this specification. The augmented BNF includes the following constructs: name = definition The name of a rule is simply the name itself (without any enclosing "<" and ">") and is separated from its definition by the equal character "=". Whitespace is only significant in that indentation of continuation lines is used to indicate a rule definition that spans more than one line. Certain basic rules are in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. Angle brackets are used within definitions whenever their presence will facilitate discerning the use of rule names. "literal" Quotation marks surround literal text. Unless stated otherwise, the text is case-insensitive. rule1 | rule2 Elements separated by a bar ("I") are alternatives, e.g., "yes | no" will accept yes or no. (rule1 rule2) Elements enclosed in parentheses are treated as a single element. Thus, _(elem (foo | bar) elem)_ allows the token sequences _elem foo elem_ and _elem bar elem_. *rule The character _*_ preceding an element indicates repetition. The full form is _*element_ indicating at least and at most occurrences of element. Default values are 0 and infinity so Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 14] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 that _*(element)_ allows any number, including zero; _1*element_ requires at least one; and _1*2element_ allows one or two. [rule] Square brackets enclose optional elements; _[foo bar]_ is equivalent to _*1(foo bar)_. N rule Specific repetition: _(element)_ is equivalent to _*(element)_; that is, exactly occurrences of (element). Thus 2DIGIT is a 2-digit number, and 3ALPHA is a string of three alphabetic characters. #rule A construct "#" is defined, similar to "*", for defining lists of elements. The full form is "#element" indicating at least and at most elements, each separated by one or more commas (",") and optional linear whitespace (LWS). This makes the usual form of lists very easy; a rule such as "( *LWS element *( *LWS "," *LWS element ))" can be shown as "1#element". Wherever this construct is used, null elements are allowed, but do not contribute to the count of elements present. That is, "(element), , (element)" is permitted, but counts as only two elements. Therefore, where at least one element is required, at least one non-null element MUST be present. Default values are 0 and infinity so that "#(element)" allows any number, including zero; "1#element" requires at least one; and _1#2element_ allows one or two. ; comment A semi-colon, set off some distance to the right of rule text, starts a comment that continues to the end of line. This is a simple way of including useful notes in parallel with the specifications. implied *LWS The grammar described by this specification is word-based. Except where noted otherwise, linear whitespace (LWS) can be included between any two adjacent words (token or quoted-string), and between adjacent tokens and delimiters (tspecials), without changing the interpretation of a field. At least one delimiter (tspecials) MUST exist between any two tokens, since they would otherwise be interpreted as a single token. However, applications SHOULD attempt to follow _common form_ when generating HTTP constructs, since there exist some implementations that fail to accept anything beyond the common forms. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 15] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 2.2 Basic Rules The following rules are used throughout this specification to describe basic parsing constructs. The US-ASCII coded character set is defined by [21]. OCTET = CHAR = UPALPHA = LOALPHA = ALPHA = UPALPHA | LOALPHA DIGIT = CTL = CR = LF = SP = HT = <"> = HTTP/1.1 defines the octet sequence CR LF as the end-of-line marker for all protocol elements except the Entity-Body (see Appendix B for tolerant applications). The end-of-line marker within an Entity-Body is defined by its associated media type, as described in Section 3.7. CRLF = CR LF HTTP/1.1 headers can be folded onto multiple lines if the continuation line begins with a space or horizontal tab. All linear whitespace, including folding, has the same semantics as SP. LWS = [CRLF] 1*( SP | HT ) The TEXT rule is only used for descriptive field contents and values that are not intended to be interpreted by the message parser. Words of *TEXT MAY contain octets from character sets other than US-ASCII only when encoded according to the rules of RFC 1522 [14]. TEXT = Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 16] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Recipients of header field TEXT containing octets outside the US-ASCII character set range MAY assume that they represent ISO-8859-1 characters if there is no other encoding indicated by an RFC 1522 mechanism. Hexadecimal numeric characters are used in several protocol elements. HEX = "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT Many HTTP/1.1 header field values consist of words separated by LWS or special characters. These special characters MUST be in a quoted string to be used within a parameter value. word = token | quoted-string token = 1* tspecials = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | "}" | SP | HT Comments can be included in some HTTP header fields by surrounding the comment text with parentheses. Comments are only allowed in fields containing _comment_ as part of their field value definition. In all other fields, parentheses are considered part of the field value. comment = "(" *( ctext | comment ) ")" ctext = A string of text is parsed as a single word if it is quoted using double-quote marks. quoted-string = ( <"> *(qdtext) <"> ) qdtext = and CTLs, but including LWS> The backslash character (_\_) may be used as a single-character quoting mechanism only within quoted-string and comment constructs. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 17] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 quoted-pair = "\" CHAR 3. Protocol Parameters 3.1 HTTP Version HTTP uses a _._ numbering scheme to indicate versions of the protocol. The protocol versioning policy is intended to allow the sender to indicate the format of a message and its capacity for understanding further HTTP communication, rather than the features obtained via that communication. No change is made to the version number for the addition of message components which do not affect communication behavior or which only add to extensible field values. The number is incremented when the changes made to the protocol add features which do not change the general message parsing algorithm, but which may add to the message semantics and imply additional capabilities of the sender. The number is incremented when the format of a message within the protocol is changed. The version of an HTTP message is indicated by an HTTP-Version field in the first line of the message. If the protocol version is not specified, the recipient MUST assume that the message is in the simple HTTP/0.9 format [6]. HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT Note that the major and minor numbers SHOULD be treated as separate integers and that each MAY be incremented higher than a single digit. Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in turn is lower than HTTP/12.3. Leading zeros SHOULD be ignored by recipients and never generated by senders. Applications sending Full-Request or Full-Response messages, as defined by this specification, MUST include an HTTP-Version of _HTTP/1.1_. Use of this version number indicates that the sending application is at least conditionally compliant with this specification. Proxy and gateway applications MUST be careful in forwarding requests that are received in a format different than that of the application's native HTTP version. Since the protocol version indicates the protocol capability of the sender, a proxy/gateway MUST never send a message with a version indicator which is greater than its native version; if a higher version request is received, the proxy/gateway MUST either downgrade the request version, respond with an error, or switch to tunnel behavior. Requests with a version lower than that of the application's native format MAY be upgraded before being forwarded; the Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 18] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 proxy/gateway's response to that request MUST follow the server requirements listed above. Note: Converting between versions of HTTP may involve addition or deletion of headers required or forbidden by the version involved. It is likely more involved than just changing the version indicator. 3.2 Uniform Resource Identifiers URIs have been known by many names: WWW addresses, Universal Document Identifiers, Universal Resource Identifiers [3], and finally the combination of Uniform Resource Locators (URL) [4] and Names (URN) [20]. As far as HTTP is concerned, Uniform Resource Identifiers are simply formatted strings which identify--via name, location, or any other characteristic--a network resource. 3.2.1 General Syntax URIs in HTTP can be represented in absolute form or relative to some known base URI [11], depending upon the context of their use. The two forms are differentiated by the fact that absolute URIs always begin with a scheme name followed by a colon. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 19] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 URI = ( absoluteURI | relativeURI ) [ "#" fragment ] absoluteURI = scheme ":" *( uchar | reserved ) relativeURI = net_path | abs_path | rel_path net_path = "//" net_loc [ abs_path ] abs_path = "/" rel_path rel_path = [ path ] [ ";" params ] [ "?" query ] path = fsegment *( "/" segment ) fsegment = 1*pchar segment = *pchar params = param *( ";" param ) param = *( pchar | "/" ) scheme = 1*( ALPHA | DIGIT | "+" | "-" | "." ) net_loc = *( pchar | ";" | "?" ) query = *( uchar | reserved ) fragment = *( uchar | reserved ) pchar = uchar | ":" | "@" | "&" | "=" | "+" uchar = unreserved | escape unreserved = ALPHA | DIGIT | safe | extra | national escape = "%" HEX HEX reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" extra = "!" | "*" | "'" | "(" | ")" | "," safe = "$" | "-" | "_" | "." | "+" unsafe = CTL | SP | <"> | "#" | "%" | "<" | ">" national = For definitive information on URL syntax and semantics, see RFC 1738 [4] and RFC 1808 [11]. The BNF above includes national characters not allowed in valid URLs as specified by RFC 1738, since HTTP servers are not restricted in the set of unreserved characters allowed to represent the rel_path part of addresses, and HTTP proxies may receive requests for URIs not defined by RFC 1738. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 20] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 The HTTP protocol does not place any a-priori limit on the length of a URI. Servers MUST be able to handle the URI of any resource they serve, and SHOULD be able to handle URIs of unbounded length if they provide GET-based forms that could generate such URIs. A server SHOULD return a status code of if a URI is longer than the server can handle. See section 9.4. Note: Servers SHOULD be cautious about depending on URI lengths above 255 bytes, because some older client or proxy 414 Request-URI Too Large implementations may not properly support these. All client and proxy implementations MUST be able to handle a URI of any finite length. 3.2.2 http URL The _http_ scheme is used to locate network resources via the HTTP protocol. This section defines the scheme-specific syntax and semantics for http URLs. http_URL = "http:" "//" host [ ":" port ] [ abs_path ] host = port = *DIGIT If the port is empty or not given, port 80 is assumed. The semantics are that the identified resource is located at the server listening for TCP connections on that port of that host, and the Request-URI for the resource is abs_path. The use of IP addresses in URL's SHOULD be avoided whenever possible. See RFC 1900[24]. If the abs_path is not present in the URL, it MUST be given as _/_ when used as a Request-URI for a resource (Section 5.1.2). Note: Although the HTTP protocol is independent of the transport layer protocol, the http URL only identifies resources by their TCP location, and thus non-TCP resources MUST be identified by some other URI scheme. The canonical form for _http_ URLs is obtained by converting any UPALPHA characters in host to their LOALPHA equivalent (hostnames are case- Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 21] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 insensitive), eliding the [ ":" port ] if the port is 80, and replacing an empty abs_path with _/_. 3.3 Date/Time Formats 3.3.1 Full Date HTTP applications have historically allowed three different formats for the representation of date/time stamps: Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, made obsolete by RFC 1036 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format The first format is preferred as an Internet standard and represents a fixed-length subset of that defined by RFC 1123 [8] (an update to RFC 822 [9]). The second format is in common use, but is based on the obsolete RFC 850 [12] date format and lacks a four-digit year. HTTP/1.1 clients and servers that parse the date value MUST accept all three formats, though they MUST only generate the RFC 1123 format for representing date/time stamps in HTTP message fields. Note: Recipients of date values are encouraged to be robust in accepting date values that may have been generated by non-HTTP applications, as is sometimes the case when retrieving or posting messages via proxies/gateways to SMTP or NNTP. All HTTP date/time stamps MUST be represented in Universal Time (UT), also known as Greenwich Mean Time (GMT), without exception. This is indicated in the first two formats by the inclusion of _GMT_ as the three-letter abbreviation for time zone, and SHOULD be assumed when reading the asctime format. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 22] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 HTTP-date = rfc1123-date | rfc850-date | asctime-date rfc1123-date = wkday "," SP date1 SP time SP "GMT" rfc850-date = weekday "," SP date2 SP time SP "GMT" asctime-date = wkday SP date3 SP time SP 4DIGIT date1 = 2DIGIT SP month SP 4DIGIT ; day month year (e.g., 02 Jun 1982) date2 = 2DIGIT "-" month "-" 2DIGIT ; day-month-year (e.g., 02-Jun-82) date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) ; month day (e.g., Jun 2) time = 2DIGIT ":" 2DIGIT ":" 2DIGIT ; 00:00:00 - 23:59:59 wkday = "Mon" | "Tue" | "Wed" | "Thu" | "Fri" | "Sat" | "Sun" weekday = "Monday" | "Tuesday" | "Wednesday" | "Thursday" | "Friday" | "Saturday" | "Sunday" month = "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" | "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec" Note: HTTP requirements for the date/time stamp format apply only to their usage within the protocol stream. Clients and servers are not required to use these formats for user presentation, request logging, etc. Additional rules for requirements on parsing and representation of dates and other potential problems with date representations include: . HTTP/1.1 clients and caches should assume that an RFC-850 date which appears to be more than 50 years in the future is in fact in the past (this helps solve the _year 2000_ problem). . An HTTP/1.1 implementation may internally represent a parsed Expires date as earlier than the proper value, but MUST NOT internally represent a parsed Expires date as later than the proper value. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 23] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 3.3.2 Delta Seconds Some HTTP header fields allow a time value to be specified as an integer number of seconds, represented in decimal, after the time that the message was received. This format SHOULD only be used to represent short time periods or periods that cannot start until receipt of the message. delta-seconds = 1*DIGIT 3.4 Character Sets HTTP uses the same definition of the term _character set_ as that described for MIME: The term _character set_ is used in this document to refer to a method used with one or more tables to convert a sequence of octets into a sequence of characters. Note that unconditional conversion in the other direction is not required, in that not all characters may be available in a given character set and a character set may provide more than one sequence of octets to represent a particular character. This definition is intended to allow various kinds of character encodings, from simple single- table mappings such as US-ASCII to complex table switching methods such as those that use ISO 2022's techniques. However, the definition associated with a MIME character set name MUST fully specify the mapping to be performed from octets to characters. In particular, use of external profiling information to determine the exact mapping is not permitted. Note: This use of the term _character set_ is more commonly referred to as a _character encoding._ However, since HTTP and MIME share the same registry, it is important that the terminology also be shared. HTTP character sets are identified by case-insensitive tokens. The complete set of tokens is defined by the IANA Character Set registry [19]. However, because that registry does not define a single, consistent token for each character set, we define here the preferred names for those character sets most likely to be used with HTTP entities. These character sets include those registered by RFC 1521 [7] -- the US-ASCII [21] and ISO-8859 [22] character sets -- and other names specifically recommended for use within MIME charset parameters. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 24] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 charset = "US-ASCII" | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" | token registry [19] MUST represent the character set defined by that registry. Applications SHOULD limit their use of character sets to those defined by the IANA registry. _ _ is more commonly Although HTTP allows an arbitrary token to be used as a charset value, | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" any token that has a predefined value within the IANA Character Set Note: This use of the term character set referred to as a _character encoding._ However, since HTTP and MIME share the same registry, it is important that the terminology also be shared. The character set of an entity body SHOULD be labeled as the lowest common denominator of the character codes used within that body, with the exception that no label is preferred over the labels US-ASCII or ISO-8859-1. 3.5 Content Codings Content coding values indicate an encoding transformation that has been or can be applied to a resource. Content codings are primarily used to allow a document to be compressed or encrypted without losing the identity of its underlying media type. Typically, the resource is stored in this encoding and only decoded before rendering or analogous usage. content-coding = "gzip" | "x-gzip" | "compress" | "x-compress" | token Note: For historical reasons, HTTP applications SHOULD consider _x-gzip_ and _x-compress_ to be equivalent to _gzip_ and _compress_, respectively. All content-coding values are case-insensitive. HTTP/1.1 uses content- coding values in the Accept-Encoding (Section 10.3) and Content-Encoding (Section 10.10) header fields. Although the value describes the content- coding, what is more important is that it indicates what decoding mechanism will be required to remove the encoding. Note that a single program MAY be capable of decoding multiple content-coding formats. Two values are defined by this specification: Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 25] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 gzip An encoding format produced by the file compression program _gzip_ (GNU zip) developed by Jean-loup Gailly[25]. This format is typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC. compress The encoding format produced by the file compression program _compress_. This format is an adaptive Lempel-Ziv-Welch coding (LZW). Note: Use of program names for the identification of encoding formats is not desirable and should be discouraged for future encodings. Their use here is representative of historical practice, not good design. HTTP defines a registration process which uses the Internet Assigned Numbers Authority (IANA) as a central registry for content-coding value tokens. Additional content-coding value tokens beyond the four defined in this document (gzip x-gzip compress x-compress) SHOULD be registered with the IANA. To allow interoperability between clients and servers, specifications of the content coding algorithms used to implement a new value SHOULD be publicly available and adequate for independent implementation, and MUST conform to the purpose of content coding defined in this section. 3.6 Transfer Codings Transfer coding values are used to indicate an encoding transformation that has been, can be, or may need to be applied to an Entity-Body in order to ensure safe transport through the network. This differs from a content coding in that the transfer coding is a property of the message, not of the original resource. transfer-coding = "chunked" | transfer-extension transfer-extension = token All transfer-coding values are case-insensitive. HTTP/1.1 uses transfer coding values in the Transfer-Encoding header field (Section 10.39). Transfer codings are analogous to the Content-Transfer-Encoding values of MIME [7], which were designed to enable safe transport of binary data over a 7-bit transport service. However, _safe transport_ has a different focus for an 8bit-clean transfer protocol. In HTTP, the only unsafe characteristic of message bodies is the difficulty in determining the exact body length (Section 7.2.2), or the desire to encrypt data over a shared transport. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 26] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 All HTTP/1.1 applications MUST be able to receive and decode the _chunked_ transfer coding , and MUST ignore chunked extensions they do not understand. The chunked encoding modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator, followed by an optional footer containing entity-header fields. This allows dynamically-produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message. Chunked-Body = *chunk "0" CRLF footer CRLF chunk = chunk-size [ chunk-ext ] CRLF chunk-data CRLF chunk-size = hex-no-zero *HEX chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-value ] ) chunk-ext-name = token chunk-ext-val = token | quoted-string chunk-data = chunk-size(OCTET) footer = *< Content-MD5 and future headers that specify they are allowed in footer>> hex-no-zero = Note that the chunks are ended by a zero-sized chunk, followed by the footer and terminated by an empty line. An example process for decoding a Chunked-Body is presented in Appendix C.5. 3.7 Media Types HTTP uses Internet Media Types [17] in the Content-Type (Section 10.15) and Accept (Section 10.1) header fields in order to provide open and extensible data typing and type negotiation. media-type = type "/" subtype *( ";" parameter ) type = token subtype = token Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 27] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Parameters may follow the type/subtype in the form of attribute/value pairs. parameter = attribute "=" value attribute = token value = token | quoted-string The type, subtype, and parameter attribute names are case-insensitive. Parameter values may or may not be case-sensitive, depending on the semantics of the parameter name. LWS MUST NOT be generated between the type and subtype, nor between an attribute and its value. Upon receipt of a media type with an unrecognized parameter, a user agent SHOULD treat the media type as if the unrecognized parameter and its value were not present. Some older HTTP applications do not recognize media type parameters. HTTP/1.1 applications SHOULD only use media type parameters when they are necessary to define the content of a message. Media-type values are registered with the Internet Assigned Number Authority (IANA [19]). The media type registration process is outlined in RFC 1590 [17]. Use of non-registered media types is discouraged. 3.7.1 Canonicalization and Text Defaults Internet media types are registered with a canonical form. In general, an Entity-Body transferred via HTTP MUST be represented in the appropriate canonical form prior to its transmission. If the body has been encoded with a Content-Encoding, the underlying data SHOULD be in canonical form prior to being encoded. Media subtypes of the _text_ type use CRLF as the text line break when in canonical form. However, HTTP allows the transport of text media with plain CR or LF alone representing a line break when used consistently within the Entity-Body. HTTP applications MUST accept CRLF, bare CR, and bare LF as being representative of a line break in text media received via HTTP. In addition, if the text media is represented in a character set that does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use of whatever octet sequences are defined by that character set to represent the equivalent of CR and LF for line breaks. This flexibility regarding line breaks applies only to text media in the Entity-Body; a bare CR or LF SHOULD NOT be substituted for CRLF within any of the HTTP control structures (such as header fields and multipart boundaries). The _charset_ parameter is used with some media types to define the character set (Section 3.4) of the data. When no explicit charset Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 28] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 parameter is provided by the sender, media subtypes of the _text_ type are defined to have a default charset value of _ISO-8859-1_ when received via HTTP. Data in character sets other than _ISO-8859-1_ or its subsets MUST be labeled with an appropriate charset value in order to be consistently interpreted by the recipient. Note: Many current HTTP servers provide data using charsets other than _ISO-8859-1_ without proper labeling. This situation reduces interoperability and is not recommended. To compensate for this, some HTTP user agents provide a configuration option to allow the user to change the default interpretation of the media type character set when no charset parameter is given. 3.7.2 Multipart Types MIME provides for a number of _multipart_ types -- encapsulations of one or more entities within a single message's Entity-Body. All multipart types share a common syntax, as defined in Section 7.2.1 of RFC 1521 [7] , and MUST include a boundary parameter as part of the media type value. The message body is itself a protocol element and MUST therefore use only CRLF to represent line breaks between body-parts. Unlike in RFC 1521, the epilogue of any multipart message MUST be empty; HTTP applications MUST NOT transmit the epilogue even if the original resource contains an epilogue. In HTTP, multipart body-parts MAY contain header fields which are significant to the meaning of that part. In general, an HTTP user agent SHOULD follow the same or similar behavior as a MIME user agent would upon receipt of a multipart type. If an application receives an unrecognized multipart subtype, the application MUST treat it as being equivalent to _multipart/mixed_. Note: The "multipart/form-data" type has been specifically defined for carrying form data suitable for processing via the POST request method, as described in RFC 1867 [15]. 3.8 Product Tokens Product tokens are used to allow communicating applications to identify themselves via a simple product token, with an optional slash and version designator. Most fields using product tokens also allow sub- products which form a significant part of the application to be listed, separated by whitespace. By convention, the products are listed in order of their significance for identifying the application. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 29] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 product = token ["/" product-version] product-version = token Examples: User-Agent: CERN-LineMode/2.15 libwww/2.17b3 Server: Apache/0.8.4 Product tokens SHOULD be short and to the point -- use of them for advertising or other non-essential information is explicitly forbidden. Although any token character may appear in a product-version, this token SHOULD only be used for a version identifier (i.e., successive versions of the same product SHOULD only differ in the product-version portion of the product value). 3.9 Quality Values HTTP content negotiation (Section 12) uses short _floating point_ numbers to indicate the relative importance (_weight_) of various negotiable parameters. The weights are normalized to a real number in the range 0 through 1, where 0 is the minimum and 1 the maximum value. In order to discourage misuse of this feature, HTTP/1.1 applications MUST not generate more than three digits after the decimal point. User configuration of these values SHOULD also be limited in this fashion. qvalue = ( "0" [ "." 0*3DIGIT ] ) | ( "." 0*3DIGIT ) | ( "1" [ "." 0*3("0") ] ) _Quality values_ is a slight misnomer, since these values actually measure relative degradation in perceived quality. Thus, a value of _0.8_ represents a 20% degradation from the optimum rather than a statement of 80% quality. 3.10 Language Tags A language tag identifies a natural language spoken, written, or otherwise conveyed by human beings for communication of information to other human beings. Computer languages are explicitly excluded. HTTP uses language tags within the Accept-Language, and Content-Language fields. The syntax and registry of HTTP language tags is the same as that defined by RFC 1766 [1]. In summary, a language tag is composed of 1 or Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 30] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 more parts: A primary language tag and a possibly empty series of subtags: language-tag = primary-tag *( "-" subtag ) primary-tag = 1*8ALPHA subtag = 1*8ALPHA Whitespace is not allowed within the tag and all tags are case- insensitive. The namespace of language tags is administered by the IANA. Example tags include: en, en-US, en-cockney, i-cherokee, x-pig-latin where any two-letter primary-tag is an ISO 639 language abbreviation and any two-letter initial subtag is an ISO 3166 country code. The last three tags above are not registered tags, but examples of tags which could be registered in future. 3.12 Full Date Values Contents moved to section 3.3. 3.13 Opaque Validators Opaque validators are quoted strings whose internal structure is not visible to clients or caches. opaque-validator = strong-opaque-validator | weak-opaque-validator | null-validator strong-opaque-validator = quoted-string weak-opaque-validator = quoted-string "/W" null-validator = <"> <"> Note that the _/W_ tag is considered part of a weak opaque validator; it MUST NOT be removed by any cache or client. There are two comparison functions on opaque validators: . The strong comparison function: in order to be considered equal, both validators must be identical in every way, and neither may be weak. . The weak comparison function: in order to be considered equal, both validators must be identical in every way, except for the presence or absence of a _weak_ tag. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 31] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 The weak comparison function MAY be used for simple (non-subrange) GET requests. The strong comparison function MUST be used in all other cases. The null validator is a special value, defined as never matching the current validator of an existing resource, and always matching the _current_ validator of a resource that does not exist. 3.14 Variant IDs Variant-IDs are used to identify specific entities (variants) of a varying resource; see section 13.8.3 for how they are used. variant-id = quoted-string Variant-IDs are compared using string octet-equality; case is significant. 3.15 Validator Sets Validator sets are used for doing conditional retrievals on varying resources; see section 13.8.4. validator-set = 1#validator-set-item validator-set-item = opaque-validator 3.16 Variant Sets Validator sets are used for doing conditional retrievals on varying resources; see section 13.8.3. variant-set = 1#variant-set-item variant-set-item = opaque-validator ";" variant-id 3.17 HTTP Protocol Parameters Related to Ranges This section defines certain HTTP protocol parameters used in range requests and related responses. 3.17.1SLUSHY Range Units A resource may be broken down into subranges according to various structural units. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 32] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 bytes-unit = "bytes" The only range unit defined by HTTP/1.1 is . HTTP/1.1 range-unit = bytes-unit other-range-unit _bytes_ implementations may ignore ranges specified using other units. other- range-unit = token 3.17.2 SLUSHY Byte Ranges Since all HTTP entities are represented in HTTP messages as sequences of bytes, the concept of a byte range is meaningful for any HTTP entity. (However, not all clients and servers need to support byte-range operations.) Byte range specifications in HTTP apply to the sequence of bytes that would be transferred by the protocol if no transfer-encoding were being applied. This means that if Content-encoding is applied to the data, the byte range specification applies to the resulting content- encoded byte stream, not to the unencoded byte stream. It also means that if the entity-body's media-type is a composite type (e.g., multipart/* and message/rfc822), then the composite's body-parts may have their own content-encoding and content- transfer-encoding, and the byte range applies to the result of the those encodings. A byte range operation may specify a single range of bytes, or a set of ranges within a single entity. ranges-specifier = byte-ranges-specifier byte-ranges-specifier = bytes-unit "=" byte-range-set byte-range-set = 1#( byte-range-spec | suffix-byte-range-spec ) byte-range-spec = first-byte-pos "-" [last-byte-pos] first-byte-pos = 1*DIGIT last-byte-pos = 1*DIGIT The first-byte-pos value in a byte-range-spec gives the byte-offset of the first byte in a range. The last-byte-pos value gives the byte- offset of the last byte in the range; that is, the byte positions specified are inclusive. Byte offsets start at zero. If the last-byte-pos value is present, it must be greater than or equal to the first-byte-pos in that byte-range-spec, or the byte-range-spec is invalid. The recipient of an invalid byte-range-spec must ignore it. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 33] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 If the last-byte-pos value is absent, it is assumed to be equal to the current length of the entity in bytes. If the last-byte-pos value is larger than the current length of the entity, it is assumed to be equal to the current length of the entity. suffix-byte-range-spec = "-" suffix-length suffix-length = 1*DIGIT A suffix-byte-range-spec is used to specify the suffix of the entity, of a length given by the suffix-length value. (That is, this form specifies the last N bytes of an entity.) If the entity is shorter than the specified suffix-length, the entire entity is used. Examples of byte-ranges-specifier values (assuming an entity of length 10000): . The first 500 bytes (byte offsets 0-499, inclusive): bytes=0-499 . The second 500 bytes (byte offsets 500-999, inclusive): bytes=500-999 . The final 500 bytes (byte offsets 9500-9999, inclusive): bytes=-500 . Or bytes=9500- . The first and last bytes only (bytes 0 and 9999): bytes=0-0,-1 . Several legal but not canonical specifications of the second 500 bytes (byte offsets 500-999, inclusive): bytes=500-600,601-999 bytes=500-700,601-999 3.17.3 SLUSHY: Content Ranges When a server returns a partial response to a client, it must describe both the extent of the range covered by the response, and the length of the entire entity. content-range-spec = byte-content-range-spec byte-content-range-spec = bytes-unit SP first-byte-pos "-" last-byte-pos "/" entity-length entity-length = 1*DIGIT Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 34] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Unlike byte-ranges-specifier values, a byte-content-range-spec may only specify one range, and must contain absolute byte positions for both the first and last byte of the range. A byte-content-range-spec whose last-byte-pos value, is less than its first-byte-pos value, or whose entity-length value is less than its last-byte-pos value, is invalid. The recipient of an invalid byte- content-range-spec must ignore it and any content transferred along with it. Examples of byte-content-range-spec values, assuming that the entity contains a total of 1234 bytes: . The first 500 bytes: bytes 0-499/1234 . The second 500 bytes: bytes 500-999/1234 . All except for the first 500 bytes: bytes 500-1233/1234 . The last 500 bytes: bytes 734-1233/1234 4. HTTP Message 4.1 Message Types HTTP messages consist of requests from client to server and responses from server to client. HTTP-message = Full-Request ; HTTP/1.1 messages | Full-Response | NULL-Request A NULL-Request (an empty line where a request would normally be expected) MUST be ignored. Clients SHOULD NOT send a NULL-Request, but there are some error and testing circumstances in which a NULL-Request might be sent by mistake and MUST NOT cause failure on the server. NULL-Request = CRLF Full-Request and Full-Response use the generic message format of RFC 822 [9] for transferring entities. Both messages may include optional header fields (also known as _headers_) and an entity body. The entity body is separated from the headers by a null line (i.e., a line with nothing preceding the CRLF). Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 35] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Full-Request = Request-Line ; Section 5.1 *( General-Header ; Section 4.3 | Request-Header ; Section 5.2 | Entity-Header ) ; Section 7.1 CRLF [ Entity-Body ] ; Section 7.2 Full-Response = Status-Line ; Section 6.1 *( General-Header ; Section 4.3 | Response-Header ; Section 6.2 | Entity-Header ) ; Section 7.1 CRLF [ Entity-Body ] ; Section 7.2 4.2 Message Headers HTTP header fields, which include (Section 4.3), Request- Header ( General-Header (Section 5.2), Response-Header Section 6.2), and Entity-Header (Section 7.1) fields, follow the same generic format as that given in Section 3.1 of RFC 822 [9]. Each header field consists of a name followed by a colon (":") and the field value. Field names are case- insensitive. The field value may be preceded by any amount of LWS, though a single SP is preferred. Header fields can be extended over multiple lines by preceding each extra line with at least one SP or HT. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 36] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 HTTP-header = field-name ":" [ field-value ] CRLF and consisting of either *TEXT or combinations The order in which header fields with differing field names are received _ field-name = token field-value = *( field-content | LWS ) field-content = is not significant. However, it is good practice_ to send General- Header fields first, followed by Request-Header or Response-Header fields, and ending with the Entity-Header fields. Multiple HTTP-header fields with the same field-name may be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one _field-name: field-value_ pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. Thus, the order in which multiple header fields with the same field-name are received may be significant to the interpretation of the combined field- value. 4.3 General Header Fields There are a few header fields which have general applicability for both request and response messages, but which do not apply to the entity being transferred. These headers apply only to the message being transmitted. General-Header = Cache-Control ; Section 10.8 | Connection ; Section 10.9 | Date ; Section 10.17 | Via ; Section 10.20 | Keep-Alive ; Section 10.24 | Pragma ; Section 10.29 | Upgrade ; Section 10.41 General header field names can be extended reliably only in combination with a change in the protocol version. However, new or experimental Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 37] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 header fields may be given the semantics of general header fields if all parties in the communication recognize them to be general header fields. Unrecognized header fields are treated as Entity-Header fields. 5. Request A request message from a client to a server includes, within the first line of that message, the method to be applied to the resource, the identifier of the resource, and the protocol version in use. For backwards compatibility with the more limited HTTP/0.9 protocol, there are two valid formats for an HTTP request: Full-Request = Request-Line ; Section 5.1 *( General-Header ; Section 4.3 | Request-Header ; Section 5.2 | Entity-Header ) ; Section 7.1 CRLF [ Entity-Body ] ; Section 7.2 NULL-Request = CRLF A NULL-Request MUST be ignored. 5.1 Request-Line Request = Full-Request | NULL-Request The Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF. The elements are separated by SP characters. No CR or LF are allowed except in the final CRLF sequence. Request-Line = Method SP Request-URI SP HTTP-Version CRLF 5.1.1 Method The Method token indicates the method to be performed on the resource identified by the Request-URI. The method is case-sensitive. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 38] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Method = "OPTIONS" ; | "GET" ; | "HEAD" ; Section 8.3 | "POST" ; Section 8.4 | "PUT" ; Section 8.5 | "DELETE" ; | "TRACE" ; Section 8.12 | extension-method extension-method = token The list of methods acceptable by a specific resource can be specified Allow ). However, the client is always Section 8.1 Section 8.2 in an header field (Section 10.5 notified through the return code of the response whether a method is currently allowed on a specific resource, as this can change dynamically. Servers SHOULD return the status code 405 (method not allowed) if the method is known by the server but not allowed for the requested resource, and 501 (not implemented) if the method is unrecognized or not implemented by the server. The list of methods known by a server can be listed in a Public response header field (Section 10.32). The methods GET and HEAD MUST be supported by all general-purpose servers. Servers which provide Last-Modified dates for resources MUST also support the conditional GET method. All other methods are optional; however, if the above methods are implemented, they MUST be implemented with the same semantics as those specified in Section 8. 5.1.2 Request-URI The Request-URI is a Uniform Resource Identifier (Section 3.2) and identifies the resource upon which to apply the request. Request-URI = "*" | absoluteURI | abs_path To allow for transition to absoluteURIs in all requests in future versions of HTTP, HTTP/1.1 servers MUST accept the absoluteURI form in requests, even though HTTP/1.1 clients will not normally generate them. Versions of HTTP after HTTP/1.1 may require absoluteURIs everywhere, Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 39] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 after HTTP/1.1 or later have become the dominant implementations. The three options for Request-URI are dependent on the nature of the request. The asterisk _*_ means that the request does not apply to a particular resource, but to the server itself, and is only allowed when the Method used does not necessarily apply to a resource. One example would be OPTIONS * HTTP/1.1 The absoluteURI form is only allowed to an origin server if the client knows the server supports HTTP/1.1 or later. If the absoluteURI form is used, any Host request-header included with the request MUST be ignored. The absoluteURI form is required when the request is being made to a proxy. The proxy is requested to forward the request and return the response. If the request is GET or HEAD and a prior response is cached, the proxy may use the cached message if it passes any restrictions in the Cache-Control and Expires header fields. Note that the proxy MAY forward the request on to another proxy or directly to the server specified by the absoluteURI. In order to avoid request loops, a proxy MUST be able to recognize all of its server names, including any aliases, local variations, and the numeric IP address. An example Request-Line would be: GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1 The most common form of Request-URI is that used to identify a resource on an origin server or gateway. In this case, only the absolute path of the URI is transmitted (see Section 3.2.1, abs_path). For example, a client wishing to retrieve the resource above directly from the origin server would create a TCP connection to port 80 of the host _www.w3.org_ and send the lines: GET /pub/WWW/TheProject.html HTTP/1.1 Host:www.w3.org followed by the remainder of the Full-Request. Note that the absolute path cannot be empty; if none is present in the original URI, it MUST be given as _/_ (the server root). If a proxy receives a request without any path in the Request-URI and the method used is capable of supporting the asterisk form of request, then the last proxy on the request chain MUST forward the request with _*_ as the final Request-URI. For example, the request OPTIONS http://www.ics.uci.edu:8001 HTTP/1.1 Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 40] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 would be forwarded by the proxy as OPTIONS * HTTP/1.1 _www.ics.uci.edu_. is transmitted as an encoded string, where some after connecting to port 8001 of host The Request-URI characters may be escaped using the _% HEX HEX_ encoding defined by RFC 1738 [4]. The origin server MUST decode the Request-URI in order to properly interpret the request. In requests that they forward, proxies MUST NOT rewrite the _abs_path_ part of a Request-URI in any way except as noted above to replace a null abs_path with _*_. Illegal Request-URIs SHOULD be responded to with an appropriate status code. (Proxies MAY transform the Request-URI for internal processing purposes, but SHOULD NOT send such a transformed Request-URI in forwarded requests. Transformations for use in cache updates and lookups are subject to additional requirements; see section 13 on caching. The main reason for this rule is to make sure that the form of Request-URIs is well specified, to enable future extensions without fear that they will break in the face of some rewritings. Another is that one consequence of rewriting the Request-URI is that integrity or authentication checks by the server may fail; since rewriting MUST be avoided in this case, it may as well be proscribed in general. Note: servers writers SHOULD be aware that some existing proxies do some rewriting. 5.2 Request Header Fields The request header fields allow the client to pass additional information about the request, and about the client itself, to the server. These fields act as request modifiers, with semantics equivalent to the parameters on a programming language method (procedure) invocation. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 41] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Request-Header = Accept ; Section 10.1 | Accept-Charset ; Section 10.2 | Accept-Encoding ; Section 10.3 | Accept-Language ; Section 10.4 | Authorization ; Section 10.6 | From ; Section 10.21 | Host ; Section 10.22 | If-Modified-Since ; Section 10.23 | Proxy-Authorization ; Section 10.31 | Range ; Section 10.33 | Referer ; Section 10.34 | User-Agent ; Section 10.43 | Max-Forwards ; Section 10.45 Request-Header field names can be extended reliably only in combination with a change in the protocol version. However, new or experimental header fields MAY be given the semantics of request header fields if all parties in the communication recognize them to be request header fields. Unrecognized header fields are treated as Entity-Header fields. 6. Response After receiving and interpreting a request message, a server responds in the form of an HTTP response message. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 42] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Response = Full-Response Full-Response = Status-Line ; Section 6.1 *( General-Header ; Section 4.3 | Response-Header ; Section 6.2 | Entity-Header ) ; Section 7.1 CRLF [ Entity-Body ] ; Section 7.2 6.1 Status-Line The first line of a Full-Response message is the Status-Line, consisting associated textual phrase, with each element separated by SP characters. No CR or LF is allowed except in the final CRLF sequence. Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF 6.1.1 Status Code and Reason Phrase element is a 3-digit integer result code of the attempt of the protocol version followed by a numeric status code and its The Status-Code to understand and satisfy the request. The Reason-Phrase is intended to give a short textual description of the Status-Code. The Status-Code is intended for use by automata and the Reason-Phrase is intended for the human user. The client is not required to examine or display the Reason- Phrase. The first digit of the Status-Code defines the class of response. The last two digits do not have any categorization role. There are 5 values for the first digit: . 1xx: Informational - Request received, continuing process . 2xx: Success - The action was successfully received, understood, and accepted . 3xx: Redirection - Further action must be taken in order to complete the request . 4xx: Client Error - The request contains bad syntax or cannot be fulfilled Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 43] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 . 5xx: Server Error - The server failed to fulfill an apparently valid request The individual values of the numeric status codes defined for HTTP/1.1, and an example set of corresponding Reason-Phrase's, are presented below. The reason phrases listed here are only recommended -- they may be replaced by local equivalents without affecting the protocol. These codes are fully defined in Section 9. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 44] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Status-Code = "100" ; Continue | "101" ; Switching Protocols | "200" ; OK | "201" ; Created | "202" ; Accepted | "203" ; Non-Authoritative Information | "204" ; No Content | "205" ; Reset Content | "206" ; Partial Content | "300" ; Multiple Choices | "301" ; Moved Permanently | "302" ; Moved Temporarily | "303" ; See Other | "304" ; Not Modified | "305" ; Use Proxy | "400" ; Bad Request | "401" ; Unauthorized | "402" ; Payment Required | "403" ; Forbidden | "404" ; Not Found | "405" ; Method Not Allowed | "406" ; Not Acceptable | "407" ; Proxy Authentication Required | "408" ; Request Time-out | "409" ; Conflict | "410" ; Gone | "411" ; Length Required | "412" ; Precondition Failed | "413" ; Request Entity Too Large | "414" ; Request URI Too Large | "415" ; Unsupported Media Type | "416" ; None Acceptable | "500" ; Internal Server Error | "501" ; Not Implemented | "502" ; Bad Gateway | "503" ; Service Unavailable | "504" ; Gateway Time-out | "505" ; HTTP Version not supported | extension-code extension-code = 3DIGIT Reason-Phrase = * HTTP status codes are extensible. HTTP applications are not required to understand the meaning of all registered status codes, though such understanding is obviously desirable. However, applications MUST understand the class of any status code, as indicated by the first digit, and treat any unrecognized response as being equivalent to the x00 status code of that class, with the exception that an unrecognized Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 45] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 response MUST not be cached. For example, if an unrecognized status code of 431 is received by the client, it can safely assume that there was something wrong with its request and treat the response as if it had received a 400 status code. In such cases, user agents SHOULD present to the user the entity returned with the response, since that entity is likely to include human-readable information which will explain the unusual status. 6.2 Response Header Fields The response header fields allow the server to pass additional information about the response which cannot be placed in the Status- Line. These header fields give information about the server and about further access to the resource identified by the Request-URI. Response-Header = Location ; Section 10.27 | Proxy-Authenticate ; Section 10.30 | Public ; Section 10.32 | Retry-After ; Section 10.36 | Server ; Section 10.37 | WWW-Authenticate ; Section 10.44 Response-Header field names can be extended reliably only in combination with a change in the protocol version. However, new or experimental header fields MAY be given the semantics of response header fields if all parties in the communication recognize them to be response header fields. Unrecognized header fields are treated as Entity-Header fields. 7. Entity Full-Request and Full-Response messages MAY transfer an entity within some requests and responses. An entity consists of Entity-Header fields and (usually) an Entity-Body. In this section, both sender and recipient refer to either the client or the server, depending on who sends and who receives the entity. 7.1 Entity Header Fields Entity-Header fields define optional metainformation about the Entity- Body or, if no body is present, about the resource identified by the request. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 46] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 Entity-Header = Allow ; Section 10.5 | Content-Base ; Section 10.9 | Content-Encoding ; Section 10.10 | Content-Language ; Section 10.11 | Content-Length ; Section 10.12 | Content-Location ; Section 10.16 | Content-MD5 ; Section 10.13 | Content-Range ; Section 10.14 | Content-Type ; Section 10.15 | Expires ; Section 10.19 | Last-Modified ; Section 10.25 | Title ; Section 10.38 | Transfer-Encoding ; Section 10.39 | extension-header extension-header = HTTP-header The extension-header mechanism allows additional Entity-Header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient. Unrecognized header fields SHOULD be ignored by the recipient and forwarded by proxies. 7.2 Entity Body The entity body (if any) sent with an HTTP request or response is in a format and encoding defined by the Entity-Header fields. Entity-Body = *OCTET An entity body is included with a request message only when the request method calls for one. The presence of an entity body in a request is signaled by the inclusion of a Content-Length and/or Content-Type header field in the request message headers. For response messages, whether or not an entity body is included with a message is dependent on both the request method and the response code. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 47] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 All responses to the HEAD request method MUST not include a body, even though the presence of entity header fields may lead one to believe they do. All 1xx (informational), 204 (no content), and 304 (not modified) responses MUST not include a body. All other responses MUST include an entity body or a Content-Length header field defined with a value of zero (0). 7.2.1 Type When an entity body is included with a message, the data type of that body is determined via the header fields Content-Type, Content-Encoding, and Transfer-Encoding. These define a three-layer, ordered encoding model: entity-body := Transfer-Encoding( Content-Encoding( Content-Type( data ) ) ) The default for both encodings is none (i.e., the identity function). Content-Type specifies the media type of the underlying data. Content- Encoding may be used to indicate any additional content codings applied to the type, usually for the purpose of data compression, that are a property of the resource requested. Transfer-Encoding may be used to indicate any additional transfer codings applied by an application to ensure safe and proper transfer of the message. Note that Transfer- Encoding is a property of the message, not of the resource. Any HTTP/1.1 message containing an entity body SHOULD include a Content- Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type header, the recipient may attempt to guess the media type via inspection of its content and/or the name extension(s) of the URL used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type _application/octet-stream_. 7.2.2 Length When an entity body is included with a message, the length of that body may be determined in one of several ways. If a Content-Length header field is present, its value in bytes represents the length of the entity body. Otherwise, the body length is determined by the Transfer-Encoding (if the _chunked_ transfer coding has been applied) or by the server closing the connection. Note: Any response message which MUST NOT include an entity body (such as the 1xx, 204, and 304 responses and any response to a HEAD request) is always terminated by the first empty line after the header fields, regardless of the entity header fields present in the message. Closing the connection cannot be used to indicate the end of a request body, since it leaves no possibility for the server to send back a response. For compatibility with HTTP/1.0 applications, HTTP/1.1 Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 48] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 requests containing an entity body MUST include a valid Content-Length header field unless the server is known to be HTTP/1.1 compliant. HTTP/1.1 servers MUST accept the _chunked_ transfer coding (Section 3.6 ), thus allowing this mechanism to be used for a request when Content- Length is unknown. If a request contains an entity body and Content-Length is not specified, the server SHOULD respond with 400 (bad request) if it cannot determine the length of the request message's content, or with 411 (length required) if it wishes to insist on receiving a valid Content- Length. Messages MUST NOT include both a Content-Length header field and the _chunked_ transfer coding. If both are received, the Content-Length MUST be ignored. When a Content-Length is given in a message where an entity body is allowed, its field value MUST exactly match the number of OCTETs in the entity body. HTTP/1.1 user agents MUST notify the user when an invalid length is received and detected. 8. Method Definitions The set of common methods for HTTP/1.1 is defined below. Although this set can be expanded, additional methods cannot be assumed to share the same semantics for separately extended clients and servers. The Host request-header field (Section 10.22) MUST accompany all HTTP/1.1 requests. 8.1 OPTIONS The OPTIONS method represents a request for information about the communication options available on the request/response chain identified by the Request-URI. This method allows the client to determine the options and/or requirements associated with a resource, or the capabilities of a server, without implying a resource action or initiating a resource retrieval. Unless the server's response is an error, the response MUST NOT include entity information other than what can be considered as communication options (e.g., Allow is appropriate, but Content-Type is not) and MUST include a Content-Length with a value of zero (0). Responses to this method are not cachable. If the Request-URI is an asterisk (_*_), the OPTIONS request is intended to apply to the server as a whole. A 200 response SHOULD include any header fields which indicate optional features implemented by the server (e.g., Public), including any extensions not defined by this specification, in addition to any applicable general or response header fields. As described in Section 5.1.2, an _OPTIONS *_ request can be Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 49] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 applied through a proxy by specifying the destination server in the Request-URI without any path information. If the Request-URI is not an asterisk, the OPTIONS request applies only to the options that are available when communicating with that resource. A 200 response SHOULD include any header fields which indicate optional features implemented by the server and applicable to that resource (e.g., Allow), including any extensions not defined by this specification, in addition to any applicable general or response header fields. If the OPTIONS request passes through a proxy, the proxy MUST edit the response to exclude those options known to be unavailable through that proxy. 8.2 GET The GET method means retrieve whatever information (in the form of an entity) is identified by the Request-URI. If the Request-URI refers to a data-producing process, it is the produced data which shall be returned as the entity in the response and not the source text of the process, unless that text happens to be the output of the process. The semantics of the GET method change to a _conditional GET_ if the request message includes an If-Modified-Since header field. A conditional GET method requests that the identified resource be transferred only if it has been modified since the date given by the If- Modified-Since header, as described in Section 10.23. The conditional GET method is intended to reduce unnecessary network usage by allowing cached entities to be refreshed without requiring multiple requests or transferring data already held by the client. The semantics of the GET method change to a _partial GET_ if the request message includes a Range header field. A partial GET requests that only part of the identified resource be transferred, as described in Section 10.33. The partial GET method is intended to reduce unnecessary network usage by allowing partially-retrieved entities to be completed without transferring data already held by the client. The response to a GET request may be cachable if and only if it meets the requirements for HTTP caching described in Section 13. 8.3 HEAD The HEAD method is identical to GET except that the server MUST not return any Entity-Body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the resource identified by the Request-URI without transferring the Entity-Body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 50] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 The response to a HEAD request may be cachable in the sense that the information contained in the response may be used to update a previously cached entity from that resource. If the new field values indicate that the cached entity differs from the current resource (as would be indicated by a change in Content-Length, Content-MD5, or Content- Version), then the cache MUST discard the cached entity. There is no _conditional HEAD_ or _partial HEAD_ request analogous to those associated with the GET method. If an If-Modified-Since and/or Range header field is included with a HEAD request, they SHOULD be ignored. 8.4 POST The POST method is used to request that the destination server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions: . Annotation of existing resources; . Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles; . Providing a block of data, such as the result of submitting a form [5], to a data-handling process; . Extending a database through an append operation. The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI. The posted entity is subordinate to that URI in the same way that a file is subordinate to a directory containing it, a news article is subordinate to a newsgroup to which it is posted, or a record is subordinate to a database. For compatibility with HTTP/1.0 applications, all POST requests MUST include a valid Content-Length header field unless the server is known to be HTTP/1.1 compliant. When sending a POST request to an HTTP/1.1 server, a client MUST use a valid Content-Length or the _chunked_ Transfer-Encoding. The server SHOULD respond with a 400 (bad request) message if it cannot determine the length of the request message's content, or with 411 (length required) if it wishes to insist on receiving a valid Content-Length. A successful POST does not require that the entity be created as a resource on the origin server or made accessible for future reference. That is, the action performed by the POST method might not result in a resource that can be identified by a URI. In this case, either 200 (ok) or 204 (no content) is the appropriate response status, depending on whether or not the response includes an entity that describes the result. Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 51] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 If a resource has been created on the origin server, the response SHOULD be 201 (created) and contain an entity (preferably of type _text/html_) which describes the status of the request and refers to the new resource. Responses to this method are not cachable. However, the 303 (see other) response can be used to direct the user agent to retrieve a cachable resource. POST requests must obey the entity transmission requirements set out in section 8.4.1. 8.4.1 SLUSHY: Entity Transmission Requirements The following rules apply to any method that is subject to the two-phase mechanism. Upon receiving such a method from an HTTP/1.1 (or later) client, an HTTP/1.1 (or later) server immediately either respond with _100 Continue_ and continue to read from the input stream, or respond with an error status. If it responds with an error status, it MAY close the transport (TCP) connection or it MAY continue to read and discard the rest of the request. It MUST not perform the requested action if returns an error status. HTTP/1.1 servers are encouraged to maintain persistent connections and use TCP's flow control mechanisms to resolve temporary overloads, rather than terminating connections with the expectation that clients will retry. The latter technique can exacerbate network congestion. An HTTP/1.1 (or later) client doing a PUT-like method SHOULD monitor the network connection for an error status while it is transmitting the body of the request including any encoding mechanism used to transmit the body. If the client sees an error status, it SHOULD immediately cease transmitting the body. If the body was proceeded by a Content-length header, the client MUST either close the connection or if the body is being sent using a Chunked encoding, use a 0 length chunk, to mark the end of the message. An HTTP/1.1 (or later) client MUST be prepared to accept a 100 Continue status followed by a regular response. An HTTP/1.1 (or later) client that sees the connection close before receiving any status from the server SHOULD retry the request, but if it does so, it MUST use the two-phase mechanism. In the two-phase mechanism, the client first sends the request headers, then waits for the server to respond with either a 100 Continue, in which case the client SHOULD continue, or an error status, in which case the client MUST NOT continue and MUST close the connection if it has not already completed sending the full request body including any encoding mechanism used to transmit the body. If the client knows that the server is an HTTP/1.1 (or later) server, because of the server protocol version returned with a previous request Fielding, Frystyk, Berners-Lee, Gettys, and Mogul [Page 52] INTERNET-DRAFT HTTP/1.1 Monday, April 22, 1996 on the same persistent connection [alternatively: within the past hours], it MUST wait for a response. If the client believes that the server is a 1.0 or earlier server, it SHOULD continue transmitting its request after waiting at least [5] seconds for a status response. An HTTP/1.1 (or later) client that sees the connection close after receiving a _100 Continue_ but before receiving any other status SHOULD retry the request, and need not use the two-phase method (but MAY do so if this simplifies the implementation). An HTTP/1.1 (or later) server that receives a request from a 1.0 (or earlier) client MUST NOT transmit the _100 Continue_ response; it SHOULD either wait for the request to be completed normally (thus avoiding an i