*** draft-ietf-http-v10-spec-04.txt Sun Oct 15 01:08:20 1995 --- draft-ietf-http-v10-spec-05.txt Mon Feb 19 12:29:53 1996 *************** *** 2,5 **** INTERNET-DRAFT R. Fielding, UC Irvine ! H. Frystyk, MIT/LCS ! Expires April 14, 1996 October 14, 1995 --- 2,5 ---- INTERNET-DRAFT R. Fielding, UC Irvine ! H. Frystyk, MIT/LCS ! Expires August 19, 1996 February 19, 1996 *************** *** 56,57 **** --- 56,58 ---- 1.3 Overall Operation + 1.4 HTTP and MIME *************** *** 120,127 **** 10.11 Location ! 10.12 MIME-Version ! 10.13 Pragma ! 10.14 Referer ! 10.15 Server ! 10.16 User-Agent ! 10.17 WWW-Authenticate --- 121,127 ---- 10.11 Location ! 10.12 Pragma ! 10.13 Referer ! 10.14 Server ! 10.15 User-Agent ! 10.16 WWW-Authenticate *************** *** 135,136 **** --- 135,137 ---- 12.4 Transfer of Sensitive Information + 12.5 Attacks Based On File and Path Names *************** *** 148,151 **** C.1 Conversion to Canonical Form - C.1.1 Representation of Line Breaks - C.1.2 Default Character Set C.2 Conversion of Date Formats --- 149,150 ---- *************** *** 153,155 **** --- 152,172 ---- C.4 No Content-Transfer-Encoding + C.5 HTTP Header Fields in Multipart Body-Parts + + Appendix D. Additional Features + D.1 Additional Request Methods + D.1.1 PUT + D.1.2 DELETE + D.1.3 LINK + D.1.4 UNLINK + D.2 Additional Header Field Definitions + D.2.1 Accept + D.2.2 Accept-Charset + D.2.3 Accept-Encoding + D.2.4 Accept-Language + D.2.5 Content-Language + D.2.6 Link + D.2.7 MIME-Version + D.2.8 Retry-After + D.2.9 Title + D.2.10 URI *************** *** 165,170 **** This specification reflects common usage of the protocol referred ! to as "HTTP/1.0". This specification is not intended to become an ! Internet standard; rather, it defines those features of the HTTP ! protocol that can reasonably be expected of any implementation ! which claims to be using HTTP/1.0. --- 183,191 ---- This specification reflects common usage of the protocol referred ! to as "HTTP/1.0". This specification describes the features that ! seem to be consistently implemented in most HTTP/1.0 clients and ! servers. The specification is split into two sections. Those ! features of HTTP for which implementations are usually consistent ! are described in the main body of this document. Those features ! which have few or inconsistent implementations are listed in ! Appendix D. *************** *** 271,276 **** considered a party to the HTTP communication, though the tunnel ! may have been initiated by an HTTP request. A tunnel is closed ! when both ends of the relayed connections are closed. Tunnels ! are used when a portal is necessary and the intermediary cannot, ! or should not, interpret the relayed communication. --- 292,297 ---- considered a party to the HTTP communication, though the tunnel ! may have been initiated by an HTTP request. The tunnel ceases to ! exist when both ends of the relayed connections are closed. ! Tunnels are used when a portal is necessary and the intermediary ! cannot, or should not, interpret the relayed communication. *************** *** 356,360 **** Not all responses are cachable, and some requests may contain ! modifiers which place special requirements on cache behavior. ! Historically, HTTP/1.0 applications have not adequately defined ! what is or is not a "cachable" response. --- 377,381 ---- Not all responses are cachable, and some requests may contain ! modifiers which place special requirements on cache behavior. Some ! HTTP/1.0 applications use heuristics to describe what is or is not ! a "cachable" response, but these rules are not standardized. *************** *** 370,379 **** ! Current practice requires that the connection be established by the ! client prior to each request and closed by the server after sending ! the response. Both clients and servers must be capable of handling ! cases where either party closes the connection prematurely, due to ! user action, automated time-out, or program failure. In any case, ! the closing of the connection by either or both parties always ! terminates the current request, regardless of its status. 2. Notational Conventions and Generic Grammar --- 391,410 ---- ! Except for experimental applications, current practice requires ! that the connection be established by the client prior to each ! request and closed by the server after sending the response. Both ! clients and servers should be aware that either party may close the ! connection prematurely, due to user action, automated time-out, or ! program failure, and should handle such closing in a predictable ! fashion. In any case, the closing of the connection by either or ! both parties always terminates the current request, regardless of ! its status. + 1.4 HTTP and MIME + + HTTP/1.0 uses many of the constructs defined for MIME, as defined + in RFC 1521 [5]. Appendix C describes the ways in which the context + of HTTP allows for different use of Internet Media Types than is + typically found in Internet mail, and gives the rationale for those + differences. + 2. Notational Conventions and Generic Grammar *************** *** 466,470 **** quoted-string), and between adjacent tokens and delimiters ! (tspecials), without changing the interpretation of a field. ! However, applications should attempt to follow "common form" ! when generating HTTP constructs, since there exist some implementations that fail to accept anything beyond the common --- 497,503 ---- quoted-string), and between adjacent tokens and delimiters ! (tspecials), without changing the interpretation of a field. At ! least one delimiter (tspecials) must exist between any two ! tokens, since they would otherwise be interpreted as a single ! token. However, applications should attempt to follow "common ! form" when generating HTTP constructs, since there exist some implementations that fail to accept anything beyond the common *************** *** 521,522 **** --- 554,560 ---- + Hexadecimal numeric characters are used in several protocol elements. + + HEX = "A" | "B" | "C" | "D" | "E" | "F" + | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT + Many HTTP/1.0 header field values consist of words separated by LWS *************** *** 537,538 **** --- 575,578 ---- fields containing "comment" as part of their field value definition. + In all other fields, parentheses are considered part of the field + value. *************** *** 616,618 **** forwarded; the proxy/gateway's response to that request must follow ! the normal server requirements. --- 656,658 ---- forwarded; the proxy/gateway's response to that request must follow ! the server requirements listed above. *************** *** 629,631 **** ! URIs in HTTP/1.0 can be represented in absolute form or relative to some known base URI [9], depending upon the context of their use. --- 669,671 ---- ! URIs in HTTP can be represented in absolute form or relative to some known base URI [9], depending upon the context of their use. *************** *** 656,658 **** ! pchar = uchar | ":" | "@" | "&" | "=" uchar = unreserved | escape --- 696,698 ---- ! pchar = uchar | ":" | "@" | "&" | "=" | "+" uchar = unreserved | escape *************** *** 660,670 **** ! escape = "%" hex hex ! hex = "A" | "B" | "C" | "D" | "E" | "F" ! | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT ! ! reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" ! safe = "$" | "-" | "_" | "." | "+" extra = "!" | "*" | "'" | "(" | ")" | "," ! national = --- 700,708 ---- ! escape = "%" HEX HEX ! reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" extra = "!" | "*" | "'" | "(" | ")" | "," ! safe = "$" | "-" | "_" | "." ! unsafe = CTL | SP | <"> | "#" | "%" | "<" | ">" ! national = *************** *** 683,685 **** ! http_URL = "http:" "//" host [ ":" port ] abs_path --- 721,723 ---- ! http_URL = "http:" "//" host [ ":" port ] [ abs_path ] *************** *** 696,698 **** present in the URL, it must be given as "/" when used as a ! Request-URI. --- 734,736 ---- present in the URL, it must be given as "/" when used as a ! Request-URI (Section 5.1.2). *************** *** 763,768 **** ! Note: HTTP/1.0 requirements for the date/time stamp format ! apply only to their usage within the protocol stream. ! Clients and servers are not required to use these formats ! for user presentation, request logging, etc. --- 801,806 ---- ! Note: HTTP requirements for the date/time stamp format apply ! only to their usage within the protocol stream. Clients and ! servers are not required to use these formats for user ! presentation, request logging, etc. *************** *** 790,791 **** --- 828,834 ---- + Note: This use of the term "character set" is more commonly + referred to as a "character encoding." However, since HTTP + and MIME share the same registry, it is important that the + terminology also be shared. + HTTP character sets are identified by case-insensitive tokens. The *************** *** 814,819 **** ! Note: This use of the term "character set" is more commonly ! referred to as a "character encoding." However, since HTTP ! and MIME share the same registry, it is important that the ! terminology also be shared. --- 857,862 ---- ! The character set of an entity body should be labelled as the ! lowest common denominator of the character codes used within that ! body, with the exception that no label is preferred over the labels ! US-ASCII or ISO-8859-1. *************** *** 845,849 **** "gzip" (GNU zip) developed by Jean-loup Gailly. This format is ! typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC. Gzip is ! available from the GNU project at ! . --- 888,890 ---- "gzip" (GNU zip) developed by Jean-loup Gailly. This format is ! typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC. *************** *** 863,871 **** field (Section 10.5) in order to provide open and extensible data ! typing. For mail applications, where there is no type negotiation ! between sender and recipient, it is reasonable to put strict limits ! on the set of allowed media types. With HTTP, where the sender and ! recipient can communicate directly, applications are allowed more ! freedom in the use of non-registered types. The following grammar ! for media types is a superset of that for MIME because it does not ! restrict itself to the official IANA and x-token types. --- 904,906 ---- field (Section 10.5) in order to provide open and extensible data ! typing. *************** *** 886,908 **** LWS must not be generated between the type and subtype, nor between ! an attribute and its value. ! Many current applications do not recognize media type parameters. ! Since parameters are a fundamental aspect of media types, this must ! be considered an error in those applications. Nevertheless, ! HTTP/1.0 applications should only use media type parameters when ! they are necessary to define the content of a message. ! If a given media-type value has been registered by the IANA, any ! use of that value must be indicative of the registered data format. ! Although HTTP allows the use of non-registered media types, such ! usage must not conflict with the IANA registry. Data providers are ! strongly encouraged to register their media types with IANA via the ! procedures outlined in RFC 1590 [13]. - All media-type's registered by IANA must be preferred over - extension tokens. However, HTTP does not limit applications to the - use of officially registered media types, nor does it encourage the - use of an "x-" prefix for unofficial types outside of explicitly - short experimental use between consenting applications. - 3.6.1 Canonicalization and Text Defaults --- 921,936 ---- LWS must not be generated between the type and subtype, nor between ! an attribute and its value. Upon receipt of a media type with an ! unrecognized parameter, a user agent should treat the media type as ! if the unrecognized parameter and its value were not present. ! Some older HTTP applications do not recognize media type ! parameters. HTTP/1.0 applications should only use media type ! parameters when they are necessary to define the content of a ! message. ! Media-type values are registered with the Internet Assigned Number ! Authority (IANA [15]). The media type registration process is ! outlined in RFC 1590 [13]. Use of non-registered media types is ! discouraged. 3.6.1 Canonicalization and Text Defaults *************** *** 909,958 **** ! Media types are registered in a canonical form. In general, entity ! bodies transferred via HTTP must be represented in the appropriate ! canonical form prior to transmission. If the body has been encoded ! via a Content-Encoding, the data must be in canonical form prior to ! that encoding. However, HTTP modifies the canonical form ! requirements for media of primary type "text" and for "application" ! types consisting of text-like records. ! HTTP redefines the canonical form of text media to allow multiple ! octet sequences to indicate a text line break. In addition to the ! preferred form of CRLF, HTTP applications must accept a bare CR or ! LF alone as representing a single line break in text media. ! Furthermore, if the text media is represented in a character set ! which does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use ! of whatever octet sequence(s) is defined by that character set to ! represent the equivalent of CRLF, bare CR, and bare LF. It is ! assumed that any recipient capable of using such a character set ! will know the appropriate octet sequence for representing line ! breaks within that character set. ! Note: This interpretation of line breaks applies only to the ! contents of an Entity-Body and only after any ! Content-Encoding has been removed. All other HTTP constructs ! use CRLF exclusively to indicate a line break. Content ! codings define their own line break requirements. ! A recipient of an HTTP text entity should translate the received ! entity line breaks to the local line break conventions before ! saving the entity external to the application and its cache; ! whether this translation takes place immediately upon receipt of ! the entity, or only when prompted by the user, is entirely up to ! the individual application. - HTTP also redefines the default character set for text media in an - entity body. If a textual media type defines a charset parameter - with a registered default value of "US-ASCII", HTTP changes the - default to be "ISO-8859-1". Since the ISO-8859-1 [18] character set - is a superset of US-ASCII [17], this has no effect upon the - interpretation of entity bodies which only contain octets within - the US-ASCII set (0 - 127). The presence of a charset parameter - value in a Content-Type header field overrides the default. - - It is recommended that the character set of an entity body be - labelled as the lowest common denominator of the character codes - used within a document, with the exception that no label is - preferred over the labels US-ASCII or ISO-8859-1. - 3.6.2 Multipart Types --- 937,978 ---- ! Internet media types are registered with a canonical form. In ! general, an Entity-Body transferred via HTTP must be represented in ! the appropriate canonical form prior to its transmission. If the ! body has been encoded with a Content-Encoding, the underlying data ! should be in canonical form prior to being encoded. ! Media subtypes of the "text" type use CRLF as the text line break ! when in canonical form. However, HTTP allows the transport of text ! media with plain CR or LF alone representing a line break when used ! consistently within the Entity-Body. HTTP applications must accept ! CRLF, bare CR, and bare LF as being representative of a line break ! in text media received via HTTP. ! ! In addition, if the text media is represented in a character set ! that does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use ! of whatever octet sequences are defined by that character set to ! represent the equivalent of CR and LF for line breaks. This ! flexibility regarding line breaks applies only to text media in the ! Entity-Body; a bare CR or LF should not be substituted for CRLF ! within any of the HTTP control structures (such as header fields ! and multipart boundaries). ! The "charset" parameter is used with some media types to define the ! character set (Section 3.4) of the data. When no explicit charset ! parameter is provided by the sender, media subtypes of the "text" ! type are defined to have a default charset value of "ISO-8859-1" ! when received via HTTP. Data in character sets other than ! "ISO-8859-1" or its subsets must be labelled with an appropriate ! charset value in order to be consistently interpreted by the ! recipient. ! Note: Many current HTTP servers provide data using charsets ! other than "ISO-8859-1" without proper labelling. This ! situation reduces interoperability and is not recommended. ! To compensate for this, some HTTP user agents provide a ! configuration option to allow the user to change the default ! interpretation of the media type character set when no ! charset parameter is given. 3.6.2 Multipart Types *************** *** 964,975 **** each type in order to correctly interpret the purpose of each ! body-part. Ideally, an HTTP user agent should follow the same or ! similar behavior as a MIME user agent does upon receipt of a ! multipart type. ! As in MIME [5], all multipart types share a common syntax and must ! include a boundary parameter as part of the media type value. The ! message body is itself a protocol element and must therefore use ! only CRLF to represent line breaks between body-parts. Unlike in ! MIME, multipart body-parts may contain HTTP header fields which are ! significant to the meaning of that part. --- 984,996 ---- each type in order to correctly interpret the purpose of each ! body-part. An HTTP user agent should follow the same or similar ! behavior as a MIME user agent does upon receipt of a multipart ! type. HTTP servers should not assume that all HTTP clients are ! prepared to handle multipart types. ! All multipart types share a common syntax and must include a ! boundary parameter as part of the media type value. The message ! body is itself a protocol element and must therefore use only CRLF ! to represent line breaks between body-parts. Multipart body-parts ! may contain HTTP header fields which are significant to the meaning ! of that part. *************** *** 1085,1088 **** General-Header = Date ; Section 10.6 ! | MIME-Version ; Section 10.12 ! | Pragma ; Section 10.13 --- 1106,1108 ---- General-Header = Date ; Section 10.6 ! | Pragma ; Section 10.12 *************** *** 1092,1094 **** header fields if all parties in the communication recognize them to ! be general header fields. Unknown header fields are treated as Entity-Header fields. --- 1112,1114 ---- header fields if all parties in the communication recognize them to ! be general header fields. Unrecognized header fields are treated as Entity-Header fields. *************** *** 1148,1150 **** return the status code 501 (not implemented) if the method is ! unknown or not implemented. --- 1168,1170 ---- return the status code 501 (not implemented) if the method is ! unrecognized or not implemented. *************** *** 1174,1176 **** ! GET http://www.w3.org/hypertext/WWW/TheProject.html HTTP/1.0 --- 1194,1196 ---- ! GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.0 *************** *** 1183,1185 **** ! GET /hypertext/WWW/TheProject.html HTTP/1.0 --- 1203,1205 ---- ! GET /pub/WWW/TheProject.html HTTP/1.0 *************** *** 1190,1192 **** The Request-URI is transmitted as an encoded string, where some ! characters may be escaped using the "% hex hex" encoding defined by RFC 1738 [4]. The origin server must decode the Request-URI in --- 1210,1212 ---- The Request-URI is transmitted as an encoded string, where some ! characters may be escaped using the "% HEX HEX" encoding defined by RFC 1738 [4]. The origin server must decode the Request-URI in *************** *** 1198,1201 **** information about the request, and about the client itself, to the ! server. All header fields are optional and conform to the generic ! HTTP-header syntax. --- 1218,1222 ---- information about the request, and about the client itself, to the ! server. These fields act as request modifiers, with semantics ! equivalent to the parameters on a programming language method ! (procedure) invocation. *************** *** 1204,1207 **** | If-Modified-Since ; Section 10.9 ! | Referer ; Section 10.14 ! | User-Agent ; Section 10.16 --- 1225,1228 ---- | If-Modified-Since ; Section 10.9 ! | Referer ; Section 10.13 ! | User-Agent ; Section 10.15 *************** *** 1211,1213 **** header fields if all parties in the communication recognize them to ! be request header fields. Unknown header fields are treated as Entity-Header fields. --- 1232,1234 ---- header fields if all parties in the communication recognize them to ! be request header fields. Unrecognized header fields are treated as Entity-Header fields. *************** *** 1320,1329 **** applications must understand the class of any status code, as ! indicated by the first digit, and treat any unknown response as ! being equivalent to the x00 status code of that class. For example, ! if an unknown status code of 421 is received by the client, it can ! safely assume that there was something wrong with its request and ! treat the response as if it had received a 400 status code. In such ! cases, user agents should present to the user the entity returned ! with the response, since that entity is likely to include ! human-readable information which will explain the unusual status. --- 1341,1352 ---- applications must understand the class of any status code, as ! indicated by the first digit, and treat any unrecognized response ! as being equivalent to the x00 status code of that class, with the ! exception that an unrecognized response must not be cached. For ! example, if an unrecognized status code of 431 is received by the ! client, it can safely assume that there was something wrong with ! its request and treat the response as if it had received a 400 ! status code. In such cases, user agents should present to the user ! the entity returned with the response, since that entity is likely ! to include human-readable information which will explain the ! unusual status. *************** *** 1333,1337 **** information about the response which cannot be placed in the ! Status-Line. These header fields are not intended to give ! information about an Entity-Body returned in the response, but ! about the server itself. --- 1356,1360 ---- information about the response which cannot be placed in the ! Status-Line. These header fields give information about the server ! and about further access to the resource identified by the ! Request-URI. *************** *** 1338,1341 **** Response-Header = Location ; Section 10.11 ! | Server ; Section 10.15 ! | WWW-Authenticate ; Section 10.17 --- 1361,1364 ---- Response-Header = Location ; Section 10.11 ! | Server ; Section 10.14 ! | WWW-Authenticate ; Section 10.16 *************** *** 1345,1348 **** header fields if all parties in the communication recognize them to ! be response header fields. Unknown header fields are treated as ! Entity-Header fields. --- 1368,1371 ---- header fields if all parties in the communication recognize them to ! be response header fields. Unrecognized header fields are treated ! as Entity-Header fields. *************** *** 1375,1377 **** fields cannot be assumed to be recognizable by the recipient. ! Unknown header fields should be ignored by the recipient and forwarded by proxies. --- 1398,1400 ---- fields cannot be assumed to be recognizable by the recipient. ! Unrecognized header fields should be ignored by the recipient and forwarded by proxies. *************** *** 1380,1383 **** ! The entity body (if any) sent with an HTTP/1.0 request or response ! is in a format and encoding defined by the Entity-Header fields. --- 1403,1406 ---- ! The entity body (if any) sent with an HTTP request or response is ! in a format and encoding defined by the Entity-Header fields. *************** *** 1531,1533 **** ! Applications must not cache responses to a POST request. --- 1554,1558 ---- ! Applications must not cache responses to a POST request because the ! application has no way of knowing that the server would return an ! equivalent response on some future request. *************** *** 1611,1617 **** taken by the user agent in order to fulfill the request. The action ! required can sometimes be carried out by the user agent without ! interaction with the user, but it is strongly recommended that this ! only take place if the method used in the request is GET or HEAD. A ! user agent should never automatically redirect a request more than ! 5 times, since such redirections usually indicate an infinite loop. --- 1636,1642 ---- taken by the user agent in order to fulfill the request. The action ! required may be carried out by the user agent without interaction ! with the user if and only if the method used in the subsequent ! request is GET or HEAD. A user agent should never automatically ! redirect a request more than 5 times, since such redirections ! usually indicate an infinite loop. *************** *** 1648,1649 **** --- 1673,1678 ---- + Note: When automatically redirecting a POST request after + receiving a 301 status code, some existing user agents will + erroneously change it into a GET request. + 302 Moved Temporarily *************** *** 1663,1664 **** --- 1692,1697 ---- + Note: When automatically redirecting a POST request after + receiving a 302 status code, some existing user agents will + erroneously change it into a GET request. + 304 Not Modified *************** *** 1706,1708 **** The request requires user authentication. The response must include ! a WWW-Authenticate header field (Section 10.17) containing a challenge applicable to the requested resource. The client may --- 1739,1741 ---- The request requires user authentication. The response must include ! a WWW-Authenticate header field (Section 10.16) containing a challenge applicable to the requested resource. The client may *************** *** 1715,1717 **** should be presented the entity that was given in the response, ! since that entity may include relevent diagnostic information. HTTP access authentication is explained in Section 11. --- 1748,1750 ---- should be presented the entity that was given in the response, ! since that entity may include relevant diagnostic information. HTTP access authentication is explained in Section 11. *************** *** 2090,2115 **** ! 10.12 MIME-Version - HTTP is not a MIME-compliant protocol (see Appendix C). However, - HTTP/1.0 messages may include a single MIME-Version general-header - field to indicate what version of the MIME protocol was used to - construct the message. Use of the MIME-Version header field should - indicate that the message is in full compliance with the MIME - protocol (as defined in [5]). Unfortunately, some older versions of - HTTP/1.0 clients and servers use this field indiscriminately, and - thus recipients must not take it for granted that the message is - indeed in full compliance with MIME. Proxies and gateways are - responsible for ensuring this compliance (where possible) when - exporting HTTP messages to strict MIME environments. Future - HTTP/1.0 applications must only use MIME-Version when the message - is fully MIME-compliant. - - MIME-Version = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT - - MIME version "1.0" is the default for use in HTTP/1.0. However, - HTTP/1.0 message parsing and semantics are defined by this document - and not the MIME specification. - - 10.13 Pragma - The Pragma general-header field is used to include --- 2123,2126 ---- ! 10.12 Pragma The Pragma general-header field is used to include *************** *** 2139,2141 **** ! 10.14 Referer --- 2150,2152 ---- ! 10.13 Referer *************** *** 2167,2169 **** ! 10.15 Server --- 2178,2180 ---- ! 10.14 Server *************** *** 2191,2194 **** ! 10.16 User-Agent The User-Agent request-header field contains information about the --- 2202,2208 ---- ! Note: Some existing servers fail to restrict themselves to ! the product token syntax within the Server field. + 10.15 User-Agent + The User-Agent request-header field contains information about the *************** *** 2216,2219 **** ! 10.17 WWW-Authenticate The WWW-Authenticate response-header field must be included in 401 --- 2230,2236 ---- ! Note: Some existing clients fail to restrict themselves to ! the product token syntax within the User-Agent field. + 10.16 WWW-Authenticate + The WWW-Authenticate response-header field must be included in 401 *************** *** 2427,2428 **** --- 2444,2464 ---- + 12.5 Attacks Based On File and Path Names + + Implementations of HTTP origin servers should be careful to + restrict the documents returned by HTTP requests to be only those + that were intended by the server administrators. If an HTTP server + translates HTTP URIs directly into file system calls, the server + must take special care not to serve files that were not intended to + be delivered to HTTP clients. For example, Unix, Microsoft Windows, + and other operating systems use ".." as a path component to + indicate a directory level above the current one. On such a system, + an HTTP server must disallow any such construct in the Request-URI + if it would otherwise allow access to a resource outside those + intended to be accessible via the HTTP server. Similarly, files + intended for reference only internally to the server (such as + access control files, configuration files, and script code) must be + protected from inappropriate retrieval, since they might contain + sensitive information. Experience has shown that minor bugs in such + HTTP server implementations have turned into security risks. + 13. Acknowledgments *************** *** 2448,2449 **** --- 2484,2488 ---- + Paul Hoffman contributed sections regarding the informational + status of this document and Appendices C and D. + This document has benefited greatly from the comments of all those *************** *** 2465,2468 **** Larry Masinter Mitra ! Gavin Nicol Bill Perry ! Jeffrey Perry Owen Rees David Robinson Marc Salomon --- 2504,2508 ---- Larry Masinter Mitra ! Jeffrey Mogul Gavin Nicol ! Bill Perry Jeffrey Perry ! Owen Rees Luigi Rizzo David Robinson Marc Salomon *************** *** 2483,2490 **** Unifying Syntax for the Expression of Names and Addresses of ! Objects on the Network as used in the World-Wide Web." RFC ! 1630, CERN, June 1994. ! [3] T. Berners-Lee and D. Connolly. "HyperText Markup Language ! Specification - 2.0." Work in Progress ! (draft-ietf-html-spec-05.txt), MIT/W3C, August 1995. --- 2523,2529 ---- Unifying Syntax for the Expression of Names and Addresses of ! Objects on the Network as used in the World-Wide Web." ! RFC 1630, CERN, June 1994. ! [3] T. Berners-Lee and D. Connolly. "Hypertext Markup Language - ! 2.0." RFC 1866, MIT/W3C, November 1995. *************** *** 2560,2562 **** Cambridge, MA 02139, U.S.A. - Tel: +1 (617) 253 5702 Fax: +1 (617) 258 8682 --- 2599,2600 ---- *************** *** 2568,2570 **** Irvine, CA 92717-3425, U.S.A. - Tel: +1 (714) 824-4049 Fax: +1 (714) 824-4056 --- 2606,2607 ---- *************** *** 2577,2579 **** Cambridge, MA 02139, U.S.A. - Tel: +1 (617) 258 8143 Fax: +1 (617) 258 8682 --- 2614,2615 ---- *************** *** 2628,2631 **** However, we recommend that applications, when parsing such headers, ! recognize a single LF as a line terminator and ignore the leading ! CR. --- 2664,2666 ---- However, we recommend that applications, when parsing such headers, ! recognize a single LF as a line terminator and ignore the leading CR. *************** *** 2633,2635 **** ! HTTP/1.0 reuses many of the constructs defined for Internet Mail (RFC 822 [7]) and the Multipurpose Internet Mail Extensions --- 2668,2670 ---- ! HTTP/1.0 uses many of the constructs defined for Internet Mail (RFC 822 [7]) and the Multipurpose Internet Mail Extensions *************** *** 2636,2649 **** (MIME [5]) to allow entities to be transmitted in an open variety ! of representations and with extensible mechanisms. However, HTTP is ! not a MIME-compliant application. HTTP's performance requirements ! differ substantially from those of Internet mail. Since it is not ! limited by the restrictions of existing mail protocols and SMTP ! gateways, HTTP does not obey some of the constraints imposed by ! RFC 822 and MIME for mail transport. ! This appendix describes specific areas where HTTP differs from ! MIME. Proxies/gateways to MIME-compliant protocols must be aware of ! these differences and provide the appropriate conversions where ! necessary. C.1 Conversion to Canonical Form --- 2671,2691 ---- (MIME [5]) to allow entities to be transmitted in an open variety ! of representations and with extensible mechanisms. However, ! RFC 1521 discusses mail, and HTTP has a few features that are ! different than those described in RFC 1521. These differences were ! carefully chosen to optimize performance over binary connections, ! to allow greater freedom in the use of new media types, to make ! date comparisons easier, and to acknowledge the practice of some ! early HTTP servers and clients. ! At the time of this writing, it is expected that RFC 1521 will be ! revised. The revisions may include some of the practices found in ! HTTP/1.0 but not in RFC 1521. + This appendix describes specific areas where HTTP differs from RFC + 1521. Proxies and gateways to strict MIME environments should be + aware of these differences and provide the appropriate conversions + where necessary. Proxies and gateways from MIME environments to + HTTP also need to be aware of the differences because some + conversions may be required. + C.1 Conversion to Canonical Form *************** *** 2650,2733 **** ! MIME requires that an entity be converted to canonical form prior ! to being transferred, as described in Appendix G of RFC 1521 [5]. ! Although HTTP does require media types to be transferred in ! canonical form, it changes the definition of "canonical form" for ! text-based media types as described in Section 3.6.1. ! C.1.1 Representation of Line Breaks ! MIME requires that the canonical form of any text type represent ! line breaks as CRLF and forbids the use of CR or LF outside of line ! break sequences. Since HTTP allows CRLF, bare CR, and bare LF (or ! the octet sequence(s) to which they would be translated for the ! given character set) to indicate a line break within text content, ! recipients of an HTTP message cannot rely upon receiving ! MIME-canonical line breaks in text. ! Where it is possible, a proxy or gateway from HTTP to a ! MIME-compliant protocol should translate all line breaks within ! text/* media types to the MIME canonical form of CRLF. However, ! this may be complicated by the presence of a Content-Encoding and ! by the fact that HTTP allows the use of some character sets which ! do not use octets 13 and 10 to represent CR and LF, as is the case ! for some multi-byte character sets. If canonicalization is ! performed, the Content-Length header field value must be updated to ! reflect the new body length. ! C.1.2 Default Character Set ! MIME requires that all subtypes of the top-level Content-Type ! "text" have a default character set of US-ASCII [17]. In contrast, ! HTTP defines the default character set for "text" to be ! ISO-8859-1 [18] (a superset of US-ASCII). Therefore, if a text/* ! media type given in the Content-Type header field does not already ! include an explicit charset parameter, the parameter ! ;charset="iso-8859-1" ! should be added by the proxy/gateway if the entity contains any ! octets greater than 127. ! C.2 Conversion of Date Formats ! HTTP/1.0 uses a restricted subset of date formats to simplify the ! process of date comparison. Proxies/gateways from other protocols ! should ensure that any Date header field present in a message ! conforms to one of the HTTP/1.0 formats and rewrite the date if ! necessary. ! C.3 Introduction of Content-Encoding ! MIME does not include any concept equivalent to HTTP's ! Content-Encoding header field. Since this acts as a modifier on the ! media type, proxies/gateways to MIME-compliant protocols must ! either change the value of the Content-Type header field or decode ! the Entity-Body before forwarding the message. ! Note: Some experimental applications of Content-Type for ! Internet mail have used a media-type parameter of ! ";conversions=" to perform an equivalent ! function as Content-Encoding. However, this parameter is not ! part of the MIME specification at the time of this writing. ! C.4 No Content-Transfer-Encoding ! HTTP does not use the Content-Transfer-Encoding (CTE) field of ! MIME. Proxies/gateways from MIME-compliant protocols must remove ! any non-identity CTE ("quoted-printable" or "base64") encoding ! prior to delivering the response message to an HTTP client. ! Proxies/gateways to MIME-compliant protocols are responsible for ! ensuring that the message is in the correct format and encoding for ! safe transport on that protocol, where "safe transport" is defined ! by the limitations of the protocol being used. At a minimum, the ! CTE field of ! Content-Transfer-Encoding: binary ! should be added by the proxy/gateway if it is unwilling to apply a ! content transfer encoding. ! An HTTP client may include a Content-Transfer-Encoding as an ! extension Entity-Header in a POST request when it knows the ! destination of that request is a proxy/gateway to a MIME-compliant ! protocol. --- 2692,2877 ---- ! RFC 1521 requires that an Internet mail entity be converted to ! canonical form prior to being transferred, as described in Appendix ! G of RFC 1521 [5]. Section 3.6.1 of this document describes the ! forms allowed for subtypes of the "text" media type when ! transmitted over HTTP. ! RFC 1521 requires that content with a Content-Type of "text" ! represent line breaks as CRLF and forbids the use of CR or LF ! outside of line break sequences. HTTP allows CRLF, bare CR, and ! bare LF to indicate a line break within text content when a message ! is transmitted over HTTP. ! Where it is possible, a proxy or gateway from HTTP to a strict RFC ! 1521 environment should translate all line breaks within the text ! media types described in Section 3.6.1 of this document to the RFC ! 1521 canonical form of CRLF. Note, however, that this may be ! complicated by the presence of a Content-Encoding and by the fact ! that HTTP allows the use of some character sets which do not use ! octets 13 and 10 to represent CR and LF, as is the case for some ! multi-byte character sets. ! C.2 Conversion of Date Formats ! ! HTTP/1.0 uses a restricted set of date formats (Section 3.3) to ! simplify the process of date comparison. Proxies and gateways from ! other protocols should ensure that any Date header field present in ! a message conforms to one of the HTTP/1.0 formats and rewrite the ! date if necessary. ! ! C.3 Introduction of Content-Encoding ! ! RFC 1521 does not include any concept equivalent to HTTP/1.0's ! Content-Encoding header field. Since this acts as a modifier on the ! media type, proxies and gateways from HTTP to MIME-compliant ! protocols must either change the value of the Content-Type header ! field or decode the Entity-Body before forwarding the message. ! (Some experimental applications of Content-Type for Internet mail ! have used a media-type parameter of ";conversions=" ! to perform an equivalent function as Content-Encoding. However, ! this parameter is not part of RFC 1521.) ! ! C.4 No Content-Transfer-Encoding ! ! HTTP does not use the Content-Transfer-Encoding (CTE) field of RFC ! 1521. Proxies and gateways from MIME-compliant protocols to HTTP ! must remove any non-identity CTE ("quoted-printable" or "base64") ! encoding prior to delivering the response message to an HTTP client. ! ! Proxies and gateways from HTTP to MIME-compliant protocols are ! responsible for ensuring that the message is in the correct format ! and encoding for safe transport on that protocol, where "safe ! transport" is defined by the limitations of the protocol being ! used. Such a proxy or gateway should label the data with an ! appropriate Content-Transfer-Encoding if doing so will improve the ! likelihood of safe transport over the destination protocol. ! C.5 HTTP Header Fields in Multipart Body-Parts ! In RFC 1521, most header fields in multipart body-parts are ! generally ignored unless the field name begins with "Content-". In ! HTTP/1.0, multipart body-parts may contain any HTTP header fields ! which are significant to the meaning of that part. ! D. Additional Features ! This appendix documents protocol elements used by some existing ! HTTP implementations, but not consistently and correctly across ! most HTTP/1.0 applications. Implementors should be aware of these ! features, but cannot rely upon their presence in, or ! interoperability with, other HTTP/1.0 applications. ! D.1 Additional Request Methods ! D.1.1 PUT ! The PUT method requests that the enclosed entity be stored under ! the supplied Request-URI. If the Request-URI refers to an already ! existing resource, the enclosed entity should be considered as a ! modified version of the one residing on the origin server. If the ! Request-URI does not point to an existing resource, and that URI is ! capable of being defined as a new resource by the requesting user ! agent, the origin server can create the resource with that URI. ! The fundamental difference between the POST and PUT requests is ! reflected in the different meaning of the Request-URI. The URI in a ! POST request identifies the resource that will handle the enclosed ! entity as data to be processed. That resource may be a ! data-accepting process, a gateway to some other protocol, or a ! separate entity that accepts annotations. In contrast, the URI in a ! PUT request identifies the entity enclosed with the request -- the ! user agent knows what URI is intended and the server should not ! apply the request to some other resource. ! D.1.2 DELETE ! The DELETE method requests that the origin server delete the ! resource identified by the Request-URI. ! D.1.3 LINK ! The LINK method establishes one or more Link relationships between ! the existing resource identified by the Request-URI and other ! existing resources. ! D.1.4 UNLINK ! The UNLINK method removes one or more Link relationships from the ! existing resource identified by the Request-URI. ! ! D.2 Additional Header Field Definitions ! ! D.2.1 Accept ! ! The Accept request-header field can be used to indicate a list of ! media ranges which are acceptable as a response to the request. The ! asterisk "*" character is used to group media types into ranges, ! with "*/*" indicating all media types and "type/*" indicating all ! subtypes of that type. The set of ranges given by the client should ! represent what types are acceptable given the context of the ! request. ! ! D.2.2 Accept-Charset ! ! The Accept-Charset request-header field can be used to indicate a ! list of preferred character sets other than the default US-ASCII ! and ISO-8859-1. This field allows clients capable of understanding ! more comprehensive or special-purpose character sets to signal that ! capability to a server which is capable of representing documents ! in those character sets. ! ! D.2.3 Accept-Encoding ! ! The Accept-Encoding request-header field is similar to Accept, but ! restricts the content-coding values which are acceptable in the ! response. ! ! D.2.4 Accept-Language ! ! The Accept-Language request-header field is similar to Accept, but ! restricts the set of natural languages that are preferred as a ! response to the request. ! ! D.2.5 Content-Language ! ! The Content-Language entity-header field describes the natural ! language(s) of the intended audience for the enclosed entity. Note ! that this may not be equivalent to all the languages used within ! the entity. ! ! D.2.6 Link ! ! The Link entity-header field provides a means for describing a ! relationship between the entity and some other resource. An entity ! may include multiple Link values. Links at the metainformation ! level typically indicate relationships like hierarchical structure ! and navigation paths. ! ! D.2.7 MIME-Version ! ! HTTP messages may include a single MIME-Version general-header ! field to indicate what version of the MIME protocol was used to ! construct the message. Use of the MIME-Version header field, as ! defined by RFC 1521 [5], should indicate that the message is ! MIME-conformant. Unfortunately, some older HTTP/1.0 servers send it ! indiscriminately, and thus this field should be ignored. ! ! D.2.8 Retry-After ! ! The Retry-After response-header field can be used with a 503 ! (service unavailable) response to indicate how long the service is ! expected to be unavailable to the requesting client. The value of ! this field can be either an HTTP-date or an integer number of ! seconds (in decimal) after the time of the response. ! ! D.2.9 Title ! ! The Title entity-header field indicates the title of the entity. ! ! D.2.10 URI ! ! The URI entity-header field may contain some or all of the Uniform ! Resource Identifiers (Section 3.2) by which the Request-URI ! resource can be identified. There is no guarantee that the resource ! can be accessed using the URI(s) specified. !