Copyright © 2002 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
An important principle of Web architecture is that all important resources be identifiable by URI. This finding discusses the importance of using GET for safe operations on the Web, so that those resources may be identified by a URI. The finding also discusses some practical limitations to this general principle.
Note: This document has been superseded by the 22 September 2003 version of this finding.
This document has been produced by the W3C Technical Architecture Group (TAG). This finding addresses TAG issue whenToUseGet-7.
This finding was accepted by the TAG at its 10 June 2002 teleconference. The TAG originally reached consensus on this finding at its 20 May 2002 teleconference. At their 16 Dec 2002 teleconference, the TAG agreed to add a publication date to this document, consistent with the TAG's expectation that findings no longer be modified in place.
Additional TAG findings, both approved and in draft state, may also be available. The TAG expects to incorporate this and other findings into a Web Architecture Document that will be published according to the process of the W3C Recommendation Track.
The terms MUST, SHOULD, and SHOULD NOT are used in this document in accordance with RFC 2119 [RFC2119].
Please send comments on this finding to the publicly archived TAG mailing list www-tag@w3.org (archive).
It is possible to share information using Web technologies
without giving that information a URI, but it's not optimal. For
example, a product catalog can be built using an HTML form where
the client provides a product number to the server in an HTTP POST
request, and information about the product comes back in the
response. But that design does not allow the client to make a link
to the information about the product, bookmark it, or use it with
any of the many Web technologies (e.g., XSLT's
document()
function, RDF assertions, XLink, etc.) that
depend on information being URI-addressable.
HTML forms that use the GET method provide a URI for each combination of inputs. Section 17.13.1 of the HTML 4.01 Recommendation [HTML401] states (and the text goes back to HTML 2.0):
The "get" method should be used when the form is idempotent (i.e., causes no side-effects). Many database searches have no visible side-effects and make ideal applications for the "get" method.
Unfortunately, the term idempotent is misused there, and the term side-effects is stretched from its use in the design of programming languages. Section 9.1.1 of the HTTP 1.1 specification [RFC2616]is more precise on the matter:
Implementors should be aware that the software represents the user in their interactions over the Internet, and should be careful to allow the user to be aware of any actions they might take which may have an unexpected significance to themselves or others.
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.
Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.
If you use GET for operations with side-effects, your make your system insecure. For example, a malicious Web page publisher outside a firewall might put a URI in a page that, when dereferenced unwittingly by someone inside the firewall, could activate a function on another system within the firewall.
To elaborate on the principal of following links being safe, consider the following two designs for mailing list subscription confirmation.
Design 1:
Design 2 (incorrect):
The latter design performed an unsafe operation (list subscription) in response to a request with a safe method (following the link from the mail message with GET). If the users's mail agent pre-fetched pages to speed up browsing, the subscription would be confirmed without the knowledge and consent of the user; the HTTP specification makes it clear that the fault is with the server in this case; the user's mail agent is free to follow links without incurring obligations.
This is not to say that there are never any obligations related to following links; only that the obligations must be accepted some other way than requesting to follow a link.
Obligations of confidentiality can be established in a straightfoward manner as follows:
Web sites that say "by following the link to ABC, you agree to the following terms and conditions" do not account for the fact that anyone (in particular, a search service) can make another link to ABC, and anyone who follows this other link to ABC may never have seen the terms and conditions.
Web application design should be informed by the above principles, but also by the relevant limitations.
The W3C HTML validation service provides an example: the norm is that validation requests are done by reference; the form uses GET, which gives the results a URI for bookmarks, links, etc; but the service also allows clients to upload a document for validation. In that case, the form uses POST, since
Whether or not GET with HTTP is used for the initial access, supplying a URI for subsequent access to the same information, e.g., using Content-Location, is useful.
The case of large parameters to a safe operation is not directly addressed by HTTP as it is presently deployed. A QUERY or "safe POST" or "GET with BODY" method has been discussed (e.g., at the December 1996 IETF meeting) but no consensus has emerged.
WebDAV [RFC 2518] uses a different HTTP method, PROPFIND (section 8.1 PROPFIND), for querying properties of resources; unfortunately, this provides no URI for the results of these queries.
Designers of HTML forms that accept non-ASCII characters have been challenged by some implementation limitations and gaps in specifications. Implementation limitations are length-related. Section section 17.13.4 of HTML 4.01 [HTML401] on mutipart/form-data says:
The content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters.
This inefficiency is due to the octet-to-%hh
escape
conversion, combined with the fact that many characters need more
than one octet to be encoded. But while somewhat inefficient, this
is not a real obstacle to using GET for non-ASCII characters.
A more serious problem is that the mapping between characters and octets is not clearly specified beyond US-ASCII; refer to section 2.1 of the URI specification [RFC2396]. For query parts (parts after the '?') resulting from filling in an HTML form, the default is to use the character encoding of the form. The definition of the accept-charset attribute on the form element in HTML 4.01 [HTML401] says:
The default value for this attribute is the reserved string "UNKNOWN". User agents may interpret this value as the character encoding that was used to transmit the document containing this FORM element.
The general direction to address this limitation is to converge to using UTF-8 for the mapping between characters and octets. The use of UTF-8 is already defined in various specifications, and we expect it to be adopted in future specifications and further deployed in due course. For instance, we expect XForms to specify that the encoding to be used in query parts is always UTF-8.
While Web application design must take into account the limitations of technology that is widely deployed at present, it should not treat these as architectural invariants. Some limitations are likely to fade away as bugs are fixed and the scope of interoperable specifications expands.
The use of HTTP for typical safe remote operations is not addressed by SOAP specifications as of this writing. For instance, from section 8.4.1.1.1 Requesting State of SOAP Adjuncts [SOAPADJUNCTS]:
HTTP Method: POST (the use of other HTTP methods is currently undefined in this binding).
Intitial investigations into requirements and a proposed solution (SOAP HTTP GET Binding Version 0.1, Orchard, May 2002) suggest this limitation is straightfoward to address; meanwhile, "the oft-quoted stock quote example" (Overview section) is misleading, since it suggest that HTTP POST is appropriate for this safe operation.
WSDL 1.1 [WSDL] provides a binding to HTTP GET, which makes it possible to respect the principle of using GET for safe operations. However, to represent safety in a more straightforward manner, it should be a property of operations themselves, not just a feature of bindings.
Thanks to David Orchard, Larry Masinter, Paul Prescod, Roy Fielding, Martin Dürst, and others for their feedback in response to the 15 April 2002 call for review.
Last modified: $Date: 2003/09/22 20:28:18 $ by $Author: ijacobs $. $Revision: 1.38 $