An important principle of Web architecture is that all important resources be identifiable by URI. The finding discusses the relationship between the URI addressability of a resource and the choice between HTTP GET and POST methods. HTTP GET promotes URI addressibility so, designers should adopt it for safe operations such as simple queries. POST is appropriate for other types of applications where a user request has the potential to change the state of the resource (or of related resources). The finding explains how to choose between GET and POST for an application taking into account architectural, security, and practical considerations.
This is the 09 May 2003 revision of this finding. It does not represent the consensus of the TAG. The purpose of the revision is to improve the clarity of the document over the previous version in the light of feedback. This draft covers some security issues not covered in the previous version. This draft also shifts emphasis away from GET to a more balanced consideration of GET and POST. Dan Connolly edited the 10 June 2002 version of this finding, which the TAG has already approved. Some of the content from that revision is reused here, and Dan has been instrumental in subsequent discussions about the current revision.
Additional TAG findings, both approved and in draft state, may also be available. The TAG expects to incorporate this and other findings into a Web Architecture Document that will be published according to the process of the W3C Recommendation Track.
The terms MUST, SHOULD, and SHOULD NOT are used in this document in accordance with [RFC2119].
Please send comments on this finding to the publicly archived TAG mailing list firstname.lastname@example.org (archive).
1.2 Relevant Principles of Web Architecture
1.3 Quick checklist for choosing GET or POST
2 The benefits of URI addressability
3 Safe interactions
3.1 Examples of using GET and POST
4 Considerations for sensitive data
5 Practical considerations
5.2 Ephemeral limitations
6 Ongoing work on GET in Web Services
In this finding, we discuss when it is appropriate to use HTTP GET and when it is appropriate to use HTTP POST, and the architectural principles behind the separation of these two methods.
In the introduction, we present scenarios that illustrate some of the topics covered, summarize the relevant principles of Web Architecture, and provide a quick checklist for choosing GET or POST. Following that, we discuss the benefits of URI addressability, which is the primary reason for distinguishing GET from POST. We then characterize safe interactions, when the use of GET is encouraged. However, when considering GET or POST, designers should also remain aware of considerations for sensitive data such as passwords or credit card information, and other practical considerations.
Scenario 1: Dan purchases a new wireless telephone and wishes to share information about it with his colleagues. Following the clue "CAT. NO. 43-893" on the bottom of the phone, Dan visits the Web site of the phone manufacturer. Typing the catalog number in the site's search box, Dan pulls up the relevant information. However, since the site designers created a POST Web form instead of a GET Web form for this read-only operation, Dan cannot bookmark the query, or send the query to his colleagues (who are potential customers) as a URI, or gain any of the other benefits made possible by the proper use of HTTP GET. The Web site designer should have provided GET access to the catalog information; see the benefits of URI addressibility.
Scenario 2: My bank allows me to read current information about my checking account over the Web. The URI they provide me for this access includes my account number. This is sensitive information I do not want to appear in the URI since, for example, when I follow a link from the bank Web site to the site of their advertiser, my account information may appear in the advertiser's server logs. Rather than moving my account number from the URI into the body of a POST request, my bank allows me to establish a secure HTTP connection. Not only do I feel more confident retrieving information over a secure connection, my browser knows not to expose sensitive information to the next site I visit after the secure connection. As a result, I can still bookmark my account page without revealing sensitive information. For more information, see the considerations for sensitive data.
Web architecture starts with Uniform Resource Identifiers (URI), whose generic syntax is defined by [RFC2396]. The Web relies on global agreement to follow the rules of URIs so that we can refer to things on the Web, access them, describe them, and share them. Providing a URI for a resource affords many advantages, including:
As described in [TBL50K], "Great multiplicative power of reuse derives from the fact that all languages use URIs as identifiers: This allows things written in one language to refer to things defined in another language. The use of URIs allows a language to leverage the many forms of persistence, identity, and various forms of equivalence."
In this finding, the term "URI addressability" means that a URI alone is sufficient for an agent to carry out a particular type of interaction.
In some cases, it may be desirable to interact in different ways with the resource identified by a URI. For example, the W3C link checking service illustrates the usefulness of separating URIs from methods. The link checker works as follows (ignoring recursive link checking for this discussion): for each HTTP URI that is part of a link in an HTML or XHTML document, the link checker uses the HEAD method to gather information about the resource identified by the URI. As the HTTP/1.1 specification states, "The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response." By using HEAD instead of GET, the link checker does not ask for more bits than it requires to do its job.
The Web architecture thus allows for the separation of URIs and methods. A URI's scheme determines a set of interaction methods. For instance, the HTTP URI scheme is defined in HTTP/1.1 [RFC2616], which allows "an open-ended set of methods and headers that indicate the purpose of a request. It builds on the discipline of reference provided by the Uniform Resource Identifier (URI) ... for indicating the resource to which a method is to be applied."; refer to section 1.1.
There are costs to separating URIs and methods, however. The separation means that some metadata required for the interaction may not be part of the URI, and as a result, the URI alone is no longer sufficient for an agent to carry out the interaction. URI addressability is lost.
In the following section we look more closely at when to use HTTP GET (since it promotes URI accessibility) and when to use HTTP POST.
Section 9.1 of HTTP/1.1 discusses safe interactions. A safe interaction is one where the user is not to be held accountable for the result of the interaction. Safe interactions are important because these are interactions where users can browse with confidence and where software programs (e.g., search engines and browsers that pre-cache data for the user) can follow links safely. Users (or software agents acting on their behalf) do not commit themselves to anything by querying a resource or following a link. For example, Web sites that say "by following the link to ABC, you agree to the following terms and conditions" do not account for the fact that anyone (in particular, a search service) can make another link to ABC, and anyone who follows this other link to ABC may never have seen the terms and conditions. See the related TAG finding "Deep Linking in the World Wide Web" [DEEPLINK].
By distinguishing safe from "unsafe" interactions (i.e., those where the user should be accountable for a particular interaction) at the protocol level:
In the same section, HTTP/1.1 states, the convention is that GET is used for safe interactions and SHOULD NOT have the significance of taking an action other than retrieval. Indeed, if you use GET for interactions with side-effects, your make your system insecure. For example, a malicious Web page publisher outside a firewall might put a URI in an HTML page so that, when someone inside the firewall unwittingly follows the link, that person activates a function on another system within the firewall.
Users accept obligations through other mechanisms than requests to follow a link. Per the HTTP/1.1 specification, designers should use HTTP POST for those interactions.
HTTP GET is designed so that all information necessary for the interaction is part of the URI, thus promoting URI addressibility. With HTTP POST, some information intended to affect change to the resource state may be part of the protocol headers, not in the URI. With this approach, the resulting URI for identifying the resource may be shorter, but the advantages of URI addressability are lost.
Note that it is possible to use POST even without supplying data in an HTTP message body. In this case, the resource is URI addressable, but the POST method indicates to clients that the interaction is unsafe or may have side-effects.
By convention, when GET method is used, all information required to identify the resource is encoded in the URI. There is no convention in HTTP/1.1 for a safe interaction (e.g., retrieval) where the client supplies data to the server in an HTTP entity body rather than in the query part of a URI. This means that for safe operations, URIs may be long. The case of large parameters to a safe operation is not directly addressed by HTTP as it is presently deployed. A QUERY or "safe POST" or "GET with BODY" method has been discussed (e.g., at the December 1996 IETF meeting) but no consensus has emerged.
In section 9.5, the HTTP/1.1 specification lists some example applications where POST should be used due to side-effects, including annotation of existing resources; posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles; and extending a database through an append operation.
Whether and how one chooses between GET and POST depends on the format specification and the application context. When HTTP URIS are used for hyperlinks in HTML, SMIL, and SVG, for example, the application determines which method will be used (generally GET). However, for both HTML forms and XForms, the author can choose between GET and POST. One implication is that HTML application designers should implement "unsafe" operations (or others requiring POST) with a form, even if the application does not call for user input.
Consider the following two designs for mailing list subscription confirmation.
Design 2 (incorrect):
The latter design performed an unsafe operation (list subscription) in response to a request with a safe method (following the link from the mail message with GET). If the user's mail agent pre-fetched pages to speed up browsing, the subscription would be confirmed without the knowledge and consent of the user; the HTTP/1.1 specification makes it clear that the fault is with the server in this case; the user's mail agent is free to follow links without incurring obligations.
Some Web interactions involve sensitive data (e.g., passwords, credit card numbers, and social security numbers, and bank account numbers as in scenario 2). The choice for how to protect sensitive data is generally orthogonal to the choice of GET or POST. For either GET or POST, if you wish to protect sensitive data, use SSL rather than plain HTTP.
There are costs to using SSL, however, including:
Although the use of SSL means the loss of certain benefits (e.g., caching), designers retain other benefits of URI addressability by using GET for safe operations over SSL.
When the use of SSL is impractical, it may still be possible to keep some sensitive data out of a URI. For instance, designers can communicate authentication information in HTTP headers rather than in the query part of a URI.
Broad issues of data protection such as preventing unauthorized access to an individuals' bookmarks (or cookies, for that matter) lie beyond the scope of Web architecture.
Web application design should be informed by the above principles, but also by the relevant limitations. For instance, when designing a service that operates on a resource via URI, consider whether there are use cases where such operation would be inappropriate.
For example, the W3C HTML validation service allows for two scenarios (both using HTML forms). With the "safe" option, validation requests are done via a URI; the form uses GET, which gives the results a URI for bookmarks, links, etc. The second option allows clients to upload a document for validation. The form uses POST for that option, since:
Whether or not GET with HTTP is used for the initial access, supplying a URI for subsequent access to the same information, e.g., using Content-Location, is useful.
Designers of HTML forms that accept non-ASCII characters have been challenged by some implementation limitations and gaps in specifications. Implementation limitations are length-related. Section section 17.13.4 of HTML 4.01 [HTML401] on multipart/form-data says, "The content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters."
This inefficiency is due to the octet-to-
conversion, combined with the fact that many characters need more
than one octet to be encoded. But while somewhat inefficient, this
is not a real obstacle to using GET for non-ASCII characters.
A more serious problem is that the mapping between characters and octets is not clearly specified beyond US-ASCII; refer to section 2.1 of [RFC2396]. For query parts resulting from filling in an HTML form, the default is to use the character encoding of the form. The definition of the accept-charset attribute on the form element in HTML 4.01 says, "The default value for this attribute is the reserved string 'UNKNOWN'. User agents may interpret this value as the character encoding that was used to transmit the document containing this FORM element."
The general direction to address this limitation is to converge to using UTF-8 for the mapping between characters and octets. The use of UTF-8 is already defined in various specifications, and we expect it to be adopted in future specifications and further deployed in due course. For instance, we expect XForms to specify that the encoding to be used in query parts is always UTF-8.
While Web application design must take into account the limitations of technology that is widely deployed at present, it should not treat these as architectural invariants. The following is a list of limitations have already been largely resolved, or are likely to fade away as bugs are fixed and the scope of interoperable specifications expands.
Section 3 of WSDL 1.2 Bindings [WSDL] provides a binding to HTTP GET, which makes it possible to respect the principle of using GET for safe operations. However, to represent safety in a more straightforward manner, it should be a property of operations themselves, not just a feature of bindings.
Thanks to David Orchard, Larry Masinter, Paul Prescod, Roy Fielding, Martin Dürst, and others for their feedback in response to the 15 April 2002 call for review. Thanks also to Roger Cutler, Joseph Reagle, and Stuart Williams for their suggestions.