Copyright © 2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document defines a mechanism to enable client-side cross-site requests.
From domain example.org to hello-world.invalid for instance.
It defines a request algorithm for GET and non-GET requests
that specifications, that want to enable cross-site requests in the technologies
they define, can use.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is the January 16 2008 Proposal to the Working Group of a rewrite of the Working Draft of the "Access Control for Cross-site Requests" document. It is expected that the Working Draft document will progress along the W3C Recommendation track. The WD document is produced by the Web Application Formats (WAF) Working Group. The WAF Working Group is part of the Rich Web Clients Activity in the W3C Interaction Domain.
Please send comments to the WAF Working Group's public mailing list public-appformats@w3.org with [access-control] at the start of the subject line. Archives of this list are available. See also W3C mailing list and archive usage guidelines.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
In the broad sense a cross-site request is a request from domain A
to domain B initiated by some code in a Web page. An HTML img
element on an example.org Web page pointing to an image on the
hello-world.invalid domain for instance.
Since these requests are made using the user's authentication information for
domain B, if any, there is a restriction, enforced by the user agent,
on accessing the data of the resource hosted on domain B from the Web
page on domain A. In certain situations, as with the initial version
of XMLHttpRequest, cross-site requests are prohibited. This is to prevent
information leakage as XMLHttpRequest allows access to the data of
the resource requested.
This document introduces an access control policy so that a resource can opt-in
to allowing a Web site residing at another domain to request the data of the resource
using the HTTP GET. Specifications that define technologies that allow
for such requests, such as XMLHttpRequest and XBL, will have to define
how the policy applies to their model.
This document also introduces an access control policy that allows a resource
to opt-in to cross-site requests using another HTTP method than GET.
When such a request is initiated from a Web page the user agent will first make
an authentication request (which will be cached) and then make the actual request
using the desired HTTP method. This will be useful for cross-site manipulation of
resources.
It is possible to make cross-site requests using the HTML
form element with the HTTP POST method. However, this provides
less capabilities on the requesting side compared to XMLHttpRequest
and does not expose the response of the request. Therefore the authorization request
is required for the HTTP POST method too. The functionality described
in this specification does not affect the way the HTML form element
functions.
To summarize, the access control policy is defined in the resource that might be obtained and is expected to be enforced by the client that retrieves and processes the resource. Thus the client is trusted and acts as a policy enforcement point.
If you have a simple text resource residing at http://example.com/hello
which contains the string "Hello World!" and you would like the hello-world.invalid
domain to be able to access it, the resource would look as follows (including one
HTTP header that is significant):
Access-Control: allow <hello-world.invalid> Hello World!
The hello-world.invalid can now access this document using
XMLHttpRequest for instance with the following code:
new client = new XMLHttpRequest();
client.open("GET", "http://example.com.com/hello")
client.onreadystatechange = function() { /* do something */ }
client.send()
It gets slightly more complicated if you want your resource to be able to handle
cross-site requests using the HTTP DELETE and POST methods.
Your resource first needs to reply to an authorization request that uses the
GET method and has the Method-Check
HTTP header set and then needs to handle the request that uses the POST
or DELETE method and give an appropriate response. The reply to the
authorization request can have the following HTTP headers specified for instance:
Access-Control: allow <hello-world.invalid> method DELETE, POST Method-Check-Expires: Sun, 06 Nov 2012 08:49:37 GMT
The Method-Check-Expires
indicates how long the response can be cached so that for subsequent requests involving
the HTTP DELETE and POST methods no authorization request
has to be made. The response to the actual request can simply contain this header:
Access-Control: allow <hello-world.invalid>
As opposed to handling such a request, actually making a request like that is
not difficult as the complexity of doing the additional authorization request is
the task of the user agent. Using XMLHttpRequest again and assuming
the application were hosted at http://calendar.invalid/app you could
do something like the following:
function deleteItem(itemId, updateUI) {
var client = new XMLHttpRequest()
client.open("DELETE", "http://calendar.invalid/app")
client.onload = updateUI
client.onerror = updateUI
client.onabort = updateUI
client.send("id=" + itemId)
}
XMLHttpRequest Level 2 includes support for
cross-site access requests though it has not yet been published
as a W3C Working Draft.
It is also possible to specify which domains are allowed to access your resource using XML:
<?access-control allow="http://hello-world.invalid https://test.example.net"?> <hello type="world"/>
You can even combine these two techniques:
Access-Control: allow <http://hello-world.invalid> <?access-control allow="https://test.example.net"?> <hello type="world"/>
This specification is applicable to both user agents and other specifications. This specification will only apply in certain contexts and specifications defining such contexts will define when and how this specification applies.
As well as sections marked as non-normative, all diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
In this specification, The words must, must not, should, should not and may are to be interpreted as described in RFC 2119. [RFC2119]
A conformant specification is one that implements all the requirements listed in this specification that are applicable to specifications. For instance, a specification needs to define what the source for the referrer root URI is.
A conformant user agent is one that implements all the requirements listed in this specification that are applicable to user agents, while also being consistent with the requirements listed in the specifications that use the access control read policy.
User agents may optimize any algorithm given in this specification, so long as the end result is indistinguishable from the result that would be obtained by the specification's algorithms. (The algorithms in this specification are generally written with more concern for clarity than efficiency.)
Terminology is generally defined throughout the specification. However, the few definitions that did not really fit anywhere else are defined here instead.
The term ToASCII algorithm means that the
ToASCII algorithm as described in RFC 3490 is applied with both the
AllowUnassigned and UseSTD3ASCIIRules flags set. [RFC3490]
There is a case-insensitive match of strings s1 and s2 if after mapping the ASCII character range A-Z to the range a-z both strings are identical.
U+0009, U+000A, U+000D and U+0020 are space characters.
A space-separated list is a string of which the items are separated by one or more space characters (in any order). The string may also be prefixed or suffixed with zero or more space characters.
To obtain the list of values from a space-separated list user agents must take the string, replace any sequence of space characters with a single U+0020 character, then drop any leading or trailing U+0020 character, then chop the resulting string at each occurrence of a U+0020 character, then drop all U+0020 values, and then return the list of values.
An XML MIME type is text/xml, application/xml
or any MIME type ending in +xml.
Two URIs are same-origin if after performing scheme-based normalization on both URIs as described in section 5.3.3 of RFC 3987 the scheme, ihost and port components are identical. If either URI does not not have an ihost component the URIs must not be considered same-origin. [RFC3987]
The access control mechanism defined in this specification allows for extension of the same-origin policy in contexts where the same-origin policy currently applies.
When making a cross-site access request user agents should ensure to:
User agents which implement this specification must also take care to properly normalize Unicode and to properly interpret IDNs to prevent URI spoofing attacks as outlined in the specification. [RFC3490]
Application authors should be aware that content retrieved from another site is not itself trustable. Authors should protect themselves against cross-site scripting attacks by not rendering or executing the retrieved content directly without validating that content.
Authors sharing content with domains that are on shared hosting environments should ensure to not allow access from arbitrary ports on those domains.
example.com could host some user-sensitive data protected by HTTP
authentication or cookies. It has an agreement with company.invalid
to share this data and therefore uses the following HTTP header:
Access-Control: allow <company.invalid:*>
Now company.invalid happens to be on a shared hosting environment
with evil.example.net (they share the IP address). Because of this
evil.example.net can host content at company.invalid:9999
and access user-sensitive data at example.com if the user, by phishing
for instance, goes to company.invalid:9999. This can be prevented
by clearly specifying the port or omitting it entirely letting it default to the
default port of the URI scheme:
Access-Control: allow <company.invalid:80> Access-Control: allow <company.invalid> Authors should ensure that GET requests on their
application has no side effects. If by some means an evil application finds out
what applications a user is associated with it might "attack" these applications
with GET requests that can effect the user's data if the user is already
authenticated with any of these applications by means of cookies or HTTP authentication.
Integrity protection of the access control policy statements may be required. This could be achieved by use of SSL/TLS for example.
The Access Control Protocol consists of request messages and response messages
Access-Control-Message = request | response
The User Agent issues an HTTP GET request.
request = get-resource-request | get-authorization-request
The user Agent issues an HTTP GET request with the Referer-Root set. The Referer-root request HTTP Header must match the following ABNF:
get-resource = "HTTP GET Request" Referer-Root Referer-Root = "Referer-Root" ":" referrer-root-URI
The Referer-Root (sic) request HTTP header helps servers knowing where the request originates in case the Referer header is not included in the request. It also indicates that the request is a cross-site access request.
TBD, redirects
ednote: I understand that Referer-root is required for Access Control, and inclusion of Referer does not change the operation of Access Control. ednote: I don't know how in ABNF to indicate a full HTTP GET request with a particular header cardinality refinement, so I just guess at this.
The User Agent issues an HTTP GET Reqeust with the Method-Check HTTP Header
get-authorization-request = "HTTP GET Request" Method-CheckMethod-Check = "Method-Check" ":" Method
The Method production is defined in RFC 2616. [RFC2616]
The Method-Check request HTTP
header informs the server that the client is making an authorization request and
will make a subsequent request to the same URI using a different method specified
in the Method-Check HTTP header
if that is allowed by the server per the semantics of authorization requests.
The server responds to an HTTP GET request.
response = resource-response | authorization-response
The get-resource-request will be followed by a resource-response. The get-authorization-request will be followed by an authorization-response.
Retrieved resources can have one or more Access-Control headers defined. These headers must match the following ABNF:
resource-response = "HTTP GET Response" *Access-Control-Header [XML-Media-Type XMLPayload] Access-Control-Header = "Access-Control" ":" rule rule = rule-allow | rule-deny rule-allow = "allow" 1*(LWS pattern) [exclude (LWS "method" LWS [Method])] rule-deny = "deny" 1*(LWS pattern) [exclude] exclude = LWS "exclude" *(LWS pattern) pattern = "<" ; = "<" access item ">" XML-Media-type = ; TBD
LWS and Method are used as defined by RFC 2616. The
pattern production above MUST not include implied
LWS. Implied LWS is allowed everywhere else. [RFC2616]
The syntax of access items when used in the Access-Control
HTTP header is restricted to internationalized domain names to which the ToASCII
algorithm has been applied as HTTP does not support Unicode.
In case resources on a domain are not all in the control of a single person "deny"
rules can be used by authors to deny read access from external resources to the
entire domain. Read access from other domains is by default disallowed but individual
resources on the domain could have <?access-control?>
processing instructions specified which can allow access from other domains. Although
files can contain processing instructions, HTTP headers can be set across an entire
server making them far more effective. The "exclude" clause can be used to list
exclusions to these "deny" rules.
"allow" rules can be used to allow read access from particular domains as long as those domains don't match any of the patterns listed in "exclude".
For cross-site non-GET access requests the server to which the request
is made can list which non-GET methods are allowed using a comma-separated
list after the "method" rule.
Access-Control: allow <*.example.org> exclude <*.public.example.org>
Access-Control: allow <webmaster.public.example.org>
Means that every subdomain of example.org can access the resource
including webmaster.public.example.org, but with the exclusion of
all other subdomains of public.example.org.
Access-Control: allow <example.org>>
Means that example.org and all its subdomains can access the resource.
Access-Control: allow <example.org> <example.invalid> method POST, PUT
Means that example.org and example.invalid can access
the resource and perform POST and PUT requests.
XML resources may include an
<?access-control?> processing instruction
within the XML Prolog to indicate, if the access control read policy
applies, from which domains their content can be accessed. [XML]
xmlpayload="*Access-Control-PI" xml-document xml-document=";" tbd
access-control-pi="XML-Processing-Instruction" (( attribute-allow [attribute-exclude] [attribute-method]) | ( attribute-deny [attribute exclude] 0**attribute-other )) attribute-allow="allow" "=\" 1*(access item) "\" ; not sure about ABNF escaping attribute-deny = "deny" "=\" 1*((access item) "\" attribute-exclude = "exclude" "=\" 1*(access item) "\" attribute-method = "method" "=\" 1*(Method) "\" attribute-other = ; not sure how to say strings other than (allow | deny | exclude | method ) in ABNF
An <?access-control?> processing
instruction that is part of the XML Prolog must be parsed using
the same syntax rules as described in the XML Stylesheet PI specification.
<?access-control?> processing instructions
outside the XML Prolog are ignored. [XMLSSPI]
The processing instruction takes three pseudo-attributes which each take a
space-separated list of
access items and one psuedo-attribute
which takes a space-separated list of HTTP methods.
These pseudo-attributes are allow, deny, exclude,
and method respectively. Either the allow or deny
pseudo-attribute must be specified. allow and
deny must not be specified at the same time. If
the deny psuedo-attribute is specified the method attribute
must not be specified.
The allow, deny, and exclude pseudo-attributes,
when specified, must at least contain a single
access item. method must
at least contain a single HTTP method.
The above means that the following examples would be non-conforming and would make the user agent deny access to the resource:
<?access-control?> <?access-control x?> <?access-control x=""?> <?access-control allow=""?> <?access-control allow="http://example.org" x=""?>
<?access-control allow="allow.example.org" deny="deny.example.org"?>>
An access item is either a single * character (always matches) or
a domain that can contain a wildcard at the start and can optionally have a scheme
and port specified. An access item must match the following
ABNF:
access-item = [scheme "://"] domain-pattern [":" port-pattern]? | "*" domain-pattern = domain | "*." domain port-pattern = port | "*"
scheme and port are used as defined in RFC 3986.
domain is an internationalized domain name as defined in RFC 3490.
[RFC3986] [RFC3490]
In addition to matching the above ABNF the ToASCII
algorithm MUST apply successfully (without errors) to each
label component of the subdomain (if any) from the access
item.
Since HTTP syntax does not allow Unicode domain has to be written
within the ASCII range resorting to the punycode syntax when necessary.
If the scheme omitted it will match any scheme from
the referrer root URI. If port-pattern
is omitted the port for the access item will be the default
port for the scheme of the access item, or the default
port of the scheme of the referrer root URI if the access
item did not include a scheme. If port-pattern is
* it will match any port.
When * is used as part of domain-pattern it matches
any number of internationalized labels before domain. If just
domain is used it will match itself and any number of internationalized
labels before domain.
Several examples of conforming access items:
* *.example.org https://*.example.org https://example.org:8443 The following access items would make the user agent deny access to the resource:
https://*.*:80 *://example.org http://example.org/ http://example.org/example http://example.org: The following access items are identical::
http://example.org http://example.org:80 The following access items would match http://foo.bar.example.org:80:
org *.org *.org:* example.org http://example.org http://*.bar.example.org:800 Authorization
ResponseMethod-Check-Expires
HTTP response header indicates how long the results of an authorization request
can be cached in an authorization request cache.
Authorization-response = "HTTP GET Response" *Access-Control-Header [XML-Media-Type XMLPayload] [Method-Check-Expires] Method-Check-Expires = "Method-Check-Expires" ":" HTTP-datee
The HTTP-date production is defined in rfc 2616. [RFC2616]
The Access Control Protocol specifies access for resources based upon the following:
[@@??] Need higher level function, to split into GET/Non-get, manage authorization cache, populate the rule set.
Note: taken from XACML 2.0 Appendex C.1. Need to do inputs to the function, that is the referer-root, method, and access item. evaluate function needs definition. does Indeterminate and NotApplicable apply?
The following specification defines the “Deny-overrides” rule-combining algorithm.
In the entire set of rules, if any rule evaluates to "Deny", then the result of the rule combination SHALL be "Deny". If any rule evaluates to "Allow" and all other rules evaluate to "NotApplicable", then the result of the rule combination SHALL be "Allow". In other words, "Deny" takes precedence, regardless of the result of evaluating any of the other rules in the combination. If all rules are found to be "NotApplicable" to the decision request, then the rule combination SHALL evaluate to "NotApplicable".
If an error occurs while evaluating the target or condition of a rule that contains an effect value of "Deny" then the evaluation SHALL continue to evaluate subsequent rules, looking for a result of "Deny". If no other rule evaluates to "Deny", then the combination SHALL evaluate to "Indeterminate", with the appropriate error status..
If at least one rule evaluates to "Allow", all other rules that do not have evaluation errors evaluate to "Allow" or "NotApplicable" and all rules that do have evaluation errors contain effects of "Allow", then the result of the combination SHALL be "Allow".
The following pseudo-code represents the evaluation strategy of this rule-combining algorithm.
Decision denyOverridesRuleCombiningAlgorithm(Rule rule[])
{
Boolean atLeastOneError = false;
Boolean atLeastOneAllow = false;
for( i=0 ; i < lengthOf(rules) ; i++ )
{
Decision decision = evaluate(rule[i]);
if (decision == Deny)
return Deny;
if (decision == Allow) {
atLeastOneAllow = true;
continue;
}
if (decision == Indeterminate) {
atLeastOneError = true;
}
}
if (atLeastOneAllow)
return Allow;
if (atLeastOneError)
return Indeterminate;
return Deny;
}
[@@ The above algorithm is heavily borrowed XACML that. The 'evaluate' function needs to be specified and one assumes that it is capable of accepting both accept and deny rules, and that it uses the AccessItemCheck function below... in which case this function also needs (at least) an origin and maybe a referrer-root upon which a decision is made.]
Response DoAccessControlledHTTPOperation(Method method, URI referrerRoot, URI origin, Body body, Headers h) {
Response r = null;
if( method == Method.GET ) {
h.add(makeHeader("ReferrerRoot", referrerRoot));
r = DoHTTPOperation(Method.GET, null /*body*/, origin, h);
} else {
if (PreOperationAccessControlDecision(method, referrerRoot, origin)) == Decision.DENY ) {
r = new Response(); //New Empty response
r.setStatus = Status.ACCESSDENIED; //Or some other failure signal.
return r;
}
r = DoHTTPOperation(Method.GET, body, origin, h);
}
Decision d = PostOperationAccessControlCheck(method, referrerRoot, origin, r);
if( d==Decision.ALLOW )
return r;
r = new Response(); //New empty response
r.setStatus = Status.ACCESSDENIED; //Or some other failure signal.
return r;
}
/****************************************************************************************/
Decision PreOperationAccessControlDecision(Method method, URI referrerRoot, URI origin) {
Rules rules[] = rulesCache.get(method, referrerRoot, origin);
if(rules == null) {
rules = DoMethodCheckOperation(method, referrerRoot, origin);
rulesCache.set(method, referrerRoot, origin, rules);
}
return denyOverridesRuleCombiningAlgorithm(method, rules, referrerRoot, origin);
}
/****************************************************************************************/
Decision PostOperationAccessControlDecision(Method method, URI referrerRoot, URI origin, Response response) {
ResponseCode rc = response.responseCode;
Headers h = response.headers;
Body b = response.body;
Rule rules[] = parseAccessRules( h.getHeaders("Access-Control" ));
if( anXMLMediaType( h.getHeader("Content-Type) )
rules.add( parseAccessPIRules(b) )
rulesCache.set(method, referrerRoot, origin, rules);
return denyOverridesRuleCombiningAlgorithm(method, rules, referrerRoot, origin);
}
The algorithm described below determines whether there is a match between a
referrer root URI (http://test.example.org:80
for instance) and an access item (*.example.org,
or * for instance)
To determine whether a referrer root URI (called origin) and an access item (called item) match user agents must use the following or equivalent pseudo-code that represents the Access Item Check matching alogorithm:
Boolean AccessItemCheck(origin, item)
{
//Match a wildcard item
if (item == "*" ) // * == U+002A
return true;
// No origin, no match
if ( origin == null )
return false;
normalize(item);
//Different item and origin schemes, no match
if( scheme(item) != null && (scheme(item) != scheme (origin))
return false;
//Non-wildcard item port and different origin port, no match.
if( port (item) != "*" && ( port (item) != port(origin))
return false;
// Remove the scheme from item (if it has one specified) and origin including the :// sequence following it.
if (scheme(item) != null)
item = trunc(item, scheme(item)); //could define trunc so that is doesn't need to be guarded.
if (scheme(origin) != null)
origin = trunc(origin, scheme(origin));
// Remove the port from item and origin including the U+003A (:) preceding it.
[@@ TBD]
origin_list = reverse (split_on_period(origin)); // removing scheme, port, and all "."
item_list = reverse(split_on_period(item)); // removing scheme, port, and all "."
for( i=0 ; i < lengthOf(item_list) ; i++ )
{ if(item_list[i] == "*"
continue;
if (!case-insensitive-match(ToASCII(origin_list(i)), ToASCII(item_list(i))))
return false;
if (origin_list(i+1) == null )
return false;
}
return true;
}
Normalize(item): If item does not have a port-pattern
let the port of item be the default port for
the scheme of item or, if item does not have
a scheme, let it be the default port of the scheme
of origin.
[Normalize needs origin as a parameter as well.]
ToASCII(item): the ToASCII algorithm as described in RFC 3490 is applied with both the AllowUnassigned and UseSTD3ASCIIRules flags set. [RFC3490]
case-insensitive-match(s1, s2): return true if after mapping the ASCII character range A-Z to the range a-z both strings are identical.
scheme(), port(): extract the scheme and port parts of the URI respectively.
split_on_period(s1): split s1 into an array using U+002E(.) character as delimiter
reverse(list): reverse the order of a list
The following table gives some example outcomes of this algorithm with the referrer root URI in the first column, the access item in the second column, and the result in the final column.
| Referrer root URI | Access item | Result |
|---|---|---|
null |
* |
Match |
null |
example.org |
No match |
http://example.org:80 |
EXAMPLE.OrG |
Match |
http://example.org:81 |
example.org |
No match |
http://example.org:80 |
example.org |
Match |
http://site.example.org:80 |
*.org |
Match |
http://xn--74h.example.org:80 |
☺.example.org |
Match |
The editor would like to thank Arthur Barstow, Benjamin Hawkes-Lewis, Cameron McCormack, David Håsäther, Dean Jackson, Eric Lawrence, Frank Ellerman, Frederick Hirsch, Graham Klyne, Hal Lockhart, Henri Sivonen, Ian Hickson, Jonas Sicking, Lachlan Hunt, Maciej Stachowiak, Marc Silbey, Marcos Caceres, Mark Nottingham, Martin Dürst, Matt Womer, Mohamed Zergaoui, Sharath Udupa, Sunava Dutta, Thomas Roessler, and Zhenbin Xu for their contributions to this specification.
Special thanks to Brad Porter, Matt Oshry and R. Auburn who helped editing earlier versions of this document.
Thanks to Stuart Williams who helped edit parts of this proposal.