Resource protection

Jonathan Rees, notes in preparation for TAG face-to-face on June 23-25 2009

There has been recent discussion on www-tag (e.g. [thread]) concerning two aspects of web request security:

addressing known confused-deputy vulnerabilities (CSRF [csrf]) in cross-origin requests, and
the introduction of functionality meant to allow new kinds of cross-origin requests without introducing new vulnerabilities (e.g. CORS [cors] and Safe Ecmascript [ses]).

Cross-origin requests can originate from HTML content (e.g. IMG tags), from Javascript, or in other ways.

Principal-specific vs. resource-specific credentials

A request on a protected resource must identify the resource that is the subject of the request, and present any credentials required to authorize the request. The credentials can be associated with a particular principal, and used with any resource (subject to access policy), or they can be associated (subject to access policy) with a particular resource, and used by any principal that has them.

Applications based on the principal-specific credential pattern are inherently vulnerable to CSRF attacks because the attacker can use a principal's credentials, intended to provide access to one resource, and use them (via an unwitting intermediary) with a different resource. The resource-specific credential pattern prevents such violations by authorizing before the attacker gets control.

CORS uses principal-specific credentials, and works by in effect creating a new principal for each pair of origins known to the browser. Because the space of principals is carved up into fine pieces, this approach can act as additional defense, but any principal-based system is inherently vulnerable to confused deputy attacks. This is explained in Tyler Close's paper "ACLs don't" [acls].

The recommended defense against CSRF is the use of nonces or other kinds of unguessable tokens. This is a realization of the resource-specific credentials pattern.

Identification and credentials

Identification and credentials may be communicated in a variety of ways, and the choice of strategy can either encourage or prevent confused deputy attacks, and either support or erode web architecture. Channels for these two kinds of information include IP address, TCP connection identity, request-URI, headers, and content. Design patterns are in use that employ each of these channels for transmission of credentials.

We have learned the hard way that global credentials such as IP address and connection have vulnerabilities. As for payload-carried credentials, in HTTP there are three options for division of labor:

The request-URI carries neither identification nor credentials (e.g. as in a SOAP request). Both are carried in headers or body.
The request-URI carries identification only. Credentials are carried in headers (e.g. Origin, Authentication, cookies) or body (e.g. hidden nonce field in representation or in POST request).
The request-URI carries both identification and credentials (unguessable URIs).

We can evaluate these options from a security perspective and from a web architecture perspective independently.

Credential placement and security

Any of these divisions can be applied to either principal-specific or resource-specific credentials. For resource-specific credentials care has to be taken to ensure that the credentials always accompany the identification wherever it goes. If they get separated then either (a) the identification will be useless, (b) the application will succumb to the temptation to use a principal's credentials, or (c) a login step will be required. Keeping them together is easier if they form a contiguous string such as a URI.

URI-carried credentials may be a risk if the the credentials carry a higher chance of being "leaked" compared to the credentials being placed elsewhere. Whether this is the case is highly dependent on application and container architecture.

Credential placement and identification

From a web architecture standpoint, the question is primarily whether the resource is being identified by a URI. Certainly case (1) above rules this out. In the cases (2)-(3) the answer depends on what you consider the resource to be. If the URI carries credentials (3), then the resource can be used by the recipient without separate credential transfer. (The URI has to be kept private, but you wouldn't be passing credentials to the recipient under any model if you didn't want it to use them.) Because the recipient doesn't want to have to trust the sender, the recipient will not want to use any of its own credentials when accessing the resource, so credentials in the URI will be its *only* way to use the resource.

If the URI does not carry credentials and they become separated and lost (2), the URI can only be used for public purposes, and if the recipient can make use of it at all, a separate login step will be needed to gain access.

(3) may seem to encourage the creation of aliases for the same resource, as different set of credentials for the resource will lead to a distinct private URI. But each credentialed version is in a sense a distinct resource ("resource" in the sense "something you can use"). The public resource should have its own public URI.

(3) might also be a threat to caching and content management systems that directly transmit static files as content may have to carry different credentials for different sessions. Many sites (e.g. Amazon and those using nonces for CSRF defense) are already generating per-session or per-request content, so this is not always a show-stopper.

URI-carried credentials may be implemented either by an unguessable token used for both identification and credentials, or by adjoining such a token to guessable information (e.g. as a query parameter or extra path component). The latter has the advantage in that it allows for straightforward conversion of a private URI to a public one. However, the token has to be specific to that resource. Otherwise, an attacker might form a new URI that combines the token with a string of its choice, gaining unauthorized access to another resource.

Credential placement and scripting

In a scripting language, resources may be identified by URIs or in a variety of other ways. In secure ECMAScript resource references are encapsulated as objects. An object naturally carries whatever information is needed to access the resource, including if necessary a URI and/or credentials. Because the language provides no way to inspect the internal state of an object, there is no way for an attacker in the same container to extract credentials and violate security policy. Additional protection derives from the fact that all use of the resource is mediated by code in the object.

Some attempts have been made to create principal-based security systems for programming languages, with Java being the most popular example. In these systems principals correspond to program modules, and there are elaborate mechanisms for granting and suppressing access to objects.

Pre-flight request chatter

Mark Nottingham has raised a concern [mnot] that the preflight request that CORS requires for each URI introduces a worrisome overhead for simple web requests. This presents a possible threat to web architecture, as a web application designer may choose to move identifying information out of the request-URI and into other parts of the request, such as a POST entity. (For example, they may decide to switch from a REST architecture to a SOAP architecture for efficiency reasons.) This would lead to the undesirable URI-free option (1) above.

Terminology note

The usual terms for the two authorization patterns are "ACL-based" and "capability-based". Somehow those don't resonate in this context. I've used different words to try to trick you into thinking about the problem afresh.

Acknowledgment

Thanks to Tyler Close for explaining these issues to me in terms that I could understand.

References

thread: "Origin enables XSS to escalate to XSRF", email thread on www-tag list, http://lists.w3.org/Archives/Public/www-tag/2009Jun/0015.html
csrf: "The Cross-Site Request Forgery (CSRF/XSRF) FAQ," http://www.cgisecurity.com/csrf-faq.html
cors: "Cross-Origin Resource Sharing," W3C Working Draft 17 March 2009, http://www.w3.org/TR/access-control/
ses: "Secure ECMAScript," http://wiki.ecmascript.org/doku.php?id=ses:ses
acls: "ACLs Don't", paper by Tyler Close, http://waterken.sourceforge.net/aclsdont/current.pdf
mnot: "[cors] Review", Mark Nottingham, http://lists.w3.org/Archives/Public/public-webapps/2009AprJun/0643.html