Confinement with Origin Web Labels

1. Introduction

This section is not normative.

Modern Web applications are conglomerations of JavaScript written by multiple authors. Authors routinely incorporate third-party scripts into their applications and share user data with third-party services (e.g., as part of a mashup). Unfortunately, in the existing model, the user’s data confidentiality and integrity is put at risk when one incorporates untrusted third-party code or shares data with untrusted third-party services.

Mechanisms such as CORS and CSP can be used to mitigate these risks by giving authors control over whom they share data with. But, once data is shared, these mechanisms do not impose any restrictions on how the code that was granted access can further disseminate the data.

This document specifies an extension to the current model called Confinement with Origin Web Labels (COWL). COWL provides authors with APIs for specifying (mandatory) access control policies on data, including content, in terms of origin labels. These policies are enforced in a mandatory fashion, transitively, even once code has access to the data. For example, with COWL, the author of https://example.com can specify that a password is confidential to https://example.com (and thus should only be disclosed to https://example.com) before sharing it with a third-party password strength checking service. In turn, COWL ensures that the third-party service, which necessarily computes on the sensitive password, is confined and respects the policy on the password: COWL disallows it from disclosing the password to any origin other than https://example.com.

COWL enforces such policies by confining code at the context-level, according to the sensitivity (i.e., the label) of the data the code has observed. To reap the greatest benefits of COWL, authors will need to compartmentalize applications into multiple contexts (e.g., iframes).

In the existing model, any page served from an origin has the ambient, implicit authority of that origin. This documents generalizes this notion of authority and gives authors explicit control over it with privileges. For example, by default, a page whose origin is https://example.com has the privilege for https://example.com. This gives the page the authority to arbitrarily disseminate data sensitive to https://example.com; to be backwards-compatible, the page is not confined when reading data sensitive to https://example.com. However, COWL allows the author to run the page with "weaker" delegated privileges (e.g., one corresponding the current user at https://example.com) or to drop the privilege altogether.

COWL is intended to be used as a defense-in-depth mechanism that can restrict how untrusted—buggy but not malicious—code handles sensitive data. Given the complexities of browser implementations and presence of covert channels, malicious code may be able to exfiltrate data. Authors should still use discretionary access control mechanisms, such as CSP and CORS, to restrict access to the data in the first place.

1.1. Goals

The goal of COWL is to provide authors with a means for protecting the confidentiality and integrity of data that is shared with untrusted code, whether third-party or their own. Existing mechanisms (e.g., CORS’s Access-Control-Allow-Origin header and the targetOrigin argument to postMessage()) provide a way for restricting which origins may access the shared data. But, once content has access to data it can usually disseminate it without restrictions. While CSP can be used to confine code, i.e., restrict how confidential data is disseminated, setting a correct CSP policy (as to confine code) is difficult and limited to content the author has control over. Indeed, sharing confidential data in the existing model almost always requires the sender to trust the receiver not to leak the data, accidentally or otherwise. COWL provides a defense-in-depth option for protecting data confidentiality and integrity. In particular, with COWL:

Authors should be able to specify confidentiality and integrity policies on data in terms of origin labels: the origins to whom the data is confidential and the origins that endorse the data. This allows authors to share sensitive data with third-party content and impose restrictions on the origins with which it can communicate once it inspects the sensitive data. Dually, it allows authors to share data via intermediate content while retaining its integrity.
Authors should be able to run code with least privilege by restricting the origins the code can communicate with and thus how it can disseminate sensitive data.
Authors should be able to privilege separate applications by compartmentalizing them into separate contexts that have delegated privileges.

1.2. Use Cases/Examples

1.2.1. Confining untrusted third-party services

An author wishes to use a service, loaded in the form of an iframe, without trusting it (or its dependencies) to not leak her sensitive data. To protect the data, the author associates a confidentiality label with the data, specifying the origins allowed to read the data. The author then shares the newly created labeled object with the untrusted code. In turn, COWL confines the untrusted code once it inspects the sensitive data, as to ensure that it can only communicate according to the author-specified policy (the label).

The author of https://example.com wishes to use a third-party password strength checker provided by https://untrusted.com. To protect the confidentiality of the password, the https://example.com application can use COWL to associate a confidentiality policy, in the form of a label, with the password before sending it to the untrusted service:

// Create new policy using Labels that specifies that the password is sensitive
// to https://example.com and should only be disclosed to this origin:
var policy = new Label(window.location.origin);

// Associate the label with the password:
var labeledPassword = new LabeledObject(password, {confidentiality: policy});

// Send the labeled password to the checker iframe:
checker.postMessage(labeledPassword, "https://untrusted.com");

// Register listener to receive a response from checker, etc.

Once the checker inspects the protected object, i.e., the password, COWL limits the iframe to communicating with origins that preserve the password’s confidentiality (in this case, https://example.com). This policy is enforced mandatorily, even if the https://untrusted.com iframe sends the password to yet another iframe.

Note, until the checker actually inspects the labeled password, it can freely communicate with any origins, e.g., with https://untrusted.com. This is important since the checker may need to fetch resources (e.g., regular expressions) to check the password strength. This is also safe—the checker has not inspected the sensitive password, and thus need not be confined.

Other use cases in this category include password managers and encrypted document editors, for example, where an encryption/decryption layer and a storage layer are provided by distrusting, but not malicious, services. The academic paper on COWL describes these use cases in detail [COWL-OSDI].

1.2.2. Sharing data with third-party mashups

A server operator wishes to provide third-party mashups access to user data. In addition to using CORS response headers to restrict the origins that can access the data [CORS], the operator wishes to restrict how the data is further disseminated by these origins. To do so, the operator sends a response header field named Sec-COWL (described in §3.5.2 The Sec-COWL HTTP Response Header Field) whose value contains the sensitivity of the data in the form of a serialized confidentiality label. In turn, COWL enforces the label restrictions on the third-party code.

The server operator of https://provider.com uses a CORS response header to grant https://mashup.com access to a resource. The operator also sets a COWL header to specify that the resource is confidential to https://provider.com and should not be disseminated arbitrarily:

Access-Control-Allow-Origin: https://mashup.com
Sec-COWL: data-confidentiality [ ["https://provider.com"] ]

COWL only allows a https://mashup.com context to read the sensitive response if the label restrictions of the response are respected, i.e., if the code can only communicate with https://provider.com.

Note, COWL only allows the code to inspect the response if the context labels, which dictate the context’s ability to communicate, are more restricting than the labels of the response. A more permissive approach, which does not require the context give up its ability to communicate arbitrarily is to use labeled JSON response. The mashup XHR example shows how authors can accomplish this.

1.2.3. Content isolation via privilege separation

A server operator wishes to isolate content (e.g., of different users) while serving it from a single physical origin. The operator can leverage privileges to ensure that content of one part of the site has different authority from another and, importantly, does not have authority of the physical origin. Concretely, when serving content, the operator can set the content’s context privilege to a weaker, delegated privilege. This ensures that the content are privilege separated.

Suppose https://university.edu wished to isolate different parts of their site according to users. The server operator can weaken the privilege of a page when serving user content by providing a response header field named Sec-COWL (see §3.5.2 The Sec-COWL HTTP Response Header Field) whose value contains a serialized delegated privilege. For example, for any content under https://university.edu/~user1, the following header is set:

Sec-COWL: ctx-privilege [ ['self', 'cowl://user1'] ]

Having this privilege can be understood as having the authority of user1’s part of the https://university.edu origin. COWL ensures that the content of this user cannot interfere with the content of https://university.edu or another user e.g., user2. For example, the content cannot modify https://university.edu cookies or the DOM of another http://university.edu page.

This delegated privilege also ensures that the content cannot disseminate data sensitive to another user (e.g., user2) arbitrarily— without being confined, it can only disseminate user1’s data on http://university.edu. Of course, this requires the server operator to label sensitive data (e.g., when sending it to the user agent) appropriately (e.g., user2’s data is labeled Label("https://university.edu")._or("user2")).

The sub-origin isolation in JavaScript example shows how this can be implemented using the COWL JavaScript APIs.

Note, sub-domains should be used when possible to ensure that content is isolated using the Same-Origin Policy. But, even in such cases, COWL can provide a useful layer of defense.

1.2.4. Running content with least-privileges

An author wishes to use a library that is tightly coupled with the page (e.g., jQuery), but not trust it to protect the user’s confidentiality and integrity. With COWL, the author can do this by loading the untrusted library after dropping privileges (from the context’s default privilege). In doing so, the content (and thus the library) loses its implicit authority over the content’s origin.

The author of https://example.com can drop privileges in JavaScript:

// Drop privileges, by setting the context privilege to an empty privilege:
COWL.privilege = new Privilege();

// Load untrusted library

Or, by setting the content’s initial privilege to the empty privilege using Sec-COWL response header:

Sec-COWL: ctx-privilege [ [] ]

Note, while this ensures that the context code cannot, for instance, access the origin’s cookies, the author must still associate a confidentiality label with resources (e.g., HTTP responses) to ensure that data is properly protected.

In some cases it is useful for a particular context to have the privilege to disseminate certain categories of data. (The or part of labels can be used to easily categorize differently-sensitive data.) To this end, the author should run the context with a delegated privilege instead of the empty privilege. The above §1.2.3 Content isolation via privilege separation shows one such example.

1.3. Trust Model

COWL provides developers with a way of imposing restrictions on how untrusted code can disseminate sensitive data. However, authors should avoid sharing sensitive data with malicious code, since such code may be able to exploit covert channels, which are present in most browsers, to leak the data. COWL can only prevent information leakage from code that (e.g., is buggy and) uses overt communication channels.

Similarly, COWL provides no guarantees against attacks wherein users are manipulated into leaking sensitive data via out-of-band channels. For example, an attacker may be able to convince a user to navigate their user agent to an attacker-owned origin by entering a URL that contains sensitive information into the user agent’s address bar.

COWL should always be used as an additional layer of defense to other security mechanisms such as CSP, SRI, CORS, and iframe sandbox.

2. Key Concepts and Terminology

2.1. Labels

An origin label, or more succinctly a label, encodes either a confidentiality or integrity security policy as conjunctive normal form (AND’s and OR’s) formulae over origins. Labels can be associated with contexts or with structurally clonable objects.

When associated with a context, the label restricts the origins that the context can communicate with, as detailed in §3.3 Labeled Contexts.

The confidentiality label Label("https://a.com")._or("https://b.com"), when associated with a context, restricts the context to sending data to https://a.com or https://b.com, but no other origins. This context label reflects the fact the context may contain data that is sensitive to either https://a.com or https://b.com; it is thus only safe for it to communicate to these origins.
Note, because the context can communicate data to either origin, another context associated with the more restricting label Label("https://a.com") cannot send it data. Doing so would allow for data confidential to https://a.com to be leaked to https://b.com.

The integrity label Label("https://a.com").or("https://b.com"), when associated with a context, restricts the context to receiving data from (a context or server) that is at least as trustworthy as https://a.com or https://b.com. This context label ensures that the code running in the context can only be influenced by data which either https://a.com or https://b.com endorse.

When associated with an object, a confidentiality label specifies the origins to whom the object is sensitive, while an integrity label specifies the origins that endorse the object. Objects that have labels associated with them are called labeled objects. §3.4 Labeled Objects defines how labels are associated with objects.
Consider an https://example.com page that receives a labeled object (e.g., via postMessage()) with the following labels:
- Confidentiality: Label("https://example.com"). This label indicates that the object is sensitive to https://example.com.
- Integrity: Label("https://a.com"). This label indicates that the object has been endorsed by https://a.com. If https://example.com received the message from an intermediary https://b.com context, this label reflects the fact that the object (produced by https://a.com) was not tampered.
Mathematically, a label is a conjunctive normal form formula over origins [DCLabels].

A label is in normal form if reducing it according to the label normal form reduction algorithm produces the same value.

Two labels are equivalent if their normal form values are mathematically equal.

A label A subsumes (or is more restricting than) another label B if the result of running the label subsumption algorithm on the normal forms of A and B returns true. Labels are partially ordered according to this subsumes relation.

The current confidentiality label is the confidentiality label associated with the current context. §3.3 Labeled Contexts specifies how labels are associated with contexts.

The current integrity label is the integrity label associated with the current context. §3.3 Labeled Contexts specifies how labels are associated with contexts.

When reading a labeled object, a context gets tainted, i.e., its context labels are updated by invoking context tainting algorithm, to reflect that it has read sensitive (of potentially different trustworthiness) data and should be confined accordingly.

2.2. Privileges

A privilege is an unforgeable object that corresponds to a label. Privileges are associated with contexts and reflect their authority.

Privileges can be used to bypass confinement restrictions imposed by confidentiality labels. In particular, a privilege can be used to bypass the restrictions imposed by any label its corresponding label—the internal privilege label—subsumes.

Consider a context from https://a.com whose current confidentiality label is Label("https://a.com").and("https://b.com"). This label confines the context to only communicating with entities whose labels are at least as restricting as this label. For example, it restricts the context from communicating with a context labeled Label("https://b.com"), since doing so could leak https://a.com data to https://b.com. It similarly prevents the context from communicating with https://a.com.
But, suppose that the context’s current privilege corresponds to Label("https://a.com") (afterall, the context originated from https://a.com). Then, the context would be able to bypass some of the restrictions imposed by the context label. Specifically, the context would be able to communicate with https://b.com; the privilege confers it the right to declassify https://a.com data to https://b.com. Indeed, when taking this privilege into consideration, the effective confidentiality label of the context is Label("https://b.com").

Note, the privilege does not allow the context to bypass any label restrictions. For example, it does not allow the context to communicate with https://a.com since doing so could leak https://b.com data.

To be flexible, COWL uses the context privilege to remove certain restrictions imposed by the context label. To avoid accidentally leaking sensitive context data, authors should use LabeledObjects.

Privileges can also be used to bypass integrity restrictions imposed by integrity labels. In particular, a privilege can be used to endorse an otherwise untrustworthy labeled context (or labeled object) as to allow it to communicate with more trustworthy end-points (another context or server).

Consider an https://a.com context whose current integrity label is Label("https://a.com")._or("https://b.com"). This label confines the context to only communicating with entities that are at most as trustworthy as this label. For example, it restricts the context from communicating with a context whose current integrity label is Label("https://a.com"), since doing so would potentially corrupt https://a.com data (e.g., by allowing https://b.com to influence the computation).
But, if the context’s current privilege corresponds to Label("https://a.com"), the context would be able to bypass some of these integrity restrictions. Specifically, the context would be able to communicate with the more-trustworthy context (labeled Label("https://a.com")) since the privilege confers it the right to endorse (or vouch for) its context on behalf of https://a.com. Indeed, when taking privileges into account, the effective integrity label of the context is Label("https://a.com").

Note, the privilege cannot be used to bypass any integrity restrictions. For example, it does not allow the context to communicate with a context whose integrity label is Label(https://b.com).

Note, browsing contexts have a current privilege that, by default, corresponds to the origin of the context, as described in §3.3 Labeled Contexts. But, authors should set the current privilege to a delegated privilege to follow the principle of least privilege.
The current privilege is the privilege associated with the current context. §3.3 Labeled Contexts specifies how privileges are associated with contexts.
The effective confidentiality label is the label returned by the label downgrade algorithm when invoked with the current confidentiality label and current privilege.
The effective integrity label is the label returned by the label upgrade algorithm when invoked with the current integrity label and current privilege.
Code can take ownership of a privilege priv by setting the current privilege to the privilege produced via the combination of the current privilege and priv. In doing so, it is said that the context owns the privilege.

3. Framework

This sub-section is not normative.

In a nut-shell, the COWL framework provides:

Policy specification via origin labels: COWL provides a Label interface for specifying confidentiality and integrity policies in terms of origins. Labels can be associated with data and content using the JavaScript LabeledObject and COWL interfaces or the Sec-COWL HTTP headers.
Explicit authority via privileges: The COWL framework provides a JavaScript Privilege interface for operating on and minting new privileges. The COWL JavaScript interface and Sec-COWL HTTP response header can be used to explicitly control the authority of a context by setting the context privilege.
Confinement enforcement mechanism: COWL extends browsing contexts and Workers with labels and privileges, which are used when enforcing confinement, i.e., when restricting a context’s network and cross-context messaging communication. This document defines the necessary changes and extensions to existing browser constructs and algorithms to enforce confinement.

3.1. Labels

Each label is an immutable object represented by a Label object, the interface of which is defined in this section.

A Label MUST have an internal label set, which is a non-empty set of disjunction sets.

A disjunction set is a set of origin URLs.

A label is said to be an empty label if its label set contains a single, empty disjunction set.

[Constructor, Constructor(DOMString origin), Exposed=Window, Worker]
interface Label {
  boolean equals(Label other);
  boolean subsumes(Label other, optional Privilege priv);

  Label and((Label or DOMString) other);
  Label _or((Label or DOMString) other);

  object toJSON();
  [Throws] static Label fromJSON(object obj, optional DOMString self);
};

Current WebIDL implementation requires an underscore for certain identifiers. Can we rename _or to or?

3.1.1. Constructors

Label()

When invoking the Label() constructor, the user agent MUST return a new empty label.

Label(DOMString origin)

When invoking the Label(origin) constructor, the user agent MUST use an algorithm equivalent to the following:

If the origin argument is not a URL, the constructor MUST throw a TypeError exception [ECMA-262] and terminate this algorithm.
Else, it MUST return a new Label that contains a label set of a single disjunction set, which itself MUST contain the URL corresponding to the origin of the parameter.

3.1.2. Methods

equals(Label other)

The user agent MUST return true if the Label on which the method has been called is equivalent to the other parameter; otherwise it MUST return false.

subsumes(Label other, optional Privilege priv)