IG/a view on SOP

From Web Security
< IG
Jump to: navigation, search

This page is dedicated to capture some discussions about the limitation of the current web security model, which relies on Same Origin Policy (SOP). That page exposes some use cases that may require some amendment to that principle and tries to list the possible directions for addressing the specificities of those use cases.

- > To be completed

Mapping out the conceptual space

The Same Origin Policy must be evaluated with respect to a number of other criteria which it contributes to in different ways.

SOP and Modal Logic

SOP is an application of concepts from modal logic to the web. As explained in RFC6454 The Web Origin Concept, SOP is a way of naming the server agents with which a user agent is engaged in when retrieving information from the web, and distinguishing which agent has sent what information, and which rights/authority these agents have. The relation to modal logic was already made in 2009 by Dan Connolly in his entry to the TAG "A Model of Authority in the Web". The part of modal logic of import here is the one related to belief and assertion contexts, which distinguishes between who said what in such a way as to not merge statements made by different agents, but to keep them cleanly separated. This became especially important with the emergence of JavaScript running in the browser, allowing JS served by one origin to act in the browser potentially with the credentials of the user. This lead to the specification of Cross Origin Resource Sharing (CORS) as a method of controlling this type of interaction. Modal logic should provide a mathematically strong foundations to help clarify and formalise the issues in this space.

SOP and Linkability

Linkability between documents on different origins is admittedly the first and core feature of the web, allowing the web to become the global communication medium. This can present issues of information leakage and forms an obvious exception to the SOP as stated by RFC6454: The Web Origin Concept:

Whenever user agents allow one origin to interact with resources from another origin, they invite security issues. For example, the ability to display images from another origin leaks their height and width. Similarly, the ability to send network requests to another origin gives rise to cross-site request forgery vulnerabilities [CSRF]. However, user agent implementors often balance these risks against the benefits of allowing the cross-origin interaction. For example, an HTML user agent that blocked cross-origin network requests would prevent its users from following hyperlinks, a core feature of the web.

The browser displaying HTML ( not containing JS ) gives the user control over which links he/she clicks and so some control over which origins she lands on. With JS agent available in the browser, the origin of the JS can take over actions of this type, as well as form submission, potentially connecting to other sites as the user. To protect against abuses the functionality specified by CORS: Cross Origin Resource Sharing was developed. (It may be that COWL can provide further help here.)

User Control ( a.k.a User Mediation )

In order for a user to be able to act, the user has to have ways of being able to steer her navigation through the web ( web browsers used to be called navigators ). This is done by providing the legally responsible human user, with an interface with which she can control the navigation [1]. The user can only be deemed responsible for actions she can in some ways control. The process of giving the user control is also known as User Mediation.

The user should be able to control information about her identity. Note that, when interacting with an origin the user's actions will reveal information about herself which potentially can start identifying her, at leat as the type of person who clicks certain links, just as the origin is revealing information about itself by publishing certain types of information.

User mediation does not necessarily mean that the user needs to act decide for every single event. The user can also decide to choose a policy to automate his choices. In important cases it is just important that the user be aware of the automation, and be able to easily change it.

[1] Different users with different skill levels will be able to steer different types of navigators. Programmers in this respect are like pilots of fighter aircraft that can work with large number of nobs and dials. Usual skill levels require much more fluid controls. Where Programming Interfaces explain how software can interact, Human Computer Interfaces are instead located in the space between technology, psychology and design.

(Cross Origin) Identity

Web Agents are identified by their Origin ( a global identifier ), and when the communication is secured by TLS this can be verified by Server Certificates and public key cryptography. A number of technologies enable the identification of the user across time, and origins. We can quickly list a number of them here:

  • anonymous browsing: browsing where the user make sure to not divulge any identifying information, if possible not even enough to tie two requests to the same agent
  • cookies: specified by RFC6454 implement a fuzzy notion of Same Origin weak identity that in effect is mostly controlled by the server, with legal obligations in some countries to make the setting of cookies visible and controllable by the user. The server can specify cookies for one origin securely, but can set it more widely to subdomains and across protocols ( http and https ).
  • username/password : this is essentially a Same Origin identity, since re-using a password across origins would create a security hole: ie it would enable another agent to pretend he is the first. This patterns is a major security problem as it is not easily enforceable, and many people use similar user name password across origins.
  • FIDO uses a public key as an indirect identifier of the agent in possession of the device that has the private key. Even though public key cryptography is usually used across origins, as it is with server certificates, FIDO by default restricts a public key to be used solely for one origin, though it can be extended across more than one with facets.
  • OpenId: OpenID 1.0 used a URL as an indirect identifier of the user who could control the page referred to by that URL to point to an identity provider. This is a global identifier enabling cross origin linking of identity
  • WebId is a direct `http` URL that refers to an agent globally via a description. This is also an identifier to enable cross origin linking of information about agents, to create enable a cross origin social web

Note that non cross origin identity does not guarantee privacy since:

  • even non cross origin identities ( cookies, username/password, FIDO keys) can be used by an origin to map interactions of a user over time which they can publish and share with others.
  • hypothetically if there were only one large provider, and all users had to use it, then the large provider would be not be able not to see all information flow between all users - assuming all information were not encrypted, which would create completely different set of problems regarding key management and sharing.


( would be best to get some better input from the privacy working group. Here is a first attempt. )

Privacy is a moral and political concept, relating to how agents should acquire and pass on knowledge they have about an individual or group. Of course if no information is ever divulged ( were someone to live in a cave away from anyone ) then privacy will be guaranteed by default: by the mere fact that nobody else will have any information to transmit. But in a normal civil situation in which most people live, privacy comes down to certain legal and moral restrictions as to what others can do with information and how they can acquire it.

As such it is quite possible to have a cryptographically secure relation with a party that does not respect privacy.


( what type of concept is that of security ? )

Exceptions to SOP

A number of exiting exceptions to SOP revealing how SOP is one of a constellation of requirements, which can be overridden if correct measures are taken to guarantee the other important principles such as privacy, user control, and security.

These exceptions should be studied in detail. This may well lead us to find a theory likely based on reasoning in modal logic, of when such exceptions are admissible and what criteria need to be fulfilled when doing so.

This would then make it easier to work on the security aspects of new protocols that need to work across origins.


Cookies, as specified by RFC6265: HTTP State Management Mechanism implement a fuzzy notion of Same Origin. We can distinguish two notions of Same Origin:

  • a strong notion of Same Origin where two origins are identical only if they are named by the same protocol, domain, port triple
  • a weak notion of Same Origin where the two origins are identical if they refer to the same agent.

Cookies allow the server to set state to be set in the browser, in both ways. Not everything goes for weak origins here: only subdomains, and http https protocol maps.

For those who believe that strong Same Origin is the only correct interpretation of SOP, cookies form an exception to SOP ( and so would FIDO's facets, see below. )


Clicking URLs can leak identifying information. In this case it is admitted that if the user is in control of the clicking, he is as much in control as he can be of the leakage of information.

This feature is used for identity protocols such as OpenID that enable user to be redirected to an Identity Provider where using a form the user can send identity and other attributes to the Relying party.

In the case of OpenID the user is in control of the Identity he uses ( the OpenID URL ), which allows the identity provider to be found, and via the Identity Provider of what attributes he sends. Here CORS is important in that it limits what actions JS agents running in the browser can do across origins.

Form Auto Fill

Since filling in forms is often repetitive across sites, be it filling in an OpenID URL, shipping addresses or credit card details, many browsers provide auto form fill in functionality.

In a simple view this is using information from one web site to fill in information in another site, and so is cross origin.

On another view this is asking the user, which we can think of as an origin, for information, and then storing and automating the process of selecting that information. Interacting with humans requires user interface components, and these have to be protected from JS agents that may also be active in the browser page - ie. such form selection methods have to make the information visible to the user without making it visible to a JS agent that may be active on the same page.

Such User Interfaces always come with APIs, which would also allow the software to be written ( owned by the User ) to automate further the task, by delegating the process to software.


Though FIDO is by default restricted to a Same Origin in seems to also allow a key to be used across more than one origin as specified in the FIDO AppID and Facet Specification v1.0. This document contains the following text in the introduction:

While the FIDO approach is preferable for many reasons, it introduces several challenges.
What set of Web origins and native applications (facets) make up a same logical application and how can they be reliably identified?
How can we avoid making the user register a new key for each web browser or application on their device that accesses services controlled by the same target entity?
How can access to registered keys be shared without violating the security guarantees around application isolation and protection from malicious code that users expect on their devices?
How can a user roam credentials between multiple devices, each with a user-friendly Trusted Computing Base for FIDO?
This document describes how FIDO addresses these goals (where adequate platform mechanisms exist for enforcement) by allowing an application to declare a credential scope that crosses all the various facets it presents to the user.

In the next section we read

When a user performs a Registration operation [UAFArchOverview] a new private key is created by their authenticator, and the public key is sent to the Relying Party. As part of this process, each key is associated with an AppID. The AppID is a URL carried as part of the protocol message sent by the server and indicates the target for this credential. By default, the audience of the credential is restricted to the Same Origin of the AppID. In some circumstances, a Relying Party may desire to apply a larger scope to a key. If that AppID URL has the https scheme, a FIDO client may be able to dereference and process it as a TrustedFacetList that designates a scope or audience restriction that includes multiple facets, such as other web origins within the same DNS zone of control of the AppID's origin, or URLs indicating the identity of other types of trusted facets such as mobile apps.

In terms of logic what we see here is that FIDO provides the possibility of identifying various origin URLs as referring to the same actor, thereby enabling trust in statements made by an actor with a given name to be applied to the same actor identified with a different name (origin)

_THE ABOVE CLAIM IS INCORRECT_ "such as other web origins within the same DNS zone of control of the AppID's origin" essentially means that what is colloquially known as "domain lowering" (as in what is colloquially known as the "cookie same origin policy") is used to determine if a given web origin is "within the same DNS zone of control of the AppID's origin". How to determine this is specified in <https://fidoalliance.org/specs/fido-uaf-v1.0-ps-20141208/fido-appid-and-facets-v1.0-ps-20141208.html#determining-if-a-caller-s-facetid-is-authorized-for-an-appid> Step 14. See also 'HTTP cookie processing algorithm in terms of Same Origin Policy and “effective Top Level Domains (eTLDs)” aka “Public Suffixes”' <http://identitymeme.org/http-cookie-processing-algorithm-etlds/>. =JeffH


The WebCrypto API is a JS API which allows an Origin to publish JavaScript which when executed in the browser gives the JS Agent access to cryptographic functions. This allows the JS agent to create public and private keys with which it can sign and encrypt information.

The Web Cryptography API defines a low-level interface to interacting with cryptographic key material that is managed or exposed by user agents. The API itself is agnostic of the underlying implementation of key storage, but provides a common set of interfaces that allow rich web applications to perform operations such as signature generation and verification, hashing and verification, encryption and decryption, without requiring access to the raw keying material.

This capability could then allow JS from an origin to use the crypto API to identify the JS running in the browser across origins by signing tokens and tying it to a global identifier and using these to communicate using XMLRPC.

The user in this case is be in control of which origin he goes to. It is up to the site owner then to follows guidelines that would put the user in control of his identity when making cross origin requests, via some form of user selectable policy, as there is no tie in to the user chrome.

Note that here there is a good argument that not much can be done against this seeming exception to SOP:

  • a JS Cryptograph library could be written and used even without the JS crypto API
  • Trying to protect against cross origin usage of a crypto library would probably cripple CORS, and still allow cross origin crypto apis to be developed by passing signed or cryptographic messages inside the body of the HTTP message
  • the same communication could be mediated via the Origin ( server to server) which could communicate to other origins, authenticate and return the information thus collected to the browser via a proxy

Further along, one could consider the browser itself to be an application downloaded from an origin. If one thinks about it that way, then browsers that have keychains are not unlike a JS Web application that stores keys on the local storage and that can use it to communicate with other origins. If so is SOP really an argument against keystores that put the user in control of cryptographic key material and credentials? A problem that is often raised is the difficulty of coming up with good usable Browser based User Interfaces to put the user in control of cryptographic functions. Could a Confinement System for the Web COWL, worked on by the WebAppSec WG draft spec help here by allowing the development of flexible yet secure User Interfaces?

Client Certificates over TLS

Client Certificates used in browsers across the web for over 15 years allows a browser to create a public private key pair, store them in the browser's chosen keystore, and send the public key to an Origin that can the sign a certificate, that on returning to the browser will be added to the keystore and tied to the private key. ( see <keygen> in HTML5 ). The Certificate can only be used for authentication ( using TLS client certificate requests ) with the users' permission as built into the chrome. This puts the user in control of the ability to identify himself with a global identifier - a public key in addition to an e-mail, or WebID - to any other origin.

This has the following advantages over OpenID type schemes:

  • An Identity can be selected by the user by point and click rather than having to type a global identifier into a form as required by OpenID ( or even BrowserId )
  • Public key cryptography allows the user to be identified without the need to be redirected to the identity provider
  • It removes the problem of passwords just as FIDO does
  • It can be tied to cryptographic hardware as shown by the German Privacy Foundations demonstration of their open source cryptokey.

Along these lines it would be worth exploring how WebCrypto could be extended to put the user in control of public key for identity or to sign claims, and how this could be tied into the WebAppSec's Credential Management Work in order to allow cross origin identity. In order to enable innovation in the UI side this may also require some ways for origins to build authentication schemes that limit leakage of information using JS framework such as COWL.

Native Messaging and SOP

Google have introduced an extension scheme called Native Messaging which allows Chrome to talk to native applications. Chrome Native Messaging.

This allows web pages to communicate with native extensions. The extension needs to specify the domains it can communicate with in its Manifest file. They give the following example:

"externally_connectable": {
  "matches": ["*://*.example.com/*"]

with the following explanation

This will expose the messaging API to any page which matches the URL patterns you specify. The URL pattern must contain at least a second-level domain - that is, hostname patterns like "*", "*.com", "*.co.uk", and "*.appspot.com" are prohibited. From the web page, use the runtime.sendMessage or runtime.connect APIs to send a message to a specific app or extension.

Here the extension can specify the domains it can communicate with. It is somewhat limited, but not very much. Since the extension can keep information, it would allow communication between Origins.

Problems posed for Future Applications

WebAppSec Credential Management

The WebAppSec's Credential Management Level 1 Working Draft has a section on cross origin leakage where it is written that:

Credentials are sensitive information, and user agents need to exercise caution in determining when they can be safely shared with a website. The safest option is to restrict credential sharing to the exact origin on which they were saved. That is likely too restrictive for the web, however: consider sites which divide functionality into subdomains: example.com vs admin.example.com.
As a compromise between annoying users, and securing their credentials,...

So here as with FIDO and Cookies, there is pressure from the usability requirements to move beyond a simply syntactic definition of an origin, to one that defines an Origin in a more flexible way as the referent of a set of URIs.

The credential managements Working draft also allows Federated Credentials which in a certain sense is cross origin as the identities the user may want to use are cross origin. But just as with UA help in filling in forms, this is acceptable as such information is user mediated: the user is in control of what kind of cross origin information is selected in the database and give to the JavaScript Origin. The browser chrome can use this database to build an UI so that the user ( who can be thought of as an Origin ) can select an OpenID, proof of age, or payment systems, or potentially even cryptographically based authentication protocols that are more in line with HTTP/2 as Signing HTTP messages ietf draft proposal.

WebPayments and SOP

Payment resources whether they reside in the Browser, client-platform Wallet, or in the Cloud do not have any a priori relation with merchants. This requires browsers to enable an open world of sellers and payment systems. The importance of this feature has been unerlined by Jeff Jaffe in an article in Davos.

What solutions could possibly work?

using redirects

It is possible to follow the OpenID pattern. The user could pay by entering his payment services URL, and then being redirected to the payment provider where he could be authenticated using FIDO or similar.

The problem as with OpenID is that it requires the user to type a URL, which is very tedious on cell phones, and which has instead lead to Relying Parties suggesting a small set of well known IDPs in order to escape the Nascar Problem. This has led in practice to reducing the number of identity providers to a few big well known ones.

The minimum needed to overcome this limitation would be for the browser to have a database of identifiers that it can use to fill in forms automatically for the user. But this can be seen to be an exception to SOP. It also does not make for a very nice user interface as the form filler can only fill in text.

browser support

To build more advanced user interfaces from a particular browser database of information would require some browser agent the user is comfortable giving access to the database to provide a UI, for identifier selection.

Clearly it cannot be the JS from a random Origin, or that JS could syphon off all the data from the database.

It could be built into the browser, which requires cross browser agreement on APIs and innovation at the UI layer, which is being explored at the Credential Management Specification. But for Cross Origin Identification systems this would need to go somewhat further than their proposed weak SOP, and allow cross origin identities to be selected by the user, in this case bank identities.

Would it be possible to script such a UI? Only if the user were able to be sure that trusted JS to build the UI's were only able to act in the user's interest. This would require something along the lines of COWL for JS in the browser, and a lot of research to understand the implications.

client applications

Further along solutions we have like Apple Pay, that sits between the untrusted merchant code and the payment resource, and form a type of advanced user agent, allowing the user to opt-in (accept) or opt-out (block) payments to different sites/origins. These applications can either be thought of as SOP compliant, since the Origin making the decision is the Human User, or Cross Origin, since that Human user needs to communicate with service providers of different origins intermediated by the payment software.

Hardware Crypto and SOP

This topic should be interpreted as: Exposing cryptographic primitives like signature operations to ordinary (untrusted) Web-pages.

In this case SOP is called for since it would be dangerous letting arbitrary Web sites access such resources. It would also be a privacy issue since any site could for example enumerate certificates without the user's knowledge.

Couldn't this be solved with a user opt-in/out-out solution? No, because the questions would be of a kind that not even an expert in cryptography could answer because they lack context. Example:

somesite.com wants to enumerate keys, do you agree?

Why would SOP help here? By restricting access so that only the creator of a key can use it, you can often skip the UI part completely because only thing a bad service can do is screwing its own clients (and thus itself) which isn't a very interesting attack and is BTW always possible no matter what technical solution you have.

SOP and the traditional eID use-case are thus at odds.

{ bblfish: what are traditional eID use cases beyond authentication? Authentication alone does not require a complex UI. For example one could just use certificate selection or card selection user interfaces for those as demonstrated by the German Privacy Foundation's crypto key demo }

Note that WebPayments can use Hardware Crypto if there is mediator application (as described above) which never exposes raw crypto APIs to merchants. This is how all existing Wallets work (although they have yet to make it to web).

{ bblfish: this suggests that it is a question as to what types of interfaces are exposed. To enable new interfaces to emerge it may require JS from one Origin to be able to be usable by JS from another Origin in a very protected way. There are some thoughts on the drawing board about this, but it is not clear if it is the right thing. Something like COWL may be needed. }

Extended Theory of SOP

Here we should develop an extended theory of SOP, security, identity, etc, that makes sense of the exceptions above and enables open discussion to help answer future problems.