Bug 20965 - EME results in a loss of control over security and privacy.
Summary: EME results in a loss of control over security and privacy.
Status: RESOLVED LATER
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions (show other bugs)
Version: unspecified
Hardware: PC Windows NT
: P2 blocker
Target Milestone: ---
Assignee: Adrian Bateman [MSFT]
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on: 21104 22909 22910
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-12 02:32 UTC by Fred Andrews
Modified: 2013-09-30 23:54 UTC (History)
11 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fred Andrews 2013-02-12 02:32:50 UTC
EME is not designed with privacy in mind or to the principle that the user comes first.
Comment 1 Andreas Kuckartz 2013-02-12 10:29:29 UTC
A Google company (Widevine) offers "silent monitoring" in connection with DRM.

It is likely that they will try implement this in a EME CDM.
Comment 2 Adrian Bateman [MSFT] 2013-02-12 13:51:16 UTC
I think all communication that could result in a privacy concern is out of scope of the EME spec. Please provide a pointer to spec text that causes this issue and propose new text.
Comment 3 Fred Andrews 2013-02-12 20:16:05 UTC
Sorry, there is an expectation that specification consider privacy in their design and security, in fact there are separate working group focused, and cross origin protection is as the heart of web security.  The EME should be held to the same standard and it is expected that the security implications can be reviewed.  Please leave this bug open until the CDM is defined and the security and privacy implications can be reviewed.
Comment 4 Andreas Kuckartz 2013-02-14 19:51:00 UTC
EME is insecure by design because it can not be realisticaly expected that the specification and/or the source code of practically relevant CDMs will be made availabe. The security of users therefore depends on the providers of the CDMs. This is not acceptable for systems connected to the Internet.

It is known that at least one DRM company (Widevine, a Google company) is offering/promoting "silent monitoring". It also is known that DRM has been used to install malware on user systems (Sony rootkit).
Comment 5 Henri Sivonen 2013-02-18 06:42:17 UTC
(In reply to comment #2)
> I think all communication that could result in a privacy concern is out of
> scope of the EME spec.

An overview of Adobe Access https://www.adobe.com/support/adobeaccess/pdfs/server/AdobeAccess_4_Overview.pdf and the PlayReady Compliance Rules http://download.microsoft.com/download/7/8/8/788478CC-74A3-4BFE-8CBE-07D80218658B/Compliance_Rules_for_PlayReady_Final_Products_19_December_2012.doc indicate that both systems have keys that are unique to a given computer or device (not just device model). 

To the extent such unique key participates in some detectable way in the key exchange dance that happens in the messages that are opaque to EME itself, such a unique key could be used as an exceptionally strong super cookie (hyper cookie?). That is, serving a trivial media file that triggers key exchange could be used by Web sites to make browsers reveal uniquely identifying information to any site on the Web enabling unprecedentedly reliable tracking of users across the Web.

Since such uniquely identifying keys seem to be common enough a characteristic of DRM systems that one can find such a characteristic in a couple of DRM systems with large installed bases by a quick inspection of public documentation, it seems reasonable to assume that the characteristic can be expected to be common to various CDMs that one might expect to live behind EME. Therefore, it seems reasonable for EME itself to address the privacy implications of such a probable characteristic of CDMs.
Comment 6 Joe Steele 2013-02-19 21:19:41 UTC
I do not believe that having a unique key or cookie is, in and of itself, a violation of privacy. Having such a key that the user cannot exercise any control over seems like a problem. I would expect CDMs to be subject to the same constraints that browsers are today, i.e. they should provide a "private" mode where such information is not retained and provide mechanisms for the user to remove such information if it already exists. There is nothing in the EME specification that prevents compliance with good privacy practices.
Comment 7 Henri Sivonen 2013-02-20 06:54:05 UTC
(In reply to comment #6)
> I do not believe that having a unique key or cookie is, in and of itself, a
> violation of privacy.

Exposing the same unique value to all sites is enough of an enabler of privacy violations that it should be addressed.

> Having such a key that the user cannot exercise any
> control over seems like a problem. I would expect CDMs to be subject to the
> same constraints that browsers are today, i.e. they should provide a
> "private" mode where such information is not retained and provide mechanisms
> for the user to remove such information if it already exists.

Private browsing modes primarily address privacy relative to other users of the same computing device that the browser runs on. They either aren't or are less about addressing privacy relative to the sites that are accessed or relative to third parties whose components (typically ads) are included on the sites.

Especially addressing privacy relative to third parties (such as ad aggregators) is an issue that browsers seek to address in their normal mode of operation without requiring the user to enter a private browsing mode. For example, Safari, by default, outside the private browsing mode, tries to avoid honoring third-party cookies. Therefore, the issue of each CDM installation having unique key material whose uniqueness is detectable by Web sites is the kind of issue browser care about addressing in the normal mode of operation.

Persistently storing content keys/licenses to last beyond the end of the current browsing session would be the kind of thing that would need addressing in order to address privacy relative to other users of the same computing device that the browser runs on. However, to the extent EME is meant to be about streaming, it should be possible to make EME or its CDMs not use permanent storage for content keys/licenses. (If the implementors of EME or CDMs are planning on addressing non-streaming use cases that involve writing content keys/licenses in permanent storage, I think it would be good for them to speak up about their intentions.)

> There is
> nothing in the EME specification that prevents compliance with good privacy
> practices.

EME should have some kind of privacy considerations section that points out the risks and suggests remedies so that each implementor doesn't need to discover the problems independently.
Comment 8 Joe Steele 2013-02-20 16:22:22 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > I do not believe that having a unique key or cookie is, in and of itself, a
> > violation of privacy.
> 
> Exposing the same unique value to all sites is enough of an enabler of
> privacy violations that it should be addressed.
> 
> > Having such a key that the user cannot exercise any
> > control over seems like a problem. I would expect CDMs to be subject to the
> > same constraints that browsers are today, i.e. they should provide a
> > "private" mode where such information is not retained and provide mechanisms
> > for the user to remove such information if it already exists.
> 
> Private browsing modes primarily address privacy relative to other users of
> the same computing device that the browser runs on. They either aren't or
> are less about addressing privacy relative to the sites that are accessed or
> relative to third parties whose components (typically ads) are included on
> the sites.
> 
> Especially addressing privacy relative to third parties (such as ad
> aggregators) is an issue that browsers seek to address in their normal mode
> of operation without requiring the user to enter a private browsing mode.
> For example, Safari, by default, outside the private browsing mode, tries to
> avoid honoring third-party cookies. Therefore, the issue of each CDM
> installation having unique key material whose uniqueness is detectable by
> Web sites is the kind of issue browser care about addressing in the normal
> mode of operation.
> 
> Persistently storing content keys/licenses to last beyond the end of the
> current browsing session would be the kind of thing that would need
> addressing in order to address privacy relative to other users of the same
> computing device that the browser runs on. However, to the extent EME is
> meant to be about streaming, it should be possible to make EME or its CDMs
> not use permanent storage for content keys/licenses. (If the implementors of
> EME or CDMs are planning on addressing non-streaming use cases that involve
> writing content keys/licenses in permanent storage, I think it would be good
> for them to speak up about their intentions.)
> 
> > There is
> > nothing in the EME specification that prevents compliance with good privacy
> > practices.
> 
> EME should have some kind of privacy considerations section that points out
> the risks and suggests remedies so that each implementor doesn't need to
> discover the problems independently.

[steele] Can you suggest some text that we could add to the spec?
Comment 9 Henri Sivonen 2013-02-21 08:20:19 UTC
(In reply to comment #8)
> [steele] Can you suggest some text that we could add to the spec?

## Privacy

### Persistent uniquely identifying key material in the CDM

If the Key System involves key material unique to a particular computer or device that the CDM runs on (e.g. device keys permanently included in the hardware and unique to a particular device instance as opposed to common to a device model or unique keys assigned to a particular computer during CDM installation or setup of a software CDM), it is possible to use these unique values to track users across multiple sites and over time by serving users a trivial media file (e.g. a minimal-length audio file consisting of silence) that triggers needkey and/or keymessage events with CDM-specific messages that show evidence of the possession of the unique key material.

When the user agent is to fire a needkey or keymessage event whose message contains information that can be used for uniquely identifying a particular device or computer and the user has not already authorized the origin of the needkey or keymessage event handlers to receive such uniquely identifying information, the user agent should a display non-modal notification asking for the user to authorize the exposure of uniquely identifying information to the origin of the needkey and keymessage event handlers and the user agent should defer the delivery of the needkey or keymessage event until the user authorizes the exposure of uniquely identifying information. If the user chooses to dismiss the notification without authorizing the exposure of uniquely identifying information, the user agent should fire a keyerror event in place of a keymessage event or an error event in place of a needkey event, stop further processing of the media element (until next media load on the element) and discard the deferred needkey or keymessage event. (XXX specify error codes in the previous sentence.)

Note: The circumstances that require user authorization can be avoided by using key material that is common to the large number of devices or computers (e.g. common for a given version of the CDM that's installed on a large number of devices or computers) and/or by using key material that is randomly generated upon need, specific to the origin of the needkey and keymessage event handlers and discarded after the user navigates away from the origin or at the end of the user agent session at the latest.

### Persistently stored data

The CDM or the user agent must not write initialization data extracted from media files or data received through the update() and createSession() methods (or parts of such data) into persistent storage. Such data could be used by Web sites to track repeat visits and could be used by people who have access to the computing device that the CDM or the user agent run on to gain clues about what sites have been visited. Storing such data persistently is not required for streaming use cases.

(XXX If offline use cases, such as caching a movie on disk for watching in transit with no or bad connectivity, are to be addressed at all, they should be addressed using an explicit mechanism rather than being addressed as a CDM-dependent hand wavy magic side effect of update() and createSession(). The NavigationController work that is being proposed as a replacement for app cache could be used to cache movies and licenses so that the NavigationController would be in charge of caching both and handing them to the media stack and EME as if the requests happened online.)
Comment 10 Joe Steele 2013-02-21 22:38:06 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > [steele] Can you suggest some text that we could add to the spec?
> 
> ## Privacy
> 
> ### Persistent uniquely identifying key material in the CDM
> 
> If the Key System involves key material unique to a particular computer or
> device that the CDM runs on (e.g. device keys permanently included in the
> hardware and unique to a particular device instance as opposed to common to
> a device model or unique keys assigned to a particular computer during CDM
> installation or setup of a software CDM), it is possible to use these unique
> values to track users across multiple sites and over time by serving users a
> trivial media file (e.g. a minimal-length audio file consisting of silence)
> that triggers needkey and/or keymessage events with CDM-specific messages
> that show evidence of the possession of the unique key material.
> 
> When the user agent is to fire a needkey or keymessage event whose message
> contains information that can be used for uniquely identifying a particular
> device or computer and the user has not already authorized the origin of the
> needkey or keymessage event handlers to receive such uniquely identifying
> information, the user agent should a display non-modal notification asking
> for the user to authorize the exposure of uniquely identifying information
> to the origin of the needkey and keymessage event handlers and the user
> agent should defer the delivery of the needkey or keymessage event until the
> user authorizes the exposure of uniquely identifying information. If the
> user chooses to dismiss the notification without authorizing the exposure of
> uniquely identifying information, the user agent should fire a keyerror
> event in place of a keymessage event or an error event in place of a needkey
> event, stop further processing of the media element (until next media load
> on the element) and discard the deferred needkey or keymessage event. (XXX
> specify error codes in the previous sentence.)

This is good material. I have to think more about the specifics, but I would not object to a user notification like this. One thing you did not mention is the ability for a user to change their mind and remove such information once it has been created. I think both of these use cases fall under "private" mode I mentioned before.

> ### Persistently stored data
> 
> The CDM or the user agent must not write initialization data extracted from
> media files or data received through the update() and createSession()
> methods (or parts of such data) into persistent storage. Such data could be
> used by Web sites to track repeat visits and could be used by people who
> have access to the computing device that the CDM or the user agent run on to
> gain clues about what sites have been visited. Storing such data
> persistently is not required for streaming use cases.

I disagree with this. I think a more useful restriction here would be that information stored by the CDM related to a particular domain can only be retrieved (if at all) from applications hosted on that domain. This would be equivalent from a security perspective, since the application could simply store the license itself during acquisition if this restriction existed. I am not sure such a restriction is needed however, because there is no mechanism proposed to directly allow a web application to access this type of information.
Comment 11 Joe Steele 2013-02-21 23:21:02 UTC
*** Bug 20966 has been marked as a duplicate of this bug. ***
Comment 12 Fred Andrews 2013-02-22 00:18:46 UTC
(In reply to comment #11)
> *** Bug 20966 has been marked as a duplicate of this bug. ***

Could I suggest keeping discussion related to 'informing the user' and 'seeking consent' etc in bug 20966.  Even if this bug 20965 is marked as 'wont fix', there would still be some chance of addressing bug 20966.
Comment 13 Henri Sivonen 2013-02-22 08:58:59 UTC
(In reply to comment #10)
> One thing you did not mention
> is the ability for a user to change their mind and remove such information
> once it has been created.

Yeah, if the user chooses some kind of "Always Allow on This Site" option, this piece of data should be listed and revocable in the same UI that's used for other site permissions like Geolocation.

> I think both of these use cases fall under
> "private" mode I mentioned before.

I'm pretty sure you'll find that browser vendors treat the issue of "globally unique persistent identifier exposed to all sites" as an issue for all modes of operation, not just "private" mode issue.

It's worth noting that a given CDM vendor could support two Key Systems: one that uses persistently individualized keys and another that randomly generates per-origin session key pair and signs the public key using a private key common to all instances of the same CDM version. Content providers could then choose how they value the persistent device identity against the authorization UI. (This assumes, of course, that the UA is aware of the nature of the Key Systems and knows that one of them doesn't necessitate the authorization UI.)

> > ### Persistently stored data
> > 
> > The CDM or the user agent must not write initialization data extracted from
> > media files or data received through the update() and createSession()
> > methods (or parts of such data) into persistent storage. Such data could be
> > used by Web sites to track repeat visits and could be used by people who
> > have access to the computing device that the CDM or the user agent run on to
> > gain clues about what sites have been visited. Storing such data
> > persistently is not required for streaming use cases.
> 
> I disagree with this. I think a more useful restriction here would be that
> information stored by the CDM related to a particular domain can only be
> retrieved (if at all) from applications hosted on that domain.

What's your use case of persistent storage of CDM-related information? I thought it wasn't worthwhile to propose more complex requirements without knowing the use cases that the requirements were supposed to address.

> This would be
> equivalent from a security perspective, since the application could simply
> store the license itself during acquisition if this restriction existed.

It would be roughly equivalent for sites like netflix.com. Netflix requires the user to log in, so it can do longitudinal tracking of a given customer anyway. If the CDM exposes information that's unique to the device, then Netflix could also track which customers log in from the same device without persistent storage of licenses adding anything from the site POV. However, if the CDM is of the type that doesn't expose unique information to the Web, then persistent storage of licenses would allow the tracking of which customers something share a device when such tracking wouldn't necessarily be enabled without persistent storage of licenses.

For sites like thedailyshow.com, it would not be equivalent. The site does not require login, so persistent storage of licenses would enable longitudinal tracking of users. Of course, opportunistic (non-login) cookies already enable such tracking, but if persistent storage of licenses is permitted, the spec needs to have provisions about enabling the user to wipe such data using the same UI that enables the wiping of cookies or IndexedDB.

In any case, persistent storage of licenses gives a person with access to the computing device information about what sites have been accessed. (Especially if the browser manages the bucketing of the licenses by origin because the browser wants to sandbox the CDM as the CDM runs mystery code and, hence, the origin data needs to be available to the browser and can't be encrypted for CDM consumption only.)
Comment 14 Fred Andrews 2013-02-22 13:46:21 UTC
(In reply to comment #13)
> (In reply to comment #10)
> > One thing you did not mention
> > is the ability for a user to change their mind and remove such information
> > once it has been created.
> 
> Yeah, if the user chooses some kind of "Always Allow on This Site" option,
> this piece of data should be listed and revocable in the same UI that's used
> for other site permissions like Geolocation.
> 
> > I think both of these use cases fall under
> > "private" mode I mentioned before.
> 
> I'm pretty sure you'll find that browser vendors treat the issue of
> "globally unique persistent identifier exposed to all sites" as an issue for
> all modes of operation, not just "private" mode issue.
> 
> It's worth noting that a given CDM vendor could support two Key Systems: one
> that uses persistently individualized keys and another that randomly
> generates per-origin session key pair and signs the public key using a
> private key common to all instances of the same CDM version. Content
> providers could then choose how they value the persistent device identity
> against the authorization UI. (This assumes, of course, that the UA is aware
> of the nature of the Key Systems and knows that one of them doesn't
> necessitate the authorization UI.)
> 
> > > ### Persistently stored data
> > > 
> > > The CDM or the user agent must not write initialization data extracted from
> > > media files or data received through the update() and createSession()
> > > methods (or parts of such data) into persistent storage. Such data could be
> > > used by Web sites to track repeat visits and could be used by people who
> > > have access to the computing device that the CDM or the user agent run on to
> > > gain clues about what sites have been visited. Storing such data
> > > persistently is not required for streaming use cases.
> > 
> > I disagree with this. I think a more useful restriction here would be that
> > information stored by the CDM related to a particular domain can only be
> > retrieved (if at all) from applications hosted on that domain.
> 
> What's your use case of persistent storage of CDM-related information? I
> thought it wasn't worthwhile to propose more complex requirements without
> knowing the use cases that the requirements were supposed to address.

Might it have been for revoking keys?

Some of the uses cases do appear to require the CDM to have privileged storage not accessible or modifiable by the UA or user.

We should not be guessing on such requirements, bug 20963 needs
to be addressed so that the proposal can be reviewed.
 
...
 
> In any case, persistent storage of licenses gives a person with access to
> the computing device information about what sites have been accessed.
> (Especially if the browser manages the bucketing of the licenses by origin
> because the browser wants to sandbox the CDM as the CDM runs mystery code
> and, hence, the origin data needs to be available to the browser and can't
> be encrypted for CDM consumption only.)

This is a good point as this requires the UA have access to the CDM storage
which seems to conflict with the requirement that the CDM storage be
privileged.

Some uses cases do not appear to be supportable by running the CDM
in a UA controlled sandbox.  For example this would allow the UA
to capture the output and access and modify the state.  I gather
that the proposal is that the OS vendor and CDM authors conspire
to create CDMs that the owner of the computer has little access
or control over, and that the CDM authors license their use by OS
vendors and content authors under very restrictive terms to
enforce the integrity of their DRM model.  This is just a guess
as the requirements have not been articulated, see bug 20963,
but if so would make this model inapplicable where the OS vendor
is the user (open source OS), see bug 20961.
Comment 15 Mark Watson 2013-02-22 16:37:22 UTC
First, I think it's clear that the UA needs to know the privacy properties of the CDM so that appropriate controls can be offered to the user (authorization, clearing of stored data etc.). This would immediately be an improvement over plugins. Again, I expect UA implementors will choose which CDMs they integrate with, rather than providing an open 'CDM plugin' API.

Regarding unique identifiers, hopefully the Privacy Interest Group can help us here. Identifiers that are not 'unique' may still have privacy/tracking implications (cf ZIP codes). I'm not sure the appropriate term for 'identifiers with privacy implications', so I'll just use 'identifier' in the following.

I see four separate privacy/tracking issues with identifiers:
1) The initial message (or part of it) for a dummy file may effectively form an identifier that any site* could use for tracking over time
2) The initial message (or part of it) for a dummy file may effectively form an identifier that any site* could use for tracking across sites, if those sites collaborate
3) An identifier available to the server side of the keysystem may be used for tracking over time by a single site
4) An identifier available to the server side of the keysystem may be used for tracking across sites, if those sites collaborate

* including sites which do not support the server side of any keysystem

Tracking across sites (2, 4) can be addressed if the identifier is origin-specific i.e. if netflix.com sees a different identifier to hulu.com.

Tracking by arbitrary sites (1, 2) can be addressed if the initial message is not consistent. For example if it is encrypted with a keysystem public key, and contains information which changes every time a message is generated (salt, nonce, timestamp) etc.

That leaves (3), where the considerations are very different depending on whether the UA can cause the identifier to be reset. If it can, then the situation is hardly different from cookies today. Indeed UAs may reset such identifiers whenever the user clears cookies. If it can't, then user authorization as described by Henri may be appropriate.

Whatever the situation, the UA implementor needs to know it and to provide information and control to the user. I don't think we should be prescriptive in the specification either about what CDMs might do. Certain sites may not operate without certain kinds of identifier and users should be able to make informed choices about whether to use those sites, rather that the W3C attempting to proscribe them (though I do understand the difference between 'offering users information and the chance to make a choice' and 'informed choice').

I also don't think we should prescribe what UAs should do (W3C specifications don't generally mandate specific privacy dialogs etc.).

However we should describe the issues and make it clear that the UA implementor MUST have complete knowledge of the CDM privacy properties so that they can provide appropriate protections.

Regarding persistently stored information, there is one use-case in the specification: secure proof of key release. This requires the CDM to persistently store session identifiers - but not the licenses or keys - for MediaKeySessions that previously existed, until receipt of the key release information by the server is acknowledged. This is origin-specific and so only allows a given origin to re-discover session identifiers for sessions with that same origin. So, they provide the possibility to reintroduce device tracking if it's not already present - if the session identifiers have appropriate uniqueness properties.

Again, the UA integrating with a CDM needs to know the privacy properties of the identifiers to provide appropriate choices and protections to the user.
Comment 16 Joe Steele 2013-02-22 18:00:16 UTC
> I also don't think we should prescribe what UAs should do (W3C
> specifications don't generally mandate specific privacy dialogs etc.).

This CR may offer some guidance as to how to word this -- http://www.w3.org/TR/wsc-ui/#pageinfosummary

> Regarding persistently stored information, there is one use-case in the
> specification: secure proof of key release. 

Agreed. However other use cases are implied by this bug: https://www.w3.org/Bugs/Public/show_bug.cgi?id=19208

Anonymizing user-specific information per domain in some manner (encryption, digesting, nonces, etc.) to avoid returning useful cross-site key requests works just as well for storing persistent data.
Comment 17 Joe Steele 2013-02-22 18:47:45 UTC
(In reply to comment #13)
> (In reply to comment #10)
> I'm pretty sure you'll find that browser vendors treat the issue of
> "globally unique persistent identifier exposed to all sites" as an issue for
> all modes of operation, not just "private" mode issue.

Agreed. However that is not required by EME. My point was about the persistence of unique identifiers, not how global they are. I am *not* arguing for the existence of a globally unique persistent identifier exposed to all sites, nor is it required for CDMs (at least not the one I am most familiar with)

> What's your use case of persistent storage of CDM-related information? I
> thought it wasn't worthwhile to propose more complex requirements without
> knowing the use cases that the requirements were supposed to address.

In cases where a license can have a longer lifetime than a single session, it is useful (and sometimes necessary) to not require the user to reacquire the license the next time they want to play. 

Here are some of the benefits:
* Allows the license provider to lower their cost (less network transactions required) which can result in lower costs for the user. 
* Allows the user to request a license in a secure environment and then continue to play back content when they are in an insecure environment without having to reacquire the license over the insecure network. 
* Reduces the number of times the user needs to authenticate.

> In any case, persistent storage of licenses gives a person with access to
> the computing device information about what sites have been accessed.

This is dependent on how the information is secured on disk. The browser cache seems like a more likely target for snooping though, since the location you downloaded the movie from is probably much more informative. If I have local access to the computing device I can gather information on the user in any number of ways. 

Or is your point that the user can get access to the list when the DRM vendor might not want them to?
Comment 18 Mark Watson 2013-02-22 20:30:56 UTC
(In reply to comment #17)
> (In reply to comment #13)
> > (In reply to comment #10)

> > In any case, persistent storage of licenses gives a person with access to
> > the computing device information about what sites have been accessed.
> 
> This is dependent on how the information is secured on disk. The browser
> cache seems like a more likely target for snooping though, since the
> location you downloaded the movie from is probably much more informative. If
> I have local access to the computing device I can gather information on the
> user in any number of ways. 
> 
> Or is your point that the user can get access to the list when the DRM
> vendor might not want them to?

I think the point is that if the CDM has a secret persistent store, then the 'clear browsing history' function of the UA might not operate the way the user expects.

But again, I think we have to remember that the browser implementors have reputations to protect and privacy experts to help them with that. I expect they will make careful decisions as to what CDMs to integrate with based on detailed information about what those CDMs do and also about what the UA *allows* the CDM to do for the case where the CDM runs in some kind of UA sandbox.
Comment 19 Fred Andrews 2013-02-22 22:46:15 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > (In reply to comment #13)
> > > (In reply to comment #10)
> 
> > > In any case, persistent storage of licenses gives a person with access to
> > > the computing device information about what sites have been accessed.
> > 
> > This is dependent on how the information is secured on disk. The browser
> > cache seems like a more likely target for snooping though, since the
> > location you downloaded the movie from is probably much more informative. If
> > I have local access to the computing device I can gather information on the
> > user in any number of ways. 
> > 
> > Or is your point that the user can get access to the list when the DRM
> > vendor might not want them to?
> 
> I think the point is that if the CDM has a secret persistent store, then the
> 'clear browsing history' function of the UA might not operate the way the
> user expects.
> 
> But again, I think we have to remember that the browser implementors have
> reputations to protect and privacy experts to help them with that. I expect
> they will make careful decisions as to what CDMs to integrate with based on
> detailed information about what those CDMs do and also about what the UA
> *allows* the CDM to do for the case where the CDM runs in some kind of UA
> sandbox.

My understanding was that EME was a UA interface to the non-UA-CDM and
that the CDM had privileges above and beyond the UA, and thus the UA
has little opportunity to protect the user.  The relationship between
the UA and the CDM needs to be clarified.

Does EME even support the UA identifying the EME in a secure way
that the privileged CDM can not spoof?  If not then the UA has
absolutely not control.

Clearly if the UA is to be able to protect the user from the CDM
then the CDM must be subordinate to the UA, and then the UA is free
to capture the output of the CDM and EME has not value for DRM.
Comment 20 Mark Watson 2013-02-23 00:08:27 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > (In reply to comment #13)
> > > > (In reply to comment #10)
> > 
> > > > In any case, persistent storage of licenses gives a person with access to
> > > > the computing device information about what sites have been accessed.
> > > 
> > > This is dependent on how the information is secured on disk. The browser
> > > cache seems like a more likely target for snooping though, since the
> > > location you downloaded the movie from is probably much more informative. If
> > > I have local access to the computing device I can gather information on the
> > > user in any number of ways. 
> > > 
> > > Or is your point that the user can get access to the list when the DRM
> > > vendor might not want them to?
> > 
> > I think the point is that if the CDM has a secret persistent store, then the
> > 'clear browsing history' function of the UA might not operate the way the
> > user expects.
> > 
> > But again, I think we have to remember that the browser implementors have
> > reputations to protect and privacy experts to help them with that. I expect
> > they will make careful decisions as to what CDMs to integrate with based on
> > detailed information about what those CDMs do and also about what the UA
> > *allows* the CDM to do for the case where the CDM runs in some kind of UA
> > sandbox.
> 
> My understanding was that EME was a UA interface to the non-UA-CDM and
> that the CDM had privileges above and beyond the UA, and thus the UA
> has little opportunity to protect the user.  The relationship between
> the UA and the CDM needs to be clarified.

Sure. We don't say anything about that now.

But practically, some DRM vendor cannot magically integrate their CDM with a UA without support from that UA implementor. If the UA implementor opens up a public API for CDM integration (for example by defining EME extensions to NPAPI) then I agree with many of the points made, because all bets would be off about what kind of things the CDM might do and what kind of protections the UA could offer.

So I don't expect UA implementors to do that for exactly that reason. They will carefully choose what CDMs to integrate with taking all these issues into account and working with the CDM vendors.

> 
> Does EME even support the UA identifying the EME in a secure way
> that the privileged CDM can not spoof?  If not then the UA has
> absolutely not control.

I don't understand the question - do you mean 'UA identifying the CDM' ? If so, then it's nothing to do with the EME spec and everything to do with the UA implementation. The UA does need to know what CDM it is talking to.

> 
> Clearly if the UA is to be able to protect the user from the CDM
> then the CDM must be subordinate to the UA, and then the UA is free
> to capture the output of the CDM and EME has not value for DRM.

Not necessarily. There are use-cases where the output of the CDM is decrypted decoded media samples and the purpose of the CDM is primarily to protect the key and the encoded samples.

There are also scenarios where a UA integrates with a CDM that is provided as part of the Operating System or platform, through standard APIs. The CDM may then output media directly to the output device. In this case the UA implementor is relying on representations from the platform vendor about the privacy properties of the APIs they are using - but this is true for any platform APIs that the UA makes use of. For example, when clearing browsing data the UA relies on the APIs for deletion of data stored on disk to actually delete that data (in fact in this case those APIs probably don't expunge the data from the disk but simply delete the files - a UA implementor chooses whether this is adequate to protect the user or whether they need to use different APIs that will write random byte patterns over the disk sectors several thousand times etc.).
Comment 21 Fred Andrews 2013-02-23 01:17:43 UTC
(In reply to comment #20)
> (In reply to comment #19)
> > (In reply to comment #18)
... 
> > My understanding was that EME was a UA interface to the non-UA-CDM and
> > that the CDM had privileges above and beyond the UA, and thus the UA
> > has little opportunity to protect the user.  The relationship between
> > the UA and the CDM needs to be clarified.
> 
> Sure. We don't say anything about that now.
> 
> But practically, some DRM vendor cannot magically integrate their CDM with a
> UA without support from that UA implementor. If the UA implementor opens up
> a public API for CDM integration (for example by defining EME extensions to
> NPAPI) then I agree with many of the points made, because all bets would be
> off about what kind of things the CDM might do and what kind of protections
> the UA could offer.
> 
> So I don't expect UA implementors to do that for exactly that reason. They
> will carefully choose what CDMs to integrate with taking all these issues
> into account and working with the CDM vendors.
>
> > Does EME even support the UA identifying the EME in a secure way
> > that the privileged CDM can not spoof?  If not then the UA has
> > absolutely not control.
> 
> I don't understand the question - do you mean 'UA identifying the CDM' ? If
> so, then it's nothing to do with the EME spec and everything to do with the
> UA implementation. The UA does need to know what CDM it is talking to.

So you are suggesting that the UA have a short-list of CDMs from which
it will select a CDM to 'load' based on the requested 'Key System string',
and that the details of how the UA identifies the CDM from this short list
is a UA implementation detail?

Or alternatively that a proprietary OS has a short-list and the user
trusts the OS to manage this, or the system interfaces to proprietary
hardware that implements the CDM and the user trusts this.

> > Clearly if the UA is to be able to protect the user from the CDM
> > then the CDM must be subordinate to the UA, and then the UA is free
> > to capture the output of the CDM and EME has not value for DRM.
...
> There are also scenarios where a UA integrates with a CDM that is provided
> as part of the Operating System or platform, through standard APIs. The CDM
> may then output media directly to the output device. In this case the UA
> implementor is relying on representations from the platform vendor about the
> privacy properties of the APIs they are using - but this is true for any
> platform APIs that the UA makes use of. For example, when clearing browsing
> data the UA relies on the APIs for deletion of data stored on disk to
> actually delete that data (in fact in this case those APIs probably don't
> expunge the data from the disk but simply delete the files - a UA
> implementor chooses whether this is adequate to protect the user or whether
> they need to use different APIs that will write random byte patterns over
> the disk sectors several thousand times etc.).

Can we reach a consensus that a UA controlled CDM can not support DRM and
that DRM demands that the OS vendor conspire with the CDM author to limit
the control the user has over their computer?

Can we reach a consensus that DRM is not applicable to open source stacks
because it is not possible to limit the control the user has over their
computer?

Can we reach a consensus that DRM could be supported in an open source
web browser that uses EME to interface to a proprietary CDM that runs
in a context that limits the control the user has over their computer?

Can we reach consensus that DRM requires the user to trust a proprietary
operating system or CDM hardware module?

Can we reach consensus that the UA has no ability to control security
or privacy when using DRM?

If we can agree on some of these matters then it may help discussions progress,
and can some non-normative text be added to the EME specification.
Comment 22 Henri Sivonen 2013-02-25 10:54:24 UTC
(In reply to comment #14)
> Might it have been for revoking keys?

Surely the revocation of content keys will be based on communicating expiry time together with the keys so that the CDM will consider them revoked when they expire and the revocation of CDM keys will happen on the server-side so that the server will refuse to send content keys over to a CDM whose keys appear on a CRL.

> Some of the uses cases do appear to require the CDM to have privileged
> storage not accessible or modifiable by the UA or user.

What use cases would not be addressed if the storage was browser-mediated and the CDM encrypted the data that it asks the browser to store in the context of a given origin? (That is, the browser would store encrypted blobs keyed by origin and could detect that some data has been stored for a given origin and the browser could delete that data.)

(In reply to comment #15)
> I see four separate privacy/tracking issues with identifiers:
> 1) The initial message (or part of it) for a dummy file may effectively form
> an identifier that any site* could use for tracking over time
> 2) The initial message (or part of it) for a dummy file may effectively form
> an identifier that any site* could use for tracking across sites, if those
> sites collaborate
> 3) An identifier available to the server side of the keysystem may be used
> for tracking over time by a single site
> 4) An identifier available to the server side of the keysystem may be used
> for tracking across sites, if those sites collaborate
> 
> * including sites which do not support the server side of any keysystem
> 
> Tracking across sites (2, 4) can be addressed if the identifier is
> origin-specific i.e. if netflix.com sees a different identifier to hulu.com.

Yes. EME should say this.

> Tracking by arbitrary sites (1, 2) can be addressed if the initial message
> is not consistent. For example if it is encrypted with a keysystem public
> key, and contains information which changes every time a message is
> generated (salt, nonce, timestamp) etc.

Yes, the case where the tracking server doesn't implement enough of the key system to locate a public key advertised by the CDM or the try verifying a signature generated with a private key is easy to address by salting.

> That leaves (3), where the considerations are very different depending on
> whether the UA can cause the identifier to be reset. If it can, then the
> situation is hardly different from cookies today.

Right.

> I also don't think we should prescribe what UAs should do (W3C
> specifications don't generally mandate specific privacy dialogs etc.).

The most relevant recent precedent is:
www.w3.org/TR/geolocation-API/#privacy_for_uas

Furthermore, the experience that informed the design of the geolocation API says that the points in the flow where user authorization may need to be checked have to be points where at the API flow is asynchronous. It seems that the relevant points in the EME flow are already asynchronous, but it would still be good to point out explicitly what the points where user authorization may need to be checked are.

> Regarding persistently stored information, there is one use-case in the
> specification: secure proof of key release.

I think key release still needs more clarification in EME, but that's another bug.

> This requires the CDM to
> persistently store session identifiers - but not the licenses or keys - for
> MediaKeySessions that previously existed, until receipt of the key release
> information by the server is acknowledged.

Isn't it enough for the CDM to generate a signed message at the time of destroying the keys and for the browser to be responsible for storing this message and re-transmitting it until it has been acknowledged? Even if the browser manages the storage, making a server that intentionally defers the acknowledgment of key release messages would allow for cookie-like tracking functionality.

(In reply to comment #17)
> (In reply to comment #13)
> > (In reply to comment #10)
> > I'm pretty sure you'll find that browser vendors treat the issue of
> > "globally unique persistent identifier exposed to all sites" as an issue for
> > all modes of operation, not just "private" mode issue.
> 
> Agreed. However that is not required by EME.

I think EME should address privacy concerns that would arise from CDM designs that can be realistically expected considering existing DRM systems even if the design decisions that result in privacy concerns are not required by EME.

> My point was about the
> persistence of unique identifiers, not how global they are. I am *not*
> arguing for the existence of a globally unique persistent identifier exposed
> to all sites, nor is it required for CDMs (at least not the one I am most
> familiar with)

Can you, please, elaborate on that? The Adobe Access 4 Overview document links to from comment 5 says:
"The Flash Player or Adobe AIR runtime client acquires a unique digital certificate (called a machine certificate)
from an Adobe-hosted server.

This process of assigning a unique certificate is called individualization. Individualization uniquely identifies both
the computer and the Flash Player or Adobe AIR runtime used to playback content.

The individualization process allows the downloaded licenses to be bound to a specific computer on which the
client is installed. Every computer is given a unique machine credential (machine private key and machine
certificate). If a specific client were to become compromised, it can be revoked and barred from acquiring licenses
for new content."

> > What's your use case of persistent storage of CDM-related information? I
> > thought it wasn't worthwhile to propose more complex requirements without
> > knowing the use cases that the requirements were supposed to address.
> 
> In cases where a license can have a longer lifetime than a single session,
> it is useful (and sometimes necessary) to not require the user to reacquire
> the license the next time they want to play. 
> 
> Here are some of the benefits:
> * Allows the license provider to lower their cost (less network transactions
> required) which can result in lower costs for the user.

This seems like a wrong optimization. The network transactions for re-contacting the license server are tiny compared to the network transactions involved in the transfer of the media itself and even in the transfer of the HTML, CSS and JavaScript around the media.
 
> * Allows the user to request a license in a secure environment and then
> continue to play back content when they are in an insecure environment
> without having to reacquire the license over the insecure network. 

We have https for secure transactions over an insecure network.

> * Reduces the number of times the user needs to authenticate.

Can you elaborate on this? In a Netflix-like case, you need to login to resume an interrupted movie anyway. On a site similar to thedailyshow.com, there is no user-facing authentication in the first place.

> > In any case, persistent storage of licenses gives a person with access to
> > the computing device information about what sites have been accessed.
> 
> This is dependent on how the information is secured on disk. The browser
> cache seems like a more likely target for snooping though, since the
> location you downloaded the movie from is probably much more informative. If
> I have local access to the computing device I can gather information on the
> user in any number of ways. 
> 
> Or is your point that the user can get access to the list when the DRM
> vendor might not want them to?

My point is that if the CDM manages its own storage, there can be snoopable data left there after of browser function to wipe browser-managed storage, such as the HTTP cache, has been used. To remedy this, persistent storage for the CDM, if needed at all, should be browser-mediated.

(In reply to comment #19)
> My understanding was that EME was a UA interface to the non-UA-CDM and
> that the CDM had privileges above and beyond the UA, and thus the UA
> has little opportunity to protect the user. 

If there is authorization by the user before CDM gets to send a message to the site or the browser vendor has had the opportunity to ensure that the messages generated by the CDM are not privacy sensitive, there are a couple of plausible designs that would give the browser the opportunity to protect the user:

Software-only case (plausible for SD; decoded frame data exposed to the browser):
The browser sandboxes the CDM into a separate process that can only perform memory allocation, computation or talk with the browser process. Encrypted media, EME messages and seek times go into the CDM. EME messages, pixels and audio samples come out of the CDM.

Hardware CDM case (plausible for HD; decoded frame data are not exposed to the browser):
The decryption and decompression function is performed by a discrete hardware component. The browser talks to this hardware component through an open-source driver. The browser vendor examines the hardware to be convinced that the hardware component cannot do IO except writing pixels to the GPU are talking through the interface that the driver exposes. Encrypted media, EME messages and seek times go through the driver into the hardware component. EME messages, references to frame data in GPU memory and audio samples come from the hardware component through the driver. The hardware component outputs pixels onto surfaces in the GPU memory that are marked as readback-disabled. The GPU hardware ensures that the surfaces marked as readback-disabled cannot be read back into software even though the software can designate where the GPU uses the surfaces.

Of course, it's easy to come up with designs that don't make it possible for the browser to protect the user in cases where the user doesn't trust the CDM (e.g. having CDMs run the way NPAPI plug-ins run).

> The relationship between
> the UA and the CDM needs to be clarified.

I agree.

> Does EME even support the UA identifying the EME in a secure way
> that the privileged CDM can not spoof?  If not then the UA has
> absolutely not control.

EME does not specify the API between the UA and the CDM. It would be possible to specify that API in such a way that the UA could authenticate the CDM on the same level of confidence that content providers can authenticate the CDM. As the first communication between the UA and the CDM, the UA could randomly generate a nonce, handed to the CDM and ask the CDM to encrypt it using the CDM's private key and hand the result back. The UA could then decrypt the results using the CDM's public key and compare the results with the original nonce.
Comment 23 Mark Watson 2013-02-25 17:58:33 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > (In reply to comment #19)
> > > (In reply to comment #18)
> ... 
> >
> > > Does EME even support the UA identifying the EME in a secure way
> > > that the privileged CDM can not spoof?  If not then the UA has
> > > absolutely not control.
> > 
> > I don't understand the question - do you mean 'UA identifying the CDM' ? If
> > so, then it's nothing to do with the EME spec and everything to do with the
> > UA implementation. The UA does need to know what CDM it is talking to.
> 
> So you are suggesting that the UA have a short-list of CDMs from which
> it will select a CDM to 'load' based on the requested 'Key System string',
> and that the details of how the UA identifies the CDM from this short list
> is a UA implementation detail?

That's one possibility, yes. See Henri's response on this as well.

> 
> Or alternatively that a proprietary OS has a short-list and the user
> trusts the OS to manage this, or the system interfaces to proprietary
> hardware that implements the CDM and the user trusts this.

No, I think more likely for the case of CDMs embedded in the OS or hardware there will be an API for each specific CDM that the UA will code to, if they decide to support that CDM on that platform.

> 
> > > Clearly if the UA is to be able to protect the user from the CDM
> > > then the CDM must be subordinate to the UA, and then the UA is free
> > > to capture the output of the CDM and EME has not value for DRM.
> ...
> > There are also scenarios where a UA integrates with a CDM that is provided
> > as part of the Operating System or platform, through standard APIs. The CDM
> > may then output media directly to the output device. In this case the UA
> > implementor is relying on representations from the platform vendor about the
> > privacy properties of the APIs they are using - but this is true for any
> > platform APIs that the UA makes use of. For example, when clearing browsing
> > data the UA relies on the APIs for deletion of data stored on disk to
> > actually delete that data (in fact in this case those APIs probably don't
> > expunge the data from the disk but simply delete the files - a UA
> > implementor chooses whether this is adequate to protect the user or whether
> > they need to use different APIs that will write random byte patterns over
> > the disk sectors several thousand times etc.).
> 
> Can we reach a consensus that a UA controlled CDM can not support DRM and
> that DRM demands that the OS vendor conspire with the CDM author to limit
> the control the user has over their computer?
> 
> Can we reach a consensus that DRM is not applicable to open source stacks
> because it is not possible to limit the control the user has over their
> computer?

No, I don't believe either of the above are obvious. We may have different definitions of DRM.

> 
> Can we reach a consensus that DRM could be supported in an open source
> web browser that uses EME to interface to a proprietary CDM that runs
> in a context that limits the control the user has over their computer?

We obviously can't limit what a user can do with their machine. DRM relies on providing software or hardware components which are hard to modify whilst retaining their intended functionality. That's a different thing.

Given that difference, then yes DRM could be supported in an open source web browser that integrated with components of that kind.

> 
> Can we reach consensus that DRM requires the user to trust a proprietary
> operating system or CDM hardware module?

No, there are also software solutions which are not part of the operating system.

> 
> Can we reach consensus that the UA has no ability to control security
> or privacy when using DRM?

No, this depends on the interface between the UA and the CDM and the security and privacy properties of the CDM.

> 
> If we can agree on some of these matters then it may help discussions
> progress,
> and can some non-normative text be added to the EME specification.
Comment 24 Joe Steele 2013-02-25 18:10:25 UTC
(In reply to comment #22)
> (In reply to comment #14)
> > My point was about the
> > persistence of unique identifiers, not how global they are. I am *not*
> > arguing for the existence of a globally unique persistent identifier exposed
> > to all sites, nor is it required for CDMs (at least not the one I am most
> > familiar with)
> 
> Can you, please, elaborate on that? The Adobe Access 4 Overview document
> links to from comment 5 says:
> "The Flash Player or Adobe AIR runtime client acquires a unique digital
> certificate (called a machine certificate)
> from an Adobe-hosted server.
> 
> This process of assigning a unique certificate is called individualization.
> Individualization uniquely identifies both
> the computer and the Flash Player or Adobe AIR runtime used to playback
> content.
> 
> The individualization process allows the downloaded licenses to be bound to
> a specific computer on which the
> client is installed. Every computer is given a unique machine credential
> (machine private key and machine
> certificate). If a specific client were to become compromised, it can be
> revoked and barred from acquiring licenses
> for new content."

The Access machine certificate that is acquired is unique per application, not per computer. That documentation is incomplete. Thanks for pointing it out. 

> > Here are some of the benefits:
> > * Allows the license provider to lower their cost (less network transactions
> > required) which can result in lower costs for the user.
> 
> This seems like a wrong optimization. The network transactions for
> re-contacting the license server are tiny compared to the network
> transactions involved in the transfer of the media itself and even in the
> transfer of the HTML, CSS and JavaScript around the media.

The network operations themselves are usually trivial (though non-zero). The bigger cost in this case can be the encryption/decryption operations which may take place on licensing server. These can involve contacting other servers, HSMs, etc. It can be a significant cost factor. This is also a factor in not using HTTPS. 

> > * Allows the user to request a license in a secure environment and then
> > continue to play back content when they are in an insecure environment
> > without having to reacquire the license over the insecure network. 
> 
> We have https for secure transactions over an insecure network.

That is correct, however in this case I am more concerned about a license server that is not accessible from the current location. For example - I acquire a license at work on a non-external server and then use that license when I am outside work. 

Also HTTPS is not appropriate for every situation requiring security. HTTPS is too heavyweight IMO if what you need is a REST like API. This is not a concern for every protocol, but it I don't see a need to mandate HTTPS when a lighter-weight channel will do. HTTPS is subject to MITM attacks when the application cannot verify the servers certificate directly. Currently the EME does not provide for this and relies on the UA and the application to perform network operations.

> > * Reduces the number of times the user needs to authenticate.
> 
> Can you elaborate on this? In a Netflix-like case, you need to login to
> resume an interrupted movie anyway. On a site similar to thedailyshow.com,
> there is no user-facing authentication in the first place.

It is common in the VOD model for the user to acquires a license that is valid for longer than a single session. This license can be acquired once and then the video can be played back multiple times without having to re-authenticate, since authentication may only be required in the license acquisition phase. 

> > > In any case, persistent storage of licenses gives a person with access to
> > > the computing device information about what sites have been accessed.
> > 
> > This is dependent on how the information is secured on disk. The browser
> > cache seems like a more likely target for snooping though, since the
> > location you downloaded the movie from is probably much more informative. If
> > I have local access to the computing device I can gather information on the
> > user in any number of ways. 
> > 
> > Or is your point that the user can get access to the list when the DRM
> > vendor might not want them to?
> 
> My point is that if the CDM manages its own storage, there can be snoopable
> data left there after of browser function to wipe browser-managed storage,
> such as the HTTP cache, has been used. To remedy this, persistent storage
> for the CDM, if needed at all, should be browser-mediated.

That is a valid approach.
Comment 25 Fred Andrews 2013-02-25 23:01:43 UTC
(In reply to comment #23)
> (In reply to comment #21)
> > 
> > Can we reach a consensus that a UA controlled CDM can not support DRM and
> > that DRM demands that the OS vendor conspire with the CDM author to limit
> > the control the user has over their computer?
> > 
> > Can we reach a consensus that DRM is not applicable to open source stacks
> > because it is not possible to limit the control the user has over their
> > computer?
> 
> No, I don't believe either of the above are obvious. We may have different
> definitions of DRM.

Could you please provide you definition of DRM?

It may be possible to avoid defining 'DRM' for the purpose of discussions
but we would need some other agreed definitions.

For example: distinguish between platforms for which the user is able to
implement their own web browser or OS that can store the decrypted output,
versus systems for which they can not. Perhaps call these 'open' and
'proprietary'?

I believe the ball is in your court. If you do not agree with the proposed
definitions then please make a proposal?
 
> > Can we reach a consensus that DRM could be supported in an open source
> > web browser that uses EME to interface to a proprietary CDM that runs
> > in a context that limits the control the user has over their computer?
> 
> We obviously can't limit what a user can do with their machine. DRM relies
> on providing software or hardware components which are hard to modify whilst
> retaining their intended functionality. That's a different thing.
> 
> Given that difference, then yes DRM could be supported in an open source web
> browser that integrated with components of that kind.

Please define 'integrated with'?

I object to any definition of DRM that constrains a user implementation
of a web browser or a user implementation or an operating system.

> > Can we reach consensus that DRM requires the user to trust a proprietary
> > operating system or CDM hardware module?
> 
> No, there are also software solutions which are not part of the operating
> system.

Yes, it could be in proprietary hardware.  If we expand this can we reach
consensus?
 
> > 
> > Can we reach consensus that the UA has no ability to control security
> > or privacy when using DRM?
> 
> No, this depends on the interface between the UA and the CDM and the
> security and privacy properties of the CDM.

The term 'control security' may have been poorly chosen, and from
above is seems we firstly need a definition of 'DRM' or similar
to even discuss this further.

Can we reach consensus that the UA has no ability to enforce security
or privacy when using DRM?

This would exclude a cooperative API between the UA and CDM for
controlling security and privacy.
Comment 26 Mark Watson 2013-02-26 00:56:34 UTC
(In reply to comment #25)
> (In reply to comment #23)
> > (In reply to comment #21)
> > > 
> > > Can we reach a consensus that a UA controlled CDM can not support DRM and
> > > that DRM demands that the OS vendor conspire with the CDM author to limit
> > > the control the user has over their computer?
> > > 
> > > Can we reach a consensus that DRM is not applicable to open source stacks
> > > because it is not possible to limit the control the user has over their
> > > computer?
> > 
> > No, I don't believe either of the above are obvious. We may have different
> > definitions of DRM.
> 
> Could you please provide you definition of DRM?
> 
> It may be possible to avoid defining 'DRM' for the purpose of discussions
> but we would need some other agreed definitions.
> 
> For example: distinguish between platforms for which the user is able to
> implement their own web browser or OS that can store the decrypted output,
> versus systems for which they can not. Perhaps call these 'open' and
> 'proprietary'?
> 
> I believe the ball is in your court. If you do not agree with the proposed
> definitions then please make a proposal?

What proposed definitions ?

I don't think we need a definition of DRM to progress this specification. The specification doesn't propose a specific DRM system and this is deliberate. Different people have different requirements and requirements change over time.

I'm completely happy with the idea of non-normative text which describes the expected functionality of CDMs and the privacy and security issues associated with different CDM capabilities and behaviours. This material would inform UAs making decisions on whether and how to integrate with CDMs.

>  
> > > Can we reach a consensus that DRM could be supported in an open source
> > > web browser that uses EME to interface to a proprietary CDM that runs
> > > in a context that limits the control the user has over their computer?
> > 
> > We obviously can't limit what a user can do with their machine. DRM relies
> > on providing software or hardware components which are hard to modify whilst
> > retaining their intended functionality. That's a different thing.
> > 
> > Given that difference, then yes DRM could be supported in an open source web
> > browser that integrated with components of that kind.
> 
> Please define 'integrated with'?

http://en.wikipedia.org/wiki/Software_integration

> 
> I object to any definition of DRM that constrains a user implementation
> of a web browser or a user implementation or an operating system.
> 
> > > Can we reach consensus that DRM requires the user to trust a proprietary
> > > operating system or CDM hardware module?
> > 
> > No, there are also software solutions which are not part of the operating
> > system.
> 
> Yes, it could be in proprietary hardware.  If we expand this can we reach
> consensus?

Let me phrase it another way and see if you agree. It's certainly the intention of the EME proposal that there will be CDMs that are difficult for users to modify. There could be obfuscated software and/or there could be hardware mechanisms that make modification of certain software or firmware difficult.

And yes, these techniques probably make it difficult for the user to determine exactly what that software does. So there would be a certain amount of trust required, where the user has to trust that the software in question does only what is claimed by its implementor. (All assuming the user wants to access the service at all).

The EME architecture places the onus on UA implementors to exercise due diligence in the manner in which they integrate with CDMs. This offers an opportunity for solutions where the UA implementor vouches for the safety of the CDM based on their relationship with the CDM implementor. This is an improvement on the situation today where the user has to trust some arbitrary plugin with no guarantees from the UA.

>  
> > > 
> > > Can we reach consensus that the UA has no ability to control security
> > > or privacy when using DRM?
> > 
> > No, this depends on the interface between the UA and the CDM and the
> > security and privacy properties of the CDM.
> 
> The term 'control security' may have been poorly chosen, and from
> above is seems we firstly need a definition of 'DRM' or similar
> to even discuss this further.
> 
> Can we reach consensus that the UA has no ability to enforce security
> or privacy when using DRM?

No, I don't see how changing 'control' to 'enforce' changes the situation.

> 
> This would exclude a cooperative API between the UA and CDM for
> controlling security and privacy.

Control of security and privacy are important factors in the design of the API between UA and CDM and are things that UA implementors should explicitly *include* in their design, IMO.
Comment 27 Fred Andrews 2013-02-26 02:56:27 UTC
(In reply to comment #26)
> (In reply to comment #25)
> > (In reply to comment #23)
> > > (In reply to comment #21)
> > > > 
> > > > Can we reach a consensus that a UA controlled CDM can not support DRM and
> > > > that DRM demands that the OS vendor conspire with the CDM author to limit
> > > > the control the user has over their computer?
> > > > 
> > > > Can we reach a consensus that DRM is not applicable to open source stacks
> > > > because it is not possible to limit the control the user has over their
> > > > computer?
> > > 
> > > No, I don't believe either of the above are obvious. We may have different
> > > definitions of DRM.
> > 
> > Could you please provide you definition of DRM?
> > 
> > It may be possible to avoid defining 'DRM' for the purpose of discussions
> > but we would need some other agreed definitions.
> > 
> > For example: distinguish between platforms for which the user is able to
> > implement their own web browser or OS that can store the decrypted output,
> > versus systems for which they can not. Perhaps call these 'open' and
> > 'proprietary'?
> > 
> > I believe the ball is in your court. If you do not agree with the proposed
> > definitions then please make a proposal?
> 
> What proposed definitions ?

"For example: distinguish between platforms for which the user is able to
implement their own web browser or OS that can store the decrypted output,
versus systems for which they can not. Perhaps call these 'open' and
'proprietary'?"

...
> > 
> > > > Can we reach consensus that DRM requires the user to trust a proprietary
> > > > operating system or CDM hardware module?
> > > 
> > > No, there are also software solutions which are not part of the operating
> > > system.
> > 
> > Yes, it could be in proprietary hardware.  If we expand this can we reach
> > consensus?
> 
> Let me phrase it another way and see if you agree. It's certainly the
> intention of the EME proposal that there will be CDMs that are difficult for
> users to modify. There could be obfuscated software and/or there could be
> hardware mechanisms that make modification of certain software or firmware
> difficult.

I agree in part.  But a CDM running within a context controlled by a
user controlled UA or a user controlled OS can offer no real protection
against the decrypted output being recorded hence the 'DRM' qualification
above.  I suggest that DRM can only be implemented in proprietary modules
running on proprietary stacks, hence the 'proprietary' qualification
above.  We need a definition of 'DRM' or some other term that allows
the distinct cases so be discussed.  Could you please take a look at
bug 21104 and see if a consensus can reached.

 
> And yes, these techniques probably make it difficult for the user to
> determine exactly what that software does. So there would be a certain
> amount of trust required, where the user has to trust that the software in
> question does only what is claimed by its implementor. (All assuming the
> user wants to access the service at all).

This is part of the issue.

The other is the ability to enforce any security and privacy and this
depends on the context in which the CDM runs so can we please try to
define that first.

> The EME architecture places the onus on UA implementors to exercise due
> diligence in the manner in which they integrate with CDMs. This offers an
> opportunity for solutions where the UA implementor vouches for the safety of
> the CDM based on their relationship with the CDM implementor. This is an
> improvement on the situation today where the user has to trust some
> arbitrary plugin with no guarantees from the UA.

I don't think this model is practical for the open web standards because
the UA implementer may well be the user or an agent for the user and
their security and privacy can not depend on a relationship with the
CDM authors.

It may well be the only option for proprietary CDMs running on proprietary
stacks, and in this case the OS vendor and CDM author would be expected
to have a close relationship and probably restrictive licensing terms.

Thus I would like to see a separation between these distinct cases.

Henri has suggested focusing on only the DRM use cases, and this would
simplify the discussions.
Comment 28 Henri Sivonen 2013-02-26 08:31:06 UTC
(In reply to comment #24)
> The network operations themselves are usually trivial (though non-zero). The
> bigger cost in this case can be the encryption/decryption operations which
> may take place on licensing server. These can involve contacting other
> servers, HSMs, etc. It can be a significant cost factor. This is also a
> factor in not using HTTPS. 

The design of ISO Common Encryption and EME assumes that you contact the license server at least once per video title. Unless most users stop, exit the browser and later resume many times per title, trying to optimize away the license server contact when resuming the previously started title still seems like a totally wrong optimization, since one would assume that as far as the burden on the license server goes, the contacts arising from initially starting a title dominate compared to the contacts arising from resuming a title.

> For example - I
> acquire a license at work on a non-external server and then use that license
> when I am outside work. 

I am surprised that such a use case is considered to be in scope. I think we need an explicit enumeration of use cases and discussion about which of those are considered to be in scope, considering that fringe use cases such as intranet DRM where you can start playing only on the intranet but can resume playing outside the intranet bring complexity and privacy considerations that aren't necessitated by the Netflix-like scenario.

(I wouldn't be at all surprised if browser vendors that deemed the Netflix scenario worth supporting were unwilling to accept additional complexity or adverse privacy characteristics in order to support resuming intranet content outside the intranet.)

What mechanism do you assume for making sure that the media itself continues to be available when you take your device outside your intranet?

> Also HTTPS is not appropriate for every situation requiring security. HTTPS
> is too heavyweight IMO if what you need is a REST like API.

There are plenty of sites of all sizes that are able to deploy https. Moreover, browsers like Firefox and Chrome are trying to push the Web into an https-everywhere future. I think it would be inappropriate for EME to try to work against that direction or to postulate that the deployers of EME services would somehow be less able to deploy https for the license transactions than all the other sites that currently deploy https. In other words, I think we should be able to assume that sites that think a network MITM would jeopardize their license transactions are able to put the XHR traffic implied by EME over https.

Still, the notion that an insecure network would make the license transactions insecure seems bogus, because one of the underlying assumptions of EME is that the browser isn't trusted with the content keys (hence the need for the separate-trust-level CDM box). It seems implausible that a key system that has been properly designed not to be vulnerable against an instrumented browser dumping the content keys would be vulnerable against a network MITM dumping the content keys.

> > > * Reduces the number of times the user needs to authenticate.
> > 
> > Can you elaborate on this? In a Netflix-like case, you need to login to
> > resume an interrupted movie anyway. On a site similar to thedailyshow.com,
> > there is no user-facing authentication in the first place.
> 
> It is common in the VOD model for the user to acquires a license that is
> valid for longer than a single session. This license can be acquired once
> and then the video can be played back multiple times without having to
> re-authenticate, since authentication may only be required in the license
> acquisition phase. 

Why do you assume that an in-browser VOD service design would work like that instead of working like Netflix works? What mechanism do you assume for making sure that the video content stays cached in the browser? It seems to me that the current caching mechanisms in browsers are inappropriate for ensuring that several gigabytes worth of data stay unevicted and you haven't explained why  the mechanisms currently being proposed wouldn't also be able to take care of caching the licenses so that the CDM itself wouldn't need to be able to persistently store them.
Comment 29 Adrian Bateman [MSFT] 2013-02-26 16:09:40 UTC
Added a note to the SOTD calling this out as an open issue.
https://dvcs.w3.org/hg/html-media/rev/1920147a47af
Comment 30 Mark Watson 2013-02-26 16:30:36 UTC
(In reply to comment #27)
> (In reply to comment #26)
> > (In reply to comment #25)
> > > (In reply to comment #23)
> > > > (In reply to comment #21)
> > > > > 
> > > > > Can we reach a consensus that a UA controlled CDM can not support DRM and
> > > > > that DRM demands that the OS vendor conspire with the CDM author to limit
> > > > > the control the user has over their computer?
> > > > > 
> > > > > Can we reach a consensus that DRM is not applicable to open source stacks
> > > > > because it is not possible to limit the control the user has over their
> > > > > computer?
> > > > 
> > > > No, I don't believe either of the above are obvious. We may have different
> > > > definitions of DRM.
> > > 
> > > Could you please provide you definition of DRM?
> > > 
> > > It may be possible to avoid defining 'DRM' for the purpose of discussions
> > > but we would need some other agreed definitions.
> > > 
> > > For example: distinguish between platforms for which the user is able to
> > > implement their own web browser or OS that can store the decrypted output,
> > > versus systems for which they can not. Perhaps call these 'open' and
> > > 'proprietary'?
> > > 
> > > I believe the ball is in your court. If you do not agree with the proposed
> > > definitions then please make a proposal?
> > 
> > What proposed definitions ?
> 
> "For example: distinguish between platforms for which the user is able to
> implement their own web browser or OS that can store the decrypted output,
> versus systems for which they can not. Perhaps call these 'open' and
> 'proprietary'?"

I don't think that distinction is useful here, at least not the way you think it is (see below).

> 
> ...
> > > 
> > > > > Can we reach consensus that DRM requires the user to trust a proprietary
> > > > > operating system or CDM hardware module?
> > > > 
> > > > No, there are also software solutions which are not part of the operating
> > > > system.
> > > 
> > > Yes, it could be in proprietary hardware.  If we expand this can we reach
> > > consensus?
> > 
> > Let me phrase it another way and see if you agree. It's certainly the
> > intention of the EME proposal that there will be CDMs that are difficult for
> > users to modify. There could be obfuscated software and/or there could be
> > hardware mechanisms that make modification of certain software or firmware
> > difficult.
> 
> I agree in part.  But a CDM running within a context controlled by a
> user controlled UA or a user controlled OS can offer no real protection
> against the decrypted output being recorded hence the 'DRM' qualification
> above.  I suggest that DRM can only be implemented in proprietary modules
> running on proprietary stacks, hence the 'proprietary' qualification
> above.  We need a definition of 'DRM' or some other term that allows
> the distinct cases so be discussed.  Could you please take a look at
> bug 21104 and see if a consensus can reached.

The things that people consider 'DRM' - or more specifically the things that are considered useful for content protection - are more subtle than you imply.

I get the impression that you consider any solution where the UA has access to decrypted decoded media as equivalent to a solution with a 'Save As...' option that saves a compressed version of the media file. But these things are not the same. 'Possible to save the content' is not the same as 'Probable that many people will save the content'. If the standard build of the browser does not contain the 'Save As...' functionality (including the re-encoding that this entails) then most users will not in practice create or obtain such a build.

I'm not commenting here on the type of content for which such solutions would be appropriate, except to say that since content requiring no protection exists it's reasonable that there also exists content at various points on the 'level of protection' scale in between none and full hardware protection.

My point above is just that its not correct to assume that because the UA has access to the decrypted media in some form that this is equivalent - in terms of use-cases - to having no protection.

Whether you call this 'DRM' or not, I don't really care - some people do - but it's in scope of EME.

> 
>  
> > And yes, these techniques probably make it difficult for the user to
> > determine exactly what that software does. So there would be a certain
> > amount of trust required, where the user has to trust that the software in
> > question does only what is claimed by its implementor. (All assuming the
> > user wants to access the service at all).
> 
> This is part of the issue.
> 
> The other is the ability to enforce any security and privacy and this
> depends on the context in which the CDM runs so can we please try to
> define that first.

I'm missing why you expect this to be defined in the specification and not just by UA implementors ? I don't think this is useful to describe in the specification.

> 
> > The EME architecture places the onus on UA implementors to exercise due
> > diligence in the manner in which they integrate with CDMs. This offers an
> > opportunity for solutions where the UA implementor vouches for the safety of
> > the CDM based on their relationship with the CDM implementor. This is an
> > improvement on the situation today where the user has to trust some
> > arbitrary plugin with no guarantees from the UA.
> 
> I don't think this model is practical for the open web standards because
> the UA implementer may well be the user or an agent for the user and
> their security and privacy can not depend on a relationship with the
> CDM authors.

Why not ? You have to obtain the CDM somehow and so whatever means you use to obtain it implies some kind of relationship - even if that is only receiving representations from the CDM author in the documentation. If those aren't sufficient, don't use that CDM. 

> 
> It may well be the only option for proprietary CDMs running on proprietary
> stacks, and in this case the OS vendor and CDM author would be expected
> to have a close relationship and probably restrictive licensing terms.
> 
> Thus I would like to see a separation between these distinct cases.
> 
> Henri has suggested focusing on only the DRM use cases, and this would
> simplify the discussions.

I don't really understand why such a constraint would be useful. Also, you have to define DRM, which is obviously tricky.
Comment 31 Andreas Kuckartz 2013-06-07 19:56:02 UTC
[Copied from a mail I sent to the W3C Restricted Media CG mailing list using the subject "PRISM and EME":]

I would like to add another reason why the W3C should not endorse EME.

As we all know EME depends on "Content Decryption Modules". These are
binary executables. The source code of those executables in practice
will not be made available to users. They can not verify what the
executables are doing.

It is now known that the U.S. government is involved in large-scale
surveillance directed against the world population (PRISM). It is also
widely assumed that this surveillance is supported by two of the three
companies which are proposing EME (Google and Microsoft). Those
companies have issued "denials", but the formulations used in these
denials are very suspicius.

It is also known that the same government has distributed malware (such
as Stuxnet) to foreign users.

This all taken together implies a significant danger that the CDM
binaries will not only enable "silent monitoring" (Google Widevine) on
behalf of media companies but that surveillance malware will be added on
behalf of the U.S. government. The persons involved likely would be
gagged by a gag order.

It is unacceptable for an Open Standards body to take part in this by
endorsing EME.
Comment 32 Glenn Adams 2013-07-30 16:23:24 UTC
I propose this bug be resolved by adding a new informative section or appendix (I prefer an appendix), entitled "Privacy Concerns", and that the material provided by Henri in comment#9 be used to populate it mutatis mutandis.
Comment 33 Glenn Adams 2013-07-30 18:02:27 UTC
(In reply to comment #32)
> I propose this bug be resolved by adding a new informative section or
> appendix (I prefer an appendix), entitled "Privacy Concerns", and that the
> material provided by Henri in comment#9 be used to populate it mutatis
> mutandis.

An example of such a section (on Privacy Considerations) is found at [1].

[1] https://dvcs.w3.org/hg/webcrypto-api/raw-file/tip/spec/Overview.html#privacy
Comment 34 Henri Sivonen 2013-08-13 10:13:17 UTC
(In reply to comment #32)
> I propose this bug be resolved by adding a new informative section or
> appendix (I prefer an appendix), entitled "Privacy Concerns", and that the
> material provided by Henri in comment#9 be used to populate it mutatis
> mutandis.

Seems good to me.
Comment 35 Adrian Bateman [MSFT] 2013-08-27 14:50:14 UTC
Draft section added to the spec per Bug 22910.
See https://dvcs.w3.org/hg/html-media/raw-file/default/encrypted-media/encrypted-media.html#privacy
Comment 36 David Dorwin 2013-09-30 23:54:36 UTC
Per the discussion in the September 17th telecon (http://www.w3.org/2013/09/17-html-media-minutes.html), we will close this bug and track the security considerations in bug 22909 and the privacy considerations in bug 22910.