Bug 22910 - Needs non-normative Privacy Consideration section
Needs non-normative Privacy Consideration section
Status: RESOLVED FIXED
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions
unspecified
All All
: P2 normal
: ---
Assigned To: Mark Watson
HTML WG Bugzilla archive list
:
Depends on:
Blocks: 17202 20965 20966 21869
  Show dependency treegraph
 
Reported: 2013-08-09 20:09 UTC by Glenn Adams
Modified: 2013-11-14 08:30 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Glenn Adams 2013-08-09 20:09:00 UTC
A new, non-normative section on Privacy Considerations should be added to address a number of concerns persistently expressed about EME, such as enhanced fingerprinting capabilities, unauthorized disclosure of key information, etc.

A reasonable template for such a section is found in WebCrypto [1], section 6.

[1] https://dvcs.w3.org/hg/webcrypto-api/raw-file/tip/spec/Overview.html#privacy
Comment 1 Glenn Adams 2013-08-13 05:33:59 UTC
Propose the following draft text, to be added as a new top level section or a sub-section of the Introduction. Note that this proposal is little more than an outline intended to be elaborated after further discussion in the TF.

X Privacy Considerations

This section is non-normative.

Fingerprinting

Malicious applications may be able to fingerprint users or user agents by detecting or enumerating the list of key systems that are supported.

Tracking

If user agents permit keys to be re-used between origins, without performing any secondary operations such as key derivation that includes the origin, then it may be possible for two origins to collude and track a unique user by recording their ability to access a common key.

Super-cookies

With the exception of ephemeral keys, its often desirable for applications to strongly associate users with keys. These associations may be used to enhance the security of authenticating to the application, such as using a key stored in a secure element as a second factor, or may be used by users to assert some identity, such as an e-mail signing identity. As such, these keys often live longer than their counterparts such as usernames and passwords, and it may be undesirable or prohibitive for users to revoke these keys. Because of this, keys may exist longer than the lifetime of the browsing context [HTML] and beyond the lifetime of items such as cookies, thus presenting a risk that a user may be tracked even after clearing such data. This is especially true for keys that were pre-provisioned for particular origins and for which no user interaction was provided.
Comment 2 Adrian Bateman [MSFT] 2013-08-27 14:47:24 UTC
(In reply to comment #1)
> Propose the following draft text, to be added as a new top level section or
> a sub-section of the Introduction. Note that this proposal is little more
> than an outline intended to be elaborated after further discussion in the TF.

I added this text verbatim and it needs further review and discussion by the TF.
https://dvcs.w3.org/hg/html-media/rev/6bedfa23739d
Comment 3 David Dorwin 2013-09-30 23:49:46 UTC
Reopening to track work on this section per the discussion in the September 17th telecon (http://www.w3.org/2013/09/17-html-media-minutes.html).
Comment 4 David Dorwin 2013-10-24 17:38:45 UTC
Web MIDI explicitly mentioned "not currently required" prompting of the user [1] and has explicit support for such prompts [2]. We may want to consider text that at least notes that prompts may be displayed.

[1] http://www.w3.org/TR/webmidi/#security-and-privacy-considerations
[2] http://www.w3.org/TR/webmidi/#NavigatorMIDIAccess
Comment 5 Joe Steele 2013-10-28 18:28:49 UTC
I sent a proposal which has bearing on the key sharing question. 
http://lists.w3.org/Archives/Public/public-html-media/2013Oct/0031.html
Comment 6 Mark Watson 2013-11-05 20:24:32 UTC
Here is a detailed proposal for the privacy section. It is modeled on material in the WebCrypto Key Discovery draft which is in turn modelled on material for from IndexedDB / Web Storage, modified appropriately and taking into account the comments made above.

I'm open to suggestions for a better way to share this (e.g. with formatting etc.)

"8. Privacy considerations

The presence or use of Key Systems on a user’s device raises a number of privacy issues, falling into two categories: (a) user-specific information that may be disclosed by the EME interface itself, or within messages from Key Systems and (b) user-specific information that may be persistently stored on the users device.

User Agents should take responsibility for providing users with adequate control over their own privacy. Since User Agents may integrate with third party CDM implementations, CDM implementors must provide sufficient information and controls to user agent implementors to enable them to implement appropriate techniques to ensure users have control over their privacy, including but not limited to the techniques described below.

8.1. Information disclosed by EME and Key Systems

Concerns regarding information disclosed by EME and Key Systems fall into two categories, concerns about non-specific information that may nevertheless contribute to the possibility of fingerprinting a user agent or device and user-specific information that may be used directly for user tracking.

8.1.1 Fingerprinting

Malicious applications may be able to fingerprint users or user agents by detecting or enumerating the list of key systems that are supported and related information. If proper origin protections are not provided this could include detection of sites that have been visited and information stored for those sites. In particular, Key Systems should not share key or other data between sites that are not CORS-same-origin.

8.1.2 Tracking

User-specific information may be obtained over the EME API in two ways: through detection of stored keys and through Key System messages.

Key Systems may access or create persistent or semi-persistent identifiers for a device or user of a device. In some cases these identifiers may be bound to a specific device in a secure manner. If these identifiers are present in Key System messages, then devices and/or users may be tracked. If the mitigations below are applied this could include both tracking of users / devices over time and associating multiple users of a given device. If not mitigated, such tracking may take three forms depending on the design of the Key System:
- in all cases, such identifiers are expected to be available to sites and/or servers that fully support the Key System (and thus can interpret Key System messages) enabling tracking by such sites.
- if identifiers exposed by Key Systems are not origin-specific, then two sites and/or servers that fully support the Key System may collude to track the user
- if a Key System messages contains information derived from a user identifier in a consistent manner, for example such that a portion of the initial Key System message for a specific content item does not change over time and is dependent on the user identifier, then this information could be used by any application to track the device or user over time.

If a Key System permits keys to be stored and to be re-used between origins, then it may be possible for two origins to collude and track a unique user by recording their ability to access a common key.

Finally, if any user interface for user control of Key Systems presents data separately from data in HTTP session cookies or persistent storage, then users are likely to modify site authorization or delete data in one and not the others. This would allow sites to use the various features as redundant backup for each other, defeating a user's attempts to protect his privacy.

There are a number of techniques that can be used to mitigate these risks of tracking without user consent:

User deletion of persistent identifiers

User agents could provide users with the ability to clear any persistent identifiers maintained by Key Systems.

Use of (non-reversible) per-origin identifiers

The user / device identifier exposed by a Key System may be different for each origin, either by allocation of different identifiers for different origins or by use of a non-reversible origin-specific mapping from an origin-independent identifier.

Encryption of user identfiiers

User identifiers in Key System messages could be encrypted, together with a timestamp or nonce, such that the Key System messages are always different. This would prevent the use of Key System messages for tracking except by applications fully supporting the Key System.

Site-specific white-listing of access to each Key System

User agents could require the user to explicitly authorize access by each site to each Key System. User agents should enable users to revoke this authorization either temporarily or permanently.

Treating Key System persistent identifiers as cookies

User agents should present the presence of persistent identifiers stored by Key Systems to the user in a way that associates them strongly with HTTP session cookies. This might encourage users to view such identifiers with healthy suspicion.

Shared blacklists

User agents may allow users to share their Key System domain blacklists. This would allow communities to act together to protect their privacy.

User alerts / prompts

User Agents could ensure that users are fully informed and / or give explicit consent before identifiers are exposed in messages from Key Systems.

User controls to disable Key Systems or Key System use of identifiers

User Agents could provide users with a global control of whether a Key System is enabled / disabled and / or whether Key System use of user / device identifiers is enabled or disabled (if supported by the Key System).

While these suggestions prevent trivial use of this feature for user tracking, they do not block it altogether. Within a single domain, a site can continue to track the user during a session, and can then pass all this information to a third party along with any identifying information (names, credit card numbers, addresses) obtained by the site. If a third party cooperates with multiple sites to obtain such information, and if identifiers are not per-origin, then a profile can still be created.

It is important to note that identifiers that are non-clearable, non-origin-specific or hardware-bound exceed the tracking impact of existing techniques such as Cookies or session identifiers embedded in URLs.

Thus, in addition to the various mitigations described above, if a browser supports a mode of operation intended to preserve user anonymity, then User Agent implementers should carefully consider whether access to Key Systems should be disabled in this mode.

8.2. Information stored on user devices

Key Systems may store information on a user’s device, or user agents may store information on behalf of Key Systems. Potentially, this could reveal information about a user to another user of the same device, including potentially the origins that have used a particular Key System (i.e. sites visited) or even the content that has been decrypted using a Key System.

If information stored by one origin affects the operation of the Key System for another origin, then potentially the sites visited or content viewed by a user on one site may be revealed to another, potentially malicious, site.

There are a number of techniques that can be used to mitigate these privacy risk to users:

Origin-specific Key System storage

User agents may require that some or all of the Key System’s persistently stored data is stored in an origin-specific way.

User deletion of Key System storage

User agents may present the user with a way to delete Key System storage for a specific origin or all origins.

Treating Key System stored data like cookies / Web Storage

User agents should present the presence of persistent data stored by Key Systems to the user in a way that associates it strongly with HTTP session cookies and/or Web Storage. This might encourage users to view such data with healthy suspicion.

Encryption or obfuscation of Key System stored data

User agents should treat data stored by Key Systems as potentially sensitive; it is quite possible for user privacy to be compromised by the release of this information. To this end, user agents should ensure that such data is securely stored and when deleting data, it is promptly deleted from the underlying storage."
Comment 7 Glenn Adams 2013-11-12 03:17:16 UTC
I notice the recent publishing of RFC 6973 Privacy Considerations for Internet Protocols [1]. While I haven't read this yet, there may be useful guidelines here that would help us construct an appropriate solution to this bug.

[1] http://tools.ietf.org/html/rfc6973
Comment 8 Adrian Bateman [MSFT] 2013-11-14 01:28:38 UTC
(In reply to Mark Watson from comment #6)
> Here is a detailed proposal for the privacy section. It is modeled on
> material in the WebCrypto Key Discovery draft which is in turn modelled on
> material for from IndexedDB / Web Storage, modified appropriately and taking
> into account the comments made above.

I have added this to the spec to make it easier to review:
https://dvcs.w3.org/hg/html-media/rev/cccd6d78bd9f
Comment 9 Adrian Bateman [MSFT] 2013-11-14 08:30:58 UTC
Updated the spec with ISSUE notes about this section needing more review as discussed at the F2F. Please open new bugs with any feedback instead of reopening this bug.
https://dvcs.w3.org/hg/html-media/rev/42d23ada7eca