This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 27268 - Add a definition of a distinctive identifier
Summary: Add a definition of a distinctive identifier
Status: RESOLVED MOVED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: David Dorwin
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard: Privacy
Keywords:
Depends on:
Blocks: 27166 27269 27270 27272
  Show dependency treegraph
 
Reported: 2014-11-07 12:12 UTC by Henri Sivonen
Modified: 2015-10-29 20:21 UTC (History)
6 users (show)

See Also:


Attachments

Description Henri Sivonen 2014-11-07 12:12:40 UTC
In order to be able to refer to it from text to be requested in subsequent bug reports, in the section for definitions, please add a definition for a "distinctive identifier". I suggest the term to be defined as follows:

(Start proposed spec text)

A distinctive identifier is a piece of data or implication of the possession of a piece of data or an observable behavior or timing for all the following criteria hold:
 1) It is exposed to outside the browsing device or exposed to the application such that the application has the opportunity to send it (even if in encrypted form if decryptable outside the device) or information about it outside the browsing device.
 2) It is not shared across a large population of users or devices.
 3) It is used in more than one session or is potentially used in one persistent session across the point of persistence.

A distinctive identifier is typically unique to user or device, but an identifier doesn't need to be strictly unique to be distinctive. (An identifier shared among a small number of users could still be distinctive.)

Examples of distinctive identifiers include but are not limited to:
 * A string of bytes that is included in key requests and that is different from the string included by other devices.
 * A public key included in key requests that is different from the public keys included in the requests by other devices.
 * Demonstration of possession of a private key (e.g. by signing some data) that other devices don't have.
 * A key id for such a key.

Examples of things that are not distinctive identifiers:
 * A public key shared among all copies of a given CDM version if the installed base is large.
 * A nonce that's unique but used in only one non-persistent session.
 * Device-unique keys used in attestations between e.g. graphics/video components and the CDM when the CDM does not let these attestations further flow to the application and instead makes a new attestation on its own using a key that does not constitute a distinctive identifier e.g. due to the first point on this list.
Comment 1 David Dorwin 2014-12-02 01:29:02 UTC
https://github.com/w3c/encrypted-media/commit/ce5d69ae56fc9cc890a02b132533431d54089780 adds the definition. It is mostly the the proposed text from comment #0.

I have some questions for Henri below.

(In reply to Henri Sivonen from comment #0)
>  3) It is used in more than one session
By "session", do you really mean MediaKeySession? What about sessions within the same MediaKeys object?

> or is potentially used in one
> persistent session across the point of persistence.
Please clarify and/or explain the purpose of this text.

>  * A nonce that's unique but used in only one non-persistent session.
What is the importance of "non-persistent" here? (I did not include this in the change.)
Comment 2 David Dorwin 2014-12-02 01:29:52 UTC
https://github.com/w3c/encrypted-media/commit/0173c012d540991aa158c1ab45a2733af342775f replaces the persistentUniqueIdentifier member of MediaKeySystemConfiguration with distinctiveIdentifier, which is defined based on the definition added above.

distinctiveIdentifier is defined as being *persistent*, which excludes some scenarios, such as a Distinctive Identifier that is erased upon exiting a private browsing session. However, I do not think that is a scenario that would be used, and the general use of a Distinctive Identifier expects it to be persistent.
Comment 3 David Dorwin 2014-12-08 21:23:18 UTC
Henri, please see my questions in comment #1. Those are the only open issues for this bug.
Comment 4 David Dorwin 2015-01-07 18:38:31 UTC
Henri, please see comment #3.
Comment 5 Henri Sivonen 2015-01-15 10:34:27 UTC
(In reply to David Dorwin from comment #1)
> https://github.com/w3c/encrypted-media/commit/
> ce5d69ae56fc9cc890a02b132533431d54089780 adds the definition. It is mostly
> the the proposed text from comment #0.
> 
> I have some questions for Henri below.
> 
> (In reply to Henri Sivonen from comment #0)
> >  3) It is used in more than one session
> By "session", do you really mean MediaKeySession? What about sessions within
> the same MediaKeys object?

I think I don't understand the implications of the distinction well enough to give an informed response at this time.

> > or is potentially used in one
> > persistent session across the point of persistence.
> Please clarify and/or explain the purpose of this text.

The purpose of this text is to close a loophole where a never-ending persistent session could carry around something that's seemingly a throw-away (and, therefore, presumptively not distinctive) value like a nonce, but it doesn't actually get thrown away in reasonable time and becomes a tracking id (i.e. distinctive for practical purposes).

> >  * A nonce that's unique but used in only one non-persistent session.
> What is the importance of "non-persistent" here? (I did not include this in
> the change.)

See above about using a never-ending persistent session for tracking users.
Comment 6 David Dorwin 2015-01-15 18:09:43 UTC
(In reply to Henri Sivonen from comment #5)
> (In reply to David Dorwin from comment #1)
> > https://github.com/w3c/encrypted-media/commit/
> > ce5d69ae56fc9cc890a02b132533431d54089780 adds the definition. It is mostly
> > the the proposed text from comment #0.
> > 
> > I have some questions for Henri below.
> > 
> > (In reply to Henri Sivonen from comment #0)
> > >  3) It is used in more than one session
> > By "session", do you really mean MediaKeySession? What about sessions within
> > the same MediaKeys object?
> 
> I think I don't understand the implications of the distinction well enough
> to give an informed response at this time.

I want to understand the type of session you were referring to so that I can eliminate the ambiguity in the spec. I think the problem is that the identifier is the same between, for example, visits to a page. This could be a browsing session. I don't think you meant MediaKeySession, since MediaKeySessions share a MediaKeys object and CDM instance and thus likely share identifiers.
> 
> > > or is potentially used in one
> > > persistent session across the point of persistence.
> > Please clarify and/or explain the purpose of this text.
> 
> The purpose of this text is to close a loophole where a never-ending
> persistent session could carry around something that's seemingly a
> throw-away (and, therefore, presumptively not distinctive) value like a
> nonce, but it doesn't actually get thrown away in reasonable time and
> becomes a tracking id (i.e. distinctive for practical purposes).

Is the never-ending persistent session internal to the CDM or is it left and used by the application? Any persistent session provides tracking just like a cookie. It's unlikely that persistent sessions would be identical on two systems. What is your specific concern beyond that?

Note: The user should be able to clear persistent sessions (like cookies), which should erase such an ID.
> 
> > >  * A nonce that's unique but used in only one non-persistent session.
> > What is the importance of "non-persistent" here? (I did not include this in
> > the change.)
> 
> See above about using a never-ending persistent session for tracking users.

Okay. If I understand correctly, you consider any nonce in a persistent session to be a distinctive identifier.

I am arguing that any persistent session is likely to be a distinctive identifier by that logic. While such things could be used to track a user (unless/until the sessions are cleared by the user), I think this waters down the meaning of distinctive identifier and distracts from the far more concerning types of identifiers. Perhaps we should add a note somewhere explaining out persistent sessions could be used to track users.
Comment 7 Henri Sivonen 2015-01-19 11:14:43 UTC
(In reply to David Dorwin from comment #6)
> (In reply to Henri Sivonen from comment #5)
> > (In reply to David Dorwin from comment #1)
> > > https://github.com/w3c/encrypted-media/commit/
> > > ce5d69ae56fc9cc890a02b132533431d54089780 adds the definition. It is mostly
> > > the the proposed text from comment #0.
> > > 
> > > I have some questions for Henri below.
> > > 
> > > (In reply to Henri Sivonen from comment #0)
> > > >  3) It is used in more than one session
> > > By "session", do you really mean MediaKeySession? What about sessions within
> > > the same MediaKeys object?
> > 
> > I think I don't understand the implications of the distinction well enough
> > to give an informed response at this time.
> 
> I want to understand the type of session you were referring to so that I can
> eliminate the ambiguity in the spec. I think the problem is that the
> identifier is the same between, for example, visits to a page. This could be
> a browsing session. I don't think you meant MediaKeySession, since
> MediaKeySessions share a MediaKeys object and CDM instance and thus likely
> share identifiers.

I meant anything that is a "session" from the DRM perspective and that the CDM can store information about to revive in a later browsing session (i.e. after the browser has been quit and relaunched). AFAICT, this means the features under https://w3c.github.io/encrypted-media/#session-storage

> > > > or is potentially used in one
> > > > persistent session across the point of persistence.
> > > Please clarify and/or explain the purpose of this text.
> > 
> > The purpose of this text is to close a loophole where a never-ending
> > persistent session could carry around something that's seemingly a
> > throw-away (and, therefore, presumptively not distinctive) value like a
> > nonce, but it doesn't actually get thrown away in reasonable time and
> > becomes a tracking id (i.e. distinctive for practical purposes).
> 
> Is the never-ending persistent session internal to the CDM or is it left and
> used by the application? Any persistent session provides tracking just like
> a cookie. It's unlikely that persistent sessions would be identical on two
> systems. What is your specific concern beyond that?

The concern I had was an application being able to ask the CDM to persist a DRM session and then in a later browser session ask the CDM to revive the DRM session thereby correlating the two browser sessions and belonging to the same user.

> Note: The user should be able to clear persistent sessions (like cookies),
> which should erase such an ID.

Do you mean that there should be UI that removes persistent sessions without also removing all DRM-related persistent data for a site?

> > > >  * A nonce that's unique but used in only one non-persistent session.
> > > What is the importance of "non-persistent" here? (I did not include this in
> > > the change.)
> > 
> > See above about using a never-ending persistent session for tracking users.
> 
> Okay. If I understand correctly, you consider any nonce in a persistent
> session to be a distinctive identifier.

Right, because even though the nonce is "used only once" from the point of view of being used in only one DRM session, it ends up being used more than one browser session.

> I am arguing that any persistent session is likely to be a distinctive
> identifier by that logic. While such things could be used to track a user
> (unless/until the sessions are cleared by the user), I think this waters
> down the meaning of distinctive identifier and distracts from the far more
> concerning types of identifiers.

Fair enough.

> Perhaps we should add a note somewhere
> explaining out persistent sessions could be used to track users.

Works for me provided that the requirements that I proposed to trigger on distinctive identifiers are changed to trigger both by the presence of distinctive identifiers and the presence of persistent sessions.
Comment 8 David Dorwin 2015-10-19 19:07:45 UTC
(In reply to Henri Sivonen from comment #7)
> (In reply to David Dorwin from comment #6)

> > Note: The user should be able to clear persistent sessions (like cookies),
> > which should erase such an ID.
> 
> Do you mean that there should be UI that removes persistent sessions without
> also removing all DRM-related persistent data for a site?

I would expect the UI to cover for all DRM-related data for a site.
I was just addressing the specific concern about persistent sessions.

> > Perhaps we should add a note somewhere
> > explaining out persistent sessions could be used to track users.
> 
> Works for me provided that the requirements that I proposed to trigger on
> distinctive identifiers are changed to trigger both by the presence of
> distinctive identifiers and the presence of persistent sessions.

What requirements are you referring to? Informing the user, etc.?
Comment 9 David Dorwin 2015-10-19 19:15:38 UTC
The remaining concerns appear to be related to various means of tracking a user. While these are important, I think it's important to consider whether and how certain mechanisms are different from cookies and whether there is language to avoid unexpected persistence (i.e. data in the CDM not truly being cleared). I worry about diluting the meaning of "Distinctive Identifier" and treating something like a cookie the same as a hardware[-based] identifier.
Comment 10 David Dorwin 2015-10-29 20:21:33 UTC
This has been migrated to https://github.com/w3c/encrypted-media/issues/117.