This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21869 - Need clarity on stored keys for CDMs
Summary: Need clarity on stored keys for CDMs
Status: RESOLVED WORKSFORME
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Adrian Bateman [MSFT]
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on: 22909 22910
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-29 19:17 UTC by Joe Steele
Modified: 2013-11-14 08:49 UTC (History)
5 users (show)

See Also:


Attachments

Description Joe Steele 2013-04-29 19:17:32 UTC
We have a few bugs which touch on the subject of retaining keys beyond the life of a session. Primarily bug 17750 "Define the behavior MediaKeySession close() and clearing the keys attribute" but 19788 and 16616 also mention this. 

We should provide some guidance to EME implementors as to how they can expect to store these keys. For example -- following normal UA guidelines I would expect this data to be per domain and per page URL. But is this based on the domain hosting the player (my preference) or on the domain for the key servers, or is it an entirely independent store? Or is it entirely up to the UA?

This is important for maintaining clean separation between players and allowing the user to clear state when they want to. 

BTW -- I could not find an exact match for this bug although it has commonalities with some existing bugs. If you would prefer to close it as a dup, I can just continue the conversation in the dup bug.
Comment 1 Henri Sivonen 2013-05-08 07:20:44 UTC
What are the use cases for retaining keys beyond the life of a session?
Comment 2 Joe Steele 2013-05-13 20:46:13 UTC
Retaining some keys allow for better performance and usability. Every key acquisition has a cost, both to the user (representing either user time spent or delay to first playback) and to the server operator (key retrieval on the back end).

Some keys may be required again and again - for example if a group of devices is associated with the same account, a common key can be used to request content licenses for those devices. 

Some keys may only be required for a particular piece of content, but you don't want to have to pay the cost of acquisition again just because you have put your machine to sleep briefly. 

And in some cases it may not be convenient to reacquire the key, for example if the key can only be acquired in a private environment but the content is available to key holders in public environments.

Also there may be metadata about the keys themselves which needs to be retained, for example how many times this piece of content has been played, when the first playback started, etc. This can be maintained on the server, but there can be a cost benefit to users and content providers to have this local.
Comment 3 Henri Sivonen 2013-05-23 15:18:56 UTC
(In reply to comment #2)
> Retaining some keys allow for better performance and usability. Every key
> acquisition has a cost, both to the user (representing either user time
> spent or delay to first playback)

Isn't this best solved by letting the video start with a few seconds of unencrypted content even though the keys are declared up front so that playback can start during the key acquisition.

> and to the server operator (key retrieval
> on the back end).

Considering how network chatty Web apps are in general fetching images, doing XHR, etc., it seems weird to me to try to optimize away key reacquisition. One would expect the messages to be small in terms of the number of bytes, the crypto operations no heavier than a TLS handshake and key lookups not more advanced than database lookups that many sites do all the time.

Is there something intuition-defying about the cost of talking to a license server that would justify the avoidance of key reacquisition? I had expected key reacquisition would happen even more often than it really has to in order to implement heartbeat.

> Some keys may be required again and again - for example if a group of
> devices is associated with the same account, a common key can be used to
> request content licenses for those devices. 

This seems to imply that DRM-level domains would be involved. Since the design of EME handles login using regular session cookies, it seems to me that the it makes no sense to for EME CDMs to have the concept of domains, since domains can be implemented entirely on the server side if desired.

> Some keys may only be required for a particular piece of content, but you
> don't want to have to pay the cost of acquisition again just because you
> have put your machine to sleep briefly. 

Why is this unwanted? Web apps like Gmail ping a server all the time. OTOH, Netflix's player times out when paused even if the computer is awake.

> And in some cases it may not be convenient to reacquire the key, for example
> if the key can only be acquired in a private environment but the content is
> available to key holders in public environments.

I have trouble seeing what sort of movie streaming service would work like this.

> Also there may be metadata about the keys themselves which needs to be
> retained, for example how many times this piece of content has been played,
> when the first playback started, etc. This can be maintained on the server,
> but there can be a cost benefit to users and content providers to have this
> local.

Seems like the client side could be simpler by handling this by the keys expiring often and the connection to the license server being chatty with re-requesting keys all the time as a form of heartbeat. If the server side doesn't like the complexity, maybe the server side should relax tracking requirements. 

Considering privacy, it would be the best that the CDM didn't store anything persisently and, therefore, didn't create a new class of cookie-like data. I think adding a class of cookie-like data in order to optimize round trips to the key server is a bad tradeoff.

I would prefer EME banning CDMs from writing anything to persistent storage as a result of talking with a key server of a content service (in order to avoid the creation of a new class of cookie-like data). This formulation would still allow downloading an IBX from the DRM vendor (as opposed to a key server of a content site) and storing it as part of CDM setup.
Comment 4 Joe Steele 2013-05-23 18:17:15 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > Retaining some keys allow for better performance and usability. Every key
> > acquisition has a cost, both to the user (representing either user time
> > spent or delay to first playback)
> 
> Isn't this best solved by letting the video start with a few seconds of
> unencrypted content even though the keys are declared up front so that
> playback can start during the key acquisition.

[steele] This would be a good solution for the initial startup delay. However this is not in compliance with long term business agreements some content publishers have for how their content is protected for distribution. In practice I have found most publishers cannot use this technique currently. Also this would not address the reacquisition problem after playback has started and you are past the initial unencrypted portion. The same holds true for live streams where the content server has no idea the this is the first time the user is viewing the stream. 

> 
> > and to the server operator (key retrieval
> > on the back end).
> 
> Considering how network chatty Web apps are in general fetching images,
> doing XHR, etc., it seems weird to me to try to optimize away key
> reacquisition. One would expect the messages to be small in terms of the
> number of bytes, the crypto operations no heavier than a TLS handshake and
> key lookups not more advanced than database lookups that many sites do all
> the time.
> 
> Is there something intuition-defying about the cost of talking to a license
> server that would justify the avoidance of key reacquisition? I had expected
> key reacquisition would happen even more often than it really has to in
> order to implement heartbeat.

[steele] Remember there are two costs here. The cost on the client side and the cost on the license server side. 

On the client side, the cost can be very expensive for platforms which use obfuscated software to do the cryptographic operations. On the order of seconds, which is immediately visible to the end user as lag. This is a bad user experience. This also costs in terms of power/battery life as the client potentially has to do a lot of crypto operations which will impact mobile devices especially hard.

On the license server side, the cost of a TLS handshake is a good estimate although that may be low for some protocols. That seems low, but consider the "license storm" problem. A sporting event is about to start. 100M people fire up their browsers and proceed to the website to watch the event. 

If no keys are cached -- all 100M send off a key request within a small window (say 15min). To avoid server overload, the key server operator has a couple of choices. They can opt to provision a large amount of servers all the time (lots of expense) or use elastic scaling to ramp up as the load increases. This is a balancing act for them, as even if they have a large amount of servers they may guess wrong on the peak traffic and experience cascading failures. Elastic scaling may not be fast enough to match the peak either, so users still experience failures.  

Now consider if keys can be cached. The application can cache a "subscription key" days ahead of time which is used to open the keys for the actual event. The event streams can be encrypted with unique keys which are decrypt-able with the "subscription key" and embedded in the content stream. In this case some percentage of the 100M (potentially a large percentage) already have the keys they need. There are far fewer license requests at the time of the event and the storm is much smaller. 

> 
> > Some keys may be required again and again - for example if a group of
> > devices is associated with the same account, a common key can be used to
> > request content licenses for those devices. 
> 
> This seems to imply that DRM-level domains would be involved. Since the
> design of EME handles login using regular session cookies, it seems to me
> that the it makes no sense to for EME CDMs to have the concept of domains,
> since domains can be implemented entirely on the server side if desired.

[steele] EME does not need to have an explicit concept of domains. I was using domains as an example of an intermediate key. In DRM systems which support a hierarchy of keys and the keys are acquired in stages (for example from different servers) it is useful to be able to retain the intermediate keys between sessions to avoid the costs I outlined above where possible.

> 
> > Some keys may only be required for a particular piece of content, but you
> > don't want to have to pay the cost of acquisition again just because you
> > have put your machine to sleep briefly. 
> 
> Why is this unwanted? Web apps like Gmail ping a server all the time. OTOH,
> Netflix's player times out when paused even if the computer is awake.

[steele] As I mentioned above - this can introduce a delay of seconds or more on the client end. 

> 
> > And in some cases it may not be convenient to reacquire the key, for example
> > if the key can only be acquired in a private environment but the content is
> > available to key holders in public environments.
> 
> I have trouble seeing what sort of movie streaming service would work like
> this.

[steele] Imagine a developer for a movie streaming service trying to debug their player. They will need to test under various network conditions (say at your local Starbucks). However they do not want to expose their pre-production key servers to the open Internet. As a developer of these players - I run into this issue on a daily basis. Or to give a different example, what about a corporate video server streaming confidential content (e.g. a company meeting). The key acquisition phase might have to be completed within the corporate firewall but the playback could continue outside the firewall since the content itself is protected.

Having said that -- that is not the main problem I am hoping to address. It is the performance and usability issues I raised above. 

> 
> > Also there may be metadata about the keys themselves which needs to be
> > retained, for example how many times this piece of content has been played,
> > when the first playback started, etc. This can be maintained on the server,
> > but there can be a cost benefit to users and content providers to have this
> > local.
> 
> Seems like the client side could be simpler by handling this by the keys
> expiring often and the connection to the license server being chatty with
> re-requesting keys all the time as a form of heartbeat. If the server side
> doesn't like the complexity, maybe the server side should relax tracking
> requirements. 

[steele] See my above comments about why this could take a lot of client and server side time. 

> 
> Considering privacy, it would be the best that the CDM didn't store anything
> persisently and, therefore, didn't create a new class of cookie-like data. I
> think adding a class of cookie-like data in order to optimize round trips to
> the key server is a bad tradeoff.
> 
> I would prefer EME banning CDMs from writing anything to persistent storage
> as a result of talking with a key server of a content service (in order to
> avoid the creation of a new class of cookie-like data). This formulation
> would still allow downloading an IBX from the DRM vendor (as opposed to a
> key server of a content site) and storing it as part of CDM setup.

[steele] If the main concern here is the adding of a separate class of cookie-like data (as opposed to storing any data), I would say this is not a firm requirement. However this has a clear benefit for the user, because it make it less likely they will shoot themselves in the foot by clearing this data inadvertently. 

I don't see the privacy concern here if this data is handled like web application data is today, segregated by domain and subject to CORS restrictions. Please articulate why you think this has privacy implications. 

I disagree on the trade-off. If the performance of players using EME is necessarily less than that of existing plugin-based or app-based solutions, this will be a roadblock to adoption and folks will continue to use the old solutions. 

I think adding restrictions here (above what normal web app applications are subject) would only result in a worse user experience and no additional privacy.
Comment 5 Henri Sivonen 2013-05-24 09:11:36 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > Retaining some keys allow for better performance and usability. Every key
> > > acquisition has a cost, both to the user (representing either user time
> > > spent or delay to first playback)
> > 
> > Isn't this best solved by letting the video start with a few seconds of
> > unencrypted content even though the keys are declared up front so that
> > playback can start during the key acquisition.
> 
> [steele] This would be a good solution for the initial startup delay.
> However this is not in compliance with long term business agreements some
> content publishers have for how their content is protected for distribution.
> In practice I have found most publishers cannot use this technique
> currently.

It's pretty sad that content licensors are *so* user-hostile. As if it mattered if opening credits have DRM or not of the rest of the title does. :-(

> On the client side, the cost can be very expensive for platforms which use
> obfuscated software to do the cryptographic operations. On the order of
> seconds

Wow. I had no idea the impact of obfuscation was so bad. Thanks. Since AES decrypt has to work in real time, I take it that operations with asymmetric keys are more obfuscated than the AES decryptor or the codec?

> Now consider if keys can be cached. The application can cache a
> "subscription key" days ahead of time which is used to open the keys for the
> actual event.

The concept of a "subscription key" isn't in EME, though, and the way to invoke CDM operations in a EME world is to attempt to play an encrypted track. How would the acquisition of the subscription key actually work with EME? Would the user have to navigate to the site days ahead of time and have the site programmatically play a trivial video file in order to do something that triggers EME and gives the CDM the opportunity to talk with the server and change its state?

> The event streams can be encrypted with unique keys which are
> decrypt-able with the "subscription key" and embedded in the content stream.

Is this an interoperable CENC feature, or are we now talking about a DRM scheme-specific feature that would defeat the CENC-based interoperability story on EME?

> In DRM systems which
> support a hierarchy of keys and the keys are acquired in stages (for example
> from different servers) it is useful to be able to retain the intermediate
> keys between sessions to avoid the costs I outlined above where possible.

Is the idea that intermediate keys are used with a less obfuscated copy of the crypto code than the key embedded in CDM itself?

> > > Some keys may only be required for a particular piece of content, but you
> > > don't want to have to pay the cost of acquisition again just because you
> > > have put your machine to sleep briefly. 
> > 
> > Why is this unwanted? Web apps like Gmail ping a server all the time. OTOH,
> > Netflix's player times out when paused even if the computer is awake.
> 
> [steele] As I mentioned above - this can introduce a delay of seconds or
> more on the client end. 

Actually: Why wouldn't the keys just stay in RAM during sleep if the browser and CDM process(es) don't terminate?

> > > And in some cases it may not be convenient to reacquire the key, for example
> > > if the key can only be acquired in a private environment but the content is
> > > available to key holders in public environments.
> > 
> > I have trouble seeing what sort of movie streaming service would work like
> > this.
> 
> [steele] Imagine a developer for a movie streaming service trying to debug
> their player. They will need to test under various network conditions (say
> at your local Starbucks). However they do not want to expose their
> pre-production key servers to the open Internet. As a developer of these
> players - I run into this issue on a daily basis.

In that case, all the rest of the preproduction service would need to be exposed to the Internet anyway as far as firewalls go. However, it could be behind login. Since with EME user identification isn't a DRM concern but a general Web app concern, the Web app could control whether messages are forwarded to the preproduction key server based on whether the login belongs to a developer. Since EME unifies how login works with how login works on the Web in general, it seems to me that it would make the most sense to control developer access to preproduction key servers exactly the same way as developer access to preproduction versions of the rest of the components of the service is controlled.

> Or to give a different
> example, what about a corporate video server streaming confidential content
> (e.g. a company meeting). The key acquisition phase might have to be
> completed within the corporate firewall but the playback could continue
> outside the firewall since the content itself is protected.

I think the W3C should reject EME spec adjustments motivated by intranet use cases. If the W3C does EME at all, I think it should only be done as an exceptional spec in response to the exceptional market power that the major Hollywood studios wield. If intranet-motivated use cases were acknowledged, then the precedent would be set in a way that would open the door to adding DRM to all sorts of other media that companies might use to store confidential data even if those forms of media aren't associated with the exceptional market power of the major Hollywood studios and then video DRM would be an exception anymore.

(Besides, DRM doesn't "protect" against an employee playing a corporate video somewhere where an unauthorized person can overhear the soundtrack, so DRM isn't a particularly appropriate mechanism for guarding corporate secrets anyway.)

> Having said that -- that is not the main problem I am hoping to address. It
> is the performance and usability issues I raised above. 

OK.

> [steele] If the main concern here is the adding of a separate class of
> cookie-like data (as opposed to storing any data),

The point is that any stored data is cookie-like data. From the point of view of someone examining the user's computing device, the stored data can serve as evidence that the user has visited a given Web site. From the point of view of a server, the stored data serves as evidence that the user has been seen before (if it didn't, the data would be useless for optimizing the flow). The latter concern is of course moot with services like Netflix that require login anyway but would be privacy-relevant in the case of services that don't require login.

Another privacy issue is that the CDM is Hollywood's agent on the users computer and the user may not trust the CDM. The browser is the user's agent and, therefore, the piece of software that the user trusts. Therefore, from the users perspective, it would be prudent if the user's agent kept Hollywood's agent in a sandbox that prevented Hollywood's agent accessing the user's disk or network directly. EME is the mechanism for providing browser-mediated (and Web app-mediated) networking for the CDM, but since EME doesn't logically require CDM access to persistent storage providing browser-mediated access to persistent storage is additional complexity from the browser perspective.

Even without sandboxing, the ability of the browser to instruct the CDM to clear data related to a given site broadens the interface between the browser and the CDM from what's logically minimally necessary. (Though it might be a drop in a bucket in the big picture.)

> I would say this is not a
> firm requirement. However this has a clear benefit for the user, because it
> make it less likely they will shoot themselves in the foot by clearing this
> data inadvertently. 

If the user tells the browser "Forget about example.com" and the CDM's example.com-related records aren't deleted, that would be a notable violation of the user's privacy expectations.

> I don't see the privacy concern here if this data is handled like web
> application data is today, segregated by domain and subject to CORS
> restrictions. Please articulate why you think this has privacy implications. 

Unless there is the added complexity of a mechanism that allows CDM-stored data to be deleted together with other traces that a site might have left in the browser, a person examining the user's computer can use CDM-stored data as evidence of sites visited and servers can use CDM-stored data as evidence of a prior visit.

The easiest way to avoid this problem would be not to store anything persistently. However, crypto operations in the CDM alone taking multiple seconds does indeed put the user experience trade-off in a different light if storing a service-specific intermediate key would substantially speed up subsequent content key acquisitions form the same site and might justify the complexity of having a mechanism for clearing the data that integrates with the other data clearing mechanisms in the browser.

> I think adding restrictions here (above what normal web app applications are
> subject) would only result in a worse user experience and no additional
> privacy.

If the CDM-stored data is cleared together with the other storage mechanisms offered to sites, there's no new privacy issue. If not, there is.
Comment 6 Joe Steele 2013-06-08 00:07:36 UTC
(In reply to comment #5)
> The concept of a "subscription key" isn't in EME, though, and the way to
> invoke CDM operations in a EME world is to attempt to play an encrypted
> track. How would the acquisition of the subscription key actually work with
> EME? Would the user have to navigate to the site days ahead of time and have
> the site programmatically play a trivial video file in order to do something
> that triggers EME and gives the CDM the opportunity to talk with the server
> and change its state?

The way I have envisioned it working (and tried to move the spec in that direction) is pretty close to that model. The user would signup with a streaming service. After the signup process, the application would download a bundle of keys that would be passed as initData during createSession(). This would cause any initial setup of the CDM and supplementary keys to be acquired. This will take some time, but this can happen while the user is browsing the options available so there is less impact on the user experience. Then when the user selects a video to play, there may be no additional key acquisition required. 

> Is this an interoperable CENC feature, or are we now talking about a DRM
> scheme-specific feature that would defeat the CENC-based interoperability
> story on EME?

CENC is about how the content is encrypted, not how the keys are acquired. Different CDMs can provide the same key using different mechanisms. For example one CDM could require a license server request, while another could have the required key on hand already.

> > In DRM systems which
> > support a hierarchy of keys and the keys are acquired in stages (for example
> > from different servers) it is useful to be able to retain the intermediate
> > keys between sessions to avoid the costs I outlined above where possible.
> 
> Is the idea that intermediate keys are used with a less obfuscated copy of
> the crypto code than the key embedded in CDM itself?

That is a possible implementation but not what I was getting at. A performance boost from doing fewer license requests is what you gain here. 

> If the user tells the browser "Forget about example.com" and the CDM's
> example.com-related records aren't deleted, that would be a notable
> violation of the user's privacy expectations.
> 
> > I don't see the privacy concern here if this data is handled like web
> > application data is today, segregated by domain and subject to CORS
> > restrictions. Please articulate why you think this has privacy implications. 
> 
> Unless there is the added complexity of a mechanism that allows CDM-stored
> data to be deleted together with other traces that a site might have left in
> the browser, a person examining the user's computer can use CDM-stored data
> as evidence of sites visited and servers can use CDM-stored data as evidence
> of a prior visit.
> 
> The easiest way to avoid this problem would be not to store anything
> persistently. However, crypto operations in the CDM alone taking multiple
> seconds does indeed put the user experience trade-off in a different light
> if storing a service-specific intermediate key would substantially speed up
> subsequent content key acquisitions form the same site and might justify the
> complexity of having a mechanism for clearing the data that integrates with
> the other data clearing mechanisms in the browser.
> 
> > I think adding restrictions here (above what normal web app applications are
> > subject) would only result in a worse user experience and no additional
> > privacy.
> 
> If the CDM-stored data is cleared together with the other storage mechanisms
> offered to sites, there's no new privacy issue. If not, there is.

Sounds like we are in agreement then. I believe CDM-stored data should be clearable just like other storage mechanisms. My only caveat is that identifying the type of data would lead to a better user experience.
Comment 7 Henri Sivonen 2013-06-20 07:00:39 UTC
Compared to what I said above, David Dorwin had an even better idea according to the latest meeting minutes. (http://www.w3.org/2013/06/18-html-media-minutes.html#item03)

To expand:
If you want data to stay around on the client side beyond having it in RAM, have the CDM pass an encrypted data packet to the JavaScript application, how the JavaScript application store the encrypted packet in IndexedDB and then next time have the JavaScript application take the encrypted packet from IndexedDB and push it back to the CDM. This way, the privacy controls that browsers need to provide for IndexedDB anyway would automatically apply to data that CDMs want to keep around.
Comment 8 Joe Steele 2013-07-15 16:03:56 UTC
(In reply to comment #7)
> Compared to what I said above, David Dorwin had an even better idea
> according to the latest meeting minutes.
> (http://www.w3.org/2013/06/18-html-media-minutes.html#item03)
> 
> To expand:
> If you want data to stay around on the client side beyond having it in RAM,
> have the CDM pass an encrypted data packet to the JavaScript application,
> how the JavaScript application store the encrypted packet in IndexedDB and
> then next time have the JavaScript application take the encrypted packet
> from IndexedDB and push it back to the CDM. This way, the privacy controls
> that browsers need to provide for IndexedDB anyway would automatically apply
> to data that CDMs want to keep around.

This is not quite the right approach. This will cause more CDM-specific code to need to be written. I believe this imposes an unnecessary burden on the app developer for no additional security or privacy. Consider the CDM to behave like a local server, and this data to be cookie data. When a web app contacts a server, it does not have to explicitly store each cookie that comes from that server. It can inspect them and it can forbid them to be stored, but only if it wants or needs to. This is the kind of API I would like to se for storage. 

I propose that to satisfy any perceived requirement for additional security, we add a mechanism to register a storage permission handler which looks something like this: Boolean AllowStorageHandler(DOMString path, Uint8Array data) 

It would be attached to the MediaKeys object like this: MediaKeys.onallowstorage

This would be called whenever the CDM wants to store data. If no handler is attached, the default handler would return True, allowing the data to be stored like a regular cookie would be. If a handler is attached, then the application can make the call. There does not seem to be a need for a parallel "AllowRead" handler, since there is no use case I can think of where the app would allow writing, but then subsequently choose to disallow reading.
Comment 9 Mark Watson 2013-07-16 15:58:08 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > (In reply to comment #2)
> > > > Retaining some keys allow for better performance and usability. Every key
> > > > acquisition has a cost, both to the user (representing either user time
> > > > spent or delay to first playback)
> > > 
> > > Isn't this best solved by letting the video start with a few seconds of
> > > unencrypted content even though the keys are declared up front so that
> > > playback can start during the key acquisition.
> > 
> > [steele] This would be a good solution for the initial startup delay.
> > However this is not in compliance with long term business agreements some
> > content publishers have for how their content is protected for distribution.
> > In practice I have found most publishers cannot use this technique
> > currently.
> 
> It's pretty sad that content licensors are *so* user-hostile. As if it
> mattered if opening credits have DRM or not of the rest of the title does.
> :-(

It's probably not so much that the licensors are user-hostile but that this use-case was not considered when contracts were written and happens to be prohibited by the contracts. Re-writing contracts is difficult because there are so many of them, they are very long and complex and every time you re-open them people bring in other new issues. So, whilst this technique might be applicable in some cases the specification design shouldn't rely on it being universally available.
Comment 10 Mark Watson 2013-07-16 16:37:00 UTC
(In reply to comment #3)

> Seems like the client side could be simpler by handling this by the keys
> expiring often and the connection to the license server being chatty with
> re-requesting keys all the time as a form of heartbeat. If the server side
> doesn't like the complexity, maybe the server side should relax tracking
> requirements. 
> 

We have found significant advantage for long-form content if a connection to the application servers is not required for streaming to continue. That is, once the key and content URLs have been acquired, streaming can continue even if the application server connection is lost. If a heartbeat is *required* to continue streaming, then the availability requirements on the application servers go up by an order of magnitude.

OTOH, relaxing tracking requirements would make it impossible to implement business rules like limitations on the number of concurrent streams on an account.

This is why we prefer a solution with secure proof of key release. This does require CDM storage for the 'last CDM state' - specifically the identities of the keys the CDM is currently using or for which secure proof of key release has not been acknowledged by the server and a reliable indication of the time.
Comment 11 Mark Watson 2013-07-16 16:45:34 UTC
(In reply to comment #7)
> Compared to what I said above, David Dorwin had an even better idea
> according to the latest meeting minutes.
> (http://www.w3.org/2013/06/18-html-media-minutes.html#item03)
> 
> To expand:
> If you want data to stay around on the client side beyond having it in RAM,
> have the CDM pass an encrypted data packet to the JavaScript application,
> how the JavaScript application store the encrypted packet in IndexedDB and
> then next time have the JavaScript application take the encrypted packet
> from IndexedDB and push it back to the CDM. This way, the privacy controls
> that browsers need to provide for IndexedDB anyway would automatically apply
> to data that CDMs want to keep around.

This seems to place a bunch on complexity in the application and doesn't look significantly simpler for the UA than having the UA implement the storage and privacy handling of the of the encrypted blob internally.

I agree that our privacy considerations should properly address the question of clearing of CDM-stored data.
Comment 12 David Dorwin 2013-08-06 07:06:17 UTC
I think most use cases can be solved using application level storage. There may be a few use cases that really do rely on some type of "secure storage", but that should be a last resort. (These probably also require CDM-specific knowledge in the application.) Most use cases, particularly online streaming, should be accomplished without storage or key sharing (see bug 17202). This keeps the integration, privacy, security, etc. complexity lower for browsers and other UAs.

For implementations that do support those specific use cases, data should be stored in a per-origin, per-profile, and user-erasable (similar to cookies, etc.) way.

I think it would be better to solve as many use cases as possible in the application and potentially revisit the issue of storage later rather than burden all UAs/CDMs with storage requirements before we've explored alternatives.


(In reply to comment #2)
It seems most of the scenarios in comment 2 could be solved with application storage.

(In reply to comment #6)
> The way I have envisioned it working (and tried to move the spec in that
> direction) is pretty close to that model. The user would signup with a
> streaming service. After the signup process, the application would download
> a bundle of keys that would be passed as initData during createSession().
> This would cause any initial setup of the CDM and supplementary keys to be
> acquired. This will take some time, but this can happen while the user is
> browsing the options available so there is less impact on the user
> experience. Then when the user selects a video to play, there may be no
> additional key acquisition required. 

It's unclear what part of this use case requires storage in the CDM and couldn't just be done using application storage. This already sounds like a lot of application complexity, so I don't think using storage APIs is unreasonable. Note that initData is data from the container and cannot include keys.

(In reply to comment #8)
> (In reply to comment #7)
> > Compared to what I said above, David Dorwin had an even better idea
> > according to the latest meeting minutes.
> > (http://www.w3.org/2013/06/18-html-media-minutes.html#item03)
> > 
> > To expand:
> > If you want data to stay around on the client side beyond having it in RAM,
> > have the CDM pass an encrypted data packet to the JavaScript application,
> > how the JavaScript application store the encrypted packet in IndexedDB and
> > then next time have the JavaScript application take the encrypted packet
> > from IndexedDB and push it back to the CDM. This way, the privacy controls
> > that browsers need to provide for IndexedDB anyway would automatically apply
> > to data that CDMs want to keep around.
> 
> This is not quite the right approach. This will cause more CDM-specific code
> to need to be written.

Telling the CDM to store a key will require CDM-specific code somewhere (server or application). If applications are responsible for storing keys, the app behavior would be the same for all key systems (i.e. get message from server and store message in IndexedDB however the app wishes then later retrieve message from IndexedDB and provide to CDM in update() call). Without specifying key storage APIs, key storage and retrieval would be very CDM-specific.

> I believe this imposes an unnecessary burden on the
> app developer for no additional security or privacy.

The alternative is burdening all clients with responsibility for providing storage as well as solving the related security and privacy issues. The clients would also be much less flexible and may not satisfy all applications. It is a lot easier to write, ship, and modify web app code than UAs or CDMs.

> Consider the CDM to
> behave like a local server, and this data to be cookie data. When a web app
> contacts a server, it does not have to explicitly store each cookie that
> comes from that server. It can inspect them and it can forbid them to be
> stored, but only if it wants or needs to. This is the kind of API I would
> like to se for storage. 

The purpose of the CDM is to decrypt content, not provide a local server. There are many other web APIs available for providing a "local server", and EME should not duplicate or reinvent them (including all of the corresponding privacy and security issues). UAs will have a much better chance of providing the appropriate controls and expected behaviors if storage is handled by the application using existing APIs.

(In reply to comment #11)
> (In reply to comment #7)
> This seems to place a bunch on complexity in the application and doesn't
> look significantly simpler for the UA than having the UA implement the
> storage and privacy handling of the of the encrypted blob internally.

Applications looking to optimize massive traffic will be complex. Web applications can be easily updated/tuned, and there are already many APIs, norms, user expectations, security & privacy settings/tools, etc. around web applications. On the other hand, browsers and CDMs are updated less often, we'd have to invent new APIs and storage mechanism, etc. Users understand clearing cookies, but it would be hard to explain clearing saved DRM keys. One of the goals of EME is to put the application in control, even if that means more logic is in the application. (For example, the application is responsible for authorization and communicating with the server.) I don't see why the philosophy should be any different for storage.

(In reply to comment #10)
> (In reply to comment #3)
> > Seems like the client side could be simpler by handling this by the keys
> > expiring often and the connection to the license server being chatty with
> > re-requesting keys all the time as a form of heartbeat. If the server side
> > doesn't like the complexity, maybe the server side should relax tracking
> > requirements. 
> > 
> 
> We have found significant advantage for long-form content if a connection to
> the application servers is not required for streaming to continue. That is,
> once the key and content URLs have been acquired, streaming can continue
> even if the application server connection is lost. If a heartbeat is
> *required* to continue streaming, then the availability requirements on the
> application servers go up by an order of magnitude.
> 
> OTOH, relaxing tracking requirements would make it impossible to implement
> business rules like limitations on the number of concurrent streams on an
> account.
> 
> This is why we prefer a solution with secure proof of key release. This does
> require CDM storage for the 'last CDM state' - specifically the identities
> of the keys the CDM is currently using or for which secure proof of key
> release has not been acknowledged by the server and a reliable indication of
> the time.

Likewise, this pushes app-/site-specific logic into the clients, requires CDMs to have access to storage even for basic online streaming, increases the complexity of the (closed-source, third-party, etc.) CDMs, and requires expanding the UA-CDM API. There are known solutions for increasing availability, and we should not try to push workarounds to those into every client.
Comment 13 Joe Steele 2013-08-06 15:17:05 UTC
(In reply to comment #12)
> I think most use cases can be solved using application level storage. There
> may be a few use cases that really do rely on some type of "secure storage",
> but that should be a last resort. (These probably also require CDM-specific
> knowledge in the application.) Most use cases, particularly online
> streaming, should be accomplished without storage or key sharing (see bug
> 17202). This keeps the integration, privacy, security, etc. complexity lower
> for browsers and other UAs.

[steele] The assumption that these use cases can be solved for a given DRM system without storage is simply not true. Some DRM systems rely on downloaded persistent keys for *ANY* operation and not having that ability severely restricts their utility relative to their implementation on other platforms. 

> 
> For implementations that do support those specific use cases, data should be
> stored in a per-origin, per-profile, and user-erasable (similar to cookies,
> etc.) way.
> 
> I think it would be better to solve as many use cases as possible in the
> application and potentially revisit the issue of storage later rather than
> burden all UAs/CDMs with storage requirements before we've explored
> alternatives.

[steele] It does not make sense to optimize for browser developers time (where there are only a handful) above web application developers time (where there are many, many more). We have explored alternatives and they are all less performant and require existing DRM systems to restructure to a much greater degree for no additional benefit.


> 
> (In reply to comment #2)
> It seems most of the scenarios in comment 2 could be solved with application
> storage.

[steele] Agreed
> 
> (In reply to comment #6)
> > The way I have envisioned it working (and tried to move the spec in that
> > direction) is pretty close to that model. The user would signup with a
> > streaming service. After the signup process, the application would download
> > a bundle of keys that would be passed as initData during createSession().
> > This would cause any initial setup of the CDM and supplementary keys to be
> > acquired. This will take some time, but this can happen while the user is
> > browsing the options available so there is less impact on the user
> > experience. Then when the user selects a video to play, there may be no
> > additional key acquisition required. 
> 
> It's unclear what part of this use case requires storage in the CDM and
> couldn't just be done using application storage. This already sounds like a
> lot of application complexity, so I don't think using storage APIs is
> unreasonable. Note that initData is data from the container and cannot
> include keys.

[steele] I am not arguing that this problem cannot be solved with application storage. I would accept that as a fallback position, given that provides the necessary storage capability. 

Given that this is a requirement for at least two DRM systems that have been mentioned (Adobe Access and Marlin V3), it should be addressed by the spec.

As a side note -- there is nothing in the CENC spec that says the initialization data cannot contain keys and in the case of some DRMs it does.

> 
> (In reply to comment #8)
> > (In reply to comment #7)
> > > Compared to what I said above, David Dorwin had an even better idea
> > > according to the latest meeting minutes.
> > > (http://www.w3.org/2013/06/18-html-media-minutes.html#item03)
> > > 
> > > To expand:
> > > If you want data to stay around on the client side beyond having it in RAM,
> > > have the CDM pass an encrypted data packet to the JavaScript application,
> > > how the JavaScript application store the encrypted packet in IndexedDB and
> > > then next time have the JavaScript application take the encrypted packet
> > > from IndexedDB and push it back to the CDM. This way, the privacy controls
> > > that browsers need to provide for IndexedDB anyway would automatically apply
> > > to data that CDMs want to keep around.
> > 
> > This is not quite the right approach. This will cause more CDM-specific code
> > to need to be written.
> 
> Telling the CDM to store a key will require CDM-specific code somewhere
> (server or application). If applications are responsible for storing keys,
> the app behavior would be the same for all key systems (i.e. get message
> from server and store message in IndexedDB however the app wishes then later
> retrieve message from IndexedDB and provide to CDM in update() call).
> Without specifying key storage APIs, key storage and retrieval would be very
> CDM-specific.

[steele] The application does not need to tell the CDM to retain any keys. This can be handled by the key responses themselves, which can contain a lifetime for the key. This places the burden on the key server and the CDM, which is where the key management should be occurring.

> 
> > I believe this imposes an unnecessary burden on the
> > app developer for no additional security or privacy.
> 
> The alternative is burdening all clients with responsibility for providing
> storage as well as solving the related security and privacy issues. The
> clients would also be much less flexible and may not satisfy all
> applications. It is a lot easier to write, ship, and modify web app code
> than UAs or CDMs.

[steele] It is our job to solve the security and privacy issues. I believe a simple statement that it is the UAs responsibility to provide this facility for CDMs, without specifying how, will give the flexibility needed. CDM code in general will be much more difficult to modify than UA code or web app code, and that is what should be driving the API discussion. 

> 
> > Consider the CDM to
> > behave like a local server, and this data to be cookie data. When a web app
> > contacts a server, it does not have to explicitly store each cookie that
> > comes from that server. It can inspect them and it can forbid them to be
> > stored, but only if it wants or needs to. This is the kind of API I would
> > like to se for storage. 
> 
> The purpose of the CDM is to decrypt content, not provide a local server.
> There are many other web APIs available for providing a "local server", and
> EME should not duplicate or reinvent them (including all of the
> corresponding privacy and security issues). UAs will have a much better
> chance of providing the appropriate controls and expected behaviors if
> storage is handled by the application using existing APIs.

[steele] This was an attempt to provide a mental model. Nothing more.

> 
> (In reply to comment #11)
> > (In reply to comment #7)
> > This seems to place a bunch on complexity in the application and doesn't
> > look significantly simpler for the UA than having the UA implement the
> > storage and privacy handling of the of the encrypted blob internally.
> 
> Applications looking to optimize massive traffic will be complex. Web
> applications can be easily updated/tuned, and there are already many APIs,
> norms, user expectations, security & privacy settings/tools, etc. around web
> applications. On the other hand, browsers and CDMs are updated less often,
> we'd have to invent new APIs and storage mechanism, etc. Users understand
> clearing cookies, but it would be hard to explain clearing saved DRM keys.
> One of the goals of EME is to put the application in control, even if that
> means more logic is in the application. (For example, the application is
> responsible for authorization and communicating with the server.) I don't
> see why the philosophy should be any different for storage.

[steele] I am not sure what you are arguing for or against here. 

> 
> (In reply to comment #10)
> > (In reply to comment #3)
> > > Seems like the client side could be simpler by handling this by the keys
> > > expiring often and the connection to the license server being chatty with
> > > re-requesting keys all the time as a form of heartbeat. If the server side
> > > doesn't like the complexity, maybe the server side should relax tracking
> > > requirements. 
> > > 
> > 
> > We have found significant advantage for long-form content if a connection to
> > the application servers is not required for streaming to continue. That is,
> > once the key and content URLs have been acquired, streaming can continue
> > even if the application server connection is lost. If a heartbeat is
> > *required* to continue streaming, then the availability requirements on the
> > application servers go up by an order of magnitude.
> > 
> > OTOH, relaxing tracking requirements would make it impossible to implement
> > business rules like limitations on the number of concurrent streams on an
> > account.
> > 
> > This is why we prefer a solution with secure proof of key release. This does
> > require CDM storage for the 'last CDM state' - specifically the identities
> > of the keys the CDM is currently using or for which secure proof of key
> > release has not been acknowledged by the server and a reliable indication of
> > the time.
> 
> Likewise, this pushes app-/site-specific logic into the clients, requires
> CDMs to have access to storage even for basic online streaming, increases
> the complexity of the (closed-source, third-party, etc.) CDMs, and requires
> expanding the UA-CDM API. There are known solutions for increasing
> availability, and we should not try to push workarounds to those into every
> client.

[steele] I was under the impression that the goal of this spec was to provide support for existing DRM systems to operate under a common API. I am telling you there are existing DRMs which will not function correctly or efficiently under the API as spec'd. Yes -- there *may* be workarounds but those will increase the complexity of the CDMs not decrease them and decrease their utility when used through this API. But why should we fallback to workarounds when the API is still under discussion. This is a well understood, common use case which could be implemented with minimal changes. Preventing this use case is only a roadblock to CDM vendors which do not own a browser (Microsoft, Google, Apple) and will discourage implementation by other CDMs.
Comment 14 Joe Steele 2013-08-06 15:20:03 UTC
This should have read: 
Preventing this use case is only a roadblock to CDM vendors which do not own a browser (NOT Microsoft, Google, Apple) and will discourage implementation by other CDMs.
Comment 15 Henri Sivonen 2013-08-13 10:08:38 UTC
There are already two EME-integrated DRM schemes out there: Widevine and PlayReady. Do those implementations use persistent storage? If they do, what for?
Comment 16 Joe Steele 2013-10-28 18:43:45 UTC
I can't speak to what the other DRMs do and how persistent it is. I can speak to what Access does. Would the other vendors in the TF mind responding to this bug?
Comment 17 Joe Steele 2013-11-14 05:38:33 UTC
Reviewing the latest changes -- I believe this has been adequately addressed by the latest section 8.2. That section states "Key Systems may store information on a user's device, or user agents may store information on behalf of Key Systems."

That is explicit enough for me. I don't think any further changes are needed.
Comment 18 David Dorwin 2013-11-14 08:49:36 UTC
Resolving per previous comment.