This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 27093 - Support for proprietary/system-specific formats in initData should be discouraged/deprecated
Summary: Support for proprietary/system-specific formats in initData should be discour...
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: David Dorwin
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard: Interoperability, Security, TAG
Keywords:
Depends on:
Blocks: 20944 26838 27053 27054
  Show dependency treegraph
 
Reported: 2014-10-17 18:09 UTC by David Dorwin
Modified: 2015-04-21 21:59 UTC (History)
6 users (show)

See Also:


Attachments

Description David Dorwin 2014-10-17 18:09:34 UTC
(This bug focuses on CENC PSSH boxes because that is the only currently defined initialization data type that supports proprietary or key system-specific formats, but the arguments would apply to any such initialization data type.)

Proprietary PSSH Data formats have been and continue to be a source of interoperability and security problems. Their (publicly) undefined and inconsistent feature sets (because the ISO CENC spec allows them to contain anything) makes discussing and defining interoperable normative behavior difficult, if not impossible. As was noted in bug 17673 and bug 20944, they are an avenue for vendor lock-in and a threat to the interoperability goals of this spec. Content packaged with only proprietary PSSH boxes has the effect of segmenting the platform (bug 27053). Also, per-key system PSSH boxes does not scale, is a hassle for content providers and packagers, and locks out new key systems.

There are also technical issues with proprietary PSSH boxes:
* Some formats are encrypted, which likely prevents the user agent from validating the input before passing it to the CDM (bug 26838).
* User agents, especially open source ones, may not be able to validate or sanitize unpublished formats (bug 26838).
* Proprietary PSSH boxes could contain code (bug 22901).
* Validating proprietary PSSH boxes in the user agent requires additional code (and libraries, such as XML support). This is especially relevant should a user agent implementation wish to support multiple key systems.
* Proprietary formats can be used to abuse the initData parameter (i.e. https://www.w3.org/Bugs/Public/show_bug.cgi?id=24082#c10).

In the interest of interoperability, openness, a consistent platform, and security, support for proprietary/key system-specific formats formats should be discouraged and deprecated.

While implementations will likely support proprietary PSSH formats until content (and devices) are updated to use standard format(s), the spec and standardization process should not be burdened by proprietary formats and use cases. (Even existing content need not be a limitation since PSSH boxes can be obtained from other sources.) I expect that CDMs will continue to support their proprietary format along with the standard one(s) and that content packagers will include some proprietary formats along with the standard ones.

We currently have a simple extensible format defined [1] that should support most use cases. If we are missing support for some use cases, we should discuss them and potentially add support for them both to the main spec and the CENC registry entry. We can also consider adding common formats standardized elsewhere to this section.

Advantages:
* Encourage use of and support for standardized formats as more content is generated and more content providers switch to HTML5.
* Eventual broad support for standard formats will:
 - Enable user agents to thoroughly validate the initData before sending it to the CDM.
 - Enable content providers, applications, and packagers to support multiple (and new) key systems without adding a PSSH box for each (new) key system. See also bug 27053.
* For content providers (and packagers), especially small ones, it will be simpler to support multiple clients.
* For the spec:
 - Deprecating proprietary formats allows us to focus on documented formats and models and make more of the initData validation text normative.
 - Use cases not supported by the existing standard format can be surfaced so they can be discussed and solved in an interoperable way if necessary.

The main disadvantage is that some implementations supporting proprietary PSSH boxes may not be fully spec compliant. For example, more normative validation may not be possible. I think the advantages definitely outweigh this disadvantage. Such PSSH boxes cannot be tested for compliance anyway.

[1] https://dvcs.w3.org/hg/html-media/raw-file/default/encrypted-media/cenc-format.html#common-system
Comment 1 David Dorwin 2014-10-17 19:20:45 UTC
https://dvcs.w3.org/hg/html-media/rev/c61aba661aa6 implements this recommendation and the requiement to support the common format while allowing proprietary PSSH formats for existing content and devices.
Comment 2 Joe Steele 2014-10-24 03:58:07 UTC
This bug was opened and closed much too fast to get any feedback from the group. Given that this is a contentious subject, I don't feel that is appropriate.

I disagree with this text -- "Initialization Data should not contain Key System-specific data or values. Implementations must support the common formats defined [EME-REGISTRY] for each Initialization Data Type they support." is not clear to me. 

I realize the EME-REGISTRY has similar text, and I object to that text as well. I was not aware of the normative restriction in there until I read this bug. 

In both cases I think a SHOULD is more appropriate.
Comment 3 Joe Steele 2014-10-29 20:21:03 UTC
This text in the same section is also objectionable:

It must only contain information related to the keys required to play a given set of stream(s) or media data. It must not contain application data, client-specific data, user-specific data, key(s), or executable code.

I propose this text instead:

It must only contain information related to the keys required to play a given set of stream(s) or media data. It must not contain application data, client-specific data, user-specific data, or executable code.

Keys which are not application, user or client specific in no way compromise user privacy OR UA security. I do not see any justification for excluding them and they are used by some Key Systems.
Comment 4 Mark Watson 2014-10-29 21:06:09 UTC
I agree with Joe's comments. I don't see any reason why initData shouldn't contain keys.
Comment 5 David Dorwin 2014-10-30 18:25:04 UTC
(In reply to Joe Steele from comment #3)
> Keys which are not application, user or client specific in no way compromise
> user privacy OR UA security. I do not see any justification for excluding
> them and they are used by some Key Systems.

Key(s) are not currently allowed because a) there is no normative text that describes how to handle them and b) there is currently no interoperable way to include or use them in initialization data.

If you would like EME to support such a feature, please file a bug, preferably with proposed solutions/text to those issues.
Comment 6 Joe Steele 2014-10-30 18:47:32 UTC
(In reply to David Dorwin from comment #5)
> (In reply to Joe Steele from comment #3)
> > Keys which are not application, user or client specific in no way compromise
> > user privacy OR UA security. I do not see any justification for excluding
> > them and they are used by some Key Systems.
> 
> Key(s) are not currently allowed because a) there is no normative text that
> describes how to handle them and b) there is currently no interoperable way
> to include or use them in initialization data.
> 
> If you would like EME to support such a feature, please file a bug,
> preferably with proposed solutions/text to those issues.

This is correct. However you did not restrict the text to the Common System ID format. For other PSSH formats keys are allowed and present. 

One approach to fixing this bug would be to explicitly limit the normative text to the Common System ID PSSH format. However with the exception of keys, I think the restriction in the text is appropriate even for the more general PSSH format. 

I think we would have stronger security/privacy protections by including the modified version of this text I proposed, rather than restricting the text to the Common format only.
Comment 7 Joe Steele 2014-10-30 18:50:12 UTC
One correction I forgot to add. The EME spec does not prohibit KeySystem-specific PSSH boxes as far as I am aware. It merely discourages use of them.
Comment 8 David Dorwin 2014-10-30 18:56:49 UTC
(In reply to Joe Steele from comment #7)
> One correction I forgot to add. The EME spec does not prohibit
> KeySystem-specific PSSH boxes as far as I am aware. It merely discourages
> use of them.

Correct. However, if embedding keys in the content is something you or others want to do and the only way to do it is using proprietary PSSH boxes and proprietary behavior, such content will never be available across clients and content providers will continue to use the proprietary PSSH boxes.
Comment 9 Mark Watson 2014-10-30 18:58:57 UTC
(In reply to David Dorwin from comment #8)
> (In reply to Joe Steele from comment #7)
> > One correction I forgot to add. The EME spec does not prohibit
> > KeySystem-specific PSSH boxes as far as I am aware. It merely discourages
> > use of them.
> 
> Correct. However, if embedding keys in the content is something you or
> others want to do and the only way to do it is using proprietary PSSH boxes
> and proprietary behavior, such content will never be available across
> clients and content providers will continue to use the proprietary PSSH
> boxes.

Not necessarily. One keysystem may simple be designed such that it's PSSH contains an encrypted content key (and the license contains the decryption key for *thst*) wheras another may put the content key itself in the license.

The externally visible behaviour is identical - there is no interoperability issue - it's just that the different keysystems made different choices about information distribution between initData and license.
Comment 10 David Dorwin 2014-10-30 23:23:23 UTC
Joe and I discussed this offline and he provided three use cases, including the one Mark describes in comment #9. (Maybe he'd like to describe them somewhere.) That one seems fine; supporting it and "traditional" initData, might result in different server paths (i.e. on server uses a key server and the other does not), but that's probably out of scope. The other two seemed would have app-visible behavior and would require work to support.

Any suggestions on how to allow this use case without allowing the others unless/until they are spec'd?
Comment 11 Joe Steele 2015-01-12 21:25:00 UTC
I believe these are the use cases that David and I discussed.

Case #1 -- the content key is in the PSSH and is encrypted for a license key that the client can acquire. The PSSH may or may not also contain some identifier to let the key server know which key needs to be issued. I believe this does require app-visible consequences.

Case #2 -- the content key is in the PSSH and is encrypted for a key that is baked into the client (i.e. the client does not need to go to a license server to retrieve it). This could be done as an optimization to avoid the key acquisition cost for content whose policy allows this. The only required app-visible consequence of this is that a key request will not happen in this case. The key will simply be available.

Case #3 -- the content key is in the PSSH and is encrypted with a key known to the license server. This allows the license server to be independent of the key management system used during packaging. The only potential app-visible consequence of this is that if the key server the application is using does not have the appropriate key, the embedded content key cannot be used.

I can see two ways forward that would not negatively impact interop.

1. We could add support for these models as optional additional components of the Common PSSH. Some clients could consume the optional components and those that could not would still have the required base data i.e. the key id. 

2. We could continue to support streams with multiple PSSH boxes, one in the Common PSSH format and others in proprietary formats. Publishers which choose to use multiple PSSH boxes can leverage any optimization provided by the proprietary formats, and those who do not will still have the Common PSSH to fall back on. 

In either case I think we should make the text changes I proposed in comment 2 and comment 3.
Comment 12 David Dorwin 2015-04-07 15:14:16 UTC
Support for the three use cases above is tracked in https://github.com/w3c/encrypted-media/issues/41.
Comment 13 David Dorwin 2015-04-21 21:59:04 UTC
Re-closing as agreed at the f2f [1]. Discussion of enabling specific scenarios will continue in GitHub issues [1].

[1] http://www.w3.org/2015/04/16-html-media-minutes.html#item06