Bug 20960 - EME is not limited to video.
Summary: EME is not limited to video.
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions (show other bugs)
Version: unspecified
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Adrian Bateman [MSFT]
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 20963
  Show dependency treegraph
 
Reported: 2013-02-12 02:19 UTC by Fred Andrews
Modified: 2013-02-26 23:37 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fred Andrews 2013-02-12 02:19:56 UTC
The EME CDM is not limited to just video and could well implement an entire HTML engine defeating the good work of many to allow users to customize the presentation of HTML.  I suggest there is not way to achieve such a restriction within the space of solutions acceptable to the proponents.
Comment 1 Adrian Bateman [MSFT] 2013-02-12 13:35:48 UTC
The only way to consume MediaKeys is with the setMediaKeys method on a HTMLMediaElement. Therefore it is limited to media elements (including video and audio).
Comment 2 Fred Andrews 2013-02-12 20:27:45 UTC
The consumption of the media keys is not defined so please leave this bug open until the specification is complete and this can be reviewed.  In particular nothing stops the CDM implementing a HTML engine and you have failed to address this.
Comment 3 Mark Watson 2013-02-19 16:28:47 UTC
The architecture is such that the CDM is able only to output decoded media data to rendering/compositing functions. It does not have access to user input or control of the rendering/compositing of its output.

Therefore it would not be possible to implement an entire HTML engine within a CDM, according to the specification.
Comment 4 Joe Steele 2013-02-19 21:07:05 UTC
(In reply to comment #2)
> The consumption of the media keys is not defined so please leave this bug
> open until the specification is complete and this can be reviewed.  In
> particular nothing stops the CDM implementing a HTML engine and you have
> failed to address this.

Can you be more specific as to why you think this concern is unique to EME? Do you have a specific scenario in mind?
Comment 5 Fred Andrews 2013-02-19 21:16:37 UTC
(In reply to comment #3)
> The architecture is such that the CDM is able only to output decoded media
> data to rendering/compositing functions. It does not have access to user
> input or control of the rendering/compositing of its output.

The architecture of the CDM is not defined.  It does appear to have privileged access to the system, the path to the monitor pixels, and could well have access to all system resources including storage.  Please defined the scope of the CDM privileges to that your claims can be assessed.
 
> Therefore it would not be possible to implement an entire HTML engine within
> a CDM, according to the specification.

This is not believable, the scope of the CDM is not defined, but taking a guess it appears quite practical.
Comment 6 Mark Watson 2013-02-19 21:47:42 UTC
(In reply to comment #5)
> (In reply to comment #3)
> > The architecture is such that the CDM is able only to output decoded media
> > data to rendering/compositing functions. It does not have access to user
> > input or control of the rendering/compositing of its output.
> 
> The architecture of the CDM is not defined.  It does appear to have
> privileged access to the system, the path to the monitor pixels, and could
> well have access to all system resources including storage.  Please defined
> the scope of the CDM privileges to that your claims can be assessed.

What I wrote is the intention of the architecture. We can clarify this in the specification, if it helps. But what access the CDM has depends entirely on the UA implementation (for CDMs embedded in UAs) or the Operating System/platform (for CDMs embedded there).

>  
> > Therefore it would not be possible to implement an entire HTML engine within
> > a CDM, according to the specification.
> 
> This is not believable, the scope of the CDM is not defined, but taking a
> guess it appears quite practical.

I don't understand what you expect us to say in the specification to address this ? Obviously, UA implementors can do what they like: for example allowing CDMs to operate totally outside any kind of security sandbox. I don't expect any of them to do that and it's not necessary to meet the goals so, presumably, they will put the CDM in an appropriate sandbox or otherwise constrain what it can do.

It's a mistake to think that this is a generic plugin architecture for arbitrary code. I expect UAs to be very careful about which CDMs they support and what they do.
Comment 7 Fred Andrews 2013-02-20 15:29:12 UTC
(In reply to comment #4)
> (In reply to comment #2)
> > The consumption of the media keys is not defined so please leave this bug
> > open until the specification is complete and this can be reviewed.  In
> > particular nothing stops the CDM implementing a HTML engine and you have
> > failed to address this.
> 
> Can you be more specific as to why you think this concern is unique to EME?
> Do you have a specific scenario in mind?

This is unique to the EME because of the back channel that EME provides from the CDM to the server.  An image decoder or a video decoder do not require a back channel.

If you want to ignore the CDM in the consideration of the EME and improve the safety and privacy then simply remove this back channel.

I expect the proponents would retort that EME would no longer support DRM without the back channel, but I would suggest this just shows that EME is designed for DRM and that the technical requirements should be specified so that we can all objectively review them rather than them being inherent in the design of the EME.

> Do you have a specific scenario in mind?

DRM would be attractive to a wider range of content authors than just video authors, and if a CDM can support DRM then there would be demand for more general HTML support within the CDM - I suggest it's inevitable that a CDM would be written that supports a relative comprehensive HTML engine.  Even if the CDM is not given network access by the OS it can still receive content via the EME channel.
Comment 8 Joe Steele 2013-02-20 17:43:45 UTC
> This is unique to the EME because of the back channel that EME provides from
> the CDM to the server.  An image decoder or a video decoder do not require a
> back channel.

Thank you. That makes the threat you are worried about more clear. 

Let me restate to make sure I understand:
* There is a bi-directional channel to the CDM from the web application
* This allows the web application to funnel all data and events the web application sees to the CDM. 
* Because the CDM can implement any code it wants, it could implement an HTML engine.
* The combination of these two things allows a CDM to render content from the web application in an alternate HTML engine and process events for it

If we replace "CDM" with "video codec" in the above argument, I believe we have the same situation. Video elements have a bi-directional channel as well in the HTMLMediaElement text track support (textTracks and addTextTrack).

> DRM would be attractive to a wider range of content authors than just video 
> authors, 

I agree with this.

> and if a CDM can support DRM then there would be demand for more 
> general HTML support within the CDM - I suggest it's inevitable that a CDM 
> would be written that supports a relative comprehensive HTML engine.  

I don't agree with this. 

The described threat would require the UA to include a CDM with this behavior. There is no requirement that any UA include any specific CDM other than ClearKey (which does not have this behavior). A much shorter path to this scenario is that the UA provides a direct non-standard method to turn on "DRM" for the web page and does not include an entire alternate HTML engine. Both scenarios require collusion by the UA implementer and both rely on behavior outside of the specification.

Having said this -- it sounds like it would satisfy your concern to have some spec text that says something along the lines of "The CDM must not implement a user agent.". Do you have some alternate text to suggest?
Comment 9 Mark Watson 2013-02-20 18:03:05 UTC
(In reply to comment #8)
> 
> > and if a CDM can support DRM then there would be demand for more 
> > general HTML support within the CDM - I suggest it's inevitable that a CDM 
> > would be written that supports a relative comprehensive HTML engine.  
> 
> I don't agree with this. 
> 
> The described threat would require the UA to include a CDM with this
> behavior. There is no requirement that any UA include any specific CDM other
> than ClearKey (which does not have this behavior). A much shorter path to
> this scenario is that the UA provides a direct non-standard method to turn
> on "DRM" for the web page and does not include an entire alternate HTML
> engine. Both scenarios require collusion by the UA implementer and both rely
> on behavior outside of the specification.

An even simpler way would be for the UA to implement a generic plugin API allowing the user to install (deliberately, or as the result of social engineering) arbitrary code that integrates into the web page. Such a plugin could implement an entirely independent HTML-equivalent presentation engine, complete with DRM protection, if it so chose.

Since such a plugin API already exists in all browsers, it would be an improvement if the use-cases that cause browsers to maintain support for it could be supported in a way where the DRM code and what it can do is more under the control of the UA implementor. This is one of our goals.
Comment 10 Fred Andrews 2013-02-20 23:33:12 UTC
(In reply to comment #8)
> > This is unique to the EME because of the back channel that EME provides from
> > the CDM to the server.  An image decoder or a video decoder do not require a
> > back channel.
> 
> Thank you. That makes the threat you are worried about more clear. 
> 
> Let me restate to make sure I understand:
> * There is a bi-directional channel to the CDM from the web application
> * This allows the web application to funnel all data and events the web
> application sees to the CDM. 
> * Because the CDM can implement any code it wants, it could implement an
> HTML engine.
> * The combination of these two things allows a CDM to render content from
> the web application in an alternate HTML engine and process events for it

It is also important to note that:

* The web application can tunnel the secret communication between the CDM and the server.  The combination of the EME spec. plus a simple web app. tunnel, as shown in the EME examples, creates a secret bi-directional channel from the CDM to the server.

* Because the CDM can implement video DRM it is likely that it can implement DRM of general HTML content.
 
> If we replace "CDM" with "video codec" in the above argument, I believe we
> have the same situation. Video elements have a bi-directional channel as
> well in the HTMLMediaElement text track support (textTracks and
> addTextTrack).

No, not at all. A video codec does not have a back channel, or a secret back channel, or the privilege to implement DRM.

This does appear to be a technical matter that we could explore further and likely come to some technical agreement on. Technical tests would be:

* demonstrate that a video codec without a back channel can implement strong DRM?

* demonstrate that a CDM can implement a general DRM enabled web engine.

> > DRM would be attractive to a wider range of content authors than just video 
> > authors, 
> 
> I agree with this.
> 
> > and if a CDM can support DRM then there would be demand for more 
> > general HTML support within the CDM - I suggest it's inevitable that a CDM 
> > would be written that supports a relative comprehensive HTML engine.  
> 
> I don't agree with this. 

You appear to agree that it would be attractive to content authors, and it appears to be technically possible, so could you elaborate on why you do not believe this would be done?

> The described threat would require the UA to include a CDM with this
> behavior.

Agreed, but this is rather obvious.  Perhaps this leads into CDM management.

> There is no requirement that any UA include any specific CDM other
> than ClearKey (which does not have this behavior).

ClearKey could be un-bundled from EME so the existence of 'ClearKey' is probably not a defense.

> A much shorter path to
> this scenario is that the UA provides a direct non-standard method to turn
> on "DRM" for the web page and does not include an entire alternate HTML
> engine. Both scenarios require collusion by the UA implementer and both rely
> on behavior outside of the specification.

And if someone proposes this then I will object to it too, but this is not the matter at hand.

If a DRM 'hole' is created by EME that can support more general DRM of content then developers with exploit it. 
 
> Having said this -- it sounds like it would satisfy your concern to have
> some spec text that says something along the lines of "The CDM must not
> implement a user agent.". Do you have some alternate text to suggest?

This constraint would need to be verifiable by the user.  It is not clear how this would be possible in a way that is consistent with the design of DRM - there have been suggestions that the CDM be standardized, independently implementable, support open source, etc that would go some way.   Technically, removing the support for a back channel would go some way.  Moving the CDM within the scope of W3C standardization would allow review. Do you have other suggestions?
Comment 11 Fred Andrews 2013-02-21 02:01:01 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > 
> > > and if a CDM can support DRM then there would be demand for more 
> > > general HTML support within the CDM - I suggest it's inevitable that a CDM 
> > > would be written that supports a relative comprehensive HTML engine.  
> > 
> > I don't agree with this. 
> > 
> > The described threat would require the UA to include a CDM with this
> > behavior. There is no requirement that any UA include any specific CDM other
> > than ClearKey (which does not have this behavior). A much shorter path to
> > this scenario is that the UA provides a direct non-standard method to turn
> > on "DRM" for the web page and does not include an entire alternate HTML
> > engine. Both scenarios require collusion by the UA implementer and both rely
> > on behavior outside of the specification.
> 
> An even simpler way would be for the UA to implement a generic plugin API
> allowing the user to install (deliberately, or as the result of social
> engineering) arbitrary code that integrates into the web page. Such a plugin
> could implement an entirely independent HTML-equivalent presentation engine,
> complete with DRM protection, if it so chose.

Well said.

However, I dispute that such an API exists in all browsers,
and dispute that a plugin could implement strong DRM on
a very significant segment of the current web browser market.
 
> Since such a plugin API already exists in all browsers, it would be an
> improvement if the use-cases that cause browsers to maintain support for it
> could be supported in a way where the DRM code and what it can do is more
> under the control of the UA implementor. This is one of our goals.

I accept that this goal is reflected in some way by the EME proposal
but suggest that EME has focused on make it more convenient to
implement DRM video within a HTML web application, rather than
restricting the CDM to such usage which is the issue at hand in
this bug.
Comment 12 Adrian Bateman [MSFT] 2013-02-26 16:16:56 UTC
Discussed in the telcon. Resolved Won't Fix.
http://www.w3.org/2013/02/26-html-media-minutes.html

We believe that the spec is as constrained to the HTMLMediaElement as it can be. The beginning of the abstract says: "This proposal extends HTMLMediaElement providing APIs to control playback of protected content." We don't believe there is any text that can be further added to the spec to constrain what implementers can code into their user agents.
Comment 13 Fred Andrews 2013-02-26 23:37:37 UTC
(In reply to comment #12)
> Discussed in the telcon. Resolved Won't Fix.
> http://www.w3.org/2013/02/26-html-media-minutes.html
> 
> We believe that the spec is as constrained to the HTMLMediaElement as it can
> be. The beginning of the abstract says: "This proposal extends
> HTMLMediaElement providing APIs to control playback of protected content."
> We don't believe there is any text that can be further added to the spec to
> constrain what implementers can code into their user agents.

Clearly the EME specification text can be tightened to exclude
content outside video and audio.  For trivial example: "This
proposal extends HTMLMediaElement providing APIs to control
playback of ONLY protected VIDEO and AUDIO content."

There may well be technical restrictions that would be applicable.
For example, for content requiring only a low level of protection
it is likely possible to limit the back channel.  The CDM could
also be sandboxed to provide protection.

Closing this bug at this point hardly shows a good faith effort
to address the matter.