24082 – Several issues discussed in the TF point to the need for defined extensibility points in EME

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24082 - Several issues discussed in the TF point to the need for defined extensibility points in EME

Summary: Several issues discussed in the TF point to the need for defined extensibilit...

Status:	RESOLVED WONTFIX

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	Encrypted Media Extensions (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Adrian Bateman [MSFT]
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-12-13 00:47 UTC by Adrian Bateman [MSFT]
Modified:	2014-10-31 15:21 UTC (History)
CC List:	6 users (show)

See Also:

Attachments

Description Adrian Bateman [MSFT] 2013-12-13 00:47:26 UTC

The goal of this bug is to replace bug 17660, which was opened a LOOONNNNGGG time ago when the spec had a very different shape.

During the discussion of bug 17660, Joe proposed some different approaches that overload the URL attribute of the message event to allow the CDM to communicate to the application (without needing a round-trip to a server). If this is needed, I would prefer to provide an explicit mechanism to support this scenario rather than endorsing stuffing data into the URL that isn't the URL of a license service.

Similarly, at TPAC we discussed Microsoft's implementation that allows the application to provide additional data to be provided in createSession that will be included by the CDM in the message event data. While it could be argued that this is a transitional problem, we expect services using PlayReady to continue to want to use this capability and blocking this in EME adds more cost to service implementations.

In bug 24025, there is discussion of a new dictionary object that is passed to the MediaKeys constructor including the possibility of including CDM specific information here.

These issues and others suggestion that we need to consider how to handle this while still keeping in mind the goals of maximizing interoperability as far as possible through things like Common Encryption and as common application logic as possible across browsers/CDMs.

We could choose to not support this kind of extensibility explicitly in the spec but this won't prevent implementations adding the capabilities. I would prefer that if this is going to happen that the framework be defined in the spec so that extensions follow a common pattern. I don't think this reduces the overall goal that people have to provide common players where possible.

Comment 1 Joe Steele 2014-02-18 06:42:41 UTC

I will provide some examples, since I raised this initially. The use cases that I am concerned about all involve passing information that is not known at the time the content is being packaged and therefore will not be part of the initData. 

I see two major areas here:
1) Information/instructions required to playback
2) Information/instructions to manage cached state

Some examples of #1 are:
- in-band authentication, where the authentication tokens are carried within the key request protocol
- domain join requests, where the domain being joined is selected by the web application
- web application authentication tokens

Some examples of #2 are:
- clear all cached state
- domain leave requests, where the domain being left is selected by the web application

It is desirable to be able to perform some of these operations prior to actually playing content, for example while the user is selecting a movie, in order to reduce the time to starting playback. However since the actions being requested may require network communications, it may make more sense to hang this additional parameter off of the createSession() call instead of the constructor to avoid adding another call/response channel directly to the MediaKeys.

I would imagine performing a domain join using this mechanism to look something like this:

var extra = { 'action':'join', 'domain':'Lambda Lambda Lambda' }
var mediaKeys = new MediaKeys("com.example.cdm");
video.setMediaKeys(mediaKeys);
var session = mediaKeys.createSession("video/mp4", initData, extras);

The structure of the extra parameter can be completely CDM specific. If it is passed when not appropriate, it can be completely ignored. If a particular entry is not understood, we can decide whether that means an error or not. I am not sure where that error would go, but it would be useful to have an error thrown.

Comment 2 Adrian Bateman [MSFT] 2014-02-21 22:36:58 UTC

A significant number of PlayReady customers (I think around 50%) use the feature to transmit data within the license request to their licensing service. Since we expect these customers to continue to share backend infrastructure between EME-enabled web applications and native applications on other platforms, we needed to provide this capability in EME to avoid costly changes on the server side. See http://msdn.microsoft.com/en-us/library/ie/dn255041

While we understand the concern that supporting this kind of application specific data might encourage developers to adopt practices that make it harder to deploy CDM-agnostic applications, we don't believe that this will happen. One of the key goals of EME is to allow as much sharing of application code using HTML media elements as possible and this will be enough to encourage use of technologies like Common Encryption with multiple content protection systems.

Comment 3 Adrian Bateman [MSFT] 2014-02-22 00:19:11 UTC

We think the following extension points should be contemplated:

1) MediaKeys constructor should take an optional dictionary.
   The use cases for this are recorded in bug 24025.

2) createSession should include an additional data parameter as described in comment 1 and comment 2. Joe proposes a dictionary here too.

3) update should also include an additional data parameter. We believe the use cases for createSession map onto update as well, for example to influence the next message event.

Since bug 24025 is a subset of this issue, we recommend closing 24025 and tracking the work here so that we solve for this issue in one go.

Comment 4 David Dorwin 2014-03-03 22:54:56 UTC

(In reply to Joe Steele from comment #1)
> I will provide some examples, since I raised this initially. The use cases
> that I am concerned about all involve passing information that is not known
> at the time the content is being packaged and therefore will not be part of
> the initData. 
> 
> I see two major areas here:
> 1) Information/instructions required to playback
Should "playback" be replaced with "acquire a license"?

The examples below all seem related to authentication and authorization. From the EME Abstract:
“The common API supports a simple set of content encryption capabilities, leaving application functions such as authentication and authorization to page authors.”

> 2) Information/instructions to manage cached state
These sound like control messages, which are different than the message passing of update() and keymessage. I don’t think extending createSession() and/or update() is the right solution for control messages.

> Some examples of #1 are:
> - in-band authentication, where the authentication tokens are carried within
> the key request protocol
Can you provide more details? What authentication tokens? Where do they come from? How is this different from the third item below?

> - domain join requests, where the domain being joined is selected by the web
> application
How do domains work in an interoperable way and within origin boundaries? (It's probably better to discuss this in a separate thread rather letting it get lost in this bug.) What parameters are necessary to join a domain?

> - web application authentication tokens
Do you mean cookies or the sign-in tokens that might be stored in them? If so, why do these need to be passed to the CDM, especially when the application is going to end up sending the message to the [license] server.

> Some examples of #2 are:
> - clear all cached state
Can you give an example where the application would need to do this? Couldn't it just release() the sessions it knows about?

> - domain leave requests, where the domain being left is selected by the web
> application
Can you provide more details? (Again, probably in a separate thread.) Why doesn't release() work?

> It is desirable to be able to perform some of these operations prior to
> actually playing content, for example while the user is selecting a movie,
> in order to reduce the time to starting playback. However since the actions
> being requested may require network communications, it may make more sense
> to hang this additional parameter off of the createSession() call instead of
> the constructor to avoid adding another call/response channel directly to
> the MediaKeys.
What specifically is missing? We have release() (and the ability to respond with messages) and loadSession(), so is it the creation that's a problem?

> I would imagine performing a domain join using this mechanism to look
> something like this:
> 
> var extra = { 'action':'join', 'domain':'Lambda Lambda Lambda' }
> var mediaKeys = new MediaKeys("com.example.cdm");
> video.setMediaKeys(mediaKeys);
> var session = mediaKeys.createSession("video/mp4", initData, extras);
Do you really want to combine a domain join request with the processing of meta data? Aren't these two separate operations, possibly with separate "sessions".

> The structure of the extra parameter can be completely CDM specific. If it
> is passed when not appropriate, it can be completely ignored. If a
> particular entry is not understood, we can decide whether that means an
> error or not. I am not sure where that error would go, but it would be
> useful to have an error thrown.

Comment 5 David Dorwin 2014-03-03 23:07:25 UTC

(In reply to Adrian Bateman [MSFT] from comment #2)
> A significant number of PlayReady customers (I think around 50%) use the
> feature to transmit data within the license request to their licensing
> service. Since we expect these customers to continue to share backend
> infrastructure between EME-enabled web applications and native applications
> on other platforms, we needed to provide this capability in EME to avoid
> costly changes on the server side. See
> http://msdn.microsoft.com/en-us/library/ie/dn255041
Couldn't this be solved with a simple thin layer that 
> 
> While we understand the concern that supporting this kind of application
> specific data might encourage developers to adopt practices that make it
> harder to deploy CDM-agnostic applications, we don't believe that this will
> happen. One of the key goals of EME is to allow as much sharing of
> application code using HTML media elements as possible and this will be
> enough to encourage use of technologies like Common Encryption with multiple
> content protection systems.

We do believe this is a significant risk. Content providers are already targeting subset(s) of clients. Moving to EME will be a significant effort (or outsourcing) for some, and it seems unlikely that some of them will come back and make their solutions interoperable later, especially if they do not have the expertise in-house.

If we want true interoperability, providers will need to make some adjustments. We believe they will make them given the advantages of migrating to HTML5. If we think content providers will end up in the same place, what is the point of adding a permanent crutch for the short term?

Alternatively, what can PlayReady do to ease the transition for these customers?


(In reply to Adrian Bateman [MSFT] from comment #3)
> We think the following extension points should be contemplated:
So far, I'm not sure the use cases presented are as much extension points as they are support for non-interoperable legacy solutions. It could be argued that EME is already extensible. According to Wikipedia, extensibility means “New capabilities can be added to the software without major changes to the underlying architecture.” I believe that the MediaKeys and MediaKeySession architecture is extensible, as shown with the addition of loadSession() to add new capabilities. If EME is missing such capabilities, or we find that it is during implementation, we should file bugs and address them. However, I don't think adding what are essentially void* parameters is the right way forward.

> 1) MediaKeys constructor should take an optional dictionary.
>    The use cases for this are recorded in bug 24025.
I'm less certain about the dictionary now. I'll update that bug with more discussion.

> 2) createSession should include an additional data parameter as described in
> comment 1 and comment 2. Joe proposes a dictionary here too.
> 
> 3) update should also include an additional data parameter. We believe the
> use cases for createSession map onto update as well, for example to
> influence the next message event.
What use cases are those? The PlayReady additional data use case or something else? Is there data that must be provided in every message? Could the CDM store such data from MediaKeySession and/or MediaKeys creation?

> Since bug 24025 is a subset of this issue, we recommend closing 24025 and
> tracking the work here so that we solve for this issue in one go.
These seem to be tracking separate use cases at the moment, and I think it would be confusing to combine the discussions into one long bug. Let's see where each discussion goes before merging them.

Comment 6 Joe Steele 2014-03-03 23:58:32 UTC

(In reply to David Dorwin from comment #4)
> (In reply to Joe Steele from comment #1)
> > 1) Information/instructions required to playback
> Should "playback" be replaced with "acquire a license"?

Yes, but I want to be clear that this is not information necessarily associated with a specific piece of content. 

> The examples below all seem related to authentication and authorization.
> From the EME Abstract:
> “The common API supports a simple set of content encryption capabilities,
> leaving application functions such as authentication and authorization to
> page authors.”

This would in no way remove control from the hands of the page authors. It does remove some of the constraints around how authentication and authorization are conveyed to the server.

> 
> > 2) Information/instructions to manage cached state
> These sound like control messages, which are different than the message
> passing of update() and keymessage. I don’t think extending createSession()
> and/or update() is the right solution for control messages.

These are control messages from the application, not from the server. And these are specifically control messages which should require user interaction.

> 
> > Some examples of #1 are:
> > - in-band authentication, where the authentication tokens are carried within
> > the key request protocol
> Can you provide more details? What authentication tokens? Where do they come
> from? How is this different from the third item below?

This is an example I raised before in meetings. In the Access ecosystem, which handles standalone applications as well as web-based ones, authentication tokens (username+password, etc.) can be captured by the application and included in the license request. This way they are encrypted by the DRM subsystem and are not subject to in-transit attacks. I don't want to argue the merits of point-to-point SSL versus message-based encryption here. We can take that offline. My point is that this is something we would like to support.

> 
> > - domain join requests, where the domain being joined is selected by the web
> > application
> How do domains work in an interoperable way and within origin boundaries?
> (It's probably better to discuss this in a separate thread rather letting it
> get lost in this bug.) What parameters are necessary to join a domain?

This is a long discussion and mostly captured by other threads. The parameters required in this case would be the domain name, a user-friendly name to associate with the application/device and potentially authentication tokens. 

> 
> > - web application authentication tokens
> Do you mean cookies or the sign-in tokens that might be stored in them? If
> so, why do these need to be passed to the CDM, especially when the
> application is going to end up sending the message to the [license] server.

I don't mean cookies specifically, although I suppose you could use it for that. The example I was thinking of was one-time fields in the page associated with the cookies. If they are sent to the server inside the license request, they can be signed/encrypted using the DRM sub-system which allows the server to then double-check them against the cookies or headers received. I can think of other examples as well. 

> 
> > Some examples of #2 are:
> > - clear all cached state
> Can you give an example where the application would need to do this?
> Couldn't it just release() the sessions it knows about?

Yes. If the user is on a public computer and wants to clear any cached state before leaving. Releasing the sessions will not necessarily clear cached data. 

> 
> > - domain leave requests, where the domain being left is selected by the web
> > application
> Can you provide more details? (Again, probably in a separate thread.) Why
> doesn't release() work?

This is related to cacheing. It may be costly to join a domain (in terms of user  time and/or effort). Therefore it makes sense to cache the results once you have done it. It does not make sense to throw that work away just because the user has left the page or closed the browser. 

> 
> > It is desirable to be able to perform some of these operations prior to
> > actually playing content, for example while the user is selecting a movie,
> > in order to reduce the time to starting playback. However since the actions
> > being requested may require network communications, it may make more sense
> > to hang this additional parameter off of the createSession() call instead of
> > the constructor to avoid adding another call/response channel directly to
> > the MediaKeys.
> What specifically is missing? We have release() (and the ability to respond
> with messages) and loadSession(), so is it the creation that's a problem?

I was responding to the extra parameter being on the constructor, rather than on the createSession() call itself. The constructor may be called prior to any content being downloaded, so a key session may not be created. Since for some of my uses cases, I need network communications, I need a key session. Am I mistaken? I think with an extra parameter in both cases we are covered.

> 
> > I would imagine performing a domain join using this mechanism to look
> > something like this:
> > 
> > var extra = { 'action':'join', 'domain':'Lambda Lambda Lambda' }
> > var mediaKeys = new MediaKeys("com.example.cdm");
> > video.setMediaKeys(mediaKeys);
> > var session = mediaKeys.createSession("video/mp4", initData, extras);
> Do you really want to combine a domain join request with the processing of
> meta data? Aren't these two separate operations, possibly with separate
> "sessions".

In our case the information about the domain to be joined is carried in the metadata. So it must be processed at the same time. That metadata may be entirely independent of a content stream however. 

> 
> > The structure of the extra parameter can be completely CDM specific. If it
> > is passed when not appropriate, it can be completely ignored. If a
> > particular entry is not understood, we can decide whether that means an
> > error or not. I am not sure where that error would go, but it would be
> > useful to have an error thrown.

Comment 7 David Dorwin 2014-03-04 01:08:12 UTC

(In reply to Joe Steele from comment #6)
> (In reply to David Dorwin from comment #4)
> > (In reply to Joe Steele from comment #1)
> > The examples below all seem related to authentication and authorization.
> > From the EME Abstract:
> > “The common API supports a simple set of content encryption capabilities,
> > leaving application functions such as authentication and authorization to
> > page authors.”
> 
> This would in no way remove control from the hands of the page authors. It
> does remove some of the constraints around how authentication and
> authorization are conveyed to the server.
Not only does this principle put applications in control, it keeps the EME APIs and specification simple and consistent across clients.

> > > 2) Information/instructions to manage cached state
> > These sound like control messages, which are different than the message
> > passing of update() and keymessage. I don’t think extending createSession()
> > and/or update() is the right solution for control messages.
> 
> These are control messages from the application, not from the server. And
Yes, that's what I meant. update() and message events are used to exchange information with a server, but these control messages don't fit that model.

> these are specifically control messages which should require user
> interaction.


> > > Some examples of #2 are:
> > > - clear all cached state
> > Can you give an example where the application would need to do this?
> > Couldn't it just release() the sessions it knows about?
> 
> Yes. If the user is on a public computer and wants to clear any cached state
> before leaving. Releasing the sessions will not necessarily clear cached
> data. 
This data should be clearable via browser UI that enables clearing of other such data: https://dvcs.w3.org/hg/html-media/raw-file/default/encrypted-media/encrypted-media.html#privacy-storedinfo

> > > - domain leave requests, where the domain being left is selected by the web
> > > application
> > Can you provide more details? (Again, probably in a separate thread.) Why
> > doesn't release() work?
> 
> This is related to cacheing. It may be costly to join a domain (in terms of
> user  time and/or effort). Therefore it makes sense to cache the results
> once you have done it. It does not make sense to throw that work away just
> because the user has left the page or closed the browser. 

As I understand it, you want to control the lifetime of a persistent license (even though it is not a content license). In theory, MediaKeySession provides this. As noted elsewhere, I'm not sure why you would want to tie (or piggyback) the two types of licenses together.

> > > It is desirable to be able to perform some of these operations prior to
> > > actually playing content, for example while the user is selecting a movie,
> > > in order to reduce the time to starting playback. However since the actions
> > > being requested may require network communications, it may make more sense
> > > to hang this additional parameter off of the createSession() call instead of
> > > the constructor to avoid adding another call/response channel directly to
> > > the MediaKeys.
> > What specifically is missing? We have release() (and the ability to respond
> > with messages) and loadSession(), so is it the creation that's a problem?
> 
> I was responding to the extra parameter being on the constructor, rather
> than on the createSession() call itself. The constructor may be called prior
> to any content being downloaded, so a key session may not be created. Since
> for some of my uses cases, I need network communications, I need a key
> session. Am I mistaken? I think with an extra parameter in both cases we are
> covered.
I think the root of the issue is that you want to create sessions or exchange messages that are not related to any particular stream or content key ("for example while the user is selecting a movie", "prior to any content being downloaded") whereas the EME APIs have been designed around license exchanges for specific keys (based on Initialization Data). Adding an extra parameter (after |initData|) doesn't seem to solve the underlying issue.

> > > I would imagine performing a domain join using this mechanism to look
> > > something like this:
> > > 
> > > var extra = { 'action':'join', 'domain':'Lambda Lambda Lambda' }
> > > var mediaKeys = new MediaKeys("com.example.cdm");
> > > video.setMediaKeys(mediaKeys);
> > > var session = mediaKeys.createSession("video/mp4", initData, extras);
> > Do you really want to combine a domain join request with the processing of
> > meta data? Aren't these two separate operations, possibly with separate
> > "sessions".
> 
> In our case the information about the domain to be joined is carried in the
> metadata.
I'm not sure what metadata you're referring to. Is it something in the Initialization Data (|initData|) or |extras|? If the former, is this metadata [in] a PSSH or something else?

> So it must be processed at the same time. That metadata may be
> entirely independent of a content stream however. 
Then why would it be combined with the license exchange for a particular stream (|initData|)? Or are you proposing that |initData| is not actually Initialization Data.

Comment 8 Joe Steele 2014-05-08 00:43:06 UTC

Where are we on this? The thread has gotten very long and hard to follow. 
Hopefully the new Use Cases wiki [1] will narrow down these discussions. :-)

I would like to see the following extensions to the spec (restating comment 1):

* Support for domain join and leave
I think this one can be dealt with later if we don't want to tackle that use case yet.

* Support for server communications not directly tied to a media session
This is to support cases where it would be advantageous to acquire keys before the user has actually selected the content to play. This can be supported today by passing in initData that is acquired from some source other than the media file. E.g. a "standard key package" initData provided to authenticated users of a service. I don't see a problem here unless someone objects to this mechanism. 

* Support for application data 
It would be very helpful to be able to add custom data from the application that could then be signed by the CDM as part of the key request. 

Would it be better to split these out into separate bugs? I think these could all be solved with adding a dictionary parameter to the MediaKey and MediaKeySession constructors. But I am also OK with adding explicit functions for the first two. The last one seems like it would have to be a parameter on the MediaKeySession constructor. 

[1] https://www.w3.org/wiki/HTML/Media_Task_Force#Use_Cases

Comment 9 David Dorwin 2014-10-16 00:01:17 UTC

EME should not endorse vendor-specific extensions. See also https://github.com/w3ctag/spec-reviews/blob/master/2014/10/eme.md#author-facing-interoperability-between-key-systems.

In addition to interoperability, such unvetted extensions may also compromise the security and privacy properties of the spec. Likewise, supporting such extensions would make it difficult to reason about such properties.

I believe we have sufficient points to extend the standardized feature set in the future if necessary. For example, SessionType and MediaKeySystemOptions.

Comment 10 Joe Steele 2014-10-16 18:21:39 UTC

Most of the issues I raised here have been dealt with elsewhere. 

(In reply to Joe Steele from comment #8)
> Where are we on this? The thread has gotten very long and hard to follow. 
> Hopefully the new Use Cases wiki [1] will narrow down these discussions. :-)
> 
> I would like to see the following extensions to the spec (restating comment
> 1):
> 
> * Support for domain join and leave
> I think this one can be dealt with later if we don't want to tackle that use
> case yet.

Another bug was filed for this (bug 25217) and closed as RESOLVE LATER.
> 
> * Support for server communications not directly tied to a media session
> This is to support cases where it would be advantageous to acquire keys
> before the user has actually selected the content to play. This can be
> supported today by passing in initData that is acquired from some source
> other than the media file. E.g. a "standard key package" initData provided
> to authenticated users of a service. I don't see a problem here unless
> someone objects to this mechanism. 

This turned out to be a non-issue. The application can and should create a media session for this use case. That works for me. 

> * Support for application data 
> It would be very helpful to be able to add custom data from the application
> that could then be signed by the CDM as part of the key request. 

This is still an issue. It can (and will if not if addressed in the spec) be implemented by applications via modifying the initData passed to the CDM to contain the additional information desired or by browsers supporting non-standard additional parameters to the createSession/generateRequest methods. Neither of these seems to add to interoperability. 

(In reply to David Dorwin from comment #9)
> EME should not endorse vendor-specific extensions. See also
> https://github.com/w3ctag/spec-reviews/blob/master/2014/10/eme.md#author-
> facing-interoperability-between-key-systems.

I will not go into detail here about the issues I have with this review. There are some interesting discussions going on in the restricted-media mailing list. For example - http://lists.w3.org/Archives/Public/public-restrictedmedia/2014Feb/0002.html

> 
> In addition to interoperability, such unvetted extensions may also
> compromise the security and privacy properties of the spec. Likewise,
> supporting such extensions would make it difficult to reason about such
> properties.

This is veering into the secure origin discussion in bug 26332. The channel between the application and the CDM already exists. Adding additional data from the same source to the channel does not change the security calculus. We are just talking about the semantics of how it gets added, not whether it can be added at all. 

> I believe we have sufficient points to extend the standardized feature set
> in the future if necessary. For example, SessionType and
> MediaKeySystemOptions.

I think we are getting closer. But this issue still exists. It can be worked around as I mentioned above, but I don't think that what I described is the right solution. The right solution is to either eliminate the need for this extension or have explicit support for it.

Comment 11 David Dorwin 2014-10-16 23:03:37 UTC

(In reply to Joe Steele from comment #10)
> (In reply to Joe Steele from comment #8)
> > * Support for application data 
> > It would be very helpful to be able to add custom data from the application
> > that could then be signed by the CDM as part of the key request. 
> 
> This is still an issue. It can (and will if not if addressed in the spec) be
> implemented by applications via modifying the initData passed to the CDM to
> contain the additional information desired or by browsers supporting
> non-standard additional parameters to the createSession/generateRequest
> methods. Neither of these seems to add to interoperability. 

This would be an inappropriate misuse of the EME APIs, be an abuse of initData, and NOT be spec-compliant. Allowing such data to be passed via some proprietary key system protocols would itself inhibit interoperability. See comment #5.

More generally, it is concerning to (again [1]) see you imply implementation of non-standard APIs or behavior should the spec process not have a specific outcome.

> (In reply to David Dorwin from comment #9)
> > In addition to interoperability, such unvetted extensions may also
> > compromise the security and privacy properties of the spec. Likewise,
> > supporting such extensions would make it difficult to reason about such
> > properties.
> 
> This is veering into the secure origin discussion in bug 26332. The channel
> between the application and the CDM already exists. Adding additional data
> from the same source to the channel does not change the security calculus.
> We are just talking about the semantics of how it gets added, not whether it
> can be added at all. 

To clarify, my comment was general and did not apply to any specific extension(s).

This was a general statement about unspecified or non-normative functionality making such analysis and specification difficult, if not impossible. This is not necessarily related to the secure origin bug; it could also affect the user agent (i.e. origin considerations).

> > I believe we have sufficient points to extend the standardized feature set
> > in the future if necessary. For example, SessionType and
> > MediaKeySystemOptions.
> 
> I think we are getting closer. But this issue still exists. It can be worked
> around as I mentioned above, but I don't think that what I described is the
> right solution. The right solution is to either eliminate the need for this
> extension or have explicit support for it.

If you think there is a missing feature, we should discuss it (in a different thread) rather than just allowing or implementing proprietary extensions. However, I believe this specific feature has already been discussed multiple times.

[1] http://www.w3.org/2014/08/26-html-media-minutes.html#item08

Comment 12 Joe Steele 2014-10-17 00:12:56 UTC

(In reply to David Dorwin from comment #11)
> (In reply to Joe Steele from comment #10)
> > (In reply to Joe Steele from comment #8)
> > > * Support for application data 
> > > It would be very helpful to be able to add custom data from the application
> > > that could then be signed by the CDM as part of the key request. 
> > 
> > This is still an issue. It can (and will if not if addressed in the spec) be
> > implemented by applications via modifying the initData passed to the CDM to
> > contain the additional information desired or by browsers supporting
> > non-standard additional parameters to the createSession/generateRequest
> > methods. Neither of these seems to add to interoperability. 
> 
> This would be an inappropriate misuse of the EME APIs, be an abuse of
> initData, and NOT be spec-compliant. Allowing such data to be passed via
> some proprietary key system protocols would itself inhibit interoperability.
> See comment #5.

I don't agree that this is non-compliant, although I think it is not in the spirit of the spec and would certainly be non-interoperable. 

The definition of Initialization Data [1] and generateRequest [2] does not preclude this usage. There is some non-normative text that talks about "sanitizing" the initialization data, but as we have previously discussed the current definition of initialization data for CENC [3] as including "protection system specific headers" means that this data often cannot be sanitized by the user agent even if such a thing were desirable. 

[1] https://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html#initialization-data
[2] https://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html#widl-MediaKeySession-generateRequest-Promise-void--DOMString-initDataType-ArrayBuffer-ArrayBufferView-initData
[3] https://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/cenc-format.html#init-data

> 
> More generally, it is concerning to (again [1]) see you imply implementation
> of non-standard APIs or behavior should the spec process not have a specific
> outcome.

Regarding non-standard APIs -- 

I was referring to the API support that Microsoft currently has for passing in additional information that was mentioned in comment 0.

Regarding non-standard behavior -- 

I am trying to point out common (in my experience) usage patterns for plugin-based protected content players that should be implementable for native browser playback. Based on the feedback from Microsoft, it would seem that my experience is common beyond our players.

It is reasonable to say that developers will try to build work-arounds when the available services do not solve their problem. If enough people use the work-arounds, service implementors take the hint and start building explicit support. If people find that it is easy to avoid the problem altogether without a work-around, the work-arounds die out naturally. This is how we make progress. 

Adobe does not own a browser. We have to work within the environment that is provided to us. Within those constraints we are trying to provide services our developers want, rather than tell them to change the way they currently work. We would prefer to work with browser implementors rather than at cross purposes in this. 

> 
> > (In reply to David Dorwin from comment #9)
> > > In addition to interoperability, such unvetted extensions may also
> > > compromise the security and privacy properties of the spec. Likewise,
> > > supporting such extensions would make it difficult to reason about such
> > > properties.
> > 
> > This is veering into the secure origin discussion in bug 26332. The channel
> > between the application and the CDM already exists. Adding additional data
> > from the same source to the channel does not change the security calculus.
> > We are just talking about the semantics of how it gets added, not whether it
> > can be added at all. 
> 
> To clarify, my comment was general and did not apply to any specific
> extension(s).
> 
> This was a general statement about unspecified or non-normative
> functionality making such analysis and specification difficult, if not
> impossible. This is not necessarily related to the secure origin bug; it
> could also affect the user agent (i.e. origin considerations).
> 
> > > I believe we have sufficient points to extend the standardized feature set
> > > in the future if necessary. For example, SessionType and
> > > MediaKeySystemOptions.
> > 
> > I think we are getting closer. But this issue still exists. It can be worked
> > around as I mentioned above, but I don't think that what I described is the
> > right solution. The right solution is to either eliminate the need for this
> > extension or have explicit support for it.
> 
> If you think there is a missing feature, we should discuss it (in a
> different thread) rather than just allowing or implementing proprietary
> extensions. However, I believe this specific feature has already been
> discussed multiple times.

This bug/thread was opened specifically to discuss this issue as far as I can tell. Yes - it has been discussed multiple times, but at no time has a resolution been reached as far as I can tell. 

> [1] http://www.w3.org/2014/08/26-html-media-minutes.html#item08

There may be some confusion here - your link points to a conversation about loadSession. The main thrust of this bug is whether the application should have a standard way to pass additional data either with the initData when the session is created (my proposal) or when the MediaKeys object is created (Microsoft's implementation).

Comment 13 Joe Steele 2014-10-17 00:34:06 UTC

I did not intend to reopen. I don't agree with closing it, but I can bring it up later if the expected work-arounds start to impact interoperability.

Comment 14 Jerry Smith 2014-10-31 15:21:33 UTC

We originally proposed providing standard extension points to support a historic practice of including custom application data in license server messages.  This was discussed and previously won’t fixed as a proposal to include optional CdmData in createSession.

We have a philosophical choice to make in this area.  Do we expect companies to have legitimate reasons for extending the EME APIs, and if so, do we want to control how those extensions are made; or do we believe all extensions undermine interoperability and any provision for extensions will just encourage usage and ultimately create more interoperability problems.

We believe in the former and have advocated for extendibility mechanisms to be defined.  These can effectively confine where extensions are made, and ease the ability to deprecate the capability at some point in the future, should that be desired.  Extensions by modifying initData seem far less desirable.