This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 27067 - Define what to do when CDM becomes unavailable
Summary: Define what to do when CDM becomes unavailable
Status: RESOLVED MOVED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions (show other bugs)
Version: unspecified
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: David Dorwin
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-15 22:47 UTC by Chris Pearce
Modified: 2015-10-19 23:17 UTC (History)
5 users (show)

See Also:


Attachments

Description Chris Pearce 2014-10-15 22:47:20 UTC
What should we do if the CDM crashes? It looks like this could be covered by the session close algorithm. Do we also need an error event dispatched to the HTMLMediaElement?

Whatever we do, it should be explicitly defined in the spec.
Comment 1 Glenn Adams 2014-10-15 23:01:09 UTC
That is like asking what do if UA crashes. I believe this question is out of scope.
Comment 2 Chris Pearce 2014-10-15 23:06:45 UTC
No it is not. Our EME plugins run out of process, so if they crash we don't. What should we do when they crash?
Comment 3 Glenn Adams 2014-10-15 23:26:28 UTC
(In reply to Chris Pearce from comment #2)
> No it is not.

From looking at the inside of your UA it may seem so, but looking from the outside app, I would not agree.

> Our EME plugins run out of process, so if they crash we don't.
> What should we do when they crash?

The fact that you run EME out of process is an implementation detail of your UA. It seems more like behavior that would be defined in a contract between your UA and your CDM mechanism, a contract that EME should be silent about (IMO).
Comment 4 David Dorwin 2014-10-15 23:28:59 UTC
This might be better phrased as "when the CDM fails or becomes unresponsive." Since the CDM is a separate entity in the spec and many implementations, this could be a real possibility. This should also cover cases where the "context" is lost, such as on some mobile platforms when the device goes to sleep.

For contrast/comparison: The HTML5 spec does not way what to do when a hardware video decoder fails, but there is a generic mechanism for reporting decode errors:
  MEDIA_ERR_DECODE (numeric value 3)
    An error of some description occurred while decoding the media resource, after the resource was established to be usable.

We may want to identify a similar mechanism. For example, maybe the first paragraph of the Session Close algorithm section should be broadened. The Encrypted Block Encountered algorithm should already cover the behavior resulting from the keys being unavailable.
Comment 5 Glenn Adams 2014-10-15 23:36:14 UTC
(In reply to David Dorwin from comment #4)
> This might be better phrased as "when the CDM fails or becomes
> unresponsive." Since the CDM is a separate entity in the spec and many
> implementations, this could be a real possibility. This should also cover
> cases where the "context" is lost, such as on some mobile platforms when the
> device goes to sleep.
> 
> For contrast/comparison: The HTML5 spec does not way what to do when a
> hardware video decoder fails, but there is a generic mechanism for reporting
> decode errors:
>   MEDIA_ERR_DECODE (numeric value 3)
>     An error of some description occurred while decoding the media resource,
> after the resource was established to be usable.
> 
> We may want to identify a similar mechanism. For example, maybe the first
> paragraph of the Session Close algorithm section should be broadened. The
> Encrypted Block Encountered algorithm should already cover the behavior
> resulting from the keys being unavailable.

that sounds reasonable
Comment 6 Joe Steele 2014-10-17 01:01:21 UTC
(In reply to David Dorwin from comment #4)
> This might be better phrased as "when the CDM fails or becomes
> unresponsive." Since the CDM is a separate entity in the spec and many
> implementations, this could be a real possibility. This should also cover
> cases where the "context" is lost, such as on some mobile platforms when the
> device goes to sleep.
> 
> For contrast/comparison: The HTML5 spec does not say what to do when a
> hardware video decoder fails, but there is a generic mechanism for reporting
> decode errors:
>   MEDIA_ERR_DECODE (numeric value 3)
>     An error of some description occurred while decoding the media resource,
> after the resource was established to be usable.
> 
> We may want to identify a similar mechanism. For example, maybe the first
> paragraph of the Session Close algorithm section should be broadened. The
> Encrypted Block Encountered algorithm should already cover the behavior
> resulting from the keys being unavailable.

I think the idea is right and I would expect both things to happen, i.e. a MEDIA_ERR_DECODE to be reported and the Session Close algorithm to run. The app could catch the error and respond appropriately then. 

However we may need more. In some cases the MediaKeys object itself may no longer be valid. Currently we don't have a MediaKeys Close algorithm defined. I think we may need that as well, to inform the application that it needs to go through the Key System selection again.
Comment 7 David Dorwin 2014-10-20 22:43:20 UTC
(In reply to Joe Steele from comment #6)
> (In reply to David Dorwin from comment #4)
> > This might be better phrased as "when the CDM fails or becomes
> > unresponsive." Since the CDM is a separate entity in the spec and many
> > implementations, this could be a real possibility. This should also cover
> > cases where the "context" is lost, such as on some mobile platforms when the
> > device goes to sleep.
> > 
> > For contrast/comparison: The HTML5 spec does not say what to do when a
> > hardware video decoder fails, but there is a generic mechanism for reporting
> > decode errors:
> >   MEDIA_ERR_DECODE (numeric value 3)
> >     An error of some description occurred while decoding the media resource,
> > after the resource was established to be usable.
> > 
> > We may want to identify a similar mechanism. For example, maybe the first
> > paragraph of the Session Close algorithm section should be broadened. The
> > Encrypted Block Encountered algorithm should already cover the behavior
> > resulting from the keys being unavailable.
> 
> I think the idea is right and I would expect both things to happen, i.e. a
> MEDIA_ERR_DECODE to be reported and the Session Close algorithm to run. The
> app could catch the error and respond appropriately then. 

Note that a MEDIA_ERR_DECODE would not occur in this case. The waiting for a key path would be executed per my last sentence above.

> However we may need more. In some cases the MediaKeys object itself may no
> longer be valid. Currently we don't have a MediaKeys Close algorithm
> defined. I think we may need that as well, to inform the application that it
> needs to go through the Key System selection again.

We shouldn't over-engineer this corner case. It seems unlikely that applications would bother listening for such an event on MediaKeys anyway.

However, we could have createSession() reject the promise with "InvalidStateError". If all the sessions are closed and you can't create a new session, something is clearly wrong with the MediaKeys.
Comment 8 Joe Steele 2014-10-20 22:54:06 UTC
(In reply to David Dorwin from comment #7)
> (In reply to Joe Steele from comment #6)
> > However we may need more. In some cases the MediaKeys object itself may no
> > longer be valid. Currently we don't have a MediaKeys Close algorithm
> > defined. I think we may need that as well, to inform the application that it
> > needs to go through the Key System selection again.
> 
> We shouldn't over-engineer this corner case. It seems unlikely that
> applications would bother listening for such an event on MediaKeys anyway.
> 
> However, we could have createSession() reject the promise with
> "InvalidStateError". If all the sessions are closed and you can't create a
> new session, something is clearly wrong with the MediaKeys.

This sounds fine to me. Hopefully Chris Pearce will comment further on this though.
Comment 9 Chris Pearce 2014-10-21 04:20:40 UTC
(In reply to David Dorwin from comment #7)
> (In reply to Joe Steele from comment #6)
> > (In reply to David Dorwin from comment #4)
> > > This might be better phrased as "when the CDM fails or becomes
> > > unresponsive." Since the CDM is a separate entity in the spec and many
> > > implementations, this could be a real possibility. This should also cover
> > > cases where the "context" is lost, such as on some mobile platforms when the
> > > device goes to sleep.
> > > 
> > > For contrast/comparison: The HTML5 spec does not say what to do when a
> > > hardware video decoder fails, but there is a generic mechanism for reporting
> > > decode errors:
> > >   MEDIA_ERR_DECODE (numeric value 3)
> > >     An error of some description occurred while decoding the media resource,
> > > after the resource was established to be usable.
> > > 
> > > We may want to identify a similar mechanism. For example, maybe the first
> > > paragraph of the Session Close algorithm section should be broadened. The
> > > Encrypted Block Encountered algorithm should already cover the behavior
> > > resulting from the keys being unavailable.
> > 
> > I think the idea is right and I would expect both things to happen, i.e. a
> > MEDIA_ERR_DECODE to be reported and the Session Close algorithm to run. The
> > app could catch the error and respond appropriately then. 
> 
> Note that a MEDIA_ERR_DECODE would not occur in this case. The waiting for a
> key path would be executed per my last sentence above.

Well, if we were executing step 2.2 of the Encrypted Block Encountered algorithm, and the CDM tried to decrypt and decode something and crashed, then the spec says we should report a MEDIA_ERR_DECODE error. If the CDM crashed while it wasn't in the act of decoding, or if it didn't have a usable keyid, then we're supposed to wait for keys.

Those keys will never become usable if the CDM has crashed, unless script detects this situation and re-created the MediaKeys (I assume).


> > However we may need more. In some cases the MediaKeys object itself may no
> > longer be valid. Currently we don't have a MediaKeys Close algorithm
> > defined. I think we may need that as well, to inform the application that it
> > needs to go through the Key System selection again.
> 
> We shouldn't over-engineer this corner case. It seems unlikely that
> applications would bother listening for such an event on MediaKeys anyway.

It seems to me that treating CDM crash as a decode error makes sense, as if the CDM crashes or is otherwise unusable, as there's no way further video samples can be played, and no more keyrequests that can be generated. Webdevs already listen for "error" events on HTMLMediaElements, so I think it makes sense to fire that event here.

On the other hand, the video element could be made capable of decoding if it re-creates a new MediaKeys object and sets it on the media element, but script would have to detect this condition, and it seems unlikely that script would make the effort given it's (hopefully!) and uncommon occurrence.

> However, we could have createSession() reject the promise with
> "InvalidStateError". If all the sessions are closed and you can't create a
> new session, something is clearly wrong with the MediaKeys.

Sure. If the CDM is in a state where it can't create a new session, MediaKeys.createSession should fail. But we should still have an explicit notification that the media element can't decode any more.

How much does what we're talking about here overlap with bug 26776?
Comment 10 David Dorwin 2014-10-21 20:33:39 UTC
As Chris noted, the crash could occur at a variety of points. If the crash occurs during, for example, the UA's Decrypt() call to the CDM MEDIA_ERR_DECODE makes sense.

However, if the crash occurred elsewhere, the next time Encrypted Block Encountered is run, "cdm" will be invalid. Currently, there is no path in the algorithm to handle this. We could add the following step after the CDM step ("2. Use the cdm to execute the following steps"):
3. If the previous step failed, abort the media element's resource fetch algorithm, run the steps to report a MEDIA_ERR_DECODE error, and abort these steps.

That should ensure that the media element always reports MEDIA_ERR_DECODE if the attached MediaKeys becomes unusable. Separately, the MediaKeySession objects may/should also be closed.

However, that still doesn't tell the application it needs to create and attach a new MediaKeys object.

We could add a simple Event named "close" to MediaKeys to support unexpected loss of the CDM, including due to device resource constraints. Since any encrypted-event-driven application would need to see "encrypted" events again, the application will probably need to reload the source even if the media element is not in an error state.

Thus, if the CDM becomes unusable for whatever reason:
 1. The user agent should close all sessions.
 2. The user agent should fire a "close" event at the MediaKeys object.
 3. If the media is playing, a MEDIA_ERR_DECODE will almost certainly occur.
 4. The application should create a new MediaKeys object and call setMediaKeys().
 5. If the application relies on "encrypted" events or a MEDIA_ERR_DECODE event occurred, it must re-set .src.
  * If the CDM is also doing decoding and a MEDIA_ERR_DECODE event did not occur (likely because the media was not playing), the user agent needs to handle the change in decoders.
 6. The application should obtain new license(s).

Without the "close" event, the application may run step #5 but would not know to run #4.

This is overkill for crashes (refresh accomplishes 4-6). However, I think this could be an issue for some implementations that use platform-based CDMs on devices (especially mobile) that support multitasking. It's unclear whether applications would actually handle this, but a simple "close" event at least gives them the option.

(In reply to Chris Pearce from comment #9)
> How much does what we're talking about here overlap with bug 26776?

I don't think there is any overlap. Those errors are always on MediaKeySessions. MediaKeys only has the user agent-only createSession() and setServerCertificate().
Comment 11 Joe Steele 2014-11-07 01:08:09 UTC
I would be in favor of a close() event. 
Chris -- any issues for the Firefox implementation?
Comment 12 Chris Pearce 2014-11-07 01:17:32 UTC
(In reply to David Dorwin from comment #10)
> As Chris noted, the crash could occur at a variety of points. If the crash
> occurs during, for example, the UA's Decrypt() call to the CDM
> MEDIA_ERR_DECODE makes sense.
> 
> However, if the crash occurred elsewhere, the next time Encrypted Block
> Encountered is run, "cdm" will be invalid. Currently, there is no path in
> the algorithm to handle this. We could add the following step after the CDM
> step ("2. Use the cdm to execute the following steps"):
> 3. If the previous step failed, abort the media element's resource fetch
> algorithm, run the steps to report a MEDIA_ERR_DECODE error, and abort these
> steps.
> 
> That should ensure that the media element always reports MEDIA_ERR_DECODE if
> the attached MediaKeys becomes unusable. Separately, the MediaKeySession
> objects may/should also be closed.


I think the idea of firing an error in the Encrypted Block Encountered algorithm if the CDM is no longer available is good, but I don't think your proposal will work. If the CDM becomes unavailable, presumably the MediaKeySessions will close, so the steps you propose won't have the desired effect; if the MediaKeySessions close, the Encrypted Block Encountered algorithm will skip to a "waiting" state, bypassing the step you'd add.

How about we inject a step immediately after 1.2 "Let cdm be the CDM loaded during the initialization of the media keys.":

"If cdm is no longer usable for any reason, abort the media element's resource fetch algorithm, run the steps to report a MEDIA_ERR_DECODE error, and abort these steps."



> However, that still doesn't tell the application it needs to create and
> attach a new MediaKeys object.
> 
> We could add a simple Event named "close" to MediaKeys to support unexpected
> loss of the CDM, including due to device resource constraints.

This sounds good to me.
Comment 13 David Dorwin 2015-10-19 23:17:10 UTC
Moved to https://github.com/w3c/encrypted-media/issues/102.