This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24368 - Define playback behavior when the key for an encrypted block is not available for a subset of streams
Summary: Define playback behavior when the key for an encrypted block is not available...
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Encrypted Media Extensions (show other bugs)
Version: unspecified
Hardware: All All
: P3 normal
Target Milestone: ---
Assignee: Adrian Bateman [MSFT]
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 18515
  Show dependency treegraph
 
Reported: 2014-01-22 23:12 UTC by David Dorwin
Modified: 2014-02-24 20:07 UTC (History)
4 users (show)

See Also:


Attachments

Description David Dorwin 2014-01-22 23:12:53 UTC
This bug is half of what bug 18515 was originally. I'm breaking playback behavior out because it seems to be a sizable issue not related to the current text proposal in the other bug.

Bug 18515 covers the app-visible events and attributes when blocked waiting for the key for an encrypted block.

This bug tracks the user-visible playback experience in that case. Most importantly, what happens when only a subset of the streams are blocked waiting for a key.

(From my original description in bug 18515)
> Some questions to address:
...
> 3) What should happen to playback if not all streams need a key to continue?
> 4) If playback is suspended in one media element, how does this affect an
> associated MediaController?
> 
> Related to the last two questions above, should user agents drop frames or
> suspend playback when at least one but not all streams need a key to
> continue? Dropping frames may be desirable if some content (i.e. audio) can
> be played without the missing key, but continuing one stream may not be
> possible in some implementations (i.e. if an audio stream needs a key).
> Choosing to drop frames may also mean that, for example, the initial video
> frames are never displayed in some use case and media combinations.
> 
> Some specific scenarios to consider:
>  A) Video needs a key but audio does not.
>  B) Audio needs a key but video does not.
>  C) There is more than one audio or video stream.
>  D) A video or audio stream that is not being rendered needs a key.
New:
E) Do we need to worry about text track data? Is there a scenario where we should block for such data?
> 
> Note that if audio is driving playback (time) then playback will probably
> continue in the first scenario unless explicitly paused. This behavior would
> be different than the opposite (second) scenario.
> 
> In all cases, playback should resume when the key is provided.


We also need to decide where this behavior should be described. It seems that that issue might be better resolved outside of EME, especially in light of similar issues in MSE, which also makes such scenarios (audio frames but not video frames) possible. We may eventually file a bug against HTML to cover all such scenarios.
Comment 1 David Dorwin 2014-01-23 02:14:11 UTC
I looked at MSE's handling of such a condition and looked for relevant text in the HTML5 spec. (Thanks to Aaron for several pointers.) The summary is that the lack of media data for one or more tracks appears to be covered by MSE and HTML5, though it may make sense to add clarity to the HTML5 spec.

Details:
* An underflow should cause a transition to HAVE_CURRENT_DATA and playback should stall.
  - This happens when the current playback position goes into an area not covered by the TimeRanges reported by HTMLMediaElement.buffered (see [1).
  - [1] and [2] appear to define the stalling behavior as well as the firing of the waiting event (relevant for bug 18515).
* In MSE, the behavior happens as part of the SourceBuffer monitoring algorithm [3].
  - Note: It is possible this section could be simplified to depend on or describe the interaction with the HTML5 algorithm.
* MSE is explicit about multiple active buffers but (unextended) HTML5 does not appear to be explicit about how to handle the case where the "media resource" and/or "media data" are available for one stream but not another.
  - This means that it is not obvious how to handle, for example, an underflow in video but not audio when using .src.
  - It is probably safe to interpret the HTML5 spec as implying 'for all selected video track(s), the enabled audio track(s), and the "showing" or "hidden" text track(s)' like the MSE spec explicitly states in [3].
* I was not able to find an explicit step for suspending playback in the HTML5 spec like the "Playback is suspended" statement in MSE [3].


EME should have consistent behavior (stall if any track is blocked on decryption). However, since we decided that decryption should not affect the readyState in bug 18515 [4], EME cannot rely on those existing algorithms.

We might be able to address this in EME by adding a couple things to the proposal in bug 18515.
For example (new text is surrounded by "***"):
   o  If media element was previously playing and had a waitingFor value of “notwaiting”: 
   ...
      -  If new mediaKeys are needed to continue ***playback for any selected video track(s), enabled audio track(s), or "showing" or "hidden" text track(s)*** and the element has not ended play back, the user agent must set the waitingFor attribute on the Media Element to “mediakeys”, queue a task to fire a simple event named timeupdate at the element, queue a task to fire a simple event named waiting at the element ***, and suspend playback***.


[1] http://www.w3.org/TR/html5/embedded-content-0.html#event-media-waiting
[2] Second bullet at http://www.w3.org/TR/html5/embedded-content-0.html#handling-first-frame-available
[3] https://dvcs.w3.org/hg/html-media/raw-file/default/media-source/media-source.html#buffer-monitoring
[4] https://www.w3.org/Bugs/Public/show_bug.cgi?id=18515#c15
Comment 2 Adrian Bateman [MSFT] 2014-02-21 23:04:31 UTC
Assigning to me to implement as part of the changes for bug 18515. We agree with the proposal.
Comment 3 Adrian Bateman [MSFT] 2014-02-24 20:07:34 UTC
I think I've capture this correctly. Please review.
https://dvcs.w3.org/hg/html-media/rev/2346418fc472