25427 – Video element playback position and live content

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 25427 - Video element playback position and live content

Summary: Video element playback position and live content

Status:	RESOLVED WORKSFORME

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	Macintosh MacOS X

Importance:	P2 editorial
Target Milestone:	---
Assignee:	This bug has no owner yet - up for the taking
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-04-23 12:11 UTC by Jon Piesing (HbbTV)
Modified:	2015-06-21 05:03 UTC (History)
CC List:	7 users (show)

See Also:

Attachments

Description Jon Piesing (HbbTV) 2014-04-23 12:11:00 UTC

This issue is raised on behalf of HbbTV - see http://www.hbbtv.org, an organisation specifying the use of web technologies in television receivers. HbbTV is in the process of adding the HTML5 video element to its specification. The current HbbTV specification uses the <object> element for presenting video in an HTML page.

We are looking at the use of the HTML5 video element to playback live content delivered via IP. This could be adaptive using MPEG DASH (in our case a native DASH player as part of the UA, not a JavaScript DASH player as part of the web page). Alternatively this could be non-adaptive using HTTP chunked transfer coding as defined by section 3.6.1 of RFC 2616.

It seems to us that the specification for the video element only partly considers live content. In particular, the description of the seekable property considers it but other places do not. For example, the following note from the description of seekable shows that live content has been considered to some extent:

"The range might be continuously changing, e.g. if the user agent is buffering a sliding window on an infinite stream. This is the behavior seen with DVRs viewing live TV, for instance."

With live content, typically the server in the network will only buffer a certain period behind the live edge. Using the terminology of this note, there would essentially be two sliding windows, one in the UA of tens of seconds and one in the network which might be (say) 10 minutes behind the live edge. In the case of MPEG DASH, the UA may know the size of the buffer in the network from the optional timeShiftBufferDepth attribute in the DASH MPD.

Typically, we would expect users to want to join live content at the live edge and not at the earliest time available in the network. In practice this would be achieved by the UA (media player) starting to read a small distance behind the live edge in order to fill the necessary buffers and start playing as quickly as possible.

Points concerning live content that we believe are ambiguous or not fully considered are the following:

1. The HTML5 specification can be interpreted in two ways regarding where playback of live content should begin:

In one interpretation, the specification reads as requiring the various playback positions to always be the earliest possible position. This is not what users will expect for live content, although we recognise this can be resolved by the application seeking to the live edge if required. 

In another interpretation, it could be considered that the statement “Establish the media timeline for the purposes of the current playback position, the earliest possible position, and the initial playback position, based on the media data” allows a UA to begin playback at the live edge if “media data” indicates that this is most appropriate.  

Which interpretation is expected (or are both of these incorrect)?  Can the specification be clarified specifically for the case of live content?

2. For live content, the earliest and latest possible positions are continually moving. It is unclear how the following statement is to be interpreted in these circumstances;

“When the earliest possible position changes, then: if the current playback position is before the earliest possible position, the user agent must seek to the earliest possible position; otherwise, if the user agent has not fired a timeupdate event at the element in the past 15 to 250ms and is not still running event handlers for such an event, then the user agent must queue a task to fire a simple event named timeupdate at the element.”

If the various playPosition properties are set to the earliest possible position before playback has started then this language might require the UA to continually be seeking as the earliest possible position moves ahead of what was the current play position.

One example of when this could occur is if the HTML page calls the load method (but not run) on a piece of live content. What values are assigned to the various playback positions under these conditions?
- None?
- A static snapshot at the time the method was called?
- Dynamic updating values? 

If the UA is continually seeking as described above then it would seem to be the last of these however we believe that should be clearly stated in the specification.

3. It is unclear whether there is a mechanism for the HTML page to seek to the live edge. For seeks before the earliest possible position, it says “If the new playback position is less than the earliest possible position, let it be that position instead.” However for seeks to the live edge, the equivalent language is less clear, “If the new playback position is later than the end of the media resource, then let it be the end of the media resource instead.”

If the live edge would be considered the end of media in then we believe that needs to be clearly stated. In general, we might expect the latest seekable position to be a few seconds before the current ‘end of the media resource’ for a live stream since the UA will typically require some amount of buffering.  Do you agree?

Perhaps the concept of a latest seekable position could be introduced explicitly to mirror the current definition of ‘earliest possible position’. It would correspond to the end time of the last range in the seekable attribute’s TimeRanges object, if any, or the current playback position otherwise.  This could be further clarified as typically being the end of the media resource for on-demand but the live edge for live content, taking into account any buffering requirements that the UA may have. If this was introduced then the following text;
If the new playback position is later than the end of the media resource, then let it be the end of the media resource instead.

could be re-written as;

If the new playback position is later than the latest seekable position, then let it be the end of the latest seekable positioninstead.

Comment 1 Philip Jägenstedt 2014-04-24 12:15:14 UTC

(In reply to Jon Piesing (HbbTV) from comment #0)
 
> 1. The HTML5 specification can be interpreted in two ways regarding where
> playback of live content should begin:
> 
> In one interpretation, the specification reads as requiring the various
> playback positions to always be the earliest possible position. This is not
> what users will expect for live content, although we recognise this can be
> resolved by the application seeking to the live edge if required. 
> 
> In another interpretation, it could be considered that the statement
> “Establish the media timeline for the purposes of the current playback
> position, the earliest possible position, and the initial playback position,
> based on the media data” allows a UA to begin playback at the live edge if
> “media data” indicates that this is most appropriate.  
> 
> Which interpretation is expected (or are both of these incorrect)?  Can the
> specification be clarified specifically for the case of live content?

I think this covers it: "If either the media resource or the address of the current media resource indicate a particular start time, then set the initial playback position to that time"

In other words, playback should begin wherever the format defines that it should begin.

> 2. For live content, the earliest and latest possible positions are
> continually moving. It is unclear how the following statement is to be
> interpreted in these circumstances;
> 
> “When the earliest possible position changes, then: if the current playback
> position is before the earliest possible position, the user agent must seek
> to the earliest possible position; otherwise, if the user agent has not
> fired a timeupdate event at the element in the past 15 to 250ms and is not
> still running event handlers for such an event, then the user agent must
> queue a task to fire a simple event named timeupdate at the element.”
> 
> If the various playPosition properties are set to the earliest possible
> position before playback has started then this language might require the UA
> to continually be seeking as the earliest possible position moves ahead of
> what was the current play position.
> 
> One example of when this could occur is if the HTML page calls the load
> method (but not run) on a piece of live content. What values are assigned to
> the various playback positions under these conditions?
> - None?
> - A static snapshot at the time the method was called?
> - Dynamic updating values? 
> 
> If the UA is continually seeking as described above then it would seem to be
> the last of these however we believe that should be clearly stated in the
> specification.

Per the bit of the spec that you quoted, the UA would continuously be trying to seek to catch up. Depending on how often the earliest possible position changes, it could even end up in a situation where no seek is finished before the next starts, which would cause lots of seeking events to be fired, but no timeupdate or seeked events. That seems pretty weird, but is there a better alternative?

> 3. It is unclear whether there is a mechanism for the HTML page to seek to
> the live edge. For seeks before the earliest possible position, it says “If
> the new playback position is less than the earliest possible position, let
> it be that position instead.” However for seeks to the live edge, the
> equivalent language is less clear, “If the new playback position is later
> than the end of the media resource, then let it be the end of the media
> resource instead.”

The position will be clamped to the seekable ranges in a later step: "If the (possibly now changed) new playback position is not in one of the ranges given in the seekable attribute, then let it be the position in one of the ranges given in the seekable attribute that is the nearest to the new playback position."

Given that, is there still a problem?

Comment 2 Silvia Pfeiffer 2014-05-12 06:07:43 UTC

(In reply to Jon Piesing (HbbTV) from comment #0)
> In general, we might expect the latest
> seekable position to be a few seconds before the current ‘end of the media
> resource’ for a live stream since the UA will typically require some amount
> of buffering.  Do you agree?

Yes.

> Perhaps the concept of a latest seekable position could be introduced
> explicitly to mirror the current definition of ‘earliest possible position’.
> It would correspond to the end time of the last range in the seekable
> attribute’s TimeRanges object, if any, or the current playback position
> otherwise.

As Philip explained, the seek position is clamped to the available time ranges, so this is already in the spec.

It seems that all your requirements are already satisfied and that live content is sufficiently supported in the existing spec.

Comment 3 Michael[tm] Smith 2015-06-21 05:03:19 UTC

Closing per comment 2:

> As Philip explained, the seek position is clamped to the available time
> ranges, so this is already in the spec.
> 
> It seems that all your requirements are already satisfied and that live
> content is sufficiently supported in the existing spec.