24863 – Per-track metadata for video and audio tracks

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24863 - Per-track metadata for video and audio tracks

Summary: Per-track metadata for video and audio tracks

Status:	RESOLVED WORKSFORME

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	This bug has no owner yet - up for the taking
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-02-28 16:03 UTC by Jon Piesing (HbbTV)
Modified:	2015-06-21 09:05 UTC (History)
CC List:	8 users (show)

See Also:

Attachments

Description Jon Piesing (HbbTV) 2014-02-28 16:03:48 UTC

This issue is raised on behalf of HbbTV - see http://www.hbbtv.org, an organisation specifying the use of web technologies in television receivers. HbbTV is in the process of adding the HTML5 video element to its specification. The current HbbTV specification uses the <object> element for presenting video in an HTML page.

The HbbTV specification based on the object element provides much richer set of information about videotracks and audiotracks than is found in HTML5. As well as kind, language and id that are in HTML5, some examples that are not included in HTML 5 include the following;
- encoding (e.g. AVC vs HEVC or HE-AAC vs Dolby vs DTS)
- number of audio channels (stereo vs 5.1 vs 7.1)
- whether the track is encrypted

Apps may use this information to decide what tracks to present. This is particularly critical for content streamed via MPEG DASH or pushed via the broadcast. In both cases the content provider may offer multiple video and audio tracks which have identical "kind" and "language" and differ only one of the above. Obviously the 'id' will be different between such tracks but enabling encoding/channels/encryption to be determined from the 'id' would likely introduce significant issues in many content providers workflow.

We have considered using the HTML5 extensions mechanisms to add properties to VideoTrack and AudioTrack but this is not popular among our members for a variety of reasons.

We would prefer to avoid having to explain to content providers that the video element can only be used for the simplest use-cases involving MPEG DASH and content pushed via the broadcast channel and that otherwise they must use the object element as defined in our current specification.

We would appreciate your feedback on whether there are other ways to expose this sort of information to apps. At what point in the future HTML5 process would it become possible to discuss adding additional properties to AudioTrack and VideoTrack?

Comment 1 Bob Lund 2014-02-28 16:50:54 UTC

The Inband tracks community group [1] is working to define proposals for addressing this and related issues. I encourage you to participate in the discussion [2].

Bob Lund

[1] http://www.w3.org/community/inbandtracks/
[2] http://lists.w3.org/Archives/Public/public-inbandtracks/

Comment 2 Silvia Pfeiffer 2014-03-10 07:23:10 UTC

(In reply to Jon Piesing (HbbTV) from comment #0)
> This issue is raised on behalf of HbbTV - see http://www.hbbtv.org, an
> organisation specifying the use of web technologies in television receivers.
> HbbTV is in the process of adding the HTML5 video element to its
> specification. The current HbbTV specification uses the <object> element for
> presenting video in an HTML page.
> 
> The HbbTV specification based on the object element provides much richer set
> of information about videotracks and audiotracks than is found in HTML5. As
> well as kind, language and id that are in HTML5, some examples that are not
> included in HTML 5 include the following;
> - encoding (e.g. AVC vs HEVC or HE-AAC vs Dolby vs DTS)
> - number of audio channels (stereo vs 5.1 vs 7.1)
> - whether the track is encrypted

If a Web developer needs such information about files, they can always add them to a list that they provide to the app (e.g. a JSON list of file name and these properties). Then, once the Web page is loaded on the client, they can check what the browser supports and pick the right file (e.g. what codecs via canPlayType(), and the Encrypted Media Extension via MediaKeys, see e.g. http://www.html5rocks.com/en/tutorials/eme/basics/).


> Apps may use this information to decide what tracks to present. This is
> particularly critical for content streamed via MPEG DASH or pushed via the
> broadcast. In both cases the content provider may offer multiple video and
> audio tracks which have identical "kind" and "language" and differ only one
> of the above.

If you're using DASH, the developer has to download the DASH manifest file first and gets all the information about encryption, number of channels, encoding, kind, id and language out of the manifest file. They can then make a decision which file to give to the UA to decode.

Note, however, that this doesn't mean that the UA will be able to decode that file, since different UAs support different formats. The developer will have to use the canPlayType() function to find out what the UA supports:
http://www.w3.org/html/wg/drafts/html/master/embedded-content.html#dom-navigator-canplaytype


> We have considered using the HTML5 extensions mechanisms to add properties
> to VideoTrack and AudioTrack but this is not popular among our members for a
> variety of reasons. 

Agreed. I am not convinced there is a need for an extension yet.


> We would prefer to avoid having to explain to content providers that the
> video element can only be used for the simplest use-cases involving MPEG
> DASH and content pushed via the broadcast channel and that otherwise they
> must use the object element as defined in our current specification.

Does what I explained above satisfy your use case?

Comment 3 Silvia Pfeiffer 2015-06-20 03:20:31 UTC

I'd be inclined to close this. There is a spec at http://dev.w3.org/html5/html-sourcing-inband-tracks/ .