This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24859 - Automatic Video and Audio Track Selection Based on User Preferences and Terminal Characteristics
Summary: Automatic Video and Audio Track Selection Based on User Preferences and Termi...
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords: a11y, media
Depends on: 24895
Blocks:
  Show dependency treegraph
 
Reported: 2014-02-28 15:44 UTC by Jon Piesing (OIPF)
Modified: 2014-05-12 11:48 UTC (History)
10 users (show)

See Also:


Attachments

Description Jon Piesing (OIPF) 2014-02-28 15:44:52 UTC
This issue is raised on behalf of HbbTV - see http://www.hbbtv.org, an organisation specifying the use of web technologies in television receivers. HbbTV is in the process of adding the HTML5 video element to its specification. The current HbbTV specification uses the <object> element for presenting video in an HTML page.

The HTML5 spec defines automatic track selection based on user preferences for text tracks as follows;

http://www.w3.org/TR/html5/embedded-content-0.html#perform-automatic-text-track-selection

Specifically;

"If the user has expressed an interest in having a track from candidates enabled based on its text track kind, text track language, and text track label,"

There is no equivalent language for video and audio tracks. Further, for video tracks and audio tracks, the resource fetch algorithm includes the following;

"If either the media resource or the address of the current media resource indicate a particular set of audio or video tracks to enable, then the selected audio tracks must be enabled in the element's audioTracks object, and, of the selected video tracks, the one that is listed first in the element's videoTracks object must be selected. All other tracks must be disabled."

The last sentence of this can be interpreted as excluding the UA selecting video or audio tracks based on user preferences and/or terminal characteristics.

We have two use-cases where automatic track selection based on user preferences and other terminal characteristics for video and audio tracks is important.

1) In TV receivers, typically users can set a range of preferences in the receiver for preferred audio language, preferred subtitle language and other audio and subtitle preferences related to accessibility (e.g. a preference for so-called "clean" audio or for audio tracks including description). These preferences are held in the receiver and applied automatically by the receiver for broadcast television. The user can also change these dynamically, for example depending on the sound track of a particular TV show. In our current specification, the media player underlying the <object> element follows these preferences automatically unless the HTML page over-rides it. This particular use-case is very related to audio. It is generally believed that a single consistent way of setting these kind of preferences for TV (regardless of how it is delivered) benefits users.

2) Our current specification supports MPEG DASH for HTTP adaptive streaming. The MPEG DASH manifest file (MPD) can contain multiple "adaptation sets" (sets of interchangeable encoded versions of a media component) which can differ in terms of codec, DRM, language, kind (called "role" by MPEG) and for audio, number of channels, e.g. stereo / 5.1 / 7.1. In our current specification, the DASH player automatically selects which video and audio adaptation set to present based on user preferences and terminal characteristics such supported codec / DRM and the number of audio channels on the output - stereo, 5.1 or 7.1. The adaptation sets are made visible to apps as our equivalent of VideoTracks and AudioTracks so that an app can over-ride the automatic selection if it so desires. We are deliberately not very prescriptive about how the automatic track selection works and there are a number of situations when an app might want to over-ride it.

Given these use-cases, we would appreciate your feedback about whether the HTML5 specification does indeed exclude automatic track selection for video and audio tracks based on user preferences and terminal characteristics. If this is excluded, can you please say if this is deliberate or accidental? If it's deliberate, can you please share the reasons for this? 

If it is excluded, can you to consider removing this exclusion.
Comment 1 Silvia Pfeiffer 2014-03-02 03:38:21 UTC
I don't think any of what you are asking for is "excluded" in the HTML spec. In fact, some of it is explicitly mentioned.

For example, the spec encourages user agents to expose a user interface of some sort (e.g. a menu or a remote control) through which viewers can select between different available audio and video tracks for a video in
http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#expose-a-user-interface-to-the-user .

Re Use Case 1:
How are you delivering "clean audio" tracks and "audio tracks with audio descriptions" to TV receivers through the <object> tag in a way that they are uniquely identifiable right now?

Re Use Case 2:
I don't understand what you are asking for. I assume you're asking if the user agent according to the HTML spec can auto-activate specific audio and video tracks based on user preferences and terminal characteristics? If so, the answer is: yes, absolutely.
Comment 2 Jon Piesing (HbbTV) 2014-03-02 11:08:12 UTC
>I don't think any of what you are asking for is "excluded" in the HTML spec. In fact, some of it is explicitly mentioned.

That's great to hear but the text quoted in the issue suggests the opposite. If that's wrong then it should be fixed.

>For example, the spec encourages user agents to expose a user interface of some sort (e.g. a menu or a remote control) through which viewers can select between different available audio and video tracks for a video in
http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#expose-a-user-interface-to-the-user .

This issue isn't about user input while the content is presenting but about user preferences set before the content starts being presented, perhaps even when the user was watching broadcast TV via a different part of the TV set.

>Re Use Case 1:
How are you delivering "clean audio" tracks and "audio tracks with audio descriptions" to TV receivers through the <object> tag in a way that they are uniquely identifiable right now?

Clean audio and audio description (etc) are signalled as part of the media resource and then acted on by the media player in the TV by default. With the object element, we have our own JavaScript API for media playback using the object element that pre-dates the HTML5 video element. This exposes more metadata about all types of track than the HTML5 media elements (see #24863). 

>Re Use Case 2:
I don't understand what you are asking for. I assume you're asking if the user agent according to the HTML spec can auto-activate specific audio and video tracks based on user preferences and terminal characteristics? If so, the answer is: yes, absolutely.

That's great news but as I said above, the text quoted in the issue suggests the opposite. If that's wrong then it should be fixed.
Comment 3 Silvia Pfeiffer 2014-03-03 00:12:49 UTC
From what you're saying, I assume you're concerned about audio and video tracks activated based on
1. Preferences given in-band in the media files
2. User preferences set in the browser


Re 1.
The second paragraph that you are citing covers that use case. It explicitly says (I only quote the relevant words):

"If the media resource indicates a particular set of audio or video tracks to enable, then the selected audio tracks must be enabled in the element's audioTracks object, and, of the selected video tracks, the one that is listed first in the element's videoTracks object must be selected. All other tracks must be disabled."


Re 2.
I agree that the spec could be more explicit on this. However, user preferences are a feature of browsers and are optional - browsers can be implemented without any user preference settings and would still be conformant to HTML. So, nothing about user preferences will be normative in the spec. The spec only has to be accurate in how it reacts to audio or video tracks that have been activated/deactivated by the browser. I can see how you might want to see this expressed more explicitly.
Comment 4 Jon Piesing (HbbTV) 2014-03-03 07:26:16 UTC
>From what you're saying, I assume you're concerned about audio and video tracks activated based on
1. Preferences given in-band in the media files
2. User preferences set in the browser

Specifically that the language for the first of these (preferences given in-band in the media files) can be interpreted as excluding the second.

"If the media resource indicates a particular set of audio or video tracks to enable, then the selected audio tracks must be enabled in the element's audioTracks object, and, of the selected video tracks, the one that is listed first in the element's videoTracks object must be selected. All other tracks must be disabled."

That does not refer to user preferences set in the browser and can be interpreted as excluding enabling tracks on that basis - "All other tracks must be disabled".

>So, nothing about user preferences will be normative in the spec.

Clear

> The spec only has to be accurate in how it reacts to audio or video tracks that have been activated/deactivated by the browser.

Yes.

> I can see how you might want to see this expressed more explicitly.

Avoiding a situation where reasonable people interpret this as being excluded would be a start.
Comment 5 Silvia Pfeiffer 2014-03-03 08:31:45 UTC
I've started a discussion in the WHATWG to get input from there: https://www.w3.org/Bugs/Public/show_bug.cgi?id=24895

I think I've represented your issue accurately.
Comment 6 Silvia Pfeiffer 2014-04-27 00:59:26 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the Editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the Tracker Issue; or you may create a Tracker Issue
yourself, if you are able to do so. For more details, see this document:

   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description:
https://github.com/w3c/html/commit/ea7c2c9f1f90d6cd3e3f7296ac15bff9000834ea
and
https://github.com/w3c/html/commit/0dc300922bb823aa5bb51c1d55e222da370227ca

Rationale: More explicitly allow the possibility of the browser enabling specific audio or video tracks from UA preferences.
Comment 7 Jon Piesing (HbbTV) 2014-04-28 08:55:28 UTC
It's unclear to me from the GIT reference whether this has been applied to HTML5.1 or to the HTML5 CR.

If it's not been applied to the HTML5 CR, please can it be?
Comment 8 Silvia Pfeiffer 2014-04-28 09:46:23 UTC
Yes we can.
Comment 9 Silvia Pfeiffer 2014-05-12 11:48:55 UTC
(In reply to Silvia Pfeiffer from comment #8)
> Yes we can.

OK, the commits have been cherry-picked for CR:
https://github.com/w3c/html/commit/319d3bbd4b6bcd53cca64ceddf146f6d7048425d
and
https://github.com/w3c/html/commit/28da55976d3363d5ad3f94547bdcffd7085aab2a