This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24860 - When can user agents honor the user preferences for automatic text track selection
Summary: When can user agents honor the user preferences for automatic text track sele...
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 editorial
Target Milestone: ---
Assignee: steve faulkner
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords: a11y, media
: 24862 (view as bug list)
Depends on: 28828 28829
Blocks:
  Show dependency treegraph
 
Reported: 2014-02-28 15:49 UTC by Jon Piesing (HbbTV)
Modified: 2016-04-25 15:30 UTC (History)
15 users (show)

See Also:
chaals: needinfo? (robin)


Attachments

Description Jon Piesing (HbbTV) 2014-02-28 15:49:21 UTC
This issue is raised on behalf of HbbTV - see http://www.hbbtv.org, an organisation specifying the use of web technologies in television receivers. HbbTV is in the process of adding the HTML5 video element to its specification. The current HbbTV specification uses the <object> element for presenting video in an HTML page.

The current HTML5 specification defines two circumstances under which automatic track selection for text tracks happens.

1) "When a media element is popped off the stack of open elements of an HTML parser or XML parser, the user agent must honor user preferences for automatic text track selection, populate the list of pending text tracks, and set the element's blocked-on-parser flag to false." and

2) "When a text track corresponding to a track element is added to a media element's list of text tracks, the user agent must queue a task to run the following steps for the media element:" ... "Honor user preferences for automatic text track selection for this element."

This language can be interpreted as only permitting user agents to honor "the user preferences for automatic text track selection" in these two circumstances. It can also be interpreted as saying that, while it is required in these two circumstances, it could happen at other times as well.

We have a use-case for user preferences to be honored at other times. Specifically, TV receivers normally come with a remote control that includes a subtitle button. The user can press this button at any time. When watching normal TV, this will either 1) toggle subtitles on or off or 2) will bring up a TV receiver specific UI to enable the user to set preferences to subtitles. Many believe that the user should get a consistent experience when they press the subtitle button regardless of whether the video being shown is classic broadcast TV or video presented via an HTML5 video element. This requires that the user agent can disable and enable text tracks on the request of the user without the app being involved.

Given this use-case, we would appreciate your feedback about whether the HTML5 specification permits honoring the user preferences for automatic text track selection in circumstances other than the two listed in the specification, such as the end-user pressing a subtitle button on a TV remote control.

If it doesn't permit honoring the user preferences in other circumstances, can it be changed to permit this?
Comment 1 Silvia Pfeiffer 2014-03-02 03:14:41 UTC
(In reply to Jon Piesing (HbbTV) from comment #0)
> This issue is raised on behalf of HbbTV - see http://www.hbbtv.org, an
> organisation specifying the use of web technologies in television receivers.
> HbbTV is in the process of adding the HTML5 video element to its
> specification. The current HbbTV specification uses the <object> element for
> presenting video in an HTML page.
> 
> The current HTML5 specification defines two circumstances under which
> automatic track selection for text tracks happens.
> 
> 1) "When a media element is popped off the stack of open elements of an HTML
> parser or XML parser, the user agent must honor user preferences for
> automatic text track selection, populate the list of pending text tracks,
> and set the element's blocked-on-parser flag to false." and
> 
> 2) "When a text track corresponding to a track element is added to a media
> element's list of text tracks, the user agent must queue a task to run the
> following steps for the media element:" ... "Honor user preferences for
> automatic text track selection for this element."
> 
> This language can be interpreted as only permitting user agents to honor
> "the user preferences for automatic text track selection" in these two
> circumstances. It can also be interpreted as saying that, while it is
> required in these two circumstances, it could happen at other times as well.

It happens only where the spec explicitly tells it to happen. If there are other times when it should happen and those other times are not explained in the spec, then the spec needs updating.


> We have a use-case for user preferences to be honored at other times.
> Specifically, TV receivers normally come with a remote control that includes
> a subtitle button. The user can press this button at any time.

That is a different use case of the ones you list above. In HTML5, there are "user preferences" and there is "user interaction". What you are describing here is "user interaction". "User preferences" as described above typically map to browser settings.

What you are asking for, therefore, is a reaction to a user interaction via a control of some sort (in browsers its typically an overlay menu from the video controls, for a TV it could be the remote control). That is covered in this paragraph:

"When a text track corresponding to a track element experiences any of the following circumstances, the user agent must start the track processing model for that text track and its track element:

* The track element is created.
* The text track has its text track mode changed.
* The track element's parent element changes and the new parent is a media element.
"

In particular, when toggling subtitles on/off, the browser will set the "text track mode" of all subtitles tracks to "disabled" and thus in the "time marches on" algorithm, all disabled tracks cues will not be shown.

Your use case is therefore already covered by the spec and no change is needed.
Comment 2 Silvia Pfeiffer 2014-03-02 03:16:03 UTC
*** Bug 24862 has been marked as a duplicate of this bug. ***
Comment 3 Silvia Pfeiffer 2014-03-02 03:30:48 UTC
To be even clearer: you use case is satisfied by http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#expose-a-user-interface-to-the-user .
Comment 4 Jon Piesing (HbbTV) 2014-03-02 10:54:24 UTC
The URL given doesn't work. Is this a reference to section 4.7.10.13 "User interface" and the controls attribute?

If it's a reference to the controls attribute then that has numerous issues in a TV environment. TV apps generally do not include the controls attribute to avoid those issues and the subtitle key would need to work when that attribute is not present.

>In particular, when toggling subtitles on/off, the browser will set the "text track mode" of all subtitles tracks to "disabled" and thus in the "time marches on" algorithm, all disabled tracks cues will not be shown.

That is great but doesn't address the point of whether a browser mechanism to control subtitles is permitted when the controls attribute is not present. If the contrbutors to this part of the spec believe that this is supposed to be permitted then it should be made clearer in the spec.
Comment 5 Silvia Pfeiffer 2014-03-03 00:56:18 UTC
(In reply to Jon Piesing (HbbTV) from comment #4)
> The URL given doesn't work. Is this a reference to section 4.7.10.13 "User
> interface" and the controls attribute?

Try this one:
http://www.w3.org/html/wg/drafts/html/master/embedded-content.html#expose-a-user-interface-to-the-user

> If it's a reference to the controls attribute then that has numerous issues
> in a TV environment. TV apps generally do not include the controls attribute
> to avoid those issues and the subtitle key would need to work when that
> attribute is not present.


To avoid what issues?

I don't understand. How can you have UA provided controls when you disallow the @controls attribute?

Also, if you disallow the browser from creating controls, but then provide your own on top, then it's up to you to expose such a user interface and take care of the providing the features that the @controls attribute would provide. "You" being whatever interface you are adding in top of the browser. If there is some sort of "middleware" that provides remote control mapping to the browser, but is not part of what you call an "application", then its up to that part to make sure that the right tracks are activated/deactivated. From the HTML spec's point of view, that's still an "application".


> >In particular, when toggling subtitles on/off, the browser will set the "text track mode" of all subtitles tracks to "disabled" and thus in the "time marches on" algorithm, all disabled tracks cues will not be shown.
> 
> That is great but doesn't address the point of whether a browser mechanism
> to control subtitles is permitted when the controls attribute is not
> present.

Of course it is. The section that I referred to explicitly says:

"If the attribute is present, or if scripting is disabled for the media element, then the user agent should expose a user interface to the user. "

Notice that it explicitly mentions "scripting is disabled" as another situation when the browser should expose a UI.

The issue here is that with HTML pages and video elements that have no @controls attribute, you might find that the page author has created the controls through JavaScript and is rendering them on top of the video elements. I don't know how you want to handle that in the set-top box case. Do you want to disable such controls and render some default controls of the TV that include the remote control access? Or do you want to have the remote control also activate/deactivate the subtitles? Both is possible. Neither is disallowed in the spec.
Comment 6 Jon Piesing (HbbTV) 2014-03-03 15:11:40 UTC
>> If it's a reference to the controls attribute then that has numerous issues
>> in a TV environment. TV apps generally do not include the controls attribute
>> to avoid those issues and the subtitle key would need to work when that
>> attribute is not present.

>To avoid what issues?

TV app developers (almost?) always provide their own controls for several reasons;
a) they get to control the visual appearance (if any) so it fits with the rest of their UI
b) they can ensure that the app has control over the VCR-style features, pause/resume, fast-forwards/fast-reverse, jump forwards/backwards. This is particularly critical for broadcasters funded by advertising.

>I don't understand. How can you have UA provided controls when you
> disallow the @controls attribute?

That depends on what functionality provided by the UA would be disabled when the controls attribute is absent.

This may sound cryptic but an example is audio volume control. Audio volume control would probably work regardless of whether the controls attribute is present or not.

>Also, if you disallow the browser from creating controls, but then provide 
>your own on top, then it's up to you to expose such a user interface and
> take care of the providing the features that the @controls attribute
> would provide.

I agree 100% but an app has to make assumptions about what features the @controls attribute would provide in order to take care of providing those features. Also the UA needs to include the functionality to enable the app to provide those features. This may not be the case for all possible features that the UA could provide in which case setting @controls had probably better not disable those features from the UA.

> "You" being whatever interface u are adding in top of the browser.
> If there is some sort of "middleware" that provides remote control
> mapping to the browser, but is not part of what you call an "application",
> then its up to that part to make sure that the right tracks are 
> activated/deactivated. From the HTML spec's point of view, that's still
> an "application".

This is also true from HbbTV's point of view.

<snip>

>The issue here is that with HTML pages and video elements that have no 
>@controls attribute, you might find that the page author has created the >controls through JavaScript and is rendering them on top of the video elements.

TV apps would normally handle the remote control keys for pause/resume/ff/frew/jump (etc) themselves.

They may do this without rendering any visible UI though.

> I don't know how you want to handle that in the set-top box case.
> Do you want to disable such controls and render some default controls
> of the TV that include the remote control access?

This would be completely unacceptable to TV app developers for the reasons given at the start of this comment.

> Or do you want to have the remote control also activate/deactivate
> the subtitles? Both is possible. Neither is disallowed in the spec.

Well the spec can be interpreted as disallowing control of subtitles. Please see the fragments of the text quoted in the original report.
Comment 7 Silvia Pfeiffer 2014-03-10 06:52:22 UTC
(In reply to Jon Piesing (HbbTV) from comment #6)
> >> If it's a reference to the controls attribute then that has numerous issues
> >> in a TV environment. TV apps generally do not include the controls attribute
> >> to avoid those issues and the subtitle key would need to work when that
> >> attribute is not present.
> 
> >To avoid what issues?
> 
> TV app developers (almost?) always provide their own controls for several
> reasons;
> a) they get to control the visual appearance (if any) so it fits with the
> rest of their UI

So you overrule the adaptation that Web developers have already made to make sure that their controls fit visually with the rest of the Web page?

Are we talking about Web pages or native TV applications here?


> b) they can ensure that the app has control over the VCR-style features,
> pause/resume, fast-forwards/fast-reverse, jump forwards/backwards. This is
> particularly critical for broadcasters funded by advertising.

All UA provided controls provide thee features.


> >I don't understand. How can you have UA provided controls when you
> > disallow the @controls attribute?
> 
> That depends on what functionality provided by the UA would be disabled when
> the controls attribute is absent.

None. The @controls attribute does not disable or enable any features. It only exposes some features visually with buttons and sliders to the user that otherwise are only available to script.


> >Also, if you disallow the browser from creating controls, but then provide 
> >your own on top, then it's up to you to expose such a user interface and
> > take care of the providing the features that the @controls attribute
> > would provide.
> 
> I agree 100% but an app has to make assumptions about what features the
> @controls attribute would provide in order to take care of providing those
> features.

What I don't understand is what an "app developer" is in this environment. To me, an "app developer" is the person who wrote the Web page and the layout and the graphical design in HTML, CSS and Javascript and has made sure that all the functionality is there that is required to interact with the Web page. If they decide to use @control, then the UA gets to render the controls. If they decide to do it themselves, the don't use @control. Is that the same "app developer" you are talking about?

 
> TV apps would normally handle the remote control keys for
> pause/resume/ff/frew/jump (etc) themselves.
> 
> They may do this without rendering any visible UI though.

That's perfectly fine. A remote control is just a different input device to a keyboard. It will still create key events:
https://dvcs.w3.org/hg/d4e/raw-file/tip/source_respec.htm#constructor-keyboardevent
There is a whole section on media keys in that spec:
https://dvcs.w3.org/hg/d4e/raw-file/tip/source_respec.htm#key-media
Comment 8 Jon Piesing (HbbTV) 2014-03-11 14:51:50 UTC
(In reply to Silvia Pfeiffer from comment #7)
> 
> Are we talking about Web pages or native TV applications here?

With very few exceptions, the only native TV apps come from manufacturers.
Hence what I mean by "TV app developer" is someone developing an interactive service offering for a TV set using HTML+JavaScript. 

> > b) they can ensure that the app has control over the VCR-style features,
> > pause/resume, fast-forwards/fast-reverse, jump forwards/backwards. This is
> > particularly critical for broadcasters funded by advertising.
> 

Let me try again. There are at least three reasons why people developing HTML+JavaScript for TV sets would not set the controls attribute.
1. They don't want the TV set to draw anything that is part of the page because it would disrupt the user experience and would likely be wildly inconsistent between manufacturers
2. They rely on video advertising to fund the content/app/service and want to ensure the end-user cannot fast-forwards through the advert or jump past it.
3. The language relating to the controls attribute is all written as "should". That basically means it's not testable and experience shows that things which aren't testable in TVs cannot be relied on to work.

I apologise for being blunt but in my experience any solution that assumes HTML+JavaScript apps for TV sets will set the controls attribute is simply irrelevant.

> > That depends on what functionality provided by the UA would be disabled when
> > the controls attribute is absent.
> 
> None. The @controls attribute does not disable or enable any features. It
> only exposes some features visually with buttons and sliders to the user
> that otherwise are only available to script.

There are a several different reasons why this doesn't work in TV ...
- there are gaps in the set of APIs that an app would need to be able to offer the functionality through script (e.g. reading previously set user preferences so the user doesn't have to re-enter them)
- the remote control buttons (or virtual buttons) that are used for the UI offered by the TV are typically hard-wired to the TV and never delivered to the browser. 
- it is widely felt to give a poor user experience if pressing volume up (or audio description or subtitle) gives a very different UI (not just colour but also conceptually) when the user is watching broadcast TV than when the end-user is watching web delivered video. This isn't an issue with play/pause/ff/frew/skip (etc) because they aren't used when watching broadcast TV.

<snip>

> To me, an "app developer" is the person who wrote the Web page and the
> layout and the graphical design in HTML, CSS and Javascript and has made
> sure that all the functionality is there that is required to interact with
> the Web page.

Yes - to me as well.

> If they decide to use @control, then the UA gets to render the
> controls. If they decide to do it themselves, the don't use @control. Is
> that the same "app developer" you are talking about?

Yes this is the same "app developer" but please see the assumptions I mention above.

I think the issue here is that typical HTML+JavaScript apps for TV will need to replace some of what is covered by the controls attribute but are not able to fully replace the rest of what is covered by that attribute and probably would not want to do so anyway.

If we look at the list of features in the spec for the UI provided if the controls attribute is present, an HTML+JavaScript app for TV would typically want to provide the following;
- begin playback
- pause playback and
- seek to an arbitrary position in the content (if the content supports arbitrary seeking), 
- (Also fast forward + fast rewind which aren't mentioned in the spec)

It would typically not want to provide the following basic TV functionality;
- change the volume
- change the display of closed captions or embedded sign-language tracks,
- select different audio tracks or turn on audio descriptions, and
- show the media content in manners more suitable to the user (e.g. full-screen video or in an independent resizable window).

Having the absence of the controls attribute block this basic TV functionality unless the app provides it would (IMHO) be very bad.
Comment 9 Simon Pieters 2014-03-18 06:07:38 UTC
"Even when the attribute is absent, however, user agents may provide controls to affect playback of the media resource (e.g. play, pause, seeking, and volume controls), but such features should not interfere with the page's normal rendering. For example, such features could be exposed in the media element's context menu."

http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#user-interface

I think explicit buttons on a remote control falls into the above.

I don't think the spec allows disabling of fast-forward, but the app could react to 'ratechange' event and change playbackRate back to 1.0 or so.
Comment 10 Silvia Pfeiffer 2014-03-18 11:41:24 UTC
(In reply to Jon Piesing (HbbTV) from comment #8)
> I apologise for being blunt but in my experience any solution that assumes
> HTML+JavaScript apps for TV sets will set the controls attribute is simply
> irrelevant.

Good. We have established that we mean the same thing.

> 
> > > That depends on what functionality provided by the UA would be disabled when
> > > the controls attribute is absent.
> > 
> > None. The @controls attribute does not disable or enable any features. It
> > only exposes some features visually with buttons and sliders to the user
> > that otherwise are only available to script.

To re-iterate: there is absolutely no functionality that would only be available through @controls. Everything is available to script.


> There are a several different reasons why this doesn't work in TV ...
> - there are gaps in the set of APIs that an app would need to be able to
> offer the functionality through script (e.g. reading previously set user
> preferences so the user doesn't have to re-enter them)

Yes, that's already possible. YouTube does that with their caption preferences and many other things purely in JS.

> - the remote control buttons (or virtual buttons) that are used for the UI
> offered by the TV are typically hard-wired to the TV and never delivered to
> the browser.

See Simon's reply.

> - it is widely felt to give a poor user experience if pressing volume up (or
> audio description or subtitle) gives a very different UI (not just colour
> but also conceptually) when the user is watching broadcast TV than when the
> end-user is watching web delivered video. This isn't an issue with
> play/pause/ff/frew/skip (etc) because they aren't used when watching
> broadcast TV.

See Simon's reply.


> I think the issue here is that typical HTML+JavaScript apps for TV will need
> to replace some of what is covered by the controls attribute but are not
> able to fully replace the rest of what is covered by that attribute and
> probably would not want to do so anyway.
> 
> If we look at the list of features in the spec for the UI provided if the
> controls attribute is present, an HTML+JavaScript app for TV would typically
> want to provide the following;
> - begin playback
> - pause playback and
> - seek to an arbitrary position in the content (if the content supports
> arbitrary seeking), 
> - (Also fast forward + fast rewind which aren't mentioned in the spec)
> 
> It would typically not want to provide the following basic TV functionality;
> - change the volume
> - change the display of closed captions or embedded sign-language tracks,
> - select different audio tracks or turn on audio descriptions, and
> - show the media content in manners more suitable to the user (e.g.
> full-screen video or in an independent resizable window).
> 
> Having the absence of the controls attribute block this basic TV
> functionality unless the app provides it would (IMHO) be very bad.

The absence of the controls attribute blocks has no such effect.
Comment 11 Silvia Pfeiffer 2014-03-18 11:42:54 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the Editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the Tracker Issue; or you may create a Tracker Issue
yourself, if you are able to do so. For more details, see this document:

   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: None.

Rationale: The functionality requested by the bug is already available and specified in the spec.
Comment 12 Jon Piesing (HbbTV) 2014-03-18 13:48:51 UTC
I think the specification would benefit from a link from the language on "honor user preferences for automatic text track selection" to the text mentioned in comment 9 from Opera.

Perhaps something like the following added immediately after "5. Set the element's did-perform-automatic-track-selection flag to true."

Note: User agents may provide controls to enable users to affect playback of text tracks at times other than when the user agent is required to honor user preferences for automatic text track selection. See http://www.w3.org/TR/html51/embedded-content-0.html#user-interface
Comment 13 Silvia Pfeiffer 2014-03-24 03:24:08 UTC
(In reply to Jon Piesing (HbbTV) from comment #12)
> I think the specification would benefit from a link from the language on
> "honor user preferences for automatic text track selection" to the text
> mentioned in comment 9 from Opera.

These are two completely different and orthogonal topics. "honor user preferences for automatic text track selection" is generally applicable, independent of there being a @controls attribute or not, just like almost everything else that relates to a media element.

> Note: User agents may provide controls to enable users to affect playback of
> text tracks at times other than when the user agent is required to honor
> user preferences for automatic text track selection. See
> http://www.w3.org/TR/html51/embedded-content-0.html#user-interface

That note makes no sense. The UA is always required to honor user preferences for automatic text track selection. "honor user preferences for automatic text track selection" is merely an algorithm that is called either after a media element has been created, or when a text track has been added - that's what the "required" in that paragraph refers to.

The user agent is able to provide controls at any time, but is particularly encouraged to do so when scripting is disabled or when @controls is present.

I believe you have a misunderstanding of what the word "controls" in this context means. The "controls" that we're talking about here are the visually represented playback and navigation controls - the actual buttons and sliders. In contrast, a remote control is not such a "control". From the browser's viewpoint, a remote control is merely an input device that provides events that the browser reacts to.

If you have the keyboard focus on the video element and you interact with the video element via a remote control, the browser will receive the keypresses as keyboard events with keyCodes for the pressed buttons. See e.g. http://dev.opera.com/articles/view/functional-key-handling-in-opera-tv-store-applications/ , http://samsungdforum.com/Guide/art00046/index.html , http://stackoverflow.com/questions/12421379/samsung-smart-tv-app-brightcove-sample-app-remote-control-issue/.


I think, we might be able to make two changes that I hope will address your issues:

1. We could explicitly mention something about captions in the paragraph that Simon cited, e.g.:

"Even when the attribute is absent, however, user agents may provide controls to affect playback of the media resource (e.g. play, pause, seeking, caption track activation/deactivation and volume controls), but such features...

2. We could explicitly mention the possible use of remote controls as input devices in the "User interface" section, e.g. in the same paragraph:

"..should not interfere with the page's normal rendering. For example, such features could be exposed in the media element's context menu or via a TV remote control."

Would that be satisfactory?
Comment 14 Jon Piesing (HbbTV) 2014-04-02 13:45:24 UTC
The proposal at the end of comment 13 is fine by us.
Comment 15 Charles McCathieNevile 2015-06-19 14:27:02 UTC
Make the changes suggested by Silvia in comment #13 to the 4th paragraph in http://www.w3.org/TR/html5/embedded-content-0.html#attr-media-controls when we can edit the spec again.

Robin, how do we do that?
Comment 16 Silvia Pfeiffer 2015-06-20 00:18:43 UTC
See https://www.w3.org/Bugs/Public/show_bug.cgi?id=28828 for the first change

and https://www.w3.org/Bugs/Public/show_bug.cgi?id=28829 for the second change.

I've added these requests to the WHATWG because realistically that's where the spec is currently updated. I believe changes there will make their way into the TR eventually, too.
Comment 17 Travis Leithead [MSFT] 2016-04-25 15:30:28 UTC
HTML5.1 Bugzilla Bug Triage: Fixed by applying WHATWG commits:
by way of porting two WHATWG change:s:
1. whatwg/html@782f2d7
2. whatwg/html@df16ff3
which together may address the bugs concerns at least from the cloned copy of this bug.

If this resolution is not satisfactory, please copy the relevant bug  details/proposal into a new issue at the W3C HTML5 Issue tracker: https://github.com/w3c/html/issues/new where it will be re-triaged. Thanks!