This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 16823 - <track> Should cues at time 0 be active when the video loads but before playback? What about when there's a poster frame?
Summary: <track> Should cues at time 0 be active when the video loads but before playb...
Status: RESOLVED WORKSFORME
Alias: None
Product: TextTracks CG
Classification: Unclassified
Component: WebVTT (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: This bug has no owner yet - up for the taking
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords: a11ytf
Depends on:
Blocks:
 
Reported: 2012-04-23 09:47 UTC by contributor
Modified: 2012-11-20 08:04 UTC (History)
10 users (show)

See Also:


Attachments

Description contributor 2012-04-23 09:47:58 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html
Multipage: http://www.whatwg.org/C#media-playback
Complete: http://www.whatwg.org/c#media-playback

Comment:
<track> Should cues at time 0 be active when the video loads but before
playback? What about when there's a poster frame?

Posted from: 85.227.154.145 by simonp@opera.com
User agent: Opera/9.80 (Macintosh; Intel Mac OS X 10.7.3; U; en) Presto/2.10.229 Version/11.62
Comment 1 Simon Pieters 2012-04-23 09:50:58 UTC
Currently the spec activates cues as part of the "When the current playback position of a media element changes" algorithm, which, AFAICT, means cues at time 0 won't be shown when the video loads (before playback starts).

I'd like to hear what people think the spec should say about that case.

I think the presence of poster="" should not change the answer because poster="" is supposed to be a shorthand for making the video's first frame look pretty.
Comment 2 Ian 'Hixie' Hickson 2012-04-25 23:00:00 UTC
Subtitles/captions are essentially an alternative form of the audio track, so I don't think it makes sense to show subtitles before any audio would have played.


> poster="" is supposed to be a shorthand for making the video's first frame look
> pretty

I disagree with this premise.
Comment 3 Glenn Maynard 2012-06-26 21:24:09 UTC
To users, captions are part of the video, not audio.  If a caption starts at time 0, it should be displayed over the video at any time frame 0 is displayed, as if it was part of the video.

Also, if you're at time 100 and you do "video.pause(); video.currentTime = 0", it's strange to show a different thing than if you paused the video to begin with.
Comment 4 John Foliot 2012-06-26 22:31:03 UTC
(In reply to comment #2)

> > poster="" is supposed to be a shorthand for making the video's first frame look
> > pretty
> 
> I disagree with this premise.

I agree with Ian: the @poster image can be any image, it need not be the first, 50th or *any* frame from the associated video - it can be a complete stand-alone graphic file, diverse from any imagery shown in the video.


***************
(In reply to comment #3)

I think the answer should be derived from the caption file itself: if the first line of captioning is:

00:11.000 --> 00:13.000
<v Roger Bingham>We are in New York City

...then nothing is shown. If however it is marked as such:

00:00.000 --> 00:13.000
<v Roger Bingham>We are in New York City

...then yes, it should be active (render on screen) then.  This should be author-choice IMHO.
Comment 5 Glenn Maynard 2012-06-26 22:52:31 UTC
(In reply to comment #4)
> I think the answer should be derived from the caption file itself: if the first
> line of captioning is:
> 
> 00:11.000 --> 00:13.000
> <v Roger Bingham>We are in New York City
> 
> ...then nothing is shown. If however it is marked as such:
> 
> 00:00.000 --> 00:13.000
> <v Roger Bingham>We are in New York City
> 
> ...then yes, it should be active (render on screen) then.  This should be
> author-choice IMHO.

Of course, if you're paused at 0, then captions starting at 11s shouldn't be seen.
Comment 6 Ian 'Hixie' Hickson 2012-08-06 21:25:54 UTC
Subtitles are an alternative representation of an audio track. If someone takes a photograph, they don't include subtitles of what was being said while the photograph was taken, because there's no audio track with a photograph. A video, before it has started playing, is equivalent to a photograph, IMHO. No audio has yet played, so it makes no sense that any subtitles should show yet.
Comment 7 Glenn Maynard 2012-08-06 22:01:01 UTC
> Subtitles are an alternative representation of an audio track. If someone takes
> a photograph, they don't include subtitles of what was being said while the
> photograph was taken, because there's no audio track with a photograph. A
> video, before it has started playing, is equivalent to a photograph, IMHO. No
> audio has yet played, so it makes no sense that any subtitles should show yet.

Seeking while paused doesn't play any audio, either, but obviously seeking to the middle of a video while paused should display the captions at that point.  It's strange that seeking from 60s to 0s while paused shows a caption that wasn't displayed when the video loaded to begin with.  It'd be just as strange if seeking to position 0 was a special case and didn't behave the same as seeking to 60s.

Also, many captions aren't representations of audio at all; for example, signs (translations of text on screen, like road signs and book titles) and credits (transliterations of names in overlay credits).  Metadata tracks might be used to store YouTube-style clickable annotations; if these start at time 0 then they should be visible at the start, too.
Comment 8 Loretta Guarino Reid 2012-08-06 22:08:52 UTC
Ian, while this may be true for subtitle tracks, we have been experimenting with description tracks, and there is often need for preliminary description.
Comment 9 Simon Pieters 2012-08-13 10:45:55 UTC
(new comments, reopening)
Comment 10 Ian 'Hixie' Hickson 2012-08-21 00:37:49 UTC
I don't understand what description tracks have to do with this; they presumably don't render as graphical subtitles.

I disagree that seeking can't render audio. A good example of that would be Apple's iMovie, which renders audio during scrubbing. Showing subtitles in the same way seems quite reasonable.

Also we already have "seek to zero" do something different than "start at zero", so I don't think that's a problem.
Comment 11 Glenn Maynard 2012-08-21 01:14:16 UTC
(In reply to comment #10)
> I don't understand what description tracks have to do with this; they
> presumably don't render as graphical subtitles.

I don't know about description tracks, but metadata tracks may render as anything at all, depending on what they're being used for.

For example, YouTube supports clickable visual annotations displayed in boxes over the video, which would be a good use case for metadata tracks.  The scripts attached to the video would display annotations in the active metadata cues.  This will get in the way if you want to see captions at 0s when the video opens paused.

It's easy to see cases where not activating initial captions immediately causes problems.  When does activating them on load cause any actual problem?

> I disagree that seeking can't render audio. A good example of that would be
> Apple's iMovie, which renders audio during scrubbing. Showing subtitles in the
> same way seems quite reasonable.

This doesn't relate to my point.  In most video players, seeking to 12m30s while paused displays the frame at that time without playing any audio.  Clearly in that situation any subtitles at that time should be displayed (and that's what WebVTT does, as far as I understand).

> Also we already have "seek to zero" do something different than "start at
> zero", so I don't think that's a problem.

(This sounds like inconsistency as an argument for more inconsistency.)
Comment 12 Silvia Pfeiffer 2012-08-23 00:56:40 UTC
(In reply to comment #6)
> Subtitles are an alternative representation of an audio track. If someone takes
> a photograph, they don't include subtitles of what was being said while the
> photograph was taken, because there's no audio track with a photograph. A
> video, before it has started playing, is equivalent to a photograph, IMHO. No
> audio has yet played, so it makes no sense that any subtitles should show yet.

I disagree with this premise. A video that has not started playing is not equivalent to a photograph, but rather to a paused video. A photograph has no timeline, but a video has. That this timeline happens to be at point 0 when the video is loaded is irrelevant - it could be at any point, but it is paused, ready to go again. All the information that is expected to be displayed with the frame at that point in time needs to be displayed no matter whether the video is in paused or playing state. This includes captions, subtitles, chapters, and metadata.

It doesn't apply as much to audio descriptions, because they indeed are more part of the audio track and would be rendered as the video playback starts. If the audio descriptions don't have sufficient time to be read out within the first cue, the video will get paused, but that's ok. So, I don't see Loretta's point as much.

But I certainly agree that all the visually rendered tracks need to be displayed as though the video was paused at time 0.
Comment 13 Ian 'Hixie' Hickson 2012-08-23 04:58:52 UTC
> A video that has not started playing is not equivalent to a
> photograph, but rather to a paused video

If that were so, we wouldn't have a poster frame when you hadn't yet started playing.
Comment 14 Silvia Pfeiffer 2012-08-23 06:35:10 UTC
(In reply to comment #13)
> > A video that has not started playing is not equivalent to a
> > photograph, but rather to a paused video
> 
> If that were so, we wouldn't have a poster frame when you hadn't yet started
> playing.

Indeed we wouldn't need to, if that wasn't often a black frame.
Comment 15 Ian 'Hixie' Hickson 2012-09-11 23:57:32 UTC
We didn't add poster="" because the first frame is often black (which is not a premise I agree with), we added it because there's a qualitative difference between the <video> element being paused at t=0 and the <video> element having never played anything. The former is a video, the latter is more akin to a photograph, as per comment 6.

I think the spec as it stands right now is fine.