This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
http://dev.w3.org/html5/webvtt/#processing-model says: "Let video be the media element or other playback mechanism." "The viewport (and initial containing block) is video's rendering area." This isn't explicit about what the rendering area is when the <video> element is not the same size as the video rendered within. This can happen with object-fit, both with the default contain but also any value other than fill. Points to consider: * Percentage positions in WebVTT ought to be relative to the video. Otherwise cues may end up obscuring what they were indented to avoid. * Using the video rendering area is a bit more complicated implementation-wise. * Using the video rendering area would make it impossible to deliberately use a video which is taller than needed, so that the captions can be rendered beneath the video.
Fredrik, any thoughts on this?
When I've thought about I have mostly ended up thinking of the cases where the video ends up being letter-boxed. In those cases (with subtitles) I don't see why you wouldn't want to be able to place cues in the non-video area. I can definitely see how more strictly positioned cues would want to use the actual video-area though. (I suppose that more often than not these days, the non-video are is empty/small.) So it almost seems like you'd like to use one for line-snapped cues and another for non-line-snapped...
Letter-boxed video often has the captions in the bottom "matte": http://en.wikipedia.org/wiki/Letterboxing_(filming) . I personally think that's a good idea, so I wouldn't want to restrict cues to rendering within the video pictures, but use the whole video element's rendering area. The effect that has on pillar-boxed video may be that cues may end up further to the right / left than originally placed, but I doubt that would typically have much effect on obscuring objects.
Fredrik, I kind of agree, an argument can be made for using different boxes for snap-to-line and non-snap-to-lines layout. But then what about a (horizontal LTR) cue which is positioned in the right 80% of the cue but uses snap-to-lines layout? Presumably there's something to avoid in that left 20%, but widening the media element could cause it to become obscured. How about clarifying that it is the media element's full extent that is used, and cautioning authors to make the media element match the video size if they're using absolutely positioned cues? A bit of a cop-out, but if I have to pick one behavior this is definitely the one. Maybe if it becomes a problem some setting could be added to match the video box instead?
(In reply to Philip Jägenstedt from comment #4) > Fredrik, I kind of agree, an argument can be made for using different boxes > for snap-to-line and non-snap-to-lines layout. But then what about a > (horizontal LTR) cue which is positioned in the right 80% of the cue but > uses snap-to-lines layout? Presumably there's something to avoid in that > left 20%, but widening the media element could cause it to become obscured. Yes, at the end of the day it's always hard to infer what the author intended... > How about clarifying that it is the media element's full extent that is > used, and cautioning authors to make the media element match the video size > if they're using absolutely positioned cues? > > A bit of a cop-out, but if I have to pick one behavior this is definitely > the one. Maybe if it becomes a problem some setting could be added to match > the video box instead? I suppose that's a reasonable resolution. Another option might be to allow making a choice between the two (are there more?), but keeping that on hold is probably better.
Rick, which box do you use in Gecko? Eric, is WebKit still using the whole of the video area (assuming no video controls)? If everyone already uses the media element dimensions, I should just make that clear in the spec.