Re: Roll-up captions in WebVTT from Silvia Pfeiffer on 2011-12-20 (public-texttracks@w3.org from December 2011)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Tue, 20 Dec 2011 11:51:28 +1100
To: Glenn Maynard <glenn@zewt.org>
Cc: David Singer <singer@apple.com>, Gal Klein <gal@plymedia.com>, public-texttracks@w3.org
Message-ID: <CAHp8n2mEyJg+ibbzUSk3xktJkDjiW1pDyf9_qxr3M8qBSQx-4g@mail.gmail.com>
On Tue, Dec 20, 2011 at 11:00 AM, Glenn Maynard <glenn@zewt.org> wrote:
> Just to sum up my opinion right now, since I think the conversation may have
> muddied it: I'm not convinced that a significant enough number of viewers
> actually prefer roll-up captions to pop-on captions to support it for that
> reason, given that almost all movie subtitles are pop-on.  I think roll-on
> captions *may* be required for live captioning, but Gal claims otherwise and
> I ask for more information below.  I'm not strongly for or against the
> rendering mode as a whole, but I'm strongly against certain methods of
> implementing them (especially the "repeat the cue over and over manually"
> one).

Just to be clear, since you may unknowingly be throwing out the baby
with the bath water:
If we do not provide a means to natively display scrolling text
overlayed on top of the video for captions, we cannot do so for
subtitles either (including karaoke), nor for any other kind of text
we want to display scrolling on top of the video in the future,
including credits, nor can we faithfully represent existing TV rollup
captions with WebVTT except through copying text between cues, which
you agree is inferior, since it involves repeating text.


> On Mon, Dec 19, 2011 at 6:18 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com>
> wrote:
>> Not necessarily. Ian invented a <redoline> (redo-line) tag in this bug
>> https://www.w3.org/Bugs/Public/show_bug.cgi?id=14104 . That could be
>> used for rollup as well as popup captions and it is editing done
>> through markup.
>
> Which he labelled as a variant, and he said that WebVTT was designed for
> static resources.  WebVTT isn't a streaming format, and trying to bend it
> into one will likely make a mess of the whole thing.

It would be worse to have to define yet another format for live
captioning and as I continue to explain, rollup is not just used in
live captioning - there are many use cases, so you are basing this on
the wrong assumptions.


>> How would the markup look such that you could render it to either
>> rollup or popup? Do you have an example markup that could be rendered
>> either way with using just UA settings?
>
> Any WebVTT text.  Again, there'd be work to do here and questions to answer,
> but it's solvable.  For example, render all text as roll-up, with cues with
> a cue line position of auto or of >= 50 being grouped at the bottom (rolling
> up) and other cues being grouped at the top rolling down.  (I'm not going to
> try to lay out a complete rendering algorithm for roll-up; I'm not at all
> familiar enough with that part of the spec to even try.  This is the part
> that would require a fair bit of work.)

It's the grouping part that you have to solve first. How do you solve
the problem that one cue influences another cue, ie that they are part
of the same "group"? Anything after that is simple, as you say. But
you have to solve the grouping first.

My suggestion is to group cues by giving them the same "class".
David's suggestion is to repeat text and mark it as a repetition, thus
identifying the cues with the repeated text as a group. Either will
work, but I would prefer not having to repeat text. My suggestion does
not repeat text.


>> My suggestion to solve this problem was to have a "class" on cues that
>> would group cues together such that they can be rendered as rollup
>> (see my first email in this thread). It is minimal additional markup
>> and would indeed allow to create rollup or popup captions from the
>> same content. It could be changed through preferences.
>
> Not if you really want to be able to see roll-ups in general, since you'd
> only be able to see roll-ups where the author has specified it.  Almost no
> authors will do that--you can disagree with that premise if you want, but I
> think I'm right--so you'd end up seeing pop-ons most of the time.

Those authors that care about rollup will use it. It means that if you
disagree with the rollup display that an author has provided, you can
override it to be pop-on. That you cannot override pop-on to be
displayed as rollup is not important, seeing as there doesn't seem to
be a need for this kind of conversion.


>>> Bitmap subtitles on DVDs (the
>>> form used by most movies) didn't even support roll-up (from what I
>>> recall when I implemented a decoder many years back).
>>
>> DVD is never used for live and captions for live recordings on DVD
>> would always have been reformatted. So, I understand why that never
>> existed.
>
> We're talking about the use of roll-up in general, including prerecorded
> captions.  You said that roll-up captions are more natural to US readers,
> and that pop-on captions are more common with anime than other genres.  That
> simply doesn't seem to be true; roll-up captions seem exceptionally rare
> outside of live captions and not even supported by many media.  I showed
> samples from several media formats and different countries to support this.

And I have shown counter-examples. I accept your premise, but would
like to ask you to be open to mine, too. I can also accept that
several publishers that have started publishing or streaming video
with captions online find it easier and currently sufficient to just
go with the pop-on model. Indeed, it has taken YouTube 5 years of
providing captions online for thousands of videos before the lack of
scrolling captions hurt enough to actually implement support for it.
But they are supporting it now and it is here to stay.

Since I have shown examples that are using scrolling text right now,
including captions, karaoke, and credits, both from existing TV and on
YouTube, I believe I have proven enough of a use case for rollup
support in WebVTT.


Silvia.
Received on Tuesday, 20 December 2011 00:52:15 UTC