Re: Tech Discussions on the Multitrack Media (issue-152)

On Wed, Feb 16, 2011 at 3:48 PM, David Singer <singer@apple.com> wrote:
> I think we should make normal things easy and complex things possible.

I very much agree with this idea.


> If we set the rule that <source>s are alternatives (exclusive or) and tracks are additional (inclusive or, including the primary source) then the content author can indicate what 'works' from the point of view of his program.
>
> For example, if you are doing audio description of video:
>
> a) and the timing does not need to change, and the description can be overlaid on top of the regular audio, offer audio description as an optional add-on track

Yes, that's one way to provide external resources as virtual new tracks.

> b) and the timing does not need to change, but the audio description has, as part of its mix-down, the appropriate portions of the main audio, then make the video the primary source, and offer a <track> which has multiple sources, one or more of which are the plain audio, others are the audio description

That would require the author to pull the main video into two separate
resources and make a video-only track and a audio-only track. While
this is, of course, possible, I don't think it's a common use of the
main video. The default should still be to have the main video
including its audio and video track be a single resource coming
directly from the video element (possibly through a <source> therein).
Is it really asking this much from the audio description providers to
provide the audio description speech as a separate resource without
the main audio?

In case of a mix-down audio description - which I regard as the 20%
use case - I would much prefer the author to provide the option to
load a completely different media resource into the <video> element
through some JavaScript which has the video and the mix-down audio
together.

> c) and the timing needs to change; offer two or more sources, one or more of which have the normal audio and normal timing, and one or more of which have the audio description with the revised timing

The problem here is that not just the audio changes timing, but also
the video. Thus, where the timing needs to change, we have in fact a
completely different media resource, so I would suggest the same as
above with the mix-down audio description: use JavaScript to swap out
the resource. Alternatively you can always provide it as a separate
video on the page.


> while it is technically true that the user-agent may be able to make all sorts of ingenious displays, it's not a great system design to assume that the UA and the user will have the time or skills to make the choices over lots of ingenious possibilities.

We do in fact have to discuss how the display of multiple videos would
work. Would they be expected to be displayed as picture-in-picture?
Seeing as with the solution where we use <track> to provide the
resource (or the track comes from within the resource) there is no
specified extra screen area into which the second, third, etc. video
stream would be displayed, I do wonder if that means we need to do
picture-in-picture, or whether we have even a chance to grab some
extra screen space. That extra screen space would trivially be
available in options 6 and 7.


Cheers,
Silvia.


> On Feb 16, 2011, at 10:31 , Silvia Pfeiffer wrote:
>
>> On Wed, Feb 16, 2011 at 12:08 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>> On Tue, Feb 15, 2011 at 4:19 PM, Silvia Pfeiffer
>>> <silviapfeiffer1@gmail.com> wrote:
>>>> On Wed, Feb 16, 2011 at 5:36 AM, Mark Watson <watsonm@netflix.com> wrote:
>>>>> Hi Philip,
>>>>>
>>>>> Just a quick note that the "alternative" vs "additional" distinction is not always completely clear. Video with different camera angles (gimmiky or not) could be considered as an alternative, or could be rendered as picture-in-picture, or multiple thumbnail videos could show beside the main video (some sports sites already do this kind of thing).
>>>>>
>>>>> If you have the output capabilities (e.g. wireless headphones plus regular speakers) then simultaneous output of different audio languages might make sense.
>>>>>
>>>>> Not to say these are compelling use-cases, just that the markup should indicate what the media actually is such that the player can decide what to do, without any hard "alternative" vs "additional" distinction.
>>>>
>>>> I think I generally agree.
>>>
>>> Is there anything that prevents an implementation from displaying
>>> multiple "alternative" streams at the same time? Even if they are
>>> explicitly labelled and in spec called "alternative" I wouldn't think
>>> there is. Compare to how there are currently alternative CSS
>>> stylesheets. There is nothing preventing an implementation from
>>> displaying multiple windows which have different alternative
>>> stylesheets applied, as long as the DOM acts as though a specific one
>>> is applied (you can only return one value for .offsetTop). This is a
>>> model that works quite well IMHO since it doesn't require page authors
>>> to keep more exotic UAs in mind, allowing UAs to freely experiment.
>>
>> I think it all burns down to what the user/UA select to activate. If
>> there are multiple tracks that should really be alternative but are
>> presented together, then it's up to the user to deactivate the one
>> they don't want to see/hear. Even with <track> elements there is
>> nothing prohibiting multiple tracks from being active at the same
>> time. It's up to the UA to present these and up to the user to decide
>> if that is the presentation they would like to see. So, I don't
>> actually think we need to put an attribute on the media track for
>> alternative/additional, since the UA will not actually use it as input
>> IMHO.
>>
>> I used to think that alternative/additional was a big thing, too, and
>> it made sense to tell the UA about it so it can act appropriately, but
>> it becomes very complex very quickly: which group of tracks is
>> alternative to which other (e.g. all the English tracks a+v+t plus
>> original video against all the German tracks plus original video?) and
>> the logic becomes almost impossible to represent, and even worse to
>> interpret. So, I would think it's easiest if the browser just tries to
>> display them and leaves the choice of managing the active tracks to
>> the user.
>>
>> Silvia.
>>
>
> David Singer
> Multimedia and Software Standards, Apple Inc.
>
>

Received on Wednesday, 16 February 2011 07:23:50 UTC