HTML A11Y TF FtF - Media sub-group -- 20 Mar 2011

you are back

HTML markup for the JS API that we discussed yesterday

<silvia> http://www.w3.org/WAI/PF/HTML/wiki/Media_Multitrack_Media_API#.2810.29_HTML_Accessibility_Task_Force_proposal_.28.22The_San_Diego_Solution.22.29

<silvia> EC: we proposed to extend <track> to media resources, too

<silvia> EC: we suggest to put <source> inside the <track> element and run the same source selection algorithm on tracks as we run on video

<silvia> FO: so we have a source and a track selection algorithm

<silvia> .. and the regular source selection algorithm is also applied to the source elements in the track element

<silvia> EC: this is similar conceptually to having separate audio and video elements

<silvia> scribe: silvia

<eric_carlson> http://www.w3.org/WAI/PF/HTML/wiki/Media_Multitrack_Media_API#.2810.29_HTML_Accessibility_Task_Force_proposal_.28.22The_San_Diego_Solution.22.29

FO: what do I do if track links to a resource, but that doesn't load?

EC: you set up all the tracks, but put an error on it

… I would assume that you don't render anything

SP: is it added to the list of available tracks on the menu?

<eric_carlson> A simple example:

EC: maybe not, or it's shown, but as disabled

<eric_carlson> <video id="v1" poster=“video.png” controls>

<eric_carlson>

<eric_carlson> <source src=“video.webm” type=”video/webm”>

<eric_carlson> <source src=“video.mp4” type=”video/mp4”>

<eric_carlson>

<eric_carlson> <track id="a1" kind="descriptions" srclang="en" label="English Audio Description">

<eric_carlson> <source src="audesc.ogg" type="audio/ogg">

<eric_carlson> <source src="audesc.mp3" type="audio/mpeg">

<eric_carlson> </track>

<eric_carlson>

<eric_carlson> <track id="v2" kind="signings" srclang="asl" label="American Sign Language">

<eric_carlson> <source src="signlang.webm" type="video/webm">

<eric_carlson> <source src="signlang.mp4" type="video/mp4">

<eric_carlson> </track>

<eric_carlson> </video>

SP: this means we probably need to extend the @kind type

<eric_carlson> and here is an example with multiple caption formats:

<eric_carlson> <video id="v1" poster="video.png" controls>

<eric_carlson>

<eric_carlson> <source src="video.webm" type="video/webm">

<eric_carlson> <source src="video.mp4" type="video/mp4">

<eric_carlson>

<eric_carlson> <track id="c1" kind="captions" srclang="en" label="Captions">

<eric_carlson> <source src="captions.vtt" type="text/vtt">

<eric_carlson> <source src="captions.xml" type="application/ttml+xml">

<eric_carlson> </track>

<eric_carlson> </video>

SP: this markup is where the formats are different for captions, but the content is identical

… if you wanted them both to appear for user selection, you'd need separate <track> elements

FO: semantics look fine

… I like that we re-use things from the video element

… the track mechanics can basically be read from the JS API

… the supported track @kinds are left to the UA?

… so if e.g. the device doesn't support audio, then audio tracks make no sense

<JF> Bug around @kind: http://www.w3.org/Bugs/Public/show_bug.cgi?id=11593

SH: what if we need to replace the video because, e.g. we have extended audio descriptions or an alternative video with a sign language overlay?

… there is no markup way to do that?

EC/SP: you'd have to use script

SH: there is no way for the browser to understand this as an alternative representation?

EC: yes, and this is probably good, because the browser in the case of the sign language video would need to download the audio twice

SH: so the best way to set up your video element in this case would be to have an empty main video with just the timeline and the others as dependent tracks?

EC: yes, that is possible with MP4 - you need some kind of track, but it can e.g. just be a 1x1 pixel image with the duration of the video, since it defines the timeline

… it could also just contain the poster with that duration ;-)

SH: is there a way to attach metadata to a track to define the name of a track?

SP: Ogg and Matroska have this functionality FAIK

SH: MP4 can do it, but it's not sure if the MP4 baseline provides it

… might just be a convention

SP: are we happy with this kind of markup?

JF: what @kind attributes do we need now?

… see http://www.w3.org/Bugs/Public/show_bug.cgi?id=11593

<Sean> I'd like to propose a strawman group='foo' attribute

SP: for audio descriptions we can probably just use the same value @kind=description as for text, because the content type will tell us whether it's audio or text

JS: is it necessary to distinguish between descriptions and extended descriptions?

SH: we have a problem if we specify both text resources and audio descriptions that are in the same <track>

… does the source selection require the same main mime type content to be listed in alternative <source> elements?

EC: no, source selection should continue to work in this situation

JF: what about extended audio descriptions?

SP: that means a change of the timeline and is a bit complicated

EC: the solution that Masatomo provided this morning with pauseOnExit may work if we use text tracks for extending

… seeking would be a problem

scribe: it's possible with a script - I am not sure we can do it without a script easily

… probably too hard to do it within the last call deadline

SH: we should at least address some mechanism of doing it

… text descriptions on the Web are more useful on the Web with snippets

EC: the requirement to split them into separate files is a bit honorous

SH: media fragments would work with a single audio file

EC: no plans to implement that

SH: you can do it with script

EC: the safest way is to separate them up front

JF: do we need @kind=signing and something for clear audio

SH: I think we need to be able to group tracks into groups where we can do replacements of a track or additions

… we need to give the UA enough information to be able to do the choices

SP: it's up to the user to make that choice

SH: in the case of, e.g. clear audio, we need a toggle control and how would the browser get this information?
... I propose we need an attribute called @group in which multiple tracks can be put together and thus marked as alternatives to each other

SP: how do radio buttons work in HTML?

SH: we could do that and use the same @name on all of the track elements that are alternatives to each other

SP: for clear audio, we have two audio tracks being an alternative to one - how would that work?

<eric_carlson> an example of @name used to specify alternate video tracks:

<eric_carlson> <video id="v1" poster="lecture.png" controls>

<eric_carlson>

<eric_carlson> <source src="lecture_audio.mp4" type="video/mp4">

<eric_carlson>

<eric_carlson> <track name="Lecture video" src="lecture_video_with_signing.mp4" ></track>

<eric_carlson> <track name="Lecture video" src="lecture_video.mp4" ></track>

<eric_carlson> </video>

SP: I don't think that solves the clear audio case yet - show me the markup

<eric_carlson> an example with a clear-audio alternate:

<eric_carlson> <video id="v1" poster="lecture.png" controls>

<eric_carlson>

<eric_carlson> <source src="lecture_video.mp4" type="video/mp4">

<eric_carlson>

<eric_carlson> <track name="Lecture audio" src="lecture_audio.mp4" ></track>

<eric_carlson> <track name="Lecture audio" src="lecture_clear_audio.mp4" ></track>

<eric_carlson> </video>

SP: isn't clear audio a way to separate the foreground speech from the background noise in two different audio resources?

SH: you could do that, but that's not how it's typically done

EC: it only has the foreground speech, so you have the alternative between full audio and speech only

<eric_carlson> <video id="v1" poster="lecture.png" controls>

<eric_carlson>

<eric_carlson> <source src="lecture_video.mp4">

<eric_carlson>

<eric_carlson> <track name="Lecture audio" src="lecture_audio.mp4" label="Lecture audio"></track>

<eric_carlson> <track name="Lecture audio" src="lecture_clear_audio.mp4" label="Lecture clear-audio"></track>

… if you have separated the two, then you don't need the original and you remove the radio selector, because it makes sense to have both available

<eric_carlson> </video>

JF: how do we expose the different alternatives to the user?

EC: the @label does this, see the just posted example

MK: how do you do video or sign-language burnt in video and audio and clear audio?

SP: do we need a @kind for clear audio?

SH: we probably do, because right now the default is @kind=subtitles

<eric_carlson> example with video/video with burned in signing alternates, and audio/clear-audio alternates

SP: we could have just @kind=audio and @kind=video to say that we have alternatives to the main audio/video

<eric_carlson> because we have alternates for video and audio, we need to include a file as the <video> @src to define the "timeline"

SH: do we have a @kind=dub for dubbed audio?

… we need to probably find an extension mechanism for @kind

<eric_carlson> eg. to give the element a duration:

<eric_carlson> <video id="v1" poster="lecture.png" controls>

<eric_carlson>

<eric_carlson> <source src="timeline.mp4" type="video/mp4">

<eric_carlson>

<eric_carlson> <track name="Lecture video" src="lecture_video_with_signing.mp4" label="Lecture video"></track>

<eric_carlson> <track name="Lecture video" src="lecture_video.mp4" label="Lecture video"></track>

<eric_carlson>

<eric_carlson> <track name="Lecture audio" src="lecture_audio.mp4" label="Lecture audio"></track>

<eric_carlson> <track name="Lecture audio" src="lecture_clear_audio.mp4" label="Lecture clear-audio"></track>

<eric_carlson> </video>

SP: what do you display by default?

EC: in the absense of the user preference, you display the first one

SH: we display nothing, since default is OFF

EC: might not make sense...

SH: it's showing the poster and since the author did the "trick" with the empty @src file, they would deal with that situation

MK: so the browser does not know which is the clear audio

EC: correct, we need a @kind

<eric_carlson> same example, with "mode=enabled" for the default tracks and with @kind so the browser can identify the track kinds:

<eric_carlson> <video id="v1" poster="lecture.png" controls>

<eric_carlson>

<eric_carlson> <source src="timeline.mp4" type="video/mp4">

<eric_carlson>

<eric_carlson> <track name="Lecture video" src="lecture_video_with_signing.mp4" mode="enabled" label="Lecture video"></track>

<eric_carlson> <track name="Lecture video" kind="signing" src="lecture_video.mp4" label="Lecture video"></track>

<eric_carlson>

<eric_carlson> <track name="Lecture audio" src="lecture_audio.mp4" mode="enabled" label="Lecture audio"></track>

<eric_carlson> <track name="Lecture audio" kind="clearaudio" src="lecture_clear_audio.mp4" label="Lecture clear-audio"></track>

<eric_carlson> </video>

SP: you put the @kind on the wrong sign language track ;-)

<eric_carlson> this time with fewer errors:

<eric_carlson> <video id="v1" poster="lecture.png" controls>

<eric_carlson>

<eric_carlson> <source src="timeline.mp4" type="video/mp4">

<eric_carlson>

<eric_carlson> <track name="Lecture video" src="lecture_video.mp4" mode="enabled" label="Lecture video"></track>

<eric_carlson> <track name="Lecture video" kind="signing" src="lecture_video_with_signing.mp4" label="Lecture video"></track>

<eric_carlson>

<eric_carlson> <track name="Lecture audio" src="lecture_audio.mp4" mode="enabled" label="Lecture audio"></track>

<eric_carlson> <track name="Lecture audio" kind="clearaudio" src="lecture_clear_audio.mp4" label="Lecture clear-audio"></track>

<eric_carlson> </video>

SP: so we now have @kind=signings and @kind=clearaudio

SH: can we have free-text in @kind, too, or some sort of registry?

EC: we need a defined set so that the UAs can map from user preferences

SP: I'd prefer to keep a set of pre-defined ones and leave it open to extension, but without prescribing any formatting for the extensions

… it would be better if it works simply by convention rather than specification that, e.g. MS use ms-xxx as a new @kind value

JF: do we need to differentiate descriptions and extended descriptions

SH: we need a semantic replacement for pauseOnExit first before we can decide on this

SP: let's move on from markup to rendering

SH: we haven't got a general rule of rendering text tracks either

SP: I'd like to see us define it in a way that without any extra markup we get a sensible default rendering

SH: start the rendering at the top left corner and render all tracks from there

<inserted> scribenick: frankolivier

SP: I started thinking about this based on what people do today
... Three different ways
... One: Videos are rendered in a tiled fashion (css box model)
... They are arranged so that they are all visible

SH: Two: Have a big video, the other videos are on the side

SP: Three: Picture in picture
... Would be nice if UA by default could do one of three by default and user could switch to others

SH: I prefer option 1
... Problem with option one is that it relies on 'magic' CSS; you have to size the boxes so that they all stay visible
... Until you know how manby videos there are, you can't do the actual layout
... By default, if you put things into a flow layout, it will use the intrinsic height
... You would need a new css concept

EC: Another option would probably be best

SH: If we dont' say video is a viewport - but a containing box, then you would get a vertical list. This makes the most sense.
... We do hav eto have it not clipped
... Video shoudl not be a viewport, it should be a css containing box

Scribe correction: EC: We would not have these issues if we solved video a11y with multiple videos

SH: We should explore: A main video with another video (with no controls) as the a11y solution
... This does not give you the containing box design

EC: Page author now has to position video elements
... not a big burden
... You're asking somebody to include an extra element, write css, to add logic, to make it look like it usually looks (picture in picture)

SP: You have to position them in the <track> idea as well?

EC: No, if you want it in top-left, you are done
... You could set the right and bottom

SP: One other difference: <track> concept has the idea of automatically creating a caption/subtitles menu

EC,SH: No, you connect the videos together, so this should not be an issue

EC: If you have a file as the main resource, slaved resourced, you could control the visibility with css, audio with mute

SH: <track> concept and two <video> concept are very similar

EC: All the stuff we invented last night (audio track interfaces, video track interfaces, text track interfaces) go away

SH: We take <track> out of video, it can be a child of any element

SP: I would not go that far

SH: <track> (like slave <video>) would key off of main video timeline
... Link audio, video, track to main video element
... You *could* make <track> link to an animated gif

SP: Mozilla likes this approach
... I would like to have this approach explored

SH: I am in favor of this approach

EC: Not decided yet

SP: We need to submit a proposal by Tuesday

SH: I think we should submit #10 in the wiki
... It is inventing more stuff than we might need, but it covers all use caes

EC: I want to have discussion now around the best approach

(SH needs to leave for airport)

EC: Does not want to settle for less functionality

<silvia> FO: is there any example of an element in HTML5 where an element acts differently depending on whether it is related to something else?

<silvia> EC: in SVG there is "use"

<JF> EC: hesitate to pursue the idea of re-examination of the past day & half work

<silvia> scribe: JF

EC: discuss potential problems with the APIs on slave elements

seeking on the thing that is slave is conceptually not a problem

FO: you shold set the slave bit on the video element when you create it

There sould be an attribute that indicates that from creation this is a slave element

SF: This is what Option 6 does

links to the id of the master video

<silvia> http://www.w3.org/WAI/PF/HTML/wiki/Media_Multitrack_Media_API#.286.29_Synchronize_separate_media_elements_through_attributes

EC: Would we disallow seeking until media examples are available for all slave content?

FO: agrees in principle

EC: SO once it begins to load, it's the "and" of the seekable region

FO: the master element must have all samples from master and slave before you draw a frame

other option is "that's complex" so browser does best effort

SP: if seperate elements gives more leaway

FO: doing the "and" is significantly more difficult

EC: this is a quality of service issue. goal is to have tighter sync than you can do with script
... even if everything is in markup, we must accept that things get parsed in random order

flow control all needs to happen on the master clock

that must happen no matter what

allowing multiple controllers can be visually confusing, but won't really cause harm

SF: do we disable controls on the slave element?

EC: it is always controlling the clock of the master

duration is a good question: what do you do when a slave is longer than master

EC: one disadvantage/confusing - there is no mix-down valume control
... volume control on tracks will need to be handled by script
... we don't really need to worry about what happens if somebody shows multiple controls

is not a useful use case

<silvia> .. all interaction is forwarded through to the master

<silvia> … script is required to set volume and mute/unmute

discussion whether to make track a top level element - not much appetitie

EC: volume and mute on the master controls the mix-down

SP: We've not talked about mode

EC: We don't need that

FO: inband i completely seperate

EC: inband or container files on the web rarely contain more than 1 media file

If you have a file with inband 1 audio nd 1 video, you can control both with one controls

FO: would a track element have multiple sources?

EC Source element is useful in 2 situations. 1 is for multiple encodings.

leave source selection to mime-types

@kind attribute on slave (audio and video element) is important

rate and time apply to all of them

anything that effects the clock

(discussion around manifest files)

EC: without a concrete usecase it's hard to define a solution

want to be able to filter by attribute

- DRAFT -

HTML A11Y TF FtF - Media sub-group

20 Mar 2011

Attendees

Contents

HTML markup for the JS API that we discussed yesterday

Summary of Action Items

Scribe.perl diagnostic output