W3C

Timed Text Working Group Teleconference

15 September 2022

Attendees

Present
Alexander_Flenniken, Andreas, Atsushi, David_Singer, François, Gary, Hew, Kim_Patch, Nigel, Pierre
Regrets
-
Chair
Gary, Nigel
Scribe
cyril, gkatsev, nigel

Meeting minutes

This meeting

nigel: welcome, this is thursday at TPAC, first TTWG meeting
… earlier had a joint meeting with MEIG
… later on today we have an APA joint meeting
… Accessible Platforms Architecture
… to talk about Synchronization Accessible User Requirements
… Our agenda is primarily with the DAPT
… and there are a few TTML issues to cover as well
… Is there anything else for today's call?
… We will get to intros in a moment
… Tomorrow's meeting is much longer with a lot more topics
… Might come back to more topics
… Rechartering moved slightly to accomodate Phillipe
… And joint with Media WG is tomorrow
… Anything else today or tomorrow

Gary: Tomorrow, with Media WG may be worth bringing up text based audio descriptions

Nigel: From the breakout session yesterday?

Gary: Yes

nigel: Apple showed using WebVTT to drive speech synthesis to provide audio descriptions
… I also led a meeting about getting user data from the video element without breaking user privacy
… there should be minutes for those sessions
… Independent Interoperable Implementations meeting was yesterday as well
… should we mention both webvtt and DAPT work?

Gary: why not?

nigel: Introductions
… you probably noticed that Gary and I are the chairs

Alexander: I work at Bocoup. I'm interested because I worked on a personal project and implemented captions
… and wanted to know what's up

atai: I'm Andreas Tai from Munich, I'm an independent consultant and previously worked for german broadcasters

atsushi: I'm a w3c contact for the TTWG

Francois: I work for the W3C where I track media related activity

Hew: Hew Maxwell I work for red bee media. Produce a lot of AD.

cyril: Cyril, I work for netflix. Currently working on the DAPT document and WebVTT/TTML for netflix

DAPT-REQs issues

nigel: our first agenda topic is DAPT reqs and issues
… I think the issues are fairly straight forward
… Should we start with the new one?
… Comment from Simon who works at yello-umbrella
… wants to adjust process step 4
… about defining audio mixing instructions for AD (AD only)
… that can be useful for the dubbing workflow, where the original M&E track isn't available
… or it's desirable to mix over original voice
… Does that make sense?

all: yeah

nigel: I better open a PR for that
… issue #16, amount and rate of gain in 2.7
… in the same section as above
… [quotes from the doucment]
… I don't quite understand the ambiguity, but might need a rewrite to make it clearer
… issue 7, event will come up later, we better cover this
… in the reqs, we have set of events
… document defines list of intervals called an event

cyril: did you respond to nested events?

nigel: definition of event is list of events with begin and end times
… I struggle with events, good to open up for discussion
… from our discussion cyril, we weren't happy with the name "event"
… comes up with the data model
… theres timed entities at the top and we need a name for them
… calling them event is a new bit of terminology
… doesn't exist in TTML
… webvtt has distinct concept of cue
… and has a close analogy, but not quite the same

atai: I remember I think the concern the name overlaps with other things like javascript
… I could imagine integrating another programming language could get very confusing
… maybe if it is prefixed to make it special to the profile
… could still be event but prefixed

hew: speech event?

cyril: yeah, you may want to represent on-screen text, which won't be speech

when you're describing the content, there may be text you want to repsent

hew: is the output speech? input may not be speech
… but isn't the output always speech?

cyril: yes, but we have different types of scripts
… you can't script pre-recorded
… intermediate steps may not be speech events
… may not have content in it

atai: could call dapt:event
… but could be confusing because event in xml could also have a namespace prefix

nigel: although , we want to map to existing ttml rather than a new entity

cyril: maybe provide a description but still call it event?

nigel: still could cause confusing

cyril: like dom events?
… I don't think there'll be a lot of overlap

nigel: I did some research into this into how script writing tools handle this
… and best I can find is paragraph
… we may want to make these divs, but if it's called paragraph someone might want to make it p

cyril: I think we should keep it event, I think overlap is small

nigel: would script event work?

cyril: still javascript script

atai: it's hard to know intuitively by the name alone
… a more descriptive name would be better

cyril: like script interval?

nigel: that's quite general purpose
… script interval sounds just like a time range
… but it needs to be associated with an element at the end

cyril: could have someone else

nigel: proposals:
… keep "event"
… "speech event"
… "script interval"
… "script event"

atai: I like "script event" because it's the closest to what it
… is

nigel: event sounds like a single moment in time
… it's a range and not a period at time

hew: events need time to happen
… script event sounds fine to me

nigel: we have a shared editorial goal to make it as easy to read
… "script event" would be easiest to tweak to see how it works

gkatsev: do we need to answer it right now?

nigel: we do want a FPWD soon (today maybe?)
… we're going to try script event, but we're open to other suggestions

cyril: what about nesting?
… the fact that audio fade in and fade out
… are these two different events?

nigel: we need to be clear about nesting

cyril: I wouldn't call it nesting, but variations
… phases, maybe. Not nesting

gkatsev: I think that webvtt calls time overlapping cues as nested cues
… as a data point

atai: maybe write it as one word?

nigel: I don't think this will be an element name, but a div or p
… part of the data model is to relate to the TTML
… script is made from script events
… and TTML is like this
… data model and data structure use different terminology, slightly

cyril: model defines the terms
… ttml defines the syntax, but no new terms

nigel: where we have these nested like in example 5
… of ADPT
… these could be timed steps

cyril: script event is the high level concept

nigel: entity?

cyril: forexample, AD spoken at this time
… point 2 or point 3 seconds is a bit of a detail

nigel: we're not calling them sub events

cyril: I don't think we need to name them
… I think it's AD specific
… not applicable to dubbing

nigel: actually, we just considered something for dubbing, which was AD only previously

cyril: I understood it as mixing and not animation of gain

nigel: animation is part of it
… so if you need to duck audio for the dub
… you def would want to lower the gain

cyril: and certainly a curve

nigel: I think we've got an editorial action there we can do in the reqs and DAPT itself
… issue 6 from cyril

cyril: I created the issue based on your comment Nigel

nigel: this is a data modeling point
… maybe we can pass it by and not worry too much
… [showing some ADPT reqs]
… I think maybe not an issue

cyril: lets close it then

nigel: event doesn't need to have content initially
… could have an empty div with times in it

cyril: yes, especially in step 1, the first step is indetifying gaps

nigel: identify a set of script events
… yeah, let's close it
… definition of script event doesn't require content

cyril: last isue is the MAUR

nigel: I think we can just do that
… I'll take the action to update the DAPT reqs
… those are all the issues on dapt-reqs
… I'll try to do an editorial pass later this after noon

DAPT issues

cyril: if we look on issues on the document, there's plenty

nigel: if you've been swamped, cyril and I have been doing lots of editorial work
… and created lots of issues for editorial work
… should we descript where we are at editing?

cyril: major changes we've made
… the introduction
… and adding step-by-step examples
… they can start reading top to bottom
… and will be less dry than directly into technical aspects
… Nigel has been porting structure of ADPT work
… by adding sections like Definitions and Conformance

nigel: worth saying why at this point
… When I drafted ADPT, I closely followed IMSC
… to make it a profile of IMSC2
… to make things required or options. That kind of thing.
… If need be, define additional features
… formally, I think that's what it is
… what's the best way to make it a usable spec, while making the required formal steps

cyril: I think we have an agreement that it should be easy for the reader
… and not need to be an expect for TTML
… but at the same time we want it to be Correct
… in the sense that it should be based on TTML
… and validate that the script is TTML
… and a TTML processor should be able to process this script
… that second part shouldn't be emphasised, but not hidden
… This document is targetting authors and implementors
… Shouldn't have too much to go to TTML2 spec or TTML1 spec to start implementing
… If they wanted to better understand things

nigel: I'm very interested on your views on this, Pierre
… based on your experience on your work on IMSC
… can you describe your thoughts?

pal: I haven't had chance to look at the latest changes
… There's at least two audiences
… First, someone who just wants a quick start
… Someone who wants to try it and get a quick result as fast as possible
… That might be done
… As soon as you try doing something more esoteric, you'll need to read the spec
… I have the same relationship with the HTML spec
… Most of the time I just had a template
… but sometimes need to refer to the spec or MDN documentation
… I think to target different sets of readers
… I would have a really good set of examples and then details spec

cyril: what do you call it?

pal: in the case of IMSC, the last version of IMSC have a pragmatic approach to make a profile of TTML
… what's missing in IMSC is a section on best practices or a quick start
… "how to do simple subtitles"

nigel: there's a bit of me over the years of doing IMSC
… there's almost annoying details that are hard to make details that explain but also make it formal
… and don't leave holes

atai: I agree with cyril that possible this spec maybe more read by non-technical people
… compared to IMSC
… or other TTML profiles
… Would be very valuable for a section that those people can read to understand DAPT and use it correctly
… issue with IMSC and WebVTT is making it human readable to make it easily understandable
… WebVTT is doing at the beginning lots of example explaining things
… I think this is what Pierre is proposing and I think that would be a good way to do it
… I would also agree with cyril if you need to make it a decision to make it more complex or easier to understand
… maybe choose makeing it easier

cyril: I'd like to make 3 points
… 1. this profile will be much simpler than IMSC

+1

cyril: I feel this should be possible to write the spec slightly differnet from IMSC because you don't need to understand as much complex details
… point 2 and 3. is the notion of processor
… validating and content processors, lots of concepts
… In order to make it simpler in DAPT
… I suggested we define whic processors we envision here
… maybe a recorder or a converter to subtitling format
… we define what conformance means for those processors
… we can say it's a TTML processors
… but basic usage doesn't need to know it's a TTML processor
… 3rd point is the notion of features
… we decided to demonstrate TTML implementation via features
… and we map each to a value and what not
… that's why in IMSC 1.1 we define the features and the new features defined in IMSC 1.1
… I think that's complex to understand for a non-expect
… I understand the value of defining feature
… Makes IR and conformance and coverage easyier
… I'm not disagreeing wiht defining features
… but we shouldn't necessarily define features in the way we've done it in IMSC/TTML
… that's hard to understand for those that don't already know TTML
… Folks don't understand the big features table in the specs
… WE can do a simpler job in DAPT
… maybe push it into an annex, so that it isn't first to readers

nigel: as long as it's normative

cyril: yes

nigel: I think I've got a slight uncertainty on how useful the features list
… might be slightly more useful than you might think
… implementor can work through that list one at a item
… which might help implementors as a kind of TODO list

cyril: I'm willing to do a first pass and propose something for features

nigel: to me it makes little difference whether it's in the main spec
… or an annex that comes at the end
… makes little practical difference
… I understand the more you scroll the harder/more complex it feels

cyril: we agree it is simpler
… but we see lots of features, even if lots are prohibited, because you have to go through it all

atai: as a data point when we started EBU-TT
… we deliberatly chosen not feature approach because we didn't understand it
… and once we thought it might be useful to have a features list
… we then put it at the end
… and we put it in the end
… for those that aren't familiar with TTML, it's hard
… we need to put ourselves in the shoes of people who are just appraoching the spec

<cyril> +1

pal: just two thoughts
… moving to an annex sounds like a good idea
… a casual reader shouldn't care
… in IMSC we unfortunately, had to list permitted and prohibited, because about half is permitted/prohibited

<Zakim> nigel, you wanted to mention IMSC compat

pal: in DAPT, most things are prohibited, so, might simplify the table by only listing permitted and say everything else if prohibited

nigel: I wanted to mention something that's directly related
… should we base this directly on IMSC?
… we couldn't do it, because IMSC prohibited some things that we wanted in DAPT
… but we should be able to turn DAPT document into a captions document
… by allowing everything allowed in IMSC
… conformance test could have optional
… or perhaps we can just have a sentence "everything permmitted in IMSC that isn't mentioned here, is optional"
… optional means you can put it into the document but you can also leave it out
… only test would be a validator could complain about TTML syntax that is used that isn't even listed as optional
… we have 3 different versions of IMSC
… can say everything in latest IMSC as optional
… or mark each feature as optional

atai: when I thought about it today, I think the beauty of the profile is that it's styling agnostic
… beyond dubbing and AD even
… I'm not sure by allowing all IMSC features, you risk this becoming an IMSC profile
… it's not about styling and layout but just about script events
… I wouldn't highlight this in the document

nigel: what you raised there is kind of interesting
… there's a whole other approach
… we can make an IMSC 1.3 or 2 and introduce an audio profile there
… and add in audio requirements a long with text/image
… and in DAPT point at the IMSC audio profile
… would only need a note about how to fulfil reqs for IMSC audio profile

cyril: my first thought it's probably not a good idea
… IMSC is for subs, I don't see why adding audio to it
… the title of the document is profile*s*
… how would you feel if we define two profiles in this document instead of one
… central one would be without audio/animation
… and other one will be with audio/animation support
… animation is complex and audio could be complex
… in dubbing you generally don't need mixing instructions
… it seems like this might be a good idea with nested profiles
… because someone could adopt first profile and then adopt second one later

atai: generally I agree another IMSC profile isn't good
… for nesting and having two profiles, a superset and a subset maybe that works

<Zakim> nigel, you wanted to mention profile levels

atai: each profile with distinct features could be confusing

nigel: one thought I had about it, it lets people who are making dubbing authoring tools
… allows those not to add unnecessary featurs but say it's conforming
… and vice versa
… one thought I is not to call it two distinct things
… maybe a level 1 and level 2 things
… to show that one is building on top of the other

cyril: we never had this notion of levels before

nigel: it's just a name

cyril: but if you receive a document and you want to know if you want to process it

nigel: you'll need two profile designator

cyril: then you have two profiles

nigel: good point

atai: as I said before, the profile itself could be used outside of AD and dubbing
… maybe as first step for captioning
… and maybe won't need multiple profiles

[David_Singer leaves]

atai: I'm not sure could be solved in a way that the features could be documented separately

nigel: not sure I quite follow

atai: what's the driver for two profiles? Confusion by people having features they won't use?
… like dubbing folks not needing any AD? and having trouble implementing the spec?
… or parsers needing to conform to both but only caring about one?
… Maybe have one profile but separate the usecases in the spec

nigel: comes to defining classes of processors that cyril mentioned
… could have a class of processors that support AD or Dubbing

Gary: WebVTT has CSS extensions with a section that says a WebVTT processor
… without a web engine can still be conformant if it doesn't support the CSS extensions

nigel: hello kim

Kim: I work with frontline to make interactive transcripts
… and we use WebVTT
… and there's some issues we've had to jerryrig
… and I wanted to let you know
… Don't know much else about this group
… if that would be useful or maybe connect to someone

nigel: yes definitely be useful
… we could create some time now
… or if there's some edge cases on DAPT reqs
… can we come back to it in a minute?

cyril: what could help decide the structure 1 profile/2 profile
… to indentify features
… what's shared is script events and timing, script type idenfitication, and optional styling

nigel: on-screen annotation should go in the middle too
… AD would read out on-screen things

cyril: not sure if character identification would be needed in AD
… contextual langauge, I'm unsuage about, maybe useful to make AD in another langauge
… maybe should be shared too

nigel: but timing could be different there

cyril: audio and animations that are those that you don't need in first pass when doing dubbing
… we can continue working on the diagram that it isn't truly nested
… character identification isn't in AD

nigel: but it isn't required in either

cyril: so we can nest it

nigel: we've over time on this and we can schedule more time tomorrow

[AOB] Kim Patch project for Duke Uni

Kim: Ended up using WebVTT and Able player
… Goal of the project was to transcribe a series of interviews for usability.
… Then Frontline has a transparency project e.g. to make available transcripts of interviews for documentaries.
… Some connected to videos.
… More than 300 transcripts, I processed them all.
… [shows screen] The transcript is beside and you can click it to go the point in the video.
… They are chapters, which are important for navigation.
… Also a "jump to current text" with the user being able to scroll independently from the video.
… Can select text and get a URL to that quote.

PBS interviews

Kim: Things missing in VTT: multiple chapters
… We ended up using Note to jury-rig chapters and sub-heads.
… Also wanted paragraphs.
… We jury-rigged that with a Note - Paragraph tag.
… Would be great if that was part of the spec.
… Really needed paragraph and sub-head tag, those were the two main things.
… The other thing is: it's important that people can read this, edit and fix it.
… The back-end needed easy to read, so WebVTT was a good format. But we can use HTML.
… Also added to Able Player to allow for search to show moments on the playback timeline for example
… to indicate the times for search results.
… WebVTT worked really well and there was nothing else that would have worked.

Cyril: Are you generating WebVTT on the fly from the text or some original script? What is the source?

Kim: A couple of sources. One is a template we made, another is a tool called Audio Notetaker, which is
… intended for dyslexic kids, I wish there were an open source equivalent.
… [shows app] The audio shows as chunks with pauses. You can section the text by sentence,
… You always want to come in at the beginning of a sentence.
… Then when we do the processing it is listening at double speed with a foot pedal,
… the text sentences are automatic, the foot pedal process lets you go around again really rapidly,
… then the tool lets me export into WebVTT.
… Shows NOTE paragraph entries in the VTT.
… We quickly processed these to get 60 interviews up at the same time as the documentary.

Cyril: This tool is VTT aware and separates Notes from text with timing?

Kim: Correct, [shows time and audio related to VTT cues] that's how we do this quickly.
… In general it's important to allow people to do transcripts quickly.
… These tools could be better.

<Zakim> gkatsev, you wanted to ask about paragraph tag

Gary: What exactly do you need from the paragraph tag? Just a long block of text?
… Is the regular cue not enough?

Kim: It's a blank space.

Gary: You're combining multiple cues and the paragraph note puts the cues together?

Kim: Yes
… [shows how the text transcript combines the cues between paragraph notes as single paragraphs]
… NOTE and then some special words is a good way to do it, it's how people are doing it already.
… Can also do subheads the same way

TTML2 Issues

TTML2 issues marked as Agenda

ttm:role values in the registry

github: https://github.com/w3c/ttml2/issues/1248

nigel: since we published this version of TTML2, there is now a Registry track in the Process with its own publication cycle
… it's recognized that it's just data values
… I'd like to modify this in TTML2 to reference the registry
… at the very least we need to assign a document (possibly the wiki) as the registry and reference it from the spec

cyril: we already do

nigel: it's informative

nigel: I'll look into reversing that
… and btw it's exactly the same in TTML1
… any thoughts or comments?

atai: currently the role registry is not normative
… there is no rule to add values, constraints on duplicate values ...
… if a new role is defined in a TTWG spec it would not go into TTML2 but only in the registry and be normative

nigel: correct
… it allows us to change that list without having to redo the entire spec

nigel: no objections?

(silence)

nigel: we need to determine the place (wiki or not) and rules for registry changes
… I take that as an action
… Proposal: Define a registry track document with the list of TTML role values and normatively reference it from TTML2

atsushi: I'm not sure about the registry

RESOLUTION: Define a registry track document with the list of TTML role values and normatively reference it from TTML2

Permit ttm:role attribute in ttm:desc elements w3c/ttml2#1247

<nigel> github: https://github.com/w3c/ttml2/issues/1247

nigel: this is a related, but different issue about TTML role
… currently the TTML2 role attribute talks about "content element" (content module, audio, ...)
… it specifically does not apply to a ttm:desc element
… however, this is in use in real world applications

<Zakim> atsushi, you wanted to discuss not sure but registry is normatively referencable???

nigel: so I'm proposing to allow role on the desc

atai: I agree
… why do we have the restrictions that metadata cannot be put on some elements
… why not allow all TTM attributes?
… why just for role?

nigel: there are only 2
… agent and role
… not sure an agent applies to metadata

atai: I do not see a harm
… TTML2 is usually very liberal
… even if it does not make sense now
… for example, agent on a br element

nigel: you're probably right
… it would make sense to be consistently liberal

hew: there is no high cost to allowing it

nigel: there would be an editorial action to double check the schema
… any objections?

(silence)

RESOLUTION: We update TTML2 to allow metadata attributes on metadata elements

Meeting close, for today

Nigel: Thanks everyone, we have an APA joint meeting in 2 hours, then we're back at 8am Vancouver time.
… [adjourns meeting]

Summary of resolutions

  1. Define a registry track document with the list of TTML role values and normatively reference it from TTML2
  2. We update TTML2 to allow metadata attributes on metadata elements
Minutes manually created (not a transcript), formatted by scribe.perl version 192 (Tue Jun 28 16:55:30 2022 UTC).