Timed Text Working Group Teleconference

27 Oct 2014

See also: IRC log


nigel, courtney, Cyril, pal, glenn, fantasai, ddahl, tidoust, noriya, khoya, jdsmith, hiroto, erik_carlson
dsinger, andreas, frans
nigel, Cyril, courtney


<trackbot> Date: 27 October 2014

<nigel> scribeNick: nigel


All members introduce themselves. Observers: Jangmuk Cho, LG electronics, interested in TTML and WebVTT

nigel: Summarises agenda, offers opportunity for other business

mike: There's an incoming liaison expected from MPEG

nigel: adds it to agenda

TTML Codecs Registry


nigel: We've agreed to host a registry and define a parameter
... We need to work out where to define the syntax normatively
... And where to note the media registration once we've updated it with IANA.

Cyril: I had two comments on the registry:
... 1. The discussion that talks about the first order detection of capabilities.
... 2. The editorial nature of stpp vs application/ttml+xml

Mike: We proposed to MPEG that they have their part and W3C has its part.

glenn: That's true - we need to remove the prefix requirement on the registry page.

Cyril: I'd rather delete the sentence "When an entry of this registry is used in a codecs parameter..."
... Actually the whole paragraph - it's up to MPEG to define any codecs parameter, and we can define the
... suffix.

group discusses whether any reference to RFC6381 is needed at all

glenn: I got rid of the RFC6381 references and put the combinatorial operators in.

pal: The AND and OR operators are normative, so it needs to be clearly defined.

cyril: +1

glenn: First we have to define where we're going to specify the normative definition of this new parameter - it shouldn't end up in the registry.

cyril: +1

glenn: When we have it somewhere else we can refer to then we can shorten the registry page.

Cyril: Can we also discuss the 1st order aspect, where it says that the processor profile is guidance only and may not always be correct.
... We should be strict here.

glenn: My position is that what's in the TTML document is what's authoritative, because it stays with the content.

cyril: Both have to be the same.

glenn: Even if you say they have to be the same, it's possible for them to diverge. Elsewhere, type identifiers are always documented as hints
... and the actual data is where you determine the concrete type.

Cyril: I agree, but we need to state it more strongly: it is an error in general to have a mismatch.

glenn: It wouldn't be an error in the document.

Cyril: I agree - if the two differ then that's an error and the value in the document has precedence.

nigel: There's a lifecycle issue there - a new processor may come along that can process older documents.

glenn: So the outer parameter may reference a new superset profile?

nigel: exactly.

glenn: So the rule should be about consistency not identity.

nigel: The options for where to put it seem to be:
... 1. An erratum to TTML1

2. TTML2

3. A WG Note

4. A new Recommendation.

group seems to think TTML2 may be the best place

glenn: I'm willing to add it to TTML2 but do not wish to repeat the whole registration section.


Glenn: We need to tweak section 3.1 in TTML2 "Content Conformance"

Cyril: I would rather add an annex in TTML2 called Media Types Registration referencing TTML1 plus diffs.

glenn: That works for me.
... I think we can avoid updating the IANA registration
... since there's no wording to prohibit adding new parameters.

Mike: I'm not so sure about that.

nigel: I don't think we can do that either can we?

glenn: The TTML2 we should define the new parameter and syntax and then reference it in 3.1.

Cyril: We should put it in a Media Type Registration annex so that it's clear. This then uses a reference to TTML1 with
... the diffs.

glenn: I'd like to call the annex something like 'Additional parameters for use with TTML media type'.

Mike: let's check with plh too.

all agree to point back to TTML1 media registration.

nigel: It's an open question if we genuinely need to update the IANA registration.

Cyril: In the SVG WG we've made a comment on the charter about how documents and versions of documents in TR/
... should be handled. We have the same problem here: TTML1 doesn't talk about TTML2. It points only to the latest
... stable version of TTML1.

pal: They're two different specs - there's no absolute guarantee for backward compatibility.

Cyril: And you're using the same MIME type? SVG 2 is backward compatible with SVG 1

glenn: TTML2 is backward compatible in the sense that a TTML1 processor would process a TTML2 document, practically speaking.
... I defined a new #version feature in TTML2 - if you want to author a document for TTML2 and prevent it from being
... processed by a TTML1 processor then you could do that by using the profile mechanism. If you don't do that then
... there's no reason that a TTML1 processor could not process it by ignoring things it doesn't understand.
... We explicitly stated that the XML namespace for TTML is mutable.

Cyril: THe issue is that the group is giving the signal that TTML1SE is the latest version whereas actually everyone is
... working on TTML2. If a TTML1 document can be considered a TTML2 document...

pal: Setting aside the media registration, TTML1 and TTML2 are different specs, albeit with a common pedigree.

Cyril: If I search for TTML now, I'll find many documents and if I hit the TTML1 document it will look like the latest version
... but that's probably not what I want.

pal: But that's right - the latest stable version now is TTML1. Even when TTML2 is a Rec TTML1 will be valid.

glenn: Think about XML - XML 1 and XML 1.1 are not entirely compatible but are still being updated.
... In TTML1 we may publish a Third Edition incorporating the errata.

pal: IMSC 1 is based on TTML1 for example.

glenn: In the latest version we actually include "ttml1" in the URL - TTML2 will be a different URL.

<glenn> ACTION: glenn to update point (1) of section 3.1 in ttml2 to refer to a new annex that defines new processorProfiles MIME type parameter [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action01]

<trackbot> Created ACTION-343 - Update point (1) of section 3.1 in ttml2 to refer to a new annex that defines new processorprofiles mime type parameter [on Glenn Adams - due 2014-11-03].

<inserted> Returning to Cyril's point about spec versioning, backwards compatibility and the relationship between versions of TTML

mike: There are a number of W3C specs that have this problem.

Cyril: And that's a problem.

pal: It's even worse if we don't make it clear that there are two different specs.

Cyril: This is going to stay a problem for anyone searching for TTML

<pal> http://www.w3.org/XML/Core/

<glenn> http://www.w3.org/TR/CSS/

pal: The XML WG is explicit about its different publications.

glenn: The CSS WG explains this with "Levels" and we could do that too, with a top level TTML uber-document
... as a WG Note to explain the relationship.

Cyril: I'll volunteer to write a similar note.

<scribe> ACTION: cyril Draft a WG note explaining the differences and relationships between the various versions of TTML [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action02]

<trackbot> Created ACTION-344 - Draft a wg note explaining the differences and relationships between the various versions of ttml [on Cyril Concolato - due 2014-11-03].

nigel: That's helpful. Now what do we do about MIME types which may be the same for TTML1 and TTML2 documents?

glenn: That's a good question. In the TTML2 spec I defined a new set of baseline profiles, and a new ttp:version attribute.
... In ยง5.2.3.1 of TTML2 we list the 3 profiles from TTML1, plus SDP-US, plus 3 new profiles specific to TTML2 including newly defined features.
... One way to do it is to use different short names in the registry for each of these profiles.
... The ttp:version attribute is a little orthogonal. It states which version of TTML was used in authoring a document instance.
... It's required if the document requires support for a feature not present in TTML1.
... And the Note mentions that the computed value of the attribute is used by the construct default processor profile procedure.
... Omitting the attribute causes the default to be one of the TTML1 profiles.

Cyril: This makes the processorProfiles parameter in the MIME type even more important.

glenn: It's important for any kind of external filtering.

Cyril: So there's the version attribute, the profile designator and externally there's processorProfiles that would use
... different short names for TTML2 than for TTML1.

glenn: Yes. In the new annex we should mention that designating externally that something is a profile could result
... in a false negative, or a false positive. A false negative in the sense that the context may think it can't process, but
... within the document it could say 'start with a profile and make a feature optional' at a very fine grained level. This could
... add or remove feature requirements. So if the external parameter indicates that the processor can not process, but
... the internal parameter says it can, then it would be a false negative. The reverse is true. that the document may require
... an extension feature.

nigel: But you could just locally define a new profile short name for an extension, and put that in the external processorProfiles parameter with the AND operator.

glenn: There's also the question of, in the absence of the external parameter, what should the semantics be?

Cyril: it should be less restrictive.

glenn: It should certainly be no more restrictive than what's in the document. The problem is there may be some cases
... where there's no way to express the complexity in the external parameter.
... One of my observations is that people want to simplify the profiles mechanism in the external parameter. Is that what people are thinking?

pal: Unless we put something in then people will just define their own way of doing it.

courtney: From a parser-writing perspective there's no efficient way to look at the supported profile requirements. You have to parse the whole thing.

<pal> pal: doing it == "signal that a document conforms to one or more specifications"

glenn: It's not that bad - you can just parse the head.

courtney: You still have to parse the whole head.

glenn: It's much better than SVG 1.1 which mandates full parsing to determine if it is a well formed XML document, even down to the last closing tag.

courtney: So there's no goal to efficiently reject files?

glenn: We just used a similar mechanism to SVG. We assumed that the head would be parsed before the body.

courtney: It defeats the purpose of profiles to allow people to pick and choose features.

pal: If this group doesn't do it then others will.

nigel: But they probably wouldn't include the feature addition and subtraction semantics.

glenn: But the short registry doesn't define what a document conforms to.

group discusses the topic further (scribe misses details)

cyril: What does EBU-TT-D say?

nigel: It permits extensions by default, but wouldn't do anything with them. It doesn't use profiles.
... Andreas raised the query about how complex it is to register a short name - the email discussion seems to indicate
... that there's no requirement to create a full profile definition document; a short name can be registered with the
... details from the spec document in the absence of a full profile document.
... By the way, as I think Dave mentioned a while back, you can also just define a new short name if you don't want to hit
... the complexity of the internal profile mechanism.

pal: So we're okay to keep the same application/ttml+xml mime type and use the processorProfiles parameter to
... distinguish between TTML1 and TTML2, and the processors within them.

RESOLUTION: We will document the syntax for the external processorProfiles parameter in TTML2.
... We will reuse the same media type in TTML2 as in TTML1 but recommend using the processorProfiles external parameter to differentiate processor requirements.

Cyril: what about the registry? Where should it go? The Media Source Extensions registry is published not on a wiki but as a document.

glenn: the usual practice in W3C is to use a wiki page.

<Cyril> https://dvcs.w3.org/hg/html-media/raw-file/default/media-source/byte-stream-format-registry.html

Cyril: For MPEG the registry needs to have a stable URL.

nigel and glenn: We can keep a stable URL with a wiki.

RESOLUTION: We will host the registry on the wiki (subject to edits as discussed today)

nigel: Opens up MPEG response for those in the room to see.

Cyril: MPEG accepts the registry proposal from W3C.
... MPEG has no comments on IMSC at this time.
... There was also discussion of track header width and height in the ISO BMFF. In many cases it's intentionally unknown what the TTML extent is.
... So MPEG drafted corrigenda to 14496-12 and 14496-30.
... in the MPEG discussion also, from an MPEG perspective you can usually extract the codecs value from the bitstream. For TTML that isn't exactly the case, because
... you have to produce the short name from a profile designator in the document, so you need extra knowledge to convert the long name into the short name.
... So MPEG is considering adding a new box just to contain the MIME type to remove the need for the extra knowledge. You'd be able to put the MIME type of a TTML document in a TTML track.
... So MPEG fixed this (or it's in the pipeline to fix) that the MIME type can be in the MP4 file.

glenn: Is there a suggestion that there should be an additional piece of metadata that includes the short names directly?

Cyril: I thought that would be redundant with the long profiles designator.

glenn: It is if you happen to know the details of the registry, but otherwise not.
... Adding such an attribute would make it more necessary to define the syntax in TTML2.

Cyril: That would be fine, but we've solved it separately in MPEG too.

nigel: I'd be concerned about putting the short codes in the TTML because that would encourage folk to extract the value from the TTML and put it into the (proposed) MP4 MIME attribute.
... That could limit the ability for distribution mechanisms creating MP4 files from old TTML files from generating an up to date external processorProfiles parameter.

glenn: I understand that concern. Early binding could be worse than late binding, for document wrapping.

group discusses the possibilities, concludes that we will not propose to put the short codes into TTML documents.

Additional observer: Tatsuya Hayashi - interested in time synchronisation with audio

MPEG liaison re track header box width and height

nigel: shows on screen the early draft liaison. Describes an issue with signalling track header width and height fields.

glenn: I think I have an action for when no extent is specified and there is a related media object.

Cyril: In MP4 we had the problem that in adaptive streaming there may be an MP4 file with no video, just the text track.

mike: In the external context there's always a related media object somewhere, either in the file or in the MPD manifest in DASH.

glenn: We don't limit the scope of how the related media object is provided.

Cyril: We added a bit that says that the width and height can be the aspect ratio instead of the actual width and height, and can be zero if you don't know.
... From an MPEG perspective a video can have encoded pixel dimensions, or output pixel dimensions. In the end there's a presentation size expressed in pixels.
... Each video track can have a different presentation size, related to each other using transformation matrices.

pal: On the same dimensions?

Cyril: Yes, on the presentation coordinate system.
... We had two problems. Firstly, a document may be authored independent of resolution, second the aspect ratio may not be known, and be important.
... The two corrigendums deal with this.
... The -12 spec was too prescriptive about visual tracks - it turns out that it is track type dependent.
... There is a new definition of a reserved 0 0 size value. Concerns have been raised.

mike: Introduces the liaison and supporting documents. They will be available to the group in the next few weeks.

Cyril: This is equally applicable to WebVTT, SVG tracks, HTML tracks, any graphical vector graphics based tracks [as well as TTML]
... Takes us through the proposed changes to 14496-30.

group discusses track selection behaviour based on aspect ratio.

WebVTT publication

nigel: We have a proposal still to publish WebVTT as a FPWD. The edits requested to the staged version were made.


group discusses the use of the term Living Standard and how forks are managed, bringing in CG changes into the WG document, and FSA requirements for doing so.

and that it's the editors' responsibility to maintain any differences between CG and WG versions, and update the WG version each time a new version is to be published by WG.

Cyril: Anyone wanting to work with WebVTT will always look at the editor's draft.

glenn: I don't mind the process, but I'm not happy with the term Living Standard.
... It looks like it will set a precedent.

mike: Use of the term Living Standard needs to be something that the W3C expresses a view on.

Cyril: What if Living Standard were changed to Draft Community Group Report?

glenn: I could live with that. Editor's Draft may be even better, to make it clear that this is within the normal WG process for a Rec track document.
... Objects to the use of the term Living Standard. Acceptable proposals to resolve this are "Editor's Draft" or "Draft Community Group Report".

pal: How will reviewers of future WG versions of the document be able to trace back why any particular change was made? (noting that this is only possible on the CG version now)

<scribe> ACTION: nigel Make request to Philip and Silvia to change Living Standard to Editor's Draft. [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action03]

<trackbot> Created ACTION-345 - Make request to philip and silvia to change living standard to editor's draft. [on Nigel Megitt - due 2014-11-03].

nigel: Going from WD to CR for WebVTT?
... Dave planned to send notes out essentially to the same set of recipients as we sent to for IMSC 1, as well as socializing at TPAC.

TTML <--> WebVTT

nigel: Current status is that we have a git repo with some but not all of the tests we worked on in Geneva.

courtney: And I don't have a draft document ready.
... I've prioritised working on the document ahead of the code.

pal: What's the timescale for this? It would be a great addition to the IMSC 1 test suite, if that software could be part of it.

courtney: My goal is to have a version of the document ready by the beginning of December, as a fairly complete 1st draft for comments.
... I don't know exactly when I will have the software ready.

nigel: We don't seem to have all the tests from Geneva - is there a reason why they can't all be submitted?

group: no reason

nigel: Okay, the action to submit them to me remains.
... It would make sense to transfer the git repo to courtney at some point - no pressure on time.

courtney: Okay, I can do that.

nigel: We have no more substantive points to discuss on this so let's adjourn for lunch.
... We'll reconvene at 1345

IMSC 1 WD Review comments

pal: LC-2968 https://www.w3.org/2006/02/lc-comments-tracker/34314/WD-ttml-imsc1-20140930/2968?cid=2968
... This is a version of implied inline regions, which isn't supported in TTML1. There's an alternative solution available, so I propose to do nothing.
... Describes notes and resolution.
... LC-2971.
... LC-2969.
... LC-2970.
... LC-2967.

Plan for advancing IMSC 1 to CR

nigel: We have to think about what exit criteria to write into the CR; if we have enough evidence of wide review; what the license expectations are for test material.

pal: What are the licensing expectations for test material?

plh: The goal here is, if the group is using a test suite to demonstrate suitability for advancement, the Director will want to make that test suite available to all.
... For that what we've done so far is to provide tests under dual license: 1. If you're going to use those tests for conformance claims, you may not modify them.
... 2. For any other purpose we don't really care.
... The second one is the BSD license.
... What's the issue?

mike: The question is what W3C is asking for - which you just answered.

plh: W3C needs the rights to modify the files and make them available to the public under the W3C license.

group looks at the DECE license requirements

plh: It doesn't permit modification.

mike: I'll explore this with DECE to check that they can provide test documents that will be usable by W3C for this purpose.
... It would be useful if nigel or plh could respond to DECE's email asking if the example files can be issued under the standard W3C license (with the text of that license).

<plh> http/www.w3.org/2002/09/wbs/1/testgrants2-200409/

plh: The 'this is closed since 01 October 2013' is a bug - ignore it.
... The text you're looking for is in section 2 on this page.

nigel and philippe send DECE a note via Mike.

nigel: Next up - do we have enough evidence of wide review?

pal: The member submission itself was based on the work of 80 members, from both the CE and the content publishing space.
... This is a specification that was used in the implementation of playback devices. So it already received a significant amount of scrutiny.
... Since being brought to W3C it has received comments from SMPTE, EBU and DECE.

plh: On Sep 24 SMPTE said they have no outstanding comments on IMSC 1.

pal: Earlier today DVB stated that they have reviewed and have no comments.
... Plus we can point to the members of this group and their representation, and that it was reviewed by the a11y group of the HTML group.
... And we requested comments from a significant number of groups.

plh: Another group that would be important if the PFWG.
... At the end of the day it's a profile not a whole new spec.

glenn: I think this has wider review than TTML1 did, at least bringing in DVB, EBU and DECE.

plh: My feeling is that you've done enough.

nigel: Great, then the next point is CR exit criteria.

nigel takes group through slide pack on CR exit criteria, triggering much debate.

<scribe> ACTION: nigel Scribe notes on CR exit criteria for IMSC 1 based on meeting debate [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action04]

<trackbot> Created ACTION-346 - Scribe notes on cr exit criteria for imsc 1 based on meeting debate [on Nigel Megitt - due 2014-11-03].

<plh> http://www.w3.org/TR/2014/CR-html5-20140731/

WebVTT FPWD publication

plh: Calling it Living Standard will cause a problem. "Draft Community Group Report" would be acceptable.

RESOLUTION: We will publish the staged version of WebVTT as a FPWD when the edits to change the occurrences of "Living Standard" to "Draft Community Group Report" have been applied.

MIME type for TTML2

Cyril: summarises morning resolutions

plh: My understanding is that IANA needs a complete new registration to overwrite the previous one.

glenn: It seems confusing to have two different registrations for the same MIME type.

Cyril: We can link back from the TTML2 section on media registration to the TTML1 definition.

nigel: Mike volunteered to do the re-registration - is that okay for it not to be a member of W3C staff?

plh: Yes, it's fine for me to be out of the loop and not a bottleneck - I just have to tell the IESG that it's okay

IMSC 1 WD Review Comments

pal: the other comments are from Andreas Tai

nigel: Let's go through them. Actually it's too hard to put them all in the issue tracker - we did 5.2 but let's just talk through the others.

pal: 3. Conformance - subtitle document not being defined. I can just take that on - it should probably say 'document instance' like TTML does.
... 4.1 - I'm happy to add the proposal.
... 5.7.3 - I agree it would be useful to show all the combinations of the parameter and the attribute for forcedDisplay.
... 5.7.4 - I should be able to implement the proposal and the typo.
... 5.10 - this is a good suggestion a priori
... 6.3 - This is a good point - I should link to 9.3.1 in TTML1
... 8.2 - Andreas is suggesting that we use the terms presentation processor and transformation processor now that we have them.
... I have to think hard about this. I think the intent of the text is the same as the note on overflow, where we talk about authoring of the documents.
... Section 8.2 doesn't necessarily say that the presentation processor shall lay out text in this way. I'm pretty sure it refers to how its authored.

glenn: If that were the intent I would have written it differently, e.g. "for the purpose of avoiding overflow, the author shall or should..." etc

nigel: Since this is in section 8 does it only apply to layout for the purpose of calculating the HRM values?

pal: 8.1 is Performance, 8.2 is Reference Fonts. Since the HRM section is referenced as 'documents shall conform to these constraints' I think it's about authoring not presentation.
... If this is the case then there isn't a strict constraint on processors at all. I'll study this and come up with a proposal for us to consider in addressing Andreas's comments.
... Annex B Forced Content - we're going to put some code snippets in there.
... 5.10 #length-cell - I have to think about this. What's the reason for needing cell metrics in documents for linePadding?

nigel: It makes it easier to make the padding distance a fraction of the font size, which is a typical use case.

pal: 5.2 The current text was direct input from EBU so we shouldn't modify it lightly. Perhaps if EBU comes back and says something different this would be easier to change.

nigel: We've completed our agenda for today, so adjourning. Thanks all. We restart tomorrow at 0830.

trackbot, start meeting

<trackbot> Meeting: Timed Text Working Group Teleconference

<trackbot> Date: 28 October 2014

<scribe> chair: nigel

<scribe> scribeNick: nigel

<inserted> Day 2 - Tuesday 28th October


Observers: Noria Sakamoto - Toshiba, interested in broadcasting in TTML

Jerome Cho, LG Electronics - wants to meet FCC regulations for accessibility with TTML, WebVTT etc.

Francois Daoust, W3C, just observing.

Courtney Kennedy: Engineering Manager at Apple, responsible for subtitles.

Cyril Concolato, University in Paris, GPAC/MP4Box etc

<pal> Pierre Lemieux, supported by MovieLabs / IMSC1 editor

Debbie Dahl, Chair of Multimodal Interaction Group, observing. Interested in synergies between timed text and EMMA standard

Kazuhiro Hoya, Fuji TV interested in UHDTV, which will adopt TTML for closed caption

Nigel Megitt, BBC, Chair of TTWG

Glenn Adams, Skynav. Editor of TTML.

WebVTT draft updated


glenn: Looks good to me.

Cyril: LGTM. When can it be published?

nigel: Tuesday next week at the earliest, depending on staff

Cyril: How will we publicise it?

nigel: I expect Dave Singer to publicise it to the charter dependency groups, W3C and external, and to the W3C liaisons.
... Dave has also suggested that we publicise it socially at TPAC too.

Spec restyling work (fantasai)

fantasai: The spec templates and styling over time have become outdated - we can use some cool web technologies to make specs more readable
... The scope of the design is not a back-end web app, just HTML/CSS. We want a design for desktop, mobile and print, in that order. It will change the markup of the headings
... and the boilerplate in the Status text that is data, not paragraph text. We'll push legalese to the bottom. The abstract should be 2-3 sentences above the fold, then
... the TOC available without having to scroll it. Then URLs, issues, feedback etc should be at the top in a more compact format. So the scope of it is to
... redo markup, boilerplate, styling. We'll look at styling, clean up the stylesheet to be more readable, make sure that the document is still quick-scannable
... A lot of styling that is ad hoc, like fragments, example code, could be harmonised across all the W3C specs. This will take a while - it's a side project for me. We want to take into
... account what the WGs need.
... The functional questions are [the ones on the agenda]. We also want general feedback on what to consider, e.g. protocol-relative links so we can switch to https.
... or always-visible TOC.
... The first question is a subjective/emotional one - what should the style express, in terms of values.
... If we used primary colours and comic sans it would feel like a toy not a spec. But if we used a parchment background and old style font it would look old-fashioned.
... They're not appropriate for W3C, - we want to know what is appropriate though, in terms of how they feel.

glenn: Do you have any templates or ideas?

fantasai: We have a proof of concept but we don't know where it's going just yet. It's going to be experimental - design by consensus. We're asking for ideas from the community.

glenn: Part of the problem is that you have different audiences for different specs.

fantasai: We haven't had feedback on different styles for different document types/audiences. We want them to fit together and feel like they belong together.

glenn: One of the problems is that we have a lot of history.

fantasai: We want to make sure that every group has a working toolset - tell us what you use in response.

Cyril: Some specs have a developer view and an author view.

fantasai: Put that under question 6 'what else we should know/consider'

nigel: Can we answer the questions?
... Q1. 3-5 adjectives

Cyril: I have no idea.

ddahl: Authoritative

glenn: consistent

ddahl: comprehensive

glenn: One problem is that documents don't have the same styling. A lot of it is editor-specific.
... The variation may cause some problems. I've also worked with ISO and ITU which crank out format-consistent specs that are somewhat impenetrable.

nigel: open, welcoming?

glenn: It would be nice to use newer styling mechanisms. You can't push the envelope too far without hitting browser variations.

courtney: clean, modern.
... Additional considerations: Should be something that will work for low vision people, using magnifiers, voice-over etc.

nigel: URLs?

Cyril: TTML1, WebVTT, IMSC 1

nigel: +1
... Do we have documentation of our markup conventions?

glenn: Not really - we use XMLSpec as a technology (from 1999). Others use respec (WebVTT and IMSC 1).
... It's very unlikely that we will adopt respec for TTML.
... In TTML1 we have a conventions section in the document. XMLSpec and Respec are separately documented for markup.

nigel: Spec processing tools? We've looked at those already. Are there any more?

group: no more

nigel: Do we have any goals?

Cyril: To have the table of contents kept visible when scrolling, for easy navigation.

courtney: +1

Cyril: In some PDF viewers and Word, I like that searched-for words are listed as occurrences on a separate panel.

courtney: Better search would be good.
... When I find a page, it's hard to see the structure of the whole thing and relations with other specs.

nigel: I hate it when clicking on references takes you to the Reference section not the thing being referenced.

Cyril: +1 What's the point in it?

nigel: What about making defined terms links to other places where they're used?

Cyril: +1
... A way to list normative (testable) statements in the spec, to generate test suites automatically would be great. CSS has that I think.

tidoust: It was the packaging spec format.

glenn: That was done by adding markup to every paragraph. Extracting assertions from a spec is a hard to automate, complex process. Maintaining it becomes quite challenging.
... Plus it's not a science. Declarative statements (X is Y) can be viewed as normative in some places, then if (X is required) implies (Y is required). It's a nice idea, but hard to do.
... People offer paid services to do that, because it's complex.

nigel: Is there anything else we should know/consider?
... Cyril mentioned Developer view/Author view earlier.

<tidoust> [FWIW, I was referring to the fact that the Packaged Web Apps (Widgets) spec was written in a way that allowed the extraction of test assertions, see: http://www.w3.org/TR/widgets/ and the test suite: http://dev.w3.org/2006/waf/widgets/test-suite/#user-agent ]

Cyril: I think there's an HTML 5 spec (might be the WhatWG spec) that does differing views.

glenn: There's a lot of advantage to marking up specs to allow automatic extraction. For example, IDL fragments have conventions that allow some tools to automatically pull out all the IDL
... to generate a test generation process. We don't have APIs in TTML at the moment. If CSS had followed a similar convention for how properties were defined, and HTML had
... followed conventions for how elements and attributes were defined, then a similar tool could have been used. They didn't adopt a convention though, so it's a manual task.
... Those are the kinds of tools that it might be useful to consider. We could mark up elements and attributes. In the original markup I used a few syntactic conventions to assist.
... For example ID attributes. I use some specific conventions for how identifiers are presented.
... I have never documented it anywhere?

TTML2 Timing relationship with related media objects.

glenn: I can walk us through this. The issue originally came up from an example TTML document with some negative time expressions.
... I immediately pointed out that you can't do that! I did wonder why they are putting -ve time expressions in a TTML document.

courtney: Caption authors may use different timecode from the video editors.

glenn: Based on that I thought it would be useful to have an offset from authored time expressions to some useful point in the media, and allow the player or processor
... to use that offset to achieve synchronisation rather than mandating precise synchronisation between TTML times and the media times.
... As I explored that some issues came up. One was the difference between the Origin of the document timeline and the Begin of the document timeline, and whether they
... are different times or the same time. I looked at SMIL and SVG time semantics to try to ascertain what was used there. I also reviewed the earlier TTML1 work.
... We have a concept in TTML that has its own terminology definition in TTML2, Root temporal extent.
... When this talks about beginning or ending, does it mean beginning of the coordinate space or the first timestamp in the document.
... Say a TTML document has a body with begin="10:00:00". Is the origin "10:00:00" or 0s in the document. I eventually tentatively concluded that for document time coordinate
... spaces is always zero and the beginning of the document is always zero. That doesn't mean it's the timestamp of the first timed element in the document.
... Everything in SVG and SMIL is predicated on the default begin being zero, in the coordinate space of the document. Recently I came round to understanding that the document
... time origin and the document begin point in time are the same. Then if I want to synchronise a document with some media, then what point am I synchronising? The origin of the
... media timeline or the begin. Let's say for example, I have a related media object that starts at 5 hours into the media timeline. The first timestamp in the media is 5:00:00 (5 hours).
... What do I want to synchronise the 10 hour time in the document with, in the media? There are 2 options. One is to have zero in the document time coordinate space correspond to
... zero in the media time coordinate space. Another is to say that there's an offset between the document time coordinate space and the media time coordinate space, and that offset
... is between the two origins of the coordinate spaces. A 3rd option is to pin the origin of the document coordinate space (zero) to the begin of the media time coordinate space (5 hours).
... That latter one doesn't seem to be quite so correct.
... [draws a picture]

nigel: Is this predicated on timebase="media"?

glenn: Let's assume that. The general answer may extend to continuous SMPTE timebase too.
... I have two entities, a video and a document. Each has a timeline - the video content has a timeline and the document body has a timeline.
... The root temporal extent of the document is the timed beginning of the document to the timed ending of the document. The choices seem to be the origin or the start of the
... first timed element.
... I think the logical begin is always zero, if begin is not specified.
... So this is the document time reference synchronisation point. What do want to tie it to - the beginning of the media or the origin of the video time coordinate space?
... My thinking has evolved on this. Originally I thought that Begin(body) would be synchronised with Begin(media). Then the offset would be between those two points.
... The more I thought about it the less viable it seemed to be. Eventually I came to the conclusion that we should synchronise Origin(document) and Origin(media).
... Then if they happen to correspond, and both the video and the body say 10h in their own coordinate spaces then they would line up.
... i.e. they would be isomorphic time spaces with zero offset.
... One of the interesting example issues is: what if the playback rates differ between the media and the document. What happens to dilations or contractions in the timelines?
... It seems like if they're both synchronised with the zero point then any modification of the playback speed, as long as they're coordinated, would work out pretty well.
... It means that you can simply multiply the coordinates with the playback rate.

<Cyril> scribeNick: Cyril

nigel: (describing an email sent earlier)
... I analyzed it a different way
... with all possible combinations of timebase and related media
... 3 different timebases in TTML: media; SMPTE; and clock
... what "relationship" means in the temporal extent definition
... if there is no related media, there is no relationship, that's easy
... the root temporal extent is from begin to end of document
... they can be unconstrained
... begin is origin and end is infinity
... if clock time is used, there might be a relationship with some media
... nothing here contradicts Glenn
... example: tape with every frame with timecode
... if you are using media times, the origin of the document is the begin of the media
... you expect the origin of hte document to be equivalent to the begin of the media
... 5s in the document means 5s in the media
... the root temporal extent is constrained by begin media/end media

glenn: SMIL and SVG make the difference between the specified and active time interval
... the question: is root temporal extent meant to express the active interval or the extent of the time coordinate space of the document ?

nigel: the next limitation is when you have media with SMPTE timecode and SMPTE timecode in the document
... the document times and media times are actually the same
... so no offset applies here
... for marker mode = continuous, this is equivalent to saying origin(document)=origin(media)
... however the rule as stated also works for marker mode = discontinuous
... i.e. when document times = media times
... next: media with SMPTE time codes and clock in the document
... the only interpretation is that the document types are supposed to be equivalent to clock times when you play the media
... ex: document time says 10:05, but starts playing at 10:03
... the use case for this are strange

glenn: wall clock values are converted to times by substracting the wall clock start time of the document (according to SMIL)

nigel: consistent with my interpretation ?

glenn: yes

nigel: then there is a category of media with no SMPTE time codes, but with time
... same as glenn, the origin of the document is equivalent to the origin of the media
... again the framing that glenn talked about applies
... the active time cannot go outside of the playback

glenn: the media active interval is as if it was a parent of the document active interval

nigel: if audio is continuing but video is not, the viewer is continuing, you should be presenting subtitles
... this is an implementation case
... any offset that needs to be applied will be externally
... I don't want to duplicate what is already in MP4 files for instance

glenn: the timeoffset I came up with makes it easier
... if the house that made the media did not have the media in hand, they can provide the offset

nigel: is that a hypothetical case ?

courtney: no
... different houses will have different conventions
... when someone give content to itunes, they give a video
... later on they'll get european or asian conventions

nigel: i don't understand the convention

courtney: some times people don't want to use zero

nigel: in SMPTE time code yes

<nigel> scribeNick: nigel

Cyril: I understand Courtney's use case. The TTML document doesn't reference the media itself. So it will be used in some external context with the media.
... For example an MP4 file, MSE, DASH. All of those have timestamp offset facilities, so I'm puzzled why we're talking about this here.

glenn: Those are all different systems with different ways to express the offsets. If it only can be carried outside the document it might get lost.
... It's useful to have it in the document as a reference point to express the intent of the author. We often need to export things from in the document to outside the document.

Cyril: So you want to export from the document some time reference?

glenn: yes

Cyril: That is fine.

glenn: Courtney isn't the only person to bring this up - I've had other reasons to add this over the years.

courtney: complex production workflows do mean that we sometimes need to do this.

Cyril: These examples seem to be overly complicated.

glenn: I think nigel wanted to cover all the cases, which is a useful exercise.
... Neither of us defined BEGIN(document) and ORIGIN(document) actually meant, which is a problem talking about this!

<Cyril> glenn:

glenn: Is it the time of the first thing in the time frame or the origin of the framing time.

nigel: the next row is where the media doesn't have timecode but the document has smpte timecode, which may start at some arbitrary point according to convention.
... In that case I can see that an offset would be useful, to say "the start of the media is at e.g. 10:00:00". I'm less comfortable doing this with media timebase, but it's quite closely related.

Cyril: The same problem will arise with WebVTT.
... The general problem is how to carry in-band the time value of the begin of the media in the document timeline.

nigel: Do houses really begin at 300s?

courtney: I only see this as a real world problem with TTML, not WebVTT.

glenn: I've seen examples with media timeBase. For example, taking into account a pre-roll of 13s.

courtney: This could also apply to WebVTT at some point in the future.

glenn: I added a few notes. I need 2 questions answered.
... My hypothesis is that the label BEGIN(document) means the origin of the document, i.e. zero on the document time coordinate space.
... I believe that's most accurate in relation to SMIL and SVG.
... This is not the time of the body.
... Now the question is what we call the Root Temporal Interval - does it also start at the origin, or at the body. We may need to distinguish the active root temporal interval
... from the overall unqualified root temporal interval.
... I want to see if the group can agree that hypothesis.
... Then I need a decision on which of the 3 models to use to describe the timing relationships.
... 1) The two origins sync up.

courtney: I don't see how that solves it.

Cyril: That's the only one that works!

nigel: +1

glenn: In that case the offset is the difference between the origins.
... 2) The origin of the document syncs up with the beginning of the media.
... This one seems more natural to me because 10 hours means 10 hours into the video.

nigel: Not if the timebase is smpte! You have to enumerate all the options.

glenn: 3) Begin(body) is begin(media). I don't think this one works too well.
... When you use media times instead of timestamps then you mainly want 2). But with SMPTE timecodes in the media it seems like 1) may be more applicable.

nigel: I think that's right.

Cyril: There's another way to do this - what happens if you have an audio track with some offset too?
... In the MP4 and DASH case, and all the others I know, you only care about the media itself. The TTML document has an anchor point, e.g. 10 hours if that's the begin of the media.
... Then you use that to anchor it onto the timeline. In MP4 if the video has a big gap at the beginning, you use an offset to say when the beginning should occur. Same with the audio.
... The TTML document should just give its anchor point as the time value in its coordinate space that corresponds to the beginning of itself.

nigel: +1 that's the proposal I made too.

glenn: So you're saying a media begin point as opposed to an offset in the document timeline?

nigel: yes.

glenn: So if zero in the document is zero in the media the media offset is 0
... And for SMPTE timecode with the 10:00:00 convention the value would be 10:00:00.
... I like that suggestion because it seems to work regardless of the timeBase. Have you worked through any play rate differences?

nigel: I'm not confident that I've worked through all the playrate consequences.

Cyril: The solution seems to be found - it needs to be liaised back to MPEG because it affects the carriage of TTML in MP4. You'd have to store it in the MP4 file.

glenn: Couldn't you just look in the TTML document?

Cyril: Let's say your document has an offset of 10 hours - will the first sample say 10 hours or zero - is an edit list required?

glenn: In SMIL you can have captions that start before the media and end after, but get effectively truncated. Why would it affect the carriage in MP4? You can still look inside the document.

Cyril: When you stream/seek/segment the document you don't want to look inside it.

glenn: I think I have enough guidance on this to move forward on resolving it.

nigel: Let's take a break - back at 11.

IMSC 1 CR Exit criteria


<trackbot> action-345 -- Nigel Megitt to Make request to philip and silvia to change living standard to editor's draft. -- due 2014-11-03 -- PENDINGREVIEW

<trackbot> http://www.w3.org/AudioVideo/TT/tracker/actions/345

close action-345

<trackbot> Closed action-345.

nigel: sorry that was the wrong action but it was done!


<trackbot> action-346 -- Nigel Megitt to Scribe notes on cr exit criteria for imsc 1 based on meeting debate -- due 2014-11-03 -- PENDINGREVIEW

<trackbot> http://www.w3.org/AudioVideo/TT/tracker/actions/346

close action-346

<trackbot> Closed action-346.

nigel: Goes through scribed notes - group makes edits
... Conclusion is:

Our criteria for exiting CR will be:

Provide an implementation report describing at least 2 independent implementations for every feature of IMSC 1 not already present in TTML1, based on implementer-provided test results for tests and sample content provided by this group.

We will not require that implementations are publicly available but encourage them to be so.

We will not exit CR before January 16th 2016 at the earliest.

pal: That's enough for me to edit the SOTD in the CR draft - I'll need to get respec.js to allow this custom paragraph.
... The next CR draft may include this text in a weird style just to get around respec.js.

nigel adjourns meeting for lunch - restart at 1300


group reconvenes

Change Proposals

Reviewing change proposals at https://www.w3.org/wiki/TTML/ChangeProposalIndex

Change Proposal 15


glenn: margin - this would be very easy to add since there's a straight mapping to CSS. I haven't had enough feedback that its needed.

nigel: There's nothing from EBU

courtney: margin isn't permitted in WebVTT either.

glenn: The described use case, for indenting 2nd and subsequent lines, wouldn't be supported by margin anyway. It's really a hanging indent.
... We don't have any indent support, hanging or otherwise.

nigel: Is there a related issue for margin?

glenn: I don't think so.
... I'll edit this on the fly now.
... marked as WONTFIX.
... box-decoration-break. This got moved in CSS to 'fragmentation'

<glenn> http://dev.w3.org/csswg/css-break/

<glenn> http://www.w3.org/TR/css3-break/

glenn: but the short name is still css-break! It was last published as a WD in TR on January 16.

<glenn> http://www.w3.org/TR/2012/WD-css3-break-20120823/

glenn: Mozilla seems to have an implementation of this that's working.

nigel: Even in today's draft the property and value combination still exist.

glenn: We have two options in TTML2 syntax: either use box-decoration-break directly or go ahead and use something simpler like linePadding and map it to this CSS property.
... The latter disconnects it as a feature from this particular instantion.

nigel: That's the normal way we do it, but I can see that with padding specified on content elements then it would be a duplication to add it a second time through linePadding.

glenn: In that case I proposed adding support for box-decoration-break in addition to padding on inline content elements that can now be specified.

PROPOSAL: support the EBU line padding proposal with the combination of padding on inline content elements and box-decoration-break.

RESOLUTION: We will support the EBU line padding proposal with the combination of padding on inline content elements and box-decoration-break.

issue-286: (TTWG F2F today) We will support the EBU line padding proposal with the combination of padding on inline content elements and box-decoration-break.

<trackbot> Notes added to issue-286 Extend the background area behind rendered text to improve readability.

glenn: border - we've added border and made it applicable to both region and certain content elements - body, div, p and span.
... One of the open questions is that border in css is a short hand for specifying the width height and colour of all the borders simultaneously, not each border separately.
... As well as this super-shorthand border property, there is the border-width, border-style and border-color properties, which allow those values to be specified on all or any from 1-4 borders separately.
... Then finally there are 12 long hand versions for each of these plus -top -right -bottom and -left.
... I've implemented the shorthand, but we could go for the more longhand versions.

nigel: We should check what's needed to match 708 window styles.

courtney: The FCC regulations don't go to the level of granularity of this.

glenn: I think this note came up when we were doing SDP-US - there has been a request in the past to describe which 708 features are supported in TTML.

courtney: That's not the same as a requirement. I don't know of any examples of subtitles with borders on them.
... 708 says borders may be raised, depressed, uniform or drop-shadow.
... I don't see anything about styling the different sides separately.

glenn: It's not clear what the mapping is for all those values. box-shadow in CSS may apply where drop-shadow is required in 708.
... They also introduced border-radius.

nigel: Let's move on from this - I think we've done enough.

glenn: line stacking strategy. I don't think we need to do anything on this right now - I put this in originally, so I'll mark it as under review by me.
... region anchor points - this was a proposal from Sean to have an auto keyword for the origin and extent properties on regions.
... I believe there's something like this in WebVTT.
... TTML doesn't have these at the moment. Sean was the champion and we don't have any other champion or requirement for this right now.
... I would say we should not take any action on this right now.

nigel: I agree - it's unclear even how the proposal maps to the WebVTT way of positioning and sizing regions.

glenn: text outline vs text shadow. When we defined textOutline in TTML1 CSS was also working on an outline property.
... the new CSS definition of drop shadow allows you to specify multiple shadows simultaneously.

courtney: the FCC regulation requires text edge attributes: normal, raised, depressed, uniform and drop-shadow.

glenn: TTML1's textOutline offers thickness and blur radius. You'd have to have multiple simultaneous shadows to achieve raised and depressed styles.

courtney: Authoring that would be complex.

glenn: XSL-FO defined a text shadow property even though CSS had not done so. We ended up calling it textOutline and we also limited it to just one dimension, not two.
... It's now officially defined in the CSS 3 text decoration module, called text-shadow. It takes 2 or 3 length specifications.

<glenn> http://dev.w3.org/csswg/css-text-decor-3/#text-shadow-property

glenn: What we could do is define some new keywords that the processor can map. That makes it easier for the author to choose amongst the different choices including raised and depressed.

courtney: That seems like a nicer way to do it.

nigel: +1

glenn: There are two questions: firstly, should we change the name from textOutline to textShadow? I would say no. We can just define the mapping semantics, and already have different naming anyway.

Proposal: retain the attibute name textOutline.

glenn: Proposal: add two new keywords for raised and depressed to meet FCC requirements and define mappings.
... There's a third proposal to add a 3rd optional length specification. This would allow separate definition of offset in x and y as well as blur.
... I see that textOutline doesn't offer a shadow, but a uniform outline that expands by the required length around the glyph.
... I need to think about this some more.
... Now I recall why we thought about adding a new attribute called textShadow, to allow this. I don't want to take away textOutline and remove backwards compatibility.
... either we enlarge the definition of textOutline to make it include shadow, or add a new textShadow property. I need to review it and see if I can come up with a proposal that works.

nigel: We're getting behind on the agenda. We'll come back to this later.

Change Proposal 25 - Distribution


<scribe> scribeNick: courtney

: nigel: tab to autocomplete is great!

nigel: : topic is combining groups of documents

cyril: in the tool mp4box, if you import ttml files to mp4 and concatenate more than one ttml file, then extracting the track should give you a combined ttml document.

glenn: xml:id uniqueness- a similar problem exists in ISD creation as described in the document combining proposal.
... btw, do you have an example of a specification for a merge algorithm? an xml syntax?

nigel: rules are laid out in presentation

glenn: so you wouldn't have some way for documents to specify a set of rules that it can follow?

nigel: no, there would be an external set of rules

glenn: would there be any content support required- additional metadata, etc?

nigel: no

glenn: you could exclude documents that contain elements with mixed content

nigel: perhaps, but that might be difficult because you could not use break spans within a sample.

<courtney_> nigel: normalize whitespace for comparison of samples.

<courtney_> nigel: to compare elements, need to transform their times into a common timeline.

<courtney_> glenn: you could translate to the isd space first prior to comparison.

<courtney_> nigel: not sure what is possible with that approach.

<courtney_> glenn: this is a transformation process, could be a separate spec from TTML.

<glenn> ACTION: glenn to check if timeContainer explicitly has no semantics with timeBase smpte, markerMode discontinuous [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action05]

<trackbot> Created ACTION-347 - Check if timecontainer explicitly has no semantics with timebase smpte, markermode discontinuous [on Glenn Adams - due 2014-11-04].

<courtney_> glenn: does the ttp:documentGroup type <xsd:NCName> proposal match the syntax of Id in XML?

<courtney_> glenn: yes it does match

<courtney_> what's the motivating use case for this?

<courtney_> nigel: to archive live created subtitles documents and to be able to create distributable time constrained segmented documents for streaming

<courtney_> Cyril: I'm not convinced there is a need for standardization here yet.

<courtney_> pal: is there a need for a standard when dealing with a private archive where the owner controls what goes in and what comes out?

<courtney_> courtney_: ttml requires lots of small files for captioning live events, and this seems like a limitation to me

<courtney_> pal: no it doesn't have to be, streaming inherently involves lots of files

ISD formalisation

<nigel> scribeNick: nigel

glenn: shows a terminal window! Invokes some code (ttx.jar) with a command line specifying the external-duration and an input ttml file
... Looks at TTML input document, that would present as 0-1s: Foo, 1-2s: Bar, 2s-[unbounded]: Baz
... Code validates the input and then writes out 3 Intermediate Synchronic Documents (ISDs).
... looks at output documents. isd elements in new isd namespace, with begin and end on the top level element.

group questions status of this work.

glenn: TTML1 defines ISDs but no serialisation of them, nor are all semantics fully defined. In TTML2 ED there's an annex that defines these, with a syntax for ISD.
... This is a proposal with the option for change.

pal: If we say any TTML document can be split unto a number of ISDs why isn't each ISD itself a TTML document. Why introduce a new structure?

glenn: Some good reasons. One: the constraints are different in an ISD document than in a TTML document. For example region elements have a body child.

nigel: You could create the ISDs as individual TTML documents prior to rehoming the body to each region and resolving the styles, as another option.

glenn: I explicitly wanted to put the ISD into a different namespace to reduce confusion. I realised when I started to formalise this then if I started with tt:tt and made it
... polymorphic then it would be much harder for people to understand, and parsers.

pal: +1 for that
... More fundamentally, can the ISD format be mapped into a TTML document?

glenn: I don't know - that wasn't in my thought process.
... There are two reasons for doing this work. One is to create HTML versions - you have to convert into a time-flattened version of the original TTML and apply the styles, and resolve region references.
... I wanted to make sure that process was fully articulated, which is essential to move forward.
... The other strong reason is to make it easily mappable into the cue structure of HTML text track. My model for each of these ISDs is one cue.
... Microsoft in the past tried to put a TTML1 document into a cue. It wasn't standardised anywhere. I want to have a good story for generic mapping TTML into a sequence of cues
... that fit into the TextTrack model. So my motivation was that each ISD should be representable as a cue, and furthermore to be distributable as a sequence of ISDs.

pal: How can I turn this back into a TT document?

glenn: I don't know - I didn't want to do that.

pal: So you've effectively created a new format.

glenn: It was my intention to make this a new format that could be used for distribution.
... The other option would be to heavily profile TTML to allow it to be distributable. By the way, there are already more than one kinds of document that are specified by TTML.
... My proposal would be to use the same MIME type and a different document type within that.

pal: My initial feedback is this introduces a new format in a world that has too many already!

nigel: There are multiple steps here. The first is to formalise something that's only conceptual for describing an algorithm in TTML1; the second is to make it a serialisable format.
... If your end goal is a TextTrack cue why not go all the way?

pal: My feedback is that we should store these as TTML documents.

glenn: Not only has timing been flattened in this process but also styles. The only styles that are expressed here are those that are not the initial values.
... [shows ISD output] There's an attribute and element called "css" meaning computed style set. Coding wise, this has been an important step for validating our algorithm.
... Notice that it still uses the TTML namespace - it copies the body into the region element; there can be more than one region in the isd.

group expresses some reservations about defining a new format

glenn: There are some questions: 1. Is it important and useful to define a serialisation format for ISD?
... I think it's both. It would help in many ways and reduce the discussion about streamability.

Cyril: It's not TTML anymore, so it's not streaming TTML. It's streaming something else.

nigel: We have a wider environment in which organisations are creating and distributing TTML documents and writing players. There's no problem splitting temporally on the client side,
... so creating a new format where the temporal division happens server side doesn't seem to be necessary.

group adjourns for a break

back at 1600

Multimodal Interaction - Debbie Dahl

nigel: Introduces Debbie as an Observer who has some requirements for multimodal interaction and thought the solution space may involve TTML.

ddahl: Introduces EMMA 1.0


ddahl: Emma represents captured user input in different formats.
... Now considering capturing system output as well as user input. It's helpful to have inputs and outputs in the same format for processing, debugging and analytics.
... Could be static, defined ahead of time
... Generated dynamically by an intelligent system
... EMMA is an XML language. We're thinking about capturing output. [shows an example]
... This example happens to have an ssml message in it, the <speak> element. SSML has lots of available complexity, not used in this example.
... Then other multimedia things might go along with it, such as HTML and other kinds of multimedia output - SVG, whatever seems appropriate to the application.
... My original question was: if we want to speech synthesise some output or synchronise pre-recorded audio with some other kind of multimedia, e.g. video, an animation of
... planes going across a map etc.
... How could we take advantage of the work done in TTML to make life easier for us in multimodal interaction to synchronise multimedia outputs generated in real time by interactive systems.

pal: If what's generated is audiovisual, that's a possibility.
... maybe you want to provide captions.

courtney: there could be a series of responses with timings.

ddahl: You could say "There are flights to Boston from Denver..." and then ask a follow-up question. When you ask what time of day would you be interested in flying, at that point
... maybe you display a form.

courtney: If you had some animation that shows a map, and you know it will play for 3 seconds, and then in 3 seconds post your next question?

ddahl: Yes. Would it be as simple as incorporating TTML maybe wrapped around another element.

nigel: Thinking about the concepts in TTML, there's a timeline against which things could be synchronised, plus styled and positioned text. There is an issue raised for associating
... audio representations of text, but at present the spec describes visual rendering only.
... It could be that SMIL is a good place to go.

pal: It's certainly more flexible.
... Is this purely semantic or is there a playback requirement?

ddahl: In the vision, there's a system that renders the captured input into something human-understandable. In the end there would be playback.

pal: The more you're interested in playback the closer you are to TTML or WebVTT which are intended to be used for playback.

courtney: If you're synchronising other kinds of media it's a good choice.

ddahl: I guess you could use TTML and SMIL?

glenn: That's right - TTML is designed to be referenced by the <text> element in SMIL. In the abstract for TTML we say:
... "In addition to being used for interchange among legacy distribution content formats, TTML Content may be used directly as a distribution format, providing, for example, a standard content format to reference from a <track> element in an HTML5 document, or a <text> or <textstream> media element in a [SMIL 3.0] document. "

pal: If you'd like to display text or captions over audio or video you should use TTML.

nigel: If you want to display any text that changes over time then TTML is a good fit. Probably the time modes in TTML are rich enough to support any use case you're likely to have.

<Cyril> scribeNick: Cyril

glenn: you can refer to TTML1 today, because TTML1 is REC
... if you need features of TTML2, you'd have to wait

ddahl: we've done work on what we call "output timestamps", for when it is planned vs. when it happened
... when it actually happens is easier

glenn: TTML does not care about when it happens
... we say when we want it to happen
... we use presentation time stamp in the MPEG sense
... however we have one mode where we use the SMPTE time base
... using SMPTE Time Code along with the video
... you can think of them as labels
... when one of these labels in the video appear that is when the matching TTML element is active

(glenn explaining the different modes SMPTE, Timestamps and clocks)

scribe: they derive from SMIL
... we use a subset of SMIL, not repeat for instance

ddahl: we might want to do somethings that does not have to do anything at all with text, like picture and music

glenn: we plan to add support for images and possibly audio in TTML 2
... we will definitely not support video in TTML

nigel: the use case for audio is audio description
... generally created by the same company

glenn: we don't want to turn TTML into SMIL light

nigel: at the moment you have simple SSML
... but if you start having details in SSML
... this is closer to the processor than the human
... I wonder if we couldn't go in that direction in TTML adding emotion, pronunciation, ...
... like a format called PLS (Pronunciation Lexicon Specification)
... this wouldn't affect the TTML document structure at all
... that could guide synthetise audio
... There is also EmotionML that is interesting
... a big use of TTML is for caption and subtitles
... but they are just text, without expression
... EmotionML gives you some information
... but how do you present that emotion

glenn: like emoji

courtney: there are conventions also
... describing the way the text was spoken (not the emotion)
... that's an interesting idea to explore
... the most artful captions have seen describe the way the text was prononced

ddahl: emotionML has different vocabularies
... there is a standard vocabulary, but you can add your own

courtney: if you would be too heavy handed in the way you describe the emotion, it could be condescending to the hard of hearing
... you'd have to do it artfully

ddahl: some people may have a processor to process emotionML

nigel: currently the emotions are in the text, forcing everyone to view them
... if you capture emotion and pronunciation would suffice to synthesize speech

ddahl: you would need prosody or other aspects

nigel: no one has brought this use case to TTML first

<nigel> http://www.w3.org/TR/emotionml/

ddahl: I had an example of annotating a video with EmotionML

glenn: TTML allows you to mix any content if it is in a different namespace

Cyril: Example 2 of annotation of videos in the emotionML spec seems to have problems (use of ? instead of #, use of "file:" instead of "file:///"

(ddahl shows a demo)

nigel: you can either add external content to TTML or extend TTML

Cyril: you might want to consider using a separate track
... not merging it in the TTML document but using a separate track in the HTML sense

nigel: there does not seem to be any action on this for us at the moment

ddahl: I came looking for information and i'll bring that back to my group

<nigel> scribeNick: nigel

Change Proposal 14 Audio Rendering


nigel: I was going to propose as per CP14 that we consider adding PLS and EmotionML into TTML but it seems that we do not need to: foreign namespace content can already
... be added with no spec changes.


<trackbot> issue-10 -- Allowing pointers to pre-rendered audio forms of elements -- open

<trackbot> http://www.w3.org/AudioVideo/TT/tracker/issues/10

Issue 10 proposes adding a pointer to an external audio file, which is the analogue to a pre-rendered graphic image.

nigel: Issue 10 proposes adding a pointer to an external audio file, which is the analogue to a pre-rendered graphic image.
... CP14 is a Priority 3 on our list, so I don't think we should spend any more time on it right now.
... Instead, we should go through the Priority 1 CPs and resolve any outstanding questions so that we can complete the TTML2 deliverable.

glenn: Let's return to CP15. We were up to shrink fit
... We don't have a champion for shrink fit and no issue, so I propose to do nothing.
... font face rule - we do have an issue for that. I'm not sure if we need the fontFaceFormat attribute.
... This implies that there's a fallback loading system that would pick the source that it knows how to process.
... That would introduce something new in TTML2, which is the ability to refer to resources outside the document.
... There's a way to get around it, which is to use data URL, i.e. embed the data as BASE64 encoded characters in the document.

nigel: I prefer external references for fonts because they allow caching.

glenn: There's a similar issue for backgroundImage resources.
... I don't have any open issues on this one.
... multiple row alignment
... I haven't worked through the possibility of using flexbox. I tried to generate some samples and they seemed to produce the same results.
... I don't want to introduce all of flexbox into TTML2. My current thinking is to define a TTML-specific property given the semantics according to the proposal.
... As part of the mapping to HTML it could potentially be mapped to flexbox.
... So I need to define a new property that is named appropriately and provides these semantics.

nigel: How will we reference the pre-existing similar feature in IMSC 1 and EBU-TT?

glenn: I don't mind drawing attention to this with a Note if people think that's useful - it's just editorial work.
... Superscript and subscript: I think I've already closed that.
... Ths issue is closed.

nigel: marks it as closed on the CP.

glenn: Change Proposal 16 - Style conditional
... I need to review this. I thought this change proposal had to do with an informal proposal I made where I described a condition attribute on some elements, where
... the value of the condition attribute is an expression in a simple expression language, whose evaluation, if false, would result in the element being excluded from presentation, otherwise
... treated as though there were no condition attribute present. This came out of the forcedDisplay discussion.
... I was going to have a simple expression language that at minimum looks like a list of functions in CSS where the names of those functions would be drawn from a list of
... predefined function list, e.g. "parameter(parameterName)" with some defined built-in parameters like "forced" so if you want to exclude some content based on this parameter
... being false then you would have a condition="parameter(forced)=true" that would be evaluated during the rendering and presentation process (specifically in the ISD generation process).
... I need to reread this CP and think about it - the proposal seems to have used something more like a media query expression. Sean wrote this originally I think.
... At minimum I want the condition mechanism to support forced semantics. Beyond that I don't have a real agenda.
... Other uses might include language.

nigel: Another is where you may want to preferentially display images vs text under some circumstances.

jdsmith: Are the only conditional inputs for this smooth animation and 4:3/16:9 video format? This looks like conditional styling.

glenn: I'm talking about more general conditional expressions
... It's an interesting idea to consider feature support conditionality.
... CP17 Default styles
... This is mostly closed. It remains to be defined what a pixel means.


<trackbot> issue-179 -- Interpreting the pixel measure -- open

<trackbot> http://www.w3.org/AudioVideo/TT/tracker/issues/179

pal: I think most people with CFF and SMPTE-TT think that when they author the document the video object has a certain number of pixels, and those are the ones they refer to.
... They literally relate to the encoded pixels, those in the AVC stream for example.

glenn: Those pixels don't have a size at that point. Then they get mapped into a display pixel which does have a size.

pal: And my 640x480 then gets mapped to a display pixel on my 1280x720 display.

glenn: And at that stage the pixel has a concrete size.

pal: That's my understanding.

glenn: The rendered pixel is dependent on lots of other variables.

Cyril: But that's not the coded pixel either. In AVC for instance you code a pixel, an RGB or YUV value or whatever. Then you stretch that according to the pixel aspect ratio.
... and then you may apply a clean aperture, to cut out some of the image, to make it the right multiple of macroblock size. So my guess it that people authoring TTML base it on
... the result of this process, taking the output of this decoding process, then applying the sample aspect ratio, then any cropping.

glenn: I think this is an open question, it's not necessarily like that. For example in SD video you often have 720 pixels per line but you only display 704 pixels, so there's an 8 pixel buffer on either side
... to allow for overruns. So what we were describing is a 0-719 coordinate space whereas what you were describing was a 0-713 coordinate space.

Cyril: Yes, if the video was cropped then you'd have some invisible text.

pal: Some codecs have the ability to store a power of 2 number of pixels, internally. Then on the output it has internal cropping to put back the right value from the input.
... So is it literally the power of 2 internal array or the output of the decoder.

Cyril: Yes, to give an example, in an MP4 file you have 3 sizes:
... 1. The size of the buffer that needs to be allocated.
... 2. The result of applying sample aspect ratio and cropping.
... 3. Applying a possible scale to the result, usually not done.
... So in an MP4 file you have the sample entry width and height, the clap and pasp width and height, and track header width and height.

pal: That first one you mentioned is what comes out of the decoder. That's what I think of when I think of stored pixels.

glenn: We want to pick one and go with it.

pal: I'm happy that we're not talking about a display pixel.

Cyril: I agree it's not the 3rd one. If you really want to use that you should use the same metrics to the TTML result.
... The only choice is 'output of decoder' or output of sample aspect ratio plus cropping.

pal: I'd take out anything that's dependent on ISO BMFF.

glenn: +1

pal: I think people have been using decoder output pixels with no further transformation.

nigel: We need to find what's common across all formats.

glenn: The source buffer is common.

pal: I'd use that as a strawman.

glenn: Previously we said 'pixel as defined in XSL-FO'. But the definition there is ambiguous - it can be device dependent or what CSS says. CSS says 96 pixels per inch,
... but it doesn't say which inch applies. They have some angle-based visual model including distance, to compute that. It's complicated. I think the CSS people gave up on it and made
... it an absolute dimension. They did that because it's what most people actually use in implementations. We have a similar scenario - most people use a different interpretation
... from what's in the spec, for whatever reason.

pal: Yes, they see a video dimension size and go for that.

glenn: I think on TTML1 we should add an erratum that defines pixel, and then use it normatively in TTML2.

nigel: What's the proposal?

glenn: We have a tentative proposal to make pixel a 'stored pixel' (pal's term) or 'coded sample' (Cyril's term).

pal: Let me throw another one in the pot.

glenn: I like 'coded sample' because it avoids circularity of definition.

nigel: Can I clarify that we're talking about no tts:extent being specified on tt:tt?

glenn: I have to do something different based on whether or not there's a related media object.
... Otherwise there's another definition.

nigel: I'm worried that we end up with ambiguity between tt:tt@tts:extent and the related media object.

glenn: That's a different problem, that we also need to deal with at the same time.

nigel: This is exactly the same as the root temporal extent problem before - we need to relate the root spatial extent to an external display rectangle.

Cyril: I've checked H264 AVC and HEVC and they both define a picture as an array of samples, and they both also define a message to carry sample aspect ratio.
... So the video may have one shape, and the TTML may define a different shape rectangle.

pal: That's right, it's also something we should talk about.

Cyril: [draws a picture] Decoder produces something with a Width and Height (within the decoder). Then you apply Sample Aspect Ratio (also within the decoder).
... and then, when you display something you may scale it, upsample/crop it etc.

glenn: The array size is the same before and after applying sample aspect ratio?

Cyril: No, the height is the same but the width may change.
... If you author using the W and H, and there's anamorphic conversion going on then you may need to apply some positioning of the TTML extent onto the video.
... The 'coded samples' are the ones before scaling with the sample aspect ratio.
... I don't know what to call the samples after scaling with the sample aspect ratio. 'scaled sample'?

courtney: I'd call them 'square pixels'

Cyril: that's not how AVC calls them, though it may make sense.

pal: As a strawman can we use 'coded sample'?
... the anamorphic 'pixels'

courtney: I think it makes sense to use the coded samples from the file because then the video in the file and the dimensions of the captions are consistent with one another, using the same metrics.

pal: my proposal is use the term 'coded sample', get feedback on that as an errata.

glenn: I'll probably refer to some MPEG document for the definition of 'coded sample'. I'm going to say that a TTML pixel is a 'coded sample'.

Cyril: It would be good to provide examples.

glenn: one interesting scenario is that tts:extent doesn't match the coded sample size of the related video object. Another is where they do match.
... The third is where there's no tts:extent, but there is a new ttp:aspectRatio property.

pal: IMSC says either do 'matching pixel aspect ratio' or 'define aspect ratio' but not both.

Cyril: I think we agree but I want to check: If I take an anamorphic video, before capture by the camera an object has a particular shape. Then after capture it's 'squished' to be thinner.
... then after decoding it gets restretched to its original shape. And what's stored, from the perspective of the coding specification, is the squished shape.
... So 'square pixels' is something that depends on your perspective.

courtney: So where it the pixel aspect ratio square?

Cyril: It's after 'unsquishing'.

group agrees terminology

pal: In IMSC 1, I will make a revision based on this erratum. It should probably say that the goal is never to have to use tts:extent and always create resolution independent subtitles.

Cyril: Can I check that there's no impact caused by interlaced and progressive video?

glenn: We assume it's been deinterlaced.

courtney: +1

nigel: In TTML2 how does this impact on viewport-related widths and heights?
... Do we need to be concerned about the aspect ratio of the related video there too?

glenn: That's what I'm working on at the moment.

Cyril: SVG lets you specify a viewbox and relate viewport coordinates to that too.

glenn: I'm not sure if we need that too - maybe.
... That's all I need for CP17


<trackbot> Issue-210 -- The values for alpha in rgba() don't correspond to CSS3 Color definitions -- open

<trackbot> http://www.w3.org/AudioVideo/TT/tracker/issues/210

glenn: cp17 - we allowed 0-255 alpha values but CSS3 defines a 0-1 scale. So there's an ambiguity if the value 1 is used.

courtney: Are the types different? Are the expressions differentiable by the decimal point?

glenn: In CSS3 you don't need the decimal if alpha value is 1. So you can't infer anything there.
... I think we just define the mapping into CSS, because then its well defined.

nigel: I'd argue it's well defined already but just needs to be clarified.


<trackbot> issue-225 -- tts:fontSize as percentage of container dimensions -- open

<trackbot> http://www.w3.org/AudioVideo/TT/tracker/issues/225

pal: TTML1 doesn't really say what you're supposed to do with pixelAspectRatio.

glenn: That's right - it's used to define authorial intent. It doesn't say how that should be applied.
... But we may need to reference that in the new verbiage.
... I put that in originally because PNG has a chunk that allows pixel aspect ratio to be defined.

pal: I think if we go down the path of coded samples then it might be good to make sure that pixelAspectRatio is set.

nigel: Can we resolve this by removing vmin and vmax?

pal: +1

issue-225: (f2f meeting) We agreed to remove vmin and vmax.

<trackbot> Notes added to issue-225 tts:fontSize as percentage of container dimensions.

nigel: CP25. At a minimum that comes down to adding a documentGroup identifier.

glenn: I'm happy to do that.

nigel: CP5?

glenn: There are too many details to discuss that. It involves converting to ISD! So I have to define that mapping to satisfy that.


nigel: Thanks everyone - we've covered a huge amount over two days, including:
... agreeing to publish WebVTT
... the MIME type extension
... Reviewing the IMSC 1 review comments and agreeing the CR exit criteria
... thinking about the feelings of our specs
... Considering the relationship between TTML and related video objects both spatially and temporally
... going through the TTML2 change proposals
... and we even had time to think about multimodal interaction!
... adjourns meeting




Summary of Action Items

[NEW] ACTION: cyril Draft a WG note explaining the differences and relationships between the various versions of TTML [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action02]
[NEW] ACTION: glenn to check if timeContainer explicitly has no semantics with timeBase smpte, markerMode discontinuous [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action05]
[NEW] ACTION: glenn to update point (1) of section 3.1 in ttml2 to refer to a new annex that defines new processorProfiles MIME type parameter [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action01]
[NEW] ACTION: nigel Make request to Philip and Silvia to change Living Standard to Editor's Draft. [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action03]
[NEW] ACTION: nigel Scribe notes on CR exit criteria for IMSC 1 based on meeting debate [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action04]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-10-29 01:38:26 $