See also: IRC log
<trackbot> Date: 27 October 2014
<nigel> scribeNick: nigel
All members introduce themselves. Observers: Jangmuk Cho, LG electronics, interested in TTML and WebVTT
nigel: Summarises agenda, offers opportunity for other business
mike: There's an incoming liaison expected from MPEG
nigel: adds it to agenda
https://www.w3.org/wiki/TTML/CodecsRegistry
nigel: We've agreed to host a
registry and define a parameter
... We need to work out where to define the syntax
normatively
... And where to note the media registration once we've updated
it with IANA.
Cyril: I had two comments on the
registry:
... 1. The discussion that talks about the first order
detection of capabilities.
... 2. The editorial nature of stpp vs application/ttml+xml
Mike: We proposed to MPEG that they have their part and W3C has its part.
glenn: That's true - we need to remove the prefix requirement on the registry page.
Cyril: I'd rather delete the
sentence "When an entry of this registry is used in a codecs
parameter..."
... Actually the whole paragraph - it's up to MPEG to define
any codecs parameter, and we can define the
... suffix.
group discusses whether any reference to RFC6381 is needed at all
glenn: I got rid of the RFC6381 references and put the combinatorial operators in.
pal: The AND and OR operators are normative, so it needs to be clearly defined.
cyril: +1
glenn: First we have to define where we're going to specify the normative definition of this new parameter - it shouldn't end up in the registry.
cyril: +1
glenn: When we have it somewhere else we can refer to then we can shorten the registry page.
Cyril: Can we also discuss the
1st order aspect, where it says that the processor profile is
guidance only and may not always be correct.
... We should be strict here.
glenn: My position is that what's in the TTML document is what's authoritative, because it stays with the content.
cyril: Both have to be the same.
glenn: Even if you say they have
to be the same, it's possible for them to diverge. Elsewhere,
type identifiers are always documented as hints
... and the actual data is where you determine the concrete
type.
Cyril: I agree, but we need to state it more strongly: it is an error in general to have a mismatch.
glenn: It wouldn't be an error in the document.
Cyril: I agree - if the two differ then that's an error and the value in the document has precedence.
nigel: There's a lifecycle issue there - a new processor may come along that can process older documents.
glenn: So the outer parameter may reference a new superset profile?
nigel: exactly.
glenn: So the rule should be about consistency not identity.
nigel: The options for where to
put it seem to be:
... 1. An erratum to TTML1
2. TTML2
3. A WG Note
4. A new Recommendation.
group seems to think TTML2 may be the best place
glenn: I'm willing to add it to TTML2 but do not wish to repeat the whole registration section.
https://dvcs.w3.org/hg/ttml/raw-file/default/ttml2/spec/ttml2.html
Glenn: We need to tweak section 3.1 in TTML2 "Content Conformance"
Cyril: I would rather add an annex in TTML2 called Media Types Registration referencing TTML1 plus diffs.
glenn: That works for me.
... I think we can avoid updating the IANA registration
... since there's no wording to prohibit adding new
parameters.
Mike: I'm not so sure about that.
nigel: I don't think we can do that either can we?
glenn: The TTML2 we should define the new parameter and syntax and then reference it in 3.1.
Cyril: We should put it in a
Media Type Registration annex so that it's clear. This then
uses a reference to TTML1 with
... the diffs.
glenn: I'd like to call the annex something like 'Additional parameters for use with TTML media type'.
Mike: let's check with plh too.
all agree to point back to TTML1 media registration.
nigel: It's an open question if we genuinely need to update the IANA registration.
Cyril: In the SVG WG we've made a
comment on the charter about how documents and versions of
documents in TR/
... should be handled. We have the same problem here: TTML1
doesn't talk about TTML2. It points only to the latest
... stable version of TTML1.
pal: They're two different specs - there's no absolute guarantee for backward compatibility.
Cyril: And you're using the same MIME type? SVG 2 is backward compatible with SVG 1
glenn: TTML2 is backward
compatible in the sense that a TTML1 processor would process a
TTML2 document, practically speaking.
... I defined a new #version feature in TTML2 - if you want to
author a document for TTML2 and prevent it from being
... processed by a TTML1 processor then you could do that by
using the profile mechanism. If you don't do that then
... there's no reason that a TTML1 processor could not process
it by ignoring things it doesn't understand.
... We explicitly stated that the XML namespace for TTML is
mutable.
Cyril: THe issue is that the
group is giving the signal that TTML1SE is the latest version
whereas actually everyone is
... working on TTML2. If a TTML1 document can be considered a
TTML2 document...
pal: Setting aside the media registration, TTML1 and TTML2 are different specs, albeit with a common pedigree.
Cyril: If I search for TTML now,
I'll find many documents and if I hit the TTML1 document it
will look like the latest version
... but that's probably not what I want.
pal: But that's right - the latest stable version now is TTML1. Even when TTML2 is a Rec TTML1 will be valid.
glenn: Think about XML - XML 1
and XML 1.1 are not entirely compatible but are still being
updated.
... In TTML1 we may publish a Third Edition incorporating the
errata.
pal: IMSC 1 is based on TTML1 for example.
glenn: In the latest version we actually include "ttml1" in the URL - TTML2 will be a different URL.
<glenn> ACTION: glenn to update point (1) of section 3.1 in ttml2 to refer to a new annex that defines new processorProfiles MIME type parameter [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action01]
<trackbot> Created ACTION-343 - Update point (1) of section 3.1 in ttml2 to refer to a new annex that defines new processorprofiles mime type parameter [on Glenn Adams - due 2014-11-03].
<inserted> Returning to Cyril's point about spec versioning, backwards compatibility and the relationship between versions of TTML
mike: There are a number of W3C specs that have this problem.
Cyril: And that's a problem.
pal: It's even worse if we don't make it clear that there are two different specs.
Cyril: This is going to stay a problem for anyone searching for TTML
<pal> http://www.w3.org/XML/Core/
<glenn> http://www.w3.org/TR/CSS/
pal: The XML WG is explicit about its different publications.
glenn: The CSS WG explains this
with "Levels" and we could do that too, with a top level TTML
uber-document
... as a WG Note to explain the relationship.
Cyril: I'll volunteer to write a similar note.
<scribe> ACTION: cyril Draft a WG note explaining the differences and relationships between the various versions of TTML [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action02]
<trackbot> Created ACTION-344 - Draft a wg note explaining the differences and relationships between the various versions of ttml [on Cyril Concolato - due 2014-11-03].
nigel: That's helpful. Now what do we do about MIME types which may be the same for TTML1 and TTML2 documents?
glenn: That's a good question. In
the TTML2 spec I defined a new set of baseline profiles, and a
new ttp:version attribute.
... In ยง5.2.3.1 of TTML2 we list the 3 profiles from TTML1,
plus SDP-US, plus 3 new profiles specific to TTML2 including
newly defined features.
... One way to do it is to use different short names in the
registry for each of these profiles.
... The ttp:version attribute is a little orthogonal. It states
which version of TTML was used in authoring a document
instance.
... It's required if the document requires support for a
feature not present in TTML1.
... And the Note mentions that the computed value of the
attribute is used by the construct default processor profile
procedure.
... Omitting the attribute causes the default to be one of the
TTML1 profiles.
Cyril: This makes the processorProfiles parameter in the MIME type even more important.
glenn: It's important for any kind of external filtering.
Cyril: So there's the version
attribute, the profile designator and externally there's
processorProfiles that would use
... different short names for TTML2 than for TTML1.
glenn: Yes. In the new annex we
should mention that designating externally that something is a
profile could result
... in a false negative, or a false positive. A false negative
in the sense that the context may think it can't process,
but
... within the document it could say 'start with a profile and
make a feature optional' at a very fine grained level. This
could
... add or remove feature requirements. So if the external
parameter indicates that the processor can not process,
but
... the internal parameter says it can, then it would be a
false negative. The reverse is true. that the document may
require
... an extension feature.
nigel: But you could just locally define a new profile short name for an extension, and put that in the external processorProfiles parameter with the AND operator.
glenn: There's also the question of, in the absence of the external parameter, what should the semantics be?
Cyril: it should be less restrictive.
glenn: It should certainly be no
more restrictive than what's in the document. The problem is
there may be some cases
... where there's no way to express the complexity in the
external parameter.
... One of my observations is that people want to simplify the
profiles mechanism in the external parameter. Is that what
people are thinking?
pal: Unless we put something in then people will just define their own way of doing it.
courtney: From a parser-writing perspective there's no efficient way to look at the supported profile requirements. You have to parse the whole thing.
<pal> pal: doing it == "signal that a document conforms to one or more specifications"
glenn: It's not that bad - you can just parse the head.
courtney: You still have to parse the whole head.
glenn: It's much better than SVG 1.1 which mandates full parsing to determine if it is a well formed XML document, even down to the last closing tag.
courtney: So there's no goal to efficiently reject files?
glenn: We just used a similar mechanism to SVG. We assumed that the head would be parsed before the body.
courtney: It defeats the purpose of profiles to allow people to pick and choose features.
pal: If this group doesn't do it then others will.
nigel: But they probably wouldn't include the feature addition and subtraction semantics.
glenn: But the short registry doesn't define what a document conforms to.
group discusses the topic further (scribe misses details)
cyril: What does EBU-TT-D say?
nigel: It permits extensions by
default, but wouldn't do anything with them. It doesn't use
profiles.
... Andreas raised the query about how complex it is to
register a short name - the email discussion seems to
indicate
... that there's no requirement to create a full profile
definition document; a short name can be registered with
the
... details from the spec document in the absence of a full
profile document.
... By the way, as I think Dave mentioned a while back, you can
also just define a new short name if you don't want to
hit
... the complexity of the internal profile mechanism.
pal: So we're okay to keep the
same application/ttml+xml mime type and use the
processorProfiles parameter to
... distinguish between TTML1 and TTML2, and the processors
within them.
RESOLUTION: We will
document the syntax for the external processorProfiles
parameter in TTML2.
... We will reuse the same media type in TTML2 as in TTML1 but
recommend using the processorProfiles external parameter to
differentiate processor requirements.
Cyril: what about the registry? Where should it go? The Media Source Extensions registry is published not on a wiki but as a document.
glenn: the usual practice in W3C is to use a wiki page.
<Cyril> https://dvcs.w3.org/hg/html-media/raw-file/default/media-source/byte-stream-format-registry.html
Cyril: For MPEG the registry needs to have a stable URL.
nigel and glenn: We can keep a stable URL with a wiki.
RESOLUTION: We will host the registry on the wiki (subject to edits as discussed today)
nigel: Opens up MPEG response for those in the room to see.
Cyril: MPEG accepts the registry
proposal from W3C.
... MPEG has no comments on IMSC at this time.
... There was also discussion of track header width and height
in the ISO BMFF. In many cases it's intentionally unknown what
the TTML extent is.
... So MPEG drafted corrigenda to 14496-12 and 14496-30.
... in the MPEG discussion also, from an MPEG perspective you
can usually extract the codecs value from the bitstream. For
TTML that isn't exactly the case, because
... you have to produce the short name from a profile
designator in the document, so you need extra knowledge to
convert the long name into the short name.
... So MPEG is considering adding a new box just to contain the
MIME type to remove the need for the extra knowledge. You'd be
able to put the MIME type of a TTML document in a TTML
track.
... So MPEG fixed this (or it's in the pipeline to fix) that
the MIME type can be in the MP4 file.
glenn: Is there a suggestion that there should be an additional piece of metadata that includes the short names directly?
Cyril: I thought that would be redundant with the long profiles designator.
glenn: It is if you happen to
know the details of the registry, but otherwise not.
... Adding such an attribute would make it more necessary to
define the syntax in TTML2.
Cyril: That would be fine, but we've solved it separately in MPEG too.
nigel: I'd be concerned about
putting the short codes in the TTML because that would
encourage folk to extract the value from the TTML and put it
into the (proposed) MP4 MIME attribute.
... That could limit the ability for distribution mechanisms
creating MP4 files from old TTML files from generating an up to
date external processorProfiles parameter.
glenn: I understand that concern. Early binding could be worse than late binding, for document wrapping.
group discusses the possibilities, concludes that we will not propose to put the short codes into TTML documents.
Additional observer: Tatsuya Hayashi - interested in time synchronisation with audio
nigel: shows on screen the early draft liaison. Describes an issue with signalling track header width and height fields.
glenn: I think I have an action for when no extent is specified and there is a related media object.
Cyril: In MP4 we had the problem that in adaptive streaming there may be an MP4 file with no video, just the text track.
mike: In the external context there's always a related media object somewhere, either in the file or in the MPD manifest in DASH.
glenn: We don't limit the scope of how the related media object is provided.
Cyril: We added a bit that says
that the width and height can be the aspect ratio instead of
the actual width and height, and can be zero if you don't
know.
... From an MPEG perspective a video can have encoded pixel
dimensions, or output pixel dimensions. In the end there's a
presentation size expressed in pixels.
... Each video track can have a different presentation size,
related to each other using transformation matrices.
pal: On the same dimensions?
Cyril: Yes, on the presentation
coordinate system.
... We had two problems. Firstly, a document may be authored
independent of resolution, second the aspect ratio may not be
known, and be important.
... The two corrigendums deal with this.
... The -12 spec was too prescriptive about visual tracks - it
turns out that it is track type dependent.
... There is a new definition of a reserved 0 0 size value.
Concerns have been raised.
mike: Introduces the liaison and supporting documents. They will be available to the group in the next few weeks.
Cyril: This is equally applicable
to WebVTT, SVG tracks, HTML tracks, any graphical vector
graphics based tracks [as well as TTML]
... Takes us through the proposed changes to 14496-30.
group discusses track selection behaviour based on aspect ratio.
nigel: We have a proposal still to publish WebVTT as a FPWD. The edits requested to the staged version were made.
http://dev.w3.org/html5/webvtt/webvtt-staged-snapshot.html
group discusses the use of the term Living Standard and how forks are managed, bringing in CG changes into the WG document, and FSA requirements for doing so.
and that it's the editors' responsibility to maintain any differences between CG and WG versions, and update the WG version each time a new version is to be published by WG.
Cyril: Anyone wanting to work with WebVTT will always look at the editor's draft.
glenn: I don't mind the process,
but I'm not happy with the term Living Standard.
... It looks like it will set a precedent.
mike: Use of the term Living Standard needs to be something that the W3C expresses a view on.
Cyril: What if Living Standard were changed to Draft Community Group Report?
glenn: I could live with that.
Editor's Draft may be even better, to make it clear that this
is within the normal WG process for a Rec track document.
... Objects to the use of the term Living Standard. Acceptable
proposals to resolve this are "Editor's Draft" or "Draft
Community Group Report".
pal: How will reviewers of future WG versions of the document be able to trace back why any particular change was made? (noting that this is only possible on the CG version now)
<scribe> ACTION: nigel Make request to Philip and Silvia to change Living Standard to Editor's Draft. [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action03]
<trackbot> Created ACTION-345 - Make request to philip and silvia to change living standard to editor's draft. [on Nigel Megitt - due 2014-11-03].
nigel: Going from WD to CR for
WebVTT?
... Dave planned to send notes out essentially to the same set
of recipients as we sent to for IMSC 1, as well as socializing
at TPAC.
nigel: Current status is that we have a git repo with some but not all of the tests we worked on in Geneva.
courtney: And I don't have a
draft document ready.
... I've prioritised working on the document ahead of the
code.
pal: What's the timescale for this? It would be a great addition to the IMSC 1 test suite, if that software could be part of it.
courtney: My goal is to have a
version of the document ready by the beginning of December, as
a fairly complete 1st draft for comments.
... I don't know exactly when I will have the software
ready.
nigel: We don't seem to have all the tests from Geneva - is there a reason why they can't all be submitted?
group: no reason
nigel: Okay, the action to submit
them to me remains.
... It would make sense to transfer the git repo to courtney at
some point - no pressure on time.
courtney: Okay, I can do that.
nigel: We have no more
substantive points to discuss on this so let's adjourn for
lunch.
... We'll reconvene at 1345
pal: LC-2968
https://www.w3.org/2006/02/lc-comments-tracker/34314/WD-ttml-imsc1-20140930/2968?cid=2968
... This is a version of implied inline regions, which isn't
supported in TTML1. There's an alternative solution available,
so I propose to do nothing.
... Describes notes and resolution.
... LC-2971.
... LC-2969.
... LC-2970.
... LC-2967.
nigel: We have to think about what exit criteria to write into the CR; if we have enough evidence of wide review; what the license expectations are for test material.
pal: What are the licensing expectations for test material?
plh: The goal here is, if the
group is using a test suite to demonstrate suitability for
advancement, the Director will want to make that test suite
available to all.
... For that what we've done so far is to provide tests under
dual license: 1. If you're going to use those tests for
conformance claims, you may not modify them.
... 2. For any other purpose we don't really care.
... The second one is the BSD license.
... What's the issue?
mike: The question is what W3C is asking for - which you just answered.
plh: W3C needs the rights to modify the files and make them available to the public under the W3C license.
group looks at the DECE license requirements
plh: It doesn't permit modification.
mike: I'll explore this with DECE
to check that they can provide test documents that will be
usable by W3C for this purpose.
... It would be useful if nigel or plh could respond to DECE's
email asking if the example files can be issued under the
standard W3C license (with the text of that license).
<plh> http/www.w3.org/2002/09/wbs/1/testgrants2-200409/
plh: The 'this is closed since 01
October 2013' is a bug - ignore it.
... The text you're looking for is in section 2 on this
page.
nigel and philippe send DECE a note via Mike.
nigel: Next up - do we have enough evidence of wide review?
pal: The member submission itself
was based on the work of 80 members, from both the CE and the
content publishing space.
... This is a specification that was used in the implementation
of playback devices. So it already received a significant
amount of scrutiny.
... Since being brought to W3C it has received comments from
SMPTE, EBU and DECE.
plh: On Sep 24 SMPTE said they have no outstanding comments on IMSC 1.
pal: Earlier today DVB stated
that they have reviewed and have no comments.
... Plus we can point to the members of this group and their
representation, and that it was reviewed by the a11y group of
the HTML group.
... And we requested comments from a significant number of
groups.
plh: Another group that would be
important if the PFWG.
... At the end of the day it's a profile not a whole new
spec.
glenn: I think this has wider review than TTML1 did, at least bringing in DVB, EBU and DECE.
plh: My feeling is that you've done enough.
nigel: Great, then the next point is CR exit criteria.
nigel takes group through slide pack on CR exit criteria, triggering much debate.
<scribe> ACTION: nigel Scribe notes on CR exit criteria for IMSC 1 based on meeting debate [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action04]
<trackbot> Created ACTION-346 - Scribe notes on cr exit criteria for imsc 1 based on meeting debate [on Nigel Megitt - due 2014-11-03].
<plh> http://www.w3.org/TR/2014/CR-html5-20140731/
plh: Calling it Living Standard will cause a problem. "Draft Community Group Report" would be acceptable.
RESOLUTION: We will publish the staged version of WebVTT as a FPWD when the edits to change the occurrences of "Living Standard" to "Draft Community Group Report" have been applied.
Cyril: summarises morning resolutions
plh: My understanding is that IANA needs a complete new registration to overwrite the previous one.
glenn: It seems confusing to have two different registrations for the same MIME type.
Cyril: We can link back from the TTML2 section on media registration to the TTML1 definition.
nigel: Mike volunteered to do the re-registration - is that okay for it not to be a member of W3C staff?
plh: Yes, it's fine for me to be out of the loop and not a bottleneck - I just have to tell the IESG that it's okay
pal: the other comments are from Andreas Tai
nigel: Let's go through them. Actually it's too hard to put them all in the issue tracker - we did 5.2 but let's just talk through the others.
pal: 3. Conformance - subtitle
document not being defined. I can just take that on - it should
probably say 'document instance' like TTML does.
... 4.1 - I'm happy to add the proposal.
... 5.7.3 - I agree it would be useful to show all the
combinations of the parameter and the attribute for
forcedDisplay.
... 5.7.4 - I should be able to implement the proposal and the
typo.
... 5.10 - this is a good suggestion a priori
... 6.3 - This is a good point - I should link to 9.3.1 in
TTML1
... 8.2 - Andreas is suggesting that we use the terms
presentation processor and transformation processor now that we
have them.
... I have to think hard about this. I think the intent of the
text is the same as the note on overflow, where we talk about
authoring of the documents.
... Section 8.2 doesn't necessarily say that the presentation
processor shall lay out text in this way. I'm pretty sure it
refers to how its authored.
glenn: If that were the intent I would have written it differently, e.g. "for the purpose of avoiding overflow, the author shall or should..." etc
nigel: Since this is in section 8 does it only apply to layout for the purpose of calculating the HRM values?
pal: 8.1 is Performance, 8.2 is
Reference Fonts. Since the HRM section is referenced as
'documents shall conform to these constraints' I think it's
about authoring not presentation.
... If this is the case then there isn't a strict constraint on
processors at all. I'll study this and come up with a proposal
for us to consider in addressing Andreas's comments.
... Annex B Forced Content - we're going to put some code
snippets in there.
... 5.10 #length-cell - I have to think about this. What's the
reason for needing cell metrics in documents for
linePadding?
nigel: It makes it easier to make the padding distance a fraction of the font size, which is a typical use case.
pal: 5.2 The current text was direct input from EBU so we shouldn't modify it lightly. Perhaps if EBU comes back and says something different this would be easier to change.
nigel: We've completed our agenda for today, so adjourning. Thanks all. We restart tomorrow at 0830.
trackbot, start meeting
<trackbot> Meeting: Timed Text Working Group Teleconference
<trackbot> Date: 28 October 2014
<scribe> chair: nigel
<scribe> scribeNick: nigel
<inserted> Day 2 - Tuesday 28th October
Observers: Noria Sakamoto - Toshiba, interested in broadcasting in TTML
Jerome Cho, LG Electronics - wants to meet FCC regulations for accessibility with TTML, WebVTT etc.
Francois Daoust, W3C, just observing.
Courtney Kennedy: Engineering Manager at Apple, responsible for subtitles.
Cyril Concolato, University in Paris, GPAC/MP4Box etc
<pal> Pierre Lemieux, supported by MovieLabs / IMSC1 editor
Debbie Dahl, Chair of Multimodal Interaction Group, observing. Interested in synergies between timed text and EMMA standard
Kazuhiro Hoya, Fuji TV interested in UHDTV, which will adopt TTML for closed caption
Nigel Megitt, BBC, Chair of TTWG
Glenn Adams, Skynav. Editor of TTML.
http://dev.w3.org/html5/webvtt/webvtt-staged-snapshot.html
glenn: Looks good to me.
Cyril: LGTM. When can it be published?
nigel: Tuesday next week at the earliest, depending on staff
Cyril: How will we publicise it?
nigel: I expect Dave Singer to
publicise it to the charter dependency groups, W3C and
external, and to the W3C liaisons.
... Dave has also suggested that we publicise it socially at
TPAC too.
fantasai: The spec templates and
styling over time have become outdated - we can use some cool
web technologies to make specs more readable
... The scope of the design is not a back-end web app, just
HTML/CSS. We want a design for desktop, mobile and print, in
that order. It will change the markup of the headings
... and the boilerplate in the Status text that is data, not
paragraph text. We'll push legalese to the bottom. The abstract
should be 2-3 sentences above the fold, then
... the TOC available without having to scroll it. Then URLs,
issues, feedback etc should be at the top in a more compact
format. So the scope of it is to
... redo markup, boilerplate, styling. We'll look at styling,
clean up the stylesheet to be more readable, make sure that the
document is still quick-scannable
... A lot of styling that is ad hoc, like fragments, example
code, could be harmonised across all the W3C specs. This will
take a while - it's a side project for me. We want to take
into
... account what the WGs need.
... The functional questions are [the ones on the agenda]. We
also want general feedback on what to consider, e.g.
protocol-relative links so we can switch to https.
... or always-visible TOC.
... The first question is a subjective/emotional one - what
should the style express, in terms of values.
... If we used primary colours and comic sans it would feel
like a toy not a spec. But if we used a parchment background
and old style font it would look old-fashioned.
... They're not appropriate for W3C, - we want to know what is
appropriate though, in terms of how they feel.
glenn: Do you have any templates or ideas?
fantasai: We have a proof of concept but we don't know where it's going just yet. It's going to be experimental - design by consensus. We're asking for ideas from the community.
glenn: Part of the problem is that you have different audiences for different specs.
fantasai: We haven't had feedback on different styles for different document types/audiences. We want them to fit together and feel like they belong together.
glenn: One of the problems is that we have a lot of history.
fantasai: We want to make sure that every group has a working toolset - tell us what you use in response.
Cyril: Some specs have a developer view and an author view.
fantasai: Put that under question 6 'what else we should know/consider'
nigel: Can we answer the
questions?
... Q1. 3-5 adjectives
Cyril: I have no idea.
ddahl: Authoritative
glenn: consistent
ddahl: comprehensive
glenn: One problem is that
documents don't have the same styling. A lot of it is
editor-specific.
... The variation may cause some problems. I've also worked
with ISO and ITU which crank out format-consistent specs that
are somewhat impenetrable.
nigel: open, welcoming?
glenn: It would be nice to use newer styling mechanisms. You can't push the envelope too far without hitting browser variations.
courtney: clean, modern.
... Additional considerations: Should be something that will
work for low vision people, using magnifiers, voice-over
etc.
nigel: URLs?
Cyril: TTML1, WebVTT, IMSC 1
nigel: +1
... Do we have documentation of our markup conventions?
glenn: Not really - we use
XMLSpec as a technology (from 1999). Others use respec (WebVTT
and IMSC 1).
... It's very unlikely that we will adopt respec for
TTML.
... In TTML1 we have a conventions section in the document.
XMLSpec and Respec are separately documented for markup.
nigel: Spec processing tools? We've looked at those already. Are there any more?
group: no more
nigel: Do we have any goals?
Cyril: To have the table of contents kept visible when scrolling, for easy navigation.
courtney: +1
Cyril: In some PDF viewers and Word, I like that searched-for words are listed as occurrences on a separate panel.
courtney: Better search would be
good.
... When I find a page, it's hard to see the structure of the
whole thing and relations with other specs.
nigel: I hate it when clicking on references takes you to the Reference section not the thing being referenced.
Cyril: +1 What's the point in it?
nigel: What about making defined terms links to other places where they're used?
Cyril: +1
... A way to list normative (testable) statements in the spec,
to generate test suites automatically would be great. CSS has
that I think.
tidoust: It was the packaging spec format.
glenn: That was done by adding
markup to every paragraph. Extracting assertions from a spec is
a hard to automate, complex process. Maintaining it becomes
quite challenging.
... Plus it's not a science. Declarative statements (X is Y)
can be viewed as normative in some places, then if (X is
required) implies (Y is required). It's a nice idea, but hard
to do.
... People offer paid services to do that, because it's
complex.
nigel: Is there anything else we
should know/consider?
... Cyril mentioned Developer view/Author view earlier.
<tidoust> [FWIW, I was referring to the fact that the Packaged Web Apps (Widgets) spec was written in a way that allowed the extraction of test assertions, see: http://www.w3.org/TR/widgets/ and the test suite: http://dev.w3.org/2006/waf/widgets/test-suite/#user-agent ]
Cyril: I think there's an HTML 5 spec (might be the WhatWG spec) that does differing views.
glenn: There's a lot of advantage
to marking up specs to allow automatic extraction. For example,
IDL fragments have conventions that allow some tools to
automatically pull out all the IDL
... to generate a test generation process. We don't have APIs
in TTML at the moment. If CSS had followed a similar convention
for how properties were defined, and HTML had
... followed conventions for how elements and attributes were
defined, then a similar tool could have been used. They didn't
adopt a convention though, so it's a manual task.
... Those are the kinds of tools that it might be useful to
consider. We could mark up elements and attributes. In the
original markup I used a few syntactic conventions to
assist.
... For example ID attributes. I use some specific conventions
for how identifiers are presented.
... I have never documented it anywhere?
glenn: I can walk us through
this. The issue originally came up from an example TTML
document with some negative time expressions.
... I immediately pointed out that you can't do that! I did
wonder why they are putting -ve time expressions in a TTML
document.
courtney: Caption authors may use different timecode from the video editors.
glenn: Based on that I thought it
would be useful to have an offset from authored time
expressions to some useful point in the media, and allow the
player or processor
... to use that offset to achieve synchronisation rather than
mandating precise synchronisation between TTML times and the
media times.
... As I explored that some issues came up. One was the
difference between the Origin of the document timeline and the
Begin of the document timeline, and whether they
... are different times or the same time. I looked at SMIL and
SVG time semantics to try to ascertain what was used there. I
also reviewed the earlier TTML1 work.
... We have a concept in TTML that has its own terminology
definition in TTML2, Root temporal extent.
... When this talks about beginning or ending, does it mean
beginning of the coordinate space or the first timestamp in the
document.
... Say a TTML document has a body with begin="10:00:00". Is
the origin "10:00:00" or 0s in the document. I eventually
tentatively concluded that for document time coordinate
... spaces is always zero and the beginning of the document is
always zero. That doesn't mean it's the timestamp of the first
timed element in the document.
... Everything in SVG and SMIL is predicated on the default
begin being zero, in the coordinate space of the document.
Recently I came round to understanding that the document
... time origin and the document begin point in time are the
same. Then if I want to synchronise a document with some media,
then what point am I synchronising? The origin of the
... media timeline or the begin. Let's say for example, I have
a related media object that starts at 5 hours into the media
timeline. The first timestamp in the media is 5:00:00 (5
hours).
... What do I want to synchronise the 10 hour time in the
document with, in the media? There are 2 options. One is to
have zero in the document time coordinate space correspond
to
... zero in the media time coordinate space. Another is to say
that there's an offset between the document time coordinate
space and the media time coordinate space, and that
offset
... is between the two origins of the coordinate spaces. A 3rd
option is to pin the origin of the document coordinate space
(zero) to the begin of the media time coordinate space (5
hours).
... That latter one doesn't seem to be quite so correct.
... [draws a picture]
nigel: Is this predicated on timebase="media"?
glenn: Let's assume that. The
general answer may extend to continuous SMPTE timebase
too.
... I have two entities, a video and a document. Each has a
timeline - the video content has a timeline and the document
body has a timeline.
... The root temporal extent of the document is the timed
beginning of the document to the timed ending of the document.
The choices seem to be the origin or the start of the
... first timed element.
... I think the logical begin is always zero, if begin is not
specified.
... So this is the document time reference synchronisation
point. What do want to tie it to - the beginning of the media
or the origin of the video time coordinate space?
... My thinking has evolved on this. Originally I thought that
Begin(body) would be synchronised with Begin(media). Then the
offset would be between those two points.
... The more I thought about it the less viable it seemed to
be. Eventually I came to the conclusion that we should
synchronise Origin(document) and Origin(media).
... Then if they happen to correspond, and both the video and
the body say 10h in their own coordinate spaces then they would
line up.
... i.e. they would be isomorphic time spaces with zero
offset.
... One of the interesting example issues is: what if the
playback rates differ between the media and the document. What
happens to dilations or contractions in the timelines?
... It seems like if they're both synchronised with the zero
point then any modification of the playback speed, as long as
they're coordinated, would work out pretty well.
... It means that you can simply multiply the coordinates with
the playback rate.
<Cyril> scribeNick: Cyril
nigel: (describing an email sent
earlier)
... I analyzed it a different way
... with all possible combinations of timebase and related
media
... 3 different timebases in TTML: media; SMPTE; and
clock
... what "relationship" means in the temporal extent
definition
... if there is no related media, there is no relationship,
that's easy
... the root temporal extent is from begin to end of
document
... they can be unconstrained
... begin is origin and end is infinity
... if clock time is used, there might be a relationship with
some media
... nothing here contradicts Glenn
... example: tape with every frame with timecode
... if you are using media times, the origin of the document is
the begin of the media
... you expect the origin of hte document to be equivalent to
the begin of the media
... 5s in the document means 5s in the media
... the root temporal extent is constrained by begin media/end
media
glenn: SMIL and SVG make the
difference between the specified and active time interval
... the question: is root temporal extent meant to express the
active interval or the extent of the time coordinate space of
the document ?
nigel: the next limitation is
when you have media with SMPTE timecode and SMPTE timecode in
the document
... the document times and media times are actually the
same
... so no offset applies here
... for marker mode = continuous, this is equivalent to saying
origin(document)=origin(media)
... however the rule as stated also works for marker mode =
discontinuous
... i.e. when document times = media times
... next: media with SMPTE time codes and clock in the
document
... the only interpretation is that the document types are
supposed to be equivalent to clock times when you play the
media
... ex: document time says 10:05, but starts playing at
10:03
... the use case for this are strange
glenn: wall clock values are converted to times by substracting the wall clock start time of the document (according to SMIL)
nigel: consistent with my interpretation ?
glenn: yes
nigel: then there is a category
of media with no SMPTE time codes, but with time
... same as glenn, the origin of the document is equivalent to
the origin of the media
... again the framing that glenn talked about applies
... the active time cannot go outside of the playback
glenn: the media active interval is as if it was a parent of the document active interval
nigel: if audio is continuing but
video is not, the viewer is continuing, you should be
presenting subtitles
... this is an implementation case
... any offset that needs to be applied will be
externally
... I don't want to duplicate what is already in MP4 files for
instance
glenn: the timeoffset I came up
with makes it easier
... if the house that made the media did not have the media in
hand, they can provide the offset
nigel: is that a hypothetical case ?
courtney: no
... different houses will have different conventions
... when someone give content to itunes, they give a
video
... later on they'll get european or asian conventions
nigel: i don't understand the convention
courtney: some times people don't want to use zero
nigel: in SMPTE time code yes
<nigel> scribeNick: nigel
Cyril: I understand Courtney's
use case. The TTML document doesn't reference the media itself.
So it will be used in some external context with the
media.
... For example an MP4 file, MSE, DASH. All of those have
timestamp offset facilities, so I'm puzzled why we're talking
about this here.
glenn: Those are all different
systems with different ways to express the offsets. If it only
can be carried outside the document it might get lost.
... It's useful to have it in the document as a reference point
to express the intent of the author. We often need to export
things from in the document to outside the document.
Cyril: So you want to export from the document some time reference?
glenn: yes
Cyril: That is fine.
glenn: Courtney isn't the only person to bring this up - I've had other reasons to add this over the years.
courtney: complex production workflows do mean that we sometimes need to do this.
Cyril: These examples seem to be overly complicated.
glenn: I think nigel wanted to
cover all the cases, which is a useful exercise.
... Neither of us defined BEGIN(document) and ORIGIN(document)
actually meant, which is a problem talking about this!
<Cyril> glenn:
glenn: Is it the time of the first thing in the time frame or the origin of the framing time.
nigel: the next row is where the
media doesn't have timecode but the document has smpte
timecode, which may start at some arbitrary point according to
convention.
... In that case I can see that an offset would be useful, to
say "the start of the media is at e.g. 10:00:00". I'm less
comfortable doing this with media timebase, but it's quite
closely related.
Cyril: The same problem will
arise with WebVTT.
... The general problem is how to carry in-band the time value
of the begin of the media in the document timeline.
nigel: Do houses really begin at 300s?
courtney: I only see this as a real world problem with TTML, not WebVTT.
glenn: I've seen examples with media timeBase. For example, taking into account a pre-roll of 13s.
courtney: This could also apply to WebVTT at some point in the future.
glenn: I added a few notes. I
need 2 questions answered.
... My hypothesis is that the label BEGIN(document) means the
origin of the document, i.e. zero on the document time
coordinate space.
... I believe that's most accurate in relation to SMIL and
SVG.
... This is not the time of the body.
... Now the question is what we call the Root Temporal Interval
- does it also start at the origin, or at the body. We may need
to distinguish the active root temporal interval
... from the overall unqualified root temporal interval.
... I want to see if the group can agree that hypothesis.
... Then I need a decision on which of the 3 models to use to
describe the timing relationships.
... 1) The two origins sync up.
courtney: I don't see how that solves it.
Cyril: That's the only one that works!
nigel: +1
glenn: In that case the offset is
the difference between the origins.
... 2) The origin of the document syncs up with the beginning
of the media.
... This one seems more natural to me because 10 hours means 10
hours into the video.
nigel: Not if the timebase is smpte! You have to enumerate all the options.
glenn: 3) Begin(body) is
begin(media). I don't think this one works too well.
... When you use media times instead of timestamps then you
mainly want 2). But with SMPTE timecodes in the media it seems
like 1) may be more applicable.
nigel: I think that's right.
Cyril: There's another way to do
this - what happens if you have an audio track with some offset
too?
... In the MP4 and DASH case, and all the others I know, you
only care about the media itself. The TTML document has an
anchor point, e.g. 10 hours if that's the begin of the
media.
... Then you use that to anchor it onto the timeline. In MP4 if
the video has a big gap at the beginning, you use an offset to
say when the beginning should occur. Same with the audio.
... The TTML document should just give its anchor point as the
time value in its coordinate space that corresponds to the
beginning of itself.
nigel: +1 that's the proposal I made too.
glenn: So you're saying a media begin point as opposed to an offset in the document timeline?
nigel: yes.
glenn: So if zero in the document
is zero in the media the media offset is 0
... And for SMPTE timecode with the 10:00:00 convention the
value would be 10:00:00.
... I like that suggestion because it seems to work regardless
of the timeBase. Have you worked through any play rate
differences?
nigel: I'm not confident that I've worked through all the playrate consequences.
Cyril: The solution seems to be found - it needs to be liaised back to MPEG because it affects the carriage of TTML in MP4. You'd have to store it in the MP4 file.
glenn: Couldn't you just look in the TTML document?
Cyril: Let's say your document has an offset of 10 hours - will the first sample say 10 hours or zero - is an edit list required?
glenn: In SMIL you can have captions that start before the media and end after, but get effectively truncated. Why would it affect the carriage in MP4? You can still look inside the document.
Cyril: When you stream/seek/segment the document you don't want to look inside it.
glenn: I think I have enough guidance on this to move forward on resolving it.
nigel: Let's take a break - back at 11.
action-345?
<trackbot> action-345 -- Nigel Megitt to Make request to philip and silvia to change living standard to editor's draft. -- due 2014-11-03 -- PENDINGREVIEW
<trackbot> http://www.w3.org/AudioVideo/TT/tracker/actions/345
close action-345
<trackbot> Closed action-345.
nigel: sorry that was the wrong action but it was done!
action-346?
<trackbot> action-346 -- Nigel Megitt to Scribe notes on cr exit criteria for imsc 1 based on meeting debate -- due 2014-11-03 -- PENDINGREVIEW
<trackbot> http://www.w3.org/AudioVideo/TT/tracker/actions/346
close action-346
<trackbot> Closed action-346.
nigel: Goes through scribed notes
- group makes edits
... Conclusion is:
Our criteria for exiting CR will be:
Provide an implementation report describing at least 2 independent implementations for every feature of IMSC 1 not already present in TTML1, based on implementer-provided test results for tests and sample content provided by this group.
We will not require that implementations are publicly available but encourage them to be so.
We will not exit CR before January 16th 2016 at the earliest.
pal: That's enough for me to edit
the SOTD in the CR draft - I'll need to get respec.js to allow
this custom paragraph.
... The next CR draft may include this text in a weird style
just to get around respec.js.
nigel adjourns meeting for lunch - restart at 1300
/|
group reconvenes
Reviewing change proposals at https://www.w3.org/wiki/TTML/ChangeProposalIndex
Change Proposal 15
https://www.w3.org/wiki/TTML/changeProposal015
glenn: margin - this would be very easy to add since there's a straight mapping to CSS. I haven't had enough feedback that its needed.
nigel: There's nothing from EBU
courtney: margin isn't permitted in WebVTT either.
glenn: The described use case,
for indenting 2nd and subsequent lines, wouldn't be supported
by margin anyway. It's really a hanging indent.
... We don't have any indent support, hanging or otherwise.
nigel: Is there a related issue for margin?
glenn: I don't think so.
... I'll edit this on the fly now.
... marked as WONTFIX.
... box-decoration-break. This got moved in CSS to
'fragmentation'
<glenn> http://dev.w3.org/csswg/css-break/
<glenn> http://www.w3.org/TR/css3-break/
glenn: but the short name is still css-break! It was last published as a WD in TR on January 16.
<glenn> http://www.w3.org/TR/2012/WD-css3-break-20120823/
glenn: Mozilla seems to have an implementation of this that's working.
nigel: Even in today's draft the property and value combination still exist.
glenn: We have two options in
TTML2 syntax: either use box-decoration-break directly or go
ahead and use something simpler like linePadding and map it to
this CSS property.
... The latter disconnects it as a feature from this particular
instantion.
nigel: That's the normal way we do it, but I can see that with padding specified on content elements then it would be a duplication to add it a second time through linePadding.
glenn: In that case I proposed adding support for box-decoration-break in addition to padding on inline content elements that can now be specified.
PROPOSAL: support the EBU line padding proposal with the combination of padding on inline content elements and box-decoration-break.
RESOLUTION: We will support the EBU line padding proposal with the combination of padding on inline content elements and box-decoration-break.
issue-286: (TTWG F2F today) We will support the EBU line padding proposal with the combination of padding on inline content elements and box-decoration-break.
<trackbot> Notes added to issue-286 Extend the background area behind rendered text to improve readability.
glenn: border - we've added
border and made it applicable to both region and certain
content elements - body, div, p and span.
... One of the open questions is that border in css is a short
hand for specifying the width height and colour of all the
borders simultaneously, not each border separately.
... As well as this super-shorthand border property, there is
the border-width, border-style and border-color properties,
which allow those values to be specified on all or any from 1-4
borders separately.
... Then finally there are 12 long hand versions for each of
these plus -top -right -bottom and -left.
... I've implemented the shorthand, but we could go for the
more longhand versions.
nigel: We should check what's needed to match 708 window styles.
courtney: The FCC regulations don't go to the level of granularity of this.
glenn: I think this note came up when we were doing SDP-US - there has been a request in the past to describe which 708 features are supported in TTML.
courtney: That's not the same as
a requirement. I don't know of any examples of subtitles with
borders on them.
... 708 says borders may be raised, depressed, uniform or
drop-shadow.
... I don't see anything about styling the different sides
separately.
glenn: It's not clear what the
mapping is for all those values. box-shadow in CSS may apply
where drop-shadow is required in 708.
... They also introduced border-radius.
nigel: Let's move on from this - I think we've done enough.
glenn: line stacking strategy. I
don't think we need to do anything on this right now - I put
this in originally, so I'll mark it as under review by
me.
... region anchor points - this was a proposal from Sean to
have an auto keyword for the origin and extent properties on
regions.
... I believe there's something like this in WebVTT.
... TTML doesn't have these at the moment. Sean was the
champion and we don't have any other champion or requirement
for this right now.
... I would say we should not take any action on this right
now.
nigel: I agree - it's unclear even how the proposal maps to the WebVTT way of positioning and sizing regions.
glenn: text outline vs text
shadow. When we defined textOutline in TTML1 CSS was also
working on an outline property.
... the new CSS definition of drop shadow allows you to specify
multiple shadows simultaneously.
courtney: the FCC regulation requires text edge attributes: normal, raised, depressed, uniform and drop-shadow.
glenn: TTML1's textOutline offers thickness and blur radius. You'd have to have multiple simultaneous shadows to achieve raised and depressed styles.
courtney: Authoring that would be complex.
glenn: XSL-FO defined a text
shadow property even though CSS had not done so. We ended up
calling it textOutline and we also limited it to just one
dimension, not two.
... It's now officially defined in the CSS 3 text decoration
module, called text-shadow. It takes 2 or 3 length
specifications.
<glenn> http://dev.w3.org/csswg/css-text-decor-3/#text-shadow-property
glenn: What we could do is define some new keywords that the processor can map. That makes it easier for the author to choose amongst the different choices including raised and depressed.
courtney: That seems like a nicer way to do it.
nigel: +1
glenn: There are two questions: firstly, should we change the name from textOutline to textShadow? I would say no. We can just define the mapping semantics, and already have different naming anyway.
Proposal: retain the attibute name textOutline.
glenn: Proposal: add two new
keywords for raised and depressed to meet FCC requirements and
define mappings.
... There's a third proposal to add a 3rd optional length
specification. This would allow separate definition of offset
in x and y as well as blur.
... I see that textOutline doesn't offer a shadow, but a
uniform outline that expands by the required length around the
glyph.
... I need to think about this some more.
... Now I recall why we thought about adding a new attribute
called textShadow, to allow this. I don't want to take away
textOutline and remove backwards compatibility.
... either we enlarge the definition of textOutline to make it
include shadow, or add a new textShadow property. I need to
review it and see if I can come up with a proposal that
works.
nigel: We're getting behind on the agenda. We'll come back to this later.
https://www.w3.org/wiki/TTML/changeProposal025
<scribe> scribeNick: courtney
: nigel: tab to autocomplete is great!
nigel: : topic is combining groups of documents
cyril: in the tool mp4box, if you import ttml files to mp4 and concatenate more than one ttml file, then extracting the track should give you a combined ttml document.
glenn: xml:id uniqueness- a
similar problem exists in ISD creation as described in the
document combining proposal.
... btw, do you have an example of a specification for a merge
algorithm? an xml syntax?
nigel: rules are laid out in presentation
glenn: so you wouldn't have some way for documents to specify a set of rules that it can follow?
nigel: no, there would be an external set of rules
glenn: would there be any content support required- additional metadata, etc?
nigel: no
glenn: you could exclude documents that contain elements with mixed content
nigel: perhaps, but that might be difficult because you could not use break spans within a sample.
<courtney_> nigel: normalize whitespace for comparison of samples.
<courtney_> nigel: to compare elements, need to transform their times into a common timeline.
<courtney_> glenn: you could translate to the isd space first prior to comparison.
<courtney_> nigel: not sure what is possible with that approach.
<courtney_> glenn: this is a transformation process, could be a separate spec from TTML.
<glenn> ACTION: glenn to check if timeContainer explicitly has no semantics with timeBase smpte, markerMode discontinuous [recorded in http://www.w3.org/2014/10/27-tt-minutes.html#action05]
<trackbot> Created ACTION-347 - Check if timecontainer explicitly has no semantics with timebase smpte, markermode discontinuous [on Glenn Adams - due 2014-11-04].
<courtney_> glenn: does the ttp:documentGroup type <xsd:NCName> proposal match the syntax of Id in XML?
<courtney_> glenn: yes it does match
<courtney_> what's the motivating use case for this?
<courtney_> nigel: to archive live created subtitles documents and to be able to create distributable time constrained segmented documents for streaming
<courtney_> Cyril: I'm not convinced there is a need for standardization here yet.
<courtney_> pal: is there a need for a standard when dealing with a private archive where the owner controls what goes in and what comes out?
<courtney_> courtney_: ttml requires lots of small files for captioning live events, and this seems like a limitation to me
<courtney_> pal: no it doesn't have to be, streaming inherently involves lots of files
<nigel> scribeNick: nigel
glenn: shows a terminal window!
Invokes some code (ttx.jar) with a command line specifying the
external-duration and an input ttml file
... Looks at TTML input document, that would present as 0-1s:
Foo, 1-2s: Bar, 2s-[unbounded]: Baz
... Code validates the input and then writes out 3 Intermediate
Synchronic Documents (ISDs).
... looks at output documents. isd elements in new isd
namespace, with begin and end on the top level element.
group questions status of this work.
glenn: TTML1 defines ISDs but no
serialisation of them, nor are all semantics fully defined. In
TTML2 ED there's an annex that defines these, with a syntax for
ISD.
... This is a proposal with the option for change.
pal: If we say any TTML document can be split unto a number of ISDs why isn't each ISD itself a TTML document. Why introduce a new structure?
glenn: Some good reasons. One: the constraints are different in an ISD document than in a TTML document. For example region elements have a body child.
nigel: You could create the ISDs as individual TTML documents prior to rehoming the body to each region and resolving the styles, as another option.
glenn: I explicitly wanted to put
the ISD into a different namespace to reduce confusion. I
realised when I started to formalise this then if I started
with tt:tt and made it
... polymorphic then it would be much harder for people to
understand, and parsers.
pal: +1 for that
... More fundamentally, can the ISD format be mapped into a
TTML document?
glenn: I don't know - that wasn't
in my thought process.
... There are two reasons for doing this work. One is to create
HTML versions - you have to convert into a time-flattened
version of the original TTML and apply the styles, and resolve
region references.
... I wanted to make sure that process was fully articulated,
which is essential to move forward.
... The other strong reason is to make it easily mappable into
the cue structure of HTML text track. My model for each of
these ISDs is one cue.
... Microsoft in the past tried to put a TTML1 document into a
cue. It wasn't standardised anywhere. I want to have a good
story for generic mapping TTML into a sequence of cues
... that fit into the TextTrack model. So my motivation was
that each ISD should be representable as a cue, and furthermore
to be distributable as a sequence of ISDs.
pal: How can I turn this back into a TT document?
glenn: I don't know - I didn't want to do that.
pal: So you've effectively created a new format.
glenn: It was my intention to
make this a new format that could be used for
distribution.
... The other option would be to heavily profile TTML to allow
it to be distributable. By the way, there are already more than
one kinds of document that are specified by TTML.
... My proposal would be to use the same MIME type and a
different document type within that.
pal: My initial feedback is this introduces a new format in a world that has too many already!
nigel: There are multiple steps
here. The first is to formalise something that's only
conceptual for describing an algorithm in TTML1; the second is
to make it a serialisable format.
... If your end goal is a TextTrack cue why not go all the
way?
pal: My feedback is that we should store these as TTML documents.
glenn: Not only has timing been
flattened in this process but also styles. The only styles that
are expressed here are those that are not the initial
values.
... [shows ISD output] There's an attribute and element called
"css" meaning computed style set. Coding wise, this has been an
important step for validating our algorithm.
... Notice that it still uses the TTML namespace - it copies
the body into the region element; there can be more than one
region in the isd.
group expresses some reservations about defining a new format
glenn: There are some questions:
1. Is it important and useful to define a serialisation format
for ISD?
... I think it's both. It would help in many ways and reduce
the discussion about streamability.
Cyril: It's not TTML anymore, so it's not streaming TTML. It's streaming something else.
nigel: We have a wider
environment in which organisations are creating and
distributing TTML documents and writing players. There's no
problem splitting temporally on the client side,
... so creating a new format where the temporal division
happens server side doesn't seem to be necessary.
group adjourns for a break
back at 1600
nigel: Introduces Debbie as an Observer who has some requirements for multimodal interaction and thought the solution space may involve TTML.
ddahl: Introduces EMMA 1.0
ddahl: Emma represents captured
user input in different formats.
... Now considering capturing system output as well as user
input. It's helpful to have inputs and outputs in the same
format for processing, debugging and analytics.
... Could be static, defined ahead of time
... Generated dynamically by an intelligent system
... EMMA is an XML language. We're thinking about capturing
output. [shows an example]
... This example happens to have an ssml message in it, the
<speak> element. SSML has lots of available complexity,
not used in this example.
... Then other multimedia things might go along with it, such
as HTML and other kinds of multimedia output - SVG, whatever
seems appropriate to the application.
... My original question was: if we want to speech synthesise
some output or synchronise pre-recorded audio with some other
kind of multimedia, e.g. video, an animation of
... planes going across a map etc.
... How could we take advantage of the work done in TTML to
make life easier for us in multimodal interaction to
synchronise multimedia outputs generated in real time by
interactive systems.
pal: If what's generated is
audiovisual, that's a possibility.
... maybe you want to provide captions.
courtney: there could be a series of responses with timings.
ddahl: You could say "There are
flights to Boston from Denver..." and then ask a follow-up
question. When you ask what time of day would you be interested
in flying, at that point
... maybe you display a form.
courtney: If you had some animation that shows a map, and you know it will play for 3 seconds, and then in 3 seconds post your next question?
ddahl: Yes. Would it be as simple as incorporating TTML maybe wrapped around another element.
nigel: Thinking about the
concepts in TTML, there's a timeline against which things could
be synchronised, plus styled and positioned text. There is an
issue raised for associating
... audio representations of text, but at present the spec
describes visual rendering only.
... It could be that SMIL is a good place to go.
pal: It's certainly more
flexible.
... Is this purely semantic or is there a playback
requirement?
ddahl: In the vision, there's a system that renders the captured input into something human-understandable. In the end there would be playback.
pal: The more you're interested in playback the closer you are to TTML or WebVTT which are intended to be used for playback.
courtney: If you're synchronising other kinds of media it's a good choice.
ddahl: I guess you could use TTML and SMIL?
glenn: That's right - TTML is
designed to be referenced by the <text> element in SMIL.
In the abstract for TTML we say:
... "In addition to being used for interchange among legacy
distribution content formats, TTML Content may be used directly
as a distribution format, providing, for example, a standard
content format to reference from a <track> element in an
HTML5 document, or a <text> or <textstream> media
element in a [SMIL 3.0] document. "
pal: If you'd like to display text or captions over audio or video you should use TTML.
nigel: If you want to display any text that changes over time then TTML is a good fit. Probably the time modes in TTML are rich enough to support any use case you're likely to have.
<Cyril> scribeNick: Cyril
glenn: you can refer to TTML1
today, because TTML1 is REC
... if you need features of TTML2, you'd have to wait
ddahl: we've done work on what we
call "output timestamps", for when it is planned vs. when it
happened
... when it actually happens is easier
glenn: TTML does not care about
when it happens
... we say when we want it to happen
... we use presentation time stamp in the MPEG sense
... however we have one mode where we use the SMPTE time
base
... using SMPTE Time Code along with the video
... you can think of them as labels
... when one of these labels in the video appear that is when
the matching TTML element is active
(glenn explaining the different modes SMPTE, Timestamps and clocks)
scribe: they derive from
SMIL
... we use a subset of SMIL, not repeat for instance
ddahl: we might want to do somethings that does not have to do anything at all with text, like picture and music
glenn: we plan to add support for
images and possibly audio in TTML 2
... we will definitely not support video in TTML
nigel: the use case for audio is
audio description
... generally created by the same company
glenn: we don't want to turn TTML into SMIL light
nigel: at the moment you have
simple SSML
... but if you start having details in SSML
... this is closer to the processor than the human
... I wonder if we couldn't go in that direction in TTML adding
emotion, pronunciation, ...
... like a format called PLS (Pronunciation Lexicon
Specification)
... this wouldn't affect the TTML document structure at
all
... that could guide synthetise audio
... There is also EmotionML that is interesting
... a big use of TTML is for caption and subtitles
... but they are just text, without expression
... EmotionML gives you some information
... but how do you present that emotion
glenn: like emoji
courtney: there are conventions
also
... describing the way the text was spoken (not the
emotion)
... that's an interesting idea to explore
... the most artful captions have seen describe the way the
text was prononced
ddahl: emotionML has different
vocabularies
... there is a standard vocabulary, but you can add your
own
courtney: if you would be too
heavy handed in the way you describe the emotion, it could be
condescending to the hard of hearing
... you'd have to do it artfully
ddahl: some people may have a processor to process emotionML
nigel: currently the emotions are
in the text, forcing everyone to view them
... if you capture emotion and pronunciation would suffice to
synthesize speech
ddahl: you would need prosody or other aspects
nigel: no one has brought this use case to TTML first
<nigel> http://www.w3.org/TR/emotionml/
ddahl: I had an example of annotating a video with EmotionML
glenn: TTML allows you to mix any content if it is in a different namespace
Cyril: Example 2 of annotation of videos in the emotionML spec seems to have problems (use of ? instead of #, use of "file:" instead of "file:///"
(ddahl shows a demo)
nigel: you can either add external content to TTML or extend TTML
Cyril: you might want to consider
using a separate track
... not merging it in the TTML document but using a separate
track in the HTML sense
nigel: there does not seem to be any action on this for us at the moment
ddahl: I came looking for information and i'll bring that back to my group
<nigel> scribeNick: nigel
https://www.w3.org/wiki/TTML/changeProposal014
nigel: I was going to propose as
per CP14 that we consider adding PLS and EmotionML into TTML
but it seems that we do not need to: foreign namespace content
can already
... be added with no spec changes.
issue-10?
<trackbot> issue-10 -- Allowing pointers to pre-rendered audio forms of elements -- open
<trackbot> http://www.w3.org/AudioVideo/TT/tracker/issues/10
Issue 10 proposes adding a pointer to an external audio file, which is the analogue to a pre-rendered graphic image.
nigel: Issue 10 proposes adding a
pointer to an external audio file, which is the analogue to a
pre-rendered graphic image.
... CP14 is a Priority 3 on our list, so I don't think we
should spend any more time on it right now.
... Instead, we should go through the Priority 1 CPs and
resolve any outstanding questions so that we can complete the
TTML2 deliverable.
glenn: Let's return to CP15. We
were up to shrink fit
... We don't have a champion for shrink fit and no issue, so I
propose to do nothing.
... font face rule - we do have an issue for that. I'm not sure
if we need the fontFaceFormat attribute.
... This implies that there's a fallback loading system that
would pick the source that it knows how to process.
... That would introduce something new in TTML2, which is the
ability to refer to resources outside the document.
... There's a way to get around it, which is to use data URL,
i.e. embed the data as BASE64 encoded characters in the
document.
nigel: I prefer external references for fonts because they allow caching.
glenn: There's a similar issue
for backgroundImage resources.
... I don't have any open issues on this one.
... multiple row alignment
... I haven't worked through the possibility of using flexbox.
I tried to generate some samples and they seemed to produce the
same results.
... I don't want to introduce all of flexbox into TTML2. My
current thinking is to define a TTML-specific property given
the semantics according to the proposal.
... As part of the mapping to HTML it could potentially be
mapped to flexbox.
... So I need to define a new property that is named
appropriately and provides these semantics.
nigel: How will we reference the pre-existing similar feature in IMSC 1 and EBU-TT?
glenn: I don't mind drawing
attention to this with a Note if people think that's useful -
it's just editorial work.
... Superscript and subscript: I think I've already closed
that.
... Ths issue is closed.
nigel: marks it as closed on the CP.
glenn: Change Proposal 16 - Style
conditional
... I need to review this. I thought this change proposal had
to do with an informal proposal I made where I described a
condition attribute on some elements, where
... the value of the condition attribute is an expression in a
simple expression language, whose evaluation, if false, would
result in the element being excluded from presentation,
otherwise
... treated as though there were no condition attribute
present. This came out of the forcedDisplay discussion.
... I was going to have a simple expression language that at
minimum looks like a list of functions in CSS where the names
of those functions would be drawn from a list of
... predefined function list, e.g. "parameter(parameterName)"
with some defined built-in parameters like "forced" so if you
want to exclude some content based on this parameter
... being false then you would have a
condition="parameter(forced)=true" that would be evaluated
during the rendering and presentation process (specifically in
the ISD generation process).
... I need to reread this CP and think about it - the proposal
seems to have used something more like a media query
expression. Sean wrote this originally I think.
... At minimum I want the condition mechanism to support forced
semantics. Beyond that I don't have a real agenda.
... Other uses might include language.
nigel: Another is where you may want to preferentially display images vs text under some circumstances.
jdsmith: Are the only conditional inputs for this smooth animation and 4:3/16:9 video format? This looks like conditional styling.
glenn: I'm talking about more
general conditional expressions
... It's an interesting idea to consider feature support
conditionality.
... CP17 Default styles
... This is mostly closed. It remains to be defined what a
pixel means.
issue-179?
<trackbot> issue-179 -- Interpreting the pixel measure -- open
<trackbot> http://www.w3.org/AudioVideo/TT/tracker/issues/179
pal: I think most people with CFF
and SMPTE-TT think that when they author the document the video
object has a certain number of pixels, and those are the ones
they refer to.
... They literally relate to the encoded pixels, those in the
AVC stream for example.
glenn: Those pixels don't have a size at that point. Then they get mapped into a display pixel which does have a size.
pal: And my 640x480 then gets mapped to a display pixel on my 1280x720 display.
glenn: And at that stage the pixel has a concrete size.
pal: That's my understanding.
glenn: The rendered pixel is dependent on lots of other variables.
Cyril: But that's not the coded
pixel either. In AVC for instance you code a pixel, an RGB or
YUV value or whatever. Then you stretch that according to the
pixel aspect ratio.
... and then you may apply a clean aperture, to cut out some of
the image, to make it the right multiple of macroblock size. So
my guess it that people authoring TTML base it on
... the result of this process, taking the output of this
decoding process, then applying the sample aspect ratio, then
any cropping.
glenn: I think this is an open
question, it's not necessarily like that. For example in SD
video you often have 720 pixels per line but you only display
704 pixels, so there's an 8 pixel buffer on either side
... to allow for overruns. So what we were describing is a
0-719 coordinate space whereas what you were describing was a
0-713 coordinate space.
Cyril: Yes, if the video was cropped then you'd have some invisible text.
pal: Some codecs have the ability
to store a power of 2 number of pixels, internally. Then on the
output it has internal cropping to put back the right value
from the input.
... So is it literally the power of 2 internal array or the
output of the decoder.
Cyril: Yes, to give an example,
in an MP4 file you have 3 sizes:
... 1. The size of the buffer that needs to be allocated.
... 2. The result of applying sample aspect ratio and
cropping.
... 3. Applying a possible scale to the result, usually not
done.
... So in an MP4 file you have the sample entry width and
height, the clap and pasp width and height, and track header
width and height.
pal: That first one you mentioned is what comes out of the decoder. That's what I think of when I think of stored pixels.
glenn: We want to pick one and go with it.
pal: I'm happy that we're not talking about a display pixel.
Cyril: I agree it's not the 3rd
one. If you really want to use that you should use the same
metrics to the TTML result.
... The only choice is 'output of decoder' or output of sample
aspect ratio plus cropping.
pal: I'd take out anything that's dependent on ISO BMFF.
glenn: +1
pal: I think people have been using decoder output pixels with no further transformation.
nigel: We need to find what's common across all formats.
glenn: The source buffer is common.
pal: I'd use that as a strawman.
glenn: Previously we said 'pixel
as defined in XSL-FO'. But the definition there is ambiguous -
it can be device dependent or what CSS says. CSS says 96 pixels
per inch,
... but it doesn't say which inch applies. They have some
angle-based visual model including distance, to compute that.
It's complicated. I think the CSS people gave up on it and
made
... it an absolute dimension. They did that because it's what
most people actually use in implementations. We have a similar
scenario - most people use a different interpretation
... from what's in the spec, for whatever reason.
pal: Yes, they see a video dimension size and go for that.
glenn: I think on TTML1 we should add an erratum that defines pixel, and then use it normatively in TTML2.
nigel: What's the proposal?
glenn: We have a tentative proposal to make pixel a 'stored pixel' (pal's term) or 'coded sample' (Cyril's term).
pal: Let me throw another one in the pot.
glenn: I like 'coded sample' because it avoids circularity of definition.
nigel: Can I clarify that we're talking about no tts:extent being specified on tt:tt?
glenn: I have to do something
different based on whether or not there's a related media
object.
... Otherwise there's another definition.
nigel: I'm worried that we end up with ambiguity between tt:tt@tts:extent and the related media object.
glenn: That's a different problem, that we also need to deal with at the same time.
nigel: This is exactly the same as the root temporal extent problem before - we need to relate the root spatial extent to an external display rectangle.
Cyril: I've checked H264 AVC and
HEVC and they both define a picture as an array of samples, and
they both also define a message to carry sample aspect
ratio.
... So the video may have one shape, and the TTML may define a
different shape rectangle.
pal: That's right, it's also something we should talk about.
Cyril: [draws a picture] Decoder
produces something with a Width and Height (within the
decoder). Then you apply Sample Aspect Ratio (also within the
decoder).
... and then, when you display something you may scale it,
upsample/crop it etc.
glenn: The array size is the same before and after applying sample aspect ratio?
Cyril: No, the height is the same
but the width may change.
... If you author using the W and H, and there's anamorphic
conversion going on then you may need to apply some positioning
of the TTML extent onto the video.
... The 'coded samples' are the ones before scaling with the
sample aspect ratio.
... I don't know what to call the samples after scaling with
the sample aspect ratio. 'scaled sample'?
courtney: I'd call them 'square pixels'
Cyril: that's not how AVC calls them, though it may make sense.
pal: As a strawman can we use
'coded sample'?
... the anamorphic 'pixels'
courtney: I think it makes sense to use the coded samples from the file because then the video in the file and the dimensions of the captions are consistent with one another, using the same metrics.
pal: my proposal is use the term 'coded sample', get feedback on that as an errata.
glenn: I'll probably refer to some MPEG document for the definition of 'coded sample'. I'm going to say that a TTML pixel is a 'coded sample'.
Cyril: It would be good to provide examples.
glenn: one interesting scenario
is that tts:extent doesn't match the coded sample size of the
related video object. Another is where they do match.
... The third is where there's no tts:extent, but there is a
new ttp:aspectRatio property.
pal: IMSC says either do 'matching pixel aspect ratio' or 'define aspect ratio' but not both.
Cyril: I think we agree but I
want to check: If I take an anamorphic video, before capture by
the camera an object has a particular shape. Then after capture
it's 'squished' to be thinner.
... then after decoding it gets restretched to its original
shape. And what's stored, from the perspective of the coding
specification, is the squished shape.
... So 'square pixels' is something that depends on your
perspective.
courtney: So where it the pixel aspect ratio square?
Cyril: It's after 'unsquishing'.
group agrees terminology
pal: In IMSC 1, I will make a revision based on this erratum. It should probably say that the goal is never to have to use tts:extent and always create resolution independent subtitles.
Cyril: Can I check that there's no impact caused by interlaced and progressive video?
glenn: We assume it's been deinterlaced.
courtney: +1
nigel: In TTML2 how does this
impact on viewport-related widths and heights?
... Do we need to be concerned about the aspect ratio of the
related video there too?
glenn: That's what I'm working on at the moment.
Cyril: SVG lets you specify a viewbox and relate viewport coordinates to that too.
glenn: I'm not sure if we need
that too - maybe.
... That's all I need for CP17
Issue-210?
<trackbot> Issue-210 -- The values for alpha in rgba() don't correspond to CSS3 Color definitions -- open
<trackbot> http://www.w3.org/AudioVideo/TT/tracker/issues/210
glenn: cp17 - we allowed 0-255 alpha values but CSS3 defines a 0-1 scale. So there's an ambiguity if the value 1 is used.
courtney: Are the types different? Are the expressions differentiable by the decimal point?
glenn: In CSS3 you don't need the
decimal if alpha value is 1. So you can't infer anything
there.
... I think we just define the mapping into CSS, because then
its well defined.
nigel: I'd argue it's well defined already but just needs to be clarified.
issue-225?
<trackbot> issue-225 -- tts:fontSize as percentage of container dimensions -- open
<trackbot> http://www.w3.org/AudioVideo/TT/tracker/issues/225
pal: TTML1 doesn't really say what you're supposed to do with pixelAspectRatio.
glenn: That's right - it's used
to define authorial intent. It doesn't say how that should be
applied.
... But we may need to reference that in the new
verbiage.
... I put that in originally because PNG has a chunk that
allows pixel aspect ratio to be defined.
pal: I think if we go down the path of coded samples then it might be good to make sure that pixelAspectRatio is set.
nigel: Can we resolve this by removing vmin and vmax?
pal: +1
issue-225: (f2f meeting) We agreed to remove vmin and vmax.
<trackbot> Notes added to issue-225 tts:fontSize as percentage of container dimensions.
nigel: CP25. At a minimum that comes down to adding a documentGroup identifier.
glenn: I'm happy to do that.
nigel: CP5?
glenn: There are too many details to discuss that. It involves converting to ISD! So I have to define that mapping to satisfy that.
nigel: Thanks everyone - we've
covered a huge amount over two days, including:
... agreeing to publish WebVTT
... the MIME type extension
... Reviewing the IMSC 1 review comments and agreeing the CR
exit criteria
... thinking about the feelings of our specs
... Considering the relationship between TTML and related video
objects both spatially and temporally
... going through the TTML2 change proposals
... and we even had time to think about multimodal
interaction!
... adjourns meeting
/|
s/s||/
s|/