Timed Text Working Group Teleconference -- 23 Oct 2018

<scribe> scribe: nigel

Agenda for today

Nigel: Good morning everyone, let's do introductions.
... Nigel Megitt, BBC, Chair

Andreas: Andreas Tai, IRT

Glenn: Glenn Adams, Skynav, been working on TTML since 2003!

Nigel: Thank you, and observers.

Masaya: Masaya Ikeo, NHK - Yam_ACCESS

Geun: Geun Hyung Kim, HTML5 Converged Technology Forum (Korea)

Nigel: Today, we have Live subtitles and caption contribution, AC review feedback,
... future requirements, and Audio profiles.
... Welcome, we have another observer:

Hiroshi: Hiroshi Fujisawa, NHK

Toshihiko: Toshihiko Yamakami, Access Co., Ltd

Andreas: For the future requirements topic, after lunch, a colleague may want to join on
... the requirements for 360º subtitles and possibly other TPAC attendees may want to
... join so if we can figure out a specific slot that would be great.

Nigel: If there are timing preferences we can be flexible - probably any time after 11:30 we can do.

Andreas: Thanks, I'll get back to the group on that.

Live Subtitle and Caption contribution

Nigel: I uploaded a short presentation:

Presentation on live subtitles and captions

Pierre: Pierre Lemieux, Movielabs, Editor IMSC

Nigel: [presents slides]

Pierre: Question about the client device being unaware of live vs prepared source, and
... the system being designed with that as a constraint.

Nigel: Yes, assume that is the case.

Glenn: The distribution packager might assign DTS or PTS?

Nigel: Yes, I should have added MPEG2 Transport Streams as a possible output, and we
... should note that there is a DVB specification for insertion of TTML into MP2 TS.
... [slide on transport protocols] If there is timing information from the carriage
... mechanism then that might need to be understood in relation to processing any
... subtitle TTML document.

Glenn: Are you hoping an RTP packet will fit within a single UDP packet?

Nigel: In general that is likely to be true, but not necessarily.

Pierre: So you can't rely on the network providing you with ordered documents?

Nigel: Yes, that could be the case.

Pierre: So the protocol you use has to be able to handle non-sequential document transmission?

Nigel: Yes, potentially.
... You do need to resolve the presentation in the end, and some deployments may
... provide fixes for out of order delivery at the protocol level (WebSocket) or at the
... application level and we need to deal with the whole range of possible operational conditions.

group: Discussion of options for defining the begin and end time of a TTML document.

Nigel: [proposal slide]

Glenn: I wouldn't object to using the ebu namespace as long we don't normatively
... reference the EBU spec. I'm not willing to cross the rubicon when it comes to bringing
... in non-W3C namespaces into TTML.
... If it is published as a Rec track document and it refers to TTML and is a module that
... blesses these features, using EBU namespace to define them, then that's okay with me.
... If we have an assumption that we are going to pull that into TTML directly then I might
... start having some discomfort.

Andreas: I think we are not there yet at this point in the discussion. First we have a problem
... that we are trying to solve and we have a standard that is already out there. It is good
... practice not to duplicate. What Nigel has proposed addresses a good part of this
... scenario, and there has been a lot of discussion since 2012 on this with at least 3 years
... regular active work on it, so I think it is worth looking at it. After reviewing this and
... deciding that this is how we want to solve it then we can look at how to adopt it.

Glenn: Right, I just wanted to give fair warning about the questions I might have.
... A question I have is why we need to do something in W3C?
... Is it a profile of EBU-TT?

Andreas: Good question. It is limited to certain vocabulary and mainly has the constraints
... from EBU-TT, which are not the same as for IMSC. It would be perfect to use the same
... mechanism for all IMSC documents.

Nigel: That was my answer, it makes sense to bring these key semantics into the home
... of TTML so that it can be applied to other profiles than EBU-TT.

Glenn: Is it an authoring guideline?

Nigel: Why would it be a guideline?

Glenn: It's not defining new technical features.

Nigel: It is indeed doing that.

Pierre: There might be technical features such as defining document times as mentioned.
... A lot of the guidelines could be in the model, but I suspect there would be some
... requirements and substantive features.

Nigel: [propose a break for 30 minutes]

Live subtitle contribution - discussion

Pierre: Is the proposal for an EBU Member Submission?

Nigel: It could be but I think it is not needed - the IPR can be contributed by EBU as
... a member based on any work that we do in this group.

Andreas: There is a question for a member submission if it will be superseded by a future
... W3C specification. The market condition is that people are pushing for implementation
... of EBU-TT Live so we should be clear about what we want to do in W3C.

Pierre: This sounds more like an EBU discussion, W3C cannot require implementation.

Andreas: It could affect adoption though since work on an alternative may change views.

Pierre: That's an EBU decision. Anything could happen when a member submission arrives here.

Andreas: We can review the document as it is and then review what is needed. I don't see
... a need for a member submission at the moment. What advantage do you see in EBU submitting one?
... The spec is out there, everyone can use it, IPR issues should not be a problem.

Pierre: I can't speak for EBU but I would think that a member submission clarifies
... significantly the scope of the effort, being live subtitles within the member submission
... scope rather than live subtitles in general.
... IMSC ended up different from CFF-TT for good reason, but the scope of the features
... for instance was set by the member submission. It would help.

Andreas: The different arguments that led other W3C members to make submission is more
... internal, how to move on with some standardisation. In the past submissions are
... submitted to W3C, then carefully reviewed, when W3C should take over certain
... standardisation.

Pierre: For instance, CFF-TT - the Ultraviolet members and the larger community felt that
... it would be beneficial if something like that specification were to be standardised by an
... organisation like W3C. That was a decision by that community to do that. But it was not
... happenstance. Here, I think it is up to EBU and its members and community to have an
... opinion on whether or not standardisation by W3C helps or not.
... It might not help if it changes the specification in a way that is not good for that
... community. You tell me.

Andreas: We are not there yet. This group has not decided yet.

Pierre: Live is really important.

Andreas: Yes, this is something we need to discuss. What is in scope for this group?

Pierre: The industry is interested in live, period.
... It is a really important use case.

Nigel: [repeats goal from earlier]

Pierre: If the goal is to arrive at how to create a set of IMSC documents in a live environment...

Andreas: What Nigel said, and other EBU members, is there is support to make EBU-TT Live
... a subset similar to how EBU-TT-D is a subset of IMSC Text Profile.

Pierre: That works.
... You don't need a member submission for that. Deciding on the scope early is a good idea.

Andreas: Yes

Pierre: Both make sense. Picking one is going to be really key.

Nigel: I think I hear consensus that some kind of TTWG technical report that addresses
... the live contribution use case is worthwhile.

Glenn: Requirements would be useful to set the scope.

Pierre: Yes, a requirements document would be helpful.

Glenn: In general we should have requirements documents before new technical specifications.
... I make a motion to require that.

Andreas: I propose a joint meeting with EBU group to discuss this. We have January in Munich
... in mind. We wanted to bring this up and see what the availability of members.

Pierre: Feb 1 in Geneva would work for me.

Andreas: That is good.

Pierre: Specifically the morning of Feb 1!

Andreas: Propose 31st and 1st.

Pierre: I'm busy Friday 1st in the afternoon but the joint meeting could be just in the morning.
... We don't need more than 3 hours.

Glenn: If we're having a face to face meeting it should be at least 2 days, if it is an official
... WG face to face meeting.

Pierre: I think we are just proposing a joint TTWG - EBU meeting.

Glenn: That would make it a TTWG f2f, I can't justify a journey to Geneva for half a day.

Andreas: If we make a one and a half day meeting, on Thursday and Friday.

Glenn: I'm available on Saturday too.

Pierre: I'd rather not, my preference would be 30th and 31st and part of the 1st.

Andreas: It would be good anyway to have the EBU and TTWG members in a room together.

Pierre: We can do it during PTS, why not?

Andreas: We need to ask Frans and EBU.
... I will ask Frans.

Nigel: Thanks, summarising the discussion:
... * A technical report on live subtitle contribution is a good idea
... * We need requirements for that
... * We will investigate a joint meeting with EBU at end of Jan/beginning of Feb
... Thank you.

Pierre: Thanks for bringing this up.
... At some point we will have a technical discussion about the details, based on the
... requirements, which will be crafted hopefully prior to that meeting, and that would be
... a good time to have a technical discussion.

Glenn: Does the current Charter cover this work?

Nigel: The requirements document would be a Note so that would certainly be covered.
... We don't have a specific deliverable for a Recommendation listed at present, so that
... may be something that we should consider for a Charter revision.
... By the way, if we proceed with David Singer's proposal from yesterday, that could be a
... good moment to revise the Charter in any case, since the WebVTT Rec deliverable would
... have to be pulled from the Scope.
... For example we could target a Charter revision in May 2019 for another 2 years, pulling
... the end date to 2021.

Glenn: 2023 will be the 20th anniversary of this WG.

Andreas: Noting that there are observers here who might be interested in this topic, if we
... proceed with this work we should make it possible for new members to join our meetings.

Nigel: As Chair, I would like to know if there are any potential members especially in
... different time zones and to be flexible about how we meet to allow them to participate.

Andreas: I also meant that it should be possible for non-members of TTWG to participate
... in the discussion.

Nigel: For a non-Rec track requirements document with no IPR, that is fine of course.
... To clear IPR rules when we get to a Rec track document obviously contributors do need
... to be WG members, effectively.

Glenn: If we publish a Rec track document that is based in large part on another spec
... outside of W3C then that may be precedent-setting.

Pierre: Like IMSC?

Nigel: It's not precedent setting.

Pierre: It's the same, it's based on TTML.

Nigel: I agree.

Pierre: From what I have read it's a how-to-interpret TTML document crafted in a particular way.

Glenn: That's reasonable.

AC Review feedback

Nigel: Reviews AC feedback. We don't have any comments to respond to.
... We have a reasonable number of responses now, some more would be good.

TTML1 3rd Edition Rec CfC

Nigel: I realised that in my CfC for publishing the TTML1 3rd Ed Recommendation, I did not
... include any consideration of superseding 2nd Edition. I don't think we need to do that
... for TTML2 or IMSC 1.1, because the previous Recs still stand, i.e. TTML1 3rd Ed and IMSC 1.0.1.
... Can I make it a condition of the CfC that we supersede TTML1 2nd Ed when we
... publish TTML1 3rd Ed.

Glenn: It would be inconsistent not to.

Pierre: Yes, supersede not obsolete.
... In the fullness of time we should probably make an Edited Recommendation of
... IMSC 1.0.1 to point to TTML1 3rd Edition too.

Andreas: Yes, superseding is okay.

Nigel: Thank you, that's a decision.

RESOLUTION: As part of the request to publish TTML1 3rd Ed as a Recommendation we will supersede TTML1 2nd Ed.

Nigel: We'll break for lunch now, back at 1300.

Future Requirements

Nigel: Since the break, we have a new observer and a new attendee:

Vladimir: Vladimir Levantovsky, Monotype, AC Rep, Chair of Web Fonts WG (awaiting re-charter)
... I have a very keen interest in anything relating to text matters, including composition,
... rendering, fonts and anything else you can imagine related to that.

mdjp: Matt Paradis, BBC, Chair of the Web Audio WG, and I run an accessibility and interactive
... work stream for BBC R&D, which is where my interest in this group lies.

Peter: I'm Peter tho Pesch, from IRT. I'm working on a project to do with accessibility of
... 360 and AR environments, particularly subtitles.

Nigel: Thank you, welcome.
... Can I first get a very quick list of the new requirements areas that we want to cover in
... this conversation?
... I already have 360º/AR/VR requirements.
... This morning we covered live subtitle use cases so we don't need to duplicate that work.
... I need to present some styling attributes for consideration, actually a bigger question
... about bringing in arbitrary CSS and how we might go about doing that.

Andreas: I recently came across a requirement for a TTML documents container element.

New requirements: 360º/AR/VR

Andreas: Just to start on this, yesterday we had at the Media and Entertainment IG a
... brief session where I showed some of the results of the work Peter has been doing.
... We did not get into the detail, I just showed the videos and we agreed there is a use
... case that needs to be solved, and there is not complete agreement, or it is not clear yet
... where it should be solved. The M&E IG action was to organise a telco where we get the
... necessary people from different groups together, discuss the problem scenario and then
... work out where the work will be done.
... Yesterday, because I walked through the different examples, I would like to repeat this
... with Peter's comments because he has the necessary input.
... Because Vladimir is working on a similar topic and yesterday brought up some additional
... issues we may want to make a list of all the things that could be in scope of the TTWG.

Nigel: Just to note, our Charter includes in the Scope: "Investigate caption format requirements for 360 Degree, AR and VR video content."

Vladimir: And "caption" doesn't necessarily mean subtitles, it could be any text label that
... is part of the content?

Glenn: We don't distinguish between subtitle and caption any more!

Vladimir: Would text label be considered in scope?

Glenn: Why not?

Andreas: The group name is Timed Text, which is very generic and doesn't say what it is
... used for. For general matters also there is the CSS group.

Vladimir: I understand we will not cover all the presentation cases.
... For example when you're in a 360º environment the text will be defined by timed text,
... but the composition might be defined by CSS.

Nigel: Consider this in scope.

Andreas: [shows examples]

Peter: I will start here at this slide. Yesterday you showed already a little bit of the scope.
... I often use this image because for me it was the easiest way to picture the coordinate system
... we are using.
... [world map, equirectangular projection]
... You also know how this would map onto a sphere. This is a common way to represent
... 360º videos, using this map and wrapping it round a sphere, putting the viewer at the
... centre looking out (the other way from the way you see a globe normally).
... Within the project I am working on, we are looking into ways of adding accessibility
... services to VR, focusing on 360º videos right now.
... There are some challenges, maybe we start with the videos to show you some of the
... thoughts we had on this.
... [always visible] This is the simplest mode, where the subtitles are always shown in the
... viewport where the viewer is looking.
... This is a basic implementation, you can see the subtitle text always sticks in one position.
... In this example the text is aligned to the viewport not to the video.
... [example with arrow pointing at the speaker]
... Here if the speaker is off screen an arrow points to the left or right to show where the
... speaker is located. It disappears when the speaker is in the field of view.
... It's a small help for people to find where the action is.
... The basic presentation mode is the same.
... [fixed positioned] This is a completely different approach.
... The subtitle is now fixed to the video not the viewport, like a burned in subtitle. The way
... it is shown here, I don't know where this is used in practice, but there is an example
... where the subtitle text is burned into the video at three different positions and fixed there.
... [Formats]
... A quick overview of how we implemented this.
... IMSC, DASH, h.264 video.
... Custom extensions to IMSC for providing the information we needed.
... In this example, imac:equirectangularLong and imac:equirectangularlat are specified on the p element.
... They specify a direction in the coordinate system, not really a position. You could specify
... a vector and where the vector hits the sphere, that is where the subtitle is located.
... This is used for the different implementations.
... This is the current status.
... Future thoughts: subtitles with two lines in each subtitle, belonging to different speakers
... at different positions, so different angles for each speaker. We could add the attributes
... at the span level but we did not do that yet.
... Also what information the author can add to indicate the suitable rendering style.

Andreas: That's better than what I said yesterday! And it doesn't contradict it.
... Yesterday there was the generic question where should this gap be addressed.
... It was clear that TTWG comes into this. I think it's worthwhile first discussing if this kind
... of use case falls in scope, and if these two attributes would be something that could
... be added to TTML and IMSC, and what additional features are needed.

<glenn> +q

Glenn: Those are very long property names, and they embed a particular projection semantic.
... If they were to be put into TTML I would probably prefer shorter names as well as
... extracting the projection method to a separate parameter for the document level.
... As far as potential requirements, I think this is good and we should consider doing something in a standard.
... We would have to define in the spec the transformation from the spherical coordinate
... space to the projection coordinate space, for different projections, e.g. a projection method parameter.

<Zakim> nigel, you wanted to ask about distance and to ask about other presentation models and to ask about doing the projection based on a rectangular region

Nigel: Why not use a 2d coordinate like for the video image and then project the text in
... the same way as the video, rather than including the coordinates?

Peter: We thought about that. We have an additional mapping step. One way would be to
... base the IMSC file on the 2D texture and then use the mapping mechanism that is
... defined by the standard for mapping the video, also for the subtitle file, or to define
... information directly in the IMSC in the target coordinate system.
... We used this approach here because it is a lot easier to implement. This is the
... rendering coordinate system and it is easy to map the video texture on a sphere in the
... framework we are using. Then it is a lot easier to define the coordinates directly.

Glenn: Right now the x and y coordinate space in TTML is cartesian based and we have a
... great deal of semantics, for example the extent of a region, is defined in x and y
... coordinate space. You could use a reverse transformation as long as you have the
... central meridian and standard parallels for doing a reverse projection to the
... equirectangular form. I think we should be hesitant to express coordinates in a
... coordinate space that is not based on our assumed cartesian space. I would rather do
... a reverse transformation, specify x and y and map to spherical coordinate space.

Vladimir: A question. Everything so far seems to be related to flat 2D projections. How would
... that apply to a stereoscopic environment.

Nigel: That was one of my questions - how do you specify depth?

Vladimir: You can break the user perception by getting it wrong.

Nigel: We have disparity already but I don't know how disparity fits with the 3d coordinate system.

Peter: We also looked at MPEG OMAF (omnidirectional media application format) and the
... draft describes how to add subtitles to the 3d space, and it supports WebVTT and IMSC
... subtitles, and the IMSC subtitles are added in a way where the MPEG scope provides a
... rendering plane for the IMSC to be rendered onto. The information in the IMSC document
... is included in the OMAF format. There's an additional metadata track that contains those
... information and that handles the information in the way MPEG does it. There is a box,
... for regions, and for points in their coordinate system. You basically get a rectangular
... plane for rendering your subtitles onto.
... It also includes depth information for stereoscopic content.

Nigel: If there's depth information in the video then there must be depth in the subtitles,
... how do those two get aligned?

Peter: I didn't fully look into this, but the standard suggests a default depth and radius
... for the video sphere, and according to this you can either add depth information relating
... to radius or directly add disparity information. The disparity information is not connected
... to the video because it is connected to the presentation of the stereoscopic image, and
... you would need to provide a left eye and right eye video stream.

Andreas: I want to point to Vladimir and ask: yesterday you brought up some additional
... things. Apart from positioning, what other things may be useful or needed?

Vladimir: Yesterday I mentioned, speculatively, without a specific application in mind,
... text objects need some kind of perspective transform to be applied.
... How much detail we go into depends on how the responsibilities of text transform are
... split between different parts.

Andreas: I wondered if CSS WG are working on the same thing, or another WG.
... I think positioning of arbitrary HTML or whatever in this space could be in the long
... run in the requirements. I don't want to contradict here what is being done in other groups.

Vladimir: I haven't heard anything about CSS considering 3D layout issues.

Philippe: The Immersive Web WG was created last month.

Andreas: I spoke with Chris Wilson yesterday.

Philiipe: He's one of the Chairs.

Andreas: I asked if we could present this use case tomorrow, he thinks it's not the right
... moment, and prefers that it gets discussed in the WebXR CG, which has a repository
... for requirements. If we open a requirement then we should open it there.

Philippe: We should ask the APA WG which is a coordination group for accessibility too,
... you should ask Janina. She might well say it came up on their radar. I don't think they
... have done any work on it.

Andreas: In this project we are also discussing user interfaces and this is definitely an
... issue for the APA WG, UIs for navigation and control of access services.

Philippe: It's not just UI!

Andreas: OK.

Philippe: We don't have an accessibility group for the 3d space right now but that is where
... the discussion should begin.

Vladimir: The Virtual Reality Industry Forum is another one outside W3C. We are still in the
... exploration stage. We know what needs to happen to do what needs to be done in the
... web, for example what to do with web fonts.
... [i.e. web fonts might need some work]

Andreas: That group could point to something in W3C?

Vladimir: Yes, it would be a huge help to point to something from W3C.

Peter: There's one thought I wanted to add. When we look at the scope of MPEG OMAF,
... keep in mind it is a distribution format, and it specifies how to bring the content to the
... consumer but when you look at the complete chain the content will probably not be
... described in OMAF. The subtitle workflow - it makes sense all the subtitle information is
... kept in one place. You can look at it in two ways - either the positional description being
... like a styling attribute or a kind of metadata to transport the information to the MPEG
... format to distribute it to the user. Maybe there are two different use cases. One to
... describe subtitles in a 3D space, something like an extended IMSC, or you could say
... we need additional metadata, just tunnel this information to the point where the complete
... format is mapped to a 3D space.

Nigel: Question: Do you need to describe the speaker position, the text position, or both?

Peter: That's a very good question. At the moment we are just pointing at the centre of the
... speaker with no height information. We don't differentiate the speaker position or the
... text position. They might be different.

Nigel: A follow-on question: what user information do have about preferences? Which of
... these do people want to use, one in particular or different people prefer different ones?

Peter: It's too early to say, research is ongoing. There are different results from different
... tests pointing in different directions. For example a university in Munich found that half
... of the test users preferred fixed position, and half didn't like it. It has the advantage
... that it is more comfortable to view and induces less sickness but you can miss the
... subtitle if you are not looking the right way. We are still looking to find the best way.

Andreas: How does VR-IF Forum relate to MPEG OMAF?

Vladimir: I think they have a liaison or they are just the same members. I doubt there is a
... direct official relationship between the two.
... VR-IF doesn't specify anything but produces usage guidelines. It's a different level, not
... technical specifications.

Andreas: The other question is regarding font technology. Recently I have seen a lot of
... advancement of the use of variable fonts on the web, with one font file with a large number
... of font faces you could use. From the discussion I've heard this 3D space presents a
... different kind of graphical challenge, and I see good application of variable fonts in this
... space which I think should be explored.

Vladimir: I absolutely agree.
... The reality is when you rely on a particular font feature to be available it would be
... too optimistic to rely on the font that happens to be resident on the user's device.
... When you rely on a specific font feature you're best/only bet is to serve the font to the
... user so you know the font is present.
... Same with variable fonts, which are in the early stages of deployment. If you want to use
... them then you need to provide the font.
... In VR-IF nothing is taken for granted, and if a particular font is needed, for feature or
... language support, then that font has to be provided. On the web the font can be downloaded,
... in ISOBMFF there is a way to provide a font.

Glenn: TTML2 supports font embedding now either directly in the TTML document or by
... URL reference to the environment somewhere which in the context of ISOBMFF could be
... part of the font carousel that's available.

Andreas: Is this in IMSC 1.1?

Nigel: I don't think so.
... [confirms this by looking at the spec]

Andreas: TTML2 has a wide feature, IMSC is a subset that doesn't support this. At the
... bottom line there should at least be a mechanism for the content provider to provide
... the font.

Vladimir: Absolutely. If you expect that variable fonts are useful in this environment then
... you have to provide them.

Andreas: As a proposal for the next steps, would it be a strategy to first try to fix the
... requirements and describe the use cases we are trying to solve?
... If this is ready then we can schedule the Web Media & Entertainment call on the IG and
... discuss it.

Nigel: Sounds good. Are there other members than IRT interested in this?

Vladimir: I am interested, I am learning more than I can contribute.

Masaya: Can TTML associate a piece of timed text with a point in space where the sound originated from?

Nigel: I think there is no standard way to do that now, no.

Vladimir: You're suggesting two independent spatial references, one for a specific location
... and the other for a location of the source so if we wanted to implement the arrows
... solution we would know the location of the source?

Masaya: yes, I'm just curious.

Nigel: I think that is for the requirements document to describe.
... Matt, do we have data for object based media pointing to where sound should be positioned in space?

Matt: We do have prototype metadata for azimuth, elevation and distance, but there's a long
... step between that prototype form and something that could be broadcast.

Nigel: Does it inform the data modelling?

Matt: It does, elsewhere we look at graph data for object based productions, and this is
... at a higher layer than something like the Audio Definition Model.
... It gives a reference for speaker or events or "sounding objects".

Nigel: I would suggest we should use the same coordinate system for things we can see
... and things we can hear. It could be an accessibility issue, to allow transformation between
... visual and auditory information.

Matt: It's a fundamental to get the coordinate system right. For example in Web Audio WG
... we had to decide whether azimuth goes clockwise or anti-clockwise. Standardising on
... a common API is important.

Andreas: For gathering requirements, typically we would start to describe what we want
... to solve, and then all these questions will come up. We also learned from this discussion
... that a lot of things come to mind based on what has already been specified, which will
... come up when the requirements are clear and we are moving to a solution.
... Peter you said you are willing to put some work into the requirements?

Peter: Yes definitely.

Andreas: Vladimir also said you are interested. I can be involved but I'm not an expert in this.
... I can be a link and help out.
... That would be my proposed action that you two and anyone else who is interested tries
... to work out these use cases, and directly post it on the GitHub repository.

Nigel: What GitHub repo?

Andreas: The XR CG has a repo for requirements or proposals, that was Chris Wilson's
... proposal and it's a good start to get it out there for everyone to access.

Peter: OK, for my understanding what we provide first is the use cases and what we want
... to do, and the question is does it involve links to existing standards?
... What standards are there to help solve these issues?
... What is within the scope of the TTML WG?
... Or the other WGs.

Vladimir: At this point we should probably have a critical eye on the existing standards.
... If the standard exists it doesn't mean it was complete, correct or designed with the same
... use cases in mind. The existing standards may need to be amended to be useful.
... There may be something missing, which is useful information for the folks who
... developed those standards. For example just because OMAF exists, doesn't mean it is
... capable of supporting all possible use cases. If we find one that is not supported they
... would welcome the contribution.

Peter: +1

Andreas: What you say makes a lot of sense Vladimir. I would propose to systematically
... separate this so first we have a green field of what the use case is to solve, and the
... requirements, and open up the issue on GitHub, then immediately afterwards reply to
... it and say "these standards address this already" and then the discussion starts.

Peter: Yes

Vladimir: I have to leave now, thank you.

Peter: I will leave too, thank you.

Philippe: [went some time ago]

Nigel: Thank you all.

TTML Documents Container

Andreas: Recently a European broadcaster asked me if TTML can have multiple tracks,
... for example different languages per file.
... I said no that's not how it is defined, you have one document per track.
... For authoring and archiving they thought about one file system with all the representations
... for the same content in one file. I said no not now.
... Then I realised you can put the root element of each document in a parent container,
... and get this with a separate "TTMLContainer" element whose children are tt elements.

<Yam_ACCESS> +present

Andreas: I wondered if this is a more generic use case where you want to specify something.

Nigel: One option available in TTML2 is to use the condition mechanism to extract just
... the content for, say, a specific language, and put all the different content in a single TTML document.
... That's an alternative to what you suggested.
... Another is to use a different manifest format, like IMF etc to handle this kind of case.

Glenn: I would have answered "yes of course" and it's the responsibility of the application
... that's using TTML to define how to use it. It's something external to the TTML file.
... I would refocus the question on making that an application specific usage scenario.
... Like if you want a PNG, JPEG and SVG version of a single image, there's no requirement
... for each file to know about each other but the outside usage may have a manifest of
... potential realisations of that resource.
... This is like the semantics of URNs and URIs. URIs are abstract, and URNs more so, but they
... map to one or more URL that realises the resource, and each URL might have a different
... aspect like language and so forth.

Andreas: I know that we delegate this. What Nigel said is to pick something out of the file
... but you want to store it without picking something. You don't want to say which one is
... preferred. You could specify the condition for a default to be selected.

Nigel: True

Andreas: The other storage scenarios are too big. It depends on the overall system
... environment if they use IMF or something else. I don't think it makes sense just to store
... subtitles in IMF without the video.

Nigel: It begs the question why localise subtitles only and not other resources like audio,
... and if you are localising audio, then it starts to make more sense to use something like IMF.

Andreas: You may have the problem that you want the different subtitle versions in one
... file. The condition attribute is an interesting thought to check out. It is not in IMSC?

Pierre: No.

Andreas: The easiest one is just to have multiple TTML documents in one file. Then you
... can easily access the complete document tree and switch easily between different documents.
... Then from one big file you can generate easily a separate document just for one version
... or language.

Glenn: I don't like it at all.

Nigel: Don't like what?

Glenn: Multiple TTML documents as children of a parent element. It raises all sorts of
... questions about semantics, like do they all start at the same begin time.
... It is more reasonable that applications of TTML should define their own way to manage
... groups of TTML documents.

Nigel: That sounded contradictory - do you mean it's okay for an application but not for this group to do?

Glenn: Yes, for example you could just put them in a zip file.

Nigel: Yes and give each a language-specific filename.

Glenn: Right [cites an existing example of this kind of technique]
... It seems too closely aligned to specific application requirements, for example what is
... the criteria for semantically grouping? Right now we define three different root level
... element types, actually four, that can appear in a TTML document: tt, ttp:profile, isd:sequence and isd:isd.
... The isd:sequence is a bit like what you're suggesting except you're suggesting a group
... not a sequence.

Andreas: The use case could be that you have one file and a player like VLC offers the choice
... of languages, and the same file would work in other players too. Two broadcasters
... mentioned this to me recently, and others before. The scenario exists, and operational
... people are looking for something like that. They can come up with their own solution,
... the question is if a common solution makes sense.

Glenn: In HTML there's something called a "web archive" that a lot of tools can work with,
... which saves all the page's files together in some form.
... I've never seen any proposal within W3C to define a standard container for a collection
... of HTML files, or PNG files or whatever basic content format file is being defined.

Andreas: The video element can have multiple text track child elements.

Nigel: I would push back against this because I think that the use case of localisation
... goes beyond just subtitles, and should include all media types as first class citizens,
... audio, video and anything else. It's detrimental to be too specific.
... Thanks, it seems like we don't have consensus to develop a requirements document
... for grouping TTML documents at this stage.

Additional styling

Nigel: I wanted to raise this because we have an interesting use case in the BBC that
... TTML cannot currently handle, even though it seems like it should be able to.
... [demonstrates some internal pages showing TTML presentation of narrative
... text captions in video styled with CSS, animations, borders, border gradients etc.]
... At the moment the CSS properties we would need are specific borders, clip-path and
... background linear gradients.
... I'm much more worried about future CSS properties that would be needed though.

Glenn: There are a couple of problems. One is testing - if we have a generic pass-through
... mechanism, like a "css" property, whose value is a CSS expression, what do you put in
... your profile? Right now we don't have a notion of parameterised set of values.

Andreas: In general I like the idea to use CSS features before they enter TTML properly.
... I don't know how exactly, but in general I would support figuring out how this could work.

Glenn: It is worth investigating.

Pierre: Since the alignment has been with CSS it is worth a longer discussion.
... Just in names there's friction for some folks, even though the gap is reducing. I also
... like the way it is clear you don't have to import all of CSS, which is a relief to others.
... For a computer, mapping a TTML name to a CSS name is a no-op. Alignment between
... TTML and CSS has served us well so we should continue doing it.

Glenn: It would make it easier to expose CSS properties without the expense of a TTML
... style attribute. There may be a sacrifice of interoperability.

Andreas: This group would just define the mechanism and then it is the responsibility
... of the application if it supports it or not.

Glenn: Then there's the profile mechanism issue.

Andreas: Just say nothing about it.
... [leaves]

Pierre: [leaves]

Nigel: Thank you both.
... OK for this requirement, I think it is worth spending some time describing the
... requirement more fully, which I will try to do. Obvious solutions to this kind of thing
... include specifying CSS properties directly on content elements or style elements,
... and allowing a class attribute to define CSS classes that apply to a content element.
... I realise both of these could create clashes between TTML styling and CSS styling and
... we would need some mechanism for resolving those clashes. Especially class styling
... is very different to the applicative styling we have in TTML, since it goes the other way
... in terms of traversal.

Glenn: Class is a shorthand for id, and we already have id.

Nigel: It's not a shorthand for id

Glenn: You can have a CSS stylesheet associated with a TTML document and have #id styles
... that are associating elements in TTML with CSS. In that sense adding class is just a
... shorthand for aggregating multiple ids into one group.

Nigel: That's true.

Glenn: At application level you could put a CSS stylesheet on one side.
... There's a precedent here in WebVTT of applying a stylesheet on the outside, though it
... it not defined clearly. Then it becomes a player dependent function whether it ingests
... and uses the stylesheet during the formatting process.
... Especially if you are doing a process where you're converting TTML to HTML/CSS.
... I would be reluctant to buy into an approach that requires mapping to HTML and CSS.
... Provided that we can have native implementations or things that don't map to HTML/CSS
... and still use whatever we develop here that would be my mental model for acceptability.

Nigel: Just wondering about how big a problem space I'm opening up. If we map TTML
... to SVG do we have to define how any classes or styles are tunnelled through?

Glenn: It could be done, the implementation would need to do some book-keeping as it
... goes through the area mapping process, to get to the SVG elements that can be styled.
... One TTML element can generate multiple areas and you can have multiple TTML elements
... generating one area.

Nigel: In terms of spec work should we feel obliged to define the tunnelling into SVG?

Glenn: I don't think so.
... We just need to be careful not to impose a restriction to a particular mapping format.
... It should be possible to make a native implementation that doesn't use CSS or SVG.
... In such a situation the native player would have to interpret the CSS and do what CSS
... does in that circumstance. A lot of CSS semantics are based on the box model and there
... may be some minor impedance mismatches between our area model and the CSS box
... model.

Nigel: I take your word for that, but our model came from XSL-FO, which was at least once
... aligned with CSS.

Glenn: For example CSS doesn't allow width or height to be specified on non-replaced
... inline elements whereas we do allow that for ipd and bpd on a span, even if it does not
... have display "inline-block". I just wanted to mention that we have taken various
... decisions semantic-wise where if we just expose CSS into the mix we may have to deal
... with incompatibilities that might arise.
... One answer to the implementer is "do whatever makes sense" which is generally how
... implementers operate anyway, but then you get interop issues.

Nigel: That's the point, to make an extensible model that allows a greater variety of CSS
... styles to be applied in applications that can support them.
... For example we could put all the "CSS tunnelling" semantics behind a feature designator.

Glenn: Yes. The general approach for CSS is that implementations ignore what they do not
... recognise. There are no guarantees.

Nigel: Some implementations support @support queries, but older ones might not.
... I think we have consensus to work this up in terms of requirements and head towards
... a solution in some future version of TTML.

Audio Profiles

Nigel: I presented something here to the joint meeting with the Media and Entertainment IG
... yesterday, and there's an Audio Description CG meeting on Thursday.
... For this group's benefit, the idea is to create a profile of TTML2 which supports the
... features needed for audio description.

Presentation to joint meeting

Nigel: [shows TTML2 feature list]
... I've just been told that the BBC implementation is live on github.io, but not quite working yet

BBC Adhere implementation

scribe: It has some build issues to fix.
... My intent is that when the CG is settled on the profile we add it to the TTWG Charter
... as a Rec track document.

Glenn: During the drive up to the implementation report you mentioned some challenges
... and we made changes to some of the feature definitions - we removed embedded-audio
... from the audio-description feature. Was that due to an implementation constraint?

Nigel: We made different changes. The embedded audio was one where I wasn't sure if
... we would hit time limits. The other was text to speech in conjunction with web audio,
... which is an API limitation that web speech output is not available as an input to web audio.

Glenn: Can that be rectified?

Matt: I had a response about this a couple of weeks ago. Due to licensing of some of the
... recognisers and synthesisers in the Web Speech API they are not licensed for recording
... so there was little enthusiasm for making an API call that would capture speech output
... from the API. Of course there are other ways to do it, but making it a feature would
... open it up to licensing issues.

Nigel: The Web Speech API never got towards Rec, it's a Note I think.

Glenn: Generally IPR isn't an issue for W3C specs.

Matt: It has multiple implementations but is still a CG report.
... The "terms of service" for many voices allow use in real time but prohibit recording the
... audio and saving it for later playback.

Nigel: If we can encourage that to get to resolution then we could use it for AD.
... The other issue to note is that for embedded audio, there's a bit of a challenge
... implementing clipBegin and clipEnd. For normal audio resources you can use media
... fragments on URLs to make a time range request, but in our testing it didn't seem to
... honour the end time always, just the begin time. But more seriously for embedded audio,
... if you implement it as a data: url then those URL media fragments seem to be completely ignored.

Matt: Range requests have to be supported by the server I think. Most do, but it's not a given.
... The data URL may not be supported at all.
... The response has to have the accept-ranges header set.

<mdjp> range requests https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests

Nigel: I think this is a different thing. It's byte ranges.

Media Fragments URI

Nigel: That's what I meant.
... It allows for a url#t=10,20 for example to give everything between 10 and 20s. In testing
... that doesn't seem to work with data urls.
... That's something that may need an explicit mention in a future edition of TTML2, for example.
... While we're on future editions of TTML2, and audio, I hope to be able to define the
... audio processing model more normatively than it is now.
... The Web Audio spec is in CR at the moment, isn't it?

Matt: Yes. Timeline to be discussed in the meeting on Thursday. No issues have been
... raised, we're not aware of any problems. On a similar note I should say we're meeting
... on Thursday and Friday, which conflicts with the AD CG but the main topic will be
... use cases, requirements and features that have been omitted from v1 so if there's anything
... around this work that would require Web Audio work to facilitate it now would be a good
... time to provide them.

Nigel: Thanks for that, if any arise I will let you know!

Glenn: Back on the issue of speech, I had pointed out how in TTML we defined a special
... resource URL for output of the speech processor, and how that was intended to be
... potentially used as an input to the audio element, so you could say an audio element
... is the speech resource instead of a pre-defined clip, and that would be useful for mix
... and gain operations.

Nigel: It's unnecessary - we didn't need to use that in our implementation.

Glenn: The connection between the speech processor's output and the audio node
... hierarchy does not exist.

Nigel: We take it as an implied one.

Glenn: That's an implementation choice that I didn't intend in the spec.

Nigel: That seems to be unnecessary pain - if you bother to put tta:speak in as anything other than none
... then you obviously want to generate audio.

Glenn: You need it to be able to pan the speech output, for example.

Nigel: That's true, I didn't consider that.
... You could posit an implied anonymous audio element if the span's tta:speak is not "none" and there is no explicit audio element child.

Glenn: That's a bit like putting origin and extent on a content element!

Nigel: I sort of see what you mean [scrunches eyes]

Glenn: In the definitions section I define a speech data resource.

Nigel: It doesn't seem clear what happens if tta:speak is not "none" and there is no
... audio element child.
... It is possible that we can tidy this up in a future edition.

Glenn: It could be improved - we could tie it to that binding mechanism more explicitly.

Nigel: +1
... However I would like to see a syntactic shortcut that avoids the need to have an audio
... element with a long string in it just for "mix this audio" when tta:speak is set, because
... that's obvious.

Glenn: I notice that it is not possible to add audio as a child of body, in TTML2. Why not? I don't recall my logic there, if there was any.

Nigel: I think it's clear that there's a bucket of audio-related potential improvements that
... are most likely to come out of work in the AD CG, which we should consider for a future
... edition of TTML2.

Meeting close

Nigel: Thank you everyone, we've reached the end of our agenda for today.
... We should take a moment to celebrate the success we've had in all the work we've done
... on TTML and IMSC over the past few years!
... Next week we have no weekly call, the week after I will send an agenda as usual.
... [meeting adjourned]

Timed Text Working Group Teleconference

23 Oct 2018

Attendees

Contents