Timed Text Working Group Teleconference

Meeting minutes

This meeting

Nigel: Today we have IMSC HRM Wide Review update,
… DAPT-REQs update
… Timed Text in Low Latency Streaming applications,
… Behaviour with non-native controls (in case there's anything more to say about that today)

Gary: Probably not this week

Nigel: And Rechartering status update (for 2nd half of the hour only)
… Any other business, or points to make sure we cover?

Atsushi: I'd like to point to an error in the DAPT-REQs

Nigel: OK, let's cover that in that agenda topic

group: [no more AOB]

IMSC-HRM Wide Review update

Nigel: Quick update: I sent the comms out, and got some thank you acknowledgements back.
… So I think we can tick that action as being done.
… As is often the case with these occasional liaison messages, there are some updates to the
… liaisons page that came to light.
… I have not forwarded any of the thank you/acknowledgements to the member-tt list.
… Anyone think that's worth doing?

Pierre: I think you can safely not do it.

Nigel: I'm hoping for that!
… Ok, I won't then.
… Any questions or points on this before moving on?

DAPT-REQs

Atsushi: For now we have SVG images integrated in the HTML.
… That causes fatal errors in the HTML validator which doesn't support XML namespaces in SVG
… That error comes from HTML Validator and will cause a fatal error during streamlined publication
… during GitHub Action, so one way is to ask the HTML Validator to support integrated SVG images,
… which I believe is quite hard.
… Another is to separate that part as a distinct file, not integrated into HTML.
… If possible I'd like to propose the latter way.

Nigel: If I understand correctly that will break the page,

<atsushi> pubrules error

Nigel: because it will stop the click-through fragment links in the SVG from working.
… I think!

Cyril: Yes because inside the SVG you have href fragment ids that point to elements in the HTML,
… for navigation.

Atsushi: Ah!

Nigel: Exactly

Atsushi: For now I have no idea then.

Cyril: Let's think offline. I don't know if it's a Must, it's a Nice to Have.

Pierre: Have we asked how much trouble it would be to update the validator?

Atsushi: No idea. I suppose there's quite a small activity on that now.

Nigel: I'd note that the W3C Process doc includes an embedded SVG with some behaviour as well.

Nigel: I think the action is to ask about updating the validator.

Pierre: At least.

Atsushi: Yes. In any case please note that there is a possibility of some errors being shown due to this.

Nigel: Yes I can see the validator errors are terrible.
… But can we still publish?

Atsushi: Yes, for the initial publication of FP DNote it is done manually so we can state
… that these errors are okay.
… It should be possible to publish by hand.
… You may see in that page I pasted above that Charter extension has not yet been announced yet,
… so there's an error about the group. I'm waiting for that - should happen today or tomorrow I believe.

Nigel: Right.

Nigel: So status now is Not Published Yet.
… Atsushi will you ask about fixing the HTML Validator?

Atsushi: I will write today or tomorrow to sys team or spec prod. Someone might know something.

Nigel: Ok, thanks.
… Any more on this agenda topic?

Timed Text in Low Latency Streaming applications

Thread that began this.

Gary: I want to note that everything that we're discussing here for IMSC would also apply
… to WebVTT in terms of HLS segmentation etc.
… It's something we have started to investigate too, how to do this.

Mike: It's a timely topic. I'm aware of one encoder vendor that implemented something.
… Still learning what they did exactly.

Nigel: Would be good to start this with a summary of the problem that needs solving.

Mike: The problem is that TTML documents can't really be incrementally parsed and displayed.
… You have to get to the end of the document and then you can start to present.
… You can make it better by omitting things like set and animation.
… In general the model is you get a document, you parse it and display it.
… That process introduces delay in packaging the segment.
… You can't start to emit the segment until you have a complete document,
… and if it's a 2s segment say, you can't emit anything until those 2s are up.
… This is only for low latency live. If you have VOD and don't care about low latency you don't care about this.
… In a TV studio like today for ATSC the encoders are taking an MPEG transport stream with 608 captions,
… and starts transcoding into ATSC 3 including IMSC 1.
… For video and audio you can encode incrementally with random access points.
… In the case of IMSC 1 you have to wait until the segment time is over before shipping it off.
… In practice that imposes a delay in the audio and video today.

Nigel: And you could reduce the segment duration, but there's something limiting that?

Mike: Every time you make the segment duration smaller there's a lot higher ratio of signalling to content.
… Some schemes like CMAF limit the minimum duration of a segment to 960ms.

Cyril: Mike, you mentioned 2 things that got me thinking.
… In live low latency, what would be the constraint that would make the progressive rendering not work?
… As I mentioned on the thread, it is possible to have a progressive SAX parser and renderer.
… The 2nd question is: nothing prevents low latency chunks with documents. The overhead increases,
… but is it that significant compared to audio and video?

Mike: I'm coming at this from the desire to have a backward compatible solution with existing
… equipment in the field. One of the solutions is chunking IMSC 1.
… In theory a smarter decoder could parse that and present it.

Cyril: I don't follow - you're saying implementations in the field don't support progressive decode?

Mike: No, the incremental encode into packets...

Cyril: No, I mean 1 segment = 1 sample = 1 document but you deliver this with HTTP chunked transfer,
… progressively, and you let the renderer handle the SAX parsing and incremental rendering.

Mike: Non-backwards compatible.

Cyril: Why?

Mike: A decoder today wouldn't know how to handle that.

Gary: The expectation is that the document is done after writing and won't get updated afterwards.

Mike: Yes, it's too late in the 1-2s timeframe.

Cyril: So to paraphrase, decoders do not support chunked download and parse?

Mike: They could, but they don't.

Cyril: And why not increase overhead with multiple small samples?

Mike: My reading of 14496-30 says there's 1 document per segment.

Cyril: I disagree, part 30 does not mention segmentation, it talks about carriage into samples.

Mike: I'll check that, maybe I was referring to CMAF. It would solve the problem.
… But today, a backwards compatible solution would be to start putting out documents incremental
… and a smart decoder could parse and decode as it goes along, whereas older ones have to wait.
… Need to take into account that there are some devices that can never be updated.
… Need to support their behaviour for a while until end of support period.
… Harder than software decoders, for example.
… Idea is that the sum of the samples would still be a legal segment.

Cyril: That's not compliant with part 30.

Mike: Not multiple samples, just chunked delivery in different packets, for a single sample.
… That's one idea that would keep compliance and backwards compat

Nigel: Complicated interplay between standards here, with constraints coming from
… specifications not defined here. How can we constrain our discussion to things that we can control?

Mike: Need to include constraints from ISOBMFF.

Nigel: But that doesn't impose any issues - the key constraint here is from CMAF not ISOBMFF.

Mike: The two largest implementations are from DASH and HLS with CMAF.

Cyril: CMAF does not constrain anything here though - 960ms is a recommendation only.
… It's the same for audio and video and this is no different.
… To Nigel's point, what is needed in TTML to solve this use case?

Mike: Different ways to solve this.
… We could introduce non-backwards-compat features in TTML to help solve this.

<cyril> ack

Cyril: For example?

Mike: Like assumptions in the case of a partial document, for example.
… Not sure I want to go there, it's just an example
… I hadn't given it a lot of thought until our exchange.
… Based on a private conversation I thought of a couple of ideas of how it could work.
… I have a somewhat unique problem with respect to broadcast, which adds a layer
… in terms of what can and can't be expected to work.

<Zakim> gkatsev, you wanted to ask about whether short segments provide a good UX. Might have really short lines?

Gary: One of the potential solutions is to have really short segments, and a document per segment.
… My issue with that is a potentially bad user experience, if there is not enough text to
… provide a full line. Unless you did a whole bunch of work to position properly.
… Because you can't append to an existing cue.

Cyril: It's difficult to know the position because you don't know the device font.

Gary: It's like how to handle partial cues.
… You might want to send stuff to the client before you have all the information you want to have.

Nigel: Strict formatting formats make this easier - in the BBC we use word by word live subtitles
… all the time in IMSC and it works fine as long as the text alignment is set correctly.
… In situations where, for example, text is centre aligned, it is unreadable.

Gary: Yes, that does make it easier.

Cyril: It would really help to have a concrete example, not talking about
… documents, chunking, segmentation, just what you want to display at what time,
… then people can propose ways to do it, and we can see where the spec compliance issues arise.

<cyril> ack

Mike: Happy to put that together.
… First and foremost, the encode delay.
… No problem with TTML syntax for timing. That's not the issue.
… Given a paint-on progressive delay, within a single document you can add text in small numbers of letters.
… Most decoders work pretty well that way.
… Good question, so I'll try to explain the problem in more detail in writing, with a picture or two.
… I'll take that as an action item.

Cyril: Thank you.

Nigel: Just a clarification question about the requirements: is anyone worrying about data rates?

Mike: In general no, but one of the things we have had to do is reconstruct the screen for a
… random access point in every segment. The first thing to worry about is recreating the screen
… from scratch. That could be tedious. So far that hasn't hurt the processors but it gets
… quickly out of control, hence the focus on the HRM.
… The initial implementations to make this work were terrible and violated the HRM.
… Now the output is HRM conformant that helped bound the complexity.

Nigel: Thanks.

<Zakim> gkatsev, you wanted to ask about character/word deletions from say 608

Gary: Maybe not to answer right now, but one thing brought up previously
… with regards to live captioning is that sometimes the captioner can delete words or
… characters in 608, and depending on how you're doing live captions you might,
… say if you're emitting VTT or IMSC caption right before a delete command, how would you
… go about deleting that word?

Mike: This is an issue with live captioning. So far the encoders have ignored it.
… I haven't produced any tests with backspace.

Gary: This is probably a longer term thing, we don't need to worry in the short term.

Mike: For now it's a general problem not a low latency live one.

Nigel: In IMSC just as you can add text you can remove it, so I don't think that's a format problem per se.
… If you're in a cue based environment where cues cannot easily be changed, I don't know how to resolve that.
… For time, let's close off now unless there's anything else burning on this topic?

Rechartering status update

[plh joins]

Nigel: Atsushi already advised that we're expecting a Charter extension.

Philippe: It's on my list - your Charter runs out at the end of the month?

Nigel: No, two weeks ago!

Philippe: Oops - Atsushi, do you want to do it? It has approval already.
… I'll get on it - I have a bunch of announcements to AC to go too.

Nigel: The other side of this is dealing with the FOs.

nigel: last time we talked, we had tried to contact apple to get more info for the FO
… and we got a response.
… Asked if Adobe's proposal of indenpendenant would be good/enough.
… Response was no, but good change.
… Independenace of implementation was the big issue for them.
… Could be that the way we defined facts and verifications doesn't look like implementations to Apple.
… No further follow-up from Apple yet.

plh: I had a talk with Apple last week.
… big change in charter.
… From their point of view, the new change doesn't satisfy "adequate implementation experience"
… If spec says in CR for longer, that's fine.

nigel: AC rep from Google also objected.
… asked what's defined as an implementation.

<plh> plh

nigel: He stated that treating content as an implementation isn't good enough.
… Ralph's view seemed to support our charter updates.

plh: they want to raise the bar.
… You don't have to agree with their views.
… People who read the specs should be able to reproduce without talking to the spec maintainers

nigel: following up from being able to implement from the spec
… but some specs do allow implementation choices, deliberately, e.g. CSS
… Should we wrap it up for today?
… Should we adjourn for today?
… [adjourns meeting]