<cpn> scribenick: cpn
Giri: Thanks everyone for
attending.
... Let's go through the status of the document.
... Apologies for the timing confusion in the email, but 8AM
Pacific is the right time, 3rd Monday of the month.
... [reviews the agenda]
<kaz> Agenda for today
Giri: Following Mark's presentation
on the CTA in the last IG call, would be good to get an update on
other SDOs
... The Qualcomm people are away at MPEG, so may not be possible
today
... Any additions for today's agenda?
Chris: We should also cover the frame
accurate seeking issue in the M&E GitHub.
... https://github.com/w3c/media-and-entertainment/issues/4
Giri: OK. Looking at the open issues
against the document: https://github.com/w3c/me-media-timed-events/issues
... There's a note to add Terminology to the document.
Chris: What terminology should we define?
<kaz> issue 5 on terminology
Giri: There's in-band and out of band
events, also media timeline
... Are there others?
Chris: I don't think so. I'm happy to do some of this myself.
Mark: Can I suggest that where possible we use existing terminology from W3C, MPEG, or elsewhere?
Giri: That's a good point, yes.
... In-band and out-of-band events are in the DASH spec
... Media timeline is in the MSE spec.
... Issue 3 relates to work in DASH-IF, from a request from 3GPP to
DASH-IF
... https://github.com/w3c/me-media-timed-events/issues/3
... I think there's been some issue with getting a copy of this
work. It's in the CTA whitepaper that Mark referenced
... Members may be aware of this. When documents are in the DASH-IF
or CTA repos, how can we get access to them from a W3C
perspective?
... If we take this to the WICG, we'll need to make people aware,
otherwise solutions will come up from elsewhere.
Mark: Which is the whitepaper?
Giri: It's "event messages in WAVE",
CTA may not be ready to distribute it, as it's work in
progress.
... Could CTA send to W3C under liaison?
Mark: I'll look into that.
<scribe> ACTION: Mark_Vickers to investigate sharing a CTA event message whitepaper with W3C
Giri: On the issue of DataCue API, we
had a lot of good discussion. But what do we want to give as input
to WICG?
... We have a description of the emsg structure, and how HbbTV
handles it.
<inserted> issue 2 - DataCue or a more specific API for DASH and emsg events
<kaz> Media Timed Events draft - section 4.1.1 DASH and ISO BMFF emsg events
Giri: We don't have to go into
solutions, point to some potential approaches, let WICG handle
it.
... Other info needed?
Chris: An agreed set of use cases, maybe the ones so far aren't enough.
Giri: We could close the last two issues.
Mark: I would recast this as not
DataCue or a more specific API, but more are there any use case /
requirement gaps that could not be solved through DataCue.
... I think it's OK to present emsg as a requirement, but we
shouldn't come up with an alternative API here.
Giri: Looking at issue #6 (use cases)
https://github.com/w3c/me-media-timed-events/issues/6
... We'll be able to improve this when we have the DASH-IF or CTA
documents
... I was hoping the people working on the MPEG effort would be
able to contribute to this.
<kaz> Media Timed Events draft - 2. Use cases
Giri: I wrote some text to contribute
to section 2.2 - https://github.com/w3c/me-media-timed-events/pull/8
... I mentioned rendering of social media feeds, banner ads,
accessibility assets not addressed by current mechanisms (based on
caption tracks)
<kaz> latest changes
Chris: TAG invited MPEG people to a call to discuss embedding of web resources. Will report back when that happens.
Giri: Maybe my additions should go to
2.3?
... [discussion of where it best fits]
... The metadata cue could contain a URL to be requested and
rendered
... The DASH-IF and CTA papers describe in-band timing information.
I wonder if this is something that should be standardised, or
remain proprietary?
Andreas: A question about the use case for rendering captions. You mentioned large print rendering of captions. Do you see DataCue as a way to render captions, apart from WebVTT?
Giri: You'd like to have the media
player render the event. The emsg is a general data carriage
mechanism. You'd define a protocol within emsg to make that occur.
That's how I'd interpret it.
... A TTML or WebVTT track, the TextTrackCues identify as captions.
With emsg, that would have to be defined on top.
... The reason I mentioned the large print captions. If TTML isn't
suitable for that, an emsg could refer to a large print
resource.
... If anyone would like to contribute to improve the use cases,
please do!
... If I can get the DASH-IF presentation for next time, we can
cover their use cases.
... I have a PR for subtitle timing accuracy. https://github.com/w3c/me-media-timed-events/pull/9
... I actually derived this from the BBC Subtitle Guidelines
http://bbc.github.io/subtitle-guidelines/
... Is this a suitable requirement?
<tidoust> scribenick: tidoust
Chris: I think these guidelines are
for content authoring. The information that you pulled out is more
to do with presentation, e.g. subtitles should not anticipate
speech by more than 1.5 seconds.
... This document contains useful links to EBU-TT-D, where Annex E
talks about when things are rendered in the media timeline, and how
timing should be preserved in the face of frame rate adjustments.
Rendering is expected to happen "as close as possible" to the
authored timestamps.
<kaz> EBU-TT-D file
Chris: The implication there is that
when you're authoring the captions, you can do things precisely,
with the player being able to play things in sync.
... This suggests a stringent time requirement than what you define
in the pull request. I'm actually wondering about linking to
EBU-TT-D.
<cpn> https://tech.ebu.ch/files/live/sites/tech/files/shared/tech/tech3380.pdf
Andreas: The question in the thread opened by Francois is on whether we need frame accuracy for subtitles rendering. This has e.g. been discussed for HbbTV as well. I wonder how hard this requirement is.
<cpn> https://github.com/w3c/media-and-entertainment/issues/4
Andreas: How do we judge on this requirement on actual TV screens today?
Chris: My feeling is that it is not so stringent but I'm not an expert there.
Andreas: We have made some tests for
DVB. We also discussed with our editorial partners in Germany,
Austria, Switzerland, and they did not report on frame-accuracy
requirements.
... This may differ per country, but I do not see that people have
been chasing TV manufacturers about accuracy of captioning /
teletext rendering.
Giri: We can say that it is just a loose requirement. They are probably other events that require stricter synchronization.
Chris: Referring to the frame accuracy discussion on GitHub, the user @daiz discusses avoiding overlap of scene changes with subtitles. He wants to be able to align subtitles on scene changes, which suggests frame accuracy is needed.
<cpn> https://github.com/w3c/media-and-entertainment/issues/4#issuecomment-396762643
Mark: I think there is an important
distinction here. Traditionally, even for analog TV, things came
in-band, so fairly accurate. How long the device takes to process
that is indeed a requirement.
... I think we need to preserve in a similar way as accurate as
possible from the signal to the application, but how long it takes
for the application to display is a separate issue.
Kaz: I personally agree with Mark. On the other hand, it's a bank holiday in Japan today, and we don't have experts from Japan. So we might want to ask them as well for opinions/feedback later.
Andreas: Thanks for the distinction, Mark. Nonetheless, I think it makes sense to check how accurate current devices are. Our experience shows that they are not frame-accurate at all. I don't say that's good, but we can take this as an indication that there may not be a strong incentive to get that frame accuracy.
Giri: This is something we struggled
with in ATSC as well. The device may make the events available
accurately, but the application may take time to process it.
... One workaround is to trigger the event earlier, and let the
application do fine adjustments.
... Another workaround is allowing the application to react
immediately to the event in a synchronous manner.
... In general, it's not an easy problem to solve. It's more
difficult when the signal is received by a set-top box and
distributed to various devices on the home network.
... I think I'm kind of agreeing with Andreas that if you can avoid
frame accuracy requirement, then that would make things easier.
Mark: If it's the application that displays the captions, then that's not really an API issue. The scope of the API is only to send the events in a very timely manner.
Giri: One thing to discuss is the ability to send data to an application on as close to synchronous manner as possible. That would serve as input to WICG to design a solution. I'll add that to the use cases.
<MarkVickers> FCC caption timing requirements: "(ii) Synchronicity. Captioning shall coincide with the corresponding spoken words and sounds to the greatest extent possible, given the type of the programming. Captions shall begin to appear at the time that the corresponding speech or sounds begin and end approximately when the speech or sounds end. Captions shall be displayed on the screen at a speed that permits them to be read by viewers." https://www.ecfr.gov/c[CUT]
<MarkVickers> https://www.ecfr.gov/cgi-bin/text-idx?SID=72eb5a624e8dc043293819a5663dff41&node=47:4.0.1.1.6.1.1.1&rgn=div8=47
<MarkVickers> Please note I just did a quick search for FCC requirements. There may be more numerical requirements elsewhere...
MarkVickers: I just pasted FCC requirements. They don't include numbers.
<Zakim> kaz, you wanted to confirm the next call will be held on August 20th
Kaz: Next call will be August 20th, right?
Giri: Yes, it is.
<kaz> [adjourned]