Meeting minutes
Announcements
<song> walk through of the use case and req. document
two topics: webtransport API entering final stage. looking for wide review feedback
in web app., ppl. has experience to using it.
sugeestions for changes or improvement is very welcome on webTransport
Group set a deadline, end of this month.
https://
Video delivery and streaming in combination with Web Codex for lower latency
https://
The spec there, follow the spec through to Github issues, ppl to send feedback.
A couple od doc for review and feedback
You're an HDR speciallist, doc interesting for you.
Might need to be adapted to support HDR. including sort of canvas, and CSS
processing model for it is going through simply spec.
sort of proposed a necessarily unless anybody has sort of clarifying questions.
For the color stuff, recommend joning thit on the web community
leave a bit of time at the end if AOB
Next Generation Audio
Next Generation Audio. Wolfgang will lead it.
Wolfgang: The document is a group draft note
thanks
Wolfgang: I suggest opening the document as I talk you through it
… I have a presentation too that summarises the document
… https://
… As some history, we decided in TPAC to make a Group Note
… Made a draft in December
… I'm hoping we can have a call for consensus
… The Note has use cases, requirements, gap analysis, and privacy considerations
… What use cases do we want to enable? Dolby and Fraunhofer have developed this together
… And what would an API need to provide?
… The requirements describes cross-cutting concerns, e.g., it should work for all codecs
… The gap analysis answers why we think existing APIs can't be used to support the use cases
… The privacy considerations is a collection of thoughts, regarding privacy implications
… Please interrupt to ask questions
… The first use case is selecting a preselection. Interacting with the gain or volume, e.g., to increase the dialog
… Related is position interactivity, where you could put the dialog in a position where it doesn't overlap other audio elements
… Then, selecting individual audio elements in the mix. e.g., musical instruments to apply gain to them
… Another is where all these elements are controlled in conjunction
… The document has more detail
… Requirements - the first is to be codec agnostic. We're not asking for an API that's specific to one company's codec
… It should work for protected media. If the API only works in non-protected use cases, we think it doesn't solve the commercially relevant use cases
… It should work where there are multiple media streams. At a minimum, you'll have audio and video. (By the way, some of these concepts could also apply to video)
… With multiple audio streams, the personalisation can apply to one of those, so apply to the media stream, not just the device
… Controls should happen in realtime. If a user selects a specific preselection, they want it to be active right away, no perceivable latency
… Non-blocking hardware access: what we mean is that users will interact with the media, as it plays, and the APIs should be asynchronous, and not require waiting until a presentation is done. It should be async to the media playback
… Gap Analysis. In meetings so far, we were asked to explain why existing web APIs can't be used
… There's HTMLMediaElement, WebCodecs, e.g., in conjunction with AudioNode. Or implement it all in WASM or JS?
… HTMLMediaElement has an audioTracks attribute. It could conceivably be used to select audio preselections. But doing this confuses tracks and preselections. Some subtle problems are described in the Note
… Some audio tracks have very few attributes, e.g., language and kind, not enough to select by the user
… Selection semantics may not work, selecting audio tracks is mutually exclusive and may not work for preselections
… Use WebCodecs and AudioNodes to mix and process the result? A limitation here is that WebCodecs output is in the clear, and AudioNode input is in the clear
… So would work for non-protected content. And no object audio support. If we want to support spatial and object audio, moving it in space, it would have to be built in. It's not available today.
… If the pipeline is built by the application developer, the content creator has no control of the end result. This is important to creatives to ensure their product is presented
… JS and WASM implementations have a performance concern. It might work on a PC, but not on a TV set. Battery life is related to performance limitations. Also this doesn't work with content protection
… Finally, the privacy considerations. It's hard to compartmentalise these. Everything comes down to fingerprinting mechanisms. Not all of these are really germane to this API, they apply to any new API. For example, if the API is supported on one platform but not others, that's fingerprinting surface. It might also provide information about the
media the user consumes
… That might happen, through same-origin information leakage. If you open a media stream and query what's available, or what the default is, it might leak information about the user's default
… If the user sets preferences and those are shared between sessions, another session might query those set in the first session.
… This might happen implicitly, e.g., a smart implementation might pre-filter the personalisation options available against some preferences. So if you are able to get at the list of pre-filtered choices, it reveals something about what was filtered out
… These are considerations for implementers
… Data persistence, as preselection choices might persist beyond one session
… Any questions?
(none)
… Can we do the CfC?
Paul: Might need some wider exposure on the mailing list. So stakeholders can read through it
<RobSmith> Is there a relevant audio CG from which we could seek feedback?
In essence, IG cannot publish spec. we could go through the same process as WG
The mental model would be if a browser implementer looks at it, the doc expose enough info.
Chris: Is the document comprehensive, in terms of which codecs it includes? We've had a liaison in Media WG on 3GPP IVAS
in terms of the set of codecs that it's considering, the next generation audio codecs describing in the group note. like 3GPP EVAS
API approach would work across all of those codecs
Chris: So we could reach out to those groups
Bernd: Could we make a list? We could put a first version out there?
RobSmith: A suggestion, would putting together a demo be a good idea, to show how it works? It invites others to review how it fits their own model
Wolfgang: We've made demos, e.g., one at a previous TPAC
Bernd: There are standards around world using NGA with a certain toolset. We could do another demo, but I feel we don't make progress on the formal status of this note
Wolfgang: We'll need support of browser vendors, so that will need demos. I don't think demos progress the Group Note
<tidoust> Support for content protection in WebCodecs
Paul: Worth reading the WebCodecs issue on protection in WebCodecs
Rob: WebVMT was published as a Note 3 years ago, the process we used was to invite a 6 or 8 week deadline, and advertising it. I got some good feedback
cpn: I think next step is getting stakeholders feedback
… and then do the formal publication later on.
wolfgang: I'll send this around and set a deadline for review. 4-6 weeks.
… It's understood that we need to have more discussions with implementers.
Kaz: As Rob mentioned, there's a chicken and egg question. Asking Audio WG makes sense. However, the requirements in Section 4 is broader than Web Audio. So I suggest we ask the other W3C groups, like WoT and Voice Interaction for comments also. Voice Interaction guys organised a workshop on smart voice agents, and discussion included time synchronisation among multiple data streams.
Talking with them would make sense
Next meeting
Chris: It's scheduled for 5 May, but I'll be away
[adjourned]