WebML WG Teleconference – 13 January 2022

Meeting minutes

Anssi: welcome to 2022!
… exciting year for this WG with lots of milestones ahead of us
… the 2021 work getting into the hands of developers

WebNN API feature and change requests

Integration with real-time video processing

Integration with real-time video processing #226

Anssi: this emerged from the Web real-time communications WG
… the request is to develop a prototype that integrates the new mediacapture transform API in a worker context with the WebNN API
… the most interesting use case driving this for WebRTC is background blur, a familiar feature in teleconferencing apps
… in this issue, we have already had a discussion with the 2 of the WebRTC WG chairs providing their input
… WebGPU co-chair (Corentin) also provided input

Noise suppression (RNNoise) based on mediacapture-transform API

Semantic segmentation based on mediacapture-transform API (preview)

Semantic segmentation based on mediacapture-transform API (source)

Anssi: two initial prototypes were submitted, developed by Bin Miao and Wanming Lin - my thanks to them on behalf of the WG

anssik: both the prototypes currently do processing in the main thread

anssik: it seems that the WG should initially focus on developing further the semantic segmentation prototype as the place to evaluate performance risks
… proposed next steps as I understand them:
… - we should migrate expensive processing to worker context
… - build a full GPU-only pipeline to minimize GPU to CPU copies
… - document any API gaps that emerge

dom: thanks Anssi for the summary
… the processing in worker is covered so that there are no gaps I think
… the real gaps are in the GPU-only pipeline approach, the gap seems to be in WebNN, interaction with WebGPU buffers
… specific questions on how to go back and forth between VideoFrame and GPUBuffer abstractions
… then, also stream-based processing
… in terms of specific next steps, would be great to get confirmation whether we can stay in the GPU-only pipeline
… story for WebGPU and WebGL integration seems important take away from this experimentation

chai: good scenario to stress-test WebNN-on-GPU
… already identified issues on CPU/GPU transitions which comes with high cost
… as far as I know, the applications out there that are already doing background segmentation do it with GPU

Investigation: how WebNN / WebGPU interop could be happening #2500

ningxin: the issue I raised summarize the current situation and our motivation for WebGPU integration
… two use cases we want to support: custom ops in WebGPU shaders
… and integration with real-time video processing

Define importing VideoFrame into WebGPU #412

ningxin: we can expect VideoFrame to live in the GPU, and may be importable as a GPUTexture

Where do we spec WebGPU-WebCodecs interaction? #2498

ningxin: this would allow to use VideoFrame as input to compute
… as Corentin mentioned, there needs more coordination between WebGPU/WebML WG to fulfill these capabilities
… a good opportunity for the two groups to make progress, which I hope this issue will help with

anssi: similar questions are being explored at the intersection of WebGPU and WebCodecs
… ningxin, you'll talk with WebGPU folks and report back to the group?

ningxin: the issue is the first step towards that
… I'm interested in input in how to best coordinate work across the two groups

anssik: if the topic would benefit from sync coordination, we could invite a subset of the gpu groups to our meeting (or vice versa)

dom: talked with WebGPU WG staff contact Francois, his recommendation was to first have discussion on the GH issue
… it might be good to have a sync meeting with editors and chairs participating from WebML and WebGPU WGs

<ningxin_hu> no, i didn't

RafaelCintron: I represent MS at the WebGPU WG since the beginning - they have 2 meetings, one for the shading language and one for the API

RafaelCintron: rep of Msft in the WebGPU WG, two meetings, one for shading language, one for the API, bi-weekly cadence
… they're focused on shipping ASAP based on interest e.g. from game engines, with a timeline toward May
… I think they'd be open to speak with us
… it could end up that we re-use some of they data types
… or go through import/export

Anssi: so we need to be considerate of their time given their roadmap

RafaelCintron: but it's clearly a very compelling use case to have an ML-in-GPU pipeline

dom: prototyping work might get stuck due to API gaps, but prototyping work needs to continue

anssik: should we explore the worker context processing?

dom: that probably will not have direct impact on performance, will inform the discussion in WebRTC WG

ningxin_hu: in terms of perf evaluation, the current prototype is running on top of the polyfill
… the TF backend runs on top of the WebGL backend
… to see the real performance benefit, we would need the WebNN Chromium implementation, under engineering review (incl security & privacy)
… will also need progress on the WebGPU/WebNN integration

RafaelCintron: having working code will also help convincing people of the value of the work

anssik: wanming would likely be interested in continuing work on this

chai: we're interested in helping with the prototype as well, it's a very exciting topic

anssi: let's continue discussion in the issue for the most impactful next steps
… it can also help down the line with perf evaluation once the native implementation progresses

Should restrict the sync APIs to only exist in Workers?

Should restrict the sync APIs to only exist in Workers? #229

Anssi: in summary, increasingly popular WASM-based frameworks need sync APIs, but blocking the main thread should be avoided
… the proposed solution would be to restrict sync build/compute APIs in the worker context

RafaelCintron: when the API is working in the mode of doing everything on the CPU, I agree that sync should be restricted to worker
… if you working on a different device (GPU or NPU...), then the sync API is only queuing commands not holding the main thread
… that being said, passing around many structured objects across threads is not great on the Web platform, as the babylon.js team has been reporting
… this would negatively impact a worker-only API

anssik: it's easier to add API surface than removing it; so maybe we could start conservatively with a worker-only sync API
… and extend it to main thread only later once we're more confident
… removing a sync API later might be tricky, as illustrated by XHR

ningxin_hu: ML Frameworks are primary consumers of the API - we should incorporate their feedback
… some run their WASM backend in the main thread
… we should figure out the solution with them

anssik: most important frameworks with WASM-based backends?

ningxin: TF, ONNX
… Ping & Emma might help us

anssik: let's tag them in the issue, and invite them to a future call if needed

chai: it's somewhat related to the previous topic - performance is key
… the prototype on segmentation will be really important to drive progress on this
… high-frame rate scenario requires as few CPU/GPU transition as possible
… We also need to make sure that framework integration will work
… from the API standpoint, conceptually, if the interop between WebNN and WebGPU allows to offload data upload/download to the WebGPU spec, we can free up WebNN to be more flexible
… this would create an API that can be used in different contexts

dom: multiple aspects, CPU vs GPU, WebNN could work with CPU backend with sync API
… not sure if we can make that distinction from WebIDL perspective
… the way we describe the ops are not specific enough
… e.g. compute(), no way to interpret it as non-blocking for main thread
… waits for compute to return
… even on a GPU basis
… getting input from frameworks what they need is important
… there will be improvements the spec will need to spell out, what parts of the API need to be sync and which can be async
… even if we have sync API on the worker, we'd like to have async API on the main thread, requires significant spec changes

dom: understanding the needs required to get the API into stable state

RafaelCintron: the spec is not final yet - I'm hopeful we can improve it iteratively
… e.g. a non-CPU backend would return a different object that allows sync operations
… WebGPU has the same issue, and creates two pipelines (sync & async)
… we might have room for solving the issue differently
… the problem with workers is that a lot of basic stuff doesn't work in a worker (e.g. the DOM)
… we could start with everything async; but with a very pipelined-application, this will add a lot of latency in practice

anssik: so I hear let's check with framework folks on their perspectives before coming back to this issue

Should WebNN support async APIs?

Should WebNN support async APIs? #230

anssik: having the API available in main thread helps with adoption

ningxin: most of this issue has been discussed in the context of the sync API and intersections with GPU
… I've pinged Ping on this thread
… TF.js has a WebGPU backend with async APIs

dom: need to understand the requirements before making progress with this issue

WebNN API open pull requests

Open PRs

<chai> I need to work on my PR backlog ^^

WebNN API Candidate Recommendation

anssi: we're planning to reach Candidate Recommendation (CR) this year - CR is an important milestone:
… it means the spec is feature complete and can be implemented as is

Candidate Recommendation readiness tracker #240

anssi: I've created a CR tracker for us to allow us to move toward this milestone in a transparent and coordinated fashion
… it's managed as a github issue with checkboxes
… I've summarized the 3 top level key CR requirements - as set by the W3C Process Document
… the first requirement is to meet the WG's requirements, including the ones set in the WG charter
… we need to demonstrate "adequate" implementation experience
… there is room for interpretation there - formally the W3C Director will assess that
… it doesn't mean the API need to ship to reach CR
… We need to show that wide-review has been received

Wide review tracker #239

anssi: which I'm tracking in a separate tracker
… we already completed TAG review, we started privacy review
… this includes getting signals from other stakeholders, e.g. web developers
… or the recommendation from the workshop
… or signals from implementors

CR blocker issues (WIP)

dom: good summary, we may need to perhaps discuss whether we have already satisfied all the use cases, and we should perhaps also note ethical considerations work that in ongoing
… the first people we need to convince is ourselves, what we want to demonstrate as "job well done"
… based on what I heard today, getting WebRTC and GPU story covered maybe we want to turn them into use cases

– DRAFT –
WebML WG Teleconference – 13 January 2022

13 January 2022

Attendees