WebML WG Teleconference – 5 May 2022

Meeting minutes

Context-based graph execution methods for different threading models

anssi: good news - the substantial pull request to address our discussions on threading models has been merged
… huge thanks to Chai, Ningxin, Rafael, everyone involved in landing this, surely one of our more complex PR

Should WebNN support async APIs?

anssi: can we now close #230?

ghurlbot, this is webmachinelearning/webnn

<ghurlbot> dom, OK

ningxin_hu: #257 introduces the async compute API - well done!

<ghurlbot> Pull Request 257 [closed] Context-based graph execution methods for different threading models. (wchao1115)

ningxin_hu: there is a remaining discussion on graph compilation #263

<ghurlbot> Issue 263 Support asynchronous graph compilation (wchao1115)

Support asynchronous graph compilation

anssik: Jiewei Qian was asking:
… "Should we change build() to return Promise? Having the API default to async gives us lots of flexibility in the future."
… "Also, it's easy to write async calls in synchronous style with JS async-await. Converting sync / blocking calls to async / non-blocking calls is much harder."

MLGraphBuilder.build()

ningxin_hu: the build method implies non trivial work for graph compilation
… there is a possibility that this would block the main thread if it can be called sync in the main thread
… thus #257 restricts the sync build method to the worker
… the same way sync compute is limited to worker
… Jiewei's comment is about changing the build method to an async one instead of having both sync & async
… my feedback to that is that while it's possible, having sync methods is needed for transpiling C++ code with WASM
… this existing codebase typically expect sync results
… and they would be hard to change to async paradigms
… Jiewei points to existing ecmscripten tools like asyncify to deal with this - we had investigated this, but using this hurts performance
… at least it was the case for compute which can be used at high frequency
… that's why I think we need both sync & async for the build method as well
… this also aligns with the pattern used e.g. in WebGPU
… I'll bring this in the issue discussion as well

RafaelCintron: is it possible for WebNN to validate the graph at build time?
… if so, I don't see an issue to making it sync
… I can see the reasoning for Model Loader
… In terms of the WebGPU example - the feedback we've gotten from game engine developers is that mapAsync wouldn't work for them when porting native code
… async poses problems and syncify doesn't address them

ningxin_hu: with regard to validating the graph, the build method has all the information to validate the graph
… that's what what we have in the WebNN native implementation
… all the operators are in place
… WRT WebGPU discussions, this also reflects what we observed when running WASM ports of native code
… this is why the current version of build is sync
… The problem is that the build method includes graph compilation and weight uploading and initialization in the GPU (critical for performance)
… because that includes moving data, and that implementations may need time to initialize weight, this risks making the method blocking on the main thread
… so we would limit the sync method to worker, and provide an async equivalent for the main thread (typically to be used in regular JS code)

RafaelCintron: for the sync version of build, we wouldn't have to wait for the results to come if we're sure it can't fail
… we could return the model right away without waiting for the GPU to finish its work
… but that means we can be sure it can't fail e.g. for memory reasons

ningxin_hu: good point
… we probably need to investigate if native APIs can ensure that type of validation - I don't have an answer to that
… but I suspect this may vary across native implementations & drivers
… the build method returns an MLGraph on which compute runs; if that MLGraph is not yet optimized for executing the compute method, that may create an unexpected delay to that execution

dom: that's because in case of real-time processing, you don't want the first frame to be delayed, you want compute to run as efficiently as possible?

ningxin: indeed - we should keep expectations of delay as clear as possible for developers

RafaelCintron: the delay needs to happen, whether at build or compute time

ningxin_hu: chai had mentioned that idea of pushing the compilation delay in the first compute, but that breaks the expectations that when calling compute, it can be run as efficiently as possible

Should MLCommandBuffer be MLExternalCommandBuffer?

anssik: #264

<ghurlbot> Issue 264 Should MLCommandBuffer be MLExternalCommandBuffer? (bbernhar)

anssik: Bryan commented:
… "If the intent of WebNN is to produce an immutable MLCommandBuffer that can be read-only by WebGPU, then I would suggest we consider 1) renaming it to MLExternalCommandBuffer and 2) avoid overloading WebGPU Interop with requirements WebGPU does not follow: a command buffer with GPU commands being equal to a command buffer with non-GPU commands - by moving MLExternalCommandBuffer into a WebNN Interop section."
… "Alternatively, we could keep MLCommandBuffer (internal usage) but allow WebNN a means to submit (ex computeAsync). This would avoid breaking WebGPU and allow WebNN to have consistent GPU async support experience on both native (standalone) and web."

RafaelCintron: would be fine with the current name; doesn't think we need to label it as external

ningxin_hu: +1
… unless we get strong pushback from the WebGPU WG

Accessibility and Internationalization responses, ethics feedback

Accessibility Checklist

anssik: #261

<ghurlbot> Issue 261 Accessibility Checklist (anssiko), cr

Accessibility Checklist

anssik: my assessment is that only one section of the checklist applies " If technology defines an API"

"If the API can be used for structured content, it provides features to represent all aspects of the content including hidden accessibility features."

"Application programming interfaces allow programmatic manipulation and interchange of content, and are being used to create a more imperative Web. While typically APIs exchange data rather than user-focused content, this data ultimately is exposed to the user in some way. Some of the content richness can disappear if the API does not support features like content alternatives, control association, etc. Technologies that define

APIs should ensure the API is rich enough to exchange all relevant accessibility information."

anssik: proposed response:
… "WebNN API is not used for structured content (data organized and structured in a particular way on a webpage in HTML)."

dom: +1

anssik: another checkpoint is the following:

"If the API relies on user agents to generate a user interface, the specification provides guidance about accessibility requirements needed to enable full interaction with the API."

"Content manipulated by an API is generally generated into a user interface. Technologies should provide guidance to ensure that user agents or dynamic content applications expose the full set of accessibility information available in the API."

anssik: proposed response is simply:

"WebNN API does not rely on user agents to generate a user interface."

anssik: proposed summary:

"Accessibility Checklist items don't apply to the Web Neural Network API."

<Geun-Hyung> presnt+

anssi: please check if you agree or disagree with my assessment - I plan to submit this to the Accessibility review before our next call

Internationalization Checklist

anssik: #262

<ghurlbot> Issue 262 Internationalization Checklist (anssiko), cr

anssik: based on my assessment, only the following checklist item applies to WebNN API:

anssik: only one item applies to WebNN

"If the spec (or its implementation) contains any natural language text that will be read by a human (this includes error messages or other UI text, JSON strings, etc, etc),"

anssik: my proposed response is the following:
… "WebNN API contains DOMStrings that are developer-defined and meant purely to improve web developer ergonomics, and not surfaced to users:"

https://www.w3.org/TR/webnn/#typedefdef-mlnamedoperands

https://www.w3.org/TR/webnn/#dom-mlgraphbuilder-input

https://www.w3.org/TR/webnn/#typedefdef-mlnamedinputs

https://www.w3.org/TR/webnn/#typedefdef-mlnamedoutputs

anssik: I gave an example to illustrate usage:
… For example, a web developer can create an operand for a graph input and assign it a name 'A':
… const inputs = { 'A': bufferA, 'B': bufferB };
… And later refer to this input using the name 'A':
… console.log(inputs.A);
… proposed summary of the a18y checklist exercise:
… "Only consideration that applies is "If the spec (or its implementation) contains any natural language text that will be read by a human (this includes error messages or other UI text, JSON strings, etc, etc),".

dom: WebNN API is low level API so that's why we do not tick many checklist boxes

Ethics workshop feedback

Review Ethical WebML workshop feedback

PR: Incorporate feedback from the Ethical ML workshops

anssi: based on the 2 ethical workshops we ran last month
… anyone interested, please take a look at the PR

anssik: when do you think we should make a W3C Note publication?

dom: chair to propose when to do that

WebNN integration with WebRTC APIs and WebGPU interop

WebNN integration with WebRTC APIs

WebRTC WG April 2022 meeting slides

WebRTC WG April 2022 meeting minutes

anssik: ningxin shared the following next steps for WebNN/mediacapture-transform integration:
… 1. enable WebGPU backend
… 2. new APIs that allow import frames as GPU textures and see whether that will improve efficiency
… 3. Improve VideoFrame GC PR: we will try out when it is merged in Chrome.

ningxin_hu: the 1st point is about the WebGPU-only pipeline - the current sample which has two main tasks (segmentation & image blending) has two backends: WebGL (doing both) & WebGPU (blending) + WebNN (ML segmentation)

anssik: thanks - I think this was appreciated a good joint discussion

dom: we raised the need for WebRTC WG to get clarity from Media WG for WebCodecs and WebGPU interaction

WebGPU interop

Investigation: how WebNN / WebGPU interop could be happening

<ningxin_hu> +1 to give a quick update

Double-precision baseline implementation of WebNN operations for testing

WebML WG Teleconference – 10 February 2022 minutes

PR #1 webnn-baseline initial implementation

<ghurlbot> Issue 1 Look into pre-canned models (anssiko)

anssik: you may recall webnn-baseline is a CR requirement https://github.com/webmachinelearning/webnn/issues/240

ningxin_hu: I propose to merge this PR to unblock development on ULP tolerance
… baseline implementation is a tool to help define ULP tolerance, that will be part of test cases
… Bruce opened a related issue

Define ULP (unit of least precision) tolerances

anssik: any concerns with merging the initial PR?
… please provide your feedback within the next 7 days

<ningxin_hu> the CL of chromium to reduce GC: https://chromium-review.googlesource.com/c/chromium/src/+/3586505

– DRAFT –
WebML WG Teleconference – 5 May 2022

05 May 2022

Attendees

Meeting minutes

Context-based graph execution methods for different threading models

Support asynchronous graph compilation

Should MLCommandBuffer be MLExternalCommandBuffer?

Accessibility and Internationalization responses, ethics feedback

Accessibility Checklist

Internationalization Checklist

Ethics workshop feedback

WebNN integration with WebRTC APIs and WebGPU interop

WebNN integration with WebRTC APIs

WebGPU interop

Double-precision baseline implementation of WebNN operations for testing

Diagnostics