WebML WG Teleconference – 22 September 2022

Meeting minutes

ghurlbot, this is webmachinelearning/webnn

<ghurlbot> anssik, OK

ghurlbot, this is webmachinelearning/webnn

<ghurlbot> anssik, OK

WebNN API Candidate Recommendation open issues

anssik: Review and discuss the current CR issues, work out a plan to address the issues prior to the expected CR publication.

Support asynchronous context creation

#272

<ghurlbot> Issue 272 Support asynchronous context creation (huningxin) cr

anssik: Awaits W3C TAG recommendation on the design, no WG action required.

Delta review (to CR) of Web Neural Network API

anssik: I submitted a "delta" TAG review request per our resolution

anssik: and asked TAG for a recommendation on three topics:
… 1) Naming of the sync and async methods
… 2) Sync/async API split design
… 3) Decision to Drop support for WebGL and focused on WebGPU interoperability

<chai> brb

anssik: currently this TAG review issue is "untriaged" in the TAG tracker.
… please check that the request I authored is accurate and let me know any corrections.

<chai> back

anssik: I asked TAG to deliver its recommendation by the end of Oct 2022 latest.

<ningxin_hu> it's accurate, thanks anssi

Web platform tests

Test plan

anssik: WebNN CR readiness depends on web-platform-tests test suite to demonstrate implementation experience

anssik: Bruce published a test plan, this is a call for review
… I noted Chai is close to producing an initial list of recommended ULP tolerance for the ops
… any other contributions, please chime in on the issue.

chai: we are pretty close to producing a recommendation to fill in the blanks in Bruce's ULP table
… some work in complex math related ops
… ops such as conv and gemm require a ULP tolerance that is compexity of the environment
… for those ops, it's not a single number, but depends on the test case, the size of conv filter, for example, the tolerance is a function of the complexity
… we hope to stand the test of time principle
… maybe even after the initial work, there may be cases we need to tweak the tolerances to accommodate future
… tweaking too many times can be a bit messy, we're working on this to have it our end of this week early next week

anssik: test suite maintenance is a positive thing

ningxin_hu: question regarding the ULP table, the comment mentions webnn-baseline to generate a reference
… I'm not sure that table captures the gap in webnn-baseline implementation
… there's a col on WTP and polyfill, no webnn-baseline status in there
… should we capture that in the table?

bruce_dai: webnn-baseline for 1st wave ops have been implemented, my plan for WPT is to cover these 1st wave ops first

ningxin_hu: thanks for the update
… we should include webnn-baseline in the plan to help give an overview

bruce_dai: I will update the table with a column for webnn-baseline

chai: ningxin_hu's questions is important, tolerance is relative to the baseline, in our case when we have the tolerance out, it is the delta to the ideal baseline
… in DML, we implement the IDL baseline in double precision FP
… the diff between the baseline gives us a fixed point
… done with the tolerances, we're open to collaborate on the implementation of the baseline
… this is an important point, we should not use an arbitrary impl on some platform as the baseline

<ningxin_hu> +1

anssik: bruce_dai clear on the next steps?

bruce_dai: yes thanks

Add method steps and normative algorithms to operations

#210

<ghurlbot> Issue 210 Add method steps to operations (anssiko) cr

#211

<ghurlbot> Issue 211 Define algorithms for dictionaries with lists as default values (anssiko)

ningxin_hu: I agree atomic PRs are preferred, so they will not conflict with others PRs
… a good start could be internal slots and MLOperand interface and MLOperator interface
… those internal slots we can define in the algorithm steps that describe how graph building happens
… this proposal is based on implementation experience
… we implement GraphBuilder in Chromium open source project and there we define internal slots, in implementation those are class members
… we can write code to use these private members and do the wiring

anssik: starting with internal slot definitions sounds good

ningxin_hu: another suggestion, because we want to use algorithm steps to specify the operation, we have a declarative them, in this transition we add algorithm steps, probably we can keep the both for a while to help the reviewer compare the declarative to the algorithmic prose
… declarative should be ultimately replaced with algorithmic steps

anssik: some specs mark declarative as informative and make algorithmic steps normative, we could consider the same

Support for int8 quantized models

#128

<ghurlbot> Issue 128 WebNN should support int8 quantized models (wchao1115) cr

anssik: I wanted to discuss the two design alternatives:
… 1) quantized ops supported by a new device type (e.g. NPU) OR
… 2) add support for all device types.
… Dom recommended last time: "it is good idea to evaluate whether our API can work with new device types such as NPUs, not sure if NPUs should be in CR release scope due to required implementation experience"
… this suggests this feature might be too early to commit for CR

chai: I think we should at least try this feature, it is true quantized ops are implementable on CPUs today, NPU device type is a new one, if we run out of time can drop

anssik: can you help with the evaluation?

chai: can help with quantized ops and NPU device type

Jonathan: I need to follow up with folks internally at Google
… many teams at Google, some are interested in this feature and some are opposed
… I think the WebNN API is great, in particular WebGPU team has not been very much involved and their position is less clear, I will dig more into this

WebGPU Working Group review request and WebNN-WebGPU interop

anssik: I wanted to discuss WG's response to the WebGPU review request, focus on WebNN-WebGPU interop requirements and issues.

Review request
… review DL is 15 Oct, see our earlier discussion:

11 August 2022 discussion
… silence is considered consent
… we agreed that our response should focus on WebNN-WebGPU interop issues

<Jonathan> about NPUs: some folks at Google had previously said that NPU support was the most compelling reason to create a new API like WebNN, and that CPU or GPU might not be sufficient given WASM+SIMD and WebGPU

#264

<ghurlbot> Issue 264 CommandBuffer usage clarification: internal, external, both? (bbernhar)

gpuweb/issues/2500
… Rafael, anything to report from discussion with Bryan on #264? Resolving that issue would help Ningxin deliver an update in https://github.com/gpuweb/gpuweb/issues/2500

RafaelCintron: not much progress with Bryan, last time WebGPU WG discussed this and are heads down with v1
… they said, any feedback, let us know
… the crux is, to implement WebNN, there needs to be a deep interaction between WebNN and WebGPU code
… to have the API take GPUBuffers but them into graph and work beautifully, these systems will need to work nicely together
… when we gain implementation experience we may need to change things, unclear how NPUs will work with GPUs, can we do transfers etc.
… the last time this came up in WebGPU WG they postponed discussion post-V1, IIRC

anssik: when will WebGPU API ship in Chrome?

RafaelCintron: no ship date, but "ASAP", new features are deferred to v.next now

anssik: all of WebGPU WG in agreement with v1 feature scope?

RafaelCintron: it seems so, it's a matter of how much to polish
… there are some things to resolve, but pressure to ship, issues with interop with other APIs

ningxin_hu: WebGPU interop is post-v1, our spec has CommandBuffer interface for this feature, not sure how this would impact the CR target for this year?
… should we make a branch without CommandBuffer interface to get to CR?

RafaelCintron: we should get the CR scope implemented and see what issues emerge, those things should block CR

ningxin_hu: my concern is this interop does not have consensus of the WebGPU WG that owns the WebGPU API
… for WebNN itself, we have control and can decide which features to add, this feature (CommandBuffer) has an external dependency
… if interop feature has a dependency but not resolved, what does this mean?

anssik: how big an effort it would be to produce a WebNN-WebGPU interop implementation?

ningxin_hu: we have prototyped this for real-time video processing use case
… we need WebNN and WebGPU implementation work closely, not one way dependency, but bi-directional interactions to ensure they work together
… I believe I shared that earlier
… related, WebGPU WG is also rechartering

WIP PR for WebGPU WG charter 2022-2024 changes
… we can influence this charter with our feedback e.g. we could be more explicit with our WebNN-WebGPU coordination expectations, currently reads

Web Machine Learning Working Group

This Working Group develops a dedicated low-level Web API for enabling efficient machine learning inference in the browser.

– DRAFT –
WebML WG Teleconference – 22 September 2022

22 September 2022

Attendees