WebML WG Teleconference – 3 November 2022

Meeting minutes

ghurlbot, this is webmachinelearning/webnn

<ghurlbot> anssik, OK

WebNN API Candidate Recommendation open issues

Current CR issues

Support asynchronous context creation, naming issues

anssik: for background refresh, Support asynchronous context creation discussed in #272

<ghurlbot> Issue 272 Support asynchronous context creation (huningxin) cr

anssik: we asked TAG for advice on naming convention to use:

Delta review (to CR) of Web Neural Network API

anssik: TAG considered this topic important enough for inclusion into its Web Platform Design Principles document that contains a set of design principles to be used when designing web platform technologies

Web Platform Design Principles

New principle: Patterns for x() vs xSync()

anssik: the principle is yet to land in that doc, but per TAG issue discussion the recommendation is:

"I think both naming schemes are useful in different cases, basically x() should be the common case and xSync()/xAsync() the exception. I.e. if the main usage is sync, then x() and xAsync() seems preferable. If the main usage is async, then x() and xSync() seems preferable."

anssik: what follows is that if we agree that the main usage of the following methods is async as follows:
… ML.createContext() returns Promise<MLContext>
… MLGraphBuilder.build() returns Promise<MLGraph>
… MLContext.compute() returns Promise<undefined>

anssik: then per this TAG recommendation we are guided to merge the PR #274 that use Sync postfix and close PR #285 that proposed Async postfix

<ghurlbot> Pull Request 274 Support async context creation and use sync postfix (huningxin)

<ghurlbot> Pull Request 285 Introduce MLContext.createContextAsync (huningxin)

anssik: any comments?

zkis: one possibility would be to make some constructors, to avoid name clashing e.g. in createContext

anssik: thanks for that input

anssik: thanks everyone who contributed to this discussion that ultimately helped Web Platform Design Principles to add a new guidance for x() vs xSync() patterns
… naming is never easy because there is no absolute truth as in maths, this is why having a doc such as Web Platform Design Principles is important
… we should probably inform WebGPU WG of this Web Platform Design Principle to allow them consider this in the naming of WebGPU API interfaces

proposed RECOMMENDATION: Per TAG recommendation adopt x() and xSync() naming pattern for createContext(), build() and compute() methods.

<zkis> +1

RESOLUTION: Per TAG recommendation adopt x() and xSync() naming pattern for createContext(), build() and compute() methods.

Web platform tests

anssik: I want us to discuss & resolve any blockers for w-p-t & webnn-baseline reference impl.
… web-platform-tests tracker issue #265 is kept up to date by Bruce (thanks!)

<ghurlbot> Issue 265 WPT tests tracker (BruceDai) cr

webnn-baseline implementation plan

anssik: as a reminder, webnn-baseline is the pure JS double-precision baseline implementation of WebNN operations for testing purpose without 3rd party deps
… let's look at the recently updated wpt PRs:

Add WebNN API operations tests

anssik: in #34287 following Dwayne's precision-metrics suggestions, updated existed data movement ops float32 tests which use ULP metrics, added tanh op float32 tests which uses ATOL metrics and gemm op float32 tests which uses IEPOE metrics, this PR is under review

<ghurlbot> Issue 34287 [not found]

anssik: Bruce anything blocking this PR?

Bruce: no blockers from my side, thanks for the review and support

Add others first-wave ops tests for WebNN API

anssik: in #36202 other float32 tests for remaining first-wave ops have updated test data (float64 inputs + float32 baseline) and precision metrics

<ghurlbot> Issue 36202 [not found]

<bruce_dai> https://github.com/web-platform-tests/wpt/pull/36202

Bruce: no blockers for this PR either

API review questions in prep for normative algorithm definition

#298

<ghurlbot> Issue 298 API review, questions, brainstorming (zolkis)

anssik: The aim of this discussion is to process review questions to clear the path for algorithm updates.
… first, thanks for Zoltan for looking at the spec with fresh eyes and providing a number of questions and proposals to the WG for consideration
… I talked with Zoltan about this review and expectations and we agreed we want to err on the side of being conservative with normative changes to not cause the spec change underneath of the ongoing implementations in unexpected ways
… to help guide this review, we put priority on clarifying questions and API design aspects where the changes are motivated by user needs per Priority of Constituencies codified in the Web Platform Design Principles

Put user needs first (Priority of Constituencies)

anssik: this TAG mandated principle helps Web API authors focus efforts on areas where the positive impact on users is maximized:

"If a trade-off needs to be made, always put user needs above all."

"User needs come before the needs of web page authors, which come before the needs of user agent implementors, which come before the needs of specification writers, which come before theoretical purity."
… we expect majority of the WebNN API usage to be driven by JS ML frameworks and abstractions built atop
… when we talk about WebNN, the users we want to put first are the JS ML frameworks

zkis: first ask for patience, I will step on previous discussions as I explore
… I might not have find everything, based on what I found I try to find algos for different ops and reference internal slots, factor out algos, follow WebGPU conventions where it makes sense
… I think we could eventually make the spec simpler, it has moved quite a lot in the past
… first I needed to understand why context is introduced, I figured out the way it is used provides an abstraction for things needed to implement and use the API, I had several questions and Ningxin answered some of them
… we could simplify things we are not using in the spec now
… MLGraph we are using to host methods such as compute() moved to context, we are using context as if it was hosting everything we need for building
… encode dispatch in the GPU context and compute
… please fix me in the issue and provide past context if I missed some prior discussions and decisions
… will make it easier for the next pair of fresh eyes

anssik: top 3 user-first improvements?

zkis: it would be simpler to be more flexible with future, graph could be an internal slot completely, have a lifecycle, allows to add APIs later

<zkis> https://github.com/webmachinelearning/webnn/issues/298#issuecomment-1299990149

chai: I have not had time to look at this yet

zkis: I'd like to get feedback on the latest IDL proposal https://github.com/webmachinelearning/webnn/issues/298#issuecomment-1299990149

Chai: I'll some time to go through this issue discussion, it is pretty long
… agree with Zoltan, breaking up into many issues might be harder due to interdependencies
… the way the thinking is summarized in the issue is good, it just takes time to go through
… concrete cases of impl difficulty will make for a stronger case to change things
… almost in all cases when we make a case was because Ningxin implemented either polyfill of webnn-native, of TF.js team gave concrete feedback
… that experience makes it easier to decide where to go
… what you see today is a result of all of those conversations
… there's a lot of them! no one can combine all of that historical discussions during the past few years
… e.g. one immutable context, why not go there, those are outcomes and mirror the implementation we've built
… changes may have side-effects for implementation, it is good to have the past, I just want to caution the way it is today there's a reason for it, to change it we need a good reason to do so

zkis: I totally agree with that and know the journey from impl feedback to the spec and back
… I'm interested in how to expose, what and why, need to go through that periodically
… it tells me I should talk more often with Ningxin, we have these bi-weekly and should have additional sync points maybe to understand impl impacts
… if Chai you read the comments I'd appreciate any guidance you may have, my thought process is implemented in the issue, it is not a proposal to be clear

Chai: my point is, a concrete example makes it easier to demonstrate the benefit to not have a philosophical point

zkis: I agree on Priority of Constituencies
… I take any feedback on the issue, I want to understand it and make consistent algos that refer to the things that are connected to use cases

anssik: also informative text contributions are valuable

zkis: we can move offline with this and you can add comments to the issue

WebNN-WebGPU interop

anssik: Review WebGPU interop mechanism and its normative WebGPU dependencies to assess whether WebGPU interop is a feasible CR target or a v2 feature.
… consider implementation feedback from the WebNN DirectML backend.
… we should make sure the WebNN API interop mechanism with WebGPU API is specified in adequate detail and check that the normative WebGPU API dependencies are defined in a fashion they can be implemented in an interoperable fashion
… the standard way to test interop is with a cross-browser test suite that runs against one or more implementations
… if we cannot yet produce a good set of tests that exercise the WebGPU interop mechanisms, it would be logical to make this a v2 feature, otherwise CR->PR advancement would be blocked until after the interop can be demonstrated
… assuming WebGPU interop is testable, we have a few options:
… 1) Keep WebGPU interop features in spec as is and mark them as "at risk" for CR purposes, improve algorithms
… 2) Move WebGPU interop features into a branch and stabilize the main branch for CR. Con: tedious logistics-wise to sync branches.

RafaelCintron: so far I know there hasn't been on impl experience on this aspect
… if there's need to impl this in different process vs. GPU process, we need more impl experience on this aspect of the spec

ningxin_hu: some experience from DirectML prototype, so far the impl focuses on WebNN standalone usage
… compute takes ArrayBuffers as input and output
… because WebGPU team wants to focus on their v1 release, for WebNN and WebGPU it is required for WebGPU to expose some interfaces
… currently no bandwidth for that work
… WebNN-WebGPU interop prototype might be postponed until both parties have bandwidth to spend on that

ningxin_hu: one clarification, the issues are related to WebNN-WebGPU interop

anssik: so what we haven't implemented yet is MLCommandEncoder https://www.w3.org/TR/webnn/#mlcommandencoder

ningxin_hu: right
… device type GPU is there and prototyped

anssik: can someone implement the spec ignoring https://www.w3.org/TR/webnn/#mlcommandencoder

Chai: yes, I specified it as such

<ningxin_hu> that's true

RafaelCintron: I've understood Brian is working on some other things currently, not actively working on WebNN

RafaelCintron: we haven't gotten official feedback from WebGPU WG, unofficial feedback only

anssik: any aspects of WebGPU API that are going to ship WebNN would not be happy with?

RafaelCintron: not aware of any, but impl experience is crucial in validating that

ningxin_hu: we are working on the WebNN main flow functionality in Chromium, one big CL to prototype WebNN mem flow, GPU device type via DML integration
… this is for main flow usage, WebNN-WebGPU interop would be postponed for later and focus is to get Mojo interface design and DirectML design and WebNN mapping, security review, moving forward with smaller CLs landed
… depending on WebGPU folks' bandwidth we can start WebNN-WebGPU interop work
… I think if someone has bandwidth to prototype WebNN-WebGPU earlier, feel free to review the big CL we sent for review and start from there
… experiment how to impl this interop capability, but this requires changes to WebGPU implementation and good to have experience in that space

<ningxin_hu> that's fine

– DRAFT –
WebML WG Teleconference – 3 November 2022

03 November 2022

Attendees