WebML CG Teleconference – 21 January 2021

Meeting minutes

webnn-native

anssik: I asked Ningxin to present the proposed standalone native implementation of the WebNN API spec called webnn-native that complements the JS implementation webnn-polyfill. Both the implementations help inform the API spec development, validate implementability atop native APIs (initial targets OpenVINO and DirectML, extensible with other backends), and provide a performance benchmark.
… Discuss the proposal to adopt webnn-native as a CG deliverable similarly to webnn-polyfill. No CG Charter changes needed, is considered "Other Software" per the existing charter text.

CG Charter: Test Suites and Other Software

"The group MAY produce test suites to support the Specifications. Please see the GitHub LICENSE file for test suite contribution licensing information."

ningxin_hu: I would like to add about the problems we want to solve with webnn-native
… many discussions in this group are around native API implementability and compatibility, e.g. what ops can be mapped to native APIs, what params and options can be implemented
… also performance prediction, what perf we can get by offloading to native APIs
… we haven't had a good tool to get answers to these questions, this proposal is to help get those answers and inform the API spec development

ningxin_hu: while webnn-polyfill is based on TF.js, this will be based on native APIs
… header file can be mapped to WebIDL from the WebNN API spec
… represented as C/C++ interfaces
… explainer enumerated backend targets, DirectML, OpenVINO initially
… sample code can be written in C and C++
… we target a few important models first -- that's the basic idea
… also this project should help with WebML-Wasm coordination
… the plan is to have the code be compiled into Wasm
… proposal here is to create a repo under the CG's GH org
… this would be Apache 2.0 licensed
… we can have some MVP deliverables for the first wave, e.g. minimal LeNet sample running with two backends

<zkis> Could a container be provided as well?

zkis: Could a container be provided as well?

ningxin_hu: good questions, we can discuss in GH
… primarily a source code release

zkis: asking because OpenVINO is already available as a container

zkis: another reason for containerization, Wasm also prefers that deployment model

anssik: Chromium fork was hard to maintain, this proposal solves that issue I suppose, is these a plan to integrate this proposal with Chromium as e.g. a 3rd party dep?

ningxin_hu: once matured, other projects could adopt this, but that's beyond the initial goal of the project

proposed RESOLUTION: Adopt webnn-native as "Other Software" deliverable, repo hosted at webmachinelearning/webnn-native

<chai> +1 its a good idea

<zkis> +1

<RafaelCintron> +1

Resolution: Adopt webnn-native as "Other Software" deliverable, repo hosted at webmachinelearning/webnn-native

<chai> i can help with the reviews

WebNN conv2d layout parameter TensorFlow incompatibility

anssik: Discuss and gather further feedback on TF, TF Lite, and TF.js preference for input and filter layout approach in WebNN API

TensorFlow conv2d expects channel_last filter layout regardless of input layout format #125

anssik: there seems to be a PR for this by Chai

Support a separate filter layout formats and the optional bias tensor. #130

Chai: more flexible to have separate filter layout for conv2d, also lacks bias 1D tensor added as the last step of the output
… appreciate review from folks
… this is not a breaking change, adds to the existing API
… would like to move optional parameters to the struct, if we need to do this in the future it'd be a breaking change

<ping_yu> I will review the PR as well

Numerical precision in conformance testing

anssik: Discuss and solicit insights on difference between TF.js WebGL and CPU/Wasm backends

Add SqueezeNet model test

#32

anssik: Ningxin has done some work and will give us an update?

ningxin_hu: adds one 1st wave model as a test case SqueezeNet
… adapts the SqueezeNet model, two opens for the group:
… discussed in detail in the PR

<ningxin_hu> https://github.com/webmachinelearning/webnn-polyfill/pull/32#issuecomment-763825323

ningxin_hu: second open regarding TF.js backend, supports multiple backends CPU Wasm, WebGL, but numerical precision differs
… the CPU backend passes the two models in TF Lite and ONNX formats
… GPU backend has some issues, leads to the failure of the two models
… which backend we should use for the numeric accuracy test?

<ping_yu> we have two settings, for webGL 2 epsilon set to e-3 and webGL 1 as e-1

ningxin_hu: my proposal is to use the accuracy setting, and use the CPU backend to test the numerical accuracy

Ping: Ningxin is right in that for CPU we have better accuracy
… WebGL1 no float32 e.g. for iOS
… leads to precision loss, for webGL 2 epsilon set to e-3 and webGL 1 as e-1

Chai: for the polyfill testing it might be OK, but from the spec itself, the reason to have conformance test is to ensure accuracy, there are so many frameworks using different criteria, so with multiple backends this becomes an issue
… the whole point of conformance is universality, that's the statement from the principles
… my second point, a data point, when we test we test on two levels: 1 model level, 2 op level

ningxin_hu: I'd like to comment about the test against the polyfill

<chai> 1. two level conformance -- models and operators

ningxin_hu: polyfill is used as a tool to run the code, depending on the implementation of the polyfill e.g. with CPU backend accuracy satisfies the conformance test criteria

<ping_yu> +1 Good to have requirement on both tolerance on op level and model level

<chai> 2. for standard conformance, having a tolerance defined over standard double precision value ensures longevity

Ping: I agree with Chai we should have different tolerance level for op and model
… target to have accurate output for the model

<chai> 3. try to define tolerance in term of ULP instead of absolute distances

Ping: these ops may have certain tolerance, if people can not pass the conformance test, they might be in trouble

<chai> 4. have different tolerance for different operation. dont use a single tolerance across everything -- that will be a least common denominator

ningxin_hu: I propose to merge #32

ningxin_hu: this PR uses different tolerance for two different models
… tried to reference the respective native settings from ONNX and TF

Chai: PR#32 is fine to merged

<ping_yu> I will take a closer look

anssik: after Ping's review, can merge

NSNet2 sample and TF.js memory leak

anssik: Check TF.js upstream blocker status and discuss any other opens blocking PR

Add NSNet2 sample #22

anssik: memory leak issues seems to be solved now thanks to Ping and TF.js folks!
… anything else blocking this PR?

<ping_yu> yes, I will do that todya

<ping_yu> sorry, I have to drop off now

anssik: Ping to look at PR#22, can merge once that's completed

Proposals for future work

anssik: Continue discuss proposals submitted for consideration for future work:

Data processing by Wenhe Li

Operation-specific APIs #2 by Jonathan

anssik: Chai had some feedback in this issue

Chai: I was just asking for more information so we can better understand the proposal
… exactly what is the proposal, one way to clarify this is to define some samples or prototype functions
… for example, looking at the WebNN graph API, there the currency is an Operand
… that will eventually construct a Tensor
… prototype functions would help us better understand what is the currency flowing through these functions
… or are we talking of e.g. WebGL buffers?

Jonathan: agree pseudo code or proto example would make this clearer
… started a conversation internally, we just haven't done that yet
… I take an action item to deliver a pseudo code or prototype example

– DRAFT –
WebML CG Teleconference – 21 January 2021

21 January 2021

Attendees

Meeting minutes

webnn-native

WebNN conv2d layout parameter TensorFlow incompatibility

Numerical precision in conformance testing

NSNet2 sample and TF.js memory leak

Proposals for future work

Adjourn

Summary of resolutions

Diagnostics