Meeting minutes
webnn-native
anssik: I asked Ningxin to present the proposed standalone native implementation of the WebNN API spec called webnn-native that complements the JS implementation webnn-polyfill. Both the implementations help inform the API spec development, validate implementability atop native APIs (initial targets OpenVINO and DirectML, extensible with other backends), and provide a performance benchmark.
… Discuss the proposal to adopt webnn-native as a CG deliverable similarly to webnn-polyfill. No CG Charter changes needed, is considered "Other Software" per the existing charter text.
CG Charter: Test Suites and Other Software
"The group MAY produce test suites to support the Specifications. Please see the GitHub LICENSE file for test suite contribution licensing information."
ningxin_hu: I would like to add about the problems we want to solve with webnn-native
… many discussions in this group are around native API implementability and compatibility, e.g. what ops can be mapped to native APIs, what params and options can be implemented
… also performance prediction, what perf we can get by offloading to native APIs
… we haven't had a good tool to get answers to these questions, this proposal is to help get those answers and inform the API spec development
ningxin_hu: while webnn-polyfill is based on TF.js, this will be based on native APIs
… header file can be mapped to WebIDL from the WebNN API spec
… represented as C/C++ interfaces
… explainer enumerated backend targets, DirectML, OpenVINO initially
… sample code can be written in C and C++
… we target a few important models first -- that's the basic idea
… also this project should help with WebML-Wasm coordination
… the plan is to have the code be compiled into Wasm
… proposal here is to create a repo under the CG's GH org
… this would be Apache 2.0 licensed
… we can have some MVP deliverables for the first wave, e.g. minimal LeNet sample running with two backends
<zkis> Could a container be provided as well?
zkis: Could a container be provided as well?
ningxin_hu: good questions, we can discuss in GH
… primarily a source code release
zkis: asking because OpenVINO is already available as a container
zkis: another reason for containerization, Wasm also prefers that deployment model
anssik: Chromium fork was hard to maintain, this proposal solves that issue I suppose, is these a plan to integrate this proposal with Chromium as e.g. a 3rd party dep?
ningxin_hu: once matured, other projects could adopt this, but that's beyond the initial goal of the project
proposed RESOLUTION: Adopt webnn-native as "Other Software" deliverable, repo hosted at webmachinelearning/webnn-native
<chai> +1 its a good idea
<zkis> +1
<RafaelCintron> +1
Resolution: Adopt webnn-native as "Other Software" deliverable, repo hosted at webmachinelearning/webnn-native
<chai> i can help with the reviews
WebNN conv2d layout parameter TensorFlow incompatibility
anssik: Discuss and gather further feedback on TF, TF Lite, and TF.js preference for input and filter layout approach in WebNN API
TensorFlow conv2d expects channel_last filter layout regardless of input layout format #125
anssik: there seems to be a PR for this by Chai
Support a separate filter layout formats and the optional bias tensor. #130
Chai: more flexible to have separate filter layout for conv2d, also lacks bias 1D tensor added as the last step of the output
… appreciate review from folks
… this is not a breaking change, adds to the existing API
… would like to move optional parameters to the struct, if we need to do this in the future it'd be a breaking change
<ping_yu> I will review the PR as well
Numerical precision in conformance testing
anssik: Discuss and solicit insights on difference between TF.js WebGL and CPU/Wasm backends
#32
anssik: Ningxin has done some work and will give us an update?
ningxin_hu: adds one 1st wave model as a test case SqueezeNet
… adapts the SqueezeNet model, two opens for the group:
… discussed in detail in the PR
<ningxin_hu> https://
ningxin_hu: second open regarding TF.js backend, supports multiple backends CPU Wasm, WebGL, but numerical precision differs
… the CPU backend passes the two models in TF Lite and ONNX formats
… GPU backend has some issues, leads to the failure of the two models
… which backend we should use for the numeric accuracy test?
<ping_yu> we have two settings, for webGL 2 epsilon set to e-3 and webGL 1 as e-1
ningxin_hu: my proposal is to use the accuracy setting, and use the CPU backend to test the numerical accuracy
Ping: Ningxin is right in that for CPU we have better accuracy
… WebGL1 no float32 e.g. for iOS
… leads to precision loss, for webGL 2 epsilon set to e-3 and webGL 1 as e-1
Chai: for the polyfill testing it might be OK, but from the spec itself, the reason to have conformance test is to ensure accuracy, there are so many frameworks using different criteria, so with multiple backends this becomes an issue
… the whole point of conformance is universality, that's the statement from the principles
… my second point, a data point, when we test we test on two levels: 1 model level, 2 op level
ningxin_hu: I'd like to comment about the test against the polyfill
<chai> 1. two level conformance -- models and operators
ningxin_hu: polyfill is used as a tool to run the code, depending on the implementation of the polyfill e.g. with CPU backend accuracy satisfies the conformance test criteria
<ping_yu> +1 Good to have requirement on both tolerance on op level and model level
<chai> 2. for standard conformance, having a tolerance defined over standard double precision value ensures longevity
Ping: I agree with Chai we should have different tolerance level for op and model
… target to have accurate output for the model
<chai> 3. try to define tolerance in term of ULP instead of absolute distances
Ping: these ops may have certain tolerance, if people can not pass the conformance test, they might be in trouble
<chai> 4. have different tolerance for different operation. dont use a single tolerance across everything -- that will be a least common denominator
ningxin_hu: I propose to merge #32
<chai> <away from keyboard>
ningxin_hu: this PR uses different tolerance for two different models
… tried to reference the respective native settings from ONNX and TF
<chai> <back on keyboard>
Chai: PR#32 is fine to merged
<ping_yu> I will take a closer look
anssik: after Ping's review, can merge
NSNet2 sample and TF.js memory leak
anssik: Check TF.js upstream blocker status and discuss any other opens blocking PR
anssik: memory leak issues seems to be solved now thanks to Ping and TF.js folks!
… anything else blocking this PR?
<ping_yu> yes, I will do that todya
<ping_yu> sorry, I have to drop off now
anssik: Ping to look at PR#22, can merge once that's completed
Proposals for future work
anssik: Continue discuss proposals submitted for consideration for future work:
Operation-specific APIs #2 by Jonathan
anssik: Chai had some feedback in this issue
Chai: I was just asking for more information so we can better understand the proposal
… exactly what is the proposal, one way to clarify this is to define some samples or prototype functions
… for example, looking at the WebNN graph API, there the currency is an Operand
… that will eventually construct a Tensor
… prototype functions would help us better understand what is the currency flowing through these functions
… or are we talking of e.g. WebGL buffers?
Jonathan: agree pseudo code or proto example would make this clearer
… started a conversation internally, we just haven't done that yet
… I take an action item to deliver a pseudo code or prototype example