Define the set of operations and their specification
anssik: let's discuss the ops compatibility study status with Daniel and Nikhil lead
Nikhil: still in planning phase, will start looking at this early next week
anssik: on our last call we agreed to create an ONNX namespace for WebNN ops and to define a process for adding to that
gregwhitworth: we started a conversation around this with rama
rama: ideally ONNX ops namaspace should be updated to reflect any extensions, with different op definition there's a risk of branching
daniel_smilkov: want to understand ONNX namespaces, understanding was ns is a subset of ops tied to a certain version
rama: namespaces are like in any other language, way to organize ops into groups, can be used to group ops for a certain purpose, but in terms of standardization whenever there's existing namespace propose use existing ones
Nikhil: to resolve diff between TF Lite and ONNX Lite needs to change ONNX ops spec
Rama: we can do that extension to whatever namespace it currently is in
gregwhitworth: outside the issue, you identified a set of operations, is there another reason to have an ONNX namespace
daniel_smilkov: we grow the namespace slowly, so want to have a mechanism to do that
gregwhitworth: ONNX namespaces probably are not the best way to address that issue
… example: in CSS we've had similar issue with referencing Unicode spec
… in the WebNN spec we reference directly the ops in the ONNX not the namespace
daniel_smilkov: that makes sense
daniel_smilkov: as long as we're able to change an operation, in result of this spec and the change gets done in ONNX we'd be fine
… when an op changes, how ONNX deals with backwards compat?
rama: the specs have version numbers associated with them, currently at v10 or v11
… in terms of runtime supporting this the expectation is the runtime supports the most recent ops, opsets are a collection of ops
Nikhil: another proposal, if we don't spec format at all, allows more portable code across browser, different channel layouts
… how to resolve backwards compat in a situation where something gets removed from ONNX op?
Rama: layout is only semantics of the op, you cannot get away without specifying the dimensions
daniel_smilkov: our question is more abstract: what if we need to make a backwards incompatible change?
… we need mechanisms that allow us to do that in the context of the WebNN
Rama: ONNX namespace would allow us to do that, another one is to use attributes
gregwhitworth: propose to create an issue and more discussion there
PROPOSED RESOLUTION: Investigate background compatibility of initial set of ops
Resolved: Investigate backwards compatibility of the initial set of ops
Graph-building syntax simpler for web developers #16
anssik: no comments, we take silence as consent with the proposal
… suggest we merge this PR and continue iteration on top of it
gregwhitworth: did not have time to review the PR
daniel_smilkov: reviewed the PR, step in the right direction
[hearing no objections]
Resolved: Merge PR #22
anssik: agreed to identify topics related to custom ops (sharing memory etc.) we need to understand better
… lead Ningxin, contributors Rafael, James, Kai
… Ningxin prototyped a WebNN backend for ONNX.js and provided findings.
WebNN backend for ONNX.js findings
Initial investigation of the WebGPU and WebNN memory sharing
Ningxin_Hu: welcome WebGPU experts review for the latter
Ningxin: The investigation is based on WebGPU backend of TF.js, WebGPU Dawn project and WebNN POC. In particular, I only touched the Metal backend of Dawn and MPS backend and WebNN POC.
Ningxin_Hu: Compilation - For sub-graph compilation, WebNN may need to support WebGPUDevice as compilation target (Compilation.setGPUDevice(WebGPUDevice device)). So framework can make sure WebNN allocates and compiles the sub-graph on the same GPU device as its WebGPU backend
Kai: that makes sense to me
Ningxin_Hu: Execution - Framework implements custom kernels in WebGPU compute shader and uses WebGPUBuffer for data input and output. To allow framework interleaves execution of WebGPU kernels and WebNN sub-graph, WebNN may need to support WebGPUBuffer object as inputs and outputs of execution (Execution.setInput(unsigned long index, WebGPUBuffer buffer) and Execution.setOutput(unsigned long index, WebGPUBuffer buffer) ).
James(?): there are some inefficiencies that can be optimized further
Kai: should use formats we have in WebGPU, not require extensions, this will end up requiring platform specific code e.g. use buffers rather than texture, but that's tricky
… instead of using the set input and set output, you'd be able to say, tell WebNN API this is the data input format I understand and this is what I will give you and allow overloading so that apps can have performant code on different platform
James: would be unfortunate if we have platform-specific shaders
… we don't have a way to write generic shaders over working on buffer or texture
Kai: agree, finding a solution would be nice
Ningxin_Hu: also a case for CPU with SIMD, internal data layout differs based on CPU arch
anssik: next steps?
Ningxin_Hu: would like to get comments in the issue, this is a critical design point for the API
… I can work on some POC, extend it to support WebGPU buffer
… this might take some time, feedback?
… with that we could find perf overhead of each
[no objections]
anssik: Ningxin_Hu to proceed with the investigation
https://github.com/webmachinelearning/webnn/issues/6
anssik: Ningxin prototyped a WebNN backend for ONNX.js and provided findings.
WebNN backend for ONNX.js findings
anssik: 1. If the framework is able to share the graph info with a backend, it would help the integration of a graph-building API
Ningxin_Hu: need to carefully design tensor layout echoing Kai and James
Ningxin_Hu: 2. If the framework is able to execute ops by multiple backends, it would help the integration of WebNN and custom ops of framework
… 3. Performance wise, if the WebNN sub-graph is as big as possible, there would be a good speedup for whole graph execution
… 4. The tensor layout conversions would take significant overhead. It should be minimized as much as possible
anssik: WebML CG F2F meeting takes place at TPAC, W3C's annual all-groups meeting
… location Fukuoka, Japan, 17 & 20 Sep 2019
… (TPAC itself runs for the whole week 16-20 Sep)
WebML CG F2F agenda (PRs welcome)
Nikhil: I will attend and also participate Wasm meeting
Ningxin_Hu: I'll attend Wasm and WebRTC
anssik: chairing Devices and Sensors, Second Screen, WebML
Succeeded: s/Kai:/James(?):/
Succeeded: s/Investigate background/Investigate backwards/
Maybe present: anssik, gregwhitworth, James, James(?), Kai, Nikhil, Ningxin, rama