WebML CG Teleconference – 22 August 2019

Meeting minutes

Define the set of operations and their specification

anssik: let's discuss the ops compatibility study status with Daniel and Nikhil lead

Nikhil: still in planning phase, will start looking at this early next week

anssik: on our last call we agreed to create an ONNX namespace for WebNN ops and to define a process for adding to that

gregwhitworth: we started a conversation around this with rama

rama: ideally ONNX ops namaspace should be updated to reflect any extensions, with different op definition there's a risk of branching

daniel_smilkov: want to understand ONNX namespaces, understanding was ns is a subset of ops tied to a certain version

rama: namespaces are like in any other language, way to organize ops into groups, can be used to group ops for a certain purpose, but in terms of standardization whenever there's existing namespace propose use existing ones

Nikhil: to resolve diff between TF Lite and ONNX Lite needs to change ONNX ops spec

Rama: we can do that extension to whatever namespace it currently is in

gregwhitworth: outside the issue, you identified a set of operations, is there another reason to have an ONNX namespace

daniel_smilkov: we grow the namespace slowly, so want to have a mechanism to do that

gregwhitworth: ONNX namespaces probably are not the best way to address that issue
… example: in CSS we've had similar issue with referencing Unicode spec
… in the WebNN spec we reference directly the ops in the ONNX not the namespace

daniel_smilkov: that makes sense

daniel_smilkov: as long as we're able to change an operation, in result of this spec and the change gets done in ONNX we'd be fine
… when an op changes, how ONNX deals with backwards compat?

rama: the specs have version numbers associated with them, currently at v10 or v11
… in terms of runtime supporting this the expectation is the runtime supports the most recent ops, opsets are a collection of ops

Nikhil: another proposal, if we don't spec format at all, allows more portable code across browser, different channel layouts
… how to resolve backwards compat in a situation where something gets removed from ONNX op?

Rama: layout is only semantics of the op, you cannot get away without specifying the dimensions

daniel_smilkov: our question is more abstract: what if we need to make a backwards incompatible change?
… we need mechanisms that allow us to do that in the context of the WebNN

Rama: ONNX namespace would allow us to do that, another one is to use attributes

gregwhitworth: propose to create an issue and more discussion there

PROPOSED RESOLUTION: Investigate background compatibility of initial set of ops

Resolved: Investigate backwards compatibility of the initial set of ops

Graph-building syntax simpler for web developers

Graph-building syntax simpler for web developers #16

Ningxin's PR to fix #16

anssik: no comments, we take silence as consent with the proposal
… suggest we merge this PR and continue iteration on top of it

gregwhitworth: did not have time to review the PR

daniel_smilkov: reviewed the PR, step in the right direction

[hearing no objections]

Resolved: Merge PR #22

Custom ops

Custom operations #6

anssik: agreed to identify topics related to custom ops (sharing memory etc.) we need to understand better
… lead Ningxin, contributors Rafael, James, Kai
… Ningxin prototyped a WebNN backend for ONNX.js and provided findings.

WebNN backend for ONNX.js findings

Initial investigation of the WebGPU and WebNN memory sharing

Ningxin_Hu: welcome WebGPU experts review for the latter

Ningxin: The investigation is based on WebGPU backend of TF.js, WebGPU Dawn project and WebNN POC. In particular, I only touched the Metal backend of Dawn and MPS backend and WebNN POC.

Ningxin_Hu: Compilation - For sub-graph compilation, WebNN may need to support WebGPUDevice as compilation target (Compilation.setGPUDevice(WebGPUDevice device)). So framework can make sure WebNN allocates and compiles the sub-graph on the same GPU device as its WebGPU backend

Kai: that makes sense to me

Ningxin_Hu: Execution - Framework implements custom kernels in WebGPU compute shader and uses WebGPUBuffer for data input and output. To allow framework interleaves execution of WebGPU kernels and WebNN sub-graph, WebNN may need to support WebGPUBuffer object as inputs and outputs of execution (Execution.setInput(unsigned long index, WebGPUBuffer buffer) and Execution.setOutput(unsigned long index, WebGPUBuffer buffer) ).

James(?): there are some inefficiencies that can be optimized further

Kai: should use formats we have in WebGPU, not require extensions, this will end up requiring platform specific code e.g. use buffers rather than texture, but that's tricky
… instead of using the set input and set output, you'd be able to say, tell WebNN API this is the data input format I understand and this is what I will give you and allow overloading so that apps can have performant code on different platform

James: would be unfortunate if we have platform-specific shaders
… we don't have a way to write generic shaders over working on buffer or texture

Kai: agree, finding a solution would be nice

Ningxin_Hu: also a case for CPU with SIMD, internal data layout differs based on CPU arch

anssik: next steps?

Ningxin_Hu: would like to get comments in the issue, this is a critical design point for the API
… I can work on some POC, extend it to support WebGPU buffer
… this might take some time, feedback?
… with that we could find perf overhead of each

[no objections]

anssik: Ningxin_Hu to proceed with the investigation

https://‌github.com/‌webmachinelearning/‌webnn/‌issues/‌6

anssik: Ningxin prototyped a WebNN backend for ONNX.js and provided findings.

WebNN backend for ONNX.js findings

anssik: 1. If the framework is able to share the graph info with a backend, it would help the integration of a graph-building API

Ningxin_Hu: need to carefully design tensor layout echoing Kai and James

Ningxin_Hu: 2. If the framework is able to execute ops by multiple backends, it would help the integration of WebNN and custom ops of framework
… 3. Performance wise, if the WebNN sub-graph is as big as possible, there would be a good speedup for whole graph execution
… 4. The tensor layout conversions would take significant overhead. It should be minimized as much as possible

WebML CG F2F meeting at TPAC

anssik: WebML CG F2F meeting takes place at TPAC, W3C's annual all-groups meeting
… location Fukuoka, Japan, 17 & 20 Sep 2019
… (TPAC itself runs for the whole week 16-20 Sep)

About TPAC

Registration by 6 Sep

WebML CG F2F agenda (PRs welcome)

Nikhil: I will attend and also participate Wasm meeting

Ningxin_Hu: I'll attend Wasm and WebRTC

anssik: chairing Devices and Sensors, Second Screen, WebML

– DRAFT –
WebML CG Teleconference – 22 August 2019

22 August 2019

Attendees

Meeting minutes

Define the set of operations and their specification

Graph-building syntax simpler for web developers

Custom ops

WebML CG F2F meeting at TPAC

Adjourn

Summary of resolutions

Diagnostics