Machine Learning for the Web breakout at TPAC 2018

Meeting minutes

<anssik> Machine Learning for the Web Community Group

<anssik> Machine Learning for the Web Community Group Charter

<anssik> anssik has changed the topic to: https://‌www.w3.org/‌wiki/‌TPAC/‌2018/‌SessionIdeas#Machine_Learning_for_the_Web

<Kangz> Hi, this is Corentin Wallez

Introduction

anssik: Welcome to the Machine Learning for the Web breakout session at TPAC! I'll be giving a quick background on this idea.
… We proposed and created a Community Group a few weeks ago. Before that, we did some incubation in the Web Platform Incubator Community Group (WICG)
… This helped set the scope of the work.
… Basically, the work is scoped to neural networks. Low-level API that can be implemented accross platforms to do client-side inference.
… Why do we want to do that?
… Getting closer to the platforms improves performance. For object recognition and immersive computation, that's crucial.
… But there are also many computation needs that you need to do on the client.
… I would like to start with a demo.

<ningxinhu> https://‌huningxin.github.io/‌webml-examples/

anssik: The demo works on your Web browsers right now because we're using a polyfill built on top of WebGL, and later WebGPU.

ningxinhu: [going through demo]
… Model gets downloaded first. In this case, cached through Service Worker.
… The demo uses Koko(sp?) that can recognize about 100 objects.
… s/Koko(sp?)/Coco/
… The WebML propotype uses Chromium. Two polyfill implementations: WebAssembly and WebGL.
… WebAssembly is much slower, because WebAssembly lacks SIMD features that allow to parallelize computation.
… WebGL2 is faster. WebML is the fatest.

anssik: The prototype is quite convincing in terms of performance gains with WebML.
… We have a F2F meeting of the CG this Friday afternoon.
… We'd like to hear from your use cases to start with.
… The primary user of the current API is authors of frameworks, e.g. TensorFlow.js

Overview of API

ningxinhu: The scope is that we want to do Edge AI computing.
… [showing presentation slides]
… Sentiment anaylsis, etc.
… JS Machine Learning frameworks such as TensorFlow.js, Keras.js, OpenCV.js provide the capabilities to do the inference on the client.
… The models you've trained can be reused in different frameworks.
… Some advantages to edge AI: latency, availability, privacy, and also cost as processing is delegated to the client. Users use their computation capabilities.
… Gain of performance is also on the power consumption side. Current JS-based solutions consume a lot of power, especially on mobile devices.
… We prepared some benchmark to compare performances.
… I will not go into much details about the data, but the takeaway is that the gap between a Web-based and native solution is huge.
… Even comparing a solution that uses WebGPU and a WebML approach, there's still a 4 times gain.
… Looking at existing solutions today, CPU Deep Learning (DL) optimizations are not exposed to applications. That's crucial, in particular on mobile devices.
… Also dedicated DL hardware accelerators.

?1: Some use cases where deep learning is not the right approach. Why don't you try to support other kinds of algorithms?
… Use case around high-res images

ningxinhu: My understanding is that DL needs low-res images, because training models would not work too well with high-res images.

anssik: We wanted to be practical with the scope. Minimal approach of what would enable use cases fast.

?1: There are use cases where you don't have all of the data to train the models, and Web developers will run into the issue.

<tomoyuki> s/KoKo(sp?)/COCO/

ningxinhu: Web developers won't have much data to train models, that's your concern?

?1: Yes

ningxinhu: Web developers will be able to reuse other trained models and re-train them with additional data.
… Here, our focus is how to tackle the performance issue for the Web. With that, the Web developer can leverage existing trained models.

anssik: It should be made clear that training is considered out of scope of this work.

ningxinhu: Our proposing Web Neural Network (WebNN) is meant to leverage existing hardware infrastructure. The OS should ship their own Machine Learning API, abstracted away by WebNN.
… WebNN provides low-level interfaces to these APIs to support different models, e.g. ONNX models, TensorFlow models, etc.
… Positioning the API on the Web. A Usage API uses a built-in model, easy to ingrate. Maps to different platform primitives, such as NLP, Skills, MLKit.
… The Model-level API (WebML) would map to CoreML, WinML, TensorFlow Lite, etc.
… WebNN is the acceleration API, close to hardware optimization. Maps to BNNS, MPS, DirectML, Android NN.
… WebNN is our current focus.

anssik: Right, the low-level WebNN proposal is the goal of the incubation in the Community Group. We're hoping this will lead to the standardization of the other higher-level APIs in the future.

Alex: WebNN is JS functions to manipulate models, is that correct?

ningxinhu: We can review the exact design afterwards.

anssik: You can refer to the following scope statement

<anssik> Machine Learning for the Web Community Group Charter

<dom> https://‌webmachinelearning.github.io/‌charter/

ningxinhu: On the top of this, WebML (not a good name), if you have interfaces for models, it will ease reuse of models. But that API can be built as a polyfill on top of the WebNN API in the near future.

?2: [performance figures question]

ningxinhu: WebNN is within 10-20% of pure native performances.

<anssik> [POC = proof of concept]

ningxinhu: We want to use a Proof of Concept as starting point and explore cross-platform capability.
… The polyfill API is modeled from Android NNAPI.
… You need to manipulate models, with operations and operands. Also, you need a compilation stage to initialize the hardware.
… Last the execution stage where you process inputs according to the models and make inferance to produce outputs.
… We're using a promise-based API, so when your promise is resolved, you have the answer.

anssik: If people want to see more demos, they can haz them!

ningxinhu: We started with a polyfill with WebAssembly and WebGL backends. Compute shaders.
… and then developed a Chromium prototype, starting with MPS/BNNS API for MacOS, NN API for Android, cross-platform hardware agnostic.
… WebNN can leverage CPUs, GPUs and accelerators.

Myles: Wondering about numbers. What happens that explains the 24x times gap between CoreML (CPU) and WebAssembly implementations.

ningxinhu: Some missing threading and SIMD features explain the difference. The gap should be reduced once features are in.

Iank: The number is a bit unfair for WASM, because these features are incoming.

Myles: Let's pretend these features are used. How far down would that gap become?

ningxinhu: We'd probably go down to 4x.

sangwhan: Comparison should go with TensorFlow. Nested loops to do convolution would be hard to achieve.

Deepti: A lot of feedback about WASM SIMD is that we may not be covering the right set of instructions.

ningxinhu: Multiple add.

<Zakim> sangwhan, you wanted to suggest better naming, ask about rationale of the "whitelisted" layers, choice of integer constants over string enums, choice of procedural programming practices for model definition, output layer being re-used being somewhat non-idiomatic javascript, and more

sangwhan: Question about the serialization of models, saw that in scope in the draft charter.

[draft charter shown]

sangwhan: To make interoperability less plainful, it would be good to include that in the scope.
… Worried about the bytes in the array buffer.

anssik: I think you might want to open an issue on the GitHub repo

Corentin: There's a standard in the Khronos group for that.

<Kangz> https://‌www.khronos.org/‌nnef

sangwhan: Right, I just want to clarify what's in scope. Doesn't matter if it reuses an existing spec.

ningxinhu: WebNN is very low-level API. Main target is JS frameworks. They can add their logic to deal with such issues.

sangwhan: Not entirely convinced, but OK.

anssik: Feel free to influence the scope on GitHub. The draft charter is a starting point. We want to get your input. Not an easy task to land on a scope that seems to work for people.

Machine Learning for the Web Community Group

<anssik> Machine Learning for the Web Community Group F2F at TPAC 2018

anssik: First face-to-face on Friday
… 2 hour meeting. [going through agenda]
… We'd like to start not with the proposed solution, but with use cases and requirements. Got some good contributions from Tomoyuki here for instance.
… Then we'll go into more details on the proposal. If people have other proposals in mind, you're welcome to submit them for consideration.
… I look forward to having a productive session. See you all on Friday!

– DRAFT –
Machine Learning for the Web breakout at TPAC 2018

24 October 2018

Meeting minutes

Introduction

Overview of API

Machine Learning for the Web Community Group

Diagnostics