13:47:20 <RRSAgent> RRSAgent has joined #webmachinelearning
13:47:20 <RRSAgent> logging to https://www.w3.org/2021/10/27-webmachinelearning-irc
13:47:22 <Zakim> RRSAgent, make logs Public
13:47:23 <Zakim> please title this meeting ("meeting: ..."), anssik
13:47:25 <anssik> Meeting: WebML WG Virtual Meeting at TPAC 2021 - Day 2
13:47:29 <anssik> Chair: Anssi
13:47:36 <anssik> Agenda: https://github.com/webmachinelearning/meetings/issues/18
13:48:33 <anssik> Scribe: Anssi
13:48:41 <anssik> scribeNick: anssik
13:48:57 <anssik> Present+ Anssi_Kostiainen
13:56:14 <takio> takio has joined #webmachinelearning
13:58:36 <RafaelCintron> RafaelCintron has joined #webmachinelearning
13:58:38 <dom> Present+
13:58:54 <dom> scribe+
13:59:34 <dom> Present+ MingMing
13:59:40 <dom> Present+ BryanBernhart
13:59:41 <ningxin_hu> ningxin_hu has joined #webmachinelearning
13:59:46 <dom> Present+ jlbirch
13:59:50 <dom> Present+ JunweiFu
13:59:54 <dom> Present+ MingquiSun
13:59:59 <dom> Present+RafaelCintron
14:00:04 <Mingqiu> Mingqiu has joined #webmachinelearning
14:00:04 <dom> Present+ SingpilShin
14:00:10 <BryanBernhart> BryanBernhart has joined #webmachinelearning
14:00:10 <dom> Present+ ZoltanKis
14:00:11 <jlb6740> jlb6740 has joined #webmachinelearning
14:00:13 <dom> Present+ BruceDai
14:00:38 <dom> Present+ Deepti
14:00:43 <dom> Present+ Wanming
14:00:46 <dom> Present+ MattWilson
14:00:55 <dom> Present+ NingxinHu
14:01:29 <wanming> wanming has joined #webmachinelearning
14:01:32 <Sungpil_Shin> Sungpil_Shin has joined #webmachinelearning
14:01:34 <dom> Present+ BelemZhang
14:01:45 <dom> Present+ TakioYamaoka
14:01:51 <dom> Present+ PetrPenzin
14:01:58 <dom> Present+ RachelYager
14:02:00 <wanming> Present+ WanmingLin
14:02:10 <dom> s/Deepti/DeeptiGandluri
14:03:37 <dom> Present+ ChaiChaoweeraprasit
14:04:00 <dom> Anssi: [reviews agenda]
14:04:04 <Chai> Chai has joined #webmachinelearning
14:04:36 <belem_zhang> belem_zhang has joined #webmachinelearning
14:05:14 <anssik> Topic: ML JS framework performance, focus areas for WebNN
14:05:56 <anssik> q+ to ask a question
14:06:13 <dom> [slide 1]
14:06:14 <kangz> kangz has joined #webmachinelearning
14:06:23 <zkis> zkis has joined #webmachinelearning
14:06:34 <dom> Ningxin: we'll be reviewing performance of ML frameworks with WASM to investigate the integration of WebNN
14:06:36 <dom> [slide 2]
14:06:57 <dom> Ningxin: 3 frameworks we've investigated: ONNX Runetime Web, TF Lite Web, OpenCV.js
14:07:08 <dom> [slide 3]
14:07:30 <dom> Ningxin: for each framework, I'll talk about how we integrated WebNN and then talk about the prototype to support WebNN
14:07:38 <dom> ... and then the tools to collect performance number
14:07:44 <dom> ... and finally review the said numbers
14:08:24 <dom> ... ONNX Runtime has a mechanism called execution provider to allocate specific nodes or subgraphs in memory to execute by a specific library
14:08:39 <dom> ... they have a CPU backend by default, a GPU execution provider (EP), DirectML EP, and others
14:08:57 <dom> ... this architecture is compled to Web via emscripten
14:09:07 <dom> ... with WASM, it only supports CPU
14:09:14 <dom> [slide 4]
14:09:25 <dom> ... the WASM module is compiled from the C++ code base
14:09:39 <dom> ... they also have WebGL engine
14:09:52 <dom> ... we have prototyped the addition of a WebNN execution provider
14:10:20 <dom> ... it's written in C++, coming from the WebNN-native project
14:10:27 <dom> ... (which maps to the WebIDL definition)
14:10:32 <dom> ... it supports 14 ONNIX ops
14:10:44 <dom> s/NIX/NX/
14:11:03 <anssik> -> https://github.com/webmachinelearning/webnn-native/ webnn-native GitHub repo
14:11:04 <dom> ... when an ops is not supported, it fallbacks to the default CPU EP
14:11:17 <dom> ... this allows to use WebNN + CPU EPs to run a full graph
14:11:37 <anssik> q?
14:11:47 <dom> ... for instance, in our test, our WebNN EP didn't support Softmax, so it fallbacks to WASM in that case
14:12:00 <dom> ... we compiled this with a customized emscripten
14:12:15 <dom> ... everything is available on github
14:12:20 <dom> [slide 5]
14:12:38 <dom> Ningxin: we used the ONNX Runtime Web Demo to collect benchmark data
14:12:58 <dom> ... we want to evaluate the performance of WebNN by accessing the native ML APis
14:13:10 <dom> ... we did tha twith a node.js add served in an electron.js app
14:13:24 <dom> s/a tw/at w/
14:13:40 <dom> ... here we used DirectML for GPU device for WebNN, and OpenVINO for the CPU device
14:13:41 <dom> [slide 6]
14:14:02 <dom> Ningxin: with that framework, we compare performance across devices
14:14:05 <anssik> ack anssik
14:14:05 <Zakim> anssik, you wanted to ask a question
14:14:18 <anssik> q+ to ask about Electron.js vs browser security sandbox overhead expectations
14:14:19 <dom> ... the charts show a great speedup compared to the baseline of WASM+SIMD
14:15:13 <dom> ... we also tested with SharedArrayBuffer enabled which already gives an improvement over the baseline
14:15:39 <dom> ... but with WebNN native, we get e.g. a 9x speedup  with squeezenet
14:16:07 <dom> ... WebNN is almost on par with the ONNX native execution provider
14:16:24 <dom> ... we get similar results on the GPU device, with WebGL being the baseline
14:16:35 <Geun-Hyung> Geun-Hyung has joined #webmachinelearning
14:16:40 <Geun-Hyung> present+
14:17:26 <dom> ... with again WebNN being on par with the native EP
14:17:39 <dom> [slide 7]
14:17:55 <dom> Ningxin: a similar review of what we did with TensorFlow that has a similar mechanism
14:17:57 <dom> [slide 8]
14:18:05 <dom> Ningxin: in TF, this is known as a delegate
14:18:20 <dom> ... by default, TF Lite Web uses WASM
14:18:29 <dom> ... there again, we added support for WebNN
14:18:33 <dom> [slide 9]
14:18:47 <dom> Ningxin: we collected data on the TF Lite demo based on Open Vino on a linux laption
14:18:49 <dom> [slide 10]
14:19:12 <dom> Ningxin: the chart shows the results, with WASM+SIMD as baseline
14:19:33 <dom> ... we can't do a device-by-device comparison since TF Lite doesn't have a GPU backend at the moment
14:20:01 <dom> ... without going into details, the WebNN delegate performance there again is pretty close to Native Delegate
14:20:06 <dom> [slide 11]
14:20:18 <dom> Ningxin: we got similar results for OpenCV.js
14:20:19 <dom> [slide 14]
14:20:46 <dom> Ningxin: we did note a significant gap running with GoogleNet between WebNN & OpenVINO
14:21:01 <dom> ... this because it needs to fallback on WASM for certain operations
14:21:20 <dom> [slide 15]
14:21:43 <dom> Ningxin: we've been working incorporating with frameworks in the past few months
14:22:04 <dom> ... falling back to default backend has proved a good way to allow progressive enhancement
14:22:33 <dom> ... Separating build & compute has also worked out well to map to the frameworks
14:23:12 <dom> ... The Sync API has proved quite important - the framework codebase are C++ based and mostly sync
14:23:28 <dom> ... to mitigate the concerns about blocking the main thread, that sync api might be moved to a worker
14:24:05 <dom> ... The design to produce results in standard layout in preallocated output buffer is also working well to avoid memory copy & conversion
14:24:23 <dom> ... Fused operators have proved to give good performance thanks to the graph optimizers
14:24:25 <dom> [slide 16]
14:24:55 <dom> Ningxin: with this electron/Node.Js implementation, we're getting good results and is helping us to reach native performance
14:25:00 <anssik> q?
14:25:03 <dom> ... and should help reach the performance of browser implementation
14:25:29 <dom> ... We would be happy to see WebNN API implemented in browser, we can help with adapting the JS frameworks to add WebNN as a backend
14:25:36 <dom> ... WebNN also needs to be added to ecmscripten
14:25:46 <anssik> ack anssik
14:25:46 <Zakim> anssik, you wanted to ask about Electron.js vs browser security sandbox overhead expectations
14:25:46 <RafaelCintron> q+
14:26:07 <dom> Anssi: do you have an estimate of the performance penalty of the browser security sandbox?
14:26:18 <dom> ... that wouldn't show with electron.js
14:26:47 <dom> Ningxin: we had some conversations with the ChromeOS team as they're prototyping the model loader API, incl with the browser security model
14:26:47 <emeyer> emeyer has joined #webmachinelearning
14:26:57 <emeyer> emeyer has left #webmachinelearning
14:26:59 <anssik> q?
14:27:00 <dom> ... for the compute/inference part, that should be similar to WebNN
14:27:12 <dom> ... so I asked them and they indicated a pretty small overhead
14:27:23 <dom> ... so we should still be getting performance close to native
14:27:47 <dom> Present+ JonathanBingham
14:27:54 <dom> Present+ CorentinWallez
14:28:11 <dom> Jonathan: we're still early in our evaluation of what security overhead will be needed with WebNN
14:28:24 <dom> ... it will probably vary across hardware and drivers
14:28:43 <dom> ... hard to evaluate the performance penalty at the moment
14:28:52 <dom> ... maybe WebGPU can help shed some light on this
14:29:34 <dom> Corentin: for WebGPU, a bunch of the performance overhead comes from securing shaders
14:29:52 <dom> ... but that probably doesn't apply to the context of WebNN computation
14:30:06 <dom> ... there may be overhead in getting data from JS to the model runner and back to JS
14:30:14 <npdoty> npdoty has joined #webmachinelearning
14:30:19 <anssik> ack RafaelCintron
14:31:00 <dom> Rafael: @@@
14:31:34 <dom> Ningxin: we chose to use the same OpenVINO backend of the native backends in ONNX
14:31:43 <dom> ... to help with comparison
14:31:53 <anssik> q?
14:31:58 <dom> ... but WebNN has other backends that could bring different results
14:32:44 <dom> Rafael: +1 to Corentin on WebGPU performance - the bottleneck is mostly in the CPU crossing the JS barrier
14:33:05 <dom> ... this may be minimal when the data starts on the GPU, e.g. with camera input
14:33:15 <anssik> q?
14:33:34 <dom> Topic: Integrating an open-source cross-platform implementation of the Web Neural Network API into a web engine
14:33:43 <RafaelCintron> Question I asked: How does WebNN CPU backend prototype compare to ONNXRuntime's native CPU backend?
14:33:54 <dom> s/@@@/How does WebNN CPU backend prototype compare to ONNXRuntime's native CPU backend?
14:34:33 <dom> [slide 1]
14:34:33 <dom> Junwei: presenting on implementing WebNN in chromium
14:34:40 <dom> [slide 2]
14:34:42 <anssik> -> https://docs.google.com/document/u/0/d/1KDVuz38fx3SpLVdE8FzCCqASjFfOBXcJWj124jP7ZZ4/ WebNN implementation in Chromium Design Doc
14:35:28 <anssik> -> https://github.com/webmachinelearning/webnn-native Standalone native implementation of the Web Neural Network API
14:35:38 <dom> [slide 3]
14:36:26 <dom> Junwei: WebNN allows to access hardware acceleration from browsers, with a set of based operations e.g. conv2d
14:37:10 <dom> ... browsers need to implement WebNN by plugging in native ML APIs to access hardware acceleration
14:37:25 <dom> ... e.g. DirectML accesses GPU
14:37:43 <dom> [slide 4]
14:38:05 <dom> Junwei: the WebNN Execution Model is based on an MLContext with a device target (cpu or gpu)
14:38:18 <dom> ... which gives a way to create an MLGraphBuilder
14:39:02 <dom> ... to build a graph that is then compiled and can be used to compute named inputs and outputs into buffer (CPU or GPU)
14:39:06 <dom> [slide 5]
14:39:21 <dom> Junwei: the WebNN-native architecture builds on the Dawn project
14:39:51 <dom> ... WebNN-native uses a C API based on a 1:1 mapping of the WebIDL
14:40:25 <dom> Present+ NickDoty
14:40:31 <weiler> present+
14:40:31 <dom> Present+ Sam
14:40:53 <anssik> RRSAgent, draft minutes
14:40:53 <RRSAgent> I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html anssik
14:41:20 <dom> [slide 6]
14:41:46 <anssik> Petr: Late comment on the previous presentation (ML JS framework performance) - I believe that Wasm baseline should be running with all its security checks enabled, therefore with full Web sandbox overhead we can probably expect WebNN/Wasm ratio to reduce a bit.
14:42:25 <dom> [slide 7]
14:42:48 <anssik> q?
14:45:06 <dom> [slide 8]
14:45:28 <dom> Junwei: WebNN is aligned with WebGPU implementation by building on top of Dawn
14:46:33 <dom> ... this allows to share buffers with WebGPU, with the same security mechanism
14:47:17 <dom> ... we're calling for review on the design document
14:47:28 <dom> ... there are still questions about WebGPU interoperability
14:47:32 <dom> [slide 9]
14:47:48 <dom> Junwei: we're planning to implement WebNN in Chromium based on interations on the design doc
14:48:03 <anssik> q?
14:48:07 <dom> ... and then follow the Chromium process, starting with ChromeStatus entry
14:48:28 <dom> ... not sure if that should be under "new feature incubation" or "implementation of existing standard"
14:48:42 <dom> ... we're looking for mentors to help us through the process
14:48:46 <dom> RRSAgent, draft minutes
14:48:46 <RRSAgent> I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html dom
14:48:57 <RachelY> RachelY has joined #webmachinelearning
14:49:26 <anssik> q?
14:49:39 <kangz> q: has there been an investigation on how to integrate on Firefox / WebKit as well?
14:50:01 <ningxin_hu> ningxin_hu has joined #webmachinelearning
14:50:29 <anssik> q+ RachelY
14:50:29 <dom> Corentin: very thorough explanation of integrating webnn-native in chromium - any similar investigation for firefox and webkit?
14:51:03 <dom> Ningxin: we have mostly experience with Chromium but we would also welcome mentors and contributions from other engines
14:51:15 <anssik> ack RachelY
14:51:25 <dom> Rachel: how does this play with yesterday's presentation on the ONNX framework?
14:52:31 <dom> Ningxin: we investigated the Web version of the ONNX runtime which is a WASM compiled version of ONNX runtime
14:53:09 <anssik> q?
14:53:17 <dom> Topic: Privacy and security discussion continued
14:53:51 <npdoty> present+
14:54:07 <dom> Jonathan: the model loader API is a complementary API to WebNN - WebNN is focused on supporting ML framework on JS
14:54:15 <anssik> -> https://github.com/webmachinelearning/model-loader/blob/master/explainer.md Model Loader API explainer
14:54:32 <dom> ... the model loader API allows to pass models from JS directly and the underlying implementation takes care of running it
14:54:46 <dom> ... Model loader could be layered on top of WebNN or use a different backend
14:55:01 <dom> ... ChromeOS is exploring the model loader API with a focus on getting the security to work
14:55:16 <dom> ... they're looking toward an origin trial in 2022, hopefully 1st half of the year
14:55:39 <dom> ... we're working with Ningxin and others to align the APIs with WebNN (e.g. shared namespace, shared input/output)
14:55:53 <dom> ... ideally we would end up sharing implementation code
14:56:06 <dom> ... we would also like to be able to run performance comparisons at some point
14:56:18 <dom> ... we might be able to eek out some extra performance from the model approach
14:56:43 <dom> Anssi: is it easier to secure the model loader API?
14:57:04 <dom> Jonathan: in the short term, yes, if you ignore performance
14:57:30 <dom> ... one path could be to simply limit model loader API to WASM which is already hardened - but then you get no perf benefit
14:57:48 <dom> ... we're exploring another path but still CPU-only
14:58:06 <dom> ... to get the performance, we'll need to get into the more challenging spaces with hardware integration
14:58:20 <ningxin_hu> q+
14:58:51 <weiler> q+
14:59:11 <dom> weiler: I don't think I've heard enough details to evaluate anything from a privacy perspective at the moment
14:59:13 <weiler> ack me
14:59:14 <anssik> ack wailer
14:59:15 <anssik> ack ningxin_hu
14:59:45 <dom> ningxin_hu: junwei mentioned it in the design doc - WebNN targets multiple device backends (CPU, GPU, specialized accelerators)
14:59:57 <npdoty> I'm also curious about the DRM proposals (as mentioned in the Zoom chat)
14:59:57 <anssik> q+ Mingqiu to ask what mechanism do you propose to protect ML models?
15:00:01 <dom> ... with WASM we have the CPU sandbox, with WebGPU, a GPU sandbox
15:00:27 <dom> ... I was wondering how to do with WebNN, esp when considering new specialized accelerators
15:00:32 <anssik> q+ npdoty is curious about the DRM proposals
15:00:53 <anssik> q?
15:01:04 <anssik> ack Mingqiu
15:01:04 <Zakim> Mingqiu, you wanted to ask what mechanism do you propose to protect ML models?
15:04:35 <dom> RRSAgent, draft minutes
15:04:35 <RRSAgent> I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html dom
15:04:45 <anssik> q?
15:05:14 <npdoty> npdoty: concerned about DRM or protection of models, because users won't have the ability to inspect the code that the machine that is running. losses of transparency and protection against biases
15:05:23 <npdoty> anssik: ethical issues to be discussed tomorrow
15:06:32 <takio> takio has left #webmachinelearning
15:10:39 <anssik> RRSAgent, draft minutes
15:10:39 <RRSAgent> I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html anssik
15:14:08 <dom> i|Topic: ML JS framework|Slideset: https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0014/WebNN_ML_JS_Framework_Performance.pdf
15:14:39 <dom> i|Topic: Integrating an|Slideset:
15:14:49 <dom> RRSAgent, draft minutes v-slide
15:14:49 <RRSAgent> I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html dom
15:44:30 <wanming> wanming has joined #webmachinelearning
16:06:17 <myles> myles has joined #webmachinelearning
17:33:45 <Zakim> Zakim has left #webmachinelearning
19:04:20 <zkis> zkis has joined #webmachinelearning