13:47:20 RRSAgent has joined #webmachinelearning 13:47:20 logging to https://www.w3.org/2021/10/27-webmachinelearning-irc 13:47:22 RRSAgent, make logs Public 13:47:23 please title this meeting ("meeting: ..."), anssik 13:47:25 Meeting: WebML WG Virtual Meeting at TPAC 2021 - Day 2 13:47:29 Chair: Anssi 13:47:36 Agenda: https://github.com/webmachinelearning/meetings/issues/18 13:48:33 Scribe: Anssi 13:48:41 scribeNick: anssik 13:48:57 Present+ Anssi_Kostiainen 13:56:14 takio has joined #webmachinelearning 13:58:36 RafaelCintron has joined #webmachinelearning 13:58:38 Present+ 13:58:54 scribe+ 13:59:34 Present+ MingMing 13:59:40 Present+ BryanBernhart 13:59:41 ningxin_hu has joined #webmachinelearning 13:59:46 Present+ jlbirch 13:59:50 Present+ JunweiFu 13:59:54 Present+ MingquiSun 13:59:59 Present+RafaelCintron 14:00:04 Mingqiu has joined #webmachinelearning 14:00:04 Present+ SingpilShin 14:00:10 BryanBernhart has joined #webmachinelearning 14:00:10 Present+ ZoltanKis 14:00:11 jlb6740 has joined #webmachinelearning 14:00:13 Present+ BruceDai 14:00:38 Present+ Deepti 14:00:43 Present+ Wanming 14:00:46 Present+ MattWilson 14:00:55 Present+ NingxinHu 14:01:29 wanming has joined #webmachinelearning 14:01:32 Sungpil_Shin has joined #webmachinelearning 14:01:34 Present+ BelemZhang 14:01:45 Present+ TakioYamaoka 14:01:51 Present+ PetrPenzin 14:01:58 Present+ RachelYager 14:02:00 Present+ WanmingLin 14:02:10 s/Deepti/DeeptiGandluri 14:03:37 Present+ ChaiChaoweeraprasit 14:04:00 Anssi: [reviews agenda] 14:04:04 Chai has joined #webmachinelearning 14:04:36 belem_zhang has joined #webmachinelearning 14:05:14 Topic: ML JS framework performance, focus areas for WebNN 14:05:56 q+ to ask a question 14:06:13 [slide 1] 14:06:14 kangz has joined #webmachinelearning 14:06:23 zkis has joined #webmachinelearning 14:06:34 Ningxin: we'll be reviewing performance of ML frameworks with WASM to investigate the integration of WebNN 14:06:36 [slide 2] 14:06:57 Ningxin: 3 frameworks we've investigated: ONNX Runetime Web, TF Lite Web, OpenCV.js 14:07:08 [slide 3] 14:07:30 Ningxin: for each framework, I'll talk about how we integrated WebNN and then talk about the prototype to support WebNN 14:07:38 ... and then the tools to collect performance number 14:07:44 ... and finally review the said numbers 14:08:24 ... ONNX Runtime has a mechanism called execution provider to allocate specific nodes or subgraphs in memory to execute by a specific library 14:08:39 ... they have a CPU backend by default, a GPU execution provider (EP), DirectML EP, and others 14:08:57 ... this architecture is compled to Web via emscripten 14:09:07 ... with WASM, it only supports CPU 14:09:14 [slide 4] 14:09:25 ... the WASM module is compiled from the C++ code base 14:09:39 ... they also have WebGL engine 14:09:52 ... we have prototyped the addition of a WebNN execution provider 14:10:20 ... it's written in C++, coming from the WebNN-native project 14:10:27 ... (which maps to the WebIDL definition) 14:10:32 ... it supports 14 ONNIX ops 14:10:44 s/NIX/NX/ 14:11:03 -> https://github.com/webmachinelearning/webnn-native/ webnn-native GitHub repo 14:11:04 ... when an ops is not supported, it fallbacks to the default CPU EP 14:11:17 ... this allows to use WebNN + CPU EPs to run a full graph 14:11:37 q? 14:11:47 ... for instance, in our test, our WebNN EP didn't support Softmax, so it fallbacks to WASM in that case 14:12:00 ... we compiled this with a customized emscripten 14:12:15 ... everything is available on github 14:12:20 [slide 5] 14:12:38 Ningxin: we used the ONNX Runtime Web Demo to collect benchmark data 14:12:58 ... we want to evaluate the performance of WebNN by accessing the native ML APis 14:13:10 ... we did tha twith a node.js add served in an electron.js app 14:13:24 s/a tw/at w/ 14:13:40 ... here we used DirectML for GPU device for WebNN, and OpenVINO for the CPU device 14:13:41 [slide 6] 14:14:02 Ningxin: with that framework, we compare performance across devices 14:14:05 ack anssik 14:14:05 anssik, you wanted to ask a question 14:14:18 q+ to ask about Electron.js vs browser security sandbox overhead expectations 14:14:19 ... the charts show a great speedup compared to the baseline of WASM+SIMD 14:15:13 ... we also tested with SharedArrayBuffer enabled which already gives an improvement over the baseline 14:15:39 ... but with WebNN native, we get e.g. a 9x speedup with squeezenet 14:16:07 ... WebNN is almost on par with the ONNX native execution provider 14:16:24 ... we get similar results on the GPU device, with WebGL being the baseline 14:16:35 Geun-Hyung has joined #webmachinelearning 14:16:40 present+ 14:17:26 ... with again WebNN being on par with the native EP 14:17:39 [slide 7] 14:17:55 Ningxin: a similar review of what we did with TensorFlow that has a similar mechanism 14:17:57 [slide 8] 14:18:05 Ningxin: in TF, this is known as a delegate 14:18:20 ... by default, TF Lite Web uses WASM 14:18:29 ... there again, we added support for WebNN 14:18:33 [slide 9] 14:18:47 Ningxin: we collected data on the TF Lite demo based on Open Vino on a linux laption 14:18:49 [slide 10] 14:19:12 Ningxin: the chart shows the results, with WASM+SIMD as baseline 14:19:33 ... we can't do a device-by-device comparison since TF Lite doesn't have a GPU backend at the moment 14:20:01 ... without going into details, the WebNN delegate performance there again is pretty close to Native Delegate 14:20:06 [slide 11] 14:20:18 Ningxin: we got similar results for OpenCV.js 14:20:19 [slide 14] 14:20:46 Ningxin: we did note a significant gap running with GoogleNet between WebNN & OpenVINO 14:21:01 ... this because it needs to fallback on WASM for certain operations 14:21:20 [slide 15] 14:21:43 Ningxin: we've been working incorporating with frameworks in the past few months 14:22:04 ... falling back to default backend has proved a good way to allow progressive enhancement 14:22:33 ... Separating build & compute has also worked out well to map to the frameworks 14:23:12 ... The Sync API has proved quite important - the framework codebase are C++ based and mostly sync 14:23:28 ... to mitigate the concerns about blocking the main thread, that sync api might be moved to a worker 14:24:05 ... The design to produce results in standard layout in preallocated output buffer is also working well to avoid memory copy & conversion 14:24:23 ... Fused operators have proved to give good performance thanks to the graph optimizers 14:24:25 [slide 16] 14:24:55 Ningxin: with this electron/Node.Js implementation, we're getting good results and is helping us to reach native performance 14:25:00 q? 14:25:03 ... and should help reach the performance of browser implementation 14:25:29 ... We would be happy to see WebNN API implemented in browser, we can help with adapting the JS frameworks to add WebNN as a backend 14:25:36 ... WebNN also needs to be added to ecmscripten 14:25:46 ack anssik 14:25:46 anssik, you wanted to ask about Electron.js vs browser security sandbox overhead expectations 14:25:46 q+ 14:26:07 Anssi: do you have an estimate of the performance penalty of the browser security sandbox? 14:26:18 ... that wouldn't show with electron.js 14:26:47 Ningxin: we had some conversations with the ChromeOS team as they're prototyping the model loader API, incl with the browser security model 14:26:47 emeyer has joined #webmachinelearning 14:26:57 emeyer has left #webmachinelearning 14:26:59 q? 14:27:00 ... for the compute/inference part, that should be similar to WebNN 14:27:12 ... so I asked them and they indicated a pretty small overhead 14:27:23 ... so we should still be getting performance close to native 14:27:47 Present+ JonathanBingham 14:27:54 Present+ CorentinWallez 14:28:11 Jonathan: we're still early in our evaluation of what security overhead will be needed with WebNN 14:28:24 ... it will probably vary across hardware and drivers 14:28:43 ... hard to evaluate the performance penalty at the moment 14:28:52 ... maybe WebGPU can help shed some light on this 14:29:34 Corentin: for WebGPU, a bunch of the performance overhead comes from securing shaders 14:29:52 ... but that probably doesn't apply to the context of WebNN computation 14:30:06 ... there may be overhead in getting data from JS to the model runner and back to JS 14:30:14 npdoty has joined #webmachinelearning 14:30:19 ack RafaelCintron 14:31:00 Rafael: @@@ 14:31:34 Ningxin: we chose to use the same OpenVINO backend of the native backends in ONNX 14:31:43 ... to help with comparison 14:31:53 q? 14:31:58 ... but WebNN has other backends that could bring different results 14:32:44 Rafael: +1 to Corentin on WebGPU performance - the bottleneck is mostly in the CPU crossing the JS barrier 14:33:05 ... this may be minimal when the data starts on the GPU, e.g. with camera input 14:33:15 q? 14:33:34 Topic: Integrating an open-source cross-platform implementation of the Web Neural Network API into a web engine 14:33:43 Question I asked: How does WebNN CPU backend prototype compare to ONNXRuntime's native CPU backend? 14:33:54 s/@@@/How does WebNN CPU backend prototype compare to ONNXRuntime's native CPU backend? 14:34:33 [slide 1] 14:34:33 Junwei: presenting on implementing WebNN in chromium 14:34:40 [slide 2] 14:34:42 -> https://docs.google.com/document/u/0/d/1KDVuz38fx3SpLVdE8FzCCqASjFfOBXcJWj124jP7ZZ4/ WebNN implementation in Chromium Design Doc 14:35:28 -> https://github.com/webmachinelearning/webnn-native Standalone native implementation of the Web Neural Network API 14:35:38 [slide 3] 14:36:26 Junwei: WebNN allows to access hardware acceleration from browsers, with a set of based operations e.g. conv2d 14:37:10 ... browsers need to implement WebNN by plugging in native ML APIs to access hardware acceleration 14:37:25 ... e.g. DirectML accesses GPU 14:37:43 [slide 4] 14:38:05 Junwei: the WebNN Execution Model is based on an MLContext with a device target (cpu or gpu) 14:38:18 ... which gives a way to create an MLGraphBuilder 14:39:02 ... to build a graph that is then compiled and can be used to compute named inputs and outputs into buffer (CPU or GPU) 14:39:06 [slide 5] 14:39:21 Junwei: the WebNN-native architecture builds on the Dawn project 14:39:51 ... WebNN-native uses a C API based on a 1:1 mapping of the WebIDL 14:40:25 Present+ NickDoty 14:40:31 present+ 14:40:31 Present+ Sam 14:40:53 RRSAgent, draft minutes 14:40:53 I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html anssik 14:41:20 [slide 6] 14:41:46 Petr: Late comment on the previous presentation (ML JS framework performance) - I believe that Wasm baseline should be running with all its security checks enabled, therefore with full Web sandbox overhead we can probably expect WebNN/Wasm ratio to reduce a bit. 14:42:25 [slide 7] 14:42:48 q? 14:45:06 [slide 8] 14:45:28 Junwei: WebNN is aligned with WebGPU implementation by building on top of Dawn 14:46:33 ... this allows to share buffers with WebGPU, with the same security mechanism 14:47:17 ... we're calling for review on the design document 14:47:28 ... there are still questions about WebGPU interoperability 14:47:32 [slide 9] 14:47:48 Junwei: we're planning to implement WebNN in Chromium based on interations on the design doc 14:48:03 q? 14:48:07 ... and then follow the Chromium process, starting with ChromeStatus entry 14:48:28 ... not sure if that should be under "new feature incubation" or "implementation of existing standard" 14:48:42 ... we're looking for mentors to help us through the process 14:48:46 RRSAgent, draft minutes 14:48:46 I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html dom 14:48:57 RachelY has joined #webmachinelearning 14:49:26 q? 14:49:39 q: has there been an investigation on how to integrate on Firefox / WebKit as well? 14:50:01 ningxin_hu has joined #webmachinelearning 14:50:29 q+ RachelY 14:50:29 Corentin: very thorough explanation of integrating webnn-native in chromium - any similar investigation for firefox and webkit? 14:51:03 Ningxin: we have mostly experience with Chromium but we would also welcome mentors and contributions from other engines 14:51:15 ack RachelY 14:51:25 Rachel: how does this play with yesterday's presentation on the ONNX framework? 14:52:31 Ningxin: we investigated the Web version of the ONNX runtime which is a WASM compiled version of ONNX runtime 14:53:09 q? 14:53:17 Topic: Privacy and security discussion continued 14:53:51 present+ 14:54:07 Jonathan: the model loader API is a complementary API to WebNN - WebNN is focused on supporting ML framework on JS 14:54:15 -> https://github.com/webmachinelearning/model-loader/blob/master/explainer.md Model Loader API explainer 14:54:32 ... the model loader API allows to pass models from JS directly and the underlying implementation takes care of running it 14:54:46 ... Model loader could be layered on top of WebNN or use a different backend 14:55:01 ... ChromeOS is exploring the model loader API with a focus on getting the security to work 14:55:16 ... they're looking toward an origin trial in 2022, hopefully 1st half of the year 14:55:39 ... we're working with Ningxin and others to align the APIs with WebNN (e.g. shared namespace, shared input/output) 14:55:53 ... ideally we would end up sharing implementation code 14:56:06 ... we would also like to be able to run performance comparisons at some point 14:56:18 ... we might be able to eek out some extra performance from the model approach 14:56:43 Anssi: is it easier to secure the model loader API? 14:57:04 Jonathan: in the short term, yes, if you ignore performance 14:57:30 ... one path could be to simply limit model loader API to WASM which is already hardened - but then you get no perf benefit 14:57:48 ... we're exploring another path but still CPU-only 14:58:06 ... to get the performance, we'll need to get into the more challenging spaces with hardware integration 14:58:20 q+ 14:58:51 q+ 14:59:11 weiler: I don't think I've heard enough details to evaluate anything from a privacy perspective at the moment 14:59:13 ack me 14:59:14 ack wailer 14:59:15 ack ningxin_hu 14:59:45 ningxin_hu: junwei mentioned it in the design doc - WebNN targets multiple device backends (CPU, GPU, specialized accelerators) 14:59:57 I'm also curious about the DRM proposals (as mentioned in the Zoom chat) 14:59:57 q+ Mingqiu to ask what mechanism do you propose to protect ML models? 15:00:01 ... with WASM we have the CPU sandbox, with WebGPU, a GPU sandbox 15:00:27 ... I was wondering how to do with WebNN, esp when considering new specialized accelerators 15:00:32 q+ npdoty is curious about the DRM proposals 15:00:53 q? 15:01:04 ack Mingqiu 15:01:04 Mingqiu, you wanted to ask what mechanism do you propose to protect ML models? 15:04:35 RRSAgent, draft minutes 15:04:35 I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html dom 15:04:45 q? 15:05:14 npdoty: concerned about DRM or protection of models, because users won't have the ability to inspect the code that the machine that is running. losses of transparency and protection against biases 15:05:23 anssik: ethical issues to be discussed tomorrow 15:06:32 takio has left #webmachinelearning 15:10:39 RRSAgent, draft minutes 15:10:39 I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html anssik 15:14:08 i|Topic: ML JS framework|Slideset: https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0014/WebNN_ML_JS_Framework_Performance.pdf 15:14:39 i|Topic: Integrating an|Slideset: 15:14:49 RRSAgent, draft minutes v-slide 15:14:49 I have made the request to generate https://www.w3.org/2021/10/27-webmachinelearning-minutes.html dom 15:44:30 wanming has joined #webmachinelearning 16:06:17 myles has joined #webmachinelearning 17:33:45 Zakim has left #webmachinelearning 19:04:20 zkis has joined #webmachinelearning