05:01:31 RRSAgent has joined #webmachinelearning 05:01:31 logging to https://www.w3.org/2022/01/12-webmachinelearning-irc 05:01:33 RRSAgent, make logs Public 05:01:34 please title this meeting ("meeting: ..."), anssik 05:01:37 Meeting: WebML CG Teleconference – 12 Jan 2022 05:01:38 Geun-Hyung has joined #webmachinelearning 05:01:42 Chair: Anssi 05:01:46 Agenda: https://github.com/webmachinelearning/meetings/blob/master/telcons/2022-01-12-cg-agenda.md 05:01:51 present+ 05:01:51 Present+ Ningxin_Hu 05:01:51 Scribe: Anssi 05:01:57 scribeNick: anssik 05:02:07 Honglin has joined #webmachinelearning 05:02:14 Present+ Anssi_Kostiainen 05:02:31 Present+ Geunhyung_Kim 05:02:37 Present+ Bruce 05:02:42 Present+ Chai_Chaoweeraprasit 05:02:56 Present+ Honglin Yu 05:03:03 scribe+ Jonathan_Bingham 05:03:11 RafaelCintron has joined #webmachinelearning 05:03:22 Present+ Rafael_Cintron 05:03:28 Present+ Ganesan_Ramalingam 05:03:39 RRSAgent, draft minutes 05:03:39 I have made the request to generate https://www.w3.org/2022/01/12-webmachinelearning-minutes.html anssik 05:04:06 ping_yu has joined #webmachinelearning 05:04:15 chai has joined #webmachinelearning 05:04:19 Present+ Ping Yu 05:04:28 Topic: Introductions 05:05:17 anssik: Welcome to 2022! We're now restarting CG calls given advances in Model Loader API 05:05:28 Present+ Jon_Napper 05:06:01 anssik: WebML CG and WG chair, Intel 05:06:42 Jon_Napper: leading ChromeOS ML intelligence team at Google 05:07:38 Present+ Andrew_Maylan 05:07:51 Andrew_Maylan: ChromeOS ML team at Google 05:08:21 Bruce: working with Ningxin, performance work, Intel 05:09:29 Geunhyung_Kim: explainability of ML is my interest, working for Gooroomee 05:10:23 Chai: Windows AI team lead at Msft, also WebNN API co-editor 05:12:00 Honglin_Yu: ChromeOS ML team at Google, Model Loader API spec and impl, exploring this space actively 05:12:39 Present+ Jiawei_Qian 05:12:55 Jiawei_Qian: prev handwriting recognition at Google 05:13:39 Jonathan_Binghan: product manager at Google, have worked with this CG for long, interested in both Model Loader API and WebNN API 05:13:49 Jonathan has joined #webmachinelearning 05:14:19 Mingming: working with Ningxin on WebNN impl, Intel 05:15:16 Ningxin: WebNN co-editor, Intel 05:16:35 Ping_Yu: TF.js lead, Google, have worked with Ningxin and others 05:17:26 RafaelCintron: Edge team at Msft, also representing Msft in a bunch of other W3C groups, e.g. IW, WebGPU, Color on the Web groups 05:17:50 Present+ Raviraj Pinnamaraju 05:18:08 Raviraj: enabling the stack on Chrome OS at Intel, working with Ningxin 05:21:07 mingming has joined #webmachinelearning 05:21:45 Topic: Model Loader API 05:21:52 My intro: Product manager for Web ML at Google 05:22:09 qjw has joined #webmachinelearning 05:22:37 https://github.com/webmachinelearning/meetings/blob/main/scribe-howto.md 05:23:12 Subtopic: Spec and implementation progress 05:23:26 -> https://github.com/webmachinelearning/model-loader/blob/main/explainer.md Updated explainer 05:23:33 -> https://webmachinelearning.github.io/model-loader/ Early spec draft 05:24:06 -> https://chromium-review.googlesource.com/c/chromium/src/+/3341136 Chromium prototype 05:24:23 Andrew_Moylan has joined #webmachinelearning 05:24:32 Jon has joined #webmachinelearning 05:24:55 Honglin: Chromium CL just my personal, not for review yet 05:25:31 ... folks are welcome to review and make comments, final impl will be different 05:25:43 ... will be split into multiple CLs 05:27:52 [Honglin presenting slides] 05:30:28 s/Honglin presenting slides/Honglin presenting slides: Update on Model Loader API 05:30:49 Just a note for Anssi, you may want to add Rama (Ganesan Ramalingam) into the participant list as well 05:30:52 [slide 1] 05:31:17 Honglin: this is a brief on Model Loader API impl 05:31:21 [slide 2] 05:31:27 [slide 3] 05:32:08 [slide 4] 05:32:28 Honglin: similar context to WebNN, user can set the number of threads is a diff 05:32:32 [slide 5] 05:32:55 Honglin: ML loader corresponding to ML graph builder in WebNN 05:33:18 ... this design is to handle the complexity of loading a model 05:33:33 [slide 6] 05:34:05 [slide 7] 05:34:30 ... this is how the current prototype works, see the prototype CL 05:35:16 ... all ML input and output relayed by the browser process 05:35:21 [slide 8] 05:36:07 q+ 05:36:26 ... we have benchmark results from MobileNet v2, even with CPU only the Model Loader API shows better performance than TF.js 05:36:33 ... strong motivation to implement this API 05:36:34 q? 05:37:55 RafaelCintron: why a separate process? The render process is the most secure one 05:38:31 Honglin: good question, we want this to be extensible to various hardware, e.g. Pixelbook has a ML-specific accelerators, we want to be able to use them, easier if we run this in ML service 05:38:33 q+ 05:39:21 ... possibly more safe than renderer, needs to be validated, renderer can do JIT complication, in ML service we can disable those system calls 05:39:24 q? 05:39:36 ack chai 05:41:08 Chai: understanding execution path would help me better understand the relative perf of inference performance 05:41:19 q+ 05:41:24 q? 05:42:21 Honglin: this inference time, with 150 images, using a demo web site where we download these images and after DL we process the data and run inference in a for loop 05:43:04 Chai: usually when running execution, you'd execute the kernels, the question is what backend of TF.js is used for the benchmark 05:43:25 q? 05:46:10 Honglin: Wasm has limited CPU instructions supported, so ML service is compiled natively, this is the main reason 05:46:17 [slide 9] 05:46:22 q- 05:46:40 ... results with quantized models still outperforms TF Lite 05:46:48 [slide 10] 05:47:12 ... IPC cost is 7-10 ms, not small, consider improving this 05:47:16 [slide 11] 05:48:07 ... 8 todos identified 05:48:41 IPC cost seems high, is it caused by marshalling and unmarshalling? 05:48:43 [slide 12] 05:49:28 q+ 05:49:30 ... graph shows how to reduce the identified IPC cost 05:49:44 ... in theory, this reduces IPC cost in half, being explored 05:49:46 q? 05:49:52 [slide 13] 05:50:17 ... 5 open questions 05:50:22 RRSAgent, draft minutes 05:50:24 I have made the request to generate https://www.w3.org/2022/01/12-webmachinelearning-minutes.html anssik 05:52:19 q? 05:52:53 Ningxin: IPC cost seems high, is it caused by marshalling and unmarshalling? 05:53:14 ... does your prototype use shared memory to transfer tensors between processes 05:53:56 Honglin: marshall and unmarshall currently in prototype, considering alternatives 05:54:03 ningxin_hu: will follow up with you offline for this 05:54:04 q? 05:54:08 ack RafaelCintron 05:54:37 RafaelCintron: how tied ChromeOS is to this ML service? Are you open to different inference engines besides TF Lite on Chrome OS 05:55:04 Andrew_Moylan: I think yes 05:55:21 RafaelCintron: how tied ChromeOS is to this, is this ChromeOS only API? 05:56:18 Jonathan: Honglin's work currently depends on ChromeOS, but we understand that is not a web standards and are talking to Chrome browser team and I think it is not in Chromium, but in Chrome, and we can coordinate with Msft to ensure ML service can be implemented also on other OSes 05:56:34 q+ 05:56:46 RafaelCintron: I'm somewhat familiar with ML service and thought it is not so tied to TF Lite 05:56:51 q? 05:57:06 RafaelCintron: that'd be good for any browser that is cross-process 05:57:20 ... even our first party like Office care about cross-platform, not just Windows 05:57:38 ... how many processes can be created, thinking of malicious usage 05:57:59 Honglin: we can limit the max number of processes 05:58:26 ... each model instance runs in a dedicated process, if the web page loads 10 models there's 10 processes, we'll limit the max number of processes 05:59:01 q? 06:00:25 Subtopic: Dependencies, coordination topics with WebNN API 06:00:40 Honglin: we have discussed shareable structs, haven't yet started code reuse 06:01:21 ... have discussed the frontend, backends needs to be explored 06:01:22 q? 06:02:15 Topic: Meeting cadence 06:02:34 anssik: first, does this meeting slot work as a recurrent one to folks? 06:02:42 https://github.com/webmachinelearning/meetings/blob/main/telcons/2022-01-12-cg-agenda.md 06:03:34 [agreement] 06:03:46 anyone who would object isn't here, lol 06:03:57 ... I propose we do either bi-weekly (to match WG) or monthly? 06:04:05 ... or on a need basis? 06:05:50 Honglin: no opinion yet on cadence 06:06:05 Jonathan: having a recurring meeting would be valuable 06:06:32 ... maybe next in two weeks and then once a month? 06:07:32 RRSAgent, draft minutes 06:07:32 I have made the request to generate https://www.w3.org/2022/01/12-webmachinelearning-minutes.html anssik 06:07:49 [agreement] 06:07:53 q? 06:08:32 ack chai 06:09:22 Chai: interop between the two APIs is a great target 06:09:48 ... reuse of API contracts is even more useful than reuse of code 06:10:21 RRSAgent, draft minutes 06:10:21 I have made the request to generate https://www.w3.org/2022/01/12-webmachinelearning-minutes.html anssik 06:12:28 s/recurrent/recurring 06:15:58 anssik: Thanks for joining everyone, thanks Honglin for the great presentation! 06:20:53 s/for review yet/for official review yet/ 06:26:56 s/[Honglin presenting slides: Update on Model Loader API]/Slideset: https://lists.w3.org/Archives/Public/www-archive/2022Jan/att-0000/Update_on_Model_Loader_API.pdf 06:27:01 RRSAgent, draft minutes 06:27:01 I have made the request to generate https://www.w3.org/2022/01/12-webmachinelearning-minutes.html anssik 06:42:16 Present+ Mingming 06:42:19 RRSAgent, draft minutes 06:42:19 I have made the request to generate https://www.w3.org/2022/01/12-webmachinelearning-minutes.html anssik 07:21:01 zkis has joined #webmachinelearning 09:53:21 Zakim has left #webmachinelearning