15:48:30 RRSAgent has joined #webmachinelearning 15:48:34 logging to https://www.w3.org/2024/09/23-webmachinelearning-irc 15:48:34 RRSAgent, make logs Public 15:48:35 please title this meeting ("meeting: ..."), anssik 15:48:39 Meeting: Web Machine Learning WG F2F – 23 September 2024 15:48:47 Chair: Anssi 15:48:51 Agenda: https://github.com/webmachinelearning/meetings/issues/25 15:48:52 https://github.com/webmachinelearning/meetings/issues/25 -> Issue 25 WebML WG - TPAC 2024 agenda (by anssiko) 15:49:31 Scribe: Anssi 15:49:35 scribeNick: anssik 15:49:42 gb, this is webmachinelearning/webnn 15:49:42 anssik, OK. 15:49:50 scribe+ dom 15:50:04 Present+ Anssi_Kostiainen 15:50:12 Present+ Dominique_Hazael-Massieux 15:50:12 Present+ Ningxin_Hu 15:50:16 Present+ Michael_McCool 15:50:21 Present+ Rafael_Cintron 15:50:27 Present+ Neil_Trevett 15:50:46 Present+ Austin_Sullivan 15:50:56 RRSAgent, draft minutes 15:50:58 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik 15:51:48 Present+ Dwayne_Robinson 15:54:20 Present+ Yuta_Hagio 15:55:16 Present+ Fredrik_Solenberg 15:57:04 Present+ Laszlo_Gombos 15:57:15 asully has joined #webmachinelearning 15:57:53 Present+ Kenji_Baheux 15:58:15 dom has joined #webmachinelearning 15:58:21 Present+ Rob_Kochman 15:58:50 RRSAgent, draft minutes 15:58:51 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom 15:59:07 Present+ Rachel_Yager 15:59:18 RRSAgent, draft minutes 15:59:19 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik 15:59:32 RafaelCintron has joined #webmachinelearning 15:59:55 Present+ Reilly_Grant 16:00:04 jsbell has joined #webmachinelearning 16:00:09 ali has joined #webmachinelearning 16:00:09 Present+ asully 16:00:21 Present+ RafaelCintron 16:00:23 Lei has joined #webmachinelearning 16:00:49 Lei has joined #webmachinelearning 16:00:49 ali has joined #webmachinelearning 16:00:49 jsbell has joined #webmachinelearning 16:00:49 RafaelCintron has joined #webmachinelearning 16:00:49 asully has joined #webmachinelearning 16:00:49 reillyg has joined #webmachinelearning 16:01:52 McCool has joined #webmachinelearning 16:01:52 gdti has joined #webmachinelearning 16:01:52 Fredrik_Solenberg has joined #webmachinelearning 16:01:52 ningxin has joined #webmachinelearning 16:01:52 Lei has joined #webmachinelearning 16:01:52 ali has joined #webmachinelearning 16:01:52 jsbell has joined #webmachinelearning 16:01:52 RafaelCintron has joined #webmachinelearning 16:01:52 asully has joined #webmachinelearning 16:01:52 reillyg has joined #webmachinelearning 16:02:48 zkis has joined #webmachinelearning 16:03:12 RobKochman has joined #webmachinelearning 16:03:12 kenji_baheux has joined #webmachinelearning 16:03:12 NeilT has joined #webmachinelearning 16:03:12 McCool has joined #webmachinelearning 16:03:12 Fredrik_Solenberg has joined #webmachinelearning 16:03:12 ningxin has joined #webmachinelearning 16:03:12 ali has joined #webmachinelearning 16:03:12 jsbell has joined #webmachinelearning 16:03:12 RafaelCintron has joined #webmachinelearning 16:03:12 asully has joined #webmachinelearning 16:03:12 reillyg has joined #webmachinelearning 16:03:53 Present+ Domenic_Denicola 16:03:59 Present+ Joshua_Bell 16:04:15 dwayner7 has joined #webmachinelearning 16:04:15 RobKochman has joined #webmachinelearning 16:04:15 kenji_baheux has joined #webmachinelearning 16:04:15 NeilT has joined #webmachinelearning 16:04:15 McCool has joined #webmachinelearning 16:04:15 Fredrik_Solenberg has joined #webmachinelearning 16:04:15 ningxin has joined #webmachinelearning 16:04:15 ali has joined #webmachinelearning 16:04:15 jsbell has joined #webmachinelearning 16:04:15 RafaelCintron has joined #webmachinelearning 16:04:15 asully has joined #webmachinelearning 16:04:15 reillyg has joined #webmachinelearning 16:04:20 Present+ Deepti_Gendluri 16:04:28 Present+ Iris_Ren 16:04:35 Present+ Bryan_Bernhart 16:04:42 Present+ Lei_Zhao 16:04:47 Present+ Zoltan_Kis 16:05:12 Topic: Welcome 16:05:20 ErikAnderson has joined #webmachinelearning 16:05:20 ktoumura has joined #webmachinelearning 16:05:20 Rachel has joined #webmachinelearning 16:05:20 dwayner7 has joined #webmachinelearning 16:05:20 RobKochman has joined #webmachinelearning 16:05:20 kenji_baheux has joined #webmachinelearning 16:05:20 NeilT has joined #webmachinelearning 16:05:20 McCool has joined #webmachinelearning 16:05:20 Fredrik_Solenberg has joined #webmachinelearning 16:05:20 ningxin has joined #webmachinelearning 16:05:20 ali has joined #webmachinelearning 16:05:20 jsbell has joined #webmachinelearning 16:05:20 RafaelCintron has joined #webmachinelearning 16:05:20 asully has joined #webmachinelearning 16:05:20 reillyg has joined #webmachinelearning 16:05:22 p+ 16:05:39 anssik: this is our first F2F as a WG, despite us having existed for a long time 16:06:12 ... I'm Anssi Koistiainen from Intel, chair of the WG, supported by Dom as our staff contact; thanks to the TPAC organizers to make this happen 16:06:30 ... great to see both long time participants and new faces 16:06:41 lgombos has joined #webmachinelearning 16:06:41 ErikAnderson has joined #webmachinelearning 16:06:41 ktoumura has joined #webmachinelearning 16:06:41 Rachel has joined #webmachinelearning 16:06:41 dwayner7 has joined #webmachinelearning 16:06:41 RobKochman has joined #webmachinelearning 16:06:41 kenji_baheux has joined #webmachinelearning 16:06:41 NeilT has joined #webmachinelearning 16:06:41 McCool has joined #webmachinelearning 16:06:41 Fredrik_Solenberg has joined #webmachinelearning 16:06:41 ningxin has joined #webmachinelearning 16:06:41 ali has joined #webmachinelearning 16:06:41 jsbell has joined #webmachinelearning 16:06:41 RafaelCintron has joined #webmachinelearning 16:06:41 asully has joined #webmachinelearning 16:06:47 ... this WG has now all the major browser vendors as participants, with Mozilla joining recently 16:07:12 ... a diverse set of experts at the intersection of expertises around AI, from different backgrounds 16:07:33 ... incl library makers that help us calibrate our work to real world requirements 16:07:52 reillyg has joined #webmachinelearning 16:07:52 lgombos has joined #webmachinelearning 16:07:52 ErikAnderson has joined #webmachinelearning 16:07:52 ktoumura has joined #webmachinelearning 16:07:52 Rachel has joined #webmachinelearning 16:07:52 dwayner7 has joined #webmachinelearning 16:07:52 RobKochman has joined #webmachinelearning 16:07:52 kenji_baheux has joined #webmachinelearning 16:07:52 NeilT has joined #webmachinelearning 16:07:52 McCool has joined #webmachinelearning 16:07:52 Fredrik_Solenberg has joined #webmachinelearning 16:07:52 ningxin has joined #webmachinelearning 16:07:52 ali has joined #webmachinelearning 16:07:52 jsbell has joined #webmachinelearning 16:07:52 RafaelCintron has joined #webmachinelearning 16:07:52 asully has joined #webmachinelearning 16:07:55 dwaynerobinson has joined #webmachinelearning 16:08:04 ... people from industry, research background - let other people know they can get involved 16:08:10 lambdakata3 has joined #webmachinelearning 16:08:49 SarahJ has joined #webmachinelearning 16:09:30 SarahJ has joined #webmachinelearning 16:09:30 lambdakata3 has joined #webmachinelearning 16:09:30 dwaynerobinson has joined #webmachinelearning 16:09:30 reillyg has joined #webmachinelearning 16:09:30 lgombos has joined #webmachinelearning 16:09:30 ErikAnderson has joined #webmachinelearning 16:09:30 ktoumura has joined #webmachinelearning 16:09:30 Rachel has joined #webmachinelearning 16:09:30 dwayner7 has joined #webmachinelearning 16:09:30 RobKochman has joined #webmachinelearning 16:09:30 kenji_baheux has joined #webmachinelearning 16:09:30 NeilT has joined #webmachinelearning 16:09:30 McCool has joined #webmachinelearning 16:09:30 Fredrik_Solenberg has joined #webmachinelearning 16:09:30 ningxin has joined #webmachinelearning 16:09:30 ali has joined #webmachinelearning 16:09:30 jsbell has joined #webmachinelearning 16:09:30 RafaelCintron has joined #webmachinelearning 16:09:30 asully has joined #webmachinelearning 16:10:28 Domenic has joined #webmachinelearning 16:10:43 Fabian has joined #webmachinelearning 16:10:43 SarahJ has joined #webmachinelearning 16:10:43 lambdakata3 has joined #webmachinelearning 16:10:43 dwaynerobinson has joined #webmachinelearning 16:10:43 reillyg has joined #webmachinelearning 16:10:43 ErikAnderson has joined #webmachinelearning 16:10:43 ktoumura has joined #webmachinelearning 16:10:43 Rachel has joined #webmachinelearning 16:10:43 dwayner7 has joined #webmachinelearning 16:10:43 RobKochman has joined #webmachinelearning 16:10:43 kenji_baheux has joined #webmachinelearning 16:10:43 NeilT has joined #webmachinelearning 16:10:43 McCool has joined #webmachinelearning 16:10:43 Fredrik_Solenberg has joined #webmachinelearning 16:10:43 ningxin has joined #webmachinelearning 16:10:43 ali has joined #webmachinelearning 16:10:43 jsbell has joined #webmachinelearning 16:10:43 RafaelCintron has joined #webmachinelearning 16:10:43 asully has joined #webmachinelearning 16:10:44 BryanB has joined #webmachinelearning 16:10:54 Present+ Thomas_Steiner 16:11:03 Present+ Taylore_Microsoft 16:11:22 [round of intros] 16:11:52 Present+ Mike_Wyrzykowski 16:12:06 Present+ 16:12:37 reillyg has joined #webmachinelearning 16:12:43 tomayac7 has joined #webmachinelearning 16:12:45 AramZS has joined #webmachinelearning 16:12:50 RRSAgent, draft minutes 16:12:52 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik 16:13:16 Present+ 16:13:20 annolan has joined #webmachinelearning 16:13:25 present+ Kunihiko_Toumura 16:13:29 lgombos has joined #webmachinelearning 16:13:29 Present+ 16:13:32 Present+ 16:13:42 present+ Laszlo_Gombos 16:13:49 andrewnolan has joined #webmachinelearning 16:13:52 annolan has left #webmachinelearning 16:14:05 Rachel has joined #webmachinelearning 16:14:05 Wonsuk has joined #webmachinelearning 16:14:12 present+ 16:14:22 Present+ Neil Trevett 16:15:09 Present+ Wonsuk_Lee 16:15:11 gdti has joined #webmachinelearning 16:15:23 Rachel Present+ Rachel_Yager 16:15:25 Present+ (Deepti Gandluri) 16:15:36 Present+ Ali Spivak 16:15:39 Present+ Rachel_Yager 16:15:55 handellm has joined #webmachinelearning 16:16:11 hirata has joined #webmachinelearning 16:16:24 Taylore has joined #webmachinelearning 16:16:42 Subtopic: Charter orientation 16:17:08 Anssi: we are 2 groups: the Web Machine Learning Working Group, and its eponym Community Group 16:17:49 hagio_nhk has joined #webmachinelearning 16:18:27 ... the WG deliverables includes the WebNN API (the core focus of the technical work), ethical guidelines (a topic on which we will get an intervention from openAI) 16:19:19 ... and work on model loader API which is blocked on standardized model format 16:19:27 zkis has joined #webmachinelearning 16:19:35 ... the CG is responsible for incubating proposals some of which may graduate later to standardization 16:19:59 Dom: note that the WG charter is expiring next year, so we'll need to start discussions about potential additions in the next few weeks/months 16:20:20 Anssi: the CG allows to do exploratory work - that's how WebNN itself started 16:20:35 guidou5 has joined #webmachinelearning 16:20:55 Anssi: We're also chartered to coordinate with other groups: WebGPU WG (with related topics on our agenda, e.g. MLTensor) 16:21:22 ... the WebAssembly WG is also an important related group (with Deepti in particular helping coordinate) 16:21:43 ... also important integration questions around WebRTC which Ningxin explored a couple of years ago 16:22:35 ... Also working closely with the Technical Architecture Group which helps us making sure our API fit well in the broader platform 16:22:44 q? 16:22:45 Present+ Jay_Wang 16:23:31 Present+Taylore_Givens 16:24:03 Topic: Ethics 16:24:16 present+ 16:24:19 guidou has joined #webmachinelearning 16:24:19 Subtopic: Democratizing Human-Centered AI with Visual Explanation and Interactive Guidance 16:24:26 Slideset: Jay_Wang_slides 16:25:04 Anssi: please welcome Jay, a safety researcher at openAI who will explain to us his research focus on making AI more accessible through novel interfaces 16:25:21 ... You're also part of Georgia Tech, have published many papers and open source tools 16:25:41 [slide 1] 16:25:59 [slide 2] 16:27:10 [slide 3] 16:27:13 [slide 4] 16:27:15 [slide 5] 16:27:30 guidou has joined #webmachinelearning 16:27:41 etienne has joined #webmachinelearning 16:28:06 [slide 6] 16:28:40 [slide 7] 16:28:52 [slide 8] 16:29:12 [slide 9] 16:29:36 [slide 10] 16:29:59 [slide 11] 16:30:20 [slide 12] 16:30:34 [slide 13] 16:30:39 [slide 14] 16:31:04 [slide 15] 16:31:12 [slide 16] 16:31:30 [slide 17] 16:31:37 [slide 18] 16:31:51 fr has joined #webmachinelearning 16:33:01 [demo of https://bit.ly/cnn-explainer] 16:34:43 -> CNN Explainer https://bit.ly/cnn-explainer 16:34:51 [slide 19] 16:34:53 [slide 20] 16:35:13 [slide 21] 16:35:51 [slide 22] 16:36:26 [slide 23] 16:36:46 [slide 24] 16:37:07 [slide 25] 16:37:17 [slide 26] 16:37:53 [slide 27] 16:38:04 -> DiffusionDB Explorer https://bit.ly/diffusiondb-vis 16:38:09 [slide 28] 16:38:51 [slide 29] 16:39:22 -> WizMap Embeddings https://bit.ly/wizmap-acl 16:39:34 [slide 30] 16:40:16 [slide 31] 16:40:21 [slide 32] 16:42:45 [slide 33] 16:43:33 [slide 34] 16:43:54 [slide 35] 16:43:55 -> GAM Changer https://bit.ly/gam-changer 16:44:05 [slide 36] 16:44:30 [slide 37] 16:44:43 [slide 38] 16:45:23 [slide 39] 16:45:47 [slide 40] 16:47:27 [slide 41] 16:47:41 [slide 42] 16:47:57 [slide 43] 16:48:01 [slide 44] 16:51:52 [slide 45] 16:52:15 [slide 46] 16:52:37 [slide 47 ] 16:53:22 [slide 48] 16:53:44 [slide 49] 16:54:24 [slide 50] 16:55:15 [slide 51] 16:55:42 -> WebSHAP https://bit.ly/webshap 16:56:30 [slide 52] 16:57:03 -> MeMemo https://bit.ly/mememojs 16:57:28 -> Wordflow https://bit.ly/wordflow-tool 16:58:29 [slide 53] 16:59:09 anssi: thank you for this very comprehensive presentation 16:59:14 RafaelCintron3 has joined #webmachinelearning 16:59:31 ... a specific intersection with our work you alluded to is possible integration of some of these tools in browser developer tools 16:59:36 q+ 16:59:41 q? 16:59:43 ack kenji_baheux 16:59:49 McCool has joined #webmachinelearning 16:59:54 q? 17:00:24 q+ 17:00:34 Kenji: how hard is to get the model to explain its behavior (e.g. in the example in Gam Changer around age/risk) 17:01:03 ack Rachel 17:01:18 Jay: this particular model was a simple regression model where it is easier to identify the particular source of the model behavior 17:02:44 Rachel: why are human hands so problematic to AI image generators? 17:03:09 q+ 17:03:20 Jay: the geometry of hands have been really hard to capture for models, but they're improving 17:03:22 ack McCool 17:03:38 McCool: re WebNN, is there any gap related to your work? 17:04:08 Jay: my tools are mostly based on TensorFlow.js 17:04:42 NeilT has joined #webmachinelearning 17:04:43 ... there may need to be different modalities of input, with different ways of embedding the vectors, to cater with the emerging needs from generative AI 17:05:13 Anssi: thanks again Jay, we hope to work more with you 17:06:31 Topic: Issue triage 17:06:41 Topic: Spec orientation 17:06:49 q+ to propose next steps for #375 17:06:50 https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset] 17:06:52 s/Topic: Issue triage// 17:06:57 q+ to discuss priority of #559 17:06:58 https://github.com/webmachinelearning/webnn/issues/559 -> Issue 559 Control flow operations: if, while (by philloooo) [opset] [feature request] 17:07:02 q+ to bump the priority of #666 17:07:03 https://github.com/webmachinelearning/webnn/issues/666 -> Issue 666 Reconsider `MLOperand` methods (by a-sully) [question] 17:07:37 MikeW has joined #webmachinelearning 17:08:05 andrewnolan has joined #webmachinelearning 17:08:17 Jeon0 has joined #webmachinelearning 17:08:21 Present+ Andrew_Nolan 17:08:58 asully has joined #webmachinelearning 17:09:08 https://github.com/webmachinelearning/webnn/issues 17:09:27 -> Triage guidance https://github.com/webmachinelearning/webnn/blob/main/docs/IssueTriage.md 17:09:54 RRSAgent, draft minutes 17:09:55 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik 17:10:21 BryanB has joined #webmachinelearning 17:12:47 -> "interop" issues https://github.com/webmachinelearning/webnn/issues?q=is%3Aissue+is%3Aopen+label%3Ainterop 17:12:55 I propose closing https://github.com/webmachinelearning/webnn/issues/11 as obsolete since we've decided to pursue a graph API. 17:12:55 https://github.com/webmachinelearning/webnn/issues/11 -> Issue 11 Executing operations (by anssiko) [feature request] 17:13:41 jsbell has joined #webmachinelearning 17:15:14 two things that I noticed: int64 issues (relates to interop) and use of constants with MLTensor (enable potentially useful mechanism for model management) 17:15:30 I propose closing https://github.com/webmachinelearning/webnn/pull/541 in favor of https://github.com/webmachinelearning/webnn/pull/754 (can this be merged?). 17:15:31 https://github.com/webmachinelearning/webnn/pull/754 -> Pull Request 754 Add MLTensor explainer (by a-sully) [webgpu interop] 17:15:31 https://github.com/webmachinelearning/webnn/pull/541 -> Pull Request 541 Add MLBuffer exploration doc (by a-sully) [webgpu interop] 17:15:50 dwayner has joined #webmachinelearning 17:15:53 Propose closing this group of "simplify" issues unless someone strongly advocates for them soon: #474, #470, #374, #324 17:15:53 https://github.com/webmachinelearning/webnn/issues/374 -> Issue 374 Simplify `MLPool2dOptions` by removing the `outputSizes` option (by huningxin) [operator specific] 17:15:53 https://github.com/webmachinelearning/webnn/issues/474 -> Issue 474 Simplify `resample2d` op (by huningxin) [operator specific] 17:15:53 https://github.com/webmachinelearning/webnn/issues/470 -> Issue 470 Simplify `matmul` op (by huningxin) [operator specific] 17:15:54 https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (by huningxin) [feature request] [operator specific] [interop] 17:16:26 (sorry Ningxin!) 17:17:21 https://github.com/webmachinelearning/webnn/issues/749 -> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing 17:17:21 https://github.com/webmachinelearning/webnn/issues/749 -> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection] 17:17:38 Dom7 has joined #webmachinelearning 17:17:51 RRSAgent, draft minutes 17:17:52 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html Dom7 17:18:41 The most recent ~5 issues don't have labels. I don't have permission to add them. 17:19:46 reillyg: it seems that issue #11 was opened a long time ago and seems like it can be closed 17:19:47 https://github.com/webmachinelearning/webnn/issues/11 -> Issue 11 Executing operations (by anssiko) [feature request] 17:20:56 scribe+ 17:21:20 anssik: no one objecting to #11 getting closed, so let's do it 17:22:18 reillyg: there are two open pull requests on the MLTensor space; can we close the generic one and keep only the specific one? can we land the PR for the explainer? 17:22:44 Austin: I'll close #541 17:22:45 https://github.com/webmachinelearning/webnn/pull/541 -> CLOSED Pull Request 541 Add MLBuffer exploration doc (by a-sully) [webgpu interop] 17:23:10 anssik: we should look at merging the explainer after our MLTensor discussion later today 17:23:54 jsbell: propose closing this group of "simplify" issues unless someone strongly advocates for them soon: #474, #470, #374, #324 17:24:10 ... we should either do them soon or abandon them 17:24:44 ningxin: I think we can close #324 17:25:03 reillyg: in the coming implementation, I've added automatic transposes 17:25:52 .... I think we can do without that particular simplification 17:26:12 anssik: so let's close #324 17:26:12 https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (by huningxin) [feature request] [operator specific] [interop] 17:26:31 MikeW has joined #webmachinelearning 17:27:04 (Has 2d vs. 2D been discussed? https://w3ctag.github.io/design-principles/#casing-rules ) 17:27:34 anssik: re #470, do we need to retitle it? open a different issue? 17:27:35 https://github.com/webmachinelearning/webnn/issues/470 -> Issue 470 Simplify `matmul` op (by huningxin) [operator specific] 17:27:48 ningxin: I'll retitle it to reflect its status 17:28:12 dwayner: I'll open a new issue instead, linking back to that one 17:28:59 I think we can close the old MLBuffer PRs in favor of the MLTensor explainer: #542 #543 #544 17:29:00 https://github.com/webmachinelearning/webnn/issues/544 -> Issue 544 [MLBuffer] Support for MLBuffer in graph execution (by bbernhar) 17:29:00 https://github.com/webmachinelearning/webnn/issues/543 -> Issue 543 [MLBuffer] Uploading/downloading tensor data (by bbernhar) 17:29:00 https://github.com/webmachinelearning/webnn/issues/542 -> Issue 542 [MLBuffer] Creation and representing MLBuffer on a XPU devices (by bbernhar) [webgpu interop] 17:29:22 dwayner: re #374, we should probably align pool with conv - I'll propose next step in the issue 17:29:23 https://github.com/webmachinelearning/webnn/issues/374 -> Issue 374 Simplify `MLPool2dOptions` by removing the `outputSizes` option (by huningxin) [operator specific] 17:29:54 MikeW has joined #webmachinelearning 17:30:19 ningxin: closing #474 SGTM 17:30:19 https://github.com/webmachinelearning/webnn/issues/474 -> Issue 474 Simplify `resample2d` op (by huningxin) [operator specific] 17:33:40 MikeW has joined #webmachinelearning 17:34:30 RRSAgent, draft minutes 17:34:32 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik 17:59:40 RafaelCintron has joined #webmachinelearning 18:03:09 spshin has joined #webmachinelearning 18:05:59 asully has joined #webmachinelearning 18:06:06 McCool has joined #webmachinelearning 18:10:52 MikeW has joined #webmachinelearning 18:10:53 Fredrik_Solenberg has joined #webmachinelearning 18:11:08 AramZS has joined #webmachinelearning 18:11:54 Topic: New features 18:12:05 Subtopic: A refreshed analysis of popular models 18:12:20 #375 18:12:20 https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset] 18:12:27 Slideset: dwayne_webnn_operator_update 18:12:31 andrewnolan has joined #webmachinelearning 18:12:38 -> 33 Models, 12 Operators, proposed IDL, data types https://github.com/webmachinelearning/webnn/issues/375#issuecomment-2292466613 18:12:39 https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset] 18:13:28 ErikAnderson has joined #webmachinelearning 18:14:27 [slide 2] 18:14:46 [slide 3] 18:15:04 [slide 4] 18:15:08 hirata has joined #webmachinelearning 18:15:39 [slide 6] 18:16:09 Wonsuk has joined #webmachinelearning 18:16:09 [slide 7] 18:16:35 [slide 8] 18:17:25 dom has joined #webmachinelearning 18:17:48 scribe+ 18:18:34 [slide 9] 18:18:37 [slide 10] 18:19:41 [slide 11] 18:21:28 Quick note: Memory64 for Wasm is available both on Chrome & Firefox nightly behind a flag for the last year or so - they're really close to being enabled by default and the proposal is stable just pending a phase 4 poll on closing out the last few spec issues 18:21:55 We'd love folks to try it out and let us know if something isn't working as expected 18:22:08 q? 18:22:44 sebastian has joined #webmachinelearning 18:22:51 q+ 18:23:03 sprang has joined #webmachinelearning 18:23:07 q+ 18:23:14 ningxin: we have already some implementation experience with these 18:23:19 q? 18:23:31 ack anssik 18:23:31 anssik, you wanted to propose next steps for #375 and to discuss priority of #559 and to bump the priority of #666 18:23:32 https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset] 18:23:32 https://github.com/webmachinelearning/webnn/issues/559 -> Issue 559 Control flow operations: if, while (by philloooo) [opset] [feature request] 18:23:32 https://github.com/webmachinelearning/webnn/issues/666 -> Issue 666 Reconsider `MLOperand` methods (by a-sully) [question] 18:23:34 dwayne: it has informed some of the proposals 18:23:36 q? 18:23:39 ack McCool 18:24:04 q? 18:24:28 anssik: I'm interested in general feedback on this approach to wave 3 operators, given current prototyping efforts 18:25:02 q+ 18:25:39 ningxin: scatterND helps with performance, not just functionality 18:25:42 q? 18:25:45 ack jsbell 18:26:10 jsbell: I ♥ the wave nomenclature - it may be useful to use it for our issues in the repo 18:26:19 q+ Has the group looked at the set of operators proposed by Arm in their TOSA proposal? https://www.mlplatform.org/tosa/ 18:26:36 ... how far are we along for the implementation? any sense of when impl/spec will be ready to advance to origin trial? 18:27:02 dwayne: for ops completeness in the DML backend, maybe two weeks of implementation work 18:27:22 q? 18:27:28 ... I hope to add the ops to the spec in the same timeframe 18:27:32 ack McCool 18:28:06 McCool: how many of these models are actually useful in the Web context? 18:28:11 q+ 18:28:30 dwayne: huge models are good to demonstrate viability, but they're so big they're likely not practical to use directly in the browser 18:28:45 McCool: +1 18:28:58 q? 18:29:04 ningxin: they were identified through popularity in transformers.js, so already used in the browser context 18:29:06 ack NeilT 18:29:35 NeilT: any consideration of the ARM @@@ operator? 18:30:35 dwayne: I've looked at it and have the data on it that I'll share 18:30:37 s/@@@/TOSA 18:30:41 q? 18:31:14 anssik: hearing overall support for the approach; no specific plan for origin trial yet 18:31:36 ... in terms of spec, do you expect any specific challenge? 18:31:56 dwayne: it should be pretty straightforward given our experience; a few interesting questions around invalid index conditions 18:32:28 ningxin: int4/uint4 data types will need attention and wide review 18:32:40 RRSAgent, draft minutes 18:32:41 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom 18:33:23 Subtopic: Quantization and dequantization 18:33:34 > #93 #128 #623 18:33:35 https://github.com/webmachinelearning/webnn/issues/93 -> Issue 93 Add QuantizeLinear and DequantizeLinear for mixed precision (by kpu) [opset] [feature request] 18:33:35 https://github.com/webmachinelearning/webnn/issues/128 -> Issue 128 WebNN should support int8 quantized models (by wchao1115) [v2] [opset] [feature request] 18:33:35 https://github.com/webmachinelearning/webnn/issues/623 -> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request] [device selection] 18:34:15 reillyg: adding operators to explicit dequantize int8 and int4 vaues to float16/float32 makes a lot of sense to add to the spec 18:34:35 ... an open question: do we agree that explicit dequantization is the approach we want to take? 18:35:17 ... backends can detect a dequantize/conv2d operation, so expression this in the API itself is maybe unnecessary 18:35:40 dwayne: that matches the experience in ONNX 18:35:43 q+ 18:36:12 q+ 18:36:15 ningxin: quantize/dequantize makes it also easier to fallback 18:36:25 ... for the backends 18:36:41 q- 18:36:54 reillyg: if we expect that backends may strip out q/deQ pairs, how do we want to specifiy the behavior of rhte full graph from a precision perspective? 18:37:16 ... how useful would the requantize operator to be? 18:37:28 dwayne: I have seen hundreds of models using the quantize operator 18:37:59 ningxin: the behavior is up to the implementation, but how do we specify this? 18:38:25 dwayne: does it change the overall precision? 18:38:55 reillyg: in TFlite, there is a top-level flag to control this 18:38:58 ack jsbell 18:39:32 jsbell: one other approach we've seen, the scale is attached to the tensor 18:39:49 ... I'm assuming we want to be explicit and not pursue this, but want to make sure we know of the options 18:40:13 reillyg: this also leads to a huge explosion of the type system 18:40:19 q+ 18:40:22 dwayne: I'll go with explicit 18:40:25 ack reillyg 18:41:03 reillyg: if we support quantize/dequantize, what data types do we support? int8/uint8 seem obvious, int4/uint4 are more complicated 18:41:18 ... representing int4 on the Web and across backends is challenging 18:41:57 dwayne: from a Web API perspective, they would be exposed as Int8Array 18:42:07 ningxin has joined #webmachinelearning 18:42:30 ... from an implementation perspective, I'm not sure how to handle int4 as input 18:42:46 reillyg: quantization is most useful for weights; do we know of any need of it for input/output? 18:42:50 q? 18:42:56 McCool: I've seen quantization for activation as well 18:43:37 reillyg: i.e. using it as another kind of an activation function 18:44:13 RafaelCintron has joined #webmachinelearning 18:44:18 ... I've linked to the 3 related issues - should we triage them into a single issue? 18:44:35 ... either #93 or #128 (#623 as a bunch of unrelated aspects) 18:44:36 https://github.com/webmachinelearning/webnn/issues/623 -> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request] [device selection] 18:44:36 https://github.com/webmachinelearning/webnn/issues/128 -> Issue 128 WebNN should support int8 quantized models (by wchao1115) [v2] [opset] [feature request] 18:44:36 https://github.com/webmachinelearning/webnn/issues/93 -> Issue 93 Add QuantizeLinear and DequantizeLinear for mixed precision (by kpu) [opset] [feature request] 18:44:37 RobKochman3 has joined #webmachinelearning 18:44:57 dwayner has joined #webmachinelearning 18:45:50 q? 18:45:53 reillyg: I'll clean them up now that I have the proper repo privs 18:45:58 Subtopic: Platform capability detection 18:47:30 reillyg: there is a bit of overlap between this topic and the next one on future proof device selection 18:47:45 ... capabilities detection and device selection go hand in hand 18:47:58 ... capabilities depend on the platform you're on, and which device you pick 18:48:49 ... I was looking at the examples of how WebGPU handles this; in WebGPU, the first step the dev goes through is requesting an adapter at which point the system devices which adapter to use 18:49:36 ... the question raised in Mike's proposal is whether we can have the developers give us the set of features they want and whether we can fulfill that 18:50:15 Mike: in WebGPU you get a set of limits, with defaults but also maximum limits 18:50:44 ... the defaults are guaranteed to run everywhere; if you ask something above the defaults, you can run on the particular device, but no guarantee to run everywhere 18:51:37 ... so instead of only describing what the device supports, establishing a baseline of support that can run everywhere 18:51:57 reillyg: it's hard to have a default operator set that works across platforms, mostly because of datatypes support 18:52:27 ... it seems it's not possible to give a baseline for models and datatypes with a guarantee to run everywhere 18:52:38 q+ 18:53:29 ... any framework built on top of WebNN will have to have code to respond to the capabilities of the platform and tailor the graph as it is being built to tailor it to these capabilities 18:53:43 ... given the current landscape of hardware support, I'm not sure if it makes sense to create a baseline set 18:54:08 Mike: it would be interesting to see what we're missing from this core set given the support of TensorFlow.js in WebGPU 18:54:21 q+ 18:54:30 q? 18:54:31 ack asully 18:55:01 asully: to get to a common set, one approach would be to relax the requirement that NPU matches with an NPU device 18:55:25 q? 18:55:46 reillyg: with a restriction to GPU, could we identify a baseline operator set? 18:55:53 dwayne: probably 18:56:35 ack ningxin 18:56:46 reillyg: if we said "if you're OK using the GPU and do float32, you're OK in the baseline" 18:57:36 q+ 18:57:52 ningxin: we have a bunch of ONNX models in our tests; they use int16; we've added support for int16 in opLimits in a PR, with a WASM fallback 18:58:50 ack asully 18:59:28 q+ to discuss whether requesting additional operators up-front is practical. 18:59:33 asully: we could limit the size of indexes to int32 19:00:36 ... to avoid the performance penalty of falling back to WASM 19:03:27 scribe+ 19:04:19 q+ 19:04:23 Mike: our goal is to avoid developers not running on platforms by inadvertance 19:04:34 ack reillyg 19:04:34 reillyg, you wanted to discuss whether requesting additional operators up-front is practical. 19:04:56 ... we want to make sure it happens with a clear intent from the developers and with a sense they will build a fallback 19:05:39 reillyg: given that this is something that will be intermediated by frameworks, I'm wondering if this is something we want to do through an API or through developer tooling 19:06:09 ... e.G. a flag to enable compatibility mode 19:06:21 (my comment was to suggest a compatibility mode for testing, now covered) 19:06:22 q- 19:06:40 ... how does framework intermediation apply to the WebGPU case? 19:07:11 Mike: the engines tend to handle it for developers 19:07:22 ... a developer tool setting sounds like a good idea 19:07:54 q+ 19:08:01 reillyg: the frameworks have a backup (e.g. go back to WASM), and it's relatively easy for them to detect when they go off limit 19:08:25 q? 19:08:45 ... we don't want frameworks to guess and check 19:09:01 mike: we could still expose maximum limits, but keep the defaults to a baseline 19:09:08 ack ningxin 19:10:19 ningxin: we could also make efforts to promoting best compatible datatypes for the Web to the transformers tooling community 19:10:33 ... this would reduce situations where frameworks have to fallback 19:11:14 q+ 19:11:26 q+ 19:11:46 ack m 19:12:31 reillyg: so is the concern with oplimits more of a compatibility rather than fingerpriting 19:12:34 ack reillyg 19:13:33 q? 19:14:01 dom: the Web platform is compelling enough as a distribution platform that it can drive convergence toward a baseline 19:14:34 reillyg: what we need to figure out is the minimal supported operator set and data types (assuming GPU execution) 19:15:03 ... and with an opt-in parameter to request more than the default 19:16:02 anssik: this might help also with fingerprinting 19:16:17 RafaelCintron: e.g. implementors could not provide this "upgrade" path in a privacy mode 19:16:31 Subtopic: Future-proof device selection abstractions 19:17:12 Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0006/MLDeviceType.pdf 19:17:19 [slide 2] 19:17:56 [slide 3] 19:19:57 q+ with a proposal. 19:19:57 q? 19:20:00 q+ with a proposal 19:20:04 ack reillyg 19:20:31 reillyg: getting rid of the explicit device type makes sense 19:21:21 ... the choice to me is between being very vague (with power preference) or a bit more specific: CPU-only, CPU+GPU, CPU+NPU, CPU+GPU+NPU 19:21:36 ... to avoid compatibility issues with the ambiguity of power preference 19:21:46 dom has joined #webmachinelearning 19:22:08 Mike: as more WebNN adoption occurs, it would be great to get data on performance improvements actual apps would get if they had more guarantees on which device they run 19:22:29 scribe+ 19:23:13 asully: further experiments to see if developers could actually target NPUs in a cross-platform compatible fashion given the level of diversity in NPUs on the market today 19:23:29 ... so basically agree 19:23:29 q? 19:23:34 q+ 19:23:35 s/further/we would like to run further/ 19:23:37 q+ 19:23:57 s/asully/reillyg/ 19:24:13 kenji_baheux has joined #webmachinelearning 19:24:26 Mike: if we had sufficient data to show that device type selection is necessary, we would be more open to it (as CoreML allows) 19:25:05 ... our proposal is that initially we remove the concept, with an openness to reconsider it based on data 19:25:44 RafaelCintron: the windows ecosystem is a lot more heterogeneous, so not having device selection feels even harder there 19:26:23 ... it's hard for the browser to make a decision since the model is only known once the data is buffered 19:26:30 ... maybe this could be done with a different API shape 19:27:06 q+ 19:27:11 ... Re privacy, how different is it for WebGPU? it seems you could do the same with increasingly complex shader 19:28:07 Mike: you can hide capabilities; you can't fully prevent it, but there are protections that have been added to WebGPU to mitigate the trivial privacy attacks, and we would like to see them in WebNN as well 19:28:08 q? 19:28:11 ack RafaelCintron 19:28:19 ack RafaelCintron 19:28:19 q? 19:28:22 ack BryanB 19:28:28 Zakim, close the queue 19:28:28 ok, dom, the speaker queue is closed 19:28:33 ErikAnderson has joined #webmachinelearning 19:28:53 Bryan: MLTensor heavily relies on the device type 19:29:02 ... we have scenarios to re-use tensors between input/output 19:29:10 handellm has joined #webmachinelearning 19:29:35 ... I'm wondering how far heuristical control can allow the proper allocation of memory resources 19:29:50 ... I fear this will lead to unnecessary reallocations/copy 19:29:52 q? 19:30:24 asully: the MLTensor explainer has an open question; an MLTensor is tied to an MLContext, tightly bound to a device 19:30:35 ... if we change this, we will have to change the semantics of MLTensor, similarly to an MLGraph 19:31:08 ... so there are solutions for that if we rescope the tensor to the graph 19:31:26 RafaelCintron: how do you share an input/output tensors in that situation? 19:31:32 asully: that might lead to data copies 19:31:51 ningxin: facial recognition typically uses face detection then recognition through 2 different models 19:32:18 asully: there may need to be a way to declare that graphs share buffer 19:32:37 reillyg: if we switch to only power preference and possibly a prefer-cpu setting 19:32:49 ... and a WebGPU device for interop 19:33:15 ... ensuring consistent data sharing with the GPU, 19:33:22 ... then it's up to the UA to deal with data placement 19:34:13 ... not forcing developer decisions on graph placement if they don't have to sounds like an improvement 19:34:18 ... I think the UA can make the right choice 19:34:19 q? 19:34:26 asully: if it has the right information, yes 19:34:59 ack kenji_baheux 19:35:10 reillyg: if you create two graphs in a single context, this would hint they should be run close 19:35:36 hagio_nhk has joined #webmachinelearning 19:35:59 Fabian has joined #webmachinelearning 19:36:09 kenji_baheux: there may be cases where you want to avoid using the GPU (e.g. because it's used for higher priority tasks), would there still be a way to indicate that? 19:36:36 anssik: which issue should we continue this discussion in? 19:36:43 AramZS has joined #webmachinelearning 19:36:47 asully: #749 is a good candidate 19:36:48 https://github.com/webmachinelearning/webnn/issues/749 -> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection] 19:36:57 reillyg: #302 also exists, but is more vague 19:36:58 https://github.com/webmachinelearning/webnn/issues/302 -> Issue 302 API simplification: context types, context options, createContext() (by zolkis) [v2] [device selection] 19:37:23 hirata has joined #webmachinelearning 20:29:52 jsbell has joined #webmachinelearning 20:31:41 kenji_baheux has joined #webmachinelearning 20:31:58 jsbell has joined #webmachinelearning 20:32:06 present+ Joshua_Bell 20:32:17 asully has joined #webmachinelearning 20:32:29 Present+ Tianqi_Chen 20:33:13 Present+ BryanB 20:34:13 Topic: Customer feedback & collaborations 20:34:36 Subtopic: Universal Large-Language Model Deployment with ML Compilation 20:35:22 Slideset: tiani_chen_slides 20:35:44 [slide 2] 20:36:09 McCool has joined #webmachinelearning 20:36:11 Present+ 20:36:28 [slide 3] 20:36:46 [slide 5] 20:37:41 zkis has joined #webmachinelearning 20:37:49 gdti has joined #webmachinelearning 20:38:09 [slide 9] 20:38:15 [slide 10] 20:38:35 [slide 11] 20:39:14 [slide 13] 20:39:41 RobKochman has joined #webmachinelearning 20:39:54 [slide 14] 20:40:59 [slide 15] 20:41:10 [slide 16] 20:42:00 [slide 17] 20:42:43 [slide 19] 20:43:04 [slide 20] 20:43:28 [slide 21] 20:43:30 [slide 22] 20:44:48 [slide 23] 20:45:16 [slide 24] 20:45:49 RafaelCintron has joined #webmachinelearning 20:45:56 [slide 25] 20:48:17 MikeW has joined #webmachinelearning 20:48:18 -> https://chat.webllm.ai/ WebLLM Chat Demo 20:48:31 [slide 26] 20:48:39 spshin has joined #webmachinelearning 20:48:56 q? 20:49:08 q+ 20:49:16 ack reillyg 20:49:17 Zakim, reopen the queue 20:49:18 ok, dom, the speaker queue is open 20:49:27 AramZS has joined #webmachinelearning 20:49:31 q+ 20:49:43 ningxin has joined #webmachinelearning 20:49:52 reillyg: are you looking at implementing support for compiling to the WebNN API? Any feedback on the API capabilities and what you might need to support it as a backend? 20:50:33 tianqi: so far we've been focusing on WebGPU backend; interop between WebNN & WebGPU would help ensure one doesn't block the other 20:50:51 ... we've been looking at getting the compiler to generate JS using the WebNN API 20:51:14 reillyg: re WebGPU & WebNN, is your goal to implement non-WebNN operators using WebGPU? 20:51:26 q? 20:51:31 ack MikeW 20:51:32 tianqi: we want to be able to partition the tasks flexibly across the two 20:52:06 Mike: as you continue your work toward adopting WebNN, it would be great to provide feedback to the group, incl comparison with other backends in terms of performance 20:52:41 q? 20:52:41 tianqi: +1 ; would love to see more contributions; WebGPU has been very useful to us, would be great to see the same with WebNN 20:53:24 q+ 20:53:46 ack gdti 20:54:03 anssik: the open source projects can be found under the mlc github org 20:54:12 -> MLC-LLM blog (Jun 7, 2024) https://blog.mlc.ai/2024/06/07/universal-LLM-deployment-engine-with-ML-compilation 20:54:16 -> WebLLM blog (Jun 13, 2024) https://blog.mlc.ai/2024/06/13/webllm-a-high-performance-in-browser-llm-inference-engine 20:54:33 -> https://discuss.tvm.apache.org/t/unity-tutorial-tvm-unity-byoc/14561 20:54:34 [Unity][Tutorial] TVM Unity BYOC 20:54:50 q? 20:55:36 dezell has joined #webmachinelearning 20:55:41 present+ 20:56:03 Subtopic: Transformers.js WebNN backend 20:56:42 [@@@ pre-recorded video by Joshua Lochner] 21:01:59 q+ 21:02:29 Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0003/WebML_WG_-_Transformers.js_update__23_September_2024_.pdf 21:02:45 [slide 1] 21:02:49 [slide 2] 21:02:53 [slide 3] 21:02:57 [slide 4] 21:03:01 [slide 5] 21:03:05 [slide 6] 21:03:19 RRSAgent, draft minutes 21:03:21 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik 21:03:50 q? 21:04:06 ningxin: webnn support is a planned feature of transformers.js v3, with the upcoming origin trial an opportunity to get feedback 21:04:07 ack McCool 21:04:34 mccool: should we prioritize discussion on dynamic shapes based on that input 21:04:56 ningxin: developers can override the dimensions to adapt e.g. to the camera size 21:06:01 ... support for static k-v cache is an open issue in the Transformers project afaik 21:06:02 q? 21:07:22 Subtopic: ONNX Runtime Web & WebNN EP 21:07:54 ningxin: one topic covered by MLTensor is the capability to support ONNX model with external data 21:08:10 ... weights are kept in an external data file 21:08:22 ... we're enabling the WebNN Execution Provider to support that 21:09:29 hirata has joined #webmachinelearning 21:09:46 ... we want to reduce peak memory consumption since they're very high and sometimes hit the limits 21:11:00 ... there are different approaches under discussion to solve this, up to streaming directly network data to the memory 21:11:29 ... ONNX with external data is a significant use case for supporting models with big weights 21:12:00 RRSAgent, 21:12:00 I'm logging. I don't understand '', dom. Try /msg RRSAgent help 21:12:31 RafaelCintron: the ONNX backend team is happy for the contributions; they expressed some concerns for the lack of dynamic shapes in WebNN 21:12:45 q+ 21:12:52 ack McCool 21:13:16 dwayner has joined #webmachinelearning 21:13:58 McCool: the MLTensor prototype has strong typing which may impact the ability to stream data 21:14:42 reillyg: for CoreML / TFLite, the model is essentially streamed to a file 21:15:07 McCool: allowing streaming from disk is useful to limit the impact on system memory 21:15:23 q? 21:15:39 Subtopic: Google Chrome Feedback revisited #453 21:15:40 https://github.com/webmachinelearning/webnn/issues/453 -> Issue 453 Google Chrome Feedback on WebNN: aiming for broad device coverage and maintainability (by vsekhar) [process] [opset] [use case] 21:15:51 Re streaming, see for example G10; another is NanoFlow 21:16:15 reillyg: a year ago, as we started looking at WebNN, we provided a high level set of feedback on the API 21:16:35 ... most of this feedback is either already integrated, tracked in other issues, or will be answered during Origin Trial 21:16:58 ... so we're OK with closing issue #453; we're happy to see progress of implementations across platforms 21:17:12 ... we'll see how well developers can leverage it cross platforms during Origin Trial 21:17:42 asully: still some skepticism around the long term viability of high level operators 21:17:52 reillyg: (this is tracked in a specific issue) 21:18:09 anssik: could you link these specific issues from #453 and then close the issue? 21:18:36 ... Thanks for having provided that feedback with the clarity on setting goals for the work 21:18:57 q? 21:19:29 reillyg: we've seen the progress we want to see on the spec and our issues, so this encompassing issue no longer feels needed 21:20:45 Subtopic: Other standards positions 21:20:55 #763 21:20:55 https://github.com/webmachinelearning/webnn/issues/763 -> Issue 763 Request standards positions from Mozilla and WebKit (by reillyeon) [process] 21:21:47 Anssi: Mike, can you get a webkit standards position on WebNN? 21:21:56 Mike: will do 21:22:31 Anssi: the more actionable feedback would be, the better 21:22:45 q? 21:22:53 Mike: we're reviewing the specification; the DeviceType was the main thing we had found objectionable 21:23:06 ... we still need to do more work of mapping to our data framework 21:23:13 ... work in progress 21:24:11 ... I'll post a request for a standards position for webkit 21:25:42 Anssi: the Mozilla rep Tareq isn't at TPAC this year, but we can work with him to file this 21:25:51 Dom: we can also file this as a WG 21:26:17 q? 21:26:44 RafaelCintron: speaking for Edge, we're fully supportive of the work 21:27:05 reillyg: we're supportive of the work; can't commit to shipping yet, but are looking forward to lessons from the origin trial 21:28:28 RRSAgent, draft minutes 21:28:29 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom 21:31:57 s/Tareq/Tarek 21:32:14 s|tiani_chen_slides|https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0004/MLCTalk.pdf 21:32:17 RRSAgent, draft minutes 21:32:18 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom 21:38:15 asully has joined #webmachinelearning 21:39:32 Topic: Interop and cross-group coordination 21:40:24 Subtopic: Interop issues across different backends 21:40:47 -> open "interop" issues https://github.com/webmachinelearning/webnn/issues?q=is%3Aissue+is%3Aopen+label%3Ainterop 21:41:13 #739 21:41:14 https://github.com/webmachinelearning/webnn/issues/739 -> Issue 739 Limited support for pad on CoreML backend (by philloooo) [operator specific] [interop] 21:41:18 RafaelCintron has joined #webmachinelearning 21:41:56 #128 21:41:56 https://github.com/webmachinelearning/webnn/issues/128 -> Issue 128 WebNN should support int8 quantized models (by wchao1115) [v2] [opset] [feature request] 21:42:15 s/#128/#180 21:42:21 #180 21:42:21 https://github.com/webmachinelearning/webnn/issues/180 -> Issue 180 Should dilated pooling be supported (by fujunwei) [operator specific] [interop] 21:43:06 ningxin: one category I want to highlight is handling failure behavior 21:43:15 ... e.g. out of bound indices for gather/scatter 21:43:49 ... out of bound errors may create memory issues 21:43:51 #486 21:43:51 https://github.com/webmachinelearning/webnn/issues/486 -> Issue 486 Add "implementation consideration" about how out-of-bound indices of Gather/Scatter should be handled (by huningxin) [operator specific] [interop] 21:44:35 ... what should be the behavior be in this siutation? the underlying platforms may have different approaches (error, clamping, normalized) 21:44:58 ... some native APIs mention this behavior as simply undefined 21:45:07 q+ 21:45:21 ... (beyond memory safety) 21:45:24 #691 21:45:25 https://github.com/webmachinelearning/webnn/issues/691 -> Issue 691 Divide-by-zero outcome should be standardized (by huningxin) [interop] 21:45:49 ... in some cases, this may vary across hardware vendors 21:46:04 ... #487 21:46:04 https://github.com/webmachinelearning/webnn/issues/487 -> Issue 487 Should `axes` be required parameter for layerNormalization build method? (by huningxin) [operator specific] [interop] 21:46:08 ... #481 21:46:08 https://github.com/webmachinelearning/webnn/issues/481 -> Issue 481 Should `scale` and `bias` be required inputs for `batchNormalization` op? (by huningxin) [operator specific] [interop] 21:46:37 ... these two issues are about the optional attribute - we set a default value for them, but the actual default may vary across platforms 21:47:00 ... for optional operands of #481, e.g. batchNormalization 21:47:19 ... when they're not present, the implementation has to provide a default value which increases the complexity of implementation 21:47:34 ... the question is whether we should make them require and leave the cost to the framework 21:47:37 q+ to discuss defaults. 21:47:54 ... some native platforms support the optional operand concept, e.g. coreML 21:48:15 q+ 21:48:37 ... I propose to close #383 21:48:38 https://github.com/webmachinelearning/webnn/issues/383 -> Issue 383 Need to restrict the value of alpha to be positive for elu operation (by lisa0314) [operator specific] [interop] 21:49:12 ... with the input we got, we concluded that there is no such restriction 21:49:22 ... coreML and directML docs say so explicitly 21:49:34 ... TFLite doesn't support the alpha parameter but can be emulated 21:50:00 ... we already proposed to close #324 21:50:01 https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (by huningxin) [feature request] [operator specific] [interop] 21:50:14 q? 21:50:27 anssik: any objection to close #383? 21:50:32 ... so we can close it 21:51:17 NGaitonde has joined #webmachinelearning 21:51:23 q? 21:51:25 ack asully 21:51:44 asully: we should try to avoid as much implementation defined behavior in the spec as possible 21:51:51 ErikAnderson has joined #webmachinelearning 21:52:29 ... in particular for situations that would end up with very different behaviors across platform e.g. #486 21:52:30 https://github.com/webmachinelearning/webnn/issues/486 -> Issue 486 Add "implementation consideration" about how out-of-bound indices of Gather/Scatter should be handled (by huningxin) [operator specific] [interop] 21:52:45 q? 21:52:48 ack reillyg 21:52:48 reillyg, you wanted to discuss defaults. 21:52:51 ... we should define a behavior implementable across platforms, even if it comes with some wrapper cost in some platforms 21:53:19 reillyg: the baseline is preventing any out of bound memory 21:53:34 ... the spec should define a common behavior 21:53:46 reillyg: +1 to avoiding implementation defined behaviors 21:53:57 ... default in the graph builder API based on developer ergonomics 21:54:29 ... there are cases where explicitly choosing NOT to have defaults provide a better developer experience 21:54:37 ... this should be looked at on an op per op 21:54:51 q? 21:54:52 ... we shouldn't simply inherit the backend default 21:54:56 ack RafaelCintron 21:54:59 RafaelCintron: +1 to both 21:55:04 https://registry.khronos.org/webgl/specs/latest/1.0/#4.1 21:55:04 https://gpuweb.github.io/gpuweb/#security-shader 21:55:07 ... if we can do it performantly, we should do it 21:55:30 ... both WebGL and WebGPU have very similar things 21:55:52 ... WebGL leaves implementation flexibility in how to deal with out of bound situations 21:56:06 ... webGPU has also a list of potential behaviors 21:56:16 q+ 21:56:22 ... for us, if we can clamp performantly, that's the best 21:56:52 ack McCool 21:56:53 reillyg: the priority of consitstuencies should be security, conformance and performance-if-it-really-matters 21:57:32 McCool: looking at a couple of cases: scattering out of bounds doesn't matter; but gathering out of bounds is problematic - we should discuss how it is handled 21:57:38 s/consitstuencies/constituencies 21:58:28 asully: re throwing run time exceptions, there is no cross platform to throw that kind of exceptions, e.g. from a GPU 21:58:39 ... I would be more supportive of using default values 21:59:19 NGaitonde has joined #webmachinelearning 21:59:22 Dwayne: is there any precedent for this we could use? 21:59:43 reillyg: for divide by zero, there was a suggestion to look at what different hardwares do 21:59:59 kenji_baheux has joined #webmachinelearning 22:00:54 asully: we probably want to ensure we always avoid runtime exception 22:01:06 reillyg: we need to measure the perf impact of clamping 22:01:35 dwayne: for out of bound, the two options are clamping or returning 0/NaN 22:04:28 McCool: we should clarify what platforms mean by undefined behavior for scatter (is it undefined order?) 22:04:31 q? 22:04:50 ... non-deterministic atomic order 22:05:21 ningxin: for non-determistic hardware, should we simply ensure safety but not try to define something beyond? 22:05:41 reillyg: we should see if that introduces a fingerprinting surface 22:05:59 mccool: which would be really expensive to fix perforantly 22:06:05 s/ant/mant/ 22:06:20 MikeW has joined #webmachinelearning 22:06:54 Subtopic: Core operator set 22:07:17 #573 22:07:17 https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] 22:07:36 reillyg: it seems like there is a minmimum core operator set we can define (with data types) with some additional research 22:08:04 ... there is an underlying question between high-level/low-level operators (the former being decomposable into the latter) 22:08:31 ... but that can be dealt as with compose that core operator set following our discussions on opLimits 22:08:52 ... what we consider core will evolve based on what we see as needed by modern models 22:09:30 ... we should have a good definition of what it takes to add an operator to the spec, including to the minimum supported set 22:09:45 -> Adding new ops https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation 22:10:08 dwayne: a primitive is one that cannot be further decomposed 22:10:23 reillyg: but do we want to add all those that cannot be decomposed? 22:10:47 q? 22:10:54 AramZS has joined #webmachinelearning 22:11:03 ... criteria for inclusion should be "can be implemented on at least two platforms across several data types" 22:11:38 dwayne: I want to support to be proactive in adding cross-platform available operators 22:11:59 reillyg: we should look also at TOSA, LINALG 22:12:28 ningxin: these two are good targets since they're used by hardware vendors as target 22:13:46 NeilT: we should look at MLRI 22:13:54 asully: the other side of the question is what to do with the non-core operators 22:13:56 q+ to ask are we going to subclass MLGraphBuilder to support multiple sets? 22:14:45 jsbell: how do we reflect that in practice in the spec? do we categorize operators? 22:14:50 reillyg: @@@ 22:17:03 dom: re beyond core, I think this will be driven by the implementation pressure to limit operators to what actually provide performance boost to enough platforms / Frameworks 22:17:21 q? 22:17:22 reillyg: getting implementation feedback across platforms and engines will provide useful push-and-pull 22:17:31 q- 22:17:55 ... if it can be implemented by two different engines, it can be in the spec; if it can be implemented everywhere, it can be in the core set 22:18:26 ... it's likely that any op can be implemented everywhere, but not for all data types 22:18:32 ... that's where the constraints come in 22:18:42 q+ 22:18:59 anssik: implementability across backends may be more critical than across engines 22:19:18 ErikAnderson has joined #webmachinelearning 22:20:25 q+ 22:20:30 ack ningxin 22:20:37 reillyg: re high-level vs decomposed low-level operators, this will likely be based on collecting performance data in practice 22:20:57 ningxin: the high level ops comes from optimized support in some of the native platforms 22:21:20 ... but this could be replaced by the backend compiler detecting and applying the optimized path 22:21:40 q- 22:23:17 dom: for some operations, there may also be an engine specific stance on implementability for other reasons (e.G. fingerpriting) 22:23:29 Subtopic: MLTensor 22:23:35 PR #754 22:23:35 https://github.com/webmachinelearning/webnn/pull/754 -> Pull Request 754 Add MLTensor explainer (by a-sully) [webgpu interop] 22:24:13 q+ 22:25:07 asully: open to merge this explainer soon, although the recent discussion on mldevicetype will impact this 22:25:15 ... since it affects the buffer allocation 22:25:54 ... the big open questions are about WebGPU interop 22:26:08 ... how to share a buffer between WebNN & WebGPU (e.G. for video feed processing) 22:26:22 handellm has joined #webmachinelearning 22:26:54 ... how efficient can this be made? on systems where this can be expressed as GPU commands, this is simple 22:27:16 ... when it's not possible, we might have to use the CPU for synchronization (with a performance hit) 22:28:26 ... the goal should be the UA has enough information to appropriately allocate the buffer 22:28:41 q? 22:28:44 dom: I don't think we should block on the device type discussion before merging the explainer 22:28:51 ack RafaelCintron 22:29:25 RafaelCintron: the GPUExternalBUffer as a type is still an open question (in comparison to GPUBuffer) 22:29:57 asully: it would have more restrictions than a simple GPUBuffer 22:30:12 q+ 22:30:17 ... I'd be happy to simply use GPUBUffer 22:30:26 ack McCool 22:30:48 McCool: rather than talking about device types, we should talk about which graphs they're connected to 22:30:56 s/connected/communicating 22:30:57 q+ 22:31:37 RafaelCintron: so long as we don't regress scenarios where connecting several graphs end up creating copies 22:32:07 ... being able to ensure graphs use the same resource domains 22:32:13 ack reillyg 22:32:30 reillyg: with the device type discussion we had earlier, my proposal is that MLContext becomes that resource domain 22:32:36 kenji_baheux has joined #webmachinelearning 22:32:55 ... if you create an MLContext and create a bunch of graphs, they should share that resource domain, and the buffers should be accessible to these graphs 22:33:37 ... are there really cases where you would have buffer before having your models ready to execute? 22:34:27 RafaelCintron: there is at least the constant buffer case as an exception; but for input/output you're probably right 22:34:58 reillyg: yeah, I would keep the MLConstant operand upload as a separate topic 22:35:46 q? 22:36:30 ningxin: there are complex scenarios where a model output can be used by two models that are running in different devices 22:36:46 reillyg: an MLTensor can only be used by graphs created by the same context today 22:37:14 ... we can move them together from one unit to another, but they're not expected to be split 22:37:29 ... an MLCOntext represent resources that can be cheaply shared with each other 22:38:02 #760 22:38:03 https://github.com/webmachinelearning/webnn/issues/760 -> Issue 760 Support building graphs from `MLTensor` containing constants (by bbernhar) [feature request] 22:38:04 Anssi: so let's proceed with merging the explainer and iterate from there 22:40:37 kenji_baheux has joined #webmachinelearning 22:40:52 q+ 22:40:59 q+ 22:41:02 q? 22:41:05 ack reillyg 22:41:29 reillyg: I'd like to understand the relationship between this and the work that Austin has been doing in the Chromium prototype to eagerly upload constants to the GPU process 22:41:45 ... if the goal is to allow to stream constants into the graph builder before you call build 22:41:59 ... then that's an implementation detail that can be added to the existing constant function 22:42:26 ... what that doesn't cover is a case of a constant being reused by multiple graphs 22:42:42 ... is that your specific use case? 22:43:00 BryanB: correct 22:43:28 sql has joined #webmachinelearning 22:43:40 reillyg: if you're compiling the graph with the constant values, that compilation may optimize the constant values and change it - can it still be shared then? 22:43:52 ... each graph might optimize operators differently and use different constant values 22:44:26 q+ 22:44:30 BryanB: the unoptimized copy might be re-used for another build 22:44:37 q+ 22:44:45 q? 22:44:50 ack RafaelCintron 22:45:17 RafaelCintron: the reason we want to do this specifically is because we found scenarios where buffers needs to be shared across graphs 22:46:13 ... this came up in a discussion with a MLBuilder that would be used to create two graphs 22:46:46 ... (which is the alternative approach to defining a new MLTensor type) 22:47:00 ack McCool 22:47:29 McCool: constant operands would also be useful to cache a model across different contexts 22:49:14 ... if I have a big model that I want to run in a WebGPU execution *and* in a WebNN execution 22:49:24 ... I wouldn't want to have to re-download the model 22:49:28 ack ningxin 22:50:55 dwayner has joined #webmachinelearning 22:50:56 reillyg: there are two parts: sharing constant data between a graph you built on WebNN in a Web site, on WebGPU in another site (beyond SOP limitations), we already have an explicit assumption that the graph building can create an optimized copy destroying the original 22:51:45 McCool: how important is it to get introp between WebNN & WebGPU implementations? 22:52:42 asully: different platforms have different handling of constants (e.G. they're handled as a file on CoreML) 22:53:07 ... MLTensor would be used for input/output; sharing that data between WebGPU and WebNN is out of sope 22:53:10 s/sope/scope 22:53:29 reillyg: right - they would have to handle as input, which may come with a performance cost 22:53:35 q? 22:54:16 ningxin: this emerged when we restricted MLBuilder to use a single graph 22:54:28 ... the reason for this was to ensure the timely release of resources 22:54:58 ... doesn't destroy help with resource lifecyle? 22:55:07 Jeon0 has joined #webmachinelearning 22:55:29 reillyg: the ability to constraint how resources get consumed is another optimization this allowed 22:55:40 s/consumed/consumed by frameworks/ 22:55:42 s/lifecyle/lifecycle 22:56:28 reillyg: DirectML will copy any constant you'll give it 22:56:55 ... if we started eagerly copy constants to the GPU process and have them in the processing pipeline ASAP, would it help DirectML? 22:57:23 RafaelCintron: if you make this "owned by DirectML"… 22:58:35 reillyg: in the coreml backend, we would stream constants in the weights file 22:59:03 ... we could reuse the same file for multiple graphs 22:59:31 q? 22:59:48 asully: so I Think we can confirm there is a use case to allow re-use of constants across graphs 23:00:05 reillyg: e.g. an optional parameter to the creation of constants 23:24:00 AramZS has joined #webmachinelearning 23:36:49 Subtopic: MLConstantOperand 23:37:03 issue #668 and PR #747 23:37:03 https://github.com/webmachinelearning/webnn/issues/668 -> Issue 668 Do we need an `MLConstantOperand`? (by a-sully) [question] [interop] 23:37:03 https://github.com/webmachinelearning/webnn/pull/747 -> Pull Request 747 Introduce MLConstantOperand (by a-sully) 23:38:30 asully has joined #webmachinelearning 23:39:46 BryanB has joined #webmachinelearning 23:40:03 I haven't been to the Anaheim Packing District but I've heard good things about it. I was hoping to check it out. 23:41:14 MikeW9 has joined #webmachinelearning 23:43:08 sql has joined #webmachinelearning 23:43:45 reillyg: a couple of pieces to this: whether we want to encode the constant parameter requirement in the API (a note on the bias parameter for convolution saying it must be a constant, either through a dedicated type or with a property on the parameter) 23:44:47 ... we could specify that implementations do constant folding - if you take a constant and pass it to the add node with another constant, the implementation would take care of making the result a constant for the backend framework 23:44:56 dwayner has joined #webmachinelearning 23:45:28 q+ 23:45:29 q? 23:45:29 ... having an encouragement to provide a constant for these parameters and an ability for the implementation to compute the contantness of a parameter would provide both interop and performance benefits 23:45:32 ack dwayner 23:45:43 RobKochman has joined #webmachinelearning 23:46:01 dwayner: what does it mean to be a constant to coreML? is it limited to CPU or also on GPU? 23:46:13 reillyg: it needs to be present at the time the graph is created 23:46:20 dwayner: could it be created as an MLTensor? 23:46:34 asully: the content of the constant needs to be known at compile time 23:47:35 dwayner: I see the value of being able to query a constantness property, whether it's required or for perf improvement 23:47:52 ... I'm not sure it needs to be exposed to the API 23:48:19 ... I'm not entirely confident that constant folding would solve all the cases I saw in my research 23:49:47 ... the emulation in mos cases would be adding one op 23:50:04 ... I would like to minimize the number of cases where we require constantness 23:50:40 reillyg: this ties in with the question of sharing constants 23:51:50 ... my intuition would be to assume constantness until we find a reason not 23:52:01 s/not/not to 23:52:20 ... it's easier to change a requirement from const to non-const than the reverse 23:52:50 dwayner: but again, except for lstm and gru, it's only one op to emulate 23:53:49 reillyg: if it's only limited to a small number of coreml ops, I agree we could just decompose 23:54:41 asully: all the coreml ops that require constantness are high level ops that need decomposition in the spec in any case 23:55:05 RRSAgent, draft minutes 23:55:07 I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom 23:57:10 Topic: Implementation plans and trials 23:58:02 anssi: we're iterating at CR stage, a call for implementation; we have implementations across 3 back ends, one of which implementing the full API and other backends a bit behind 23:58:16 ... multiple OS, with the API behind a flag in chrome and edge 23:59:10 Mike: we would be interested on data which set models run on NPU across a wide range of devices, vs falling back on CPU/GPU 23:59:29 reillyg: is there a way to detect on coreml whether something ran on the NPU? 23:59:46 Mike: I think so but will double check 00:00:23 anssik: figuring out the right metrics for the origin trial is important 00:01:49 rafael: perf comparison, compilation time, top 40 operators and spread of usage, contexts loss, memory usage 00:02:15 ... we need to have an API that can remain stable for a few months to collect useful data 00:02:39 ... the WebGPU OT lasted several months, with multiple breaking changes which wasn't ideal 00:03:07 reillyg: the expectation is that we're not going to ship at the conclusion of the OT, it's a data gathering experiment and that the API woudl be turned off at the end of the period 00:03:25 +q 00:03:29 ... we expect that most developers will be using frameworks which will get back to using CPU or GPU delegates after the period 00:04:13 ack RobKochman 00:04:27 kenji_baheux has joined #webmachinelearning 00:04:46 q? 00:04:51 RobKochman: we have to think through what the developers would actually do and what success would look like for them 00:05:53 McCool: do we want to collect data on which models? 00:06:10 rafael: we wouldn't know it from telemetry but through surveys 00:08:14 jsbell: we want developers to do A/B testing across the different execution providers, since I'm not sure we could tease that out on our end 00:08:28 q? 00:08:30 mccool: maybe the frameworks could help with the A/B testing 00:08:47 Topic: Advancement on the W3C Rec Track 00:08:52 Subtopic: Wide review status 00:09:24 -> https://github.com/w3ctag/design-reviews/issues/933 00:09:24 https://github.com/w3ctag/design-reviews/issues/933 -> CLOSED Issue 933 Updated review of WebNN API (by dontcallmedom) [Priority: urgent] [Progress: review complete] [Review type: small delta] [Review type: horizontal review] [Venue: WebML CG] [Resolution: satisfied with concerns] [Mode: breakout] 00:09:24 … [Topic: Machine Learning] [Focus: Web architecture (pending)] [Focus: Security (pending)] [Focus: Privacy (pending)] 00:10:14 Anssik: I suggest we integrate the responses Rafael and reillyg gave as non-normative text in the spec 00:10:42 ... the TAG is also asking about future-proofing the API against hardware evolution 00:11:52 reillyg: clearly we have thought about it, we still need implementation feedback to determine whether our solution works 00:12:30 anssik: the TAG is also asking multi-platform/multi-engine implementations; we have multiple backend implementation, all major browser vendors in the group, 5-ish OS support 00:12:53 rafael: +1 00:13:14 s/5-ish/3-ish/ 00:13:39 Subtopic: W3C “living standards” expectations 00:14:42 dom: there's no "living standard" stamp at W3C, you can remain at CR stage as long as the WG is operating 00:15:01 ... either you stay at CR and every two years you publish a CRS and go through wide review 00:15:10 ... iterate on 2-year cycle 00:15:51 ... more traditional path is to go to Recommendation status, requires going from CR to Recommendation demonstrating interop exprience across all features in 2 or more impls 00:16:24 ... rarely everything is perfect, but you need to demonstrate the standard is interoperable at this stage 00:16:44 ... my personal perspective is that going that final step as painful it is to ensure convergence 00:17:10 ... reflects what end user need from the technology 00:17:53 ... WebNN now has non-significant number of ops, the risk if we stay iterating at CR is we never take a knife to carve out those ops that get enough implementation experience 00:18:34 ... two engines willing to ship across OSes, backends 00:19:17 ... without going to REC, we could iterate on CR 00:19:21 q+ 00:19:49 anssik: WebRTC Recommendation experience? 00:20:17 dom: adding post corrections is cheap from process perspective 00:20:42 ... I'm driving this in WebRTC WG where we made sufficient progress that can say it works well for that group and could work here too 00:20:56 q? 00:21:21 ack jsbell 00:21:39 jsbell: thanks dom this is what I wanted to learn 00:21:59 ... we don't need to make this decision know 00:22:00 q? 00:22:24 q? 00:22:36 Topic: Incubations 00:22:41 Subtopic: Custom ops 00:22:56 kenji_baheux has joined #webmachinelearning 00:23:00 Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0007/Tensor_Primitive_Ops_Proposal_-_TPAC.pdf 00:23:09 [slide 1] 00:23:27 [slide 2] 00:25:48 [slide 3] 00:28:28 [slide 4] 00:29:48 [slide 5] 00:31:34 [slide 6] 00:32:40 q+ to ask whether any current backends support building custom ops using a subgraph. 00:33:24 [slide 7] 00:34:21 [slide 8] 00:36:34 [slide 9] 00:41:29 #559 00:41:30 https://github.com/webmachinelearning/webnn/issues/559 -> Issue 559 Control flow operations: if, while (by philloooo) [opset] [feature request] 00:43:12 q? 00:43:26 q+ 00:44:58 q+ 00:47:10 ack reillyg 00:47:10 reillyg, you wanted to ask whether any current backends support building custom ops using a subgraph. 00:47:17 RafaelCintron has joined #webmachinelearning 00:47:30 reillyg: providing more structure to the underlying compiler is likely to produce good results 00:47:54 ... however the current backends we've been prototyping with provide this current support 00:48:04 ningxin: I think CoreML does 00:48:36 dwayne: this maybe could be made to work with DirectML 00:48:51 reillyg: but this would require pattern matching 00:49:02 dwayne: but a very simple pattern matching 00:49:26 reillyg: but it risks create performance cliffs; a smarter compiler would make me feel more confident 00:49:52 asully: from a Web platform perspective, the question is whether we need to be able to provide hints e.g. reusable subgraphs, that can be passed to the lower level 00:49:55 ack jsbell 00:50:01 ... where that is implemented should remain an implementation detail 00:50:05 jsbell: very cool proposal 00:50:08 https://openxla.org/stablehlo/spec#composite 00:50:10 NaGaiton has joined #webmachinelearning 00:50:27 jsbell: the stable HLO approach is very similar and may be a way to annotate the subgraphs 00:50:41 asully: there are aspects of the stable HLO compositor we wouldn't want to expose to the Web 00:51:18 q? 00:51:23 ack gdti 00:51:28 ... we wouldn't want to have magic values everywhere - we would have to consider the decomposition, not doing string matching 00:52:16 q+ 00:52:55 gdti: cool proposal, two points: 1 minor: the PoC is comparing the WASM built-in function to standards WASM, so the perf isn't representative of the fallback path given the cost of going from WebNN to WASM 00:53:41 ack asully 00:53:50 ... from a Web platform perspective, exposing fully this to the Web might be too challenging to maintain 00:55:22 asully: with this custom op proposal, would this open the way to remove the specification of many of hte WebNN high level operators which can be expressed into lower level WebNN ops? 00:55:45 MikeW has joined #webmachinelearning 00:56:13 ... I'm supportive of making the API more focused on lower level APIs à la MLIR 00:56:36 q? 00:57:13 Subtopic: Built-in APIs for translation and prompting 00:57:27 Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0008/TPAC_2024_Built-in_AI_APIs.pdf 00:57:39 [slide 1] 00:58:08 [slide 3] 00:59:22 [slide 4] 01:00:28 [slide 5] 01:01:10 [slide 6] 01:02:03 Minor correction, the 13 partners [...] numbers are for what preceded the early preview program (a more tight early exploration with a few partners). The early preview program has orders of magnitude more participants. 01:02:42 [slide 7] 01:04:03 [slide 9] 01:04:31 [slide 10] 01:04:43 [slide 11] 01:05:16 [slide 12] 01:05:25 [slide 13] 01:05:42 [slide 14] 01:06:07 [slide 15] 01:06:21 sql has joined #webmachinelearning 01:07:39 [slide 16] 01:08:51 [slide 17] 01:08:51 [slide 18] 01:09:32 [slide 19] 01:10:41 [slide 20] 01:13:01 [slide 21] 01:14:08 kenji_baheux has joined #webmachinelearning 01:14:16 [slide 22] 01:14:52 [slide 23] 01:15:37 -> Translation API https://github.com/WICG/translation-api 01:15:41 andrewnolan has joined #webmachinelearning 01:15:42 -> Prompt API https://github.com/explainers-by-googlers/prompt-api/ 01:16:11 [slide 24] 01:16:13 [slide 25] 01:16:46 [slide 26] 01:16:46 [slide 27] 01:16:46 [slide 28] 01:16:46 [slide 29] 01:17:43 q? 01:18:01 q+ 01:18:05 https://docs.google.com/presentation/d/1QeVJ6gsE8_xy2Yui1KcTB75dsiKTI0dDFD7mgLITnmE/edit?usp=sharing 01:18:05 ack ningxin 01:18:56 q? 01:18:57 Junseok has joined #webmachinelearning 01:18:57 ningxin: can a developer expect to get the same level of performance from these tasks API and WebNN? 01:19:30 domenic: the task API would probably hit more directly the hardware so might have a bit a perf advantage, but I assume the goal for WebNN is to make this imperceptible 01:19:37 Subtopic: Model management 01:19:49 scribe+ 01:19:54 RRSAgent, draft minutes 01:19:55 I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik 01:20:16 [slide 2] 01:20:44 s/[slide 2]/Slideset: storage_slides/ 01:20:49 [slide 2] 01:21:16 [slide ?] (Alternative 5) 01:22:57 also, if it's rare enough the likelihood of benefitting from sharing it across origin should be low (for most users). 01:23:20 Meeting: Web Machine Learning WG F2F – 23 September 2024 (Part 2) 01:23:22 RRSAgent, draft minutes 01:23:23 I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik 01:24:14 McCool: I will be presenting a breakout session on this on Wednesday. 01:24:22 q? 01:24:32 Topic: Wrap up 01:24:41 anssik: thank you for your active participation and great discussions 01:24:57 ... interested folks are welcome to join us for a dinner at Anaheim Packing District 2.5 miles from the meeting venue 01:25:01 -> https://www.anaheimpackingdistrict.com/merchants 01:25:27 RRSAgent, draft minutes 01:25:28 I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik 05:04:28 RRSAgent, draft minutes 05:04:29 I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik 05:08:24 s|Jay_Wang_slides|https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0005/jay-wang-w3c-webml-compressed-selected.pdf 05:08:27 RRSAgent, draft minutes 05:08:28 I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik 05:34:29 Zakim has left #webmachinelearning 14:05:41 zkis has joined #webmachinelearning 14:34:33 ktoumura has joined #webmachinelearning 15:47:00 dom has joined #webmachinelearning 15:51:51 AramZS has joined #webmachinelearning 16:35:34 tre has joined #webmachinelearning 17:15:35 Domenic has joined #webmachinelearning