15:48:30 <RRSAgent> RRSAgent has joined #webmachinelearning
15:48:34 <RRSAgent> logging to https://www.w3.org/2024/09/23-webmachinelearning-irc
15:48:34 <Zakim> RRSAgent, make logs Public
15:48:35 <Zakim> please title this meeting ("meeting: ..."), anssik
15:48:39 <anssik> Meeting: Web Machine Learning WG F2F – 23 September 2024
15:48:47 <anssik> Chair: Anssi
15:48:51 <anssik> Agenda: https://github.com/webmachinelearning/meetings/issues/25
15:48:52 <gb> https://github.com/webmachinelearning/meetings/issues/25 -> Issue 25 WebML WG - TPAC 2024 agenda (by anssiko)
15:49:31 <anssik> Scribe: Anssi
15:49:35 <anssik> scribeNick: anssik
15:49:42 <anssik> gb, this is webmachinelearning/webnn
15:49:42 <gb> anssik, OK.
15:49:50 <anssik> scribe+ dom
15:50:04 <anssik> Present+ Anssi_Kostiainen
15:50:12 <anssik> Present+ Dominique_Hazael-Massieux
15:50:12 <anssik> Present+ Ningxin_Hu
15:50:16 <anssik> Present+ Michael_McCool
15:50:21 <anssik> Present+ Rafael_Cintron
15:50:27 <anssik> Present+ Neil_Trevett
15:50:46 <anssik> Present+ Austin_Sullivan
15:50:56 <anssik> RRSAgent, draft minutes
15:50:58 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik
15:51:48 <anssik> Present+ Dwayne_Robinson
15:54:20 <anssik> Present+ Yuta_Hagio
15:55:16 <anssik> Present+ Fredrik_Solenberg
15:57:04 <anssik> Present+ Laszlo_Gombos
15:57:15 <asully> asully has joined #webmachinelearning
15:57:53 <anssik> Present+ Kenji_Baheux
15:58:15 <dom> dom has joined #webmachinelearning
15:58:21 <anssik> Present+ Rob_Kochman
15:58:50 <dom> RRSAgent, draft minutes
15:58:51 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom
15:59:07 <anssik> Present+ Rachel_Yager
15:59:18 <anssik> RRSAgent, draft minutes
15:59:19 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik
15:59:32 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:59:55 <anssik> Present+ Reilly_Grant
16:00:04 <jsbell> jsbell has joined #webmachinelearning
16:00:09 <ali> ali has joined #webmachinelearning
16:00:09 <asully> Present+ asully
16:00:21 <RafaelCintron> Present+ RafaelCintron
16:00:23 <Lei> Lei has joined #webmachinelearning
16:00:49 <Lei> Lei has joined #webmachinelearning
16:00:49 <ali> ali has joined #webmachinelearning
16:00:49 <jsbell> jsbell has joined #webmachinelearning
16:00:49 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:00:49 <asully> asully has joined #webmachinelearning
16:00:49 <reillyg> reillyg has joined #webmachinelearning
16:01:52 <McCool> McCool has joined #webmachinelearning
16:01:52 <gdti> gdti has joined #webmachinelearning
16:01:52 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
16:01:52 <ningxin> ningxin has joined #webmachinelearning
16:01:52 <Lei> Lei has joined #webmachinelearning
16:01:52 <ali> ali has joined #webmachinelearning
16:01:52 <jsbell> jsbell has joined #webmachinelearning
16:01:52 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:01:52 <asully> asully has joined #webmachinelearning
16:01:52 <reillyg> reillyg has joined #webmachinelearning
16:02:48 <zkis> zkis has joined #webmachinelearning
16:03:12 <RobKochman> RobKochman has joined #webmachinelearning
16:03:12 <kenji_baheux> kenji_baheux has joined #webmachinelearning
16:03:12 <NeilT> NeilT has joined #webmachinelearning
16:03:12 <McCool> McCool has joined #webmachinelearning
16:03:12 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
16:03:12 <ningxin> ningxin has joined #webmachinelearning
16:03:12 <ali> ali has joined #webmachinelearning
16:03:12 <jsbell> jsbell has joined #webmachinelearning
16:03:12 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:03:12 <asully> asully has joined #webmachinelearning
16:03:12 <reillyg> reillyg has joined #webmachinelearning
16:03:53 <anssik> Present+ Domenic_Denicola
16:03:59 <anssik> Present+ Joshua_Bell
16:04:15 <dwayner7> dwayner7 has joined #webmachinelearning
16:04:15 <RobKochman> RobKochman has joined #webmachinelearning
16:04:15 <kenji_baheux> kenji_baheux has joined #webmachinelearning
16:04:15 <NeilT> NeilT has joined #webmachinelearning
16:04:15 <McCool> McCool has joined #webmachinelearning
16:04:15 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
16:04:15 <ningxin> ningxin has joined #webmachinelearning
16:04:15 <ali> ali has joined #webmachinelearning
16:04:15 <jsbell> jsbell has joined #webmachinelearning
16:04:15 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:04:15 <asully> asully has joined #webmachinelearning
16:04:15 <reillyg> reillyg has joined #webmachinelearning
16:04:20 <anssik> Present+ Deepti_Gendluri
16:04:28 <anssik> Present+ Iris_Ren
16:04:35 <anssik> Present+ Bryan_Bernhart
16:04:42 <anssik> Present+ Lei_Zhao
16:04:47 <anssik> Present+ Zoltan_Kis
16:05:12 <anssik> Topic: Welcome
16:05:20 <ErikAnderson> ErikAnderson has joined #webmachinelearning
16:05:20 <ktoumura> ktoumura has joined #webmachinelearning
16:05:20 <Rachel> Rachel has joined #webmachinelearning
16:05:20 <dwayner7> dwayner7 has joined #webmachinelearning
16:05:20 <RobKochman> RobKochman has joined #webmachinelearning
16:05:20 <kenji_baheux> kenji_baheux has joined #webmachinelearning
16:05:20 <NeilT> NeilT has joined #webmachinelearning
16:05:20 <McCool> McCool has joined #webmachinelearning
16:05:20 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
16:05:20 <ningxin> ningxin has joined #webmachinelearning
16:05:20 <ali> ali has joined #webmachinelearning
16:05:20 <jsbell> jsbell has joined #webmachinelearning
16:05:20 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:05:20 <asully> asully has joined #webmachinelearning
16:05:20 <reillyg> reillyg has joined #webmachinelearning
16:05:22 <Rachel> p+
16:05:39 <dom> anssik: this is our first F2F as a WG, despite us having existed for a long time
16:06:12 <dom> ... I'm Anssi Koistiainen from Intel, chair of the WG, supported by Dom as our staff contact; thanks to the TPAC organizers to make this happen
16:06:30 <dom> ... great to see both long time participants and new faces
16:06:41 <lgombos> lgombos has joined #webmachinelearning
16:06:41 <ErikAnderson> ErikAnderson has joined #webmachinelearning
16:06:41 <ktoumura> ktoumura has joined #webmachinelearning
16:06:41 <Rachel> Rachel has joined #webmachinelearning
16:06:41 <dwayner7> dwayner7 has joined #webmachinelearning
16:06:41 <RobKochman> RobKochman has joined #webmachinelearning
16:06:41 <kenji_baheux> kenji_baheux has joined #webmachinelearning
16:06:41 <NeilT> NeilT has joined #webmachinelearning
16:06:41 <McCool> McCool has joined #webmachinelearning
16:06:41 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
16:06:41 <ningxin> ningxin has joined #webmachinelearning
16:06:41 <ali> ali has joined #webmachinelearning
16:06:41 <jsbell> jsbell has joined #webmachinelearning
16:06:41 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:06:41 <asully> asully has joined #webmachinelearning
16:06:47 <dom> ... this WG has now all the major browser vendors as participants, with Mozilla joining recently
16:07:12 <dom> ... a diverse set of experts at the intersection of expertises around AI, from different backgrounds
16:07:33 <dom> ... incl library makers that help us calibrate our work to real world requirements
16:07:52 <reillyg> reillyg has joined #webmachinelearning
16:07:52 <lgombos> lgombos has joined #webmachinelearning
16:07:52 <ErikAnderson> ErikAnderson has joined #webmachinelearning
16:07:52 <ktoumura> ktoumura has joined #webmachinelearning
16:07:52 <Rachel> Rachel has joined #webmachinelearning
16:07:52 <dwayner7> dwayner7 has joined #webmachinelearning
16:07:52 <RobKochman> RobKochman has joined #webmachinelearning
16:07:52 <kenji_baheux> kenji_baheux has joined #webmachinelearning
16:07:52 <NeilT> NeilT has joined #webmachinelearning
16:07:52 <McCool> McCool has joined #webmachinelearning
16:07:52 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
16:07:52 <ningxin> ningxin has joined #webmachinelearning
16:07:52 <ali> ali has joined #webmachinelearning
16:07:52 <jsbell> jsbell has joined #webmachinelearning
16:07:52 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:07:52 <asully> asully has joined #webmachinelearning
16:07:55 <dwaynerobinson> dwaynerobinson has joined #webmachinelearning
16:08:04 <dom> ... people from industry, research background - let other people know they can get involved
16:08:10 <lambdakata3> lambdakata3 has joined #webmachinelearning
16:08:49 <SarahJ> SarahJ has joined #webmachinelearning
16:09:30 <SarahJ> SarahJ has joined #webmachinelearning
16:09:30 <lambdakata3> lambdakata3 has joined #webmachinelearning
16:09:30 <dwaynerobinson> dwaynerobinson has joined #webmachinelearning
16:09:30 <reillyg> reillyg has joined #webmachinelearning
16:09:30 <lgombos> lgombos has joined #webmachinelearning
16:09:30 <ErikAnderson> ErikAnderson has joined #webmachinelearning
16:09:30 <ktoumura> ktoumura has joined #webmachinelearning
16:09:30 <Rachel> Rachel has joined #webmachinelearning
16:09:30 <dwayner7> dwayner7 has joined #webmachinelearning
16:09:30 <RobKochman> RobKochman has joined #webmachinelearning
16:09:30 <kenji_baheux> kenji_baheux has joined #webmachinelearning
16:09:30 <NeilT> NeilT has joined #webmachinelearning
16:09:30 <McCool> McCool has joined #webmachinelearning
16:09:30 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
16:09:30 <ningxin> ningxin has joined #webmachinelearning
16:09:30 <ali> ali has joined #webmachinelearning
16:09:30 <jsbell> jsbell has joined #webmachinelearning
16:09:30 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:09:30 <asully> asully has joined #webmachinelearning
16:10:28 <Domenic> Domenic has joined #webmachinelearning
16:10:43 <Fabian> Fabian has joined #webmachinelearning
16:10:43 <SarahJ> SarahJ has joined #webmachinelearning
16:10:43 <lambdakata3> lambdakata3 has joined #webmachinelearning
16:10:43 <dwaynerobinson> dwaynerobinson has joined #webmachinelearning
16:10:43 <reillyg> reillyg has joined #webmachinelearning
16:10:43 <ErikAnderson> ErikAnderson has joined #webmachinelearning
16:10:43 <ktoumura> ktoumura has joined #webmachinelearning
16:10:43 <Rachel> Rachel has joined #webmachinelearning
16:10:43 <dwayner7> dwayner7 has joined #webmachinelearning
16:10:43 <RobKochman> RobKochman has joined #webmachinelearning
16:10:43 <kenji_baheux> kenji_baheux has joined #webmachinelearning
16:10:43 <NeilT> NeilT has joined #webmachinelearning
16:10:43 <McCool> McCool has joined #webmachinelearning
16:10:43 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
16:10:43 <ningxin> ningxin has joined #webmachinelearning
16:10:43 <ali> ali has joined #webmachinelearning
16:10:43 <jsbell> jsbell has joined #webmachinelearning
16:10:43 <RafaelCintron> RafaelCintron has joined #webmachinelearning
16:10:43 <asully> asully has joined #webmachinelearning
16:10:44 <BryanB> BryanB has joined #webmachinelearning
16:10:54 <dom> Present+ Thomas_Steiner
16:11:03 <dom> Present+ Taylore_Microsoft
16:11:22 <dom> [round of intros]
16:11:52 <anssik> Present+ Mike_Wyrzykowski
16:12:06 <kenji_baheux> Present+
16:12:37 <reillyg> reillyg has joined #webmachinelearning
16:12:43 <tomayac7> tomayac7 has joined #webmachinelearning
16:12:45 <AramZS> AramZS has joined #webmachinelearning
16:12:50 <anssik> RRSAgent, draft minutes
16:12:52 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik
16:13:16 <ningxin> Present+
16:13:20 <annolan> annolan has joined #webmachinelearning
16:13:25 <ktoumura> present+ Kunihiko_Toumura
16:13:29 <lgombos> lgombos has joined #webmachinelearning
16:13:29 <BryanB> Present+
16:13:32 <dwaynerobinson> Present+
16:13:42 <lgombos> present+ Laszlo_Gombos
16:13:49 <andrewnolan> andrewnolan has joined #webmachinelearning
16:13:52 <annolan> annolan has left #webmachinelearning
16:14:05 <Rachel> Rachel has joined #webmachinelearning
16:14:05 <Wonsuk> Wonsuk has joined #webmachinelearning
16:14:12 <Fredrik_Solenberg> present+
16:14:22 <NeilT> Present+ Neil Trevett
16:15:09 <dom> Present+ Wonsuk_Lee
16:15:11 <gdti> gdti has joined #webmachinelearning
16:15:23 <Rachel> Rachel Present+ Rachel_Yager
16:15:25 <gdti> Present+ (Deepti Gandluri)
16:15:36 <ali> Present+ Ali Spivak
16:15:39 <Rachel> Present+ Rachel_Yager
16:15:55 <handellm> handellm has joined #webmachinelearning
16:16:11 <hirata> hirata has joined #webmachinelearning
16:16:24 <Taylore> Taylore has joined #webmachinelearning
16:16:42 <anssik> Subtopic: Charter orientation
16:17:08 <dom> Anssi: we are 2 groups: the Web Machine Learning Working Group, and its eponym Community Group
16:17:49 <hagio_nhk> hagio_nhk has joined #webmachinelearning
16:18:27 <dom> ... the WG deliverables includes the WebNN API (the core focus of the technical work), ethical guidelines (a topic on which we will get an intervention from openAI)
16:19:19 <dom> ... and work on model loader API which is blocked on standardized model format
16:19:27 <zkis> zkis has joined #webmachinelearning
16:19:35 <dom> ... the CG is responsible for incubating proposals some of which may graduate later to standardization
16:19:59 <dom> Dom: note that the WG charter is expiring next year, so we'll need to start discussions about potential additions in the next few weeks/months
16:20:20 <dom> Anssi: the CG allows to do exploratory work - that's how WebNN itself started
16:20:35 <guidou5> guidou5 has joined #webmachinelearning
16:20:55 <dom> Anssi: We're also chartered to coordinate with other groups: WebGPU WG (with related topics on our agenda, e.g. MLTensor)
16:21:22 <dom> ... the WebAssembly WG is also an important related group (with Deepti in particular helping coordinate)
16:21:43 <dom> ... also important integration questions around WebRTC which Ningxin explored a couple of years ago
16:22:35 <dom> ... Also working closely with the Technical Architecture Group which helps us making sure our API fit well in the broader platform
16:22:44 <McCool> q?
16:22:45 <dom> Present+ Jay_Wang
16:23:31 <Taylore> Present+Taylore_Givens
16:24:03 <anssik> Topic: Ethics
16:24:16 <tomayac> present+
16:24:19 <guidou> guidou has joined #webmachinelearning
16:24:19 <anssik> Subtopic: Democratizing Human-Centered AI with Visual Explanation and Interactive Guidance
16:24:26 <dom> Slideset: Jay_Wang_slides
16:25:04 <dom> Anssi: please welcome Jay, a safety researcher at openAI who will explain to us his research focus on making AI more accessible through novel interfaces
16:25:21 <dom> ... You're also part of Georgia Tech, have published many papers and open source tools
16:25:41 <dom> [slide 1]
16:25:59 <dom> [slide 2]
16:27:10 <dom> [slide 3]
16:27:13 <dom> [slide 4]
16:27:15 <dom> [slide 5]
16:27:30 <guidou> guidou has joined #webmachinelearning
16:27:41 <etienne> etienne has joined #webmachinelearning
16:28:06 <dom> [slide 6]
16:28:40 <dom> [slide 7]
16:28:52 <dom> [slide 8]
16:29:12 <dom> [slide 9]
16:29:36 <dom> [slide 10]
16:29:59 <dom> [slide 11]
16:30:20 <dom> [slide 12]
16:30:34 <dom> [slide 13]
16:30:39 <dom> [slide 14]
16:31:04 <dom> [slide 15]
16:31:12 <dom> [slide 16]
16:31:30 <dom> [slide 17]
16:31:37 <dom> [slide 18]
16:31:51 <fr> fr has joined #webmachinelearning
16:33:01 <dom> [demo of https://bit.ly/cnn-explainer]
16:34:43 <anssik> -> CNN Explainer https://bit.ly/cnn-explainer
16:34:51 <dom> [slide 19]
16:34:53 <dom> [slide 20]
16:35:13 <dom> [slide 21]
16:35:51 <dom> [slide 22]
16:36:26 <dom> [slide 23]
16:36:46 <dom> [slide 24]
16:37:07 <dom> [slide 25]
16:37:17 <dom> [slide 26]
16:37:53 <dom> [slide 27]
16:38:04 <anssik> -> DiffusionDB Explorer https://bit.ly/diffusiondb-vis
16:38:09 <dom> [slide 28]
16:38:51 <dom> [slide 29]
16:39:22 <anssik> -> WizMap Embeddings https://bit.ly/wizmap-acl
16:39:34 <dom> [slide 30]
16:40:16 <dom> [slide 31]
16:40:21 <dom> [slide 32]
16:42:45 <dom> [slide 33]
16:43:33 <dom> [slide 34]
16:43:54 <dom> [slide 35]
16:43:55 <anssik> -> GAM Changer https://bit.ly/gam-changer
16:44:05 <dom> [slide 36]
16:44:30 <dom> [slide 37]
16:44:43 <dom> [slide 38]
16:45:23 <dom> [slide 39]
16:45:47 <dom> [slide 40]
16:47:27 <dom> [slide 41]
16:47:41 <dom> [slide 42]
16:47:57 <dom> [slide 43]
16:48:01 <dom> [slide 44]
16:51:52 <dom> [slide 45]
16:52:15 <dom> [slide 46]
16:52:37 <dom> [slide 47 ]
16:53:22 <dom> [slide 48]
16:53:44 <dom> [slide 49]
16:54:24 <dom> [slide 50]
16:55:15 <dom> [slide 51]
16:55:42 <anssik> -> WebSHAP https://bit.ly/webshap
16:56:30 <dom> [slide 52]
16:57:03 <anssik> -> MeMemo https://bit.ly/mememojs
16:57:28 <anssik> -> Wordflow https://bit.ly/wordflow-tool
16:58:29 <dom> [slide 53]
16:59:09 <dom> anssi: thank you for this very comprehensive presentation
16:59:14 <RafaelCintron3> RafaelCintron3 has joined #webmachinelearning
16:59:31 <dom> ... a specific intersection with our work you alluded to is possible integration of some of these tools in browser developer tools
16:59:36 <kenji_baheux> q+
16:59:41 <anssik> q?
16:59:43 <anssik> ack kenji_baheux
16:59:49 <McCool> McCool has joined #webmachinelearning
16:59:54 <McCool> q?
17:00:24 <Rachel> q+
17:00:34 <dom> Kenji: how hard is to get the model to explain its behavior (e.g. in the example in Gam Changer around age/risk)
17:01:03 <anssik> ack Rachel
17:01:18 <dom> Jay: this particular model was a simple regression model where it is easier to identify the particular source of the model behavior
17:02:44 <dom> Rachel: why are human hands so problematic to AI image generators?
17:03:09 <McCool> q+
17:03:20 <dom> Jay: the geometry of hands have been really hard to capture for models, but they're improving
17:03:22 <anssik> ack McCool
17:03:38 <dom> McCool: re WebNN, is there any gap related to your work?
17:04:08 <dom> Jay: my tools are mostly based on TensorFlow.js
17:04:42 <NeilT> NeilT has joined #webmachinelearning
17:04:43 <dom> ... there may need to be different modalities of input, with different ways of embedding the vectors, to cater with the emerging needs from generative AI
17:05:13 <dom> Anssi: thanks again Jay, we hope to work more with you
17:06:31 <dom> Topic: Issue triage
17:06:41 <anssik> Topic: Spec orientation
17:06:49 <anssik> q+ to propose next steps for #375
17:06:50 <gb> https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset]
17:06:52 <dom> s/Topic: Issue triage//
17:06:57 <anssik> q+ to discuss priority of #559
17:06:58 <gb> https://github.com/webmachinelearning/webnn/issues/559 -> Issue 559 Control flow operations: if, while (by philloooo) [opset] [feature request]
17:07:02 <anssik> q+ to bump the priority of #666
17:07:03 <gb> https://github.com/webmachinelearning/webnn/issues/666 -> Issue 666 Reconsider `MLOperand` methods (by a-sully) [question]
17:07:37 <MikeW> MikeW has joined #webmachinelearning
17:08:05 <andrewnolan> andrewnolan has joined #webmachinelearning
17:08:17 <Jeon0> Jeon0 has joined #webmachinelearning
17:08:21 <andrewnolan> Present+ Andrew_Nolan
17:08:58 <asully> asully has joined #webmachinelearning
17:09:08 <anssik> https://github.com/webmachinelearning/webnn/issues
17:09:27 <anssik> -> Triage guidance https://github.com/webmachinelearning/webnn/blob/main/docs/IssueTriage.md
17:09:54 <anssik> RRSAgent, draft minutes
17:09:55 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik
17:10:21 <BryanB> BryanB has joined #webmachinelearning
17:12:47 <anssik> -> "interop" issues https://github.com/webmachinelearning/webnn/issues?q=is%3Aissue+is%3Aopen+label%3Ainterop
17:12:55 <reillyg> I propose closing https://github.com/webmachinelearning/webnn/issues/11 as obsolete since we've decided to pursue a graph API.
17:12:55 <gb> https://github.com/webmachinelearning/webnn/issues/11 -> Issue 11 Executing operations (by anssiko) [feature request]
17:13:41 <jsbell> jsbell has joined #webmachinelearning
17:15:14 <McCool> two things that I noticed: int64 issues (relates to interop) and use of constants with MLTensor (enable potentially useful mechanism for model management)
17:15:30 <reillyg> I propose closing https://github.com/webmachinelearning/webnn/pull/541 in favor of https://github.com/webmachinelearning/webnn/pull/754 (can this be merged?).
17:15:31 <gb> https://github.com/webmachinelearning/webnn/pull/754 -> Pull Request 754 Add MLTensor explainer (by a-sully) [webgpu interop]
17:15:31 <gb> https://github.com/webmachinelearning/webnn/pull/541 -> Pull Request 541 Add MLBuffer exploration doc (by a-sully) [webgpu interop]
17:15:50 <dwayner> dwayner has joined #webmachinelearning
17:15:53 <jsbell> Propose closing this group of "simplify" issues unless someone strongly advocates for them soon: #474, #470, #374, #324
17:15:53 <gb> https://github.com/webmachinelearning/webnn/issues/374 -> Issue 374 Simplify `MLPool2dOptions` by removing the `outputSizes` option (by huningxin) [operator specific]
17:15:53 <gb> https://github.com/webmachinelearning/webnn/issues/474 -> Issue 474 Simplify `resample2d` op (by huningxin) [operator specific]
17:15:53 <gb> https://github.com/webmachinelearning/webnn/issues/470 -> Issue 470 Simplify `matmul` op (by huningxin) [operator specific]
17:15:54 <gb> https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (by huningxin) [feature request] [operator specific] [interop]
17:16:26 <jsbell> (sorry Ningxin!)
17:17:21 <BryanB> https://github.com/webmachinelearning/webnn/issues/749 -> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing
17:17:21 <gb> https://github.com/webmachinelearning/webnn/issues/749 -> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection]
17:17:38 <Dom7> Dom7 has joined #webmachinelearning
17:17:51 <Dom7> RRSAgent, draft minutes
17:17:52 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html Dom7
17:18:41 <reillyg> The most recent ~5 issues don't have labels. I don't have permission to add them.
17:19:46 <Dom7> reillyg: it seems that issue #11 was opened a long time ago and seems like it can be closed
17:19:47 <gb> https://github.com/webmachinelearning/webnn/issues/11 -> Issue 11 Executing operations (by anssiko) [feature request]
17:20:56 <Dom7> scribe+
17:21:20 <Dom7> anssik: no one objecting to #11 getting closed, so let's do it
17:22:18 <Dom7> reillyg: there are two open pull requests on the MLTensor space; can we close the generic one and keep only the specific one? can we land the PR for the explainer?
17:22:44 <Dom7> Austin: I'll close #541
17:22:45 <gb> https://github.com/webmachinelearning/webnn/pull/541 -> CLOSED Pull Request 541 Add MLBuffer exploration doc (by a-sully) [webgpu interop]
17:23:10 <Dom7> anssik: we should look at merging the explainer after our MLTensor discussion later today
17:23:54 <Dom7> jsbell: propose closing this group of "simplify" issues unless someone strongly advocates for them soon: #474, #470, #374, #324
17:24:10 <Dom7> ... we should either do them soon or abandon them
17:24:44 <Dom7> ningxin: I think we can close #324
17:25:03 <Dom7> reillyg: in the coming implementation, I've added automatic transposes
17:25:52 <Dom7> .... I think we can do without that particular simplification
17:26:12 <Dom7> anssik: so let's close #324
17:26:12 <gb> https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (by huningxin) [feature request] [operator specific] [interop]
17:26:31 <MikeW> MikeW has joined #webmachinelearning
17:27:04 <Domenic> (Has 2d vs. 2D been discussed? https://w3ctag.github.io/design-principles/#casing-rules )
17:27:34 <Dom7> anssik: re #470, do we need to retitle it? open a different issue?
17:27:35 <gb> https://github.com/webmachinelearning/webnn/issues/470 -> Issue 470 Simplify `matmul` op (by huningxin) [operator specific]
17:27:48 <Dom7> ningxin: I'll retitle it to reflect its status
17:28:12 <Dom7> dwayner: I'll open a new issue instead, linking back to that one
17:28:59 <BryanB> I think we can close the old MLBuffer PRs in favor of the MLTensor explainer: #542 #543 #544
17:29:00 <gb> https://github.com/webmachinelearning/webnn/issues/544 -> Issue 544 [MLBuffer] Support for MLBuffer in graph execution (by bbernhar)
17:29:00 <gb> https://github.com/webmachinelearning/webnn/issues/543 -> Issue 543 [MLBuffer] Uploading/downloading tensor data (by bbernhar)
17:29:00 <gb> https://github.com/webmachinelearning/webnn/issues/542 -> Issue 542 [MLBuffer] Creation and representing MLBuffer on a XPU devices (by bbernhar) [webgpu interop]
17:29:22 <Dom7> dwayner: re #374, we should probably align pool with conv - I'll propose next step in the issue
17:29:23 <gb> https://github.com/webmachinelearning/webnn/issues/374 -> Issue 374 Simplify `MLPool2dOptions` by removing the `outputSizes` option (by huningxin) [operator specific]
17:29:54 <MikeW> MikeW has joined #webmachinelearning
17:30:19 <Dom7> ningxin: closing #474 SGTM
17:30:19 <gb> https://github.com/webmachinelearning/webnn/issues/474 -> Issue 474 Simplify `resample2d` op (by huningxin) [operator specific]
17:33:40 <MikeW> MikeW has joined #webmachinelearning
17:34:30 <anssik> RRSAgent, draft minutes
17:34:32 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik
17:59:40 <RafaelCintron> RafaelCintron has joined #webmachinelearning
18:03:09 <spshin> spshin has joined #webmachinelearning
18:05:59 <asully> asully has joined #webmachinelearning
18:06:06 <McCool> McCool has joined #webmachinelearning
18:10:52 <MikeW> MikeW has joined #webmachinelearning
18:10:53 <Fredrik_Solenberg> Fredrik_Solenberg has joined #webmachinelearning
18:11:08 <AramZS> AramZS has joined #webmachinelearning
18:11:54 <anssik> Topic: New features
18:12:05 <anssik> Subtopic: A refreshed analysis of popular models
18:12:20 <anssik> #375
18:12:20 <gb> https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset]
18:12:27 <Dom7> Slideset: dwayne_webnn_operator_update
18:12:31 <andrewnolan> andrewnolan has joined #webmachinelearning
18:12:38 <anssik> -> 33 Models, 12 Operators, proposed IDL, data types https://github.com/webmachinelearning/webnn/issues/375#issuecomment-2292466613
18:12:39 <gb> https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset]
18:13:28 <ErikAnderson> ErikAnderson has joined #webmachinelearning
18:14:27 <Dom7> [slide 2]
18:14:46 <Dom7> [slide 3]
18:15:04 <Dom7> [slide 4]
18:15:08 <hirata> hirata has joined #webmachinelearning
18:15:39 <Dom7> [slide 6]
18:16:09 <Wonsuk> Wonsuk has joined #webmachinelearning
18:16:09 <Dom7> [slide 7]
18:16:35 <Dom7> [slide 8]
18:17:25 <dom> dom has joined #webmachinelearning
18:17:48 <dom> scribe+
18:18:34 <dom> [slide 9]
18:18:37 <dom> [slide 10]
18:19:41 <dom> [slide 11]
18:21:28 <gdti> Quick note: Memory64 for Wasm is available both on Chrome & Firefox nightly behind a flag for the last year or so - they're really close to being enabled by default and the proposal is stable just pending a phase 4 poll on closing out the last few spec issues
18:21:55 <gdti> We'd love folks to try it out and let us know if something isn't working as expected
18:22:08 <anssik> q?
18:22:44 <sebastian> sebastian has joined #webmachinelearning
18:22:51 <McCool> q+
18:23:03 <sprang> sprang has joined #webmachinelearning
18:23:07 <jsbell> q+
18:23:14 <dom> ningxin: we have already some implementation experience with these
18:23:19 <anssik> q?
18:23:31 <anssik> ack anssik
18:23:31 <Zakim> anssik, you wanted to propose next steps for #375 and to discuss priority of #559 and to bump the priority of #666
18:23:32 <gb> https://github.com/webmachinelearning/webnn/issues/375 -> Issue 375 Support for transformers (by dontcallmedom) [v2] [opset]
18:23:32 <gb> https://github.com/webmachinelearning/webnn/issues/559 -> Issue 559 Control flow operations: if, while (by philloooo) [opset] [feature request]
18:23:32 <gb> https://github.com/webmachinelearning/webnn/issues/666 -> Issue 666 Reconsider `MLOperand` methods (by a-sully) [question]
18:23:34 <dom> dwayne: it has informed some of the proposals
18:23:36 <anssik> q?
18:23:39 <anssik> ack McCool
18:24:04 <dom> q?
18:24:28 <dom> anssik: I'm interested in general feedback on this approach to wave 3 operators, given current prototyping efforts
18:25:02 <McCool> q+
18:25:39 <dom> ningxin: scatterND helps with performance, not just functionality
18:25:42 <anssik> q?
18:25:45 <anssik> ack jsbell
18:26:10 <dom> jsbell: I ♥ the wave nomenclature - it may be useful to use it for our issues in the repo
18:26:19 <NeilT> q+ Has the group looked at the  set of operators proposed by Arm in their TOSA proposal? https://www.mlplatform.org/tosa/
18:26:36 <dom> ... how far are we along for the implementation? any sense of when impl/spec will be ready to advance to origin trial?
18:27:02 <dom> dwayne: for ops completeness in the DML backend, maybe two weeks of implementation work
18:27:22 <anssik> q?
18:27:28 <dom> ... I hope to add the ops to the spec in the same timeframe
18:27:32 <anssik> ack McCool
18:28:06 <dom> McCool: how many of these models are actually useful in the Web context?
18:28:11 <NeilT> q+
18:28:30 <dom> dwayne: huge models are good to demonstrate viability, but they're so big they're likely not practical to use directly in the browser
18:28:45 <dom> McCool: +1
18:28:58 <anssik> q?
18:29:04 <dom> ningxin: they were identified through popularity in transformers.js, so already used in the browser context
18:29:06 <anssik> ack NeilT
18:29:35 <dom> NeilT: any consideration of the ARM @@@ operator?
18:30:35 <dom> dwayne: I've looked at it and have the data on it that I'll share
18:30:37 <anssik> s/@@@/TOSA
18:30:41 <anssik> q?
18:31:14 <dom> anssik: hearing overall support for the approach; no specific plan for origin trial yet
18:31:36 <dom> ... in terms of spec, do you expect any specific challenge?
18:31:56 <dom> dwayne: it should be pretty straightforward given our experience; a few interesting questions around invalid index conditions
18:32:28 <dom> ningxin: int4/uint4 data types will need attention and wide review
18:32:40 <dom> RRSAgent, draft minutes
18:32:41 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom
18:33:23 <anssik> Subtopic: Quantization and dequantization
18:33:34 <reillyg> > #93 #128 #623
18:33:35 <gb> https://github.com/webmachinelearning/webnn/issues/93 -> Issue 93 Add QuantizeLinear and DequantizeLinear for mixed precision (by kpu) [opset] [feature request]
18:33:35 <gb> https://github.com/webmachinelearning/webnn/issues/128 -> Issue 128 WebNN should support int8 quantized models (by wchao1115) [v2] [opset] [feature request]
18:33:35 <gb> https://github.com/webmachinelearning/webnn/issues/623 -> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request] [device selection]
18:34:15 <dom> reillyg: adding operators to explicit dequantize int8 and int4 vaues to float16/float32 makes a lot of sense to add to the spec
18:34:35 <dom> ... an open question: do we agree that explicit dequantization is the approach we want to take?
18:35:17 <dom> ... backends can detect a dequantize/conv2d operation, so expression this in the API itself is maybe unnecessary
18:35:40 <dom> dwayne: that matches the experience in ONNX
18:35:43 <jsbell> q+
18:36:12 <McCool> q+
18:36:15 <dom> ningxin: quantize/dequantize makes it also easier to fallback
18:36:25 <dom> ... for the backends
18:36:41 <McCool> q-
18:36:54 <dom> reillyg: if we expect that backends may strip out q/deQ pairs, how do we want to specifiy the behavior of rhte full graph from a precision perspective?
18:37:16 <dom> ... how useful would the requantize operator to be?
18:37:28 <dom> dwayne: I have seen hundreds of models using the quantize operator
18:37:59 <dom> ningxin: the behavior is up to the implementation, but how do we specify this?
18:38:25 <dom> dwayne: does it change the overall precision?
18:38:55 <dom> reillyg: in TFlite, there is a top-level flag to control this
18:38:58 <anssik> ack jsbell
18:39:32 <dom> jsbell: one other approach we've seen, the scale is attached to the tensor
18:39:49 <dom> ... I'm assuming we want to be explicit and not pursue this, but want to make sure we know of the options
18:40:13 <dom> reillyg: this also leads to a huge explosion of the type system
18:40:19 <reillyg> q+
18:40:22 <dom> dwayne: I'll go with explicit
18:40:25 <anssik> ack reillyg
18:41:03 <dom> reillyg: if we support quantize/dequantize, what data types do we support? int8/uint8 seem obvious, int4/uint4 are more complicated
18:41:18 <dom> ... representing int4 on the Web and across backends is challenging
18:41:57 <dom> dwayne: from a Web API perspective, they would be exposed as Int8Array
18:42:07 <ningxin> ningxin has joined #webmachinelearning
18:42:30 <dom> ... from an implementation perspective, I'm not sure how to handle int4 as input
18:42:46 <dom> reillyg: quantization is most useful for weights; do we know of any need of it for input/output?
18:42:50 <anssik> q?
18:42:56 <dom> McCool: I've seen quantization for activation as well
18:43:37 <dom> reillyg: i.e. using it as another kind of an activation function
18:44:13 <RafaelCintron> RafaelCintron has joined #webmachinelearning
18:44:18 <dom> ... I've linked to the 3 related issues - should we triage them into a single issue?
18:44:35 <dom> ... either #93 or #128 (#623 as a bunch of unrelated aspects)
18:44:36 <gb> https://github.com/webmachinelearning/webnn/issues/623 -> Issue 623 WebNN should support NPU and QDQ operations (by wchao1115) [v2] [opset] [feature request] [device selection]
18:44:36 <gb> https://github.com/webmachinelearning/webnn/issues/128 -> Issue 128 WebNN should support int8 quantized models (by wchao1115) [v2] [opset] [feature request]
18:44:36 <gb> https://github.com/webmachinelearning/webnn/issues/93 -> Issue 93 Add QuantizeLinear and DequantizeLinear for mixed precision (by kpu) [opset] [feature request]
18:44:37 <RobKochman3> RobKochman3 has joined #webmachinelearning
18:44:57 <dwayner> dwayner has joined #webmachinelearning
18:45:50 <anssik> q?
18:45:53 <dom> reillyg: I'll clean them up now that I have the proper repo privs
18:45:58 <anssik> Subtopic: Platform capability detection
18:47:30 <dom> reillyg: there is a bit of overlap between this topic and the next one on future proof device selection
18:47:45 <dom> ... capabilities detection and device selection go hand in hand
18:47:58 <dom> ... capabilities depend on the platform you're on, and which device you pick
18:48:49 <dom> ... I was looking at the examples of how WebGPU handles this; in WebGPU, the first step the dev goes through is requesting an adapter at which point the system devices which adapter to use
18:49:36 <dom> ... the question raised in Mike's proposal is whether we can have the developers give us the set of features they want and whether we can fulfill that
18:50:15 <dom> Mike: in WebGPU you get a set of limits, with defaults but also maximum limits
18:50:44 <dom> ... the defaults are guaranteed to run everywhere; if you ask something above the defaults, you can run on the particular device, but no guarantee to run everywhere
18:51:37 <dom> ... so instead of only describing what the device supports, establishing a baseline of support that can run everywhere
18:51:57 <dom> reillyg: it's hard to have a default operator set that works across platforms, mostly because of datatypes support
18:52:27 <dom> ... it seems it's not possible to give a baseline for models and datatypes with a guarantee to run everywhere
18:52:38 <asully> q+
18:53:29 <dom> ... any framework built on top of WebNN will have to have code to respond to the capabilities of the platform and tailor the graph as it is being built to tailor it to these capabilities
18:53:43 <dom> ... given the current landscape of hardware support, I'm not sure if it makes sense to create a baseline set
18:54:08 <dom> Mike: it would be interesting to see what we're missing from this core set given the support of TensorFlow.js in WebGPU
18:54:21 <ningxin> q+
18:54:30 <anssik> q?
18:54:31 <anssik> ack asully
18:55:01 <dom> asully: to get to a common set, one approach would be to relax the requirement that NPU matches with an NPU device
18:55:25 <anssik> q?
18:55:46 <dom> reillyg: with a restriction to GPU, could we identify a baseline operator set?
18:55:53 <dom> dwayne: probably
18:56:35 <anssik> ack ningxin
18:56:46 <dom> reillyg: if we said "if you're OK using the GPU and do float32, you're OK in the baseline"
18:57:36 <asully> q+
18:57:52 <dom> ningxin: we have a bunch of ONNX models in our tests; they use int16; we've added support for int16 in opLimits in a PR, with a WASM fallback
18:58:50 <anssik> ack asully
18:59:28 <reillyg> q+ to discuss whether requesting additional operators up-front is practical.
18:59:33 <dom> asully: we could limit the size of indexes to int32
19:00:36 <dom> ... to avoid the performance penalty of falling back to WASM
19:03:27 <Dom7> scribe+
19:04:19 <McCool> q+
19:04:23 <Dom7> Mike: our goal is to avoid developers not running on platforms by inadvertance
19:04:34 <reillyg> ack reillyg
19:04:34 <Zakim> reillyg, you wanted to discuss whether requesting additional operators up-front is practical.
19:04:56 <Dom7> ... we want to make sure it happens with a clear intent from the developers and with a sense they will build a fallback
19:05:39 <Dom7> reillyg: given that this is something that will be intermediated by frameworks, I'm wondering if this is something we want to do through an API or through developer tooling
19:06:09 <Dom7> ... e.G. a flag to enable compatibility mode
19:06:21 <McCool> (my comment was to suggest a compatibility mode for testing, now covered)
19:06:22 <McCool> q-
19:06:40 <Dom7> ... how does framework intermediation apply to the WebGPU case?
19:07:11 <Dom7> Mike: the engines tend to handle it for developers
19:07:22 <Dom7> ... a developer tool setting sounds like a good idea
19:07:54 <ningxin> q+
19:08:01 <Dom7> reillyg: the frameworks have a backup (e.g. go back to WASM), and it's relatively easy for them to detect when they go off limit
19:08:25 <anssik> q?
19:08:45 <Dom7> ... we don't want frameworks to guess and check
19:09:01 <Dom7> mike: we could still expose maximum limits, but keep the defaults to a baseline
19:09:08 <anssik> ack ningxin
19:10:19 <Dom7> ningxin: we could also make efforts to promoting best compatible datatypes for the Web to the transformers tooling community
19:10:33 <Dom7> ... this would reduce situations where frameworks have to fallback
19:11:14 <McCool> q+
19:11:26 <reillyg> q+
19:11:46 <McCool> ack m
19:12:31 <Dom7> reillyg: so is the concern with oplimits more of a compatibility rather than fingerpriting
19:12:34 <reillyg> ack reillyg
19:13:33 <anssik> q?
19:14:01 <Dom7> dom: the Web platform is compelling enough as a distribution platform that it can drive convergence toward a baseline
19:14:34 <Dom7> reillyg: what we need to figure out is the minimal supported operator set and data types (assuming GPU execution)
19:15:03 <Dom7> ... and with an opt-in parameter to request more than the default
19:16:02 <Dom7> anssik: this might help also with fingerprinting
19:16:17 <Dom7> RafaelCintron: e.g. implementors could not provide this "upgrade" path in a privacy mode
19:16:31 <anssik> Subtopic: Future-proof device selection abstractions
19:17:12 <Dom7> Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0006/MLDeviceType.pdf
19:17:19 <Dom7> [slide 2]
19:17:56 <Dom7> [slide 3]
19:19:57 <reillyg> q+ with a proposal.
19:19:57 <anssik> q?
19:20:00 <reillyg> q+ with a proposal
19:20:04 <anssik> ack reillyg
19:20:31 <Dom7> reillyg: getting rid of the explicit device type makes sense
19:21:21 <Dom7> ... the choice to me is between being very vague (with power preference) or a bit more specific: CPU-only, CPU+GPU, CPU+NPU, CPU+GPU+NPU
19:21:36 <Dom7> ... to avoid compatibility issues with the ambiguity of power preference
19:21:46 <dom> dom has joined #webmachinelearning
19:22:08 <Dom7> Mike: as more WebNN adoption occurs, it would be great to get data on performance improvements actual apps would get if they had more guarantees on which device they run
19:22:29 <dom> scribe+
19:23:13 <dom> asully: further experiments to see if developers could actually target NPUs in a cross-platform compatible fashion given the level of diversity in NPUs on the market today
19:23:29 <dom> ... so basically agree
19:23:29 <anssik> q?
19:23:34 <RafaelCintron> q+
19:23:35 <dom> s/further/we would like to run further/
19:23:37 <BryanB> q+
19:23:57 <dom> s/asully/reillyg/
19:24:13 <kenji_baheux> kenji_baheux has joined #webmachinelearning
19:24:26 <dom> Mike: if we had sufficient data to show that device type selection is necessary, we would be more open to it (as CoreML allows)
19:25:05 <dom> ... our proposal is that initially we remove the concept, with an openness to reconsider it based on data
19:25:44 <dom> RafaelCintron: the windows ecosystem is a lot more heterogeneous, so not having device selection feels even harder there
19:26:23 <dom> ... it's hard for the browser to make a decision since the model is only known once the data is buffered
19:26:30 <dom> ... maybe this could be done with a different API shape
19:27:06 <kenji_baheux> q+
19:27:11 <dom> ... Re privacy, how different is it for WebGPU? it seems you could do the same with increasingly complex shader
19:28:07 <dom> Mike: you can hide capabilities; you can't fully prevent it, but there are protections that have been added to WebGPU to mitigate the trivial privacy attacks, and we would like to see them in WebNN as well
19:28:08 <anssik> q?
19:28:11 <dom> ack RafaelCintron
19:28:19 <anssik> ack RafaelCintron
19:28:19 <McCool> q?
19:28:22 <anssik> ack BryanB
19:28:28 <dom> Zakim, close the queue
19:28:28 <Zakim> ok, dom, the speaker queue is closed
19:28:33 <ErikAnderson> ErikAnderson has joined #webmachinelearning
19:28:53 <dom> Bryan: MLTensor heavily relies on the device type
19:29:02 <dom> ... we have scenarios to re-use tensors between input/output
19:29:10 <handellm> handellm has joined #webmachinelearning
19:29:35 <dom> ... I'm wondering how far heuristical control can allow the proper allocation of memory resources
19:29:50 <dom> ... I fear this will lead to unnecessary reallocations/copy
19:29:52 <anssik> q?
19:30:24 <dom> asully: the MLTensor explainer has an open question; an MLTensor is tied to an MLContext, tightly bound to a device
19:30:35 <dom> ... if we change this, we will have to change the semantics of MLTensor, similarly to an MLGraph
19:31:08 <dom> ... so there are solutions for that if we rescope the tensor to the graph
19:31:26 <dom> RafaelCintron: how do you share an input/output tensors in that situation?
19:31:32 <dom> asully: that might lead to data copies
19:31:51 <dom> ningxin: facial recognition typically uses face detection then recognition through 2 different models
19:32:18 <dom> asully: there may need to be a way to declare that graphs share buffer
19:32:37 <dom> reillyg: if we switch to only power preference and possibly a prefer-cpu setting
19:32:49 <dom> ... and a WebGPU device for interop
19:33:15 <dom> ... ensuring consistent data sharing with the GPU,
19:33:22 <dom> ... then it's up to the UA to deal with data placement
19:34:13 <dom> ... not forcing developer decisions on graph placement if they don't have to sounds like an improvement
19:34:18 <dom> ... I think the UA can make the right choice
19:34:19 <anssik> q?
19:34:26 <dom> asully: if it has the right information, yes
19:34:59 <anssik> ack kenji_baheux
19:35:10 <dom> reillyg: if you create two graphs in a single context, this would hint they should be run close
19:35:36 <hagio_nhk> hagio_nhk has joined #webmachinelearning
19:35:59 <Fabian> Fabian has joined #webmachinelearning
19:36:09 <dom> kenji_baheux: there may be cases where you want to avoid using the GPU (e.g. because it's used for higher priority tasks), would there still be a way to indicate that?
19:36:36 <dom> anssik: which issue should we continue this discussion in?
19:36:43 <AramZS> AramZS has joined #webmachinelearning
19:36:47 <dom> asully: #749 is a good candidate
19:36:48 <gb> https://github.com/webmachinelearning/webnn/issues/749 -> Issue 749 MLContextOptions.deviceType seems unnecessary outside of conformance testing (by mwyrzykowski) [device selection]
19:36:57 <dom> reillyg: #302 also exists, but is more vague
19:36:58 <gb> https://github.com/webmachinelearning/webnn/issues/302 -> Issue 302 API simplification: context types, context options, createContext() (by zolkis) [v2] [device selection]
19:37:23 <hirata> hirata has joined #webmachinelearning
20:29:52 <jsbell> jsbell has joined #webmachinelearning
20:31:41 <kenji_baheux> kenji_baheux has joined #webmachinelearning
20:31:58 <jsbell> jsbell has joined #webmachinelearning
20:32:06 <jsbell> present+ Joshua_Bell
20:32:17 <asully> asully has joined #webmachinelearning
20:32:29 <anssik> Present+ Tianqi_Chen
20:33:13 <BryanB> Present+ BryanB
20:34:13 <anssik> Topic: Customer feedback & collaborations
20:34:36 <anssik> Subtopic: Universal Large-Language Model Deployment with ML Compilation
20:35:22 <dom> Slideset: tiani_chen_slides
20:35:44 <dom> [slide 2]
20:36:09 <McCool> McCool has joined #webmachinelearning
20:36:11 <kenji_baheux> Present+
20:36:28 <dom> [slide 3]
20:36:46 <dom> [slide 5]
20:37:41 <zkis> zkis has joined #webmachinelearning
20:37:49 <gdti> gdti has joined #webmachinelearning
20:38:09 <dom> [slide 9]
20:38:15 <dom> [slide 10]
20:38:35 <dom> [slide 11]
20:39:14 <dom> [slide 13]
20:39:41 <RobKochman> RobKochman has joined #webmachinelearning
20:39:54 <dom> [slide 14]
20:40:59 <dom> [slide 15]
20:41:10 <dom> [slide 16]
20:42:00 <dom> [slide 17]
20:42:43 <dom> [slide 19]
20:43:04 <dom> [slide 20]
20:43:28 <dom> [slide 21]
20:43:30 <dom> [slide 22]
20:44:48 <dom> [slide 23]
20:45:16 <dom> [slide 24]
20:45:49 <RafaelCintron> RafaelCintron has joined #webmachinelearning
20:45:56 <dom> [slide 25]
20:48:17 <MikeW> MikeW has joined #webmachinelearning
20:48:18 <dom> -> https://chat.webllm.ai/ WebLLM Chat Demo
20:48:31 <dom> [slide 26]
20:48:39 <spshin> spshin has joined #webmachinelearning
20:48:56 <anssik> q?
20:49:08 <reillyg> q+
20:49:16 <anssik> ack reillyg
20:49:17 <dom> Zakim, reopen the queue
20:49:18 <Zakim> ok, dom, the speaker queue is open
20:49:27 <AramZS> AramZS has joined #webmachinelearning
20:49:31 <MikeW> q+
20:49:43 <ningxin> ningxin has joined #webmachinelearning
20:49:52 <dom> reillyg: are you looking at implementing support for compiling to the WebNN API? Any feedback on the API capabilities and what you might need to support it as a backend?
20:50:33 <dom> tianqi: so far we've been focusing on WebGPU backend; interop between WebNN & WebGPU would help ensure one doesn't block the other
20:50:51 <dom> ... we've been looking at getting the compiler to generate JS using the WebNN API
20:51:14 <dom> reillyg: re WebGPU & WebNN, is your goal to implement non-WebNN operators using WebGPU?
20:51:26 <anssik> q?
20:51:31 <anssik> ack MikeW
20:51:32 <dom> tianqi: we want to be able to partition the tasks flexibly across the two
20:52:06 <dom> Mike: as you continue your work toward adopting WebNN, it would be great to provide feedback to the group, incl comparison with other backends in terms of performance
20:52:41 <anssik> q?
20:52:41 <dom> tianqi: +1 ; would love to see more contributions; WebGPU has been very useful to us, would be great to see the same with WebNN
20:53:24 <gdti> q+
20:53:46 <anssik> ack gdti
20:54:03 <dom> anssik: the open source projects can be found under the mlc github org
20:54:12 <anssik> -> MLC-LLM blog (Jun 7, 2024) https://blog.mlc.ai/2024/06/07/universal-LLM-deployment-engine-with-ML-compilation
20:54:16 <anssik> -> WebLLM blog (Jun 13, 2024) https://blog.mlc.ai/2024/06/13/webllm-a-high-performance-in-browser-llm-inference-engine
20:54:33 <dom> -> https://discuss.tvm.apache.org/t/unity-tutorial-tvm-unity-byoc/14561
20:54:34 <dom> [Unity][Tutorial] TVM Unity BYOC
20:54:50 <anssik> q?
20:55:36 <dezell> dezell has joined #webmachinelearning
20:55:41 <dezell> present+
20:56:03 <anssik> Subtopic: Transformers.js WebNN backend
20:56:42 <dom> [@@@ pre-recorded video by Joshua Lochner]
21:01:59 <McCool> q+
21:02:29 <anssik> Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0003/WebML_WG_-_Transformers.js_update__23_September_2024_.pdf
21:02:45 <anssik> [slide 1]
21:02:49 <anssik> [slide 2]
21:02:53 <anssik> [slide 3]
21:02:57 <anssik> [slide 4]
21:03:01 <anssik> [slide 5]
21:03:05 <anssik> [slide 6]
21:03:19 <anssik> RRSAgent, draft minutes
21:03:21 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html anssik
21:03:50 <anssik> q?
21:04:06 <dom> ningxin: webnn support is a planned feature of transformers.js v3, with the upcoming origin trial an opportunity to get feedback
21:04:07 <anssik> ack McCool
21:04:34 <dom> mccool: should we prioritize discussion on dynamic shapes based on that input
21:04:56 <dom> ningxin: developers can override the dimensions to adapt e.g. to the camera size
21:06:01 <dom> ... support for static k-v cache is an open issue in the Transformers project afaik
21:06:02 <anssik> q?
21:07:22 <dom> Subtopic: ONNX Runtime Web & WebNN EP
21:07:54 <dom> ningxin: one topic covered by MLTensor is the capability to support ONNX model with external data
21:08:10 <dom> ... weights are kept in an external data file
21:08:22 <dom> ... we're enabling the WebNN Execution Provider to support that
21:09:29 <hirata> hirata has joined #webmachinelearning
21:09:46 <dom> ... we want to reduce peak memory consumption since they're very high and sometimes hit the limits
21:11:00 <dom> ... there are different approaches under discussion to solve this, up to streaming directly network data to the memory
21:11:29 <dom> ... ONNX with external data is a significant use case for supporting models with big weights
21:12:00 <dom> RRSAgent,
21:12:00 <RRSAgent> I'm logging. I don't understand '', dom.  Try /msg RRSAgent help
21:12:31 <dom> RafaelCintron: the ONNX backend team is happy for the contributions; they expressed some concerns for the lack of dynamic shapes in WebNN
21:12:45 <McCool> q+
21:12:52 <anssik> ack McCool
21:13:16 <dwayner> dwayner has joined #webmachinelearning
21:13:58 <dom> McCool: the MLTensor prototype has strong typing which may impact the ability to stream data
21:14:42 <dom> reillyg: for CoreML / TFLite, the model is essentially streamed to a file
21:15:07 <dom> McCool: allowing streaming from disk is useful to limit the impact on system memory
21:15:23 <anssik> q?
21:15:39 <dom> Subtopic: Google Chrome Feedback revisited #453
21:15:40 <gb> https://github.com/webmachinelearning/webnn/issues/453 -> Issue 453 Google Chrome Feedback on WebNN: aiming for broad device coverage and maintainability (by vsekhar) [process] [opset] [use case]
21:15:51 <McCool> Re streaming, see for example G10; another is NanoFlow
21:16:15 <dom> reillyg: a year ago, as we started looking at WebNN, we provided a high level set of feedback on the API
21:16:35 <dom> ... most of this feedback is either already integrated, tracked in other issues, or will be answered during Origin Trial
21:16:58 <dom> ... so we're OK with closing issue #453; we're happy to see progress of implementations across platforms
21:17:12 <dom> ... we'll see how well developers can leverage it cross platforms during Origin Trial
21:17:42 <dom> asully: still some skepticism around the long term viability of high level operators
21:17:52 <dom> reillyg: (this is tracked in a specific issue)
21:18:09 <dom> anssik: could you link these specific issues from #453 and then close the issue?
21:18:36 <dom> ... Thanks for having provided that feedback with the clarity on setting goals for the work
21:18:57 <anssik> q?
21:19:29 <dom> reillyg: we've seen the progress we want to see on the spec and our issues, so this encompassing issue no longer feels needed
21:20:45 <anssik> Subtopic: Other standards positions
21:20:55 <anssik> #763
21:20:55 <gb> https://github.com/webmachinelearning/webnn/issues/763 -> Issue 763 Request standards positions from Mozilla and WebKit (by reillyeon) [process]
21:21:47 <dom> Anssi: Mike, can you get a webkit standards position on WebNN?
21:21:56 <dom> Mike: will do
21:22:31 <dom> Anssi: the more actionable feedback would be, the better
21:22:45 <anssik> q?
21:22:53 <dom> Mike: we're reviewing the specification; the DeviceType was the main thing we had found objectionable
21:23:06 <dom> ... we still need to do more work of mapping to our data framework
21:23:13 <dom> ... work in progress
21:24:11 <dom> ... I'll post a request for a standards position for webkit
21:25:42 <dom> Anssi: the Mozilla rep Tareq isn't at TPAC this year, but we can work with him to file this
21:25:51 <dom> Dom: we can also file this as a WG
21:26:17 <anssik> q?
21:26:44 <dom> RafaelCintron: speaking for Edge, we're fully supportive of the work
21:27:05 <dom> reillyg: we're supportive of the work; can't commit to shipping yet, but are looking forward to lessons from the origin trial
21:28:28 <dom> RRSAgent, draft minutes
21:28:29 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom
21:31:57 <anssik> s/Tareq/Tarek
21:32:14 <dom> s|tiani_chen_slides|https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0004/MLCTalk.pdf
21:32:17 <dom> RRSAgent, draft minutes
21:32:18 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom
21:38:15 <asully> asully has joined #webmachinelearning
21:39:32 <anssik> Topic: Interop and cross-group coordination
21:40:24 <anssik> Subtopic: Interop issues across different backends
21:40:47 <anssik> -> open "interop" issues https://github.com/webmachinelearning/webnn/issues?q=is%3Aissue+is%3Aopen+label%3Ainterop
21:41:13 <dom> #739
21:41:14 <gb> https://github.com/webmachinelearning/webnn/issues/739 -> Issue 739 Limited support for pad on CoreML backend (by philloooo) [operator specific] [interop]
21:41:18 <RafaelCintron> RafaelCintron has joined #webmachinelearning
21:41:56 <dom> #128
21:41:56 <gb> https://github.com/webmachinelearning/webnn/issues/128 -> Issue 128 WebNN should support int8 quantized models (by wchao1115) [v2] [opset] [feature request]
21:42:15 <anssik> s/#128/#180
21:42:21 <anssik> #180
21:42:21 <gb> https://github.com/webmachinelearning/webnn/issues/180 -> Issue 180 Should dilated pooling be supported (by fujunwei) [operator specific] [interop]
21:43:06 <dom> ningxin: one category I want to highlight is handling failure behavior
21:43:15 <dom> ... e.g. out of bound indices for gather/scatter
21:43:49 <dom> ... out of bound errors may create memory issues
21:43:51 <anssik> #486
21:43:51 <gb> https://github.com/webmachinelearning/webnn/issues/486 -> Issue 486 Add "implementation consideration" about how out-of-bound indices of Gather/Scatter should be handled (by huningxin) [operator specific] [interop]
21:44:35 <dom> ... what should be the behavior be in this siutation? the underlying platforms may have different approaches (error, clamping, normalized)
21:44:58 <dom> ... some native APIs mention this behavior as simply undefined
21:45:07 <asully> q+
21:45:21 <dom> ... (beyond memory safety)
21:45:24 <anssik> #691
21:45:25 <gb> https://github.com/webmachinelearning/webnn/issues/691 -> Issue 691 Divide-by-zero outcome should be standardized (by huningxin) [interop]
21:45:49 <dom> ... in some cases, this may vary across hardware vendors
21:46:04 <dom> ... #487
21:46:04 <gb> https://github.com/webmachinelearning/webnn/issues/487 -> Issue 487 Should `axes` be required parameter for layerNormalization build method? (by huningxin) [operator specific] [interop]
21:46:08 <dom> ... #481
21:46:08 <gb> https://github.com/webmachinelearning/webnn/issues/481 -> Issue 481 Should `scale` and `bias` be required inputs for `batchNormalization` op? (by huningxin) [operator specific] [interop]
21:46:37 <dom> ... these two issues are about the optional attribute - we set a default value for them, but the actual default may vary across platforms
21:47:00 <dom> ... for optional operands of #481, e.g. batchNormalization
21:47:19 <dom> ... when they're not present, the implementation has to provide a default value which increases the complexity of implementation
21:47:34 <dom> ... the question is whether we should make them require and leave the cost to the framework
21:47:37 <reillyg> q+ to discuss defaults.
21:47:54 <dom> ... some native platforms support the optional operand concept, e.g. coreML
21:48:15 <RafaelCintron> q+
21:48:37 <dom> ... I propose to close #383
21:48:38 <gb> https://github.com/webmachinelearning/webnn/issues/383 -> Issue 383 Need to restrict the value of alpha to be positive for elu operation (by lisa0314) [operator specific] [interop]
21:49:12 <dom> ... with the input we got, we concluded that there is no such restriction
21:49:22 <dom> ... coreML and directML docs say so explicitly
21:49:34 <dom> ... TFLite doesn't support the alpha parameter but can be emulated
21:50:00 <dom> ... we already proposed to close #324
21:50:01 <gb> https://github.com/webmachinelearning/webnn/issues/324 -> Issue 324 Simplify the operand layout support of conv2d and pooling 2d operations (by huningxin) [feature request] [operator specific] [interop]
21:50:14 <McCool> q?
21:50:27 <dom> anssik: any objection to close #383?
21:50:32 <dom> ... so we can close it
21:51:17 <NGaitonde> NGaitonde has joined #webmachinelearning
21:51:23 <anssik> q?
21:51:25 <anssik> ack asully
21:51:44 <dom> asully: we should try to avoid as much implementation defined behavior in the spec as possible
21:51:51 <ErikAnderson> ErikAnderson has joined #webmachinelearning
21:52:29 <dom> ... in particular for situations that would end up with very different behaviors across platform e.g. #486
21:52:30 <gb> https://github.com/webmachinelearning/webnn/issues/486 -> Issue 486 Add "implementation consideration" about how out-of-bound indices of Gather/Scatter should be handled (by huningxin) [operator specific] [interop]
21:52:45 <anssik> q?
21:52:48 <anssik> ack reillyg
21:52:48 <Zakim> reillyg, you wanted to discuss defaults.
21:52:51 <dom> ... we should define a behavior implementable across platforms, even if it comes with some wrapper cost in some platforms
21:53:19 <dom> reillyg: the baseline is preventing any out of bound memory
21:53:34 <dom> ... the spec should define a common behavior
21:53:46 <dom> reillyg: +1 to avoiding implementation defined behaviors
21:53:57 <dom> ... default in the graph builder API based on developer ergonomics
21:54:29 <dom> ... there are cases where explicitly choosing NOT to have defaults provide a better developer experience
21:54:37 <dom> ... this should be looked at on an op per op
21:54:51 <anssik> q?
21:54:52 <dom> ... we shouldn't simply inherit the backend default
21:54:56 <anssik> ack RafaelCintron
21:54:59 <dom> RafaelCintron: +1 to both
21:55:04 <RafaelCintron> https://registry.khronos.org/webgl/specs/latest/1.0/#4.1
21:55:04 <RafaelCintron> https://gpuweb.github.io/gpuweb/#security-shader
21:55:07 <dom> ... if we can do it performantly, we should do it
21:55:30 <dom> ... both WebGL and WebGPU have very similar things
21:55:52 <dom> ... WebGL leaves implementation flexibility in how to deal with out of bound situations
21:56:06 <dom> ... webGPU has also a list of potential behaviors
21:56:16 <McCool> q+
21:56:22 <dom> ... for us, if we can clamp performantly, that's the best
21:56:52 <anssik> ack McCool
21:56:53 <dom> reillyg: the priority of consitstuencies should be security, conformance and performance-if-it-really-matters
21:57:32 <dom> McCool: looking at a couple of cases: scattering out of bounds doesn't matter; but gathering out of bounds is problematic - we should discuss how it is handled
21:57:38 <anssik> s/consitstuencies/constituencies
21:58:28 <dom> asully: re throwing run time exceptions, there is no cross platform to throw that kind of exceptions, e.g. from a GPU
21:58:39 <dom> ... I would be more supportive of using default values
21:59:19 <NGaitonde> NGaitonde has joined #webmachinelearning
21:59:22 <dom> Dwayne: is there any precedent for this we could use?
21:59:43 <dom> reillyg: for divide by zero, there was a suggestion to look at what different hardwares do
21:59:59 <kenji_baheux> kenji_baheux has joined #webmachinelearning
22:00:54 <dom> asully: we probably want to ensure we always avoid runtime exception
22:01:06 <dom> reillyg: we need to measure the perf impact of clamping
22:01:35 <dom> dwayne: for out of bound, the two options are clamping or returning 0/NaN
22:04:28 <dom> McCool: we should clarify what platforms mean by undefined behavior for scatter (is it undefined order?)
22:04:31 <anssik> q?
22:04:50 <dom> ... non-deterministic atomic order
22:05:21 <dom> ningxin: for non-determistic hardware, should we simply ensure safety but not try to define something beyond?
22:05:41 <dom> reillyg: we should see if that introduces a fingerprinting surface
22:05:59 <dom> mccool: which would be really expensive to fix perforantly
22:06:05 <dom> s/ant/mant/
22:06:20 <MikeW> MikeW has joined #webmachinelearning
22:06:54 <anssik> Subtopic: Core operator set
22:07:17 <dom> #573
22:07:17 <gb> https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset]
22:07:36 <dom> reillyg: it seems like there is a minmimum core operator set we can define (with data types) with some additional research
22:08:04 <dom> ... there is an underlying question between high-level/low-level operators (the former being decomposable into the latter)
22:08:31 <dom> ... but that can be dealt as with compose that core operator set following our discussions on opLimits
22:08:52 <dom> ... what we consider core will evolve based on what we see as needed by modern models
22:09:30 <dom> ... we should have a good definition of what it takes to add an operator to the spec, including to the minimum supported set
22:09:45 <anssik> -> Adding new ops https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation
22:10:08 <dom> dwayne: a primitive is one that cannot be further decomposed
22:10:23 <dom> reillyg: but do we want to add all those that cannot be decomposed?
22:10:47 <anssik> q?
22:10:54 <AramZS> AramZS has joined #webmachinelearning
22:11:03 <dom> ... criteria for inclusion should be "can be implemented on at least two platforms across several data types"
22:11:38 <dom> dwayne: I want to support to be proactive in adding cross-platform available operators
22:11:59 <dom> reillyg: we should look also at TOSA, LINALG
22:12:28 <dom> ningxin: these two are good targets since they're used by hardware vendors as target
22:13:46 <dom> NeilT: we should look at MLRI
22:13:54 <dom> asully: the other side of the question is what to do with the non-core operators
22:13:56 <zkis> q+ to ask are we going to subclass MLGraphBuilder to support multiple sets?
22:14:45 <dom> jsbell: how do we reflect that in practice in the spec? do we categorize operators?
22:14:50 <dom> reillyg: @@@
22:17:03 <dom> dom: re beyond core, I think this will be driven by the implementation pressure to limit operators to what actually provide performance boost to enough platforms / Frameworks
22:17:21 <jsbell> q?
22:17:22 <dom> reillyg: getting implementation feedback across platforms and engines will provide useful push-and-pull
22:17:31 <zkis> q-
22:17:55 <dom> ... if it can be implemented by two different engines, it can be in the spec; if it can be implemented everywhere, it can be in the core set
22:18:26 <dom> ... it's likely that any op can be implemented everywhere, but not for all data types
22:18:32 <dom> ... that's where the constraints come in
22:18:42 <ningxin> q+
22:18:59 <dom> anssik: implementability across backends may be more critical than across engines
22:19:18 <ErikAnderson> ErikAnderson has joined #webmachinelearning
22:20:25 <jsbell> q+
22:20:30 <anssik> ack ningxin
22:20:37 <dom> reillyg: re high-level vs decomposed low-level operators, this will likely be based on collecting performance data in practice
22:20:57 <dom> ningxin: the high level ops comes from optimized support in some of the native platforms
22:21:20 <dom> ... but this could be replaced by the backend compiler detecting and applying the optimized path
22:21:40 <jsbell> q-
22:23:17 <dom> dom: for some operations, there may also be an engine specific stance on implementability for other reasons (e.G. fingerpriting)
22:23:29 <anssik> Subtopic: MLTensor
22:23:35 <anssik> PR #754
22:23:35 <gb> https://github.com/webmachinelearning/webnn/pull/754 -> Pull Request 754 Add MLTensor explainer (by a-sully) [webgpu interop]
22:24:13 <RafaelCintron> q+
22:25:07 <dom> asully: open to merge this explainer soon, although the recent discussion on mldevicetype will impact this
22:25:15 <dom> ... since it affects the buffer allocation
22:25:54 <dom> ... the big open questions are about WebGPU interop
22:26:08 <dom> ... how to share a buffer between WebNN & WebGPU (e.G. for video feed processing)
22:26:22 <handellm> handellm has joined #webmachinelearning
22:26:54 <dom> ... how efficient can this be made? on systems where this can be expressed as GPU commands, this is simple
22:27:16 <dom> ... when it's not possible, we might have to use the CPU for synchronization (with a performance hit)
22:28:26 <dom> ... the goal should be the UA has enough information to appropriately allocate the buffer
22:28:41 <anssik> q?
22:28:44 <dom> dom: I don't think we should block on the device type discussion before merging the explainer
22:28:51 <anssik> ack RafaelCintron
22:29:25 <dom> RafaelCintron: the GPUExternalBUffer as a type is still an open question (in comparison to GPUBuffer)
22:29:57 <dom> asully: it would have more restrictions than a simple GPUBuffer
22:30:12 <McCool> q+
22:30:17 <dom> ... I'd be happy to simply use GPUBUffer
22:30:26 <anssik> ack McCool
22:30:48 <dom> McCool: rather than talking about device types, we should talk about which graphs they're connected to
22:30:56 <dom> s/connected/communicating
22:30:57 <reillyg> q+
22:31:37 <dom> RafaelCintron: so long as we don't regress scenarios where connecting several graphs end up creating copies
22:32:07 <dom> ... being able to ensure graphs use the same resource domains
22:32:13 <anssik> ack reillyg
22:32:30 <dom> reillyg: with the device type discussion we had earlier, my proposal is that MLContext becomes that resource domain
22:32:36 <kenji_baheux> kenji_baheux has joined #webmachinelearning
22:32:55 <dom> ... if you create an MLContext and create a bunch of graphs, they should share that resource domain, and the buffers should be accessible to these graphs
22:33:37 <dom> ... are there really cases where you would have buffer before having your models ready to execute?
22:34:27 <dom> RafaelCintron: there is at least the constant buffer case as an exception; but for input/output you're probably right
22:34:58 <dom> reillyg: yeah, I would keep the MLConstant operand upload as a separate topic
22:35:46 <anssik> q?
22:36:30 <dom> ningxin: there are complex scenarios where a model output can be used by two models that are running in different devices
22:36:46 <dom> reillyg: an MLTensor can only be used by graphs created by the same context today
22:37:14 <dom> ... we can move them together from one unit to another, but they're not expected to be split
22:37:29 <dom> ... an MLCOntext represent resources that can be cheaply shared with each other
22:38:02 <anssik> #760
22:38:03 <gb> https://github.com/webmachinelearning/webnn/issues/760 -> Issue 760 Support building graphs from `MLTensor` containing constants (by bbernhar) [feature request]
22:38:04 <dom> Anssi: so let's proceed with merging the explainer and iterate from there
22:40:37 <kenji_baheux> kenji_baheux has joined #webmachinelearning
22:40:52 <reillyg> q+
22:40:59 <RafaelCintron> q+
22:41:02 <anssik> q?
22:41:05 <anssik> ack reillyg
22:41:29 <dom> reillyg: I'd like to understand the relationship between this and the work that Austin has been doing in the Chromium prototype to eagerly upload constants to the GPU process
22:41:45 <dom> ... if the goal is to allow to stream constants into the graph builder before you call build
22:41:59 <dom> ... then that's an implementation detail that can be added to the existing constant function
22:42:26 <dom> ... what that doesn't cover is a case of a constant being reused by multiple graphs
22:42:42 <dom> ... is that your specific use case?
22:43:00 <dom> BryanB: correct
22:43:28 <sql> sql has joined #webmachinelearning
22:43:40 <dom> reillyg: if you're compiling the graph with the constant values, that compilation may optimize the constant values and change it - can it still be shared then?
22:43:52 <dom> ... each graph might optimize operators differently and use different constant values
22:44:26 <McCool> q+
22:44:30 <dom> BryanB: the unoptimized copy might be re-used for another build
22:44:37 <ningxin> q+
22:44:45 <anssik> q?
22:44:50 <anssik> ack RafaelCintron
22:45:17 <dom> RafaelCintron: the reason we want to do this specifically is because we found scenarios where buffers needs to be shared across graphs
22:46:13 <dom> ... this came up in a discussion with a MLBuilder that would be used to create two graphs
22:46:46 <dom> ... (which is the alternative approach to defining a new MLTensor type)
22:47:00 <anssik> ack McCool
22:47:29 <dom> McCool: constant operands would also be useful to cache a model across different contexts
22:49:14 <dom> ... if I have a big model that I want to run in a WebGPU execution *and* in a WebNN execution
22:49:24 <dom> ... I wouldn't want to have to re-download the model
22:49:28 <anssik> ack ningxin
22:50:55 <dwayner> dwayner has joined #webmachinelearning
22:50:56 <dom> reillyg: there are two parts: sharing constant data between a graph you built on WebNN in a Web site, on WebGPU in another site (beyond SOP limitations), we already have an explicit assumption that the graph building can create an optimized copy destroying the original
22:51:45 <dom> McCool: how important is it to get introp between WebNN & WebGPU implementations?
22:52:42 <dom> asully: different platforms have different handling of constants (e.G. they're handled as a file on CoreML)
22:53:07 <dom> ... MLTensor would be used for input/output; sharing that data between WebGPU and WebNN is out of sope
22:53:10 <dom> s/sope/scope
22:53:29 <dom> reillyg: right - they would have to handle as input, which may come with a performance cost
22:53:35 <anssik> q?
22:54:16 <dom> ningxin: this emerged when we restricted MLBuilder to use a single graph
22:54:28 <dom> ... the reason for this was to ensure the timely release of resources
22:54:58 <dom> ... doesn't destroy help with resource lifecyle?
22:55:07 <Jeon0> Jeon0 has joined #webmachinelearning
22:55:29 <dom> reillyg: the ability to constraint how resources get consumed is another optimization this allowed
22:55:40 <dom> s/consumed/consumed by frameworks/
22:55:42 <anssik> s/lifecyle/lifecycle
22:56:28 <dom> reillyg: DirectML will copy any constant you'll give it
22:56:55 <dom> ... if we started eagerly copy constants to the GPU process and have them in the processing pipeline ASAP, would it help DirectML?
22:57:23 <dom> RafaelCintron: if you make this "owned by DirectML"…
22:58:35 <dom> reillyg: in the coreml backend, we would stream constants in the weights file
22:59:03 <dom> ... we could reuse the same file for multiple graphs
22:59:31 <anssik> q?
22:59:48 <dom> asully: so I Think we can confirm there is a use case to allow re-use of constants across graphs
23:00:05 <dom> reillyg: e.g. an optional parameter to the creation of constants
23:24:00 <AramZS> AramZS has joined #webmachinelearning
23:36:49 <anssik> Subtopic: MLConstantOperand
23:37:03 <anssik> issue #668 and PR #747
23:37:03 <gb> https://github.com/webmachinelearning/webnn/issues/668 -> Issue 668 Do we need an `MLConstantOperand`? (by a-sully) [question] [interop]
23:37:03 <gb> https://github.com/webmachinelearning/webnn/pull/747 -> Pull Request 747 Introduce MLConstantOperand (by a-sully)
23:38:30 <asully> asully has joined #webmachinelearning
23:39:46 <BryanB> BryanB has joined #webmachinelearning
23:40:03 <jsbell> I haven't been to the Anaheim Packing District but I've heard good things about it. I was hoping to check it out.
23:41:14 <MikeW9> MikeW9 has joined #webmachinelearning
23:43:08 <sql> sql has joined #webmachinelearning
23:43:45 <dom> reillyg: a couple of pieces to this: whether we want to encode the constant parameter requirement in the API (a note on the bias parameter for convolution saying it must be a constant, either through a dedicated type or with a property on the parameter)
23:44:47 <dom> ... we could specify that implementations do constant folding - if you take a constant and pass it to the add node with another constant, the implementation would take care of making the result a constant for the backend framework
23:44:56 <dwayner> dwayner has joined #webmachinelearning
23:45:28 <dwayner> q+
23:45:29 <anssik> q?
23:45:29 <dom> ... having an encouragement to provide a constant for these parameters and an ability for the implementation to compute the contantness of a parameter would provide both interop and performance benefits
23:45:32 <anssik> ack dwayner
23:45:43 <RobKochman> RobKochman has joined #webmachinelearning
23:46:01 <dom> dwayner: what does it mean to be a constant to coreML? is it limited to CPU or also on GPU?
23:46:13 <dom> reillyg: it needs to be present at the time the graph is created
23:46:20 <dom> dwayner: could it be created as an MLTensor?
23:46:34 <dom> asully: the content of the constant needs to be known at compile time
23:47:35 <dom> dwayner: I see the value of being able to query a constantness property, whether it's required or for perf improvement
23:47:52 <dom> ... I'm not sure it needs to be exposed to the API
23:48:19 <dom> ... I'm not entirely confident that constant folding would solve all the cases I saw in my research
23:49:47 <dom> ... the emulation in mos cases would be adding one op
23:50:04 <dom> ... I would like to minimize the number of cases where we require constantness
23:50:40 <dom> reillyg: this ties in with the question of sharing constants
23:51:50 <dom> ... my intuition would be to assume constantness until we find a reason not
23:52:01 <dom> s/not/not to
23:52:20 <dom> ... it's easier to change a requirement from const to non-const than the reverse
23:52:50 <dom> dwayner: but again, except for lstm and gru, it's only one op to emulate
23:53:49 <dom> reillyg: if it's only limited to a small number of coreml ops, I agree we could just decompose
23:54:41 <dom> asully: all the coreml ops that require constantness are high level ops that need decomposition in the spec in any case
23:55:05 <dom> RRSAgent, draft minutes
23:55:07 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/23-webmachinelearning-minutes.html dom
23:57:10 <anssik> Topic: Implementation plans and trials
23:58:02 <dom> anssi: we're iterating at CR stage, a call for implementation; we have implementations across 3 back ends, one of which implementing the full API and other backends a bit behind
23:58:16 <dom> ... multiple OS, with the API behind a flag in chrome and edge
23:59:10 <dom> Mike: we would be interested on data which set models run on NPU across a wide range of devices, vs falling back on CPU/GPU
23:59:29 <dom> reillyg: is there a way to detect on coreml whether something ran on the NPU?
23:59:46 <dom> Mike: I think so but will double check
00:00:23 <dom> anssik: figuring out the right metrics for the origin trial is important
00:01:49 <dom> rafael: perf comparison, compilation time, top 40 operators and spread of usage, contexts loss, memory usage
00:02:15 <dom> ... we need to have an API that can remain stable for a few months to collect useful data
00:02:39 <dom> ... the WebGPU OT lasted several months, with multiple breaking changes which wasn't ideal
00:03:07 <dom> reillyg: the expectation is that we're not going to ship at the conclusion of the OT, it's a data gathering experiment and that the API woudl be turned off at the end of the period
00:03:25 <RobKochman> +q
00:03:29 <dom> ... we expect that most developers will be using frameworks which will get back to using CPU or GPU delegates after the period
00:04:13 <anssik> ack RobKochman
00:04:27 <kenji_baheux> kenji_baheux has joined #webmachinelearning
00:04:46 <anssik> q?
00:04:51 <dom> RobKochman: we have to think through what the developers would actually do and what success would look like for them
00:05:53 <dom> McCool: do we want to collect data on which models?
00:06:10 <dom> rafael: we wouldn't know it from telemetry but through surveys
00:08:14 <dom> jsbell: we want developers to do A/B testing across the different execution providers, since I'm not sure we could tease that out on our end
00:08:28 <anssik> q?
00:08:30 <dom> mccool: maybe the frameworks could help with the A/B testing
00:08:47 <anssik> Topic: Advancement on the W3C Rec Track
00:08:52 <anssik> Subtopic: Wide review status
00:09:24 <anssik> -> https://github.com/w3ctag/design-reviews/issues/933
00:09:24 <gb> https://github.com/w3ctag/design-reviews/issues/933 -> CLOSED Issue 933 Updated review of WebNN API (by dontcallmedom) [Priority: urgent] [Progress: review complete] [Review type: small delta] [Review type: horizontal review] [Venue: WebML CG] [Resolution: satisfied with concerns] [Mode: breakout]
00:09:24 <gb> … [Topic: Machine Learning] [Focus: Web architecture (pending)] [Focus: Security (pending)] [Focus: Privacy (pending)]
00:10:14 <dom> Anssik: I suggest we integrate the responses Rafael and reillyg gave as non-normative text in the spec
00:10:42 <dom> ... the TAG is also asking about future-proofing the API against hardware evolution
00:11:52 <dom> reillyg: clearly we have thought about it, we still need implementation feedback to determine whether our solution works
00:12:30 <dom> anssik: the TAG is also asking multi-platform/multi-engine implementations; we have multiple backend implementation, all major browser vendors in the group, 5-ish OS support
00:12:53 <dom> rafael: +1
00:13:14 <dom> s/5-ish/3-ish/
00:13:39 <anssik> Subtopic: W3C “living standards” expectations
00:14:42 <anssik> dom: there's no "living standard" stamp at W3C, you can remain at CR stage as long as the WG is operating
00:15:01 <anssik> ... either you stay at CR and every two years you publish a CRS and go through wide review
00:15:10 <anssik> ... iterate on 2-year cycle
00:15:51 <anssik> ... more traditional path is to go to Recommendation status, requires going from CR to Recommendation demonstrating interop exprience across all features in 2 or more impls
00:16:24 <anssik> ... rarely everything is perfect, but you need to demonstrate the standard is interoperable at this stage
00:16:44 <anssik> ... my personal perspective is that going that final step as painful it is to ensure convergence
00:17:10 <anssik> ... reflects what end user need from the technology
00:17:53 <anssik> ... WebNN now has non-significant number of ops, the risk if we stay iterating at CR is we never take a knife to carve out those ops that get enough implementation experience
00:18:34 <anssik> ... two engines willing to ship across OSes, backends
00:19:17 <anssik> ... without going to REC, we could iterate on CR
00:19:21 <jsbell> q+
00:19:49 <anssik> anssik: WebRTC Recommendation experience?
00:20:17 <anssik> dom: adding post corrections is cheap from process perspective
00:20:42 <anssik> ... I'm driving this in WebRTC WG where we made sufficient progress that can say it works well for that group and could work here too
00:20:56 <anssik> q?
00:21:21 <anssik> ack jsbell
00:21:39 <anssik> jsbell: thanks dom this is what I wanted to learn
00:21:59 <anssik> ... we don't need to make this decision know
00:22:00 <anssik> q?
00:22:24 <anssik> q?
00:22:36 <anssik> Topic: Incubations
00:22:41 <anssik> Subtopic: Custom ops
00:22:56 <kenji_baheux> kenji_baheux has joined #webmachinelearning
00:23:00 <dom> Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0007/Tensor_Primitive_Ops_Proposal_-_TPAC.pdf
00:23:09 <dom> [slide 1]
00:23:27 <dom> [slide 2]
00:25:48 <dom> [slide 3]
00:28:28 <dom> [slide 4]
00:29:48 <dom> [slide 5]
00:31:34 <dom> [slide 6]
00:32:40 <reillyg> q+ to ask whether any current backends support building custom ops using a subgraph.
00:33:24 <dom> [slide 7]
00:34:21 <dom> [slide 8]
00:36:34 <dom> [slide 9]
00:41:29 <dom> #559
00:41:30 <gb> https://github.com/webmachinelearning/webnn/issues/559 -> Issue 559 Control flow operations: if, while (by philloooo) [opset] [feature request]
00:43:12 <anssik> q?
00:43:26 <jsbell> q+
00:44:58 <gdti> q+
00:47:10 <anssik> ack reillyg
00:47:10 <Zakim> reillyg, you wanted to ask whether any current backends support building custom ops using a subgraph.
00:47:17 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:47:30 <dom> reillyg: providing more structure to the underlying compiler is likely to produce good results
00:47:54 <dom> ... however the current backends we've been prototyping with provide this current support
00:48:04 <dom> ningxin: I think CoreML does
00:48:36 <dom> dwayne: this maybe could be made to work with DirectML
00:48:51 <dom> reillyg: but this would require pattern matching
00:49:02 <dom> dwayne: but a very simple pattern matching
00:49:26 <dom> reillyg: but it risks create performance cliffs; a smarter compiler would make me feel more confident
00:49:52 <dom> asully: from a Web platform perspective, the question is whether we need to be able to provide hints e.g. reusable subgraphs, that can be passed to the lower level
00:49:55 <anssik> ack jsbell
00:50:01 <dom> ... where that is implemented should remain an implementation detail
00:50:05 <dom> jsbell: very cool proposal
00:50:08 <jsbell> https://openxla.org/stablehlo/spec#composite
00:50:10 <NaGaiton> NaGaiton has joined #webmachinelearning
00:50:27 <dom> jsbell: the stable HLO approach is very similar and may be a way to annotate the subgraphs
00:50:41 <dom> asully: there are aspects of the stable HLO compositor we wouldn't want to expose to the Web
00:51:18 <anssik> q?
00:51:23 <anssik> ack gdti
00:51:28 <dom> ... we wouldn't want to have magic values everywhere - we would have to consider the decomposition, not doing string matching
00:52:16 <asully> q+
00:52:55 <dom> gdti: cool proposal, two points: 1 minor: the PoC is comparing the WASM built-in function to standards WASM, so the perf isn't representative of the fallback path given the cost of going from WebNN to WASM
00:53:41 <anssik> ack asully
00:53:50 <dom> ... from a Web platform perspective, exposing fully this to the Web might be too challenging to maintain
00:55:22 <dom> asully: with this custom op proposal, would this open the way to remove the specification of many of hte WebNN high level operators which can be expressed into lower level WebNN ops?
00:55:45 <MikeW> MikeW has joined #webmachinelearning
00:56:13 <dom> ... I'm supportive of making the API more focused on lower level APIs à la MLIR
00:56:36 <anssik> q?
00:57:13 <anssik> Subtopic: Built-in APIs for translation and prompting
00:57:27 <dom> Slideset: https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0008/TPAC_2024_Built-in_AI_APIs.pdf
00:57:39 <dom> [slide 1]
00:58:08 <dom> [slide 3]
00:59:22 <dom> [slide 4]
01:00:28 <dom> [slide 5]
01:01:10 <dom> [slide 6]
01:02:03 <kenji_baheux> Minor correction, the 13 partners [...] numbers are for what preceded the early preview program (a more tight early exploration with a few partners). The early preview program has orders of magnitude more participants.
01:02:42 <dom> [slide 7]
01:04:03 <dom> [slide 9]
01:04:31 <dom> [slide 10]
01:04:43 <dom> [slide 11]
01:05:16 <dom> [slide 12]
01:05:25 <dom> [slide 13]
01:05:42 <dom> [slide 14]
01:06:07 <dom> [slide 15]
01:06:21 <sql> sql has joined #webmachinelearning
01:07:39 <dom> [slide 16]
01:08:51 <dom> [slide 17]
01:08:51 <dom> [slide 18]
01:09:32 <dom> [slide 19]
01:10:41 <dom> [slide 20]
01:13:01 <dom> [slide 21]
01:14:08 <kenji_baheux> kenji_baheux has joined #webmachinelearning
01:14:16 <dom> [slide 22]
01:14:52 <dom> [slide 23]
01:15:37 <anssik> -> Translation API https://github.com/WICG/translation-api
01:15:41 <andrewnolan> andrewnolan has joined #webmachinelearning
01:15:42 <anssik> -> Prompt API https://github.com/explainers-by-googlers/prompt-api/
01:16:11 <dom> [slide 24]
01:16:13 <dom> [slide 25]
01:16:46 <dom> [slide 26]
01:16:46 <dom> [slide 27]
01:16:46 <dom> [slide 28]
01:16:46 <dom> [slide 29]

01:17:43 <anssik> q?
01:18:01 <ningxin> q+
01:18:05 <Domenic> https://docs.google.com/presentation/d/1QeVJ6gsE8_xy2Yui1KcTB75dsiKTI0dDFD7mgLITnmE/edit?usp=sharing
01:18:05 <anssik> ack ningxin
01:18:56 <anssik> q?
01:18:57 <Junseok> Junseok has joined #webmachinelearning
01:18:57 <dom> ningxin: can a developer expect to get the same level of performance from these tasks API and WebNN?
01:19:30 <dom> domenic: the task API would probably hit more directly the hardware so might have a bit a perf advantage, but I assume the goal for WebNN is to make this imperceptible
01:19:37 <anssik> Subtopic: Model management
01:19:49 <reillyg> scribe+
01:19:54 <anssik> RRSAgent, draft minutes
01:19:55 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik
01:20:16 <reillyg> [slide 2]
01:20:44 <reillyg> s/[slide 2]/Slideset: storage_slides/
01:20:49 <reillyg> [slide 2]
01:21:16 <reillyg> [slide ?] (Alternative 5)
01:22:57 <kenji_baheux> also, if it's rare enough the likelihood of benefitting from sharing it across origin should be low (for most users).
01:23:20 <anssik> Meeting: Web Machine Learning WG F2F – 23 September 2024 (Part 2)
01:23:22 <anssik> RRSAgent, draft minutes
01:23:23 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik
01:24:14 <reillyg> McCool: I will be presenting a breakout session on this on Wednesday.
01:24:22 <anssik> q?
01:24:32 <anssik> Topic: Wrap up
01:24:41 <anssik> anssik: thank you for your active participation and great discussions
01:24:57 <anssik> ... interested folks are welcome to join us for a dinner at Anaheim Packing District 2.5 miles from the meeting venue
01:25:01 <anssik> -> https://www.anaheimpackingdistrict.com/merchants
01:25:27 <anssik> RRSAgent, draft minutes
01:25:28 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik
05:04:28 <anssik> RRSAgent, draft minutes
05:04:29 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik
05:08:24 <anssik> s|Jay_Wang_slides|https://lists.w3.org/Archives/Public/www-archive/2024Sep/att-0005/jay-wang-w3c-webml-compressed-selected.pdf
05:08:27 <anssik> RRSAgent, draft minutes
05:08:28 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/24-webmachinelearning-minutes.html anssik
05:34:29 <Zakim> Zakim has left #webmachinelearning
14:05:41 <zkis> zkis has joined #webmachinelearning
14:34:33 <ktoumura> ktoumura has joined #webmachinelearning
15:47:00 <dom> dom has joined #webmachinelearning
15:51:51 <AramZS> AramZS has joined #webmachinelearning
16:35:34 <tre> tre has joined #webmachinelearning
17:15:35 <Domenic> Domenic has joined #webmachinelearning