23:55:30 RRSAgent has joined #webmachinelearning 23:55:34 logging to https://www.w3.org/2025/11/09-webmachinelearning-irc 23:55:34 RRSAgent, make logs Public 23:55:35 please title this meeting ("meeting: ..."), anssik 23:55:47 RafaelCintron has joined #webmachinelearning 23:57:40 awafaa has joined #webmachinelearning 23:57:51 Present+ 23:57:53 Scribe+ 23:58:16 DenisD has joined #webmachinelearning 23:58:22 Present+ Andrew_Wafaa 23:58:50 Remote + Denis_DIDIER 23:58:55 Meeting: Web Machine Learning WG F2F – 10 November 2025 23:59:00 Chair: Anssi 23:59:06 Agenda: https://github.com/webmachinelearning/meetings/issues/35 23:59:06 https://github.com/webmachinelearning/meetings/issues/35 -> https://github.com/webmachinelearning/meetings/issues/35 23:59:10 Scribe: Anssi 23:59:14 ningxin has joined #webmachinelearning 23:59:16 scribeNick: anssik 23:59:21 scribe+ dom 23:59:25 gb, this is webmachinelearning/webnn 23:59:26 anssik, OK. 23:59:30 Present+ Anssi_Kostiainen 23:59:40 Present+ Dominique_Hazael-Massieux 00:00:08 RRSAgent, draft minutes 00:00:09 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 00:00:51 Present+ Tarek_Ziade 00:00:59 Present+ Ningxin_Hu 00:01:13 Present+ Sushanth_Rajasankar 00:01:18 Present+ Erik_Anderson 00:01:33 Present+ Thomas_Steiner 00:01:57 Present+ Markus_Tavenrath 00:02:33 Present+ Mark_Foltz 00:02:52 big-screen has joined #webmachinelearning 00:04:23 hyojin has joined #webmachinelearning 00:04:55 acomminos has joined #webmachinelearning 00:04:59 Mike_Wyrzykowski has joined #webmachinelearning 00:06:21 RRSAgent, draft minutes 00:06:22 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 00:07:29 Mike_Wyrzykowski has joined #webmachinelearning 00:07:29 acomminos has joined #webmachinelearning 00:07:29 big-screen has joined #webmachinelearning 00:07:29 ningxin has joined #webmachinelearning 00:07:29 DenisD has joined #webmachinelearning 00:07:29 RafaelCintron has joined #webmachinelearning 00:07:29 DwayneR has joined #webmachinelearning 00:07:29 reillyg has joined #webmachinelearning 00:08:39 markafoltz has joined #webmachinelearning 00:08:39 Mike_Wyrzykowski has joined #webmachinelearning 00:08:39 acomminos has joined #webmachinelearning 00:08:39 big-screen has joined #webmachinelearning 00:08:39 ningxin has joined #webmachinelearning 00:08:39 DenisD has joined #webmachinelearning 00:08:39 RafaelCintron has joined #webmachinelearning 00:08:39 DwayneR has joined #webmachinelearning 00:08:39 reillyg has joined #webmachinelearning 00:09:58 ErikAnderson has joined #webmachinelearning 00:09:58 Mark_Foltz has joined #webmachinelearning 00:09:58 Mike_Wyrzykowski has joined #webmachinelearning 00:09:58 acomminos has joined #webmachinelearning 00:09:58 big-screen has joined #webmachinelearning 00:09:58 ningxin has joined #webmachinelearning 00:09:58 DenisD has joined #webmachinelearning 00:09:58 RafaelCintron has joined #webmachinelearning 00:09:58 DwayneR has joined #webmachinelearning 00:09:58 reillyg has joined #webmachinelearning 00:11:08 sushanth has joined #webmachinelearning 00:11:08 Tarek has joined #webmachinelearning 00:11:08 kush has joined #webmachinelearning 00:11:08 ErikAnderson has joined #webmachinelearning 00:11:08 Mark_Foltz has joined #webmachinelearning 00:11:08 Mike_Wyrzykowski has joined #webmachinelearning 00:11:08 acomminos has joined #webmachinelearning 00:11:08 big-screen has joined #webmachinelearning 00:11:08 ningxin has joined #webmachinelearning 00:11:08 DenisD has joined #webmachinelearning 00:11:08 RafaelCintron has joined #webmachinelearning 00:11:08 DwayneR has joined #webmachinelearning 00:11:08 reillyg has joined #webmachinelearning 00:11:28 Topic: Welcome 00:12:11 mtavenrath has joined #webmachinelearning 00:12:11 Kenji_Baheux3 has joined #webmachinelearning 00:12:11 alispivak has joined #webmachinelearning 00:12:11 sushanth has joined #webmachinelearning 00:12:11 Tarek has joined #webmachinelearning 00:12:11 kush has joined #webmachinelearning 00:12:11 ErikAnderson has joined #webmachinelearning 00:12:11 Mark_Foltz has joined #webmachinelearning 00:12:11 Mike_Wyrzykowski has joined #webmachinelearning 00:12:11 acomminos has joined #webmachinelearning 00:12:11 big-screen has joined #webmachinelearning 00:12:11 ningxin has joined #webmachinelearning 00:12:11 DenisD has joined #webmachinelearning 00:12:11 RafaelCintron has joined #webmachinelearning 00:12:11 DwayneR has joined #webmachinelearning 00:12:11 reillyg has joined #webmachinelearning 00:12:17 Anssi: welcome to the W3C Web Machine Learning WG F2F at TPAC 2025, this is our second physical F2F 00:12:25 ... I'm Anssi Kostiainen, Intel, the chair of the WG 00:13:08 ... with me is Dom, Dominique Hazael-Massieux, W3C Staff, helping run the meeting smoothly 00:13:26 ... again great to see so many folks here in person and new people outside the usual WG participants, including participants and guests who represent Japanese W3C members and organizations 00:13:32 kbx has joined #webmachinelearning 00:13:32 mtavenrath has joined #webmachinelearning 00:13:32 alispivak has joined #webmachinelearning 00:13:32 sushanth has joined #webmachinelearning 00:13:32 Tarek has joined #webmachinelearning 00:13:32 kush has joined #webmachinelearning 00:13:32 ErikAnderson has joined #webmachinelearning 00:13:32 Mark_Foltz has joined #webmachinelearning 00:13:32 Mike_Wyrzykowski has joined #webmachinelearning 00:13:32 acomminos has joined #webmachinelearning 00:13:32 big-screen has joined #webmachinelearning 00:13:32 ningxin has joined #webmachinelearning 00:13:32 DenisD has joined #webmachinelearning 00:13:32 RafaelCintron has joined #webmachinelearning 00:13:32 DwayneR has joined #webmachinelearning 00:13:32 reillyg has joined #webmachinelearning 00:13:35 Present+ Tarek_Ziade 00:14:12 ... Arigato gozaimasu! 00:14:18 ... this WG has continued to grow rapidly since the last year, we have all major browser vendors on board and new folks are joining 00:14:26 ... the YoY groth is around +30% in both organization and participants, for both this WG and its sister CG 00:14:32 ... a few new members who joined the WG since last F2F: 00:14:37 ... Hugging Face 00:14:40 RobKochman has joined #webmachinelearning 00:14:40 kbx has joined #webmachinelearning 00:14:40 mtavenrath has joined #webmachinelearning 00:14:40 alispivak has joined #webmachinelearning 00:14:40 sushanth has joined #webmachinelearning 00:14:40 Tarek has joined #webmachinelearning 00:14:40 kush has joined #webmachinelearning 00:14:40 ErikAnderson has joined #webmachinelearning 00:14:40 Mark_Foltz has joined #webmachinelearning 00:14:40 Mike_Wyrzykowski has joined #webmachinelearning 00:14:40 acomminos has joined #webmachinelearning 00:14:40 big-screen has joined #webmachinelearning 00:14:40 ningxin has joined #webmachinelearning 00:14:40 DenisD has joined #webmachinelearning 00:14:40 RafaelCintron has joined #webmachinelearning 00:14:40 DwayneR has joined #webmachinelearning 00:14:40 reillyg has joined #webmachinelearning 00:14:44 Present+ Rob_Kochman 00:14:56 ... Qualcomm 00:15:01 ... NVIDIA 00:15:07 ... ARM 00:15:14 ... Shopify 00:15:45 jets has joined #webmachinelearning 00:15:46 ... we are working at the intersection of Web & AI/ML technologies during this time of exponential growth in AI and we've luckly to have such a diverse group of experts onboard: 00:15:54 dezell2 has joined #webmachinelearning 00:15:54 BenGreenstein has joined #webmachinelearning 00:15:54 RobKochman has joined #webmachinelearning 00:15:54 kbx has joined #webmachinelearning 00:15:54 mtavenrath has joined #webmachinelearning 00:15:54 alispivak has joined #webmachinelearning 00:15:54 sushanth has joined #webmachinelearning 00:15:54 Tarek has joined #webmachinelearning 00:15:54 kush has joined #webmachinelearning 00:15:54 ErikAnderson has joined #webmachinelearning 00:15:54 Mark_Foltz has joined #webmachinelearning 00:15:54 Mike_Wyrzykowski has joined #webmachinelearning 00:15:54 acomminos has joined #webmachinelearning 00:15:54 big-screen has joined #webmachinelearning 00:15:54 ningxin has joined #webmachinelearning 00:15:54 DenisD has joined #webmachinelearning 00:15:54 RafaelCintron has joined #webmachinelearning 00:15:54 DwayneR has joined #webmachinelearning 00:15:54 reillyg has joined #webmachinelearning 00:15:59 Present+ Reilly_Grant 00:16:12 ... all browser vendors, OS vendors, major semiconductor companies invested in AI, major platform providers, ISVs, distinguished researchers from the academia, individuals, and more 00:16:55 ... if you registered as a WG participant, please join us at the table 00:17:16 ... observers are welcome to join the table too subject to available space 00:17:20 Anssi: we use Zoom for a hybrid meeting experience, please join using the link in the meeting invite 00:17:22 dezell2 has joined #webmachinelearning 00:17:22 BenGreenstein has joined #webmachinelearning 00:17:22 RobKochman has joined #webmachinelearning 00:17:22 kbx has joined #webmachinelearning 00:17:22 mtavenrath has joined #webmachinelearning 00:17:22 alispivak has joined #webmachinelearning 00:17:22 sushanth has joined #webmachinelearning 00:17:22 Tarek has joined #webmachinelearning 00:17:22 kush has joined #webmachinelearning 00:17:22 ErikAnderson has joined #webmachinelearning 00:17:22 Mark_Foltz has joined #webmachinelearning 00:17:22 Mike_Wyrzykowski has joined #webmachinelearning 00:17:22 acomminos has joined #webmachinelearning 00:17:22 big-screen has joined #webmachinelearning 00:17:22 ningxin has joined #webmachinelearning 00:17:22 DenisD has joined #webmachinelearning 00:17:22 RafaelCintron has joined #webmachinelearning 00:17:22 DwayneR has joined #webmachinelearning 00:17:22 reillyg has joined #webmachinelearning 00:17:34 Anssi: we use IRC for official meeting minutes and for managing the speaker queue 00:17:40 ... please join the #webmachinelearning IRC channel, link in the meeting invite and agenda: 00:17:46 -> https://irc.w3.org/?channels=#webmachinelearning 00:17:52 -> https://github.com/webmachinelearning/meetings/issues/35 00:17:53 https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko) 00:18:20 Anssi: to put yourself on the queue type in IRC "q+" 00:18:27 ... during the introductions round, we'll try to record everyone's participation on IRC with: 00:18:31 ... Present+ Firstname_Lastname 00:19:04 Ugur-Depixen has joined #webmachinelearning 00:19:04 MasaoG has joined #webmachinelearning 00:19:04 dezell2 has joined #webmachinelearning 00:19:04 BenGreenstein has joined #webmachinelearning 00:19:04 RobKochman has joined #webmachinelearning 00:19:04 kbx has joined #webmachinelearning 00:19:04 mtavenrath has joined #webmachinelearning 00:19:04 alispivak has joined #webmachinelearning 00:19:04 sushanth has joined #webmachinelearning 00:19:04 Tarek has joined #webmachinelearning 00:19:04 kush has joined #webmachinelearning 00:19:04 ErikAnderson has joined #webmachinelearning 00:19:04 Mark_Foltz has joined #webmachinelearning 00:19:04 Mike_Wyrzykowski has joined #webmachinelearning 00:19:04 acomminos has joined #webmachinelearning 00:19:04 big-screen has joined #webmachinelearning 00:19:04 ningxin has joined #webmachinelearning 00:19:04 DenisD has joined #webmachinelearning 00:19:04 RafaelCintron has joined #webmachinelearning 00:19:04 DwayneR has joined #webmachinelearning 00:19:04 reillyg has joined #webmachinelearning 00:19:11 thelounge5 has joined #webmachinelearning 00:19:13 Present+ RafaelCintron 00:19:29 Ugur_Acar_Depixen has joined #webmachinelearning 00:19:47 ... please check that your participation is recorded on IRC to we're able to acknowledge your presence in the meeting minutes 00:19:49 Subtopic: Intros 00:19:58 Anssi: since we're many again, we'll do a quick round of introductions, 15 seconds each, full name, affiliation and key interest 00:20:39 phillis has joined #webmachinelearning 00:20:55 Dwayne: WebNN spec editor, with a focus on new operators, at Microsoft 00:21:22 hagio_nhk has joined #webmachinelearning 00:21:26 Rafael: also at Microsoft, on Edge browser, working on all things AI and graphic rendering 00:21:48 MikeW: at Apple, involved in WebNN and also WebGPU 00:22:11 Denis: involved in Sustainable Web Guidelines 00:22:56 Phillis: at Google, working with Reilly on WebNN implementation on Chromium 00:23:06 Introduction : Denis DIDIER, from France - Company ITHENKA, Contributor to W3C Sustainable Web Guidelines, and Sustainable AI with french non-profit Institute for Sustainable IT. 00:23:22 handellm has joined #webmachinelearning 00:23:39 Reilly: implementing WebNN in Chromium 00:23:50 Shushan: at Microsoft, built-in AI 00:23:58 ErikA: manager on edge browser team 00:24:21 Ugur: working on AI solutions for the construction industry as chief AI officer 00:24:50 AndrewW: at ARM, involved at our open source and standards strategy team 00:25:00 Tarek: from Mozilla on the Firefox AI team 00:25:22 Markus: from NVidia, devtech supporting ISV integrating ML in their apps, getting involved in standardization to help their lives 00:25:54 Ningxin: co-editor of WebNN spec at Intel 00:25:55 RRSAgent, draft minutes 00:25:57 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 00:26:03 Dom: staff contact for the group and looking at impact of AI on the Web 00:26:12 @@@: at Google looking at integration of WebNN in Chromium 00:26:33 MarkF: at Google, working on Chrome on AI & agentic features, involved on WebMCP 00:26:45 Kenji: chrome, built-in AI 00:26:58 Ben: Chrome, similar to Mark on agentic AI 00:27:22 @@@2: Chrome team, WebMCP 00:27:27 Thomas: devrel at Chrome 00:27:49 Ali: program manager in Google, supervising ML/GPU work 00:28:18 @@@2 is Rob Kochman. 00:28:19 DavidEzell: Connexus - excited by this group; we're a standards body hoping to ruborchrage retail vendors with our standards 00:28:27 Brian: @@@ 00:28:39 YutaHagio: working for NHK, Japanese broadcaster 00:28:48 ChiaraCerretti: @@@ 00:29:03 GuidoU: Google, WebRTC APIs in Chrome, exploring application of AI 00:29:13 MarcusH: Google Meet, interested in AI/WebRTC 00:29:26 Diogo: Brazilian W3C Office 00:29:34 SamGoto: Google Chrome, platform APIs 00:29:40 @@@: Meta browser 00:29:47 RRSAgent, draft minutes 00:29:48 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 00:30:07 Masao: @@@ 00:30:31 Present+ Kenji_Baheux 00:30:34 Present+ Brian_McManus 00:30:39 present+ Mike_Wyrzykowski 00:30:40 Present+ Markus_Handell 00:30:57 Present+ Ben_Greenstein 00:31:07 s/MarcusH/MarkusH/ 00:31:12 chiace has joined #webmachinelearning 00:31:21 alispivak has joined #webmachinelearning 00:31:23 Ugur_Acar has joined #webmachinelearning 00:31:28 Present+ Chiara Cerretti 00:31:36 Present+ Ugur_Acar_Depixen 00:31:41 Present+ Ali_Spivak 00:31:51 MasaoG has left #webmachinelearning 00:31:52 Subtopic: Agenda bashing 00:31:59 Anssi: the F2F agenda was built collaboratively with you, the WG participants and is published on GH: 00:32:04 -> https://github.com/webmachinelearning/meetings/issues/35 00:32:05 https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko) 00:32:36 Anssi: any last-minute proposals or updates? 00:32:44 Masao_Goho has joined #webmachinelearning 00:33:00 Present+ Hyojin Song 00:35:34 Meet after the meeting for meat or no meat. 00:36:52 Subtopic: Charter orientation 00:36:59 Anssi: we have two groups, Web Machine Learning Working Group (WG) and Community Group (CG) 00:37:29 ... WG standardizes Web APIs for on-device inference using CG incubations as its seeds 00:37:35 ... deliverables: WebNN API, Ethical Principles 00:37:41 -> WebML WG Charter https://www.w3.org/2023/04/web-machine-learning-charter.html 00:38:06 s|2023/04/2025/03| 00:38:12 Anssi: we're looking to make the Ethical Principles as a joint deliverable with the proposed Web & AI Interest Group 00:38:18 ... this informative document is a reference from the WebNN API spec 00:38:24 ... CG is a group where new ideas are discussed, explored and incubated before formal standardization 00:38:30 ... past CG spec incubations include e.g. WebNN, Model Loader 00:38:36 ... since last year, we've expanded the scope of the CG to built-in AI APIs and agentic web capabilities 00:38:41 -> WebML CG Charter https://webmachinelearning.github.io/charter/ 00:38:48 -> WebML CG Incubations https://webmachinelearning.github.io/incubations/ 00:38:55 Anssi: current CG deliverables: 00:39:12 ... Prompt API 00:39:12 ... Writing Assistance APIs 00:39:12 ... Translator and Language Detector APIs 00:39:12 ... Proofreader API 00:39:12 ... WebMCP API 00:39:17 ... the CG technical scope is higher-level task-based APIs and agentic web feature WebMCP 00:39:22 ... while the WG technical scope is a lower-level WebNN API, the graph builder abstraction 00:39:28 Anssi: the WG and CG work closely together and coordinate with other W3C groups, for example: 00:39:45 ... - WebGPU WG/CG for WebNN-WebGPU interop 00:39:52 ... - Wasm CG 00:40:06 ... - WebRTC for media processing-related integrations 00:40:13 ... - AI Agent Protocol Community Group for agentic protocols 00:40:20 ... - And with horizontals: privacy, security, a11y, also emerging sustainability and ethics 00:41:19 dom: we operate under W3C CoC and antitrust guidance for W3C, under W3C Patent Policy 00:41:58 q? 00:42:14 Topic: Spec orientation 00:42:26 Anssi: we have scheduled time before our first break to do a triage pass through open issues 00:42:31 ... the plan is to collaborative look at our backlog or issues and PRs to: 00:42:52 -> https://github.com/webmachinelearning/webnn/issues 00:43:12 Anssi: - focus on breaking changes 00:43:26 ... - check priorities 00:43:33 ... - set next steps for the issues 00:43:46 ... let's use IRC to queue proposals, examples: 00:43:53 q+ to propose next steps for #573 00:43:53 https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] [Agenda+] 00:44:23 q+ to discuss priority of #883 00:44:23 https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+] 00:44:36 q+ to bump the priority of #861 00:44:37 https://github.com/webmachinelearning/webnn/issues/861 -> Issue 861 Evaluate sustainability impact (by anssiko) [tag-needs-resolution] [Agenda+] 00:44:40 kzms2 has joined #webmachinelearning 00:44:44 Anssi: this is your last-minute opportunity to influence today's agenda 00:44:57 ... we'll first record triage results on IRC during the first ~15 mins and then review as a group and continue discuss and refine on the hallway track with coffee/tea 00:45:02 guidou has joined #webmachinelearning 00:45:25 q+ to discuss the priority of #226 versus WebGPU interop work 00:45:26 https://github.com/webmachinelearning/webnn/issues/226 -> Issue 226 Integration with real-time video processing (by dontcallmedom) [use case] 00:46:07 Anna has joined #webmachinelearning 00:47:00 RRSAgent, draft minutes 00:47:02 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 00:50:25 +q to discuss whether we have actionable next-steps on #6 00:50:25 https://github.com/webmachinelearning/webnn/issues/6 -> Issue 6 Custom operations (by dsmilkov) [v2] [device selection] 00:52:00 +q to discuss (likely in the context of the OT discussion) next steps on #763 00:52:01 https://github.com/webmachinelearning/webnn/issues/763 -> Issue 763 Request standards positions from Mozilla and WebKit (by reillyeon) [process] 00:53:05 +q to discuss #807 ahead of prototyping work that I believe will be starting soon 00:53:06 https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request] 00:56:04 Ugur has joined #webmachinelearning 00:58:26 ningxin has joined #webmachinelearning 00:58:26 big-screen has joined #webmachinelearning 00:59:34 q? 00:59:36 q+ to discuss defining a process for the WG to accept or modify operators. 01:00:52 q? 01:01:00 ack anssik 01:01:00 anssik, you wanted to propose next steps for #573 and to discuss priority of #883 and to bump the priority of #861 01:01:01 https://github.com/webmachinelearning/webnn/issues/861 -> Issue 861 Evaluate sustainability impact (by anssiko) [tag-needs-resolution] [Agenda+] 01:01:01 https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] [Agenda+] 01:01:01 https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+] 01:01:21 ack reillyg 01:01:21 reillyg, you wanted to discuss the priority of #226 versus WebGPU interop work and to discuss whether we have actionable next-steps on #6 and to discuss (likely in the context of 01:01:22 https://github.com/webmachinelearning/webnn/issues/226 -> Issue 226 Integration with real-time video processing (by dontcallmedom) [use case] 01:01:22 https://github.com/webmachinelearning/webnn/issues/6 -> Issue 6 Custom operations (by dsmilkov) [v2] [device selection] 01:01:25 ... the OT discussion) next steps on #763 and to discuss #807 ahead of prototyping work that I believe will be starting soon and to discuss defining a process for the WG to accept 01:01:25 ... or modify operators. 01:01:42 ack reillyg 01:01:48 hagio_nhk has joined #webmachinelearning 01:02:12 reillyg: identified a couple of issues worth discussing, incl introducing future work 01:02:20 ... issue #6 on custom operator support 01:02:52 ... we're getting close to origin trial chromium, so we should request standards position from WebKit and Mozilla 01:03:39 q? 01:04:01 ... without discussing operator support in deep detail in this meeting, it might be useful discussing a process of adopting new operators or modifying existing operators 01:05:26 This is what we have for op change process: https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation 01:06:47 Topic: WebNN Small Language Model (SLM) Performance Optimization Case Study 01:06:57 Slideset: ninginx_slides 01:07:22 Topic: New features 01:07:32 Subtopic: WebNN Small Language Model (SLM) Performance Optimization Case Study 01:07:42 thelounge has joined #webmachinelearning 01:07:45 Anssi: I've asked Ningxin to present a WebNN Small Language Model (SLM) 01:07:45 Performance Optimization Case Study​ conducted by a group of our engineering team 01:07:46 alispivak has joined #webmachinelearning 01:07:49 ... thank you Yuheng, Wei, Wanming, Jonathan, Ningxin for producing this case study to inform WebNN new features discussion with considerations for topics such as: 01:07:52 ... - WebNN SLM support and challenges​, operators' fusion​, on-device KV, tensor binding​, dynamic shape​ 01:07:56 ... after this case study, we will proceed with discussion 01:08:15 chiace has joined #webmachinelearning 01:08:20 ningxin: looking at how to optimize performance for a small language model, based on QWen small model 01:08:35 ... case study conducted by a Team at intel 01:08:39 s/intel/Intel 01:08:56 [slide 3] 01:09:23 Ningxin: we reused the model from the native ORT-GenAI project 01:10:30 ... in this experiment, we focused on the WebGPU EP 01:11:35 ... and ran the same model in the webnn-based stack using the WebGPU EP as well 01:11:41 [slide 4] 01:12:32 [slide 5] 01:13:54 chikamune has joined #webmachinelearning 01:14:13 [slide 6] 01:14:48 Ningxin: looking at the contribution to inference time of key operators, starting with the native stack 01:15:20 [slide 7] 01:15:43 Ningxin: the ONNX macro ops get decomposed in WebNN operators 01:16:11 ... GroupQueryAttention requires 24 WebNN operators 01:16:23 [slide 8] 01:17:27 [slide 9] 01:17:45 [slide 10] 01:18:10 ErikAnderson has joined #webmachinelearning 01:19:00 [slide 11] 01:19:20 [slide 12] 01:19:50 RRSAgent, draft minutes 01:19:51 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 01:20:45 [slide 13] 01:22:03 [slide 14] 01:22:46 [slide 15] 01:23:36 [slide 16] 01:25:07 [slide 17] 01:26:00 [slide 18] 01:26:50 [slide 19] 01:27:07 [slide 20] 01:30:01 [slide 21] 01:30:39 [slide 22] 01:32:39 Anssi: thank you Ningxin & team for this insightful case study 01:32:43 RRSAgent, draft minutes 01:32:44 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom 01:33:06 ... Accessing to optimized macro ops is key for SLM performance​ 01:33:06 ... MatMulNBits, GQA etc., ​ 01:33:06 ... Support in spec or fusion in implementation?​ 01:33:06 ... Support dynamic input shapes?​ 01:33:06 ... Allow same tensor for input and output?​ 01:33:06 q+ 01:33:08 ... Decouple tensor binding and graph dispatch?​ 01:33:31 markafoltz has joined #webmachinelearning 01:33:31 q- 01:33:45 q+ to discuss next steps based on SLM presentation. 01:58:35 RobKochman has joined #webmachinelearning 02:00:01 markafoltz has joined #webmachinelearning 02:01:08 kbx has joined #webmachinelearning 02:01:19 Present+ Kenji_Baheux 02:02:33 RRSAgent, draft minutes 02:02:35 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 02:04:18 dezell has joined #webmachinelearning 02:06:57 q? 02:07:36 reillyg: the biggest question on all these API change proposals is doing the research on what's possible on the current WebNN implementation 02:08:06 ack reillyg 02:08:06 reillyg, you wanted to discuss next steps based on SLM presentation. 02:08:14 ... we investigated dynamic shape, tensor binding (backends have support for this) 02:08:16 mtavenrath has joined #webmachinelearning 02:08:30 ... this would be useful also for real-time application 02:08:51 ... same-tensor in input/output: the main question is whether the graph supports it 02:09:10 ... maybe a binding step could be used for validation 02:09:44 Dwayne: impressive speed up identified in the case study 02:09:49 Rafael: +1 on the next steps identified 02:10:38 Ningxin: we could look at another backend, like LiteRT 02:10:55 ... and compare it to using ONNX 02:11:10 ... both would be based on the same PyTorch model 02:12:15 ... MatMulNBits allow to set accuracy which the underlying implementation can use to accelerate the inference 02:12:26 ... not clear how to encode this when doing fusion 02:12:44 q+ 02:12:47 ack anssik 02:12:51 q? 02:13:09 Subtopic: Core operator set 02:13:16 Anssi: #573 02:13:17 https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] [Agenda+] 02:13:49 chiace has joined #webmachinelearning 02:13:53 Anssi: in the case study discussion we learned many of the SLM building blocks are key for performance:​ 02:14:10 handellm has joined #webmachinelearning 02:14:43 ... - MatMulNBits​ 02:14:47 ... - GroupQueryAttention (GQA)​ 02:14:51 ... - SkipSimplifiedLayerNorm / SimplifiedLayerNorm​ 02:14:56 ... - RotaryEmbedding​ 02:15:00 ... the question to the group is, should we support them in spec or with fusion in implementation? 02:15:04 -> WebML WG Teleconference – 9 October 2025 https://www.w3.org/2025/10/09-webmachinelearning-minutes.html#e3a7 02:15:07 Anssi: NVIDIA team reported they're collecting all the ops that'd benefit from being in the set, one class is various attentions, also gathers, MoE, TopK, and looking for other ops that'd benefit from not being composed 02:15:14 Markus: operators proliferation - some operators can't be decomposed into existing operators 02:15:42 ... part of what we need to consider is whether operators we expose in the spec are going to remain useful for long enough 02:16:18 There's a 3rd possibility too (not just built-in operator vs recognized fusion) - support subgraph composition, so that these complex operators (they are really entire graphs) are not permanently baked into the API, but still can be recognized easily and passed through to the backends. 02:16:32 q? 02:16:44 ... can we enhance WebNN to be a multi-layers, so that we can have decomposition happen at the discretion of the browser or the backend, carrying over optimization down to the hardware as necessary 02:17:02 ... which operators are complex enough and can't be decomposed into existing operators 02:17:08 ... how to expose compound operators in WebNN 02:17:14 ningxin has joined #webmachinelearning 02:17:42 Dwayne: TopK feels like a primitive operator that should be added to WebNN 02:17:56 ... MatMulNBits on the other hand is a subgraph 02:18:00 kbx has left #webmachinelearning 02:18:18 Anssi: Dwayne had done some extensive research on this topic and re-raised the concept of aggregate operators via subgraphs 02:18:23 -> https://github.com/webmachinelearning/webnn/issues/573#issuecomment-3386373261 02:18:24 https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] [Agenda+] 02:18:25 ... another possibility is to support the concept of subgraphs that can be referred to as operator later, as Ningxin described at TPAC last year 02:18:26 Pavan3 has joined #webmachinelearning 02:18:32 ... we don't have a specific issue for that atm 02:18:56 ... I'll open one 02:20:18 Anssi: Markus, could you capture your feedback on github? maybe on a meta issue around optimization 02:20:38 Markus: we fully agree with what has been described so far 02:20:43 q? 02:21:20 Markus: if we would have sub-graphs, it would also optmize the number of nodes and speed up computation, as Ningxin was describing 02:21:28 q? 02:21:43 s|ningxin_slides|https://lists.w3.org/Archives/Public/www-archive/2025Nov/att-0000/WebNN_SLM_Optimization_-_TPAC.pdf 02:21:47 RRSAgent, draft minutes 02:21:48 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom 02:22:23 s|ninginx_slides|https://lists.w3.org/Archives/Public/www-archive/2025Nov/att-0000/WebNN_SLM_Optimization_-_TPAC.pdf 02:22:25 RRSAgent, draft minutes 02:22:26 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom 02:23:21 RESOLVED: open a new issue for aggregate operators via subgraphs 02:23:41 Subtopic: Support flexible input sizes 02:23:46 Anssi: issue #883 02:23:47 https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+] 02:24:10 ... this issue was initiated due to the fact many models e.g. vision models require flexible input sizes that are determined at inference time 02:24:17 ... a few scenarios mentioned: 02:24:22 ... - input images with different resolutions for vision models 02:24:35 ... a concrete example would b MODNet, a model for real-time portrait matting 02:24:40 -> https://huggingface.co/Xenova/modnet 02:24:59 ... - transformers with arbitrary input lengths 02:25:05 ... when using KV cache, need to increase the KV cache length by 1 per each inference 02:25:15 ... - speech recognition 02:25:29 ... example Whisper encoder with arbitrary input lengths 02:25:35 ... decoder increases the KV cache length by 1 per inference 02:25:42 ... - LLMs 02:25:51 ... for example Qwen2.5-0.5B-Instruct also increases KV cache length at inference time 02:26:08 ... "Lack of the support for flexible input sizes increases the complexity of using WebNN for those models." 02:26:31 ... - complexity reduction: now need to modify the model and fix input size pre compile 02:26:52 ... - at inference time: now need to resize the image input or pad input 02:26:58 ... - native support: flexible input sizes are already supported by native frameworks 02:27:08 Anssi: Dwayne responded with considerations: 02:27:13 -> https://github.com/webmachinelearning/webnn/issues/883#issuecomment-3232188158 02:27:14 https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+] 02:28:42 dwaye: [voicing his thoughts written up in https://github.com/webmachinelearning/webnn/issues/883#issuecomment-3232188158 ] 02:29:08 q? 02:29:10 ... a step between binding and building the graph would help 02:29:58 Anssi: per our latest discussion, we're now exploring the following as a group: 02:29:59 anssik: reillyg raised questions around how this would get implemented on existing backends and performance implication 02:30:03 ... - how this will be implemented by backends 02:30:08 ... - what is the role of WebNN in this decision, the framework could build multiple graphs 02:30:08 ... - understand performance bottlenecks (of multiple graphs) 02:30:14 reillyg: all the backends we target some form of dynamic shape 02:30:52 ... two APIs part: where the dynamic shapes are idenitifed in the graph? (either arbitrary or among a set of well defined) 02:31:12 ... what API to switch to dynamic? 02:31:20 q? 02:31:42 ... then figure how to translate that with the backends while considering the performance impact 02:31:58 Anssi: MarkusT shared there are three different types of dynamic shapes usually used in neural network models: 02:32:04 ... - Completely Unknown Sizes 02:32:07 ... - Symbolic Sizes 02:32:11 ... - Tensor-Derived Sizes 02:32:17 -> TensorRT: Working with Dynamic Shapes https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/work-dynamic-shapes.html 02:32:33 Markus: [voicing https://github.com/webmachinelearning/webnn/issues/883#issuecomment-3386553391 ] 02:32:33 https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+] 02:32:50 q? 02:33:58 ... Symbolic Sizes feels like the most interesting for WebNN 02:34:39 ... it might be useful to have a way to define the ranges of sizes to optimize backend preparation 02:34:56 ... anyone feels symbolic size wouldn't work for them? 02:35:50 Ningxin: if we define symbolic size, does that include calculation on these symbols? 02:36:14 Markus: that would need more discussion; the more maths we allow, the more complexity 02:36:16 q+ 02:36:38 Ningxin: so we should start with the simplest approach and iterate as we identify the needs 02:36:58 q+ 02:37:25 q? 02:37:29 Markus: dispatch would be the phase where the dimensions would be updated 02:37:39 ack reillyg 02:37:43 Ningxin: this would match Chromium's implementation 02:38:27 reillyg: I wonder if in most cases, the need to express complex mathetmical functions for shapes goes away 02:39:08 ... with the risk that in some intermediate nodes it would not be possible to express the computed bounds of an operator 02:39:33 ... there are two separate pieces: API shape and validation 02:39:43 ... the WebNN API only has developers provide shapes for inputs 02:40:11 ... the API then provides back to developers the shapes of intermediate nodes, computed based on the operators 02:40:29 ... with static shapes, this can be computed statically at the time of graph building 02:40:54 ... if we move to dynamic, we either have to say "we don't know" or express it with symbols 02:41:20 Markus: just saying "it's dynamic" feels reasonable 02:42:07 reilly: we use the computed shape of a graph as part of the validation internally 02:42:31 ack DwayneR 02:42:33 ... I would have to check if dynamic shape, this would have to be done by the developer themselves 02:42:57 DwayneR: we should distinguish flexible model size and dynamic shape 02:43:28 q? 02:43:35 Markus: that matches "tensor-derived sizes" in my taxonomy 02:44:25 Ningxin: some prototyping & study would usefully inform this feature 02:44:49 reillyg: looking at the existing graph validation code and how much more complicated it becomes with dynamic shape 02:45:11 RESOLVED: study more backends and do prototyping before more formally specifying solution 02:45:29 kbx has joined #webmachinelearning 02:45:53 Ningxin: once we have that prototype in the browser, we should look at whether it makes it easier to deploy existing language models without modification 02:46:14 RRSAgent, draft minutes 02:46:15 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom 02:46:29 q? 02:46:42 Subtopic: API to Separate Graph Building from Weight Loading to Reduce Peak Memory Usage 02:46:54 Anssi: proposal issue #901 by Markus 02:46:54 https://github.com/webmachinelearning/webnn/issues/901 -> Issue 901 Proposal: API to Separate Graph Building from Weight Loading to Reduce Peak Memory Usage (by mtavenrath) 02:47:02 ... a proposal to reduce peak memory usage 02:47:35 ... using Stable Diffusion as a reference, CPU memory used during graph building is more than 3x the actual model size using dGPU, also high on iGPU 02:48:00 ... proposal to introducing an API that splits graph creation from weight passing i.e. loading the constant data 02:48:11 ... to enable streaming weights directly into the graph during initialization 02:48:24 q+ 02:48:29 ... do all current WebNN backends support a weightless graph creation, where all tensor shapes and data types are known, but the actual weight data is not provided until a later step? 02:49:22 ... using dGPU limit the peak CPU memory overhead to 1x-2x the size of the largest single tensor 02:49:28 ... using iGPU no temporary CPU-side storage would be needed for the "upload" as it's shared memory, reduce the total peak CPU memory consumption down to roughly Model Size + Max Single Tensor Size 02:49:39 q? 02:50:28 Markus: my experience is that even on desktops/notebooks still have limited memory, typically 16GB 02:50:36 ... even more so on mobile devices 02:51:52 q? 02:51:53 ack reillyg 02:51:56 ... right now, models in WebNN can't make use of all available memory due to the memory used for loading them 02:52:34 reillyg: model caching is related to this: how do we get the model weights the developer provide to the underlying framework the most efficiently possible? 02:52:50 ... the frameworks want to see the weights to repack them to match memory layout 02:53:17 ... we are constrained having weights available during graph building 02:53:35 ... but clearly graph building isn't memory efficient for now 02:53:41 Would constant tensor can solve this issue? https://www.w3.org/TR/webnn/#api-mlcontext-createconstanttensor 02:53:49 ... more an implementation issue I believe 02:54:12 ... some of the changes we need for caching would help address this performance issue 02:54:37 q? 02:54:50 ... maybe from an API perspective would be for situations with a very large constant we wouldn't want to load in the memory at all 02:55:26 q+ 02:55:39 ... which could be improved by streaming the constant 02:56:03 Markus: do we have the list of operators that would require the constant to be known at build time? 02:56:29 q+ 02:56:55 reillyg: we can get that list of operators 02:57:19 ... it's a constraint we get from backends we wish we didn't have 02:57:48 Markus: maybe this is something we can reach out to backend developers to change 02:57:55 sushraja has joined #webmachinelearning 02:58:05 q? 02:58:16 ack ningxin 02:58:17 ack ningxin 02:58:35 ErikAnderson has joined #webmachinelearning 02:58:53 Ugur has joined #webmachinelearning 02:59:03 takaaki has joined #webmachinelearning 02:59:07 ningxin: would the constant tensor help with this? https://www.w3.org/TR/webnn/#api-mlcontext-createconstanttensor 03:00:08 ... the issue is with some underlying AI runtimes don't support this; with DirectML, we can do this, but on ONNX, AFAIK there is no way to do this, we have to put everything on CPU during session creation time 03:00:56 ... in terms of frameworks, ONNX runtime web needs all the weights to be on CPU 03:01:05 q++ 03:01:13 q-- 03:01:20 ... we can have the API shape, but it needs adjustment both in underlying runtimes and frameworks 03:01:25 q? 03:01:28 ack RafaelCintron 03:01:30 Markus: right, this is an ecosystem effort in which WebNN is at the center 03:01:32 ack + 03:01:44 q+ 03:02:01 Rafael: in the browser, this is a multiprocessor architecture, the model MUST run in a different thread for security 03:02:59 ... being able to share memory across processes would be good for performance, but challenging for security 03:03:25 q? 03:03:29 ... maybe not insurmountable, but there was discomfort 03:04:15 Markus: zero-copy would be the dream, but reducing from 10 copy to much fewer is critical to make this usable on real world conditions 03:04:19 ack reillyg 03:05:03 reillyg: looking at the implementation sketch Markus put together, that almost match how we're doing it: when a dev gives us a constant, we create a handle to it, we start uploading that constant to the backend, and let the developer continue building the graph 03:05:24 ... the problem right now is less than the API doesn't allow you to only keep roughly the largest tensor worth of memory - it does 03:05:36 ... but all the implementations on the browser side and on the JS side aren't handling this very well 03:05:47 ... ONNX runtime web requires everything to be in JS memory 03:06:04 ... similarly today when we create a constant, we keep it in the memory in the browser - but we don't have to 03:06:28 ... what Markus describes is how we intend the current API to work, but that's not how implementations exist today 03:07:50 Anssi: is there any spec change that we should make out of this conversation? any normative change for WebNN to enable this optimization? 03:08:18 reillyg: the only two things we might change: if we have an issue with very large constants, we may want to add a streaming constructor for constatns 03:09:13 ... and a feedback mechanism to let developers know when they can start loading the next one, with a backpressure mechanism to manage peak memory 03:10:55 Dom: any cooperation we should facilitate with backends/frameworks? 03:11:07 reillyg: I assume the ONNX Web runtime is aware of the memory issue 03:11:54 Ningxin: the main issue I think is on the backend side, and whether this would work with the various hardware EPs 03:12:16 ... on the JS framework, we can probably have a solution 03:12:23 q+ 03:13:12 s/atns/ants/ 03:13:45 reillyg: adding a streaming constructor would also open the door to the backpressure feature I was describing 03:14:52 RESOLVED: explore streaming constructor for constants 03:15:46 q? 03:15:54 ack RafaelCintron 03:16:10 sushraja has joined #webmachinelearning 03:16:17 q? 03:16:17 RafaelCintron: +1 on the importance on getting this fixed to be clear 03:16:39 ... I know the ONNX runtime is trying to fix very similar issues 03:16:42 sushraja has joined #webmachinelearning 03:17:06 ... this will also be needed for WebGPU interop 03:17:08 q? 03:17:15 RRSAgent, draft minutes 03:17:16 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom 03:17:40 Subtopic: Device selection, state of the union 03:17:46 -> [device selection] https://github.com/webmachinelearning/webnn/labels/device%20selection 03:17:54 Anssi: a bag of issues 03:18:21 ... we have explored API surface-level enhancement for both "before graph compilation" tied to MLContext and "after graph compilation" tied to MLGraph object 03:18:32 ... recently we reached consensus to add a simple accelerator selection mechanism 03:18:38 ... issue #815 was addressed by PR #895 03:18:38 https://github.com/webmachinelearning/webnn/issues/895 -> #895 03:18:39 https://github.com/webmachinelearning/webnn/issues/815 -> #815 03:18:45 ... the minimal design the group landed on is an `MLContext.accelerated` boolean: 03:18:52 ``` 03:18:52 interface MLContext { 03:18:52 undefined destroy(); 03:18:52 + readonly attribute boolean accelerated; 03:18:52 readonly attribute Promise lost; 03:18:53 }; 03:18:53 ``` 03:19:01 Anssi: a corresponding explainer update was #884 03:19:01 https://github.com/webmachinelearning/webnn/pull/884 -> MERGED Pull Request 884 Update explainer with new proposal for simple accelerator mapping (by zolkis) [device selection] 03:19:18 ... we spun off issues for further discussion: 03:19:26 hagio_nhk has joined #webmachinelearning 03:19:26 ... #897 to device "underlying execution device" concept 03:19:30 https://github.com/webmachinelearning/webnn/issues/897 -> Issue 897 Define "underlying execution device" concept (by anssiko) [device selection] 03:19:34 ... #900 for CPU fallback hint 03:19:35 https://github.com/webmachinelearning/webnn/issues/900 -> Issue 900 CPU fallback hint (by anssiko) [device selection] 03:19:43 ... #902 usecase-driven scenenarios 03:19:44 https://github.com/webmachinelearning/webnn/issues/902 -> Issue 902 Device selection criteria for usecase-driven scenarios (by fdwr) [device selection] 03:20:12 Anssi: we also have a spec issue #836, PR #854 and prototype implementation for `MLGraph.devices` API 03:20:13 https://github.com/webmachinelearning/webnn/pull/854 -> Pull Request 854 define graph.devices (by philloooo) [device selection] 03:20:13 https://github.com/webmachinelearning/webnn/issues/836 -> Issue 836 Get devices used for a graph after graph compilation (by philloooo) [device selection] 03:20:35 ... the latest on this is MarkusH and MikeW are exploring use cases with this design 03:20:44 ... privacy is the key concern with this proposed API enhancement 03:21:05 Anssi: issue #759 03:21:06 https://github.com/webmachinelearning/webnn/issues/759 -> Issue 759 MLOpSupportLimits should be opt-in with base functionality (by mwyrzykowski) [device selection] 03:21:25 ... this proposal from MikeW for providing an API for listing operator support limits is informed by a similar API in WebGPU: 03:21:30 -> WebGPU limits https://www.w3.org/TR/webgpu/#limits 03:21:55 Anssi: the proposed MLOpSupportLimits API returns all available devices with their operator support limits 03:22:00 ... using this information, the web app can choose one of them to initialize a context with 03:22:41 Subtopic: Device selection criteria for usecase-driven scenarios 03:22:45 Anssi: issue #902 03:22:59 Anssi: any device selection feature we design should be motivated by a real-world app scenario / use case 03:23:22 Dwayne: no concrete proposals here 03:24:03 ... the question is how to find the right balance between leaving more freedom to the UA and allowing situations were more device control is required 03:24:44 Markus: the problem is made complex because there are not only CPU/GPU/NPU, but several GPUs, NPUs, sometimes of different vendors 03:25:07 ... WebNN is a really good target for vendors seeking to deploy on the Web interoperably 03:25:31 ... one situation that is challenging is when they need to run multiple models at the same time 03:26:09 ... when professional users have multiple powerful GPUs, we wouldn't want the privacy protections to make it impossible to fully take advantage of their hardware 03:27:05 q? 03:27:06 ... I wondered if a permission prompt similar to camera/mic could be acceptable, which would then get access to full query of devices while avoiding slient fingerprinting 03:28:21 Rafael: WebGL and WebGPU have a way to pick a specific highperformance adapter 03:28:32 q+ 03:28:49 ... with a restriction on iframes 03:29:22 ... wrt prompting, neither WebGL or WebGPU have a prompt - how do you handle the situation where the user say no because of prompt fatigue 03:29:46 vmpstr has joined #webmachinelearning 03:29:58 RRSAgent, draft minutes 03:29:59 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 03:30:05 ... fingerprinting is a real issue - WebGL has been massively used for fingerprinting based on telemetry 03:30:43 q? 03:30:46 ack reillyg 03:30:55 ... I'm OK with allowing access to high performance GPUs, and maybe consider a permission prompt for super advanced use cases 03:31:06 reillyg: +1 to Rafael 03:31:25 ... a solution where by default you the GPU the browser identifies the best 03:31:49 ... I don't think the current WebGPU implementation in Chromium allows to use multiple GPUs 03:31:51 q+ 03:32:12 ... Maybe WebNN should allow to query for NPUs 03:32:45 RafaelCintron: as far as Chromium is concerned, high-perf request is only supported on Mac 03:32:55 ... not supported on Windows - maybe coming in the future 03:32:57 q? 03:33:03 ack RafaelCintron 03:33:52 RRSAgent, draft minutes 03:33:53 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom 03:34:06 hagio_nhk has left #webmachinelearning 03:34:28 Itadakimasu 👋. 04:34:46 Mike_Wyrzykowski has joined #webmachinelearning 04:49:21 sushraja has joined #webmachinelearning 04:49:27 Present+ 04:49:35 Ehsan has joined #webmachinelearning 04:49:58 mgifford2 has joined #webmachinelearning 04:50:06 IrisJ has joined #webmachinelearning 04:50:19 kbx7 has joined #webmachinelearning 04:50:19 handellm has joined #webmachinelearning 04:50:26 Tarek3 has joined #webmachinelearning 04:50:49 RobKochman has joined #webmachinelearning 04:51:52 Erik: how much do we need to explore the permission prompt? vs an enterprise policy optional API? 04:52:13 Markus: my main point is to make sure we consider scenarios with more complex device selection then just per type 04:52:45 elena has joined #webmachinelearning 04:52:52 ... both because multiple devices of a given type might exist, or because you need to keep a particular job on a device where another job is happening (e.g. decoding) 04:53:09 erik: how much does this need to be driven by the app vs via hints? 04:53:21 Markus: I'd be fine with hints, but I'm skeptical they'll suffice 04:54:00 ErikAnderson has joined #webmachinelearning 04:54:23 ... Another case might be benchmarking - including done by the app for device selection 04:55:05 dom: on the web platform, always need to balance use cases vs. privacy, 80/20 rule, hints vs direct control needs this consideration 04:55:10 Mike_Wyrzykowski has joined #webmachinelearning 04:55:27 ... we're might be adding a huge new fingerprint surface 04:55:46 q+ 04:55:58 Markus: we'd put this behind a permission prompt 04:56:22 dom: prompt fatique and understandability is an issue with adding new permission prompts 04:56:23 q? 04:56:25 ack reillyg 04:56:47 q? 04:57:24 thomasN: +1 on trade-off with privacy; one successful strategy has been to look at what data has already been exposed 04:57:29 reillyg: the question on benchmark is a good one 04:57:40 ... we expect developers already do this to decide what they can run 04:58:07 ... if this is something we can provide them instead of getting them to run benchmark workloads that are wasteful 04:58:30 ... the question is how to express capabilities as numbers which are difficults the same way hints are 04:58:45 q+ 04:58:55 ... we've seen this as relatively successful in the WebGPU context and might be useful here for NPUs 04:58:59 q? 04:59:01 ... but unclear which numbers to provide 04:59:01 ack Mike_Wyrzykowski 04:59:03 ErikAnderson has joined #webmachinelearning 04:59:35 MikeW: do we need to expose opslimits by processing unit? 05:00:02 ... (as I commented on the issue https://github.com/webmachinelearning/webnn/issues/902) 05:00:03 https://github.com/webmachinelearning/webnn/issues/902 -> https://github.com/webmachinelearning/webnn/issues/902 05:00:11 s/opslimits/OpsSupportLimits/ 05:00:14 q? 05:00:28 ningxin has joined #webmachinelearning 05:00:37 reillyg: this woudl be very helpful, but it's not information made available by platforms - e.g. CoreML doesn't provide stats on the capabilities of NPU 05:00:47 .... similar situation in other platforms 05:01:14 ... this would be great enhancement to the API 05:01:33 Anssi: how does this relate to #759? 05:01:34 https://github.com/webmachinelearning/webnn/issues/759 -> Issue 759 MLOpSupportLimits should be opt-in with base functionality (by mwyrzykowski) [device selection] 05:01:56 MikeW: they're related but different; as reilly say, the challenge is making that information queryable 05:02:44 markafoltz has joined #webmachinelearning 05:02:58 reillyg: for #759, we recently updated the WPT to differentiate required and optional tests to represent that idea of things developers can rely on or not 05:03:05 ... not sure this has been reflected in the spec 05:03:20 q? 05:03:48 +1 to hints for now. 05:04:03 reillyg: beyond choosing devices, there is also a scheduling aspect to this 05:04:33 ... e.g. if there are real-time vs non-real time workloads running in parallel, helping the UA to schedule with hints would be useful 05:04:58 q? 05:05:00 Ugur has joined #webmachinelearning 05:06:06 dom: in addition to permission prompt, there's also discussion about integrating permission management with page embedded permission control 05:06:28 ... not changing the discussion, it changes how this is embedded in the UX so we don't have prompt coming from nowhere 05:06:30 s/page embedded permission control/page embedded permission control (PEPC) 05:06:58 ... for more advanced query API we need to make it in context of this new proposal for permission management 05:07:03 q? 05:07:41 markus: if we have hints, how do we validate they work? 05:08:17 reilly: a developer can't measure how their app runs if we don't provide the metrics 05:08:59 ... Phillis has a proposal to expose which device the model is running on 05:09:02 Subtopic: CPU fallback hint 05:09:13 Anssi: issue #900 05:09:13 https://github.com/webmachinelearning/webnn/issues/900 -> Issue 900 CPU fallback hint (by anssiko) [device selection] 05:09:36 ... the group has explored a "CPU fallback" hint, a flag to expose to web content whether a CPU fallback mechanism is active 05:09:39 ... spun off from the "accelerated" hint, a feature discussion that landed 05:10:29 MarkusH: we have use cases where knowing if the workload will be accelerated is critical to deciding whether to run it or not 05:10:56 ... we would want to abort if we detect cpu fallback before or after compilation 05:11:05 ... before would help saving download coast 05:11:07 s/coast/cost 05:11:15 q? 05:11:18 q+ 05:11:32 ack reillyg 05:11:44 reillyg: the previous discussion about OS support for GPU/NPU devices is helpful here 05:12:03 ... in general, the answer to "is CPU fallback active" before compilation is always "yes" 05:12:18 ... it's always supported 05:12:46 q+ 05:12:52 ack ningxin 05:12:56 ... how do we help developers determine whether to use faster vs better model based on GPU availability 05:12:57 q+ 05:13:22 NIngxin: we should distinguish "cpu-only" vs "cpu fallback" - the latter is always available 05:13:43 ... what you want here is to avoid accerelated=false 05:14:23 q? 05:14:30 ... we can set context.accelerated=false if we detect the GPU/NPU won't work 05:15:05 reillyg: one question is "do you have a GPU/NPU?" if not, this means we're on a CPU-only situation 05:16:03 ... if it's about fallbacks - do we want to provide an option to fail compilation if it will end up running on CPU - but that only works after compilation which you want to avoid 05:16:03 q+ 05:17:22 reillyg: we should clarify that the issue about detecting whether a GPU/NPU is available - for a pre-compilation situation 05:17:25 ack ErikAnderson 05:17:53 It's not just if a device has the GPU/NPU but if a user wants to have the LLM run on their device. It may be a matter of user preference, but also energy usage. Users may be happy running a GPU/NPU in some locations or times, and not others based on things like local energy costs or reliability. Just battery life as well. 05:18:05 ErikA: similar to the discussions in WebGL/WebGPU 05:18:35 MarkusH: we can always to try it and check whether it works well run on real time 05:19:27 ErikA: in WebGL, you can create a context that makes it fail if you hit performance challenges 05:19:36 q? 05:19:39 For context: https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/getContext#failifmajorperformancecaveat 05:19:40 ack Tarek3 05:19:42 ack Tarek 05:19:46 reillyg: we should start simple "can it run fast at all", and look at more detailed evaluation in a later phase 05:20:38 Tarek: I had similar questions around concurrency: if the existing accelerated hardware is already used, should that be exposed to the app? 05:21:15 reillyg: a given app might run separate models/graphc rendering in parallel - we should help the app negotiate to figure which workloads to run where 05:21:28 Tarek: so the orchestration might happen on both sides? 05:21:29 kush has joined #webmachinelearning 05:22:00 reillyg: right - an app might have more workloads to run than are runnable in parallel on a given system 05:22:47 https://github.com/webmachinelearning/webnn/issues/900 -> Issue 900 CPU fallback hint (by anssiko) [device selection] 05:23:10 MarkusH: another aspect is time-sensitivity: video frame needs to be processed in real time, when the answer to a chat-bot query to an llm is much less time-sensitive 05:23:21 big-screen has joined #webmachinelearning 05:25:21 MarkusH: a boolean flag on whether it is accelerate is probably a good enough starting point 05:25:27 phillis has joined #webmachinelearning 05:25:43 s/accelerate/accelerated/ 05:26:08 Subtopic: Get device selection information after graph compilation 05:26:09 mtavenrath4 has joined #webmachinelearning 05:26:13 Anssi: issue #836 and PR #854 05:26:13 https://github.com/webmachinelearning/webnn/pull/854 -> Pull Request 854 define graph.devices (by philloooo) [device selection] 05:26:13 https://github.com/webmachinelearning/webnn/issues/836 -> Issue 836 Get devices used for a graph after graph compilation (by philloooo) [device selection] 05:26:30 ... the group thinks we need the following two to advance: 05:26:35 ... 1) strong use cases 05:26:42 ... 2) check the API design is privacy preserving 05:26:55 Anssi: MarkusH from Google Meet share his key use case, adaptation 05:27:10 `graph.devices` could help identify: 05:27:15 BenGreenstein has joined #webmachinelearning 05:27:23 a) what resources a misbehaving model is using, and 05:27:32 b) which models are candidates to stop that would help the situations 05:27:41 q+ 05:28:00 Anssi: MikeW commented: 05:28:08 ... "Another way of achieving the same thing is the web app sorts its workloads in priority, terminating lower priority ones (1). Or some type of metric reporting that the model was stalled K ms waiting to run due to other work on the system and took S ms to complete (2)" 05:28:45 MikeW: the problem is that the information on which device has been selected isn't static 05:29:13 ... a workload that has run on a GPU may run on the NPU the next run, or fallback to CPU 05:29:37 q? 05:29:41 ack Mike_Wyrzykowski 05:29:46 ... I can see the value in expressing the graph can run on an accelerated unit, but reporting the last device on which it has run is not very reliable 05:30:09 q+ 05:30:26 q+ 05:30:30 reillyg: is there still value to report on which devices the workload might run? e.g. gpu or npu; would that be good enough for applications? 05:30:55 MikeW: could we just return that it can run accelerated vs a specific device value? 05:31:00 ack Mike_Wyrzykowski 05:31:28 ... the distinctions on specific hardware types are evolving, and it's not obvious it's needed for the app 05:31:28 ack ErikAnderson 05:31:33 MarkusH: I think that could work 05:32:09 Erik: an app author might want to know how much of the workload on which unit 05:32:15 q- 05:32:33 RRSAgent, draft minutes 05:32:35 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 05:33:38 ... I'm not sure the proposal on the PR provides enough reliable context 05:33:41 dom: two aspects, things you want to operate and course-correct live and things you want to monitor to know if want to modify system later, is separation of concerns approach appropriate here? 05:34:06 MarkusH: when we detect that we're not operating in real-time compatible ways, we need to take action 05:34:23 q? 05:34:31 ... the proposal in the PR could help; an accelerated flag would probably suffice 05:34:52 phillis: if it's hybrid, what do we report? 05:35:39 dom: would an enum be actionable? 05:35:50 markush: in practice, it would depend on how much it runs on the CPU 05:36:14 Youngmin has joined #webmachinelearning 05:36:22 reillyg: there is a cost in using multiple units (even GPU + NPU) - so maybe "hybrid" is worth reporting in general 05:36:39 ... at some point, some of the performance detection can only be done by the app developers 05:37:47 RESOLVED: Phillis to refine the proposal to reflect an accelerated status, with discussions on hybrid still TBD 05:37:52 How much of this is inherent to hardware design? Will switching costs matter in 5 years? Probably. What influence might the W3C have in the future of what this technology makes available? 05:38:14 Subtopic: MLOpSupportLimits 05:38:21 Anssi: issue #759 05:38:21 https://github.com/webmachinelearning/webnn/issues/759 -> #759 05:38:42 MikeW: we should define limits that are supported across all devices 05:39:03 q? 05:40:02 reillyg: we have this in the test; a goal for us implementation-wise is to make sure that the implementation we have can implement all operators, and for those operators that can't be made optional 05:40:16 ningxin has joined #webmachinelearning 05:40:39 Topic: Customer feedback & collaborations 05:42:05 Anssi: customer feedback, including end-users, frameworks, independent software vendors, is extremely important throughout the process of developing new Web APIs, starting with use case identification, requirements gathering and hands-on feedback from early adopters, all the way to maintenance phase when large-scale deployment happens 05:42:14 ... we have used a dedicated community-maintained repo, WebNN Awesome, to document various signals from customers and developers at large 05:42:20 -> https://github.com/webmachinelearning/awesome-webnn 05:43:07 Anssi: I recognize many customers are not comfortable to publicly speak for their future product's use of WebNN API at this time, so I ask for sensitivity in this regard toward that 05:43:23 ... that said, we have some brave early adopters who have worked with us in public 05:43:32 ... kudos to the Google Meet team and Markus in particular for sharing feedback, reviewing our proposals and also submitting new feature requests for considerations 05:47:22 Subtopic: RTC-style workloads with response time requirements 05:47:29 Anssi: issue #898 05:47:29 https://github.com/webmachinelearning/webnn/issues/898 -> Issue 898 Support for workloads with response time requirements (realtime) (by handellm) [Agenda+] 05:47:55 ... Markus provided customer feedback from Google Meet product where RTC-style workloads have strict response time requirements 05:47:58 ... assumption is the system while not under load is able to execute the workload 05:48:19 MarkusH: I see a future where we run more and more concurrent ML workloads on our system 05:48:41 ... if the system can't detect what's real-time or not, it may not be able to orchestrate it 05:49:07 ... e.g. audio processing needs to be run within certain time requirements to the risk of audio glitches or robot voices 05:49:37 q? 05:49:38 ... if we can't rely on these deadlines being respected, this creates an adoption blocker 05:49:47 ... the same is true (with a different scale) for video processing 05:50:28 ... also, there is prioritization - not all audio processing may be as critical 05:50:35 q? 05:50:41 ... we've also documented situations of misbehaving concurrent workloads 05:51:08 q+ 05:51:30 ack mtavenrath4 05:51:33 ack mtavenrath4 05:51:35 ack mtavenrath 05:51:48 markusT: it feels like a hard to address problem in general 05:52:15 ... e.g. ONNX runtime doesn't have a sense of real time 05:53:06 ... tasks get queued, so if it gets queued behing a slow task (e.g. an LLM request), you can't really accelerate this 05:53:30 ... not sure there is a prioritization mechanism on all type of devices 05:53:56 q+ 05:53:57 q+ 05:54:00 ack Tarek 05:54:01 ... even getting this orchestrated on native is hard, because the frameworks don't support the infrastruture you would need to execute properly 05:54:17 Tarek: do we really want to do that in WebNN? 05:54:40 ... we're starting from this situation of wanting to run concurrent workloads via a background utility 05:54:54 q+ 05:55:31 ... should this be done by the app or in the backend? does it even make sense to run several things on a GPU 05:56:09 MarkusT: do you know how much the task will take of the available window? 05:56:25 MarkusH: on CPU, this is solved problem with OS priorities 05:57:15 ... when workloads get interleaved on GPUs, there is an opportunity for prioritization 05:57:37 MarkusT: pre-emption is now available on GPUs, but it is much more expensive than a CPU 05:57:57 ... but overall, this gets us back to my device selection issue 05:58:08 ... e.G. audio processing you probably want on CPU, where the data is anyway 05:58:33 ... conversely, video processing happens on GPU, and you'd want to use the device used to render the video as well 05:58:57 q? 05:59:27 MarkusH: audio processing might be best run on NPU for power efficiency 05:59:29 q+ 05:59:36 ack RafaelCintron 05:59:59 MarkusT: that's where it' useful to know which devices are available, and support benchmarking 06:00:35 Dingwei has joined #webmachinelearning 06:00:35 RafaelCintron: hints could communicate priorities, that would map to processing queues 06:00:39 q- 06:00:50 ack dom 06:01:59 dom: reflection, this need to orchestrate processing across latency and power efficiency, we find this around many APIs on the web platforms, and each time we create a hint we should align across APIs 06:02:36 ... you don't want to switch from one GPU to another GPU, you want continuity, but how to describe that in a declarative way is the question 06:03:13 ... these are more general questions, Google Meet is a good use case to look at RTC-application requirements 06:03:15 q? 06:03:41 RRSAgent, draft minutes 06:03:43 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 06:03:52 MarkusH: I was discussing with Youenn of the concept of worker priority that was proposed by Intel a couple of years ago 06:04:26 ... e.g. an "audio" worker priority would be exposed to WebNN 06:04:36 ... and influence how that job would run 06:05:14 AlexDawson has joined #webmachinelearning 06:05:24 -> https://www.w3.org/2023/Talks/TPAC/breakouts/web-worker-qos/ Web WOrker Quality of Service breakout presentation at TPAC 2023 06:05:34 RRSAgent, draft minutes 06:05:35 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom 06:34:21 phillis has joined #webmachinelearning 06:37:03 Topic: Implementation plans and trials 06:37:17 Tarek has joined #webmachinelearning 06:37:20 Anssi: in this session we'll discuss and share the latest news on the next step for implementations, Origin Trial or equivalent, new requirements or feedback from updates in backends, frameworks 06:37:28 ... but first, we kick off with exciting demos to raise your appetite! 06:38:02 simone has joined #webmachinelearning 06:38:05 -> WebNN Developer Preview demos https://microsoft.github.io/webnn-developer-preview/ 06:38:49 Present+ Tara_Whalen 06:38:49 Present+ Simone_Onofri 06:40:35 [showing running WebNN Stable Diffusion Turbo both on GPU and NPU] 06:40:52 present+ Iris_Johnson 06:41:12 Present+ Iris_Johnson 06:41:34 [showing WebNN Segment Anything demo running on GPU and on NPU] 06:41:57 handellm has joined #webmachinelearning 06:43:46 [showing WebNN Whisper Base on GPU and NPU] 06:44:23 -> WebNN via Transformers.js https://huggingface.co/webnn/spaces 06:45:11 [showing Background removal based on MODnet, demo hosted on hugging face, on GPU & NPU] 06:45:16 Dingwei has joined #webmachinelearning 06:45:41 [showing real-imte object detection w/ Yolo12n] 06:46:23 tara has joined #webmachinelearning 06:46:34 [real-time depth estimation w/Depth anything v2] 06:50:35 [demo of background blur done with WebNN on a full WebGPU pipeline, with 23% improved performance and 17% lower power consumption] 06:51:18 Anssi: thank you Ningxin for these compelling demos! 06:51:27 alispivak has joined #webmachinelearning 06:51:44 mgifford2 has joined #webmachinelearning 06:51:51 Subtopic: Browser vendors' trials 06:52:12 Ugur has joined #webmachinelearning 06:52:30 Anssi: we've discussed on our telcons that Origin Trial in Chrome is getting closer, latest discussion: 06:52:37 -> https://www.w3.org/2025/10/23-webmachinelearning-minutes.html#1faa 06:52:41 ... we also discussed Edge works in upstream with only a small 5-10 days delay and will launch an Origin Trial in sync 06:52:45 ... more information about Origin Trials will be made available at: 06:52:49 -> Chrome Origin Trials https://developer.chrome.com/origintrials/ 06:52:53 -> Edge Origin Trials https://developer.microsoft.com/en-us/microsoft-edge/origin-trials 06:53:13 RRSAgent, draft minutes 06:53:15 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 06:53:41 reillyg: I imagine this landing in the 2nd or 3rd version of the new year in Chrome OT 06:53:49 s/version/release 06:54:22 Topic: Horizontals 06:54:27 Anssi: in this session we get to know experts behind horizontal groups 06:54:31 Subtopic: Ethics 06:54:50 Anssi: for ethics, I've proposed to make this group's Ethical Principles for Web Machine Learning a joint deliverable with the Web & AI Interest Group, a group that is being proposed 06:55:06 ... by doing this, we can tap into the expertise of that Interest Group to help advance this important deliverable on the W3C Note track 06:55:12 ... we currently refer to this doc in the WebNN spec Ethical Considerations 06:55:32 -> https://www.w3.org/TR/webmachinelearning-ethics/ 06:55:36 -> https://www.w3.org/TR/webnn/#ethics 06:56:30 dom: ethics has not received a lot of bandwidth which is why we propose to make it a joint deliverable with Web & AI IG, the document was written 2022-23 so long time ago considering the rate of development in AI space 06:56:35 q? 06:56:59 dom: also Ethical Considerations has been endorsed as W3C Statement 06:57:00 q? 06:57:17 s/Ethical Considerations/Ethical Web Principles/ 06:58:07 Subtopic: Sustainability 06:58:14 Anssi: I've asked Mike Gifford, co-chair of the Sustainable Web IG to talk about the work done in that group 06:58:18 -> https://www.w3.org/groups/ig/sustainableweb/ 06:58:22 Anssi: per TAG review feedback, we're expected to evaluate sustainability impact of WebNN, see issue #861 06:58:23 https://github.com/webmachinelearning/webnn/issues/861 -> Issue 861 Evaluate sustainability impact (by anssiko) [tag-needs-resolution] [Agenda+] 06:58:36 https://www.w3.org/TR/web-sustainability-guidelines/ 06:58:36 https://github.com/w3c/sustainableweb-wsg/issues/139 06:58:36 https://github.com/w3c/sustainableweb-wsg/issues/139 -> Issue 139 Adding a comment about what is or isn't included with AI (by mgifford) [enhancement] [editorial] 06:58:44 RRSAgent, draft minutes 06:58:45 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 06:58:57 MikeG: we're an environmental and climate crisis which we need to integrate in our work 06:59:17 ningxin has joined #webmachinelearning 06:59:28 ... the goal of the Web Sustainability Guidelines is to create a Web standard that other institutions can use as a guidelines to evaluate how sustainable their technologies are 06:59:49 ... the size of avergae Web page has grown beyond the scale of the information it's providing 06:59:50 q? 06:59:55 s/gae/age/ 07:00:19 ... which relates to Web performance, although we have considerations that are completely separate - e.g. water consumption 07:00:52 present+ 07:01:11 ... We're here because AI is changing a lot for the Web; we're seeing agentic browsers, the raise in AI in everything whether you want it or not, with a huge environmental impact, with data centers growing impact on electricity, water, sound 07:01:33 present+ 07:02:08 ... we're interesting in evaluating overlap between our groups, the impact of decentralizing AI inferences to devices on environmental impact given their lower optimization compared to data center 07:02:13 ... we have a few questions: 07:03:01 Present+ Mike_Gifford 07:03:02 ... - what advice can you give us as we're starting to write up guidance on accessibility [suspect AI was meant AI] 07:03:09 s/.AI// 07:03:12 q? 07:03:43 Anssi: what is the best way for participants of this group to help with this? github repo? 07:03:52 mikeG: yes 07:04:31 Anssi: are there AI-related issues we can help with? initially AI wasn't really part of the scope as I understand it 07:04:53 MIkeG: right - we can't not address this given the impact of AI 07:05:14 ... our guidelines are expected to address different context and audiences 07:05:31 ... we're not sure yet on whether to include AI in a cluster or distribute it across the document 07:05:57 Anssik: for this group to help, having AI-focused content would make it easier 07:06:18 Present+ Alex_Dawson 07:06:20 q? 07:06:30 MikeG: we have lots of infrastructure to help navigate the guidelines and issues through a well-defined taxonomy, which will help with this 07:06:44 ... there is also the question of data centers on which we could use expertise from people here 07:07:38 dom: sustainability is currently and IG that's working on a note and the direction is toward a horizontal group 07:08:01 ... horizontal definition is more of a cultural one, ethical web considerations tell us to consider sustainability 07:08:02 q? 07:09:06 q? 07:09:07 MikeG: how does your group deal with a fast-evolving ecosystem such as AI? 07:09:50 Anssik: we try to find the right level of abtractions that stand the test of time, as Web standards have tried to do 07:10:05 ... similar to the discussion about to what extent the NPU/GPU distinction matters 07:10:20 W3C already has Societal Impact self-review, there is scope for a potential self-review for sustainability in the future. 07:10:34 reillyg: we also depends on what developers will want to use to provide the best UX 07:11:15 ... so we're more reactive where the sustainability work would be more proactive in pushing in a given direction 07:11:31 MikeG: aligning incentives towards good sustainability is a key challenge we face 07:12:31 ... Small vs Large Language Models: the former seems more environmental friendly; but will that distinction remains relevant over time? 07:12:41 Anssi: the Mobile Web Best PRactices document had that very issue 07:13:52 q+ 07:14:25 s/PRactices/Practices 07:14:29 ack Tarek 07:14:52 Tarek: re SML, at Mozilla the definition we used a year ago no longer works today 07:15:03 Ugur has joined #webmachinelearning 07:15:15 ... we're looking at device tiers: non-capable, device with certain capabilities, high end devices 07:15:27 ... we've found that more robust over time 07:15:58 s/re SML/re SLM 07:16:02 MikeG: any suggestion on how to classify models instead of devices? 07:16:28 Tarek: anything that doesn't spit out a continous stream of tokens is a SLM 07:16:30 q+ 07:17:16 @@@: we put the boundary at 7B parameters 07:17:25 Tarek: but it's at risk of changing in a few months 07:17:52 Thomas: I don't think the # of parameters in a guideline context: the guideline should be about "use the smallest possible model" 07:18:03 fershad has joined #webmachinelearning 07:18:14 ... with the caveat that an already-available model on device might be a better option 07:20:45 q+ 07:22:24 Anssi: my mental model to the model selection problem: use the "right tool for the right job" but it is complicated because the diversity to toolboxes available to people 07:22:39 anssik: re model selection, this should be about selecting the right tool to the right job, but it is a complicated evaluation to make given the variety of options available 07:23:05 Subtopic: Privacy 07:23:09 Anssi: late last year, the Privacy Interest Group was launched replacing the Privacy Working Group 07:23:13 ... what's new in this transition? 07:23:17 -> https://www.w3.org/2024/10/wg-privacy-charter.html 07:23:20 Tara: the transition of the IG to WG hasn't really changed much in terms of the review work 07:23:21 q- 07:23:38 s/@@@:/Sushanth: 07:24:10 Ugur has joined #webmachinelearning 07:24:23 AlexDawson has left #webmachinelearning 07:24:29 tara: Simone and I are going to run a joint presentation 07:24:51 anssik always struggle a bit to delineate between privacy and security 07:24:59 q- 07:25:05 tara: they do have a lot in common 07:26:17 ... we have specialized guidance, but this shouldn't be a source of concern on your end 07:26:21 s/: Privacy/: Privacy and Security 07:26:29 Anssi: Security Interest Group was recently launched to reinvigorate work to advise groups developing standards on how to avoid and mitigate security issues 07:27:01 Slideset: privacy_slides 07:27:12 [slide 1] 07:27:15 [slide 2] 07:28:09 elena has joined #webmachinelearning 07:28:27 [slide 3] 07:28:46 RRSAgent, draft minutes 07:28:47 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik 07:29:40 s|privacy_slides|https://docs.google.com/presentation/d/11m1TXLVzhnIEimyqIjgs0VTXhhA4wWIGyATEw664Ei0/edit 07:29:53 [slide 4] 07:32:39 [slide 5] 07:33:11 ThomasNattestad has joined #webmachinelearning 07:33:58 [slide 6] 07:35:28 q+ 07:35:51 RobKochman has joined #webmachinelearning 07:35:54 q? 07:35:57 ack ThomasNattestad 07:36:37 Ugur has joined #webmachinelearning 07:37:43 ThomasN: re fingerprinting, one of the perennial that keeps popping is that there is already so much entropy that it's not obvious how much of it can still be mitigated 07:37:51 q+ 07:38:30 ... is this a tractable problem, and something that is worth spending time mitigating at the spec level? 07:39:38 Tara: I think we're pushing towards a better space and so we feel it's worth considering the trade-offs that limit that path open 07:39:52 s/limit/keep 07:40:25 ThomasN: it's hard to evaluate the cost in developing the API and more importantly the ability of developers to fulfill their use cases 07:40:33 ack christianliebel 07:40:48 q+ christianliebel 07:40:55 [slide 7] 07:42:13 [slide 8] 07:43:01 [slide 9] 07:43:20 [slide 10] 07:43:51 q? 07:44:19 [slide 11] 07:44:48 [slide 12] 07:45:14 [slide 13] 07:45:34 [slide 14] 07:46:10 [slide 15] 07:46:49 ack christianliebel 07:47:15 christianliebel: the APIs we build in the CG/WG are on-device 07:47:24 Topic: Wrap up 07:47:31 Anssi: thank you everyone for your active participation and productive discussions 07:47:36 ... this day was packed and we managed to finish with gusto! 07:47:40 ... how different a trusted executed environment on cloud would be from the security/privacy perspectives? 07:48:03 ... special thank you to our guests Mike, Tara, Simone, who joined to share important work happening across horizontals 07:48:07 ... also huge thanks to Ningxin & team for the case study and compelling demos that both inform our future direction and demonstrate the exciting web experiences we already enable today with WebNN 07:48:12 ... interested folks are welcome to join us for a dinner 07:48:26 ... we're quite many, so the plan would be to meet in the Portopia Hotel (adjacent to the Kobe International Conference Center) lobby at 18:00 to coordinate on transport and restaurants, likely split to multiple based on preferences 07:48:42 s/18:00/18:15 07:49:09 RRSAgent, draft minutes 07:49:10 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom 07:49:28 RRSAgent, draft minutes 07:49:30 I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik