23:55:30 <RRSAgent> RRSAgent has joined #webmachinelearning
23:55:34 <RRSAgent> logging to https://www.w3.org/2025/11/09-webmachinelearning-irc
23:55:34 <Zakim> RRSAgent, make logs Public
23:55:35 <Zakim> please title this meeting ("meeting: ..."), anssik
23:55:47 <RafaelCintron> RafaelCintron has joined #webmachinelearning
23:57:40 <awafaa> awafaa has joined #webmachinelearning
23:57:51 <dom> Present+
23:57:53 <dom> Scribe+
23:58:16 <DenisD> DenisD has joined #webmachinelearning
23:58:22 <awafaa> Present+ Andrew_Wafaa
23:58:50 <DenisD> Remote + Denis_DIDIER
23:58:55 <anssik> Meeting: Web Machine Learning WG F2F – 10 November 2025
23:59:00 <anssik> Chair: Anssi
23:59:06 <anssik> Agenda: https://github.com/webmachinelearning/meetings/issues/35
23:59:06 <gb> https://github.com/webmachinelearning/meetings/issues/35 -> https://github.com/webmachinelearning/meetings/issues/35
23:59:10 <anssik> Scribe: Anssi
23:59:14 <ningxin> ningxin has joined #webmachinelearning
23:59:16 <anssik> scribeNick: anssik
23:59:21 <anssik> scribe+ dom
23:59:25 <anssik> gb, this is webmachinelearning/webnn
23:59:26 <gb> anssik, OK.
23:59:30 <anssik> Present+ Anssi_Kostiainen
23:59:40 <anssik> Present+ Dominique_Hazael-Massieux
00:00:08 <anssik> RRSAgent, draft minutes
00:00:09 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
00:00:51 <anssik> Present+ Tarek_Ziade
00:00:59 <anssik> Present+ Ningxin_Hu
00:01:13 <anssik> Present+ Sushanth_Rajasankar
00:01:18 <anssik> Present+ Erik_Anderson
00:01:33 <anssik> Present+ Thomas_Steiner
00:01:57 <anssik> Present+ Markus_Tavenrath
00:02:33 <anssik> Present+ Mark_Foltz
00:02:52 <big-screen> big-screen has joined #webmachinelearning
00:04:23 <hyojin> hyojin has joined #webmachinelearning
00:04:55 <acomminos> acomminos has joined #webmachinelearning
00:04:59 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:06:21 <anssik> RRSAgent, draft minutes
00:06:22 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
00:07:29 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:07:29 <acomminos> acomminos has joined #webmachinelearning
00:07:29 <big-screen> big-screen has joined #webmachinelearning
00:07:29 <ningxin> ningxin has joined #webmachinelearning
00:07:29 <DenisD> DenisD has joined #webmachinelearning
00:07:29 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:07:29 <DwayneR> DwayneR has joined #webmachinelearning
00:07:29 <reillyg> reillyg has joined #webmachinelearning
00:08:39 <markafoltz> markafoltz has joined #webmachinelearning
00:08:39 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:08:39 <acomminos> acomminos has joined #webmachinelearning
00:08:39 <big-screen> big-screen has joined #webmachinelearning
00:08:39 <ningxin> ningxin has joined #webmachinelearning
00:08:39 <DenisD> DenisD has joined #webmachinelearning
00:08:39 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:08:39 <DwayneR> DwayneR has joined #webmachinelearning
00:08:39 <reillyg> reillyg has joined #webmachinelearning
00:09:58 <ErikAnderson> ErikAnderson has joined #webmachinelearning
00:09:58 <Mark_Foltz> Mark_Foltz has joined #webmachinelearning
00:09:58 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:09:58 <acomminos> acomminos has joined #webmachinelearning
00:09:58 <big-screen> big-screen has joined #webmachinelearning
00:09:58 <ningxin> ningxin has joined #webmachinelearning
00:09:58 <DenisD> DenisD has joined #webmachinelearning
00:09:58 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:09:58 <DwayneR> DwayneR has joined #webmachinelearning
00:09:58 <reillyg> reillyg has joined #webmachinelearning
00:11:08 <sushanth> sushanth has joined #webmachinelearning
00:11:08 <Tarek> Tarek has joined #webmachinelearning
00:11:08 <kush> kush has joined #webmachinelearning
00:11:08 <ErikAnderson> ErikAnderson has joined #webmachinelearning
00:11:08 <Mark_Foltz> Mark_Foltz has joined #webmachinelearning
00:11:08 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:11:08 <acomminos> acomminos has joined #webmachinelearning
00:11:08 <big-screen> big-screen has joined #webmachinelearning
00:11:08 <ningxin> ningxin has joined #webmachinelearning
00:11:08 <DenisD> DenisD has joined #webmachinelearning
00:11:08 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:11:08 <DwayneR> DwayneR has joined #webmachinelearning
00:11:08 <reillyg> reillyg has joined #webmachinelearning
00:11:28 <anssik> Topic: Welcome
00:12:11 <mtavenrath> mtavenrath has joined #webmachinelearning
00:12:11 <Kenji_Baheux3> Kenji_Baheux3 has joined #webmachinelearning
00:12:11 <alispivak> alispivak has joined #webmachinelearning
00:12:11 <sushanth> sushanth has joined #webmachinelearning
00:12:11 <Tarek> Tarek has joined #webmachinelearning
00:12:11 <kush> kush has joined #webmachinelearning
00:12:11 <ErikAnderson> ErikAnderson has joined #webmachinelearning
00:12:11 <Mark_Foltz> Mark_Foltz has joined #webmachinelearning
00:12:11 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:12:11 <acomminos> acomminos has joined #webmachinelearning
00:12:11 <big-screen> big-screen has joined #webmachinelearning
00:12:11 <ningxin> ningxin has joined #webmachinelearning
00:12:11 <DenisD> DenisD has joined #webmachinelearning
00:12:11 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:12:11 <DwayneR> DwayneR has joined #webmachinelearning
00:12:11 <reillyg> reillyg has joined #webmachinelearning
00:12:17 <anssik> Anssi: welcome to the W3C Web Machine Learning WG F2F at TPAC 2025, this is our second physical F2F
00:12:25 <anssik> ... I'm Anssi Kostiainen, Intel, the chair of the WG
00:13:08 <anssik> ... with me is Dom, Dominique Hazael-Massieux, W3C Staff, helping run the meeting smoothly
00:13:26 <anssik> ... again great to see so many folks here in person and new people outside the usual WG participants, including participants and guests who represent Japanese W3C members and organizations
00:13:32 <kbx> kbx has joined #webmachinelearning
00:13:32 <mtavenrath> mtavenrath has joined #webmachinelearning
00:13:32 <alispivak> alispivak has joined #webmachinelearning
00:13:32 <sushanth> sushanth has joined #webmachinelearning
00:13:32 <Tarek> Tarek has joined #webmachinelearning
00:13:32 <kush> kush has joined #webmachinelearning
00:13:32 <ErikAnderson> ErikAnderson has joined #webmachinelearning
00:13:32 <Mark_Foltz> Mark_Foltz has joined #webmachinelearning
00:13:32 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:13:32 <acomminos> acomminos has joined #webmachinelearning
00:13:32 <big-screen> big-screen has joined #webmachinelearning
00:13:32 <ningxin> ningxin has joined #webmachinelearning
00:13:32 <DenisD> DenisD has joined #webmachinelearning
00:13:32 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:13:32 <DwayneR> DwayneR has joined #webmachinelearning
00:13:32 <reillyg> reillyg has joined #webmachinelearning
00:13:35 <Tarek> Present+ Tarek_Ziade
00:14:12 <anssik> ... Arigato gozaimasu!
00:14:18 <anssik> ... this WG has continued to grow rapidly since the last year, we have all major browser vendors on board and new folks are joining
00:14:26 <anssik> ... the YoY groth is around +30% in both organization and participants, for both this WG and its sister CG
00:14:32 <anssik> ... a few new members who joined the WG since last F2F:
00:14:37 <anssik> ... Hugging Face
00:14:40 <RobKochman> RobKochman has joined #webmachinelearning
00:14:40 <kbx> kbx has joined #webmachinelearning
00:14:40 <mtavenrath> mtavenrath has joined #webmachinelearning
00:14:40 <alispivak> alispivak has joined #webmachinelearning
00:14:40 <sushanth> sushanth has joined #webmachinelearning
00:14:40 <Tarek> Tarek has joined #webmachinelearning
00:14:40 <kush> kush has joined #webmachinelearning
00:14:40 <ErikAnderson> ErikAnderson has joined #webmachinelearning
00:14:40 <Mark_Foltz> Mark_Foltz has joined #webmachinelearning
00:14:40 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:14:40 <acomminos> acomminos has joined #webmachinelearning
00:14:40 <big-screen> big-screen has joined #webmachinelearning
00:14:40 <ningxin> ningxin has joined #webmachinelearning
00:14:40 <DenisD> DenisD has joined #webmachinelearning
00:14:40 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:14:40 <DwayneR> DwayneR has joined #webmachinelearning
00:14:40 <reillyg> reillyg has joined #webmachinelearning
00:14:44 <RobKochman> Present+ Rob_Kochman
00:14:56 <anssik> ... Qualcomm
00:15:01 <anssik> ... NVIDIA
00:15:07 <anssik> ... ARM
00:15:14 <anssik> ... Shopify
00:15:45 <jets> jets has joined #webmachinelearning
00:15:46 <anssik> ... we are working at the intersection of Web & AI/ML technologies during this time of exponential growth in AI and we've luckly to have such a diverse group of experts onboard:
00:15:54 <dezell2> dezell2 has joined #webmachinelearning
00:15:54 <BenGreenstein> BenGreenstein has joined #webmachinelearning
00:15:54 <RobKochman> RobKochman has joined #webmachinelearning
00:15:54 <kbx> kbx has joined #webmachinelearning
00:15:54 <mtavenrath> mtavenrath has joined #webmachinelearning
00:15:54 <alispivak> alispivak has joined #webmachinelearning
00:15:54 <sushanth> sushanth has joined #webmachinelearning
00:15:54 <Tarek> Tarek has joined #webmachinelearning
00:15:54 <kush> kush has joined #webmachinelearning
00:15:54 <ErikAnderson> ErikAnderson has joined #webmachinelearning
00:15:54 <Mark_Foltz> Mark_Foltz has joined #webmachinelearning
00:15:54 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:15:54 <acomminos> acomminos has joined #webmachinelearning
00:15:54 <big-screen> big-screen has joined #webmachinelearning
00:15:54 <ningxin> ningxin has joined #webmachinelearning
00:15:54 <DenisD> DenisD has joined #webmachinelearning
00:15:54 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:15:54 <DwayneR> DwayneR has joined #webmachinelearning
00:15:54 <reillyg> reillyg has joined #webmachinelearning
00:15:59 <reillyg> Present+ Reilly_Grant
00:16:12 <anssik> ... all browser vendors, OS vendors, major semiconductor companies invested in AI, major platform providers, ISVs, distinguished researchers from the academia, individuals, and more
00:16:55 <anssik> ... if you registered as a WG participant, please join us at the table
00:17:16 <anssik> ... observers are welcome to join the table too subject to available space
00:17:20 <anssik> Anssi: we use Zoom for a hybrid meeting experience, please join using the link in the meeting invite
00:17:22 <dezell2> dezell2 has joined #webmachinelearning
00:17:22 <BenGreenstein> BenGreenstein has joined #webmachinelearning
00:17:22 <RobKochman> RobKochman has joined #webmachinelearning
00:17:22 <kbx> kbx has joined #webmachinelearning
00:17:22 <mtavenrath> mtavenrath has joined #webmachinelearning
00:17:22 <alispivak> alispivak has joined #webmachinelearning
00:17:22 <sushanth> sushanth has joined #webmachinelearning
00:17:22 <Tarek> Tarek has joined #webmachinelearning
00:17:22 <kush> kush has joined #webmachinelearning
00:17:22 <ErikAnderson> ErikAnderson has joined #webmachinelearning
00:17:22 <Mark_Foltz> Mark_Foltz has joined #webmachinelearning
00:17:22 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:17:22 <acomminos> acomminos has joined #webmachinelearning
00:17:22 <big-screen> big-screen has joined #webmachinelearning
00:17:22 <ningxin> ningxin has joined #webmachinelearning
00:17:22 <DenisD> DenisD has joined #webmachinelearning
00:17:22 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:17:22 <DwayneR> DwayneR has joined #webmachinelearning
00:17:22 <reillyg> reillyg has joined #webmachinelearning
00:17:34 <anssik> Anssi: we use IRC for official meeting minutes and for managing the speaker queue
00:17:40 <anssik> ... please join the #webmachinelearning IRC channel, link in the meeting invite and agenda:
00:17:46 <anssik> -> https://irc.w3.org/?channels=#webmachinelearning
00:17:52 <anssik> -> https://github.com/webmachinelearning/meetings/issues/35
00:17:53 <gb> https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko)
00:18:20 <anssik> Anssi: to put yourself on the queue type in IRC "q+"
00:18:27 <anssik> ... during the introductions round, we'll try to record everyone's participation on IRC with:
00:18:31 <anssik> ... Present+ Firstname_Lastname
00:19:04 <Ugur-Depixen> Ugur-Depixen has joined #webmachinelearning
00:19:04 <MasaoG> MasaoG has joined #webmachinelearning
00:19:04 <dezell2> dezell2 has joined #webmachinelearning
00:19:04 <BenGreenstein> BenGreenstein has joined #webmachinelearning
00:19:04 <RobKochman> RobKochman has joined #webmachinelearning
00:19:04 <kbx> kbx has joined #webmachinelearning
00:19:04 <mtavenrath> mtavenrath has joined #webmachinelearning
00:19:04 <alispivak> alispivak has joined #webmachinelearning
00:19:04 <sushanth> sushanth has joined #webmachinelearning
00:19:04 <Tarek> Tarek has joined #webmachinelearning
00:19:04 <kush> kush has joined #webmachinelearning
00:19:04 <ErikAnderson> ErikAnderson has joined #webmachinelearning
00:19:04 <Mark_Foltz> Mark_Foltz has joined #webmachinelearning
00:19:04 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
00:19:04 <acomminos> acomminos has joined #webmachinelearning
00:19:04 <big-screen> big-screen has joined #webmachinelearning
00:19:04 <ningxin> ningxin has joined #webmachinelearning
00:19:04 <DenisD> DenisD has joined #webmachinelearning
00:19:04 <RafaelCintron> RafaelCintron has joined #webmachinelearning
00:19:04 <DwayneR> DwayneR has joined #webmachinelearning
00:19:04 <reillyg> reillyg has joined #webmachinelearning
00:19:11 <thelounge5> thelounge5 has joined #webmachinelearning
00:19:13 <RafaelCintron> Present+ RafaelCintron
00:19:29 <Ugur_Acar_Depixen> Ugur_Acar_Depixen has joined #webmachinelearning
00:19:47 <anssik> ... please check that your participation is recorded on IRC to we're able to acknowledge your presence in the meeting minutes
00:19:49 <anssik> Subtopic: Intros
00:19:58 <anssik> Anssi: since we're many again, we'll do a quick round of introductions, 15 seconds each, full name, affiliation and key interest
00:20:39 <phillis> phillis has joined #webmachinelearning
00:20:55 <dom> Dwayne: WebNN spec editor, with a focus on new operators, at Microsoft
00:21:22 <hagio_nhk> hagio_nhk has joined #webmachinelearning
00:21:26 <dom> Rafael: also at Microsoft, on Edge browser, working on all things AI and graphic rendering
00:21:48 <dom> MikeW: at Apple, involved in WebNN and also WebGPU
00:22:11 <dom> Denis: involved in Sustainable Web Guidelines
00:22:56 <dom> Phillis: at Google, working with Reilly on WebNN implementation on Chromium
00:23:06 <DenisD> Introduction : Denis DIDIER, from France - Company ITHENKA, Contributor to W3C Sustainable Web Guidelines, and Sustainable AI with french non-profit Institute for Sustainable IT.
00:23:22 <handellm> handellm has joined #webmachinelearning
00:23:39 <dom> Reilly: implementing WebNN in Chromium
00:23:50 <dom> Shushan: at Microsoft, built-in AI
00:23:58 <dom> ErikA: manager on edge browser team
00:24:21 <dom> Ugur: working on AI solutions for the construction industry as chief AI officer
00:24:50 <dom> AndrewW: at ARM, involved at our open source and standards strategy team
00:25:00 <dom> Tarek: from Mozilla on the Firefox AI team
00:25:22 <dom> Markus: from NVidia, devtech supporting ISV integrating ML in their apps, getting involved in standardization to help their lives
00:25:54 <dom> Ningxin: co-editor of WebNN spec at Intel
00:25:55 <anssik> RRSAgent, draft minutes
00:25:57 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
00:26:03 <dom> Dom: staff contact for the group and looking at impact of AI on the Web
00:26:12 <dom> @@@: at Google looking at integration of WebNN in Chromium
00:26:33 <dom> MarkF: at Google, working on Chrome on AI & agentic features, involved on WebMCP
00:26:45 <dom> Kenji: chrome, built-in AI
00:26:58 <dom> Ben: Chrome, similar to Mark on agentic AI
00:27:22 <dom> @@@2: Chrome team, WebMCP
00:27:27 <dom> Thomas: devrel at Chrome
00:27:49 <dom> Ali: program manager in Google, supervising ML/GPU work
00:28:18 <kbx> @@@2 is Rob Kochman.
00:28:19 <dom> DavidEzell: Connexus - excited by this group; we're a standards body hoping to ruborchrage retail vendors with our standards
00:28:27 <dom> Brian: @@@
00:28:39 <dom> YutaHagio: working for NHK, Japanese broadcaster
00:28:48 <dom> ChiaraCerretti: @@@
00:29:03 <dom> GuidoU: Google, WebRTC APIs in Chrome, exploring application of AI
00:29:13 <dom> MarcusH: Google Meet, interested in AI/WebRTC
00:29:26 <dom> Diogo: Brazilian W3C Office
00:29:34 <dom> SamGoto: Google Chrome, platform APIs
00:29:40 <dom> @@@: Meta browser
00:29:47 <anssik> RRSAgent, draft minutes
00:29:48 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
00:30:07 <dom> Masao: @@@
00:30:31 <kbx> Present+ Kenji_Baheux
00:30:34 <jets> Present+ Brian_McManus
00:30:39 <Mike_Wyrzykowski> present+ Mike_Wyrzykowski
00:30:40 <handellm> Present+ Markus_Handell
00:30:57 <BenGreenstein> Present+ Ben_Greenstein
00:31:07 <dom> s/MarcusH/MarkusH/
00:31:12 <chiace> chiace has joined #webmachinelearning
00:31:21 <alispivak> alispivak has joined #webmachinelearning
00:31:23 <Ugur_Acar> Ugur_Acar has joined #webmachinelearning
00:31:28 <chiace> Present+ Chiara Cerretti
00:31:36 <Ugur_Acar> Present+ Ugur_Acar_Depixen
00:31:41 <alispivak> Present+ Ali_Spivak
00:31:51 <MasaoG> MasaoG has left #webmachinelearning
00:31:52 <anssik> Subtopic: Agenda bashing
00:31:59 <anssik> Anssi: the F2F agenda was built collaboratively with you, the WG participants and is published on GH:
00:32:04 <anssik> -> https://github.com/webmachinelearning/meetings/issues/35
00:32:05 <gb> https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko)
00:32:36 <anssik> Anssi: any last-minute proposals or updates?
00:32:44 <Masao_Goho> Masao_Goho has joined #webmachinelearning
00:33:00 <hyojin> Present+ Hyojin Song
00:35:34 <kbx> Meet after the meeting for meat or no meat.
00:36:52 <anssik> Subtopic: Charter orientation
00:36:59 <anssik> Anssi: we have two groups, Web Machine Learning Working Group (WG) and Community Group (CG)
00:37:29 <anssik> ... WG standardizes Web APIs for on-device inference using CG incubations as its seeds
00:37:35 <anssik> ... deliverables: WebNN API, Ethical Principles
00:37:41 <anssik> -> WebML WG Charter https://www.w3.org/2023/04/web-machine-learning-charter.html
00:38:06 <dom> s|2023/04/2025/03|
00:38:12 <anssik> Anssi: we're looking to make the Ethical Principles as a joint deliverable with the proposed Web & AI Interest Group
00:38:18 <anssik> ... this informative document is a reference from the WebNN API spec
00:38:24 <anssik> ... CG is a group where new ideas are discussed, explored and incubated before formal standardization
00:38:30 <anssik> ... past CG spec incubations include e.g. WebNN, Model Loader
00:38:36 <anssik> ... since last year, we've expanded the scope of the CG to built-in AI APIs and agentic web capabilities
00:38:41 <anssik> -> WebML CG Charter https://webmachinelearning.github.io/charter/
00:38:48 <anssik> -> WebML CG Incubations https://webmachinelearning.github.io/incubations/
00:38:55 <anssik> Anssi: current CG deliverables:
00:39:12 <anssik> ... Prompt API
00:39:12 <anssik> ... Writing Assistance APIs
00:39:12 <anssik> ... Translator and Language Detector APIs
00:39:12 <anssik> ... Proofreader API
00:39:12 <anssik> ... WebMCP API
00:39:17 <anssik> ... the CG technical scope is higher-level task-based APIs and agentic web feature WebMCP
00:39:22 <anssik> ... while the WG technical scope is a lower-level WebNN API, the graph builder abstraction
00:39:28 <anssik> Anssi: the WG and CG work closely together and coordinate with other W3C groups, for example:
00:39:45 <anssik> ... - WebGPU WG/CG for WebNN-WebGPU interop
00:39:52 <anssik> ... - Wasm CG
00:40:06 <anssik> ... - WebRTC for media processing-related integrations
00:40:13 <anssik> ... - AI Agent Protocol Community Group for agentic protocols
00:40:20 <anssik> ... - And with horizontals: privacy, security, a11y, also emerging sustainability and ethics
00:41:19 <anssik> dom: we operate under W3C CoC and antitrust guidance for W3C, under W3C Patent Policy
00:41:58 <anssik> q?
00:42:14 <anssik> Topic: Spec orientation
00:42:26 <anssik> Anssi: we have scheduled time before our first break to do a triage pass through open issues
00:42:31 <anssik> ... the plan is to collaborative look at our backlog or issues and PRs to:
00:42:52 <anssik> -> https://github.com/webmachinelearning/webnn/issues
00:43:12 <anssik> Anssi: - focus on breaking changes
00:43:26 <anssik> ... - check priorities
00:43:33 <anssik> ... - set next steps for the issues
00:43:46 <anssik> ... let's use IRC to queue proposals, examples:
00:43:53 <anssik> q+ to propose next steps for #573
00:43:53 <gb> https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] [Agenda+]
00:44:23 <anssik> q+ to discuss priority of #883
00:44:23 <gb> https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+]
00:44:36 <anssik> q+ to bump the priority of #861
00:44:37 <gb> https://github.com/webmachinelearning/webnn/issues/861 -> Issue 861 Evaluate sustainability impact (by anssiko) [tag-needs-resolution] [Agenda+]
00:44:40 <kzms2> kzms2 has joined #webmachinelearning
00:44:44 <anssik> Anssi: this is your last-minute opportunity to influence today's agenda
00:44:57 <anssik> ... we'll first record triage results on IRC during the first ~15 mins and then review as a group and continue discuss and refine on the hallway track with coffee/tea
00:45:02 <guidou> guidou has joined #webmachinelearning
00:45:25 <reillyg> q+ to discuss the priority of #226 versus WebGPU interop work
00:45:26 <gb> https://github.com/webmachinelearning/webnn/issues/226 -> Issue 226 Integration with real-time video processing (by dontcallmedom) [use case]
00:46:07 <Anna> Anna has joined #webmachinelearning
00:47:00 <anssik> RRSAgent, draft minutes
00:47:02 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
00:50:25 <reillyg> +q to discuss whether we have actionable next-steps on #6
00:50:25 <gb> https://github.com/webmachinelearning/webnn/issues/6 -> Issue 6 Custom operations (by dsmilkov) [v2] [device selection]
00:52:00 <reillyg> +q to discuss (likely in the context of the OT discussion) next steps on #763
00:52:01 <gb> https://github.com/webmachinelearning/webnn/issues/763 -> Issue 763 Request standards positions from Mozilla and WebKit (by reillyeon) [process]
00:53:05 <reillyg> +q to discuss #807 ahead of prototyping work that I believe will be starting soon
00:53:06 <gb> https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request]
00:56:04 <Ugur> Ugur has joined #webmachinelearning
00:58:26 <ningxin> ningxin has joined #webmachinelearning
00:58:26 <big-screen> big-screen has joined #webmachinelearning
00:59:34 <dom> q?
00:59:36 <reillyg> q+ to discuss defining a process for the WG to accept or modify operators.
01:00:52 <anssik> q?
01:01:00 <anssik> ack anssik
01:01:00 <Zakim> anssik, you wanted to propose next steps for #573 and to discuss priority of #883 and to bump the priority of #861
01:01:01 <gb> https://github.com/webmachinelearning/webnn/issues/861 -> Issue 861 Evaluate sustainability impact (by anssiko) [tag-needs-resolution] [Agenda+]
01:01:01 <gb> https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] [Agenda+]
01:01:01 <gb> https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+]
01:01:21 <reillyg> ack reillyg
01:01:21 <Zakim> reillyg, you wanted to discuss the priority of #226 versus WebGPU interop work and to discuss whether we have actionable next-steps on #6 and to discuss (likely in the context of
01:01:22 <gb> https://github.com/webmachinelearning/webnn/issues/226 -> Issue 226 Integration with real-time video processing (by dontcallmedom) [use case]
01:01:22 <gb> https://github.com/webmachinelearning/webnn/issues/6 -> Issue 6 Custom operations (by dsmilkov) [v2] [device selection]
01:01:25 <Zakim> ... the OT discussion) next steps on #763 and to discuss #807 ahead of prototyping work that I believe will be starting soon and to discuss defining a process for the WG to accept
01:01:25 <Zakim> ... or modify operators.
01:01:42 <anssik> ack reillyg
01:01:48 <hagio_nhk> hagio_nhk has joined #webmachinelearning
01:02:12 <dom> reillyg: identified a couple of issues worth discussing, incl introducing future work
01:02:20 <dom> ... issue #6 on custom operator support
01:02:52 <dom> ... we're getting close to origin trial chromium, so we should request standards position from WebKit and Mozilla
01:03:39 <anssik> q?
01:04:01 <dom> ... without discussing operator support in deep detail in this meeting, it might be useful discussing a process of adopting new operators or modifying existing operators
01:05:26 <ningxin> This is what we have for op change process: https://github.com/webmachinelearning/webnn/blob/main/CONTRIBUTING.md#proposing-and-adding-a-new-operation
01:06:47 <dom> Topic: WebNN Small Language Model (SLM) Performance Optimization Case Study
01:06:57 <dom> Slideset: ninginx_slides
01:07:22 <anssik> Topic: New features
01:07:32 <anssik> Subtopic: WebNN Small Language Model (SLM) Performance Optimization Case Study
01:07:42 <thelounge> thelounge has joined #webmachinelearning
01:07:45 <anssik> Anssi: I've asked Ningxin to present a WebNN Small Language Model (SLM)
01:07:45 <anssik> Performance Optimization Case Study​ conducted by a group of our engineering team
01:07:46 <alispivak> alispivak has joined #webmachinelearning
01:07:49 <anssik> ... thank you Yuheng, Wei, Wanming, Jonathan, Ningxin for producing this case study to inform WebNN new features discussion with considerations for topics such as:
01:07:52 <anssik> ... - WebNN SLM support and challenges​, operators' fusion​, on-device KV, tensor binding​, dynamic shape​
01:07:56 <anssik> ... after this case study, we will proceed with discussion
01:08:15 <chiace> chiace has joined #webmachinelearning
01:08:20 <dom> ningxin: looking at how to optimize performance for a small language model, based on QWen small model
01:08:35 <dom> ... case study conducted by a Team at intel
01:08:39 <dom> s/intel/Intel
01:08:56 <dom> [slide 3]
01:09:23 <dom> Ningxin: we reused the model from the native ORT-GenAI project
01:10:30 <dom> ... in this experiment, we focused on the WebGPU EP
01:11:35 <dom> ... and ran the same model in the webnn-based stack using the WebGPU EP as well
01:11:41 <dom> [slide 4]
01:12:32 <dom> [slide 5]
01:13:54 <chikamune> chikamune has joined #webmachinelearning
01:14:13 <dom> [slide 6]
01:14:48 <dom> Ningxin: looking at the contribution to inference time of key operators, starting with the native stack
01:15:20 <dom> [slide 7]
01:15:43 <dom> Ningxin: the ONNX macro ops get decomposed in WebNN operators
01:16:11 <dom> ... GroupQueryAttention requires 24 WebNN operators
01:16:23 <dom> [slide 8]
01:17:27 <dom> [slide 9]
01:17:45 <dom> [slide 10]
01:18:10 <ErikAnderson> ErikAnderson has joined #webmachinelearning
01:19:00 <dom> [slide 11]
01:19:20 <dom> [slide 12]
01:19:50 <anssik> RRSAgent, draft minutes
01:19:51 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
01:20:45 <dom> [slide 13]
01:22:03 <dom> [slide 14]
01:22:46 <dom> [slide 15]
01:23:36 <dom> [slide 16]
01:25:07 <dom> [slide 17]
01:26:00 <dom> [slide 18]
01:26:50 <dom> [slide 19]
01:27:07 <dom> [slide 20]
01:30:01 <dom> [slide 21]
01:30:39 <dom> [slide 22]
01:32:39 <anssik> Anssi: thank you Ningxin & team for this insightful case study
01:32:43 <dom> RRSAgent, draft minutes
01:32:44 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom
01:33:06 <anssik> ... Accessing to optimized macro ops is key for SLM performance​
01:33:06 <anssik> ... MatMulNBits, GQA etc., ​
01:33:06 <anssik> ... Support in spec or fusion in implementation?​
01:33:06 <anssik> ... Support dynamic input shapes?​
01:33:06 <anssik> ... Allow same tensor for input and output?​
01:33:06 <reillyg> q+
01:33:08 <anssik> ... Decouple tensor binding and graph dispatch?​
01:33:31 <markafoltz> markafoltz has joined #webmachinelearning
01:33:31 <reillyg> q-
01:33:45 <reillyg> q+ to discuss next steps based on SLM presentation.
01:58:35 <RobKochman> RobKochman has joined #webmachinelearning
02:00:01 <markafoltz> markafoltz has joined #webmachinelearning
02:01:08 <kbx> kbx has joined #webmachinelearning
02:01:19 <kbx> Present+ Kenji_Baheux
02:02:33 <anssik> RRSAgent, draft minutes
02:02:35 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
02:04:18 <dezell> dezell has joined #webmachinelearning
02:06:57 <dom> q?
02:07:36 <dom> reillyg: the biggest question on all these API change proposals is doing the research on what's possible on the current WebNN implementation
02:08:06 <anssik> ack reillyg
02:08:06 <Zakim> reillyg, you wanted to discuss next steps based on SLM presentation.
02:08:14 <dom> ... we investigated dynamic shape, tensor binding (backends have support for this)
02:08:16 <mtavenrath> mtavenrath has joined #webmachinelearning
02:08:30 <dom> ... this would be useful also for real-time application
02:08:51 <dom> ... same-tensor in input/output: the main question is whether the graph supports it
02:09:10 <dom> ... maybe a binding step could be used for validation
02:09:44 <dom> Dwayne: impressive speed up identified in the case study
02:09:49 <dom> Rafael: +1 on the next steps identified
02:10:38 <dom> Ningxin: we could look at another backend, like LiteRT
02:10:55 <dom> ... and compare it to using ONNX
02:11:10 <dom> ... both would be based on the same PyTorch model
02:12:15 <dom> ... MatMulNBits allow to set accuracy which the underlying implementation can use to accelerate the inference
02:12:26 <dom> ... not clear how to encode this when doing fusion
02:12:44 <anssik> q+
02:12:47 <anssik> ack anssik
02:12:51 <anssik> q?
02:13:09 <anssik> Subtopic: Core operator set
02:13:16 <anssik> Anssi: #573
02:13:17 <gb> https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] [Agenda+]
02:13:49 <chiace> chiace has joined #webmachinelearning
02:13:53 <anssik> Anssi: in the case study discussion we learned many of the SLM building blocks are key for performance:​
02:14:10 <handellm> handellm has joined #webmachinelearning
02:14:43 <anssik> ... - MatMulNBits​
02:14:47 <anssik> ... - GroupQueryAttention (GQA)​
02:14:51 <anssik> ... - SkipSimplifiedLayerNorm / SimplifiedLayerNorm​
02:14:56 <anssik> ... - RotaryEmbedding​
02:15:00 <anssik> ... the question to the group is, should we support them in spec or with fusion in implementation?
02:15:04 <anssik> -> WebML WG Teleconference – 9 October 2025 https://www.w3.org/2025/10/09-webmachinelearning-minutes.html#e3a7
02:15:07 <anssik> Anssi: NVIDIA team reported they're collecting all the ops that'd benefit from being in the set, one class is various attentions, also gathers, MoE, TopK, and looking for other ops that'd benefit from not being composed
02:15:14 <dom> Markus: operators proliferation - some operators can't be decomposed into existing operators
02:15:42 <dom> ... part of what we need to consider is whether operators we expose in the spec are going to remain useful for long enough
02:16:18 <DwayneR> There's a 3rd possibility too (not just built-in operator vs recognized fusion) - support subgraph composition, so that these complex operators (they are really entire graphs) are not permanently baked into the API, but still can be recognized easily and passed through to the backends.
02:16:32 <anssik> q?
02:16:44 <dom> ... can we enhance WebNN to be a multi-layers, so that we can have decomposition happen at the discretion of the browser or the backend, carrying over optimization down to the hardware as necessary
02:17:02 <dom> ... which operators are complex enough and can't be decomposed into existing operators
02:17:08 <dom> ... how to expose compound operators in WebNN
02:17:14 <ningxin> ningxin has joined #webmachinelearning
02:17:42 <dom> Dwayne: TopK feels like a primitive operator that should be added to WebNN
02:17:56 <dom> ... MatMulNBits on the other hand is a subgraph
02:18:00 <kbx> kbx has left #webmachinelearning
02:18:18 <anssik> Anssi: Dwayne had done some extensive research on this topic and re-raised the concept of aggregate operators via subgraphs
02:18:23 <anssik> -> https://github.com/webmachinelearning/webnn/issues/573#issuecomment-3386373261
02:18:24 <gb> https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] [Agenda+]
02:18:25 <dom> ... another possibility is to support the concept of subgraphs that can be referred to as operator later, as Ningxin described at TPAC last year
02:18:26 <Pavan3> Pavan3 has joined #webmachinelearning
02:18:32 <dom> ... we don't have a specific issue for that atm
02:18:56 <dom> ... I'll open one
02:20:18 <dom> Anssi: Markus, could you capture your feedback on github? maybe on a meta issue around optimization
02:20:38 <dom> Markus: we fully agree with what has been described so far
02:20:43 <anssik> q?
02:21:20 <dom> Markus: if we would have sub-graphs, it would also optmize the number of nodes and speed up computation, as Ningxin was describing
02:21:28 <anssik> q?
02:21:43 <dom> s|ningxin_slides|https://lists.w3.org/Archives/Public/www-archive/2025Nov/att-0000/WebNN_SLM_Optimization_-_TPAC.pdf
02:21:47 <dom> RRSAgent, draft minutes
02:21:48 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom
02:22:23 <dom> s|ninginx_slides|https://lists.w3.org/Archives/Public/www-archive/2025Nov/att-0000/WebNN_SLM_Optimization_-_TPAC.pdf
02:22:25 <dom> RRSAgent, draft minutes
02:22:26 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom
02:23:21 <dom> RESOLVED: open a new issue for aggregate operators via subgraphs
02:23:41 <anssik> Subtopic: Support flexible input sizes
02:23:46 <anssik> Anssi: issue #883
02:23:47 <gb> https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+]
02:24:10 <anssik> ... this issue was initiated due to the fact many models e.g. vision models require flexible input sizes that are determined at inference time
02:24:17 <anssik> ... a few scenarios mentioned:
02:24:22 <anssik> ... - input images with different resolutions for vision models
02:24:35 <anssik> ... a concrete example would b MODNet, a model for real-time portrait matting
02:24:40 <anssik> -> https://huggingface.co/Xenova/modnet
02:24:59 <anssik> ... - transformers with arbitrary input lengths
02:25:05 <anssik> ... when using KV cache, need to increase the KV cache length by 1 per each inference
02:25:15 <anssik> ... - speech recognition
02:25:29 <anssik> ... example Whisper encoder with arbitrary input lengths
02:25:35 <anssik> ... decoder increases the KV cache length by 1 per inference
02:25:42 <anssik> ... - LLMs
02:25:51 <anssik> ... for example Qwen2.5-0.5B-Instruct also increases KV cache length at inference time
02:26:08 <anssik> ... "Lack of the support for flexible input sizes increases the complexity of using WebNN for those models."
02:26:31 <anssik> ... - complexity reduction: now need to modify the model and fix input size pre compile
02:26:52 <anssik> ... - at inference time: now need to resize the image input or pad input
02:26:58 <anssik> ... - native support: flexible input sizes are already supported by native frameworks
02:27:08 <anssik> Anssi: Dwayne responded with considerations:
02:27:13 <anssik> -> https://github.com/webmachinelearning/webnn/issues/883#issuecomment-3232188158
02:27:14 <gb> https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+]
02:28:42 <dom> dwaye: [voicing his thoughts written up in https://github.com/webmachinelearning/webnn/issues/883#issuecomment-3232188158 ]
02:29:08 <anssik> q?
02:29:10 <dom> ... a step between binding and building the graph would help
02:29:58 <anssik> Anssi: per our latest discussion, we're now exploring the following as a group:
02:29:59 <dom> anssik: reillyg raised questions around how this would get implemented on existing backends and performance implication
02:30:03 <anssik> ... - how this will be implemented by backends
02:30:08 <anssik> ... - what is the role of WebNN in this decision, the framework could build multiple graphs
02:30:08 <anssik> ... - understand performance bottlenecks (of multiple graphs)
02:30:14 <dom> reillyg: all the backends we target some form of dynamic shape
02:30:52 <dom> ... two APIs part: where the dynamic shapes are idenitifed in the graph? (either arbitrary or among a set of well defined)
02:31:12 <dom> ... what API to switch to dynamic?
02:31:20 <anssik> q?
02:31:42 <dom> ... then figure how to translate that with the backends while considering the performance impact
02:31:58 <anssik> Anssi: MarkusT shared there are three different types of dynamic shapes usually used in neural network models:
02:32:04 <anssik> ... - Completely Unknown Sizes
02:32:07 <anssik> ... - Symbolic Sizes
02:32:11 <anssik> ... - Tensor-Derived Sizes
02:32:17 <anssik> -> TensorRT: Working with Dynamic Shapes https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/work-dynamic-shapes.html
02:32:33 <dom> Markus: [voicing https://github.com/webmachinelearning/webnn/issues/883#issuecomment-3386553391 ]
02:32:33 <gb> https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] [Agenda+]
02:32:50 <anssik> q?
02:33:58 <dom> ... Symbolic Sizes feels like the most interesting for WebNN
02:34:39 <dom> ... it might be useful to have a way to define the ranges of sizes to optimize backend preparation
02:34:56 <dom> ... anyone feels symbolic size wouldn't work for them?
02:35:50 <dom> Ningxin: if we define symbolic size, does that include calculation on these symbols?
02:36:14 <dom> Markus: that would need more discussion; the more maths we allow, the more complexity
02:36:16 <reillyg> q+
02:36:38 <dom> Ningxin: so we should start with the simplest approach and iterate as we identify the needs
02:36:58 <DwayneR> q+
02:37:25 <anssik> q?
02:37:29 <dom> Markus: dispatch would be the phase where the dimensions would be updated
02:37:39 <anssik> ack reillyg
02:37:43 <dom> Ningxin: this would match Chromium's implementation
02:38:27 <dom> reillyg: I wonder if in most cases, the need to express complex mathetmical functions for shapes goes away
02:39:08 <dom> ... with the risk that in some intermediate nodes it would not be possible to express the computed bounds of an operator
02:39:33 <dom> ... there are two separate pieces: API shape and validation
02:39:43 <dom> ... the WebNN API only has developers provide shapes for inputs
02:40:11 <dom> ... the API then provides back to developers the shapes of intermediate nodes, computed based on the operators
02:40:29 <dom> ... with static shapes, this can be computed statically at the time of graph building
02:40:54 <dom> ... if we move to dynamic, we either have to say "we don't know" or express it with symbols
02:41:20 <dom> Markus: just saying "it's dynamic" feels reasonable
02:42:07 <dom> reilly: we use the computed shape of a graph as part of the validation internally
02:42:31 <anssik> ack DwayneR
02:42:33 <dom> ... I would have to check if dynamic shape, this would have to be done by the developer themselves
02:42:57 <dom> DwayneR: we should distinguish flexible model size and dynamic shape
02:43:28 <anssik> q?
02:43:35 <dom> Markus: that matches "tensor-derived sizes" in my taxonomy
02:44:25 <dom> Ningxin: some prototyping & study would usefully inform this feature
02:44:49 <dom> reillyg: looking at the existing graph validation code and how much more complicated it becomes with dynamic shape
02:45:11 <dom> RESOLVED: study more backends and do prototyping before more formally specifying solution
02:45:29 <kbx> kbx has joined #webmachinelearning
02:45:53 <dom> Ningxin: once we have that prototype in the browser, we should look at whether it makes it easier to deploy existing language models without modification
02:46:14 <dom> RRSAgent, draft minutes
02:46:15 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom
02:46:29 <anssik> q?
02:46:42 <anssik> Subtopic: API to Separate Graph Building from Weight Loading to Reduce Peak Memory Usage
02:46:54 <anssik> Anssi: proposal issue #901 by Markus
02:46:54 <gb> https://github.com/webmachinelearning/webnn/issues/901 -> Issue 901 Proposal: API to Separate Graph Building from Weight Loading to Reduce Peak Memory Usage (by mtavenrath)
02:47:02 <anssik> ... a proposal to reduce peak memory usage
02:47:35 <anssik> ... using Stable Diffusion as a reference, CPU memory used during graph building is more than 3x the actual model size using dGPU, also high on iGPU
02:48:00 <anssik> ... proposal to introducing an API that splits graph creation from weight passing i.e. loading the constant data
02:48:11 <anssik> ... to enable streaming weights directly into the graph during initialization
02:48:24 <reillyg> q+
02:48:29 <anssik> ... do all current WebNN backends support a weightless graph creation, where all tensor shapes and data types are known, but the actual weight data is not provided until a later step?
02:49:22 <anssik> ... using dGPU limit the peak CPU memory overhead to 1x-2x the size of the largest single tensor
02:49:28 <anssik> ... using iGPU no temporary CPU-side storage would be needed for the "upload" as it's shared memory, reduce the total peak CPU memory consumption down to roughly Model Size + Max Single Tensor Size
02:49:39 <anssik> q?
02:50:28 <dom> Markus: my experience is that even on desktops/notebooks still have limited memory, typically 16GB
02:50:36 <dom> ... even more so on mobile devices
02:51:52 <anssik> q?
02:51:53 <anssik> ack reillyg
02:51:56 <dom> ... right now, models in WebNN can't make use of all available memory due to the memory used for loading them
02:52:34 <dom> reillyg: model caching is related to this: how do we get the model weights the developer provide to the underlying framework the most efficiently possible?
02:52:50 <dom> ... the frameworks want to see the weights to repack them to match memory layout
02:53:17 <dom> ... we are constrained having weights available during graph building
02:53:35 <dom> ... but clearly graph building isn't memory efficient for now
02:53:41 <ningxin> Would constant tensor can solve this issue? https://www.w3.org/TR/webnn/#api-mlcontext-createconstanttensor
02:53:49 <dom> ... more an implementation issue I believe
02:54:12 <dom> ... some of the changes we need for caching would help address this performance issue
02:54:37 <anssik> q?
02:54:50 <dom> ... maybe from an API perspective would be for situations with a very large constant we wouldn't want to load in the memory at all
02:55:26 <ningxin> q+
02:55:39 <dom> ... which could be improved by streaming the constant
02:56:03 <dom> Markus: do we have the list of operators that would require the constant to be known at build time?
02:56:29 <RafaelCintron> q+
02:56:55 <dom> reillyg: we can get that list of operators
02:57:19 <dom> ... it's a constraint we get from backends we wish we didn't have
02:57:48 <dom> Markus: maybe this is something we can reach out to backend developers to change
02:57:55 <sushraja> sushraja has joined #webmachinelearning
02:58:05 <anssik> q?
02:58:16 <anssik> ack ningxin
02:58:17 <dom> ack ningxin
02:58:35 <ErikAnderson> ErikAnderson has joined #webmachinelearning
02:58:53 <Ugur> Ugur has joined #webmachinelearning
02:59:03 <takaaki> takaaki has joined #webmachinelearning
02:59:07 <dom> ningxin: would the constant tensor help with this? https://www.w3.org/TR/webnn/#api-mlcontext-createconstanttensor
03:00:08 <dom> ... the issue is with some underlying AI runtimes don't support this; with DirectML, we can do this, but on ONNX, AFAIK there is no way to do this, we have to put everything on CPU during session creation time
03:00:56 <dom> ... in terms of frameworks, ONNX runtime web needs all the weights to be on CPU
03:01:05 <reillyg> q++
03:01:13 <reillyg> q--
03:01:20 <dom> ... we can have the API shape, but it needs adjustment both in underlying runtimes and frameworks
03:01:25 <anssik> q?
03:01:28 <anssik> ack RafaelCintron
03:01:30 <dom> Markus: right, this is an ecosystem effort in which WebNN is at the center
03:01:32 <anssik> ack +
03:01:44 <reillyg> q+
03:02:01 <dom> Rafael: in the browser, this is a multiprocessor architecture, the model MUST run in a different thread for security
03:02:59 <dom> ... being able to share memory across processes would be good for performance, but challenging for security
03:03:25 <anssik> q?
03:03:29 <dom> ... maybe not insurmountable, but there was discomfort
03:04:15 <dom> Markus: zero-copy would be the dream, but reducing from 10 copy to much fewer is critical to make this usable on real world conditions
03:04:19 <anssik> ack reillyg
03:05:03 <dom> reillyg: looking at the implementation sketch Markus put together, that almost match how we're doing it: when a dev gives us a constant, we create a handle to it, we start uploading that constant to the backend, and let the developer continue building the graph
03:05:24 <dom> ... the problem right now is less than the API doesn't allow you to only keep roughly the largest tensor worth of memory - it does
03:05:36 <dom> ... but all the implementations on the browser side and on the JS side aren't handling this very well
03:05:47 <dom> ... ONNX runtime web requires everything to be in JS memory
03:06:04 <dom> ... similarly today when we create a constant, we keep it in the memory in the browser - but we don't have to
03:06:28 <dom> ... what Markus describes is how we intend the current API to work, but that's not how implementations exist today
03:07:50 <dom> Anssi: is there any spec change that we should make out of this conversation? any normative change for WebNN to enable this optimization?
03:08:18 <dom> reillyg: the only two things we might change: if we have an issue with very large constants, we may want to add a streaming constructor for constatns
03:09:13 <dom> ... and a feedback mechanism to let developers know when they can start loading the next one, with a backpressure mechanism to manage peak memory
03:10:55 <dom> Dom: any cooperation we should facilitate with backends/frameworks?
03:11:07 <dom> reillyg: I assume the ONNX Web runtime is aware of the memory issue
03:11:54 <dom> Ningxin: the main issue I think is on the backend side, and whether this would work with the various hardware EPs
03:12:16 <dom> ... on the JS framework, we can probably have a solution
03:12:23 <RafaelCintron> q+
03:13:12 <dom> s/atns/ants/
03:13:45 <dom> reillyg: adding a streaming constructor would also open the door to the backpressure feature I was describing
03:14:52 <dom> RESOLVED: explore streaming constructor for constants
03:15:46 <anssik> q?
03:15:54 <anssik> ack RafaelCintron
03:16:10 <sushraja> sushraja has joined #webmachinelearning
03:16:17 <anssik> q?
03:16:17 <dom> RafaelCintron: +1 on the importance on getting this fixed to be clear
03:16:39 <dom> ... I know the ONNX runtime is trying to fix very similar issues
03:16:42 <sushraja> sushraja has joined #webmachinelearning
03:17:06 <dom> ... this will also be needed for WebGPU interop
03:17:08 <anssik> q?
03:17:15 <dom> RRSAgent, draft minutes
03:17:16 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom
03:17:40 <anssik> Subtopic: Device selection, state of the union
03:17:46 <anssik> -> [device selection] https://github.com/webmachinelearning/webnn/labels/device%20selection
03:17:54 <anssik> Anssi: a bag of issues
03:18:21 <anssik> ... we have explored API surface-level enhancement for both "before graph compilation" tied to MLContext and "after graph compilation" tied to MLGraph object
03:18:32 <anssik> ... recently we reached consensus to add a simple accelerator selection mechanism
03:18:38 <anssik> ... issue #815 was addressed by PR #895
03:18:38 <gb> https://github.com/webmachinelearning/webnn/issues/895 -> #895
03:18:39 <gb> https://github.com/webmachinelearning/webnn/issues/815 -> #815
03:18:45 <anssik> ... the minimal design the group landed on is an `MLContext.accelerated` boolean:
03:18:52 <anssik> ```
03:18:52 <anssik> interface MLContext {
03:18:52 <anssik>   undefined destroy();
03:18:52 <anssik> + readonly attribute boolean accelerated;
03:18:52 <anssik>   readonly attribute Promise<MLContextLostInfo> lost;
03:18:53 <anssik> };
03:18:53 <anssik> ```
03:19:01 <anssik> Anssi: a corresponding explainer update was #884
03:19:01 <gb> https://github.com/webmachinelearning/webnn/pull/884 -> MERGED Pull Request 884 Update explainer with new proposal for simple accelerator mapping (by zolkis) [device selection]
03:19:18 <anssik> ... we spun off issues for further discussion:
03:19:26 <hagio_nhk> hagio_nhk has joined #webmachinelearning
03:19:26 <anssik> ... #897 to device "underlying execution device" concept
03:19:30 <gb> https://github.com/webmachinelearning/webnn/issues/897 -> Issue 897 Define "underlying execution device" concept (by anssiko) [device selection]
03:19:34 <anssik> ... #900 for CPU fallback hint
03:19:35 <gb> https://github.com/webmachinelearning/webnn/issues/900 -> Issue 900 CPU fallback hint (by anssiko) [device selection]
03:19:43 <anssik> ... #902 usecase-driven scenenarios
03:19:44 <gb> https://github.com/webmachinelearning/webnn/issues/902 -> Issue 902 Device selection criteria for usecase-driven scenarios (by fdwr) [device selection]
03:20:12 <anssik> Anssi: we also have a spec issue #836, PR #854 and prototype implementation for `MLGraph.devices` API
03:20:13 <gb> https://github.com/webmachinelearning/webnn/pull/854 -> Pull Request 854 define graph.devices (by philloooo) [device selection]
03:20:13 <gb> https://github.com/webmachinelearning/webnn/issues/836 -> Issue 836 Get devices used for a graph after graph compilation (by philloooo) [device selection]
03:20:35 <anssik> ... the latest on this is MarkusH and MikeW are exploring use cases with this design
03:20:44 <anssik> ... privacy is the key concern with this proposed API enhancement
03:21:05 <anssik> Anssi: issue #759
03:21:06 <gb> https://github.com/webmachinelearning/webnn/issues/759 -> Issue 759 MLOpSupportLimits should be opt-in with base functionality (by mwyrzykowski) [device selection]
03:21:25 <anssik> ... this proposal from MikeW for providing an API for listing operator support limits is informed by a similar API in WebGPU:
03:21:30 <anssik> -> WebGPU limits https://www.w3.org/TR/webgpu/#limits
03:21:55 <anssik> Anssi: the proposed MLOpSupportLimits API returns all available devices with their operator support limits
03:22:00 <anssik> ... using this information, the web app can choose one of them to initialize a context with
03:22:41 <anssik> Subtopic: Device selection criteria for usecase-driven scenarios
03:22:45 <anssik> Anssi: issue #902
03:22:59 <anssik> Anssi: any device selection feature we design should be motivated by a real-world app scenario / use case
03:23:22 <dom> Dwayne: no concrete proposals here
03:24:03 <dom> ... the question is how to find the right balance between leaving more freedom to the UA and allowing situations were more device control is required
03:24:44 <dom> Markus: the problem is made complex because there are not only CPU/GPU/NPU, but several GPUs, NPUs, sometimes of different vendors
03:25:07 <dom> ... WebNN is a really good target for vendors seeking to deploy on the Web interoperably
03:25:31 <dom> ... one situation that is challenging is when they need to run multiple models at the same time
03:26:09 <dom> ... when professional users have multiple powerful GPUs, we wouldn't want the privacy protections to make it impossible to fully take advantage of their hardware
03:27:05 <anssik> q?
03:27:06 <dom> ... I wondered if a permission prompt similar to camera/mic could be acceptable, which would then get access to full query of devices while avoiding slient fingerprinting
03:28:21 <dom> Rafael: WebGL and WebGPU have a way to pick a specific highperformance adapter
03:28:32 <reillyg> q+
03:28:49 <dom> ... with a restriction on iframes
03:29:22 <dom> ... wrt prompting, neither WebGL or WebGPU have a prompt - how do you handle the situation where the user say no because of prompt fatigue
03:29:46 <vmpstr> vmpstr has joined #webmachinelearning
03:29:58 <anssik> RRSAgent, draft minutes
03:29:59 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
03:30:05 <dom> ... fingerprinting is a real issue - WebGL has been massively used for fingerprinting based on telemetry
03:30:43 <anssik> q?
03:30:46 <anssik> ack reillyg
03:30:55 <dom> ... I'm OK with allowing access to high performance GPUs, and maybe consider a permission prompt for super advanced use cases
03:31:06 <dom> reillyg: +1 to Rafael
03:31:25 <dom> ... a solution where by default you the GPU the browser identifies the best
03:31:49 <dom> ... I don't think the current WebGPU implementation in Chromium allows to use multiple GPUs
03:31:51 <RafaelCintron> q+
03:32:12 <dom> ... Maybe WebNN should allow to query for NPUs
03:32:45 <dom> RafaelCintron: as far as Chromium is concerned, high-perf request is only supported on Mac
03:32:55 <dom> ... not supported on Windows - maybe coming in the future
03:32:57 <anssik> q?
03:33:03 <dom> ack RafaelCintron
03:33:52 <dom> RRSAgent, draft minutes
03:33:53 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom
03:34:06 <hagio_nhk> hagio_nhk has left #webmachinelearning
03:34:28 <DwayneR> Itadakimasu 👋.
04:34:46 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
04:49:21 <sushraja> sushraja has joined #webmachinelearning
04:49:27 <RafaelCintron> Present+
04:49:35 <Ehsan> Ehsan has joined #webmachinelearning
04:49:58 <mgifford2> mgifford2 has joined #webmachinelearning
04:50:06 <IrisJ> IrisJ has joined #webmachinelearning
04:50:19 <kbx7> kbx7 has joined #webmachinelearning
04:50:19 <handellm> handellm has joined #webmachinelearning
04:50:26 <Tarek3> Tarek3 has joined #webmachinelearning
04:50:49 <RobKochman> RobKochman has joined #webmachinelearning
04:51:52 <dom> Erik: how much do we need to explore the permission prompt? vs an enterprise policy optional API?
04:52:13 <dom> Markus: my main point is to make sure we consider scenarios with more complex device selection then just per type
04:52:45 <elena> elena has joined #webmachinelearning
04:52:52 <dom> ... both because multiple devices of a given type might exist, or because you need to keep a particular job on a device where another job is happening (e.g. decoding)
04:53:09 <dom> erik: how much does this need to be driven by the app vs via hints?
04:53:21 <dom> Markus: I'd be fine with hints, but I'm skeptical they'll suffice
04:54:00 <ErikAnderson> ErikAnderson has joined #webmachinelearning
04:54:23 <dom> ... Another case might be benchmarking - including done by the app for device selection
04:55:05 <anssik> dom: on the web platform, always need to balance use cases vs. privacy, 80/20 rule, hints vs direct control needs this consideration
04:55:10 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
04:55:27 <anssik> ... we're might be adding a huge new fingerprint surface
04:55:46 <reillyg> q+
04:55:58 <anssik> Markus: we'd put this behind a permission prompt
04:56:22 <anssik> dom: prompt fatique and understandability is an issue with adding new permission prompts
04:56:23 <anssik> q?
04:56:25 <anssik> ack reillyg
04:56:47 <anssik> q?
04:57:24 <dom> thomasN: +1 on trade-off with privacy; one successful strategy has been to look at what data has already been exposed
04:57:29 <dom> reillyg: the question on benchmark is a good one
04:57:40 <dom> ... we expect developers already do this to decide what they can run
04:58:07 <dom> ... if this is something we can provide them instead of getting them to run benchmark workloads that are wasteful
04:58:30 <dom> ... the question is how to express capabilities as numbers which are difficults the same way hints are
04:58:45 <Mike_Wyrzykowski> q+
04:58:55 <dom> ... we've seen this as relatively successful in the WebGPU context and might be useful here for NPUs
04:58:59 <anssik> q?
04:59:01 <dom> ... but unclear which numbers to provide
04:59:01 <anssik> ack Mike_Wyrzykowski
04:59:03 <ErikAnderson> ErikAnderson has joined #webmachinelearning
04:59:35 <dom> MikeW: do we need to expose opslimits by processing unit?
05:00:02 <dom> ... (as I commented on the issue https://github.com/webmachinelearning/webnn/issues/902)
05:00:03 <gb> https://github.com/webmachinelearning/webnn/issues/902 -> https://github.com/webmachinelearning/webnn/issues/902
05:00:11 <dom> s/opslimits/OpsSupportLimits/
05:00:14 <anssik> q?
05:00:28 <ningxin> ningxin has joined #webmachinelearning
05:00:37 <dom> reillyg: this woudl be very helpful, but it's not information made available by platforms - e.g. CoreML doesn't provide stats on the capabilities of NPU
05:00:47 <dom> .... similar situation in other platforms
05:01:14 <dom> ... this would be great enhancement to the API
05:01:33 <dom> Anssi: how does this relate to #759?
05:01:34 <gb> https://github.com/webmachinelearning/webnn/issues/759 -> Issue 759 MLOpSupportLimits should be opt-in with base functionality (by mwyrzykowski) [device selection]
05:01:56 <dom> MikeW: they're related but different; as reilly say, the challenge is making that information queryable
05:02:44 <markafoltz> markafoltz has joined #webmachinelearning
05:02:58 <dom> reillyg: for #759, we recently updated the WPT to differentiate required and optional tests to represent that idea of things developers can rely on or not
05:03:05 <dom> ... not sure this has been reflected in the spec
05:03:20 <anssik> q?
05:03:48 <RafaelCintron> +1 to hints for now.
05:04:03 <dom> reillyg: beyond choosing devices, there is also a scheduling aspect to this
05:04:33 <dom> ... e.g. if there are real-time vs non-real time workloads running in parallel, helping the UA to schedule with hints would be useful
05:04:58 <anssik> q?
05:05:00 <Ugur> Ugur has joined #webmachinelearning
05:06:06 <anssik> dom: in addition to permission prompt, there's also discussion about integrating permission management with page embedded permission control
05:06:28 <anssik> ... not changing the discussion, it changes how this is embedded in the UX so we don't have prompt coming from nowhere
05:06:30 <tomayac> s/page embedded permission control/page embedded permission control (PEPC)
05:06:58 <anssik> ... for more advanced query API we need to make it in context of this new proposal for permission management
05:07:03 <anssik> q?
05:07:41 <dom> markus: if we have hints, how do we validate they work?
05:08:17 <dom> reilly: a developer can't measure how their app runs if we don't provide the metrics
05:08:59 <dom> ... Phillis has a proposal to expose which device the model is running on
05:09:02 <anssik> Subtopic: CPU fallback hint
05:09:13 <anssik> Anssi: issue #900
05:09:13 <gb> https://github.com/webmachinelearning/webnn/issues/900 -> Issue 900 CPU fallback hint (by anssiko) [device selection]
05:09:36 <anssik> ... the group has explored a "CPU fallback" hint, a flag to expose to web content whether a CPU fallback mechanism is active
05:09:39 <anssik> ... spun off from the "accelerated" hint, a feature discussion that landed
05:10:29 <dom> MarkusH: we have use cases where knowing if the workload will be accelerated is critical to deciding whether to run it or not
05:10:56 <dom> ... we would want to abort if we detect cpu fallback before or after compilation
05:11:05 <dom> ... before would help saving download coast
05:11:07 <dom> s/coast/cost
05:11:15 <anssik> q?
05:11:18 <reillyg> q+
05:11:32 <anssik> ack reillyg
05:11:44 <dom> reillyg: the previous discussion about OS support for GPU/NPU devices is helpful here
05:12:03 <dom> ... in general, the answer to "is CPU fallback active" before compilation is always "yes"
05:12:18 <dom> ... it's always supported
05:12:46 <ningxin> q+
05:12:52 <anssik> ack ningxin
05:12:56 <dom> ... how do we help developers determine whether to use faster vs better model based on GPU availability
05:12:57 <ErikAnderson> q+
05:13:22 <dom> NIngxin: we should distinguish "cpu-only" vs "cpu fallback" - the latter is always available
05:13:43 <dom> ... what you want here is to avoid accerelated=false
05:14:23 <anssik> q?
05:14:30 <dom> ... we can set context.accelerated=false if we detect the GPU/NPU won't work
05:15:05 <dom> reillyg: one question is "do you have a GPU/NPU?" if not, this means we're on a CPU-only situation
05:16:03 <dom> ... if it's about fallbacks - do we want to provide an option to fail compilation if it will end up running on CPU - but that only works after compilation which you want to avoid
05:16:03 <Tarek3> q+
05:17:22 <dom> reillyg: we should clarify that the issue about detecting whether a GPU/NPU is available - for a pre-compilation situation
05:17:25 <anssik> ack ErikAnderson
05:17:53 <mgifford2> It's not just if a device has the GPU/NPU but if a user wants to have the LLM run on their device. It may be a matter of user preference, but also energy usage. Users may be happy running a GPU/NPU in some locations or times, and not others based on things like local energy costs or reliability. Just battery life as well.
05:18:05 <dom> ErikA: similar to the discussions in WebGL/WebGPU
05:18:35 <dom> MarkusH: we can always to try it and check whether it works well run on real time
05:19:27 <dom> ErikA: in WebGL, you can create a context that makes it fail if you hit performance challenges
05:19:36 <anssik> q?
05:19:39 <ErikAnderson> For context: https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/getContext#failifmajorperformancecaveat
05:19:40 <anssik> ack Tarek3
05:19:42 <anssik> ack Tarek
05:19:46 <dom> reillyg: we should start simple "can it run fast at all", and look at more detailed evaluation in a later phase
05:20:38 <dom> Tarek: I had similar questions around concurrency: if the existing accelerated hardware is already used, should that be exposed to the app?
05:21:15 <dom> reillyg: a given app might run separate models/graphc rendering in parallel - we should help the app negotiate to figure which workloads to run where
05:21:28 <dom> Tarek: so the orchestration might happen on both sides?
05:21:29 <kush> kush has joined #webmachinelearning
05:22:00 <dom> reillyg: right - an app might have more workloads to run than are runnable in parallel on a given system
05:22:47 <gb> https://github.com/webmachinelearning/webnn/issues/900 -> Issue 900 CPU fallback hint (by anssiko) [device selection]
05:23:10 <dom> MarkusH: another aspect is time-sensitivity: video frame needs to be processed in real time, when the answer to a chat-bot query to an llm is much less time-sensitive
05:23:21 <big-screen> big-screen has joined #webmachinelearning
05:25:21 <dom> MarkusH: a boolean flag on whether it is accelerate is probably a good enough starting point
05:25:27 <phillis> phillis has joined #webmachinelearning
05:25:43 <dom> s/accelerate/accelerated/
05:26:08 <anssik> Subtopic: Get device selection information after graph compilation
05:26:09 <mtavenrath4> mtavenrath4 has joined #webmachinelearning
05:26:13 <anssik> Anssi: issue #836 and PR #854
05:26:13 <gb> https://github.com/webmachinelearning/webnn/pull/854 -> Pull Request 854 define graph.devices (by philloooo) [device selection]
05:26:13 <gb> https://github.com/webmachinelearning/webnn/issues/836 -> Issue 836 Get devices used for a graph after graph compilation (by philloooo) [device selection]
05:26:30 <anssik> ... the group thinks we need the following two to advance:
05:26:35 <anssik> ... 1) strong use cases
05:26:42 <anssik> ... 2) check the API design is privacy preserving
05:26:55 <anssik> Anssi: MarkusH from Google Meet share his key use case, adaptation
05:27:10 <anssik> `graph.devices` could help identify:
05:27:15 <BenGreenstein> BenGreenstein has joined #webmachinelearning
05:27:23 <anssik> a) what resources a misbehaving model is using, and
05:27:32 <anssik> b) which models are candidates to stop that would help the situations
05:27:41 <Mike_Wyrzykowski> q+
05:28:00 <anssik> Anssi: MikeW commented:
05:28:08 <anssik> ... "Another way of achieving the same thing is the web app sorts its workloads in priority, terminating lower priority ones (1). Or some type of metric reporting that the model was stalled K ms waiting to run due to other work on the system and took S ms to complete (2)"
05:28:45 <dom> MikeW: the problem is that the information on which device has been selected isn't static
05:29:13 <dom> ... a workload that has run on a GPU may run on the NPU the next run, or fallback to CPU
05:29:37 <anssik> q?
05:29:41 <anssik> ack Mike_Wyrzykowski
05:29:46 <dom> ... I can see the value in expressing the graph can run on an accelerated unit, but reporting the last device on which it has run is not very reliable
05:30:09 <ErikAnderson> q+
05:30:26 <Mike_Wyrzykowski> q+
05:30:30 <dom> reillyg: is there still value to report on which devices the workload might run? e.g. gpu or npu; would that be good enough for applications?
05:30:55 <dom> MikeW: could we just return that it can run accelerated vs a specific device value?
05:31:00 <anssik> ack Mike_Wyrzykowski
05:31:28 <dom> ... the distinctions on specific hardware types are evolving, and it's not obvious it's needed for the app
05:31:28 <anssik> ack ErikAnderson
05:31:33 <dom> MarkusH: I think that could work
05:32:09 <dom> Erik: an app author might want to know how much of the workload on which unit
05:32:15 <Mike_Wyrzykowski> q-
05:32:33 <anssik> RRSAgent, draft minutes
05:32:35 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
05:33:38 <dom> ... I'm not sure the proposal on the PR provides enough reliable context
05:33:41 <anssik> dom: two aspects, things you want to operate and course-correct live and things you want to monitor to know if want to modify system later, is separation of concerns approach appropriate here?
05:34:06 <dom> MarkusH: when we detect that we're not operating in real-time compatible ways, we need to take action
05:34:23 <anssik> q?
05:34:31 <dom> ... the proposal in the PR could help; an accelerated flag would probably suffice
05:34:52 <dom> phillis: if it's hybrid, what do we report?
05:35:39 <dom> dom: would an enum be actionable?
05:35:50 <dom> markush: in practice, it would depend on how much it runs on the CPU
05:36:14 <Youngmin> Youngmin has joined #webmachinelearning
05:36:22 <dom> reillyg: there is a cost in using multiple units (even GPU + NPU) - so maybe "hybrid" is worth reporting in general
05:36:39 <dom> ... at some point, some of the performance detection can only be done by the app developers
05:37:47 <dom> RESOLVED: Phillis to refine the proposal to reflect an accelerated status, with discussions on hybrid still TBD
05:37:52 <mgifford2> How much of this is inherent to hardware design? Will switching costs matter in 5 years? Probably. What influence might the W3C have in the future of what this technology makes available?
05:38:14 <anssik> Subtopic: MLOpSupportLimits
05:38:21 <anssik> Anssi: issue #759
05:38:21 <gb> https://github.com/webmachinelearning/webnn/issues/759 -> #759
05:38:42 <dom> MikeW: we should define limits that are supported across all devices
05:39:03 <anssik> q?
05:40:02 <dom> reillyg: we have this in the test; a goal for us implementation-wise is to make sure that the implementation we have can implement all operators, and for those operators that can't be made optional
05:40:16 <ningxin> ningxin has joined #webmachinelearning
05:40:39 <anssik> Topic: Customer feedback & collaborations
05:42:05 <anssik> Anssi: customer feedback, including end-users, frameworks, independent software vendors, is extremely important throughout the process of developing new Web APIs, starting with use case identification, requirements gathering and hands-on feedback from early adopters, all the way to maintenance phase when large-scale deployment happens
05:42:14 <anssik> ... we have used a dedicated community-maintained repo, WebNN Awesome, to document various signals from customers and developers at large
05:42:20 <anssik> -> https://github.com/webmachinelearning/awesome-webnn
05:43:07 <anssik> Anssi: I recognize many customers are not comfortable to publicly speak for their future product's use of WebNN API at this time, so I ask for sensitivity in this regard toward that
05:43:23 <anssik> ... that said, we have some brave early adopters who have worked with us in public
05:43:32 <anssik> ... kudos to the Google Meet team and Markus in particular for sharing feedback, reviewing our proposals and also submitting new feature requests for considerations
05:47:22 <anssik> Subtopic: RTC-style workloads with response time requirements
05:47:29 <anssik> Anssi: issue #898
05:47:29 <gb> https://github.com/webmachinelearning/webnn/issues/898 -> Issue 898 Support for workloads with response time requirements (realtime) (by handellm) [Agenda+]
05:47:55 <anssik> ... Markus provided customer feedback from Google Meet product where RTC-style workloads have strict response time requirements
05:47:58 <anssik> ... assumption is the system while not under load is able to execute the workload
05:48:19 <dom> MarkusH: I see a future where we run more and more concurrent ML workloads on our system
05:48:41 <dom> ... if the system can't detect what's real-time or not, it may not be able to orchestrate it
05:49:07 <dom> ... e.g. audio processing needs to be run within certain time requirements to the risk of audio glitches or robot voices
05:49:37 <anssik> q?
05:49:38 <dom> ... if we can't rely on these deadlines being respected, this creates an adoption blocker
05:49:47 <dom> ... the same is true (with a different scale) for video processing
05:50:28 <dom> ... also, there is prioritization - not all audio processing may be as critical
05:50:35 <anssik> q?
05:50:41 <dom> ... we've also documented situations of misbehaving concurrent workloads
05:51:08 <mtavenrath4> q+
05:51:30 <anssik> ack mtavenrath4
05:51:33 <anssik> ack mtavenrath4
05:51:35 <anssik> ack mtavenrath
05:51:48 <dom> markusT: it feels like a hard to address problem in general
05:52:15 <dom> ... e.g. ONNX runtime doesn't have a sense of real time
05:53:06 <dom> ... tasks get queued, so if it gets queued behing a slow task (e.g. an LLM request), you can't really accelerate this
05:53:30 <dom> ... not sure there is a prioritization mechanism on all type of devices
05:53:56 <Tarek3> q+
05:53:57 <RafaelCintron> q+
05:54:00 <anssik> ack Tarek
05:54:01 <dom> ... even getting this orchestrated on native is hard, because the frameworks don't support the infrastruture you would need to execute properly
05:54:17 <dom> Tarek: do we really want to do that in WebNN?
05:54:40 <dom> ... we're starting from this situation of wanting to run concurrent workloads via a background utility
05:54:54 <reillyg> q+
05:55:31 <dom> ... should this be done by the app or in the backend? does it even make sense to run several things on a GPU
05:56:09 <dom> MarkusT: do you know how much the task will take of the available window?
05:56:25 <dom> MarkusH: on CPU, this is solved problem with OS priorities
05:57:15 <dom> ... when workloads get interleaved on GPUs, there is an opportunity for prioritization
05:57:37 <dom> MarkusT: pre-emption is now available on GPUs, but it is much more expensive than a CPU
05:57:57 <dom> ... but overall, this gets us back to my device selection issue
05:58:08 <dom> ... e.G. audio processing you probably want on CPU, where the data is anyway
05:58:33 <dom> ... conversely, video processing happens on GPU, and you'd want to use the device used to render the video as well
05:58:57 <anssik> q?
05:59:27 <dom> MarkusH: audio processing might be best run on NPU for power efficiency
05:59:29 <dom> q+
05:59:36 <anssik> ack RafaelCintron
05:59:59 <dom> MarkusT: that's where it' useful to know which devices are available, and support benchmarking
06:00:35 <Dingwei> Dingwei has joined #webmachinelearning
06:00:35 <dom> RafaelCintron: hints could communicate priorities, that would map to processing queues
06:00:39 <reillyg> q-
06:00:50 <anssik> ack dom
06:01:59 <anssik> dom: reflection, this need to orchestrate processing across latency and power efficiency, we find this around many APIs on the web platforms, and each time we create a hint we should align across APIs
06:02:36 <anssik> ... you don't want to switch from one GPU to another GPU, you want continuity, but how to describe that in a declarative way is the question
06:03:13 <anssik> ... these are more general questions, Google Meet is a good use case to look at RTC-application requirements
06:03:15 <anssik> q?
06:03:41 <anssik> RRSAgent, draft minutes
06:03:43 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
06:03:52 <dom> MarkusH: I was discussing with Youenn of the concept of worker priority that was proposed by Intel a couple of years ago
06:04:26 <dom> ... e.g. an "audio" worker priority would  be exposed to WebNN
06:04:36 <dom> ... and influence how that job would run
06:05:14 <AlexDawson> AlexDawson has joined #webmachinelearning
06:05:24 <dom> -> https://www.w3.org/2023/Talks/TPAC/breakouts/web-worker-qos/ Web WOrker Quality of Service breakout presentation at TPAC 2023
06:05:34 <dom> RRSAgent, draft minutes
06:05:35 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom
06:34:21 <phillis> phillis has joined #webmachinelearning
06:37:03 <anssik> Topic: Implementation plans and trials
06:37:17 <Tarek> Tarek has joined #webmachinelearning
06:37:20 <anssik> Anssi: in this session we'll discuss and share the latest news on the next step for implementations, Origin Trial or equivalent, new requirements or feedback from updates in backends, frameworks
06:37:28 <anssik> ... but first, we kick off with exciting demos to raise your appetite!
06:38:02 <simone> simone has joined #webmachinelearning
06:38:05 <anssik> -> WebNN Developer Preview demos https://microsoft.github.io/webnn-developer-preview/
06:38:49 <anssik> Present+ Tara_Whalen
06:38:49 <anssik> Present+ Simone_Onofri
06:40:35 <dom> [showing running WebNN Stable Diffusion Turbo both on GPU and NPU]
06:40:52 <IrisJ> present+ Iris_Johnson
06:41:12 <IrisJ> Present+ Iris_Johnson
06:41:34 <dom> [showing WebNN Segment Anything demo running on GPU and on NPU]
06:41:57 <handellm> handellm has joined #webmachinelearning
06:43:46 <dom> [showing WebNN Whisper Base on GPU and NPU]
06:44:23 <anssik> -> WebNN via Transformers.js https://huggingface.co/webnn/spaces
06:45:11 <dom> [showing Background removal based on MODnet, demo hosted on hugging face, on GPU & NPU]
06:45:16 <Dingwei> Dingwei has joined #webmachinelearning
06:45:41 <dom> [showing real-imte object detection w/ Yolo12n]
06:46:23 <tara> tara has joined #webmachinelearning
06:46:34 <dom> [real-time depth estimation w/Depth anything v2]
06:50:35 <dom> [demo of background blur done with WebNN on a full WebGPU pipeline, with 23% improved performance and 17% lower power consumption]
06:51:18 <anssik> Anssi: thank you Ningxin for these compelling demos!
06:51:27 <alispivak> alispivak has joined #webmachinelearning
06:51:44 <mgifford2> mgifford2 has joined #webmachinelearning
06:51:51 <anssik> Subtopic: Browser vendors' trials
06:52:12 <Ugur> Ugur has joined #webmachinelearning
06:52:30 <anssik> Anssi: we've discussed on our telcons that Origin Trial in Chrome is getting closer, latest discussion:
06:52:37 <anssik> -> https://www.w3.org/2025/10/23-webmachinelearning-minutes.html#1faa
06:52:41 <anssik> ... we also discussed Edge works in upstream with only a small 5-10 days delay and will launch an Origin Trial in sync
06:52:45 <anssik> ... more information about Origin Trials will be made available at:
06:52:49 <anssik> -> Chrome Origin Trials https://developer.chrome.com/origintrials/
06:52:53 <anssik> -> Edge Origin Trials https://developer.microsoft.com/en-us/microsoft-edge/origin-trials
06:53:13 <anssik> RRSAgent, draft minutes
06:53:15 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
06:53:41 <dom> reillyg: I imagine this landing in the 2nd or 3rd version of the new year in Chrome OT
06:53:49 <dom> s/version/release
06:54:22 <anssik> Topic: Horizontals
06:54:27 <anssik> Anssi: in this session we get to know experts behind horizontal groups
06:54:31 <anssik> Subtopic: Ethics
06:54:50 <anssik> Anssi: for ethics, I've proposed to make this group's Ethical Principles for Web Machine Learning a joint deliverable with the Web & AI Interest Group, a group that is being proposed
06:55:06 <anssik> ... by doing this, we can tap into the expertise of that Interest Group to help advance this important deliverable on the W3C Note track
06:55:12 <anssik> ... we currently refer to this doc in the WebNN spec Ethical Considerations
06:55:32 <anssik> -> https://www.w3.org/TR/webmachinelearning-ethics/
06:55:36 <anssik> -> https://www.w3.org/TR/webnn/#ethics
06:56:30 <anssik> dom: ethics has not received a lot of bandwidth which is why we propose to make it a joint deliverable with Web & AI IG, the document was written 2022-23 so long time ago considering the rate of development in AI space
06:56:35 <anssik> q?
06:56:59 <anssik> dom: also Ethical Considerations has been endorsed as W3C Statement
06:57:00 <anssik> q?
06:57:17 <dom> s/Ethical Considerations/Ethical Web Principles/
06:58:07 <anssik> Subtopic: Sustainability
06:58:14 <anssik> Anssi: I've asked Mike Gifford, co-chair of the Sustainable Web IG to talk about the work done in that group
06:58:18 <anssik> -> https://www.w3.org/groups/ig/sustainableweb/
06:58:22 <anssik> Anssi: per TAG review feedback, we're expected to evaluate sustainability impact of WebNN, see issue #861
06:58:23 <gb> https://github.com/webmachinelearning/webnn/issues/861 -> Issue 861 Evaluate sustainability impact (by anssiko) [tag-needs-resolution] [Agenda+]
06:58:36 <mgifford2> https://www.w3.org/TR/web-sustainability-guidelines/
06:58:36 <mgifford2> https://github.com/w3c/sustainableweb-wsg/issues/139
06:58:36 <gb> https://github.com/w3c/sustainableweb-wsg/issues/139 -> Issue 139 Adding a comment about what is or isn't included with AI (by mgifford) [enhancement] [editorial]
06:58:44 <anssik> RRSAgent, draft minutes
06:58:45 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
06:58:57 <dom> MikeG: we're an environmental and climate crisis which we need to integrate in our work
06:59:17 <ningxin> ningxin has joined #webmachinelearning
06:59:28 <dom> ... the goal of the Web Sustainability Guidelines is to create a Web standard that other institutions can use as a guidelines to evaluate how sustainable their technologies are
06:59:49 <dom> ... the size of avergae Web page has grown beyond the scale of the information it's providing
06:59:50 <anssik> q?
06:59:55 <dom> s/gae/age/
07:00:19 <dom> ... which relates to Web performance, although we have considerations that are completely separate - e.g. water consumption
07:00:52 <simone> present+
07:01:11 <dom> ... We're here because AI is changing a lot for the Web; we're seeing agentic browsers, the raise in AI in everything whether you want it or not, with a huge environmental impact, with data centers growing impact on electricity, water, sound
07:01:33 <christianliebel> present+
07:02:08 <dom> ... we're interesting in evaluating overlap between our groups, the impact of decentralizing AI inferences to devices on environmental impact given their lower optimization compared to data center
07:02:13 <dom> ... we have a few questions:
07:03:01 <anssik> Present+ Mike_Gifford
07:03:02 <dom> ... - what advice can you give us as we're starting to write up guidance on accessibility [suspect AI was meant AI]
07:03:09 <dom> s/.AI//
07:03:12 <anssik> q?
07:03:43 <dom> Anssi: what is the best way for participants of this group to help with this? github repo?
07:03:52 <dom> mikeG: yes
07:04:31 <dom> Anssi: are there AI-related issues we can help with? initially AI wasn't really part of the scope as I understand it
07:04:53 <dom> MIkeG: right - we can't not address this given the impact of AI
07:05:14 <dom> ... our guidelines are expected to address different context and audiences
07:05:31 <dom> ... we're not sure yet on whether to include AI in a cluster or distribute it across the document
07:05:57 <dom> Anssik: for this group to help, having AI-focused content would make it easier
07:06:18 <anssik> Present+ Alex_Dawson
07:06:20 <anssik> q?
07:06:30 <dom> MikeG: we have lots of infrastructure to help navigate the guidelines and issues through a well-defined taxonomy, which will help with this
07:06:44 <dom> ... there is also the question of data centers on which we could use expertise from people here
07:07:38 <anssik> dom: sustainability is currently and IG that's working on a note and the direction is toward a horizontal group
07:08:01 <anssik> ... horizontal definition is more of a cultural one, ethical web considerations tell us to consider sustainability
07:08:02 <anssik> q?
07:09:06 <anssik> q?
07:09:07 <dom> MikeG: how does your group deal with a fast-evolving ecosystem such as AI?
07:09:50 <dom> Anssik: we try to find the right level of abtractions that stand the test of time, as Web standards have tried to do
07:10:05 <dom> ... similar to the discussion about to what extent the NPU/GPU distinction matters
07:10:20 <AlexDawson> W3C already has Societal Impact self-review, there is scope for a potential self-review for sustainability in the future.
07:10:34 <dom> reillyg: we also depends on what developers will want to use to provide the best UX
07:11:15 <dom> ... so we're more reactive where the sustainability work would be more proactive in pushing in a given direction
07:11:31 <dom> MikeG: aligning incentives towards good sustainability is a key challenge we face
07:12:31 <dom> ... Small vs Large Language Models: the former seems more environmental friendly; but will that distinction remains relevant over time?
07:12:41 <dom> Anssi: the Mobile Web Best PRactices document had that very issue
07:13:52 <Tarek> q+
07:14:25 <anssik> s/PRactices/Practices
07:14:29 <anssik> ack Tarek
07:14:52 <dom> Tarek: re SML, at Mozilla the definition we used a year ago no longer works today
07:15:03 <Ugur> Ugur has joined #webmachinelearning
07:15:15 <dom> ... we're looking at device tiers: non-capable, device with certain capabilities, high end devices
07:15:27 <dom> ... we've found that more robust over time
07:15:58 <anssik> s/re SML/re SLM
07:16:02 <dom> MikeG: any suggestion on how to classify models instead of devices?
07:16:28 <dom> Tarek: anything that doesn't spit out a continous stream of tokens is a SLM
07:16:30 <tomayac> q+
07:17:16 <dom> @@@: we put the boundary at 7B parameters
07:17:25 <dom> Tarek: but it's at risk of changing in a few months
07:17:52 <dom> Thomas: I don't think the # of parameters in a guideline context: the guideline should be about "use the smallest possible model"
07:18:03 <fershad> fershad has joined #webmachinelearning
07:18:14 <dom> ... with the caveat that an already-available model on device might be a better option
07:20:45 <Tarek> q+
07:22:24 <anssik> Anssi: my mental model to the model selection problem: use the "right tool for the right job" but it is complicated because the diversity to toolboxes available to people
07:22:39 <dom> anssik: re model selection, this should be about selecting the right tool to the right job, but it is a complicated evaluation to make given the variety of options available
07:23:05 <anssik> Subtopic: Privacy
07:23:09 <anssik> Anssi: late last year, the Privacy Interest Group was launched replacing the Privacy Working Group
07:23:13 <anssik> ... what's new in this transition?
07:23:17 <anssik> -> https://www.w3.org/2024/10/wg-privacy-charter.html
07:23:20 <dom> Tara: the transition of the IG to WG hasn't really changed much in terms of the review work
07:23:21 <Tarek> q-
07:23:38 <christianliebel> s/@@@:/Sushanth:
07:24:10 <Ugur> Ugur has joined #webmachinelearning
07:24:23 <AlexDawson> AlexDawson has left #webmachinelearning
07:24:29 <dom> tara: Simone and I are going to run a joint presentation
07:24:51 <dom> anssik always struggle a bit to delineate between privacy and security
07:24:59 <tomayac> q-
07:25:05 <dom> tara: they do have a lot in common
07:26:17 <dom> ... we have specialized guidance, but this shouldn't be a source of concern on your end
07:26:21 <dom> s/: Privacy/: Privacy and Security
07:26:29 <anssik> Anssi: Security Interest Group was recently launched to reinvigorate work to advise groups developing standards on how to avoid and mitigate security issues
07:27:01 <dom> Slideset: privacy_slides
07:27:12 <dom> [slide 1]
07:27:15 <dom> [slide 2]
07:28:09 <elena> elena has joined #webmachinelearning
07:28:27 <dom> [slide 3]
07:28:46 <anssik> RRSAgent, draft minutes
07:28:47 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik
07:29:40 <dom> s|privacy_slides|https://docs.google.com/presentation/d/11m1TXLVzhnIEimyqIjgs0VTXhhA4wWIGyATEw664Ei0/edit
07:29:53 <dom> [slide 4]
07:32:39 <dom> [slide 5]
07:33:11 <ThomasNattestad> ThomasNattestad has joined #webmachinelearning
07:33:58 <dom> [slide 6]
07:35:28 <ThomasNattestad> q+
07:35:51 <RobKochman> RobKochman has joined #webmachinelearning
07:35:54 <anssik> q?
07:35:57 <anssik> ack ThomasNattestad
07:36:37 <Ugur> Ugur has joined #webmachinelearning
07:37:43 <dom> ThomasN: re fingerprinting, one of the perennial that keeps popping is that there is already so much entropy that it's not obvious how much of it can still be mitigated
07:37:51 <christianliebel> q+
07:38:30 <dom> ... is this a tractable problem, and something that is worth spending time mitigating at the spec level?
07:39:38 <dom> Tara: I think we're pushing towards a better space and so we feel it's worth considering the trade-offs that limit that path open
07:39:52 <dom> s/limit/keep
07:40:25 <dom> ThomasN: it's hard to evaluate the cost in developing the API and more importantly the ability of developers to fulfill their use cases
07:40:33 <anssik> ack christianliebel
07:40:48 <dom> q+ christianliebel
07:40:55 <dom> [slide 7]
07:42:13 <dom> [slide 8]
07:43:01 <dom> [slide 9]
07:43:20 <dom> [slide 10]
07:43:51 <anssik> q?
07:44:19 <dom> [slide 11]
07:44:48 <dom> [slide 12]
07:45:14 <dom> [slide 13]
07:45:34 <dom> [slide 14]
07:46:10 <dom> [slide 15]
07:46:49 <anssik> ack christianliebel
07:47:15 <dom> christianliebel: the APIs we build in the CG/WG are on-device
07:47:24 <anssik> Topic: Wrap up
07:47:31 <anssik> Anssi: thank you everyone for your active participation and productive discussions
07:47:36 <anssik> ... this day was packed and we managed to finish with gusto!
07:47:40 <dom> ... how different a trusted executed environment on cloud would be from the security/privacy perspectives?
07:48:03 <anssik> ... special thank you to our guests Mike, Tara, Simone, who joined to share important work happening across horizontals
07:48:07 <anssik> ... also huge thanks to Ningxin & team for the case study and compelling demos that both inform our future direction and demonstrate the exciting web experiences we already enable today with WebNN
07:48:12 <anssik> ... interested folks are welcome to join us for a dinner
07:48:26 <anssik> ... we're quite many, so the plan would be to meet in the Portopia Hotel (adjacent to the Kobe International Conference Center) lobby at 18:00 to coordinate on transport and restaurants, likely split to multiple based on preferences
07:48:42 <anssik> s/18:00/18:15
07:49:09 <dom> RRSAgent, draft minutes
07:49:10 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html dom
07:49:28 <anssik> RRSAgent, draft minutes
07:49:30 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/09-webmachinelearning-minutes.html anssik