14:51:32 RRSAgent has joined #webmachinelearning 14:51:37 logging to https://www.w3.org/2025/10/23-webmachinelearning-irc 14:51:37 RRSAgent, make logs Public 14:51:38 please title this meeting ("meeting: ..."), anssik 14:51:39 Meeting: WebML WG Teleconference – 23 October 2025 14:51:42 Chair: Anssi 14:51:51 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-10-23-wg-agenda.md 14:51:55 Scribe: Anssi 14:52:03 scribeNick: anssik 14:52:19 Present+ Anssi_Kostiainen 14:52:24 Regrets+ Dwayne_Robinson 14:52:33 RRSAgent, draft minutes 14:52:34 I have made the request to generate https://www.w3.org/2025/10/23-webmachinelearning-minutes.html anssik 14:56:53 zkis has joined #webmachinelearning 14:57:07 Present+ Zoltan_Kis 14:57:31 Fabio has joined #webmachinelearning 14:59:13 Present+ Fabio_Bernardon 15:00:17 Ehsan has joined #webmachinelearning 15:00:57 Present+ Ningxin_Hu 15:01:24 Present+ Ehsan_Toreini 15:01:38 handellm has joined #webmachinelearning 15:01:46 Present+ Markus_Handell 15:02:52 Present+ Mike_Wyrzykowski 15:03:11 Mike_Wyrzykowski has joined #webmachinelearning 15:03:49 Present+ Rafael_Cintron 15:03:54 ningxin has joined #webmachinelearning 15:03:57 RRSAgent, draft minutes 15:03:58 I have made the request to generate https://www.w3.org/2025/10/23-webmachinelearning-minutes.html anssik 15:04:24 Anssi: we'll start by acknowledging our new participants 15:04:31 ... please welcome to the WebML WG: 15:05:02 ... Sword Li from Cybozu, a Japanese company developing a web-based workplace collaboration platform 15:05:04 RafaelCintron has joined #webmachinelearning 15:05:20 ... Haoli Chen from ByteDance, familiar from its global social media platform TikTok 15:05:44 ... welcome on board Sword and Haoli 15:06:00 ... we look forward to your contributions and product-driven feedback on WebNN 15:06:03 qcomp has left #webmachinelearning 15:06:21 ... while new participants are onboarding, with mixed emotions we will say goodbye to our long-standing participant Zoltan who will be stepping away from this Working Group at the end of the month 15:06:52 ... Zoltan has a long track record of contributions as one of the first participants in this group, he plans to continue in the WebML Community Group in his future capacity, so we will get to benefit from his expertise in our incubator in the future 15:07:00 ... thank you Zoltan for all your contributions to this Working Group since its inception! 15:07:51 Zoltan: thank you everyone, it has been fun to be part of this effort, especially past few years we've gotten a lot of traction and momentum continues 15:07:56 Thanks much, Zoltan, your contribution is highly appreciated! 15:08:26 Topic: Incubations 15:08:42 Anssi: next, a quick recap of recent WebML Community Group developments 15:08:48 -> WebML CG Teleconference – 16 October 2025 https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-10-16-cg-agenda.md 15:09:02 gb, this is webmachinelearning/webmcp 15:09:02 anssik, OK. 15:09:10 Anssi: last week we focused on WebMCP that is attracting a lot of attention and new participants interested in this agentic web capability are joining eager to contribute 15:09:15 ... here's a brief summary of recent developments: 15:09:52 ... - we scheduled WebMCP TPAC F2F discussions on Tuesday Japan morning to allow US West Coast remote participants join at better hours 15:10:16 ... - for WebMCP elicitation #21, we resolved the proposed API should give user an option to block abusive sites permanently but throw an error to developers so legitimate sites can implement fallback behaviour 15:10:20 https://github.com/webmachinelearning/webmcp/issues/21 -> Issue 21 Elicitation (by bwalderman) 15:10:30 ... - for interleaving interaction #20, we did not identify a concrete use case for informing sites when users decide to take over in the middle of a tool execution, thus we closed this issue with no action 15:10:31 https://github.com/webmachinelearning/webmcp/issues/20 -> CLOSED Issue 20 Interleaving user and Agent interaction with the site (by khushalsagar) [Agenda+] 15:11:06 ... - for prompt injection #11, exploration continues tracking MCP upstream developments and by developing the clipboard mitigation idea through prototyping 15:11:06 https://github.com/webmachinelearning/webmcp/issues/11 -> Issue 11 Prompt injection (by bwalderman) [Agenda+] 15:11:38 ... - lastly, declarative API PR #26 was discussed, the group agreed to continue evolve this API together with the imperative API, as a learning from Web Components 15:11:39 https://github.com/webmachinelearning/webmcp/pull/26 -> Pull Request 26 add explainer for the declarative api (by MiguelsPizza) 15:11:53 ... questions comments? 15:11:53 qcomp has joined #webmachinelearning 15:12:08 Topic: F2F deep dives 15:12:18 gb, this is webmachinelearning/meetings 15:12:18 anssik, OK. 15:12:32 Anssi: F2F Agenda issue #35 15:12:33 https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko) 15:12:41 Anssi: I want to expand the plan for a few core sessions on the WebML WG F2F Agenda to make sure we can make these productive and interesting to you 15:12:52 ... three are the following buckets: (1) implementation experience, (2) new features, (3) customer feedback 15:12:58 Subtopic: Implementation experience 15:13:15 Anssi: my proposal is to kick off the "Implementation plans and trials" session with demos 15:14:39 Anssi: kick off this session with demos that exercise diverse hardware accelerator 15:15:12 ... from the demo session excitement, we'll transition to discuss browser vendors' trial plans, dissect the latest implementation experience across the layers to inform the WebNN spec development 15:15:16 ... I'm aware that implementers may not want to share their detailed plans ahead of product launches, so it is OK to abstract out any such details 15:16:42 Present+ Reilly_Grant 15:16:45 Rafael: we're doing all Edge work in upstream, 5-10 days delay from upstream 15:17:00 q+ 15:17:03 ... any Origin Trials follow Chrome's schedule 15:17:05 ack reillyg 15:18:08 Reilly: I mean the question is, what is Chrome's schedule, we're awaiting on finishing the integration with Windows ML API, so we have complete support on Windows, expecting this to land in stable in the next month or so, so likely in a position to do an OT hitting Stable for folks around the start of the year 15:19:12 Anssi: what are the tests and data you are looking for? 15:19:17 ... wpt pass rate perhaps? 15:19:58 Reilly: wpt is in good shape, the biggest blocker is to look at various stability metrics, what's the security risk of launching this 15:20:37 Anssi: Edge has its own OT frontend? 15:20:47 Rafael: correct 15:22:34 Subtopic: New features 15:22:39 gb, this is webmachinelearning/webnn 15:22:39 anssik, OK. 15:22:43 Anssi: in the "New features" session, the following have been proposed: 15:23:22 ... (1) Core operator set #573 - discuss a plan to extend with attentions, MoE, TopK, MatMulNBits ... or fuse 15:23:22 https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset] 15:24:03 q+ 15:24:18 ack ningxin 15:24:55 Ningxin: we've investigated ops such as attention and MatMulNBits, decomp performance vs. fused in LLMs 15:25:29 ... if time allows, we can share a 10-min update what we've learned, performance vs. code complexity 15:25:51 q? 15:26:36 ... (2) Support flexible input sizes #883 - understand and drive consensus on feature details such as dynamic shape types, unknown size, symbolic size, tensor-derived-sized, Markus provided pre-reading via TensorRT docs 15:26:37 https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] 15:26:54 ... (3) bag of issues in "device selection" to seek consensus on 15:27:32 -> https://github.com/webmachinelearning/webnn/labels/device%20selection 15:27:51 Anssi: 1-2 topics can still fit in this "new features" session 15:28:02 -> https://github.com/webmachinelearning/meetings/issues/35 15:28:02 https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko) 15:28:35 Subtopic: Customer feedback 15:28:53 Anssi: "Customer feedback and collaborations" is the session to share feedback from real-world users 15:29:12 ... I understand now everyone wants to speak to their future product features, but any feedback that is available either directly from customers, or through a proxy, with confidential details abstracted out, is welcome in this session 15:29:44 ... to clarify, we consider customers broadly, feedback from end-users, web developers, frameworks, anyone who is interfacing with the WebNN API, either directly or through an abstraction, is welcome 15:29:55 ... Belem has created a repo called Awesome WebNN to collect customer and user feedback signals into one place 15:30:01 ... this community resource is one place to look for feedback and signals, anyone is welcome to contribute to this repo 15:30:05 -> https://github.com/webmachinelearning/awesome-webnn 15:30:54 Topic: New features and operator specific issues 15:30:59 gb, this is webmachinelearning/webnn 15:30:59 anssik, OK. 15:31:15 Subtopic: The decomposition of lstm has issue for batch size 1 input 15:31:18 Anssi: issue #889 15:31:19 https://github.com/webmachinelearning/webnn/issues/889 -> Issue 889 The decomposition of `lstm` has issue for batch size 1 input (by fujunwei) [question] [operator specific] 15:31:36 ... Junwei opened this issue while working to enable Kokoro TTS model on TFLite backend 15:31:56 ... implementation patch removed specific size 1 dimensions at 0 axis with squeeze_dims option in TFLite GraphBuilder implementation 15:32:01 ... are we clear on spec changes required? 15:32:21 q+ 15:32:31 Reilly: I wasn't aware there's spec side issue for this 15:32:40 ... I reviewed the Chromium CL 15:32:44 ack ningxin 15:33:28 Ningxin: I talked to Junwei offline, looks like our decompose sample code as an issue, to handle 1 side dimension correctly, we use squeeze, that will remote size 1 dimension unintentionally 15:33:51 ... we need to fix our sample core because TFLite backend implementation follow the sample code, not spec text issue itself 15:34:32 ... I can submit a PR to fix the sample code in the spec 15:35:20 Topic: Query supported devices 15:35:35 Subtopic: Before graph compilation 15:35:49 Anssi: spec PR #895 and explainer PR #884 15:35:49 https://github.com/webmachinelearning/webnn/pull/884 -> Pull Request 884 Update explainer with new proposal for simple accelerator mapping (by zolkis) [device selection] 15:35:49 https://github.com/webmachinelearning/webnn/pull/895 -> Pull Request 895 Add a simple accelerator selection mechanism. (by zolkis) [device selection] 15:36:02 ... thanks Zoltan for refining the spec PRs since our last discussion 15:36:07 ... the PR was updated as follows since last review: 15:36:18 ... - add "poll CPU fallback status" algorithm 15:36:34 ... - in "create a context" algorithm, cpuFallbackActive initialized to undefined instead of false 15:36:41 ... the IDL diff remains the same, two new boolean flags are added to MLContext: 15:36:49 ``` 15:36:49 interface MLContext { 15:36:49 undefined destroy(); 15:36:49 + readonly attribute boolean accelerated; 15:36:49 + readonly attribute boolean cpuFallbackActive; 15:36:49 readonly attribute Promise lost; 15:36:50 }; 15:36:50 ``` 15:37:16 Anssi: Markus from Google Meet LGTM'd this PR (thanks!) 15:37:32 ... I requested review from Ningxin and Rafael because you had provided feedback earlier, others are welcome to review too 15:38:13 Zoltan: last time it was mentioned we could add a truth table to clarify the combinations, I think that'd be too static, a dynamic relationship is better captured by the algorithm 15:38:19 ... I will amend it per feedback 15:38:20 q? 15:38:20 OK, I will take a look. 15:39:00 Yes, I'll take a look 15:39:16 Zoltan: I've updated the explainer and we can merge them together with spec PR 15:39:37 ... I will probably clean up the device selection explainer a bit 15:40:47 Subtopic: After graph compilation 15:41:26 Reilly: two pieces, we added graph.devices API to give developer visibility into what happened when they built the model, help with debugging, e.g. why the model is slow etc. 15:41:39 Anssi: issue #836 PR #854 15:41:40 https://github.com/webmachinelearning/webnn/pull/854 -> Pull Request 854 define graph.devices (by philloooo) [device selection] 15:41:40 https://github.com/webmachinelearning/webnn/issues/836 -> Issue 836 Get devices used for a graph after graph compilation (by philloooo) [device selection] 15:42:20 Reilly: we could switch this to the same atributes proposed for MLContext, for CPU fallback, because for some implementations it is per-graph behaviour 15:42:52 ... properties that come from interfaces could be logged, or information that could come up via developer tools, it is useful to understand how the system will behave in practice 15:43:17 q? 15:43:58 -> https://github.com/webmachinelearning/webnn/labels/device%20selection 15:45:16 q+ 15:45:19 ack RafaelCintron 15:45:50 Rafael: with this after compile feature do we need also the before graph feature? 15:46:21 Reilly: I think the after compile is more specific, whether or not the acceleration comes from NPU or GPU 15:47:20 ... there's two reasons for this specificity, models on macOS include CPU fallback and run significantly slowed that only use GPU and NPU, so being able to detect that 15:47:34 ... maybe fallback is not specifically CPU only but when we use more than one device? 15:48:10 ... about demos, we haven't had a chance to develop a demo yet for the after compile case, want to understand real-world case with multiple ML workloads at the same time 15:48:30 ... want to understand load-balancing experience 15:48:32 q+ 15:48:42 ack zkis 15:49:25 Zoltan: wanted to say, we can add a new properly that discloses if any fallback is happening, there was a use case for CPU-specific fallback 15:50:04 ... I think Mike's proposal is also worth looking into, MLSupportLimits 15:50:30 Markus: CPU fallback was motivated by Google Meet feedback 15:50:59 ... I guess, GPU fallback could be also interesting, but there's not enough experience yet to tell whether that is an appropriate solution 15:51:01 q? 15:51:22 Zoltan: is that a consistent behaviour across platforms? 15:52:09 Reilly: given how the platforms work, that is macOS specific, architecturally we only see this on macOS, because it is the only platform that drives developers selecting three devices as where the model could run, others ask to pick CPI 15:52:16 s/CPI/CPU and another accelerator 15:52:44 Reilly: one or two ops falling back to CPU might be high due to context switching 15:52:58 q? 15:54:23 Markus: we have a demo app in the works that might inform this discussion 15:54:47 ... one thing is interop, packing tensors differently depending on the details of the device 15:55:52 ... waiting for some input on interop issues before opening a new spec issue for this 15:55:53 q? 15:57:19 Fabio: we did discuss with Markus, either end user or developers have control over where workloads execute, think agents running 4-5 different models, you want to know where they run 15:57:37 ... could have iGPU, dGPU, NPU, how to distribute the work so you can understand the performance you get 15:57:59 ... also understand the privacy considerations, not necessarily want the developer to know the exact details but performance level expected 15:58:26 ... no specific solution yet, but discussing how to best address this and how the end user could direct where to execute the workloads 15:58:27 q? 15:59:20 Zoltan: this is a multi-faceted issue due to especially NPU device diversity, can we identify a simple solution that can be extended with more fine-grained information 15:59:22 q+ 15:59:28 ack RafaelCintron 15:59:57 Rafael: I think you have to be a power use to know WebNN is used and care how to distribute the workload across devices 16:00:11 ... developers could be informed enough to do this 16:00:52 ... there may be systems where GPU is faster than NPU and the other way around 16:01:07 q+ 16:01:16 ... as diagnostics information, this would be good to have 16:01:19 ack Fabio 16:01:53 Fabio: agree, my preference is to let the end-user figure these out, because they know the system, maybe one direction is to look how to allow the user to make the selection? 16:02:19 ... example, Adobe Suite online, different tasks, in those cases you may have light-weight models that runs better on NPU and heavier models that run better on GPU 16:02:37 ... we discussed internally if we can have implicit performance level to hang the devices off 16:02:51 ... the evolving ecosystem makes this challenging 16:02:52 q? 16:03:02 q+ 16:03:16 Rafael: supreme power user could get access to this data, but for normal users the platform should be able to pick the best devices 16:03:29 ... most users are non-technical, and do not understand device selection details 16:03:54 q? 16:03:57 ack handellm 16:04:38 Markus: echoing Rafael, maybe we can identify the reasons from wanting to configure this and that, I'd like to use low-latency setup for my use case that is real-time collaboration 16:04:47 Rafael: I'm in favour of hints as a mechanism 16:04:47 q? 16:05:21 Topic: Cancel 6 Nov WG telcon due to F2F on 10-11 Nov 16:05:26 Anssi: I will cancel Thurday WebML WG 6 November Teleconference due to close proximity with the TPAC F2F meetings week 16:05:30 ... as a reminder: 16:05:34 ... please check you have the F2F in your calendar, and if not, export 10 Nov invite from: 16:05:38 -> https://www.w3.org/groups/wg/webmachinelearning/calendar/ 16:05:49 Anssi: don't be confused it says "Tentative", that's a "feature" of the tool and the F2F meeting is Confirmed 16:05:54 ... also, please check out the WebML WG agenda and group-specific instructions and share your suggestions: 16:05:58 -> https://github.com/webmachinelearning/meetings/issues/35 16:05:58 https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko) 16:06:12 Anssi: we can use our F2F time together more productively if folks make proposal ahead the meeting for topics of interest to them 16:06:33 ... see you in Kobe in-person or virtually on 10-11 November 2025 (or 9-10 November if you're in the US West Coast!) 16:06:37 ... safe travels! 16:06:45 RRSAgent, draft minutes 16:06:46 I have made the request to generate https://www.w3.org/2025/10/23-webmachinelearning-minutes.html anssik 16:12:56 s/now everyone/not everyone 16:13:47 s/spec side/a spec side 16:14:07 s/as an/has an 16:14:24 s/sample core/sample code 16:14:36 s/follow the/follows the 16:15:05 s/spec PRs/spec PR 16:17:06 s/new properly/new property 16:18:02 s/high due/high cost due 16:18:48 s/with Markus/with Markus T 16:19:56 s/power use to/power user to 16:22:21 RRSAgent, draft minutes 16:22:22 I have made the request to generate https://www.w3.org/2025/10/23-webmachinelearning-minutes.html anssik