14:51:32 <RRSAgent> RRSAgent has joined #webmachinelearning
14:51:37 <RRSAgent> logging to https://www.w3.org/2025/10/23-webmachinelearning-irc
14:51:37 <Zakim> RRSAgent, make logs Public
14:51:38 <Zakim> please title this meeting ("meeting: ..."), anssik
14:51:39 <anssik> Meeting: WebML WG Teleconference – 23 October 2025
14:51:42 <anssik> Chair: Anssi
14:51:51 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-10-23-wg-agenda.md
14:51:55 <anssik> Scribe: Anssi
14:52:03 <anssik> scribeNick: anssik
14:52:19 <anssik> Present+ Anssi_Kostiainen
14:52:24 <anssik> Regrets+ Dwayne_Robinson
14:52:33 <anssik> RRSAgent, draft minutes
14:52:34 <RRSAgent> I have made the request to generate https://www.w3.org/2025/10/23-webmachinelearning-minutes.html anssik
14:56:53 <zkis> zkis has joined #webmachinelearning
14:57:07 <anssik> Present+ Zoltan_Kis
14:57:31 <Fabio> Fabio has joined #webmachinelearning
14:59:13 <anssik> Present+ Fabio_Bernardon
15:00:17 <Ehsan> Ehsan has joined #webmachinelearning
15:00:57 <anssik> Present+ Ningxin_Hu
15:01:24 <anssik> Present+ Ehsan_Toreini
15:01:38 <handellm> handellm has joined #webmachinelearning
15:01:46 <anssik> Present+ Markus_Handell
15:02:52 <anssik> Present+ Mike_Wyrzykowski
15:03:11 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
15:03:49 <anssik> Present+ Rafael_Cintron
15:03:54 <ningxin> ningxin has joined #webmachinelearning
15:03:57 <anssik> RRSAgent, draft minutes
15:03:58 <RRSAgent> I have made the request to generate https://www.w3.org/2025/10/23-webmachinelearning-minutes.html anssik
15:04:24 <anssik> Anssi: we'll start by acknowledging our new participants
15:04:31 <anssik> ... please welcome to the WebML WG:
15:05:02 <anssik> ... Sword Li from Cybozu, a Japanese company developing a web-based workplace collaboration platform
15:05:04 <RafaelCintron> RafaelCintron has joined #webmachinelearning
15:05:20 <anssik> ... Haoli Chen from ByteDance, familiar from its global social media platform TikTok
15:05:44 <anssik> ... welcome on board Sword and Haoli
15:06:00 <anssik> ... we look forward to your contributions and product-driven feedback on WebNN
15:06:03 <qcomp> qcomp has left #webmachinelearning
15:06:21 <anssik> ... while new participants are onboarding, with mixed emotions we will say goodbye to our long-standing participant Zoltan who will be stepping away from this Working Group at the end of the month
15:06:52 <anssik> ... Zoltan has a long track record of contributions as one of the first participants in this group, he plans to continue in the WebML Community Group in his future capacity, so we will get to benefit from his expertise in our incubator in the future
15:07:00 <anssik> ... thank you Zoltan for all your contributions to this Working Group since its inception!
15:07:51 <anssik> Zoltan: thank you everyone, it has been fun to be part of this effort, especially past few years we've gotten a lot of traction and momentum continues
15:07:56 <ningxin> Thanks much, Zoltan, your contribution is highly appreciated!
15:08:26 <anssik> Topic: Incubations
15:08:42 <anssik> Anssi: next, a quick recap of recent WebML Community Group developments
15:08:48 <anssik> -> WebML CG Teleconference – 16 October 2025 https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-10-16-cg-agenda.md
15:09:02 <anssik> gb, this is webmachinelearning/webmcp
15:09:02 <gb> anssik, OK.
15:09:10 <anssik> Anssi: last week we focused on WebMCP that is attracting a lot of attention and new participants interested in this agentic web capability are joining eager to contribute
15:09:15 <anssik> ... here's a brief summary of recent developments:
15:09:52 <anssik> ... - we scheduled WebMCP TPAC F2F discussions on Tuesday Japan morning to allow US West Coast remote participants join at better hours
15:10:16 <anssik> ... - for WebMCP elicitation #21, we resolved the proposed API should give user an option to block abusive sites permanently but throw an error to developers so legitimate sites can implement fallback behaviour
15:10:20 <gb> https://github.com/webmachinelearning/webmcp/issues/21 -> Issue 21 Elicitation (by bwalderman)
15:10:30 <anssik> ... - for interleaving interaction #20, we did not identify a concrete use case for informing sites when users decide to take over in the middle of a tool execution, thus we closed this issue with no action
15:10:31 <gb> https://github.com/webmachinelearning/webmcp/issues/20 -> CLOSED Issue 20 Interleaving user and Agent interaction with the site (by khushalsagar) [Agenda+]
15:11:06 <anssik> ... - for prompt injection #11, exploration continues tracking MCP upstream developments and by developing the clipboard mitigation idea through prototyping
15:11:06 <gb> https://github.com/webmachinelearning/webmcp/issues/11 -> Issue 11 Prompt injection (by bwalderman) [Agenda+]
15:11:38 <anssik> ... - lastly, declarative API PR #26 was discussed, the group agreed to continue evolve this API together with the imperative API, as a learning from Web Components
15:11:39 <gb> https://github.com/webmachinelearning/webmcp/pull/26 -> Pull Request 26 add explainer for the declarative api (by MiguelsPizza)
15:11:53 <anssik> ... questions comments?
15:11:53 <qcomp> qcomp has joined #webmachinelearning
15:12:08 <anssik> Topic: F2F deep dives
15:12:18 <anssik> gb, this is webmachinelearning/meetings
15:12:18 <gb> anssik, OK.
15:12:32 <anssik> Anssi: F2F Agenda issue #35
15:12:33 <gb> https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko)
15:12:41 <anssik> Anssi: I want to expand the plan for a few core sessions on the WebML WG F2F Agenda to make sure we can make these productive and interesting to you
15:12:52 <anssik> ... three are the following buckets: (1) implementation experience, (2) new features, (3) customer feedback
15:12:58 <anssik> Subtopic: Implementation experience
15:13:15 <anssik> Anssi: my proposal is to kick off the "Implementation plans and trials" session with demos
15:14:39 <anssik> Anssi: kick off this session with demos that exercise diverse hardware accelerator
15:15:12 <anssik> ... from the demo session excitement, we'll transition to discuss browser vendors' trial plans, dissect the latest implementation experience across the layers to inform the WebNN spec development
15:15:16 <anssik> ... I'm aware that implementers may not want to share their detailed plans ahead of product launches, so it is OK to abstract out any such details
15:16:42 <reillyg> Present+ Reilly_Grant
15:16:45 <anssik> Rafael: we're doing all Edge work in upstream, 5-10 days delay from upstream
15:17:00 <reillyg> q+
15:17:03 <anssik> ... any Origin Trials follow Chrome's schedule
15:17:05 <anssik> ack reillyg
15:18:08 <anssik> Reilly: I mean the question is, what is Chrome's schedule, we're awaiting on finishing the integration with Windows ML API, so we have complete support on Windows, expecting this to land in stable in the next month or so, so likely in a position to do an OT hitting Stable for folks around the start of the year
15:19:12 <anssik> Anssi: what are the tests and data you are looking for?
15:19:17 <anssik> ... wpt pass rate perhaps?
15:19:58 <anssik> Reilly: wpt is in good shape, the biggest blocker is to look at various stability metrics, what's the security risk of launching this
15:20:37 <anssik> Anssi: Edge has its own OT frontend?
15:20:47 <anssik> Rafael: correct
15:22:34 <anssik> Subtopic: New features
15:22:39 <anssik> gb, this is webmachinelearning/webnn
15:22:39 <gb> anssik, OK.
15:22:43 <anssik> Anssi: in the "New features" session, the following have been proposed:
15:23:22 <anssik> ... (1) Core operator set #573 - discuss a plan to extend with attentions, MoE, TopK, MatMulNBits ... or fuse
15:23:22 <gb> https://github.com/webmachinelearning/webnn/issues/573 -> Issue 573 Core operator set (by philloooo) [question] [opset]
15:24:03 <ningxin> q+
15:24:18 <anssik> ack ningxin
15:24:55 <anssik> Ningxin: we've investigated ops such as attention and MatMulNBits, decomp performance vs. fused in LLMs
15:25:29 <anssik> ... if time allows, we can share a 10-min update what we've learned, performance vs. code complexity
15:25:51 <anssik> q?
15:26:36 <anssik> ... (2) Support flexible input sizes #883 - understand and drive consensus on feature details such as dynamic shape types, unknown size, symbolic size, tensor-derived-sized, Markus provided pre-reading via TensorRT docs
15:26:37 <gb> https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific]
15:26:54 <anssik> ... (3) bag of issues in "device selection" to seek consensus on
15:27:32 <anssik> -> https://github.com/webmachinelearning/webnn/labels/device%20selection
15:27:51 <anssik> Anssi: 1-2 topics can still fit in this "new features" session
15:28:02 <anssik> -> https://github.com/webmachinelearning/meetings/issues/35
15:28:02 <gb> https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko)
15:28:35 <anssik> Subtopic: Customer feedback
15:28:53 <anssik> Anssi: "Customer feedback and collaborations" is the session to share feedback from real-world users
15:29:12 <anssik> ... I understand now everyone wants to speak to their future product features, but any feedback that is available either directly from customers, or through a proxy, with confidential details abstracted out, is welcome in this session
15:29:44 <anssik> ... to clarify, we consider customers broadly, feedback from end-users, web developers, frameworks, anyone who is interfacing with the WebNN API, either directly or through an abstraction, is welcome
15:29:55 <anssik> ... Belem has created a repo called Awesome WebNN to collect customer and user feedback signals into one place
15:30:01 <anssik> ... this community resource is one place to look for feedback and signals, anyone is welcome to contribute to this repo
15:30:05 <anssik> -> https://github.com/webmachinelearning/awesome-webnn
15:30:54 <anssik> Topic: New features and operator specific issues
15:30:59 <anssik> gb, this is webmachinelearning/webnn
15:30:59 <gb> anssik, OK.
15:31:15 <anssik> Subtopic: The decomposition of lstm has issue for batch size 1 input
15:31:18 <anssik> Anssi: issue #889
15:31:19 <gb> https://github.com/webmachinelearning/webnn/issues/889 -> Issue 889 The decomposition of `lstm` has issue for batch size 1 input (by fujunwei) [question] [operator specific]
15:31:36 <anssik> ... Junwei opened this issue while working to enable Kokoro TTS model on TFLite backend
15:31:56 <anssik> ... implementation patch removed specific size 1 dimensions at 0 axis with squeeze_dims option in TFLite GraphBuilder implementation
15:32:01 <anssik> ... are we clear on spec changes required?
15:32:21 <ningxin> q+
15:32:31 <anssik> Reilly: I wasn't aware there's spec side issue for this
15:32:40 <anssik> ... I reviewed the Chromium CL
15:32:44 <anssik> ack ningxin
15:33:28 <anssik> Ningxin: I talked to Junwei offline, looks like our decompose sample code as an issue, to handle 1 side dimension correctly, we use squeeze, that will remote size 1 dimension unintentionally
15:33:51 <anssik> ... we need to fix our sample core because TFLite backend implementation follow the sample code, not spec text issue itself
15:34:32 <anssik> ... I can submit a PR to fix the sample code in the spec
15:35:20 <anssik> Topic: Query supported devices
15:35:35 <anssik> Subtopic: Before graph compilation
15:35:49 <anssik> Anssi: spec PR #895 and explainer PR #884
15:35:49 <gb> https://github.com/webmachinelearning/webnn/pull/884 -> Pull Request 884 Update explainer with new proposal for simple accelerator mapping (by zolkis) [device selection]
15:35:49 <gb> https://github.com/webmachinelearning/webnn/pull/895 -> Pull Request 895 Add a simple accelerator selection mechanism. (by zolkis) [device selection]
15:36:02 <anssik> ... thanks Zoltan for refining the spec PRs since our last discussion
15:36:07 <anssik> ... the PR was updated as follows since last review:
15:36:18 <anssik> ... - add "poll CPU fallback status" algorithm
15:36:34 <anssik> ... - in "create a context" algorithm, cpuFallbackActive initialized to undefined instead of false
15:36:41 <anssik> ... the IDL diff remains the same, two new boolean flags are added to MLContext:
15:36:49 <anssik> ```
15:36:49 <anssik> interface MLContext {
15:36:49 <anssik>   undefined destroy();
15:36:49 <anssik> + readonly attribute boolean accelerated;
15:36:49 <anssik> + readonly attribute boolean cpuFallbackActive;
15:36:49 <anssik>   readonly attribute Promise<MLContextLostInfo> lost;
15:36:50 <anssik> };
15:36:50 <anssik> ```
15:37:16 <anssik> Anssi: Markus from Google Meet LGTM'd this PR (thanks!)
15:37:32 <anssik> ... I requested review from Ningxin and Rafael because you had provided feedback earlier, others are welcome to review too
15:38:13 <anssik> Zoltan: last time it was mentioned we could add a truth table to clarify the combinations, I think that'd be too static, a dynamic relationship is better captured by the algorithm
15:38:19 <anssik> ... I will amend it per feedback
15:38:20 <anssik> q?
15:38:20 <RafaelCintron> OK, I will take a look.
15:39:00 <ningxin> Yes, I'll take a look
15:39:16 <anssik> Zoltan: I've updated the explainer and we can merge them together with spec PR
15:39:37 <anssik> ... I will probably clean up the device selection explainer a bit
15:40:47 <anssik> Subtopic: After graph compilation
15:41:26 <anssik> Reilly: two pieces, we added graph.devices API to give developer visibility into what happened when they built the model, help with debugging, e.g. why the model is slow etc.
15:41:39 <anssik> Anssi: issue #836 PR #854
15:41:40 <gb> https://github.com/webmachinelearning/webnn/pull/854 -> Pull Request 854 define graph.devices (by philloooo) [device selection]
15:41:40 <gb> https://github.com/webmachinelearning/webnn/issues/836 -> Issue 836 Get devices used for a graph after graph compilation (by philloooo) [device selection]
15:42:20 <anssik> Reilly: we could switch this to the same atributes proposed for MLContext, for CPU fallback, because for some implementations it is per-graph behaviour
15:42:52 <anssik> ... properties that come from interfaces could be logged, or information that could come up via developer tools, it is useful to understand how the system will behave in practice
15:43:17 <anssik> q?
15:43:58 <anssik> -> https://github.com/webmachinelearning/webnn/labels/device%20selection
15:45:16 <RafaelCintron> q+
15:45:19 <anssik> ack RafaelCintron
15:45:50 <anssik> Rafael: with this after compile feature do we need also the before graph feature?
15:46:21 <anssik> Reilly: I think the after compile is more specific, whether or not the acceleration comes from NPU or GPU
15:47:20 <anssik> ... there's two reasons for this specificity, models on macOS include CPU fallback and run significantly slowed that only use GPU and NPU, so being able to detect that
15:47:34 <anssik> ... maybe fallback is not specifically CPU only but when we use more than one device?
15:48:10 <anssik> ... about demos, we haven't had a chance to develop a demo yet for the after compile case, want to understand real-world case with multiple ML workloads at the same time
15:48:30 <anssik> ... want to understand load-balancing experience
15:48:32 <zkis> q+
15:48:42 <anssik> ack zkis
15:49:25 <anssik> Zoltan: wanted to say, we can add a new properly that discloses if any fallback is happening, there was a use case for CPU-specific fallback
15:50:04 <anssik> ... I think Mike's proposal is also worth looking into, MLSupportLimits
15:50:30 <anssik> Markus: CPU fallback was motivated by Google Meet feedback
15:50:59 <anssik> ... I guess, GPU fallback could be also interesting, but there's not enough experience yet to tell whether that is an appropriate solution
15:51:01 <anssik> q?
15:51:22 <anssik> Zoltan: is that a consistent behaviour across platforms?
15:52:09 <anssik> Reilly: given how the platforms work, that is macOS specific, architecturally we only see this on macOS, because it is the only platform that drives developers selecting three devices as where the model could run, others ask to pick CPI
15:52:16 <anssik> s/CPI/CPU and another accelerator
15:52:44 <anssik> Reilly: one or two ops falling back to CPU might be high due to context switching
15:52:58 <anssik> q?
15:54:23 <anssik> Markus: we have a demo app in the works that might inform this discussion
15:54:47 <anssik> ... one thing is interop, packing tensors differently depending on the details of the device
15:55:52 <anssik> ... waiting for some input on interop issues before opening a new spec issue for this
15:55:53 <anssik> q?
15:57:19 <anssik> Fabio: we did discuss with Markus, either end user or developers have control over where workloads execute, think agents running 4-5 different models, you want to know where they run
15:57:37 <anssik> ... could have iGPU, dGPU, NPU, how to distribute the work so you can understand the performance you get
15:57:59 <anssik> ... also understand the privacy considerations, not necessarily want the developer to know the exact details but performance level expected
15:58:26 <anssik> ... no specific solution yet, but discussing how to best address this and how the end user could direct where to execute the workloads
15:58:27 <anssik> q?
15:59:20 <anssik> Zoltan: this is a multi-faceted issue due to especially NPU device diversity, can we identify a simple solution that can be extended with more fine-grained information
15:59:22 <RafaelCintron> q+
15:59:28 <anssik> ack RafaelCintron
15:59:57 <anssik> Rafael: I think you have to be a power use to know WebNN is used and care how to distribute the workload across devices
16:00:11 <anssik> ... developers could be informed enough to do this
16:00:52 <anssik> ... there may be systems where GPU is faster than NPU and the other way around
16:01:07 <Fabio> q+
16:01:16 <anssik> ... as diagnostics information, this would be good to have
16:01:19 <anssik> ack Fabio
16:01:53 <anssik> Fabio: agree, my preference is to let the end-user figure these out, because they know the system, maybe one direction is to look how to allow the user to make the selection?
16:02:19 <anssik> ... example, Adobe Suite online, different tasks, in those cases you may have light-weight models that runs better on NPU and heavier models that run better on GPU
16:02:37 <anssik> ... we discussed internally if we can have implicit performance level to hang the devices off
16:02:51 <anssik> ... the evolving ecosystem makes this challenging
16:02:52 <anssik> q?
16:03:02 <handellm> q+
16:03:16 <anssik> Rafael: supreme power user could get access to this data, but for normal users the platform should be able to pick the best devices
16:03:29 <anssik> ... most users are non-technical, and do not understand device selection details
16:03:54 <anssik> q?
16:03:57 <anssik> ack handellm
16:04:38 <anssik> Markus: echoing Rafael, maybe we can identify the reasons from wanting to configure this and that, I'd like to use low-latency setup for my use case that is real-time collaboration
16:04:47 <anssik> Rafael: I'm in favour of hints as a mechanism
16:04:47 <anssik> q?
16:05:21 <anssik> Topic: Cancel 6 Nov WG telcon due to F2F on 10-11 Nov
16:05:26 <anssik> Anssi: I will cancel Thurday WebML WG 6 November Teleconference due to close proximity with the TPAC F2F meetings week
16:05:30 <anssik> ... as a reminder:
16:05:34 <anssik> ... please check you have the F2F in your calendar, and if not, export 10 Nov invite from:
16:05:38 <anssik> -> https://www.w3.org/groups/wg/webmachinelearning/calendar/
16:05:49 <anssik> Anssi: don't be confused it says "Tentative", that's a "feature" of the tool and the F2F meeting is Confirmed
16:05:54 <anssik> ... also, please check out the WebML WG agenda and group-specific instructions and share your suggestions:
16:05:58 <anssik> -> https://github.com/webmachinelearning/meetings/issues/35
16:05:58 <gb> https://github.com/webmachinelearning/meetings/issues/35 -> Issue 35 WebML WG/CG F2F Agenda - TPAC 2025 (Kobe, Japan) (by anssiko)
16:06:12 <anssik> Anssi: we can use our F2F time together more productively if folks make proposal ahead the meeting for topics of interest to them
16:06:33 <anssik> ... see you in Kobe in-person or virtually on 10-11 November 2025 (or 9-10 November if you're in the US West Coast!)
16:06:37 <anssik> ... safe travels!
16:06:45 <anssik> RRSAgent, draft minutes
16:06:46 <RRSAgent> I have made the request to generate https://www.w3.org/2025/10/23-webmachinelearning-minutes.html anssik
16:12:56 <anssik> s/now everyone/not everyone
16:13:47 <anssik> s/spec side/a spec side
16:14:07 <anssik> s/as an/has an
16:14:24 <anssik> s/sample core/sample code
16:14:36 <anssik> s/follow the/follows the
16:15:05 <anssik> s/spec PRs/spec PR
16:17:06 <anssik> s/new properly/new property
16:18:02 <anssik> s/high due/high cost due
16:18:48 <anssik> s/with Markus/with Markus T
16:19:56 <anssik> s/power use to/power user to
16:22:21 <anssik> RRSAgent, draft minutes
16:22:22 <RRSAgent> I have made the request to generate https://www.w3.org/2025/10/23-webmachinelearning-minutes.html anssik