14:49:49 RRSAgent has joined #webmachinelearning 14:49:53 logging to https://www.w3.org/2025/09/11-webmachinelearning-irc 14:49:53 RRSAgent, make logs Public 14:49:54 please title this meeting ("meeting: ..."), anssik 14:49:54 Meeting: WebML WG Teleconference – 11 September 2025 14:49:57 Chair: Anssi 14:50:01 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-09-11-wg-agenda.md 14:50:08 Scribe: Anssi 14:50:10 scribeNick: anssik 14:50:23 Present+ Anssi_Kostiainen 14:52:12 RRSAgent, draft minutes 14:52:13 I have made the request to generate https://www.w3.org/2025/09/11-webmachinelearning-minutes.html anssik 14:59:14 RafaelCintron has joined #webmachinelearning 14:59:17 DwayneR has joined #webmachinelearning 14:59:28 brwalder has joined #webmachinelearning 15:00:07 ningxin has joined #webmachinelearning 15:00:18 kush has joined #webmachinelearning 15:00:53 Present+ Khushal_Sagar 15:01:02 Present+ Alex_Nahas 15:01:09 Joshua_Lochner has joined #webmachinelearning 15:01:31 zkis has joined #webmachinelearning 15:01:33 Present+ Jason_McGhee 15:01:39 present+ 15:02:08 Present+ Zoltan_Kis 15:02:10 phillis has joined #webmachinelearning 15:02:17 Present+ Rafael_Cintron 15:02:18 present+ 15:02:19 Ehsan has joined #webmachinelearning 15:02:49 Present+ Brandon_Walderman 15:02:54 Present+ Dwayne_Robinson 15:02:55 Fabio has joined #webmachinelearning 15:03:24 Present+ Reilly_Grant 15:03:43 Leo has joined #webmachinelearning 15:05:03 jason has joined #webmachinelearning 15:05:31 Anssi: we'll start by welcoming our new participants 15:05:38 ... please welcome to the WebML WG: 15:05:42 ... Rick Viscomi from Google 15:05:45 tarek has joined #webmachinelearning 15:05:46 ... Fabio Bernardon and Sandeep Kumar from NVIDIA 15:06:08 AlexN has joined #webmachinelearning 15:06:11 ... Ilya Grigorik from Shopify 15:06:21 ... and please welcome to the WebML CG: 15:06:26 ... Uğur Toprakdeviren as an unaffiliated individual 15:06:37 Mike_Wyrzykowski has joined #webmachinelearning 15:07:02 ... for new folks joining, this is officially a Working Group call where we focus on WebNN API, but by recent convention we've provided a quick update on incubations such as WebMCP and Built-in AI APIs at the beginning of the meeting 15:07:18 ... for detailed discussion on incubations we have a separate call, we'll do some adjustments to that call schedule, to be discused in a few minutes 15:08:02 Topic: Incubations 15:08:09 gb, this is webmachinelearning/webmcp 15:08:09 anssik, OK. 15:08:17 Anssi: first, an update on the recent WebML Community Group developments 15:08:35 Subtopic: WebMCP 15:10:06 Alex: thanks, I'm super excited to see this standards work starting, while working at Amazon I saw the need for this feature, in terms of what to get out of this group, fleshing out the spec, excited to contribute and elevate the spec to make web better 15:10:46 Jason: I saw the friction when MCP was out and started prototyping, being able to have the compute we owned by user and expose the value directly, thought it is a great idea, so put together an early implementation 15:11:18 ... we met with Alex a few months ago and have been hashing this space out together, security is very important to get right 15:13:29 Anssi: recent WebMCP feature discussions include: 15:13:35 ... - WebMCP for Service Workers explainer #19 15:13:36 https://github.com/webmachinelearning/webmcp/pull/19 -> MERGED Pull Request 19 Add new explainer for service workers (by bwalderman) 15:14:19 Brandon: feedback via dedicated issues is welcome for WebMCP for SW 15:14:53 Anssi: Capability discovery #8 15:14:54 https://github.com/webmachinelearning/webmcp/issues/8 -> Issue 8 Should tools be a means for capability discovery? (by bokand) 15:14:59 ... licitation #21 15:15:00 https://github.com/webmachinelearning/webmcp/issues/21 -> Issue 21 Elicitation (by bwalderman) 15:15:06 ... API design #15 15:15:06 https://github.com/webmachinelearning/webmcp/issues/15 -> Issue 15 API design (by bwalderman) 15:15:14 ... Interleaving interaction #20 15:15:15 https://github.com/webmachinelearning/webmcp/issues/20 -> Issue 20 Interleaving user and Agent interaction with the site (by khushalsagar) 15:15:24 ... Declarative API #22 15:15:25 https://github.com/webmachinelearning/webmcp/issues/22 -> Issue 22 Declarative API Equivalent (by EisenbergEffect) 15:15:44 Prompt injection #11 15:15:44 https://github.com/webmachinelearning/webmcp/issues/11 -> Issue 11 Prompt injection (by bwalderman) 15:16:04 s/Prompt injection #11/... Prompt injection #11 15:16:10 ... API to list registered tools #16 15:16:10 https://github.com/webmachinelearning/webmcp/issues/16 -> Issue 16 Add API to list / execute tools? (by bokand) 15:16:28 Anssi: thank you everyone who contributed to these discussion 15:16:33 ... this formative stage of the WebMCP proposal is the right time to join the effort 15:17:59 Brandon: encourage to read the explainer and want to highlight prompt injection since security is something we must get right, safety is important for users 15:18:13 ... unsolved problem in the MCP ecosystem as well so all input is welcome 15:19:02 Subtopic: Community Group meeting schedule 15:19:24 Anssi: I'm proposing a change to the Community Group meeting schedule 15:19:43 I would be in favor of interleaving. 15:19:43 ... proposal to reuse this Thursday 15:00 UTC / 08:00 AM Pacific meeting slot for the Community Group call to better support AMER geo during the WebMCP ramp-up phase 15:20:16 ... since this Working Group meets every other week, we could interleave the Community Group meeting with WebMCP focus on either even or odd weeks 15:20:33 ... I believe this would simplify your calendaring exercise 15:20:43 ... the trade-off is the time would not be optimal for APAC participation, especially from Japan 15:20:48 ... feedback, comments? 15:21:02 Anssi: I see Alex and Jason +1'd 15:22:32 ... Rafael, Brandon, Krushal also +1 15:22:53 Subtopic: Prompt API tool calling 15:23:02 Anssi: Prompt API tool calling issues under consideration 15:23:06 -> https://github.com/webmachinelearning/prompt-api/labels/tools 15:24:04 Topic: New features and operator specific issues 15:24:07 gb, this is webmachinelearning/webnn 15:24:07 anssik, OK. 15:24:15 -> [operator specific] issues https://github.com/webmachinelearning/webnn/labels/operator%20specific 15:24:21 -> [feature request] issues https://github.com/webmachinelearning/webnn/labels/feature%20request 15:24:33 Subtopic: Support dynamic tensor resizing for slice and resample2d 15:24:38 Anssi: issue #885 15:24:39 https://github.com/webmachinelearning/webnn/issues/885 -> Issue 885 Support dynamic tensor resizing for slice and resample2d (by Honry) [feature request] [operator specific] 15:24:49 ... first, noting this issue is related to the flexible input sizes issue we'll discuss after this one 15:25:12 ... Wanming reports: "Currently, tensor resizing via WebNN’s slice and resample2d is limited to static parameters: slice must use static starts and sizes; resample2d must use static sizes and axes." 15:25:21 ... this causes fallback with performance impact in certain models 15:25:43 ... proposal is to enable dynamic tensor resizing with the following changes: 15:25:45 ... - change slice starts and sizes argument types from unsigned long to MLOperand 15:25:53 ... - change MLResample2dOptions.sizes and MLResample2dOptions.axes argument types from unsigned long to MLOperand 15:26:02 ``` 15:26:02 MLOperand slice(MLOperand input, 15:26:02 sequence<[EnforceRange] unsigned long> starts, 15:26:02 sequence<[EnforceRange] unsigned long> sizes, 15:26:02 optional MLSliceOptions options = {}); 15:26:03 15:26:03 dictionary MLResample2dOptions : MLOperatorOptions { 15:26:03 MLInterpolationMode mode = "nearest-neighbor"; 15:26:04 sequence scales; 15:26:04 sequence<[EnforceRange] unsigned long> sizes; 15:26:04 sequence<[EnforceRange] unsigned long> axes; 15:26:05 }; 15:26:05 15:26:05 partial interface MLGraphBuilder { 15:26:06 MLOperand resample2d(MLOperand input, optional MLResample2dOptions options = {}); 15:26:06 }; 15:26:06 ``` 15:26:18 -> https://www.w3.org/TR/webnn/#api-mlgraphbuilder-slice 15:26:18 -> https://www.w3.org/TR/webnn/#api-mlgraphbuilder-resample2d-method 15:26:42 Anssi: Dwayne suggests ORT should resolve patterns of cast/gather/unsqueeze before they reach the dependent operator 15:27:05 ... if slice and resample would take dynamic *GPU* tensors as suggested it would waste parallel shader resources 15:27:30 ... Dwayne further notes: "if the input parameters were moved into an MLTensor, there would need to be a requirement that any such tensors are CPU-restrained and do not execute on a remote device (GPU/NPU)." 15:27:49 ... also note DML EP is able to do this, ensure such tensors stay on the CPU-side 15:28:11 ... and asked whether this can apply to WebNN EP too? 15:28:30 ... Wanming asked for more information on how DML EP implementation, does it require graph recompilation? 15:28:57 Dwayne: to answer the question, must recompile the graph 15:29:31 ... because dynamic inputs affect the shape, the values need to be known on the CPU-side, transitively not limited to image, must consider traversal 15:29:57 ... these are tiny tensors, must consider all overhead of element tensors, no HW benefit if these can be resolved by the caller 15:30:17 q+ 15:30:19 ... not necessary to change the options knowing DML EP can do this, there's a performance cost 15:31:02 q+ 15:31:08 ... next step, can WebNN EP construction be delayed similarly DML EP? I'll chat with Wanming and Ningxin 15:31:24 ack reillyg 15:31:59 Reilly: Dwayne's elaboration answered my question, I was confused why only resize must support dynamic shapes, was expecting trickle down effect on the subgraph 15:32:19 ... we've discussed dynamic tensor shape in general, helpful for models with KV caches in particular and LLMs in general 15:33:05 Dwayne: I think in general it is useful to support dynamic sized, but if shape computation is affected it should be done before it reaches WebNN 15:33:53 Reilly: I think dynamic sizing capability would require recompilation, would have to be an intrinsic component of the API w/o going through the rebuilding 15:34:09 ... frameworks can rebuild the graph without starting compilation from the scratch 15:34:17 ack ningxin 15:35:13 Ningxin: per Wanming's last comment the resize target tensor is used to reflect the exact original image size, for this model if you check SegmentAnything demo, decoder runs many times on a single image, unless a new image is uploaded 15:35:51 ... I'm not sure this model is modelled correctly to used dynamic size, in normal case even if the model can accept different image size, it is represented by flexible dimensions 15:36:30 ... for the latter case, if the model is rearchitected, the user can recompile the model if the image size changes, does not happen often for each encoder inference 15:36:34 Dwayne: +1 15:36:49 q? 15:37:07 Subtopic: Flexible input sizes 15:37:17 Anssi: issue #883 15:37:17 https://github.com/webmachinelearning/webnn/issues/883 -> Issue 883 Support flexible input sizes (by huningxin) [feature request] [operator specific] 15:37:30 ... last time we introduced this proposal, well documented in the issue thanks to Dwayne's research 15:37:55 ... given this is a significant change to the implementation, we want to ensure the main API consumers i.e. the ML JS frameworks' feedback is considered in this design phase 15:38:08 ... the group identified ONNX Runtime and Transformers.js as users of the WebNN API who have had this requirement 15:38:25 ... in ONNX Runtime, the dynamic shape tensors are passed to Execution Providers 15:38:40 ... and currently, WebNN EP falls back to Wasm EP if dynamic shape tensors are passed to it 15:39:20 ... per Dwayne's comments DirectML EP supports dynamic input shapes albeit with some performance penalty due to on demand creation of operators or delayed graph construction until shape information is known 15:39:35 ... I did not see direct feedback from Guenther of ORT or Joshua of Transformers.js in the issue 15:40:17 JoshuaL: I'd say dynamic input shapes are input for Transformers.js users' use case 15:40:53 ... recently WebNN has been able to work with some vision models that can fix input size and use static shapes, current LLM implementation of ours require dynamic shapes on decodes side, getting a new token adding it back 15:41:15 q+ 15:41:25 ... we have though about creating another implementation that allow static shapes, but given WebGPU implementation is working 15:41:55 ... great to hear there's movement in WebNN to address 15:41:57 q? 15:42:02 ack reillyg 15:42:09 sounds good! And yes, a massive +1 from my side :) 15:42:41 Reilly: I agree on the value of doing this, there are versions of Whisper models that have memory trade-off, the questions is mostly to the folks familiar with the existing implementation of this in frameworks 15:42:51 ... recompilation, how is this implemented in e.g. ORT? 15:43:12 ... light-weight compilation process, what's the signal we can implement this across multiple platforms? 15:43:17 q? 15:43:41 Phillis: on Core ML static and dynamic shapes are supported 15:44:01 ... dynamic shapes only getting executed on CPU 15:44:24 Anssi: can we get this information via a public API in Core ML? 15:44:32 Phillis: yes, we can test for this 15:44:40 q? 15:45:08 Dwayne: ORT support dynamic shapes, but less efficient 15:45:19 ... must go down to individual nodes, cannot replan memory 15:45:25 ... DML EP must recompile 15:45:40 Reilly: Transformers.js is using ORT Web, must be working in WebGPU EP 15:45:54 Dwayne: I don't know the details about it, can ask Gunther 15:46:21 q? 15:47:06 Subtopic: Support uint8/int8 input for resample2d 15:47:12 Anssi: issue #872 15:47:12 https://github.com/webmachinelearning/webnn/issues/872 -> Issue 872 Support uint8/int8 input for resample2d (by huningxin) [operator specific] 15:47:21 ... I put this on the agenda as a last call for comments because I felt this was well fleshed out 15:47:35 ... current status is 2 of 3 backends support this, is a candidate for optional data type support 15:47:41 ... Rafael signalled support at our last meeting 15:47:52 ... this is already implemented in Chromium and we have an agreement to make the corresponding spec change 15:48:09 ... given no further comments in the issue suggest this can be turned into PR at the convenience of the editors 15:48:18 q? 15:48:32 Mike: +1 15:48:47 Topic: Privacy considerations 15:48:52 Anssi: issue #886 15:48:53 https://github.com/webmachinelearning/webnn/issues/886 -> Issue 886 Revise privacy considerations (by anssiko) [privacy-tracker] 15:49:04 ... to close on the privacy review feedback, there are two remaining tasks: 15:49:20 ... - review privacy considerations in the light of new information and spec updates 15:49:38 ... - migrate relevant parts of the standalone self-review questionnaire to the in-spec privacy considerations section 15:49:43 -> https://github.com/webmachinelearning/webnn/blob/main/security-privacy.md 15:49:48 -> https://www.w3.org/TR/webnn/#privacy 15:50:02 Anssi: I can take a look at this but appreciate if more folks review privacy considerations, in particular those who have not contributed to this sections yet to have fresh eyes on it 15:50:10 q? 15:50:27 Topic: MLGraph Cache 15:50:31 -> Explainer https://github.com/webmachinelearning/webnn/blob/main/cache-explainer.md 15:51:41 Anssi: explainer PR #862 was merged some time ago, the PR includes comments that provide context 15:51:42 https://github.com/webmachinelearning/webnn/pull/862 -> MERGED Pull Request 862 Add WebNN MLGraph Cache Explainer (by anssiko) 15:52:11 Anssi: to recap, the proposal is an explicit API for caching compiled graphs, allowing web applications to save and reuse them, thereby reducing the overhead of repeated compilation 15:52:18 ... this feature awaits experimental implementation experience 15:52:28 ... Wanming was planning to do an experiment on ONNX Runtime Web with WebNN model cache support 15:52:45 ... Ningxin, do you know the latest status, is this work planned or should we defer this for later? 15:53:14 Ningxin: Wanming's plan is depending on the Chromium prototype, that OTOH has dependency on a related ORT API 15:53:50 ... other backends like TFLite and Core ML, there's an opportunity for further implementation experience 15:54:43 Reilly: it is on our roadmap, currently focused on some other components of the system, that said, is on our radar definitely 15:55:37 ... my last update on this is that Intel folks looked at building large models, and solutions for building large models and caching them 15:55:49 ... storing weights on disk is step 1 of caching the model 15:55:49 q? 15:57:16 Mike: I'll take a look at this issue and come back 15:57:46 Reilly: in Chromium implementation we already store Core ML models because that is required, but reuse later is not yet implemented explicitly 15:58:36 ... Core ML has a nice properly in Core ML, ORT, TFLite, they work on high-level concepts, if the context is CPU context initially, you can build on CPU and if the developer later wants high-performance it can be resused without rebuilding the model 15:59:09 Topic: Query supported devices 15:59:24 Anssi: as you recall, we split this problem space in two to be more digestible: before and after graph compilation 15:59:47 ... the group needs to decide whether it wants to proceed with "before" case, "after" case, or with both, or nothing 15:59:49 ... for any new feature we need both real-world use case validation and implementation experience 15:59:59 ... for the "before" case we have the use cases, but lack implementation experience 16:00:09 ... for the "after" case we're missing use cases, but have some implementation experience 16:00:13 ... let's discuss the "before graph compilation" case first 16:00:18 Subtopic: Before graph compilation 16:00:24 Anssi: the PR was updated by Zoltan to factor in feedback from the previous call, thanks! 16:00:33 ... I want us to discuss implementability challenges that were brought up at our last meeting 16:00:45 ... the updated proposal introduces the following changes to the API surface: 16:00:58 ... 1) a new "accelerated" MLPowerPreference hint, this is developer-settable to true or false 16:01:12 ... 2) getter to expose the post-context creation confidence level that the requested "accelerated" context is available, returns one of: 16:01:22 ... "probably" -> the context will mostly use GPU/NPU, but CPU fallback may happen 16:01:34 ... "best-effort" -> NPU/GPU is supported by the platform but it cannot guarantee it for sure 16:01:41 q+ 16:01:50 ... "no" -> the platform tells it likely cannot provide NPU or GPU 16:01:55 ... I believe this proposal considers all the use cases discussed so far 16:02:08 ... we'd need implementation experience, preferably from multiple backends, e.g. OpenVINO, TFLite, CoreML? 16:02:12 q? 16:02:14 ack RafaelCintron 16:02:31 Rafael: I'd like to understand the implementability of this feature 16:02:55 ... it is unsettling to have more APIs with less explicit answers 16:03:28 q? 16:03:29 q+ 16:03:34 ack zkis 16:03:49 q+ 16:04:20 Zoltan: certainly we can do true or false, in that case we have the CPU fallback information to avoid undefined or fuzzy space 16:04:56 ... should CPU fallback we property or event? 16:05:00 ack Mike_Wyrzykowski 16:05:15 MikeW: effectively agree with Rafael concerns 16:05:40 q+ 16:05:44 ack RafaelCintron 16:06:16 Rafael: I think the main thing we're struggling with is, that at context time we don't have graph at all, some implementation have generous opSupportLimits 16:06:29 ... but later learns, cannot satisfy those limits 16:06:53 q+ 16:07:02 ack zkis 16:07:15 Phillis: same as with after graph compilation query? 16:07:28 Rafael: if there's no GPU, we can say not accelerated 16:08:03 Phillis: the reality is we cannot get any useful information before graph compilation, except whether there's the actual physical GPU or NPU on the system 16:08:26 Zoltan: ops and capabilities we haven't fleshed out yet, MikeW made a proposal for that, we can pick it up if this does not work out 16:08:29 q? 16:09:07 Zoltan: Rafael, do we also need the capability query? 16:09:32 Rafael: do you mean, is it sufficient to specify the op limits? 16:09:58 ... WebGPU allows asking "want HW accelerated", and the API responds "no HW acceleration for you, sorry!" 16:10:31 ... it is worth having the feature if the develop only cares about WebNN 16:10:45 q? 16:10:53 Subtopic: After graph compilation 16:10:57 anssik: issue #836 and PR #854 16:10:58 https://github.com/webmachinelearning/webnn/pull/854 -> Pull Request 854 define graph.devices (by philloooo) 16:10:58 https://github.com/webmachinelearning/webnn/issues/836 -> Issue 836 Get devices used for a graph after graph compilation (by philloooo) [device selection] 16:11:02 ... for this "after graph compilation" case we have demonstrated imlementability with a Chromium prototype 16:11:07 ... however, as discussed last time, the use cases are not documented so we wanted to work on those before making progress with this as a spec feature to ensure the proposed solution targets real-world use cases 16:11:39 Phillis: the use cases are what Markus mentioned before, we have a use case in abstract, we also considered example app with that use case 16:11:51 ... it is not such priority, but is on our roadmap 16:12:44 RRSAgent, draft minutes 16:12:46 I have made the request to generate https://www.w3.org/2025/09/11-webmachinelearning-minutes.html anssik 16:37:36 Present+ Ehsan_Toreini 16:37:52 Present+ Fabio_Bernardon 16:38:03 Present+ Leo_Lee 16:38:14 Present+ Ningxin_Hu 16:38:23 Present+ Phillis_Tang 16:38:35 RRSAgent, draft minutes 16:38:37 I have made the request to generate https://www.w3.org/2025/09/11-webmachinelearning-minutes.html anssik 16:39:04 Present+ Mike_Wyrzykowski 16:39:07 RRSAgent, draft minutes 16:39:09 I have made the request to generate https://www.w3.org/2025/09/11-webmachinelearning-minutes.html anssik 16:39:36 Present+ Joshua_Lochner 16:39:56 RRSAgent, draft minutes 16:39:58 I have made the request to generate https://www.w3.org/2025/09/11-webmachinelearning-minutes.html anssik 16:41:48 s/licitation/Elicitation 16:42:07 s/these discussion/these discussions 16:44:57 s/DML EP implementation/DML EP implementation works 16:45:37 s/similarly DML EP/similarly to DML EP 16:46:27 s/sized/sizes 16:47:36 s/used dynamic/use dynamic 16:49:00 s/Gunther/Guenther 16:49:51 s/this sections/this section 16:50:32 s/other backends/for other backends 16:51:57 s/Core ML has/there is 16:52:21 s/resused/reused 16:53:41 s/or with both, or nothing/with both, or do neither 16:54:41 s/we property/be property 16:54:54 s/Rafael concerns/Rafael's concerns 16:55:28 s/is, that/is that 16:55:59 s/implementation have/implementations have 16:56:49 s/develop only/developer only 16:57:42 s/such priority/high priority currently 16:57:50 RRSAgent, draft minutes 16:57:52 I have made the request to generate https://www.w3.org/2025/09/11-webmachinelearning-minutes.html anssik 17:12:55 Regrets+ Markus_Handell 17:12:59 RRSAgent, draft minutes 17:13:00 I have made the request to generate https://www.w3.org/2025/09/11-webmachinelearning-minutes.html anssik 17:26:21 zkis has joined #webmachinelearning 19:05:18 Zakim has left #webmachinelearning