13:57:27 RRSAgent has joined #webmachinelearning 13:57:31 logging to https://www.w3.org/2025/04/10-webmachinelearning-irc 13:57:31 RRSAgent, make logs Public 13:57:32 please title this meeting ("meeting: ..."), anssik 13:57:32 Meeting: WebML WG Teleconference – 10 April 2025 13:57:39 Chair: Anssi 13:57:44 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-04-10-wg-agenda.md 13:57:51 Scribe: Anssi 13:57:56 scribeNick: anssik 13:58:03 gb, this is webmachinelearning/webnn 13:58:03 anssik, OK. 13:58:07 Present+ Anssi_Kostiainen 13:58:14 RRSAgent, draft minutes 13:58:15 I have made the request to generate https://www.w3.org/2025/04/10-webmachinelearning-minutes.html anssik 13:58:22 Winston has joined #webmachinelearning 13:59:14 zkis has joined #webmachinelearning 13:59:24 Present+ Zoltan_Kis 13:59:42 Present+ Winston_Chen 14:00:15 Present+ Joshua_Bell 14:00:20 zkis_ has joined #webmachinelearning 14:00:23 Present+ Tarek_Ziade 14:00:30 Present+ Mike_Wyrzykowski 14:00:48 tarek has joined #webmachinelearning 14:00:52 ningxin has joined #webmachinelearning 14:00:55 Present+ Joshua_Lochner 14:01:20 Present+ Christian_Liebel 14:01:30 Present+ Ningxin_Hu 14:01:35 jsbell has joined #webmachinelearning 14:01:42 Present+ Elena_Zhelezina 14:02:35 Present+ Rafael_Cintron 14:02:48 RRSAgent, draft minutes 14:02:50 I have made the request to generate https://www.w3.org/2025/04/10-webmachinelearning-minutes.html anssik 14:03:28 McCool has joined #webmachinelearning 14:04:05 Present+ Michael_McCool 14:04:17 Joshua_Lochner has joined #webmachinelearning 14:04:48 Mike_Wyrzykowski has joined #webmachinelearning 14:05:09 Topic: Incubations summary 14:05:18 anssik: we had an EU and APAC timezone friendly WebML CG Teleconference last week 14:05:22 -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-03-31-cg-minutes.md 14:05:27 anssik: key takeaways: 14:05:45 ... Proofreader API was discussed, positive sentiment from the group 14:05:53 ... this API will be proposed for CG adoption: https://github.com/webmachinelearning/charter/pull/11 14:05:53 https://github.com/webmachinelearning/charter/pull/11 -> Pull Request 11 Add Proofreader API to Deliverables (by anssiko) 14:06:15 anssik: I will send a call for review to the group's list soon 14:06:40 ... note that proofreading was in scope of the Prompt API, now we want to add an explicit task-specific API for it 14:07:09 ... we also discussed Writing Assistance APIs review feedback, noted the spec is in a good shape, the most advanced in terms of spec maturity of all task-based APIs 14:07:26 ... also Prompt API feature requests were discussed 14:07:39 ... exposing max image / audio limits, preference to leave this feature out for now 14:07:51 RafaelCintron has joined #webmachinelearning 14:08:25 ... multimodal real-time capabilities, we saw a demo from Christian using cloud-based APIs, noted a gap of around 1-year with cloud-based vs. task-based APIs in browsers 14:08:46 ... reviewed DOM integration proposal, the group wanted to see motivating use cases for the feature 14:08:57 ... our upcoming WebML CG meeting schedule is as follows, note we agreed to skip the next week's AMER: 14:09:19 ... - 28 April EU 14:09:19 ... - 13/14 May AMER 14:09:19 ... - 26 May EU 14:09:19 ... - 10/11 June AMER 14:09:19 ... - 23 June EU (tentative due to vacation period in the Northern hemisphere) 14:09:20 ... - 8/9 July AMER 14:09:56 Topic: AI Agents 14:10:18 anssik: Dom hosted an AI Agents W3C Breakouts session a few weeks ago 14:10:22 -> How would AI Agents change the Web platform? - Mar 26 https://www.w3.org/2025/Talks/dhm-ai-agents/ 14:10:46 anssik: most recently AI Agents was discussed at the W3C Advisory Committee meeting, Apr 8, in context of AI Impact on the Web discussion 14:10:56 ... topics: 14:11:07 ... - AI Browsers such as OpenAI Operator 14:11:16 ... - Model Context Protocol (MCP) 14:11:20 ... - Web Automators 14:11:24 ... - Assistive technology 14:11:50 ... risks: 14:12:06 zkis has joined #webmachinelearning 14:12:06 ... - security, hallucination, break out of the sandbox with prompt injection 14:12:14 ... - privacy, with another party in the mix 14:12:27 ... ecosystem: 14:12:37 ... - user intent dilution 14:13:04 ... - monetization with attention 14:14:01 anssik: I'd like to share a few active W3C workstreams connected with AI Agents 14:14:21 ... the WebML CG discussed a Prompt API feature request to add tool/function calling 14:14:31 ... this function calling proposal from Jul 2024 is basically a predecessor for MCP introduced Nov 2024 14:14:35 -> https://github.com/webmachinelearning/prompt-api/issues/7 14:14:36 https://github.com/webmachinelearning/prompt-api/issues/7 -> Issue 7 Support for tool/function calling (by christianliebel) [enhancement] 14:15:12 anssik: proposes to allow browser (extensions?) to provide standard functions to be called to augment the capabilities of an LLM model 14:15:17 ... a simple example would be e.g. a calculator function provided by the browser 14:16:22 Christian: this is still relevant, you can do function calling without AI Agents and vice versa, calling a JS function with a well-defined schedma, use cases e.g. with form filling, you want to make sure the data is well formed and can be used in non-AI contexts 14:16:59 ... re Agentic AI, this is very WIP, so we're not behind in terms of the web capabilities, now is the good time to explore this 14:17:50 anssik: there's also a more recent feature request to add explicit MCP support to Prompt API 14:17:53 -> https://github.com/webmachinelearning/prompt-api/issues/100 14:17:54 https://github.com/webmachinelearning/prompt-api/issues/100 -> Issue 100 [FR] Add MCP Support (by christianliebel) 14:18:28 Christian: you could add MCP support to your browser and expose certain functionality via tools, this is why tool calling is important, I think it should be implemented 14:19:25 ... MCP story is early, would be early if you could interact with the web site, extension could talk to the tab, this could be one functionality, Playwright MCP Server is a good example 14:19:41 ... this type of use cases would be nice, thinking if Prompt API is the good place to extend 14:19:42 https://modelcontextprotocol.io/introduction 14:20:02 anssik: MCP, Model Context Protocol, is like function calling with superpowers 14:20:10 ... reminds the traditional client-server model: 14:20:34 ... - MCP Server -- where the tools live, e.g. local calculator or remote weather lookup, web search etc. 14:20:54 ... - MCP Client -- connector usually part of the AI Agent, finds available tools, formats requests, communicates with the MCP Server 14:21:36 Christian: MCP is still in flux, it is not too late to think about how to integrate this into browsers 14:21:52 ... you talk to external systems, maybe on your local system, maybe remotely 14:22:10 MCP in agentic orchestration and other topics: https://huggingface.co/blog/Kseniase/mcp 14:22:18 -> A collection of MCP Servers https://mcp.so/ 14:22:38 q? 14:23:01 Topic: Tensors for graph constants 14:23:05 anssik: issue #760 PR #830 14:23:06 https://github.com/webmachinelearning/webnn/pull/830 -> Pull Request 830 Allow tensors for graph constants. (by bbernhar) 14:23:06 https://github.com/webmachinelearning/webnn/issues/760 -> Issue 760 Support building graphs from `MLTensor` containing constants (by bbernhar) [feature request] 14:23:34 ... this issue was opened Sep 2024 and I felt now is the right time to discuss it again on this call 14:23:38 ... thanks Bryan for iterating on the PR and prototyping, and Austin for all the Chromium work! 14:24:53 jsbell: there's an agreement on the idea, bikeshedding on if this is a new IDL type or a new property on an existing interface 14:26:16 anssik: I see 3 open conversations in the PR, would like to see if we have agreement on them 14:26:55 Ningxin: I think we could defer this until Bryan is on the call 14:27:16 Topic: Caching mechanism for MLGraph 14:27:25 anssik: issue #807 14:27:26 https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request] 14:27:47 Subtopic: Explainer updates & migration 14:27:59 anssik: since our last discussion, we agreed cross-origin model sharing use cases are out of scope 14:28:07 ... we agreed to focus on the same origin caching of MLGraphs 14:28:20 ... this tighter scope aligns with the implementation intent and avoids privacy risks 14:28:52 ... we will use Reilly's explicit API as a starting point for the explainer that will document Chromium implementation experience, using the successful MLTensor explainer-Chromium prototyping feedback loop as a blueprint 14:29:15 ... given Reilly's Chromium experience, I'd ask Reilly to submit a PR for the explainer skeleton focusing on same-origin case that we can iteratively advance 14:29:26 q? 14:29:29 ... all exploratory work (cross-origin, adapters etc.) will happen in a separate hybrid-ai repo to keep this WG focused on what is being implemented in browser engines 14:29:32 ack zkis 14:29:40 https://github.com/webmachinelearning/hybrid-ai/pull/16 14:29:41 https://github.com/webmachinelearning/hybrid-ai/pull/16 -> Pull Request 16 Create localcache.md (by mmccool) 14:29:42 q+ 14:29:47 ack McCool 14:31:24 McCool: I created a local cache explainer, has labels for discussion items 14:32:14 q+ 14:32:31 ack RafaelCintron 14:32:55 RafaelCintron: I like the concept of having a PR to gather comments, to provide feedback how to do that? 14:32:58 +1 to send the localcache.md PR to WebNN repo 14:33:04 +1 14:33:07 anssik: there will be a new PR 14:34:43 q? 14:34:45 +1 to discuss the explainer in WebNN WG, as cross-origin was marked out of scope in the PR, so it is in line with the WG 14:35:35 Zoltan: I checked this discussion and it looks pretty good and like that the cross-origin is out of scope, Reilly taking the first stab on the PR and explainer SGTM 14:35:50 q? 14:36:08 Subtopic: Requirements from web frameworks 14:36:27 anssik: it was proposed by Ningxin we should look at what WebNN key customers i.e. web frameworks need from the caching mechanism and design toward those requirements 14:36:42 ... one such customer is ORT Web that has Execution Provider context cache feature 14:36:49 q+ 14:37:22 ... EP context cache attempts to solve the exact problem of compilation cost, notes most backends SDKs provide the feature to dump the pre-compiled model into binary file that can be directly executed on the target device and as such improves session creation time 14:37:27 -> OnnxRuntime EP context cache https://onnxruntime.ai/docs/execution-providers/EP-Context-Design.html 14:37:41 anssik: what can we learn from the OnnxRuntime EP context cache design? 14:37:45 ack ningxin 14:37:59 ningxin: wanted to clarify this design is only for native 14:38:06 s/ORT Web/ORT native 14:38:58 ningxin: this means we need to coordinate with ORT folks, EP context is a possible way to move forward, Execution Provider is based on vendors SDKs, can provide compiled blobs to native apps to use those exported models with this EP context mode 14:39:33 ... this is the native use case, I think there's an opportunity to have a similar thing on the web, EP context is one ONNX op in ONNX opset 14:40:07 ... we cannot import ONNX model with native binary with web, but can thing about saved MLGraph being used for EP context, I think there's an opportunity to explore with Reilly's proposal 14:40:30 ... we experimented with this feature on native to see data we could project on to web 14:40:59 ... on Intel platform using GPU and NPU, using EP context feature we can accelerate session generation time with SD turbo 7x speedup 14:41:16 ... for NPU even greater speedup, 25x speedup in session creation time 14:41:28 ... this is very promising data 14:41:30 q? 14:42:54 ningxin: prototype in Chromium is our next step, will share Chromium CLs in the spec issue 14:43:03 Subtopic: Related API proposals 14:43:16 anssik: there are other explorations in this space 14:43:20 -> Cross-Origin Storage API: https://github.com/explainers-by-googlers/cross-origin-storage 14:43:37 anssik: I believe discussion on COS API use cases and user research might be useful due to some overlap 14:44:13 jsbell: high-level, this is very-very-experimental, no implementation commitment yet 14:45:19 q+ 14:45:43 Christian: AFAICT, this is in early feedback gathering phase, user research going on 14:46:25 q+ 14:46:35 ... example.org and example.com could not share the same model file so want to only require it to be downloaded once, Joshua L provided positive feedback, also WebLLM project, but challenges remain 14:46:35 ack jsbell 14:46:57 jsbell: parallel exploration about whether can this be done on a higher level, "want a model good at translation" 14:47:34 ... this exploration was discussed at BlinkOn and interesting tidbit was that file-based mechanism bleeds into compiled model mechanism 14:48:33 ... would be difficult to see how that could be done for ML models similarly to what we do can do th trick for Wasm modules 14:48:52 q? 14:49:11 anssik: another proposal is the Cross-Origin Model cache exploration by Mike looking at cross-origin reuse and adapters 14:49:15 -> Cross-Origin Model Cache https://github.com/webmachinelearning/hybrid-ai/blob/main/proposals/cache.md 14:49:18 ack McCool 14:50:11 McCool: one comment comparing my proposal with COS API, cache is non-deterministic, not equivalent, the advantage of hashing is you can't change the data without changing the hash 14:50:21 +1 to Michael... I think these approaches are complementary, not in competition. 14:51:15 McCool: need to figure out local caching first and how that impacts other things, I think we need to explore file store and caching, what is unique to caching 14:51:53 jsbell: the caching and cross-origin storage proposal are not in conflicts, that said, no implementation commitment for the latter at the moment 14:52:32 q+ 14:52:39 ack McCool 14:53:03 McCool: the adapters feature is a separate issue, step 3 14:54:40 Topic: Query mechanism for supported devices 14:54:47 anssik: issue #815 14:54:47 https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query mechanism for supported devices (by anssiko) [device selection] 14:54:58 (adapters might be step 2 if we build them on local caches and/or model storage) 14:55:23 anssik: I wanted to discuss any new use case feedback, Apple's device privacy considerations, address questions on the capacity concept 14:55:28 ... and review the latest iteration of the API proposal 14:55:55 ... we discussed Markus' feedback last time, is there any new information, or questions to Markus? 14:55:58 q? 14:56:07 -> https://github.com/webmachinelearning/webnn/issues/815#issuecomment-2758753704 14:56:08 https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query mechanism for supported devices (by anssiko) [device selection] 14:56:29 anssik: Mike responded to Markus and shared an example from the Apple device ecosystem, explaining that e.g. a 8-core and 80-core Apple GPUs are seen as identical devices for privacy reasons 14:57:06 Mike: for WebGPU and other Web APIs we limit exposure, so there look the same 14:58:45 anssik: this implies GPU capabilities may differ significantly, and on a lower core-count system Apple framework might prefer to use the NPU for a better user experience, even if a developer would request a GPU 14:58:56 Mike_Wyrzykowski has joined #webmachinelearning 14:59:01 ... it is suggested the MLPowerPreference hint allows for this flexibility 14:59:21 ... another key piece of feedback is that excluding the CPU is not possible with Core ML APIs today 14:59:28 ... suggestion is to allow Core ML to choose the device it thinks is the most suitable 14:59:37 ... lastly, there was a request for a sample that could be used to reproduce the case where the existing MLPowerPreference is not sufficient for this use case 14:59:48 anssik: Zoltan put together the latest iteration of the API proposal considering what Core ML supports: 14:59:53 ... - enumerate available compute devices (cpu, gpu, npu) 14:59:58 ... - limit the used compute devices (to cpu-only, cpu+gpu, cpu+npu, or auto). 15:00:08 -> https://developer.apple.com/documentation/coreml/mlcomputedevice/allcomputedevices 15:00:08 -> https://developer.apple.com/documentation/coreml/mlcomputeunits#Processing-Unit-Configurations 15:00:26 zkis: the latest proposal adds devicePreference hint passed at context creation time: 15:00:26 15:00:26 ``` 15:00:26 const context = await navigator.ml.createContext({ 15:00:26 powerPreference: 'high-performance', 15:00:26 devicePreference: "gpu-like", // or "cpu-only", defaulting on "auto" 15:00:26 }); 15:00:26 ``` 15:00:56 zkis: two directions, Mike's proposal, or even simpler version of what Reilly requested 15:01:34 q? 15:02:17 zkis: original request is to be able to ask if the context has GPU in any combination, I tried to satisfy that use case 15:02:31 ... if this is implementable on Apple platforms would be good to know 15:03:53 Mike: the need for an additional hint does not seem completely justified, adding it would mean we can't remove it ever, need stronger motivation to add it 15:04:15 q? 15:04:56 RRSAgent, draft minutes 15:04:57 I have made the request to generate https://www.w3.org/2025/04/10-webmachinelearning-minutes.html anssik 15:07:43 s/schedma/schema 15:08:26 s/would be early/would be great 15:36:11 s/can thing/can think 15:37:38 s/th trick// 15:38:02 s/exploration by Mike/exploration by Michael 15:39:17 s/there look the same/these look the same 15:40:10 RRSAgent, draft minutes 15:40:11 I have made the request to generate https://www.w3.org/2025/04/10-webmachinelearning-minutes.html anssik 17:06:35 Zakim has left #webmachinelearning