13:57:27 <RRSAgent> RRSAgent has joined #webmachinelearning
13:57:31 <RRSAgent> logging to https://www.w3.org/2025/04/10-webmachinelearning-irc
13:57:31 <Zakim> RRSAgent, make logs Public
13:57:32 <Zakim> please title this meeting ("meeting: ..."), anssik
13:57:32 <anssik> Meeting: WebML WG Teleconference – 10 April 2025
13:57:39 <anssik> Chair: Anssi
13:57:44 <anssik> Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-04-10-wg-agenda.md
13:57:51 <anssik> Scribe: Anssi
13:57:56 <anssik> scribeNick: anssik
13:58:03 <anssik> gb, this is webmachinelearning/webnn
13:58:03 <gb> anssik, OK.
13:58:07 <anssik> Present+ Anssi_Kostiainen
13:58:14 <anssik> RRSAgent, draft minutes
13:58:15 <RRSAgent> I have made the request to generate https://www.w3.org/2025/04/10-webmachinelearning-minutes.html anssik
13:58:22 <Winston> Winston has joined #webmachinelearning
13:59:14 <zkis> zkis has joined #webmachinelearning
13:59:24 <anssik> Present+ Zoltan_Kis
13:59:42 <anssik> Present+ Winston_Chen
14:00:15 <anssik> Present+ Joshua_Bell
14:00:20 <zkis_> zkis_ has joined #webmachinelearning
14:00:23 <anssik> Present+ Tarek_Ziade
14:00:30 <anssik> Present+ Mike_Wyrzykowski
14:00:48 <tarek> tarek has joined #webmachinelearning
14:00:52 <ningxin> ningxin has joined #webmachinelearning
14:00:55 <anssik> Present+ Joshua_Lochner
14:01:20 <anssik> Present+ Christian_Liebel
14:01:30 <anssik> Present+ Ningxin_Hu
14:01:35 <jsbell> jsbell has joined #webmachinelearning
14:01:42 <anssik> Present+ Elena_Zhelezina
14:02:35 <anssik> Present+ Rafael_Cintron
14:02:48 <anssik> RRSAgent, draft minutes
14:02:50 <RRSAgent> I have made the request to generate https://www.w3.org/2025/04/10-webmachinelearning-minutes.html anssik
14:03:28 <McCool> McCool has joined #webmachinelearning
14:04:05 <anssik> Present+ Michael_McCool
14:04:17 <Joshua_Lochner> Joshua_Lochner has joined #webmachinelearning
14:04:48 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
14:05:09 <anssik> Topic: Incubations summary
14:05:18 <anssik> anssik: we had an EU and APAC timezone friendly WebML CG Teleconference last week
14:05:22 <anssik> -> https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-03-31-cg-minutes.md
14:05:27 <anssik> anssik: key takeaways:
14:05:45 <anssik> ... Proofreader API was discussed, positive sentiment from the group
14:05:53 <anssik> ... this API will be proposed for CG adoption: https://github.com/webmachinelearning/charter/pull/11
14:05:53 <gb> https://github.com/webmachinelearning/charter/pull/11 -> Pull Request 11 Add Proofreader API to Deliverables (by anssiko)
14:06:15 <anssik> anssik: I will send a call for review to the group's list soon
14:06:40 <anssik> ... note that proofreading was in scope of the Prompt API, now we want to add an explicit task-specific API for it
14:07:09 <anssik> ... we also discussed Writing Assistance APIs review feedback, noted the spec is in a good shape, the most advanced in terms of spec maturity of all task-based APIs
14:07:26 <anssik> ... also Prompt API feature requests were discussed
14:07:39 <anssik> ... exposing max image / audio limits, preference to leave this feature out for now
14:07:51 <RafaelCintron> RafaelCintron has joined #webmachinelearning
14:08:25 <anssik> ... multimodal real-time capabilities, we saw a demo from Christian using cloud-based APIs, noted a gap of around 1-year with cloud-based vs. task-based APIs in browsers
14:08:46 <anssik> ... reviewed DOM integration proposal, the group wanted to see motivating use cases for the feature
14:08:57 <anssik> ... our upcoming WebML CG meeting schedule is as follows, note we agreed to skip the next week's AMER:
14:09:19 <anssik> ... - 28 April EU
14:09:19 <anssik> ... - 13/14 May AMER
14:09:19 <anssik> ... - 26 May EU
14:09:19 <anssik> ... - 10/11 June AMER
14:09:19 <anssik> ... - 23 June EU (tentative due to vacation period in the Northern hemisphere)
14:09:20 <anssik> ... - 8/9 July AMER
14:09:56 <anssik> Topic: AI Agents
14:10:18 <anssik> anssik: Dom hosted an AI Agents W3C Breakouts session a few weeks ago
14:10:22 <anssik> -> How would AI Agents change the Web platform? - Mar 26 https://www.w3.org/2025/Talks/dhm-ai-agents/
14:10:46 <anssik> anssik: most recently AI Agents was discussed at the W3C Advisory Committee meeting, Apr 8, in context of AI Impact on the Web discussion
14:10:56 <anssik> ... topics:
14:11:07 <anssik> ... - AI Browsers such as OpenAI Operator
14:11:16 <anssik> ... - Model Context Protocol (MCP)
14:11:20 <anssik> ... - Web Automators
14:11:24 <anssik> ... - Assistive technology
14:11:50 <anssik> ... risks:
14:12:06 <zkis> zkis has joined #webmachinelearning
14:12:06 <anssik> ... - security, hallucination, break out of the sandbox with prompt injection
14:12:14 <anssik> ... - privacy, with another party in the mix
14:12:27 <anssik> ... ecosystem:
14:12:37 <anssik> ... - user intent dilution
14:13:04 <anssik> ... - monetization with attention
14:14:01 <anssik> anssik: I'd like to share a few active W3C workstreams connected with AI Agents
14:14:21 <anssik> ... the WebML CG discussed a Prompt API feature request to add tool/function calling
14:14:31 <anssik> ... this function calling proposal from Jul 2024 is basically a predecessor for MCP introduced Nov 2024
14:14:35 <anssik> -> https://github.com/webmachinelearning/prompt-api/issues/7
14:14:36 <gb> https://github.com/webmachinelearning/prompt-api/issues/7 -> Issue 7 Support for tool/function calling (by christianliebel) [enhancement]
14:15:12 <anssik> anssik: proposes to allow browser (extensions?) to provide standard functions to be called to augment the capabilities of an LLM model
14:15:17 <anssik> ... a simple example would be e.g. a calculator function provided by the browser
14:16:22 <anssik> Christian: this is still relevant, you can do function calling without AI Agents and vice versa, calling a JS function with a well-defined schedma, use cases e.g. with form filling, you want to make sure the data is well formed and can be used in non-AI contexts
14:16:59 <anssik> ... re Agentic AI, this is very WIP, so we're not behind in terms of the web capabilities, now is the good time to explore this
14:17:50 <anssik> anssik: there's also a more recent feature request to add explicit MCP support to Prompt API
14:17:53 <anssik> -> https://github.com/webmachinelearning/prompt-api/issues/100
14:17:54 <gb> https://github.com/webmachinelearning/prompt-api/issues/100 -> Issue 100 [FR] Add MCP Support (by christianliebel)
14:18:28 <anssik> Christian: you could add MCP support to your browser and expose certain functionality via tools, this is why tool calling is important, I think it should be implemented
14:19:25 <anssik> ... MCP story is early, would be early if you could interact with the web site, extension could talk to the tab, this could be one functionality, Playwright MCP Server is a good example
14:19:41 <anssik> ... this type of use cases would be nice, thinking if Prompt API is the good place to extend
14:19:42 <zkis> https://modelcontextprotocol.io/introduction
14:20:02 <anssik> anssik: MCP, Model Context Protocol, is like function calling with superpowers
14:20:10 <anssik> ... reminds the traditional client-server model:
14:20:34 <anssik> ... - MCP Server -- where the tools live, e.g. local calculator or remote weather lookup, web search etc.
14:20:54 <anssik> ... - MCP Client -- connector usually part of the AI Agent, finds available tools, formats requests, communicates with the MCP Server
14:21:36 <anssik> Christian: MCP is still in flux, it is not too late to think about how to integrate this into browsers
14:21:52 <anssik> ... you talk to external systems, maybe on your local system, maybe remotely
14:22:10 <zkis> MCP in agentic orchestration and other topics: https://huggingface.co/blog/Kseniase/mcp
14:22:18 <anssik> -> A collection of MCP Servers https://mcp.so/
14:22:38 <anssik> q?
14:23:01 <anssik> Topic: Tensors for graph constants
14:23:05 <anssik> anssik: issue #760 PR #830
14:23:06 <gb> https://github.com/webmachinelearning/webnn/pull/830 -> Pull Request 830 Allow tensors for graph constants. (by bbernhar)
14:23:06 <gb> https://github.com/webmachinelearning/webnn/issues/760 -> Issue 760 Support building graphs from `MLTensor` containing constants (by bbernhar) [feature request]
14:23:34 <anssik> ... this issue was opened Sep 2024 and I felt now is the right time to discuss it again on this call
14:23:38 <anssik> ... thanks Bryan for iterating on the PR and prototyping, and Austin for all the Chromium work!
14:24:53 <anssik> jsbell: there's an agreement on the idea, bikeshedding on if this is a new IDL type or a new property on an existing interface
14:26:16 <anssik> anssik: I see 3 open conversations in the PR, would like to see if we have agreement on them
14:26:55 <anssik> Ningxin: I think we could defer this until Bryan is on the call
14:27:16 <anssik> Topic: Caching mechanism for MLGraph
14:27:25 <anssik> anssik: issue #807
14:27:26 <gb> https://github.com/webmachinelearning/webnn/issues/807 -> Issue 807 Caching mechanism for MLGraph (by anssiko) [question] [feature request]
14:27:47 <anssik> Subtopic: Explainer updates & migration
14:27:59 <anssik> anssik: since our last discussion, we agreed cross-origin model sharing use cases are out of scope
14:28:07 <anssik> ... we agreed to focus on the same origin caching of MLGraphs
14:28:20 <anssik> ... this tighter scope aligns with the implementation intent and avoids privacy risks
14:28:52 <anssik> ... we will use Reilly's explicit API as a starting point for the explainer that will document Chromium implementation experience, using the successful MLTensor explainer-Chromium prototyping feedback loop as a blueprint
14:29:15 <anssik> ... given Reilly's Chromium experience, I'd ask Reilly to submit a PR for the explainer skeleton focusing on same-origin case that we can iteratively advance
14:29:26 <zkis> q?
14:29:29 <anssik> ... all exploratory work (cross-origin, adapters etc.) will happen in a separate hybrid-ai repo to keep this WG focused on what is being implemented in browser engines
14:29:32 <anssik> ack zkis
14:29:40 <McCool> https://github.com/webmachinelearning/hybrid-ai/pull/16
14:29:41 <gb> https://github.com/webmachinelearning/hybrid-ai/pull/16 -> Pull Request 16 Create localcache.md (by mmccool)
14:29:42 <McCool> q+
14:29:47 <anssik> ack McCool
14:31:24 <anssik> McCool: I created a local cache explainer, has labels for discussion items
14:32:14 <RafaelCintron> q+
14:32:31 <anssik> ack RafaelCintron
14:32:55 <anssik> RafaelCintron: I like the concept of having a PR to gather comments, to provide feedback how to do that?
14:32:58 <ningxin> +1 to send the localcache.md PR to WebNN repo
14:33:04 <jsbell> +1
14:33:07 <anssik> anssik: there will be a new PR
14:34:43 <anssik> q?
14:34:45 <zkis> +1 to discuss the explainer in WebNN WG, as cross-origin was marked out of scope in the PR, so it is in line with the WG
14:35:35 <anssik> Zoltan: I checked this discussion and it looks pretty good and like that the cross-origin is out of scope, Reilly taking the first stab on the PR and explainer SGTM
14:35:50 <anssik> q?
14:36:08 <anssik> Subtopic: Requirements from web frameworks
14:36:27 <anssik> anssik: it was proposed by Ningxin we should look at what WebNN key customers i.e. web frameworks need from the caching mechanism and design toward those requirements
14:36:42 <anssik> ... one such customer is ORT Web that has Execution Provider context cache feature
14:36:49 <ningxin> q+
14:37:22 <anssik> ... EP context cache attempts to solve the exact problem of compilation cost, notes most backends SDKs provide the feature to dump the pre-compiled model into binary file that can be directly executed on the target device and as such improves session creation time
14:37:27 <anssik> -> OnnxRuntime EP context cache https://onnxruntime.ai/docs/execution-providers/EP-Context-Design.html
14:37:41 <anssik> anssik: what can we learn from the OnnxRuntime EP context cache design?
14:37:45 <anssik> ack ningxin
14:37:59 <anssik> ningxin: wanted to clarify this design is only for native
14:38:06 <anssik> s/ORT Web/ORT native
14:38:58 <anssik> ningxin: this means we need to coordinate with ORT folks, EP context is a possible way to move forward, Execution Provider is based on vendors SDKs, can provide compiled blobs to native apps to use those exported models with this EP context mode
14:39:33 <anssik> ... this is the native use case, I think there's an opportunity to have a similar thing on the web, EP context is one ONNX op in ONNX opset
14:40:07 <anssik> ... we cannot import ONNX model with native binary with web, but can thing about saved MLGraph being used for EP context, I think there's an opportunity to explore with Reilly's proposal
14:40:30 <anssik> ... we experimented with this feature on native to see data we could project on to web
14:40:59 <anssik> ... on Intel platform using GPU and NPU, using EP context feature we can accelerate session generation time with SD turbo 7x speedup
14:41:16 <anssik> ... for NPU even greater speedup, 25x speedup in session creation time
14:41:28 <anssik> ... this is very promising data
14:41:30 <anssik> q?
14:42:54 <anssik> ningxin: prototype in Chromium is our next step, will share Chromium CLs in the spec issue
14:43:03 <anssik> Subtopic: Related API proposals
14:43:16 <anssik> anssik: there are other explorations in this space
14:43:20 <anssik> -> Cross-Origin Storage API: https://github.com/explainers-by-googlers/cross-origin-storage
14:43:37 <anssik> anssik: I believe discussion on COS API use cases and user research might be useful due to some overlap
14:44:13 <anssik> jsbell: high-level, this is very-very-experimental, no implementation commitment yet
14:45:19 <jsbell> q+
14:45:43 <anssik> Christian: AFAICT, this is in early feedback gathering phase, user research going on
14:46:25 <McCool> q+
14:46:35 <anssik> ... example.org and example.com could not share the same model file so want to only require it to be downloaded once, Joshua L provided positive feedback, also WebLLM project, but challenges remain
14:46:35 <anssik> ack jsbell
14:46:57 <anssik> jsbell: parallel exploration about whether can this be done on a higher level, "want a model good at translation"
14:47:34 <anssik> ... this exploration was discussed at BlinkOn and interesting tidbit was that file-based mechanism bleeds into compiled model mechanism
14:48:33 <anssik> ... would be difficult to see how that could be done for ML models similarly to what we do can do th trick for Wasm modules
14:48:52 <anssik> q?
14:49:11 <anssik> anssik: another proposal is the Cross-Origin Model cache exploration by Mike looking at cross-origin reuse and adapters
14:49:15 <anssik> -> Cross-Origin Model Cache https://github.com/webmachinelearning/hybrid-ai/blob/main/proposals/cache.md
14:49:18 <anssik> ack McCool
14:50:11 <anssik> McCool: one comment comparing my proposal with COS API, cache is non-deterministic, not equivalent, the advantage of hashing is you can't change the data without changing the hash
14:50:21 <jsbell> +1 to Michael... I think these approaches are complementary, not in competition.
14:51:15 <anssik> McCool: need to figure out local caching first and how that impacts other things, I think we need to explore file store and caching, what is unique to caching
14:51:53 <anssik> jsbell: the caching and cross-origin storage proposal are not in conflicts, that said, no implementation commitment for the latter at the moment
14:52:32 <McCool> q+
14:52:39 <anssik> ack McCool
14:53:03 <anssik> McCool: the adapters feature is a separate issue, step 3
14:54:40 <anssik> Topic: Query mechanism for supported devices
14:54:47 <anssik> anssik: issue #815
14:54:47 <gb> https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query mechanism for supported devices (by anssiko) [device selection]
14:54:58 <McCool> (adapters might be step 2 if we build them on local caches and/or model storage)
14:55:23 <anssik> anssik: I wanted to discuss any new use case feedback, Apple's device privacy considerations, address questions on the capacity concept
14:55:28 <anssik> ... and review the latest iteration of the API proposal
14:55:55 <anssik> ... we discussed Markus' feedback last time, is there any new information, or questions to Markus?
14:55:58 <anssik> q?
14:56:07 <anssik> -> https://github.com/webmachinelearning/webnn/issues/815#issuecomment-2758753704
14:56:08 <gb> https://github.com/webmachinelearning/webnn/issues/815 -> Issue 815 Query mechanism for supported devices (by anssiko) [device selection]
14:56:29 <anssik> anssik: Mike responded to Markus and shared an example from the Apple device ecosystem, explaining that e.g. a 8-core and 80-core Apple GPUs are seen as identical devices for privacy reasons
14:57:06 <anssik> Mike: for WebGPU and other Web APIs we limit exposure, so there look the same
14:58:45 <anssik> anssik: this implies GPU capabilities may differ significantly, and on a lower core-count system Apple framework might prefer to use the NPU for a better user experience, even if a developer would request a GPU
14:58:56 <Mike_Wyrzykowski> Mike_Wyrzykowski has joined #webmachinelearning
14:59:01 <anssik> ... it is suggested the MLPowerPreference hint allows for this flexibility
14:59:21 <anssik> ... another key piece of feedback is that excluding the CPU is not possible with Core ML APIs today
14:59:28 <anssik> ... suggestion is to allow Core ML to choose the device it thinks is the most suitable
14:59:37 <anssik> ... lastly, there was a request for a sample that could be used to reproduce the case where the existing MLPowerPreference is not sufficient for this use case
14:59:48 <anssik> anssik: Zoltan put together the latest iteration of the API proposal considering what Core ML supports:
14:59:53 <anssik> ... - enumerate available compute devices (cpu, gpu, npu)
14:59:58 <anssik> ... - limit the used compute devices (to cpu-only, cpu+gpu, cpu+npu, or auto).
15:00:08 <anssik>  -> https://developer.apple.com/documentation/coreml/mlcomputedevice/allcomputedevices
15:00:08 <anssik>  -> https://developer.apple.com/documentation/coreml/mlcomputeunits#Processing-Unit-Configurations
15:00:26 <anssik> zkis: the latest proposal adds devicePreference hint passed at context creation time:
15:00:26 <anssik> 
15:00:26 <anssik> ```
15:00:26 <anssik> const context = await navigator.ml.createContext({
15:00:26 <anssik>     powerPreference: 'high-performance',
15:00:26 <anssik>     devicePreference: "gpu-like",  // or "cpu-only", defaulting on "auto"
15:00:26 <anssik> });
15:00:26 <anssik> ```
15:00:56 <anssik> zkis: two directions, Mike's proposal, or even simpler version of what Reilly requested
15:01:34 <anssik> q?
15:02:17 <anssik> zkis: original request is to be able to ask if the context has GPU in any combination, I tried to satisfy that use case
15:02:31 <anssik> ... if this is implementable on Apple platforms would be good to know
15:03:53 <anssik> Mike: the need for an additional hint does not seem completely justified, adding it would mean we can't remove it ever, need stronger motivation to add it
15:04:15 <anssik> q?
15:04:56 <anssik> RRSAgent, draft minutes
15:04:57 <RRSAgent> I have made the request to generate https://www.w3.org/2025/04/10-webmachinelearning-minutes.html anssik
15:07:43 <anssik> s/schedma/schema
15:08:26 <anssik> s/would be early/would be great
15:36:11 <anssik> s/can thing/can think
15:37:38 <anssik> s/th trick//
15:38:02 <anssik> s/exploration by Mike/exploration by Michael
15:39:17 <anssik> s/there look the same/these look the same
15:40:10 <anssik> RRSAgent, draft minutes
15:40:11 <RRSAgent> I have made the request to generate https://www.w3.org/2025/04/10-webmachinelearning-minutes.html anssik
17:06:35 <Zakim> Zakim has left #webmachinelearning