14:58:14 RRSAgent has joined #webmachinelearning 14:58:19 logging to https://www.w3.org/2025/10/02-webmachinelearning-irc 14:58:19 RRSAgent, make logs Public 14:58:20 please title this meeting ("meeting: ..."), anssik 14:58:27 Meeting: WebML CG Teleconference – 2 October 2025 14:58:36 Chair: Anssi 14:58:45 Agenda: https://github.com/webmachinelearning/meetings/blob/main/telcons/2025-10-02-cg-agenda.md 14:58:50 kush has joined #webmachinelearning 14:58:57 Scribe: Anssi 14:59:07 scribeNick: anssik 14:59:26 Present+ Anssi_Kostiainen 14:59:39 present+ 14:59:45 Leo has joined #webmachinelearning 14:59:45 Regrets+ Kenneth_Christiansen 14:59:49 Present+ Khushal_Sagar 14:59:57 brwalder has joined #webmachinelearning 15:00:10 Present+ Alex_Nahas 15:00:13 RafaelCintron has joined #webmachinelearning 15:00:13 Present+ Fabio_Bernardon 15:00:24 Present+ Leo_Lee 15:00:37 Present+ Rafael_Cintron 15:00:54 Present+ Thomas_Steiner 15:00:57 RRSAgent, draft minutes 15:00:58 I have made the request to generate https://www.w3.org/2025/10/02-webmachinelearning-minutes.html anssik 15:01:48 Present+ Reilly_Grant 15:02:17 Present+ Rick_Viscomi 15:02:40 Present+ David_Bokan 15:03:29 Present+ Mari 15:03:32 Ehsan has joined #webmachinelearning 15:03:54 RRSAgent, draft minutes 15:03:55 I have made the request to generate https://www.w3.org/2025/10/02-webmachinelearning-minutes.html anssik 15:04:05 Present+ Jason_McGhee 15:04:18 Present+ Ehsan_Toreini 15:04:18 RRSAgent, draft minutes 15:04:19 I have made the request to generate https://www.w3.org/2025/10/02-webmachinelearning-minutes.html anssik 15:04:30 Present+ 15:04:39 Anssi: first, please welcome 15:04:54 ... Mari, Jason Mayes and Mark Foltz from Google, 15:04:59 ... Henrik Edstrom from Autodesk, 15:05:07 ... Ranjith Raj, Luca Del Puppo, as individual contributors 15:05:11 ... to the WebML Community Group! 15:05:17 bokan has joined #webmachinelearning 15:05:58 ... also Rick Viscomi from Google 15:06:19 Rick: I joined recently to look into WebMCP, work on Chrome DevRel, interested in this space and happy to follow along 15:07:27 Mari: part of the Chrome technical team, working in the agentic space, want to understand the scope of the work and future looking roadmap 15:07:59 rviscomi has joined #webmachinelearning 15:08:01 jason has joined #webmachinelearning 15:08:11 Topic: Updated WebML CG Charter operational 15:08:19 gb, this is webmachinelearning/charter 15:08:19 anssik, OK. 15:08:23 -> Charter https://webmachinelearning.github.io/charter/ 15:08:28 Anssi: new charter is now operational as of 2025-09-25 15:08:33 ... Changelog is simply "Add WebMCP API as a new deliverable" 15:08:37 ... thank you everyone for your support! 15:08:47 ... any questions? 15:09:02 -> https://webmachinelearning.github.io/incubations/ 15:09:18 Topic: WebMCP API 15:09:23 gb, this is webmachinelearning/webmcp 15:09:23 anssik, OK. 15:09:47 Anssi: next, we will continue our brainstorming to build shared understanding of the key issues, solutions, and solicit new ideas 15:09:54 ... we touch on the issues we deferred from our last call, and also revisit global name bikeshedding time allowing 15:10:08 ... as a reminder, please apply Agenda+ GH label to issues and/or PRs you'd propose we discuss on our meetings, also feel free to remove the label as appropriate 15:10:14 -> Agenda+ label https://github.com/webmachinelearning/webmcp/labels/Agenda+ 15:10:18 Anssi: I will consider all proposals and update also announced agendas when needed 15:10:29 Subtopic: Capability discovery 15:10:34 Anssi: issue #8 15:10:35 https://github.com/webmachinelearning/webmcp/issues/8 -> Issue 8 Should tools be a means for capability discovery? (by bokand) [Agenda+] 15:10:58 ... David asks should tools be a means for capability discovery for an agent? 15:11:05 ... it looks like the declarative API that would complement the imperative API would address this issue? 15:11:10 ... are we ready yet to make a resolution? 15:11:12 q+ 15:11:22 ack kush 15:12:13 Khushal: declarative makes it easy index and crawl the site 15:12:34 dbokan has joined #webmachinelearning 15:12:58 ... wanted to have a separate issue to establish that goal, I had some comments on declarative API that adds attributes to existing markup helps but is not enough, does not help with JS specific functionality, Msft folks had a proposal to address that, a manifest-based proposal 15:13:19 ... I don't have an opinion on the API shape, but can take a decision that tools are also indexable 15:13:20 q? 15:13:37 David: I have a little bit reservation on tools being context dependant, no concern otherwise 15:15:09 +1 15:15:14 +1 15:15:15 yup, lgtm 15:15:16 +1 15:15:19 +1 15:15:27 RESOLUTION: The group wants to make the tools be part of the discovery mechanism and continues to explore and prototype API shapes that satisfy this requirement. This includes the declarative API proposal that complements the imperative API, as well as the JSON manifest, with pros/cons documented. 15:15:45 Subtopic: API to list registered tools 15:15:51 Anssi: issue #16 15:15:52 https://github.com/webmachinelearning/webmcp/issues/16 -> Issue 16 Add API to list / execute tools? (by bokand) [Agenda+] 15:16:17 q+ 15:16:21 ... David proposes "an API to list out the registered tools and be able to execute them by name and argument dictionary would be useful for external agents (e.g. provided via extensions or third-party libraries)" 15:16:31 ... Khushal points out WebMCP for Service Workers session management intersects here 15:16:37 -> Session management https://github.com/webmachinelearning/webmcp/blob/main/docs/service-workers.md#session-management 15:16:43 Anssi: there's a concern multiple agents could stomp on each other 15:16:53 ... Brandon proposes a lock mechanism similar to Pointer Lock that only one user or agent can hold at a time 15:17:17 ... Ilya suggests we need an API / listered that agent can subscribe to for updates to the list of tools, similarly to MCP's "notifications/tools/list_changed" method, this avoid the need to poll for changes in a loop 15:17:37 ... Jason proposes an alternative where (un)registerTool communicate tools changes, noting Ilya's proposal aligns with MCP and makes more sense 15:18:09 ... Khushal point out we haven't explored integration with non-browser agents that could interact with web pages via extension APIs, of Chrome DevTools Protocol (CDP) for automation use cases 15:18:29 q? 15:18:37 ack kush 15:18:59 Khushal: when filed this issue, implicit assumption how agents would use this API, my thinking has evolved since 15:19:18 ... two options exist: 15:19:23 ... "1. The Agent executes script on the web page and discovers tools using the same Web APIs through which the site is declaring them to the browser." 15:19:26 ... "2. The API surface the Agent is using (an extension API or chrome devtools protocol) provides higher level hooks to connect WebMCP with the Agent." 15:20:01 ... then this API is web-ified further, iframe-embedding might have specific constraints or requirements 15:20:25 ... these policies will be implemented by the engine, expecting user-land code to replicate this properly for security policy is risky 15:20:52 ... we haven't explored standard API surface how the browser exposes WebMCP to 3P 15:20:53 q? 15:21:08 q+ to ask a question 15:21:10 ack anssik 15:21:10 anssik, you wanted to ask a question 15:21:15 q+ 15:21:18 q+ 15:21:20 ack AlexN 15:21:25 dbokan has joined #webmachinelearning 15:21:39 rviscomi has joined #webmachinelearning 15:22:04 Alex: from my perspective, we inject JS and SOP violations exists with that approach, what the API for those tools would look like, listing tools would be compilation of all? 15:22:13 ... can open a new issue 15:22:20 ack brwalder 15:22:54 Brandon: +1 to Khushal's point that having a well-defined API to expose 3P to manage security boundary is a great idea, rather than inject JS to the page 15:23:17 ... to Alex's point which tools the agent would get from the list, which Service Worker, I guess this hasn't been yet well refined yet 15:23:42 ... inspired by what VSCode's agent does, the user chooses what context the agent has, currently open file or currently open file + other relevant files 15:23:55 q+ 15:24:07 ... translating that to the browser world, agents could choose with which tabs they interact with and only get tools from those tabs 15:24:21 ... also consider Service Worker-based approach 15:24:26 ack jason 15:25:28 q+ 15:25:31 Jason: I was thinking it could be request-based, if the agent requested access to the current tab vs. all the tabs, think collaborative apps across tabs, with each separate contexts, do you need three saparate contexts or one that has access to all and asks the user 15:25:47 ... not necessarily specific tabs, more packaged versions we might expose for the same domain 15:25:48 q? 15:25:53 ack reillyg 15:26:30 Reilly: I'm not sure we need to do anything specific here besides providing agent tools, extension authors are familiar with crossing extension boundaries 15:26:35 ... I'm not sure this is any different 15:26:37 q? 15:26:49 q+ 15:26:59 Reilly: browsers include UI granting extensions access to the page on user activation 15:27:00 q? 15:27:02 ack AlexN 15:27:28 Alex: only think I'd say is there's a risk when multiple agents inject the same JS, thinks can get messy with multiple agent interacting at the same time 15:27:41 ... see MetaMask extension for similar incidents 15:27:55 ... there's race conditions and such 15:28:09 q+ 15:28:11 q? 15:28:11 A simpler example might be multiple password managers 15:28:17 ack kush 15:28:45 Khushal: since we're discussing how well extensions interact with WebMCP, or purpose-built extensions or CDP APIs, is there adequate understanding to make a call on that? 15:28:58 ... Reilly thinks injecting JS is OK? 15:29:22 Reilly: having extensions specific APIs is fine, wanted to point out the SOP violation is any different from what browsers already allow for extensions in general 15:29:35 Khushal: could we add this to the JS API is the question 15:29:47 q+ 15:29:51 Reilly: not going to add APIs for injecting script 15:30:55 ... enumerate tools without inserting script, we'd rather use the non-script injection path, not add additional capabilitities to make the script injection path easier 15:31:06 ack dbokan 15:31:29 q+ 15:31:30 David: with extension API web-based libraries could add to your page and interact with your page via WebMCP, is that important use case? 15:31:31 q? 15:31:33 ack brwalder 15:32:08 Brandon: for web frameworks like that there's a workaround that the framework can manage the tool set and maintain references, maybe the web framework problem is solved? 15:32:23 David: I suppose so 15:32:24 q? 15:32:31 Maybe I'm just missing something here - but IIUC reillyg (please correct me if i'm not understanding) isn't suggesting we use the extensions api for webmcp 15:32:56 +1 to extension and CDP APIs 15:32:59 +1 15:33:06 +1 for both 15:34:05 Reilly: to answer Jason, many APIs to access all the sites, built-in agent can see the tools, if there's a built-in agent, you can have extension that can enumerate tools, maybe CDP API, or the browser itself provides an MCP server for listing tools 15:34:25 Jason: makes sense, thanks 15:35:52 q+ 15:35:59 ack brwalder 15:36:17 Brandon: suggest "connecting WebMCP with external agents" 15:36:32 q+ 15:36:44 ack dbokan 15:37:06 David: is the resolution saying we as part of WebMCP will be doing that, it seems a bit external to the API itself? 15:37:12 q+ 15:37:18 ack reillyg 15:37:19 +1 15:37:30 "Javascript injection by external Agents to interact with WebMCP is not supported." 15:37:33 q+ 15:37:47 Reilly: I think that its within our scope to define a WebDriver API or web extensions API to link these things 15:37:58 ... probably outside the WebMCP spec, closer to MCP spec 15:38:16 ... WebMCP server connecting with MCP, so agent don't have to act differently across browsers 15:38:17 q? 15:38:56 sohum has joined #webmachinelearning 15:39:05 proposed RESOLUTION: The group looks into higher-level hooks to connect WebMCP with external egants for listing tools. This reduces coupling with MCP and subsequently browser implementation complexity. Javascript injection by external Agents to interact with WebMCP is not supported. 15:39:42 +1 15:39:47 +1 15:39:51 +1 15:39:54 RESOLUTION: The group looks into higher-level hooks to connect WebMCP with external egants for listing tools. This reduces coupling with MCP and subsequently browser implementation complexity. Javascript injection by external Agents to interact with WebMCP is not supported. 15:39:54 +1 15:39:57 +1 15:40:06 Subtopic: Elicitation 15:40:10 Anssi: issue #21 15:40:11 https://github.com/webmachinelearning/webmcp/issues/21 -> Issue 21 Elicitation (by bwalderman) [Agenda+] 15:40:36 ... Brandon is "Gathering thoughts on supporting MCP elicitation since this would be a good way to bring the user's attention to a tab if the agent determines that their input is needed." 15:40:39 q+ 15:40:41 -> MCP Elicitation https://modelcontextprotocol.io/specification/draft/client/elicitation 15:40:52 Anssi: missing feature is how to inform the agent elicitation is happening on-page 15:41:13 ... Khushal's initial idea was to use "needsUserInput" tool annotation, but this does not account for conditionality 15:41:35 ... how to mitigate abuse case where a site grabs user's attention too much a la popups 15:42:28 ... elicitation control flows differ between WebMCP (client) and MCP (remote server); client as an arbitrator knows where we're at with user input, while in MCP server case the remote server manages user input 15:42:45 Anssi: Alex proposes to align with the MCP spec and defer elicitation to call resolution to the client 15:42:58 ... deferred elicitation is better for non-human-in-the-loop use cases e.g. automation 15:43:32 ... we resolved earlier to focus on human in the loop use cases, nevertheless deferred elicitation would enable forwards compatibility when we get to those automation use cases 15:43:40 -> 2025-09-18 RESOLUTION: WebMCP focuses on human in the loop use cases initially. https://www.w3.org/2025/09/18-webmachinelearning-minutes.html#54e1 15:43:54 Anssi: Ilya from Shopify provides important e-commerce related feedback: 15:44:18 ... WebMCP must enable user input and allow review for e.g. terms, disclosures, liability & compliance requirements 15:44:36 ... in the most recent comment, Khushal summarized the latest high-level design, rephrased: 15:44:46 ... 1) Site <- WebMCP API -> Browser 15:44:56 ... 2) Browser <- another API -> Agent 15:45:18 ... 3) WebMCP mirrors MCP concepts, if they map well to Web API friendly abstractions 15:45:31 ... elicitation is needed for (1) for sure 15:45:41 q? 15:46:00 ack kush 15:46:25 Khushal: we recognize there are cases where the user needs to interact with the site in the middle of tool execution 15:46:39 ... only way to do that is when the user interact with the site, then annotation is enough 15:47:02 ... but we realized that dynamically during execution we may find out the same 15:47:12 ... which entity needs to know user attention is required? 15:47:35 q+ 15:47:37 ... built-in agent can background a tab, how to handle that case here? 15:48:09 ... Ilya noted we're seeing an usage patterns where there's browser usage in a VM, how to manage that, then the discussion went into what is the browser and agent connection looking like 15:48:27 ... whatever API is powering the interaction, it needs to be able to communicate if user interaction is happening 15:48:38 ... what is the connection between user and agent to use WebMCP 15:49:09 ... to minimize the problem space, can we avoid the second part, and device how the WebMCP talks to browser when it needs elicitation 15:49:20 q? 15:49:25 ack brwalder 15:50:01 Brandon: parallels to popup issue, if an agent needs to elicit input from the user, and user's tab is backgrounded, that is attention-grabbing behavior we want to avoid 15:50:35 ... browser UI does not need to foreground the tab to so so, perhaps flashing the tab would work as a mechanism to let the tool call yield control to the user 15:51:05 ... maybe during the tool call an API on the WebMCP object, JS can tell the browser to give mouse and keyboard control back 15:51:49 ... lock mechanism that allows signal the agent the human is giving input is one possible approach, not sure how that'd look like for declarative API 15:51:50 q? 15:51:51 q+ 15:51:54 ack AlexN 15:52:13 Alex: I have it in the declarative explainer PR, mark certain inputs as requiring human input 15:52:32 ... that pops up an alert, rough proposal, feedback welcome 15:52:46 ... PR #26 15:52:46 https://github.com/webmachinelearning/webmcp/pull/26 -> Pull Request 26 add explainer for the declarative api (by MiguelsPizza) 15:52:54 q? 15:53:14 q? 15:53:59 Khushal: not sure what the API shape to lock looks like, perhaps resolve with "Tool execution should be able to yield to the user." 15:54:17 q+ 15:54:25 Brandon: there should be some way for the agents to yield control to user and agent should know when that happens so they can update their UI accordingly 15:54:42 Khushal: the user needs to take over and there's some browser UI "I'm taking over"? 15:55:13 Brandon: I can do in more detail with a comment in this issue, thinking about an imperative approach, some sort of JS API that pauses the agent 15:55:27 ... when the user has clicked "submit" it resumes the agent 15:55:49 Khushal: I though exec would be an async function 15:56:21 Brandon: tool functions are async, but they'd need multiple user interactions, dedicated API for pause and resume the tool call function and defer promise resolution might be necessary 15:56:47 Khushal: how about resolving on the high-level idea: "should be able to pause and resume during tool execution" 15:57:42 "Tool execution should be able to start/stop yielding to the user throughout it's lifecycle." 15:57:55 Reilly: I'd add, the high-level need as you said is to have a way for the Agent to say I'd like to interact here, mitigate abuse including an option for the agent to tell the browser when I'm executing the tool to not let the tool to ask for user input, up to agent to mitigate from having abuse 15:58:26 Khushal: for popups you have an option "do not let this site create popups", the user makes the judgment call 15:58:58 +1 15:59:03 +1 15:59:14 +1 15:59:18 RESOLUTION: Tool execution should be able to start/stop yielding to the user throughout its lifecycle. 16:00:03 Subtopic: Bikeshedding the global name 16:00:07 Anssi: issue #24 16:00:09 https://github.com/webmachinelearning/webmcp/issues/24 -> Issue 24 Bikeshedding the global name (by bwalderman) [Agenda+] 16:00:11 -> Earlier discussion from 2025-09-18 telcon https://www.w3.org/2025/09/18-webmachinelearning-minutes.html#a05d 16:00:43 Brandon: it seems we've converging on navigator as the home, should have a name .modelContext or .agentContext 16:00:55 ... not just "tools" or "resources" but others too 16:00:56 q+ 16:01:15 ... which way to go, modelContext or agentContext 16:01:49 ... another issue #31 for agent-to-agent interaction 16:01:50 https://github.com/webmachinelearning/webmcp/issues/31 -> Issue 31 Support agent to agent interaction (by khushalsagar) 16:01:53 +1 to modelContext 16:02:02 ... modelContext might be better considering A2A future 16:02:09 +1 16:02:09 +1 16:02:13 +1 16:02:29 +1 16:03:25 Reilly: model as a term is overloaded 16:04:09 I have a coin if we need to flip for it 16:04:31 q+ 16:04:38 ack kush 16:04:58 Khushal: the implementation in Chrome needs some name we can give to developers as a source of truth 16:05:10 q+ 16:05:14 ... I've been writing the word "agent" and confuse it with "user agent" 16:05:17 agentModelContext :P 16:05:58 Reilly: I can live with navigator.modelContext 16:06:22 ... I feel like the user agent containing the model is what the API is about 16:06:47 q- 16:06:52 q- 16:06:52 ack reillyg 16:07:10 +1 16:07:11 +1 16:07:11 +1 16:07:14 +100 16:07:15 +1 16:07:16 +1 16:07:39 RESOLUTION: navigator.modelContext is the "root" object name 16:07:46 RRSAgent, draft minutes 16:07:47 I have made the request to generate https://www.w3.org/2025/10/02-webmachinelearning-minutes.html anssik 16:07:54 Thanks all! 16:09:14 s/future looking /future 16:10:37 s/easy index/easy to index 16:11:06 s/markup helps/markup, it helps 16:11:19 s/Msft/Microsoft 16:11:44 s/no concern/no concerns 16:12:39 s/listered/listener 16:13:22 s/when filed/when I filed 16:13:45 s/implicit assumption/had implicit assumption 16:14:07 s/then this/when this 16:15:44 s/refined yet/refined 16:16:20 s/each separate/each with separate 16:16:34 s/saparate/separate 16:17:31 s/thinks can/things can 16:17:51 s/multiple agent interacting/multiple agents interacting 16:18:54 s/is any different from/is not any different from 16:20:39 s/external egants/external agants 16:22:44 s/device/decide 16:23:17 s/to so so/to do so 16:24:24 s/do in more detail/add more detail 16:25:14 s/for pause/to pause 16:26:27 RRSAgent, draft minutes 16:26:28 I have made the request to generate https://www.w3.org/2025/10/02-webmachinelearning-minutes.html anssik 18:30:26 Zakim has left #webmachinelearning