Meeting minutes
Repository: webmachinelearning/webmcp
Anssi: to start, we will welcome our latest new participants:
… Omri Belavad and Guilherme Gervasio joining as an individual contributor
… welcome all to the WebML Community Group!
… as a reminder, we'll use IRC-based queue management in this meeting:
https://
Anssi: to suggest agenda topics, use Agenda+ label -- we may discuss newly Agenda+ labeled issues #51 and #130 today too
<gb> Issue 130 Tool unregistration design (by domfarolino) [Agenda+]
<gb> Issue 51 Define the API for in-page Agents to use a site's declared tools (by khushalsagar) [Agenda+]
WebMCP
Announcement: Awesome WebMCP
Anssi: we launched a new community-curated resource to recognize applicable contributions from the wider WebMCP ecosystem:
Tool infra for registerTool/unregisterTool
Anssi: PR #113
<gb> MERGED Pull Request 113 Add basic tool infra and complete `registerTool()` (by domfarolino) [Agenda+]
Anssi: we addressed all the review comments and landed the PR ahead the meeting, great work
… incremental improvements will arrive in subsequent PRs
… to that end, I'll propose we move on to the next topic
… any comments?
Dominic: this was landed and other work well under way
Declarative explainer
Anssi: PR #76
<gb> Pull Request 76 Declarative API Explainer (by domfarolino) [Agenda+]
Anssi: great review comments, thank you all
… does the group have specific proposals to discuss today?
… are there any objection to approve and merge the baseline explainer with a condition that all open review comments have dedicated issue to ensure we follow them through?
Dominic: it seems like all things are resolved exept the last few days worth of comment, JSON-LD discussion
… JSON-LD cross document ergonomics is an open issue for discussion
<gb> Pull Request 76 Declarative API Explainer (by domfarolino) [Agenda+]
Alex: I did a prototype and found out there is probably a better primitive
Brandon: one thought I had, scraping the DOM for JSON-LD and using that as output carries with it an implicit assumption the agent is doing DOM scraping in addition to WebMCP
… true for in-browser agents possibly, but for other agents don't want to do DOM scraping to get their output, want to use WebMCP only
… having a native WebMCP way to get that state on a new page seems more friendly to agents who don't want to scrape the DOM
Alex: extension agents?
Brandon: also in-page agents
Dominic: that makes sense, I think the trick is for non-built in browser agents, they will have some hack that tells them the context of the DOM is an answer to the previous tool call
… in JS the entire context gets reset after navigation
… JSON-LD only works for built-in agents, and even for that it is a bit hacky
… will file an issue for this
Anssi: issue #126 is a spin-off from PR #76, interest to discuss this issue today?
<gb> Pull Request 76 Declarative API Explainer (by domfarolino) [Agenda+]
<gb> Issue 126 Where to fire `toolactivated` and `toolcanceled` events? (by domfarolino) [declarative]
Dominic: do we want to fire these on form elements or on the ModelContext object or Window
… to support these events on imperative tools, we cannot get away from this
Brandon: Dominic good points on your most recent comment, if imperative tools need to do anything they can just use the execute function
… putting the event on ModelContext object seems like the best design
<domfarolino> s/"live"/"are fired"
<domfarolino> +1
RESOLUTION: The toolactivated and toolcanceled events are fired on the ModelContext object. (issue #126)
<gb> Issue 126 Where to fire `toolactivated` and `toolcanceled` events? (by domfarolino) [declarative]
Agent allowlist use cases and requirements
Anssi: issue #116
<gb> Issue 116 Agent allowlist use cases and requirements (by anssiko) [Agenda+]
Anssi: I think the M:N problem illustrated by Yoav's comment informs this feature design:
webmachinelearning/
<gb> Issue 51 Define the API for in-page Agents to use a site's declared tools (by khushalsagar) [Agenda+]
Anssi: also Alex shared related learnings from the crypto space, see "Pointer Locks for Agents & Agent whitelisting"
webmachinelearning/
<gb> CLOSED Issue 43 Clarifying the scope of the proposal (by 43081j)
Anssi: for WebMCP too, we could have a future where we have in-page agents, iframe agents, extension-based agents and in-browser agents all living together on the web platform and talking WebMCP
… and all these agents would in principle know how to operate the tools provided by any website
… based on past learnings, it seems the agent allowlist mechanism should be considered together with an agent negotiation/lock mechanism #118
<gb> Issue 118 Agent negotiation (by yoavweiss) [backlog]
Anssi: I'll invite participants to share concrete use cases in this issue
ack?
Dominic: if we don't introduce an API that allows to enumerate and list, then monkey-patching will likely happen
… to avoid this registerTool problem, we can introduce a registry
… two sub-problems, A) do we want agent identifier or allowlist to register tools, only exposed to extension agent, in-browser agent etc.; B) or one registry that is visible to all tools
… second is concurrency problem, with pointer lock style as one possible solution
… blocking out all other agents when one agent is executing may be too heavy-handed solution
… the simplest solution would be a global registry and design it so we can add on functionality to filter tools from specific agents
… when you listTools() registered and you can view all global tools and those tools exposed to your agent only
… I think this is a decent path forward
… that might help with concurrency management
Dominic: I think we're most excited to move forward with global registry of tools that can be extended with per-agent filter mechanism
… pointer lock where platform blocks other agents seems a bit heavy
Alex: that sums up it pretty well, want to highlight Khus' proposal
… we will see many different agents interacting
Khushal: I was convinced it seems premature to solve, filtering which tool goes to which agent, both the things don't need to be solved in the same agent and right now, we just want to make sure this API is extensible to addess these both things, filtering and ensuring only one agent is running at a time
Dominic: personally I'd like to see real concurrency handled by the web developer ultimately, basic global list of tools that extends to per-agent registry
Anssi: are we clear on the use cases?
Dominic: I think the use case would need to clarified when we are to augment the global registry, understand multiple agents stomping each other
<AlexN> +1
<kush> +1
Dominic: my plan is to spec this and make sure it is extensible for a filtering mechanism
Brandon: sounds good
Kryspin: we're talking about executing, how do we think about coordination, where there's hierarchy of agents, primary and flowing down from there
… we're talking about tools and concurrency, the tools themselves mutate state because the agent determines it needs to be called, e.g. call checkout function, that may be relevant for in-page agent to know it was called, both aware of the interaction happening
… shared tools are an intention and if in-browser agent calls a function it should bubble down, otherwise they're operating without knowledge of intention
… think shared table, where people are not seeing what others are doing
Dominic: if in-browser agent calls a checkout flow, maybe the in-page agent wants to know, it is useful, the same if the user clicks the checkout button in the UI
Kryspin: the in-page agent needs to know, to be able to act on the checkout function
Brandon: I think some of this is on web developer to solve, we have in-page agent, in-browser agent, to mitigate this is to build in-page agents in a way that's useful to both agents, if there's an action on the page they don't want in-browser agent doing, they shouldn't expose that tool
… need coordination to have the means to delegate between the two
Kryspin: ModelContext is not registering against an agent, it'd make sense for in-page agent to be given tools, that'd limit scope
… if no interaction between the two agents, shared prompt mechanism or means to play an assistant role to the other agent, not able to know what the in-browser is tasked to do, from the in-page agent side
… allowlist will solve this
Alex: to the points of this sharing state across agents, there's a need for some context that does not need to be fetched via tool call but written from the app
… solves coordination issue, will open an issue about this
Mark: there's several topics here, one thing is to think if there are multiple agents that are concurrently modifying a page, allow feedback to be shared, what changed, to give direction to plan the next tool call
Kryspin: there should be a message bus, a messenger pattern, different agents having understanding what other agents are doing, more important than concurrency, in-page agents don't know what in-browser agent is doing
<AlexN> prior art: https://
Kryspin: people will do ambitious things, browser agent has more agency than in-page agents, and agents need to coordinate
Dominic: return value from the tool dispatched to all agents?
… all agents listening to that?
Kryspin: yes, this is the missing piece
Victor: in-page agents, we're trying to solve the problems without good list of use cases for this class of problems
Kryspin: I agree with that, we started with tool calling and shared registry of tools
… a separate explainer would be great, help people coming from MCP space to WebMCP space
Dominic: we can add to an existing explainer
Kryspin: I will add to an existing explainer agent coordination section
Input schema validation ownership
Anssi: issue #92
<gb> Issue 92 Who owns the validation layer? (by MiguelsPizza) [Agenda+]
Anssi: Alex proposes the browser to do the input schema validation and return validation errors in a structured way
… the group seems to agree on that high-level point
… Dominic commented there are three validation layers: meta, input and output validation
… we seem to have an initial agreement that browser should do three-layer validation
Anssi: do we want to discuss error behavior on the call today?
Dominic: meta validation, how strict we want to be?
… JSON schema supports dynamic linking, for example
… what is the normal behaviour, for best alignment with other projects
Alex: standard schema is what the community has rallied around, it is the best resource
<Kryspin_Ziemski> https://
Dominic: what would be the default look like?
Alex: you can write your own validator
Khushal: should ensure interop across all browser agents
Dominic: what keywords to support and expand from there
<gb> Issue 92 Who owns the validation layer? (by MiguelsPizza) [Agenda+]
proposed RESOLUTION: Browser is responsible for input schema validation on all three layers: meta, input and output. TODO research JSON schema validation and codify a subset in the spec. (issue #92)
<domfarolino> +1
<kush> +1
<Kryspin_Ziemski> +1
<brwalder> +1
<AlexN> +1
RESOLUTION: Browser is responsible for input schema validation on all three layers: meta, input and output. TODO research JSON schema validation and codify a subset in the spec. (issue #92)
provideContext overwrite
Anssi: issue #101
<gb> Issue 101 `navigator.modelContext.provideContext` allows overwriting of previously registered tools in the same environment (by beaufortfrancois) [Agenda+]
Anssi: since I put this on the agenda, Brandon shared motivation for the initial provideContext design:
webmachinelearning/
<gb> Issue 101 `navigator.modelContext.provideContext` allows overwriting of previously registered tools in the same environment (by beaufortfrancois) [Agenda+]
Anssi: and informed by this background and current focus on tools, Brandon proposed to drop provideContext/clearContext from the spec entirely
… and it seems the group agrees
Dominic: I put up a PR to do this spec change, also doing this in Chrome, I see no controversy doing this change
<domfarolino> webmachinelearning/
<gb> Pull Request 132 Remove `provideContext()` and `clearContext()` (by domfarolino)
<gb> Pull Request 132 Remove `provideContext()` and `clearContext()` (by domfarolino)
<gb> Issue 101 `navigator.modelContext.provideContext` allows overwriting of previously registered tools in the same environment (by beaufortfrancois) [Agenda+]
<domfarolino> +1
<kush> +1
<Victor> +1
<AlexN> +1
RESOLUTION: Drop provideContext/clearContext from the spec as in PR #132. (issue #101)
Define the API for in-page Agents to use a site's declared tools
Anssi: issue #51 was proposed to the agenda yesterday
<gb> Issue 51 Define the API for in-page Agents to use a site's declared tools (by khushalsagar) [Agenda+]
<AlexN> +1
Anssi: this discussion was reignited by Yoav's M:N problem illustration, complexity explosion that exist between "MCP Servers" and "MCP Clients" in context of WebMCP when we use more loosely scoped definitions
… we revisited this in context of agent allowlist discussion:
webmachinelearning/
<gb> Issue 51 Define the API for in-page Agents to use a site's declared tools (by khushalsagar) [Agenda+]
Anssi: Brandon's proposal is to move listTools and executeTools to navigator.modelContext to allow non-browser agents enumerate and execute tools
<kush> +1
<brwalder> +1
<domfarolino> +1
<qcomp> +1
<Victor> +1
<AlexN> +1
RESOLUTION: Move listTools and executeTools to navigator.modelContext to allow non-browser agents enumerate and execute tools. (issue #51)
<reillyg> Great meeting!
<AlexN> Thanks anssik!