Semantics for the Agentic Web
- Upcoming
- Tentative
- Breakout Sessions
- Upcoming
- Tentative
- Breakout Sessions
Meeting
The concept of an "agentic web," where AI agents act on a user's behalf, is a growing topic of discussion. Currently, these agents often rely on brittle inference, parsing visual labels or class names to understand a website's functionality. This model is fragile; simple A/B tests or site redesigns can break an agent, leading to high task failure rates. Keeping in mind that for many tasks the acceptable failure rate is 0. This model can also be compute-expensive, as many implementations resolve ambiguities using a combination of DOM Tree and pixel scraping, greatly limiting the performance of agents who might have resource constrained (i.e. on-device) models.
For human-facing assistive technologies, ARIA closed gaps in semantic HTML, creating a more robust experience for users of AT. Now, we face a similar question for machines.
This session is a general discussion to explore whether our existing semantic toolkit is sufficient for this potential use case.
- Is ARIA, which is designed for human accessibility, the right tool to serve machine agents? An interesting point of reference is this FAQ from ChatGPT which states "ChatGPT Atlas uses ARIA tags—the same labels and roles that support screen readers—to interpret page structure and interactive elements."
- What happens when we conflate these two distinct purposes?
- More importantly, are there semantic needs for agents that have no human-facing equivalent?
Consider hints that could improve agent reliability and safety, such as:
- Explicitly flagging a button as a destructive action (e.g., distinguishing "Archive" from "Delete Permanently").
- Identifying page content as User-Generated Content (UGC) to allow agents to apply extra safeguards.
- Signaling transient states, like logged-in vs. logged-out status, to help an agent plan a task.
This session is not about a specific proposal. It's about the core problem. Does ARIA already solve for all the hints a machine might need, or should we be considering an additional semantic layer for machines at all?
Let's come together to discuss the problem, the risks of both action and inaction, and the potential paths forwards.
Additional co-chair (if registered): @chrishtr
Agenda
Chairs:
Penelope McLachlan
Description:
The concept of an "agentic web," where AI agents act on a user's behalf, is a growing topic of discussion. Currently, these agents often rely on brittle inference, parsing visual labels or class names to understand a website's functionality. This model is fragile; simple A/B tests or site redesigns can break an agent, leading to high task failure rates. Keeping in mind that for many tasks the acceptable failure rate is 0. This model can also be compute-expensive, as many implementations resolve ambiguities using a combination of DOM Tree and pixel scraping, greatly limiting the performance of agents who might have resource constrained (i.e. on-device) models.
For human-facing assistive technologies, ARIA closed gaps in semantic HTML, creating a more robust experience for users of AT. Now, we face a similar question for machines.
This session is a general discussion to explore whether our existing semantic toolkit is sufficient for this potential use case.
- Is ARIA, which is designed for human accessibility, the right tool to serve machine agents? An interesting point of reference is this FAQ from ChatGPT which states "ChatGPT Atlas uses ARIA tags—the same labels and roles that support screen readers—to interpret page structure and interactive elements."
- What happens when we conflate these two distinct purposes?
- More importantly, are there semantic needs for agents that have no human-facing equivalent?
Consider hints that could improve agent reliability and safety, such as:
- Explicitly flagging a button as a destructive action (e.g., distinguishing "Archive" from "Delete Permanently").
- Identifying page content as User-Generated Content (UGC) to allow agents to apply extra safeguards.
- Signaling transient states, like logged-in vs. logged-out status, to help an agent plan a task.
This session is not about a specific proposal. It's about the core problem. Does ARIA already solve for all the hints a machine might need, or should we be considering an additional semantic layer for machines at all?
Let's come together to discuss the problem, the risks of both action and inaction, and the potential paths forwards.
Additional co-chair (if registered): @chrishtr
Goal(s):
To collectively explore whether semantic HTML and ARIA is sufficient for AI agents or if a new, machine-facing semantic layer is needed for a reliable agentic web.
Materials:
Joining Instructions
Instructions are restricted to W3C users . You need to log in to see them.
Export options
Personal Links
Please log in to export this event with all the information you have access to.
Public Links
The following links do not contain any sensitive information and can be shared publicly.