Information

Semantics for the Agentic Web
  • Upcoming
  • Tentative
  • Breakout Sessions

Meeting

Event details

Date:
Japan Standard Time
Status:
Tentative
Location:
R01
Participants:
Penelope McLachlan, Vincent Scheib
Big meeting:
TPAC 2025 (Calendar)

The concept of an "agentic web," where AI agents act on a user's behalf, is a growing topic of discussion. Currently, these agents often rely on brittle inference, parsing visual labels or class names to understand a website's functionality. This model is fragile; simple A/B tests or site redesigns can break an agent, leading to high task failure rates. Keeping in mind that for many tasks the acceptable failure rate is 0. This model can also be compute-expensive, as many implementations resolve ambiguities using a combination of DOM Tree and pixel scraping, greatly limiting the performance of agents who might have resource constrained (i.e. on-device) models.

For human-facing assistive technologies, ARIA closed gaps in semantic HTML, creating a more robust experience for users of AT. Now, we face a similar question for machines.

This session is a general discussion to explore whether our existing semantic toolkit is sufficient for this potential use case.

  • Is ARIA, which is designed for human accessibility, the right tool to serve machine agents? An interesting point of reference is this FAQ from ChatGPT which states "ChatGPT Atlas uses ARIA tags—the same labels and roles that support screen readers—to interpret page structure and interactive elements."
  • What happens when we conflate these two distinct purposes?
  • More importantly, are there semantic needs for agents that have no human-facing equivalent?

Consider hints that could improve agent reliability and safety, such as:

  • Explicitly flagging a button as a destructive action (e.g., distinguishing "Archive" from "Delete Permanently").
  • Identifying page content as User-Generated Content (UGC) to allow agents to apply extra safeguards.
  • Signaling transient states, like logged-in vs. logged-out status, to help an agent plan a task.

This session is not about a specific proposal. It's about the core problem. Does ARIA already solve for all the hints a machine might need, or should we be considering an additional semantic layer for machines at all?

Let's come together to discuss the problem, the risks of both action and inaction, and the potential paths forwards.

Additional co-chair (if registered): @chrishtr

Agenda

Chairs:
Penelope McLachlan

Description:
The concept of an "agentic web," where AI agents act on a user's behalf, is a growing topic of discussion. Currently, these agents often rely on brittle inference, parsing visual labels or class names to understand a website's functionality. This model is fragile; simple A/B tests or site redesigns can break an agent, leading to high task failure rates. Keeping in mind that for many tasks the acceptable failure rate is 0. This model can also be compute-expensive, as many implementations resolve ambiguities using a combination of DOM Tree and pixel scraping, greatly limiting the performance of agents who might have resource constrained (i.e. on-device) models.

For human-facing assistive technologies, ARIA closed gaps in semantic HTML, creating a more robust experience for users of AT. Now, we face a similar question for machines.

This session is a general discussion to explore whether our existing semantic toolkit is sufficient for this potential use case.

  • Is ARIA, which is designed for human accessibility, the right tool to serve machine agents? An interesting point of reference is this FAQ from ChatGPT which states "ChatGPT Atlas uses ARIA tags—the same labels and roles that support screen readers—to interpret page structure and interactive elements."
  • What happens when we conflate these two distinct purposes?
  • More importantly, are there semantic needs for agents that have no human-facing equivalent?

Consider hints that could improve agent reliability and safety, such as:

  • Explicitly flagging a button as a destructive action (e.g., distinguishing "Archive" from "Delete Permanently").
  • Identifying page content as User-Generated Content (UGC) to allow agents to apply extra safeguards.
  • Signaling transient states, like logged-in vs. logged-out status, to help an agent plan a task.

This session is not about a specific proposal. It's about the core problem. Does ARIA already solve for all the hints a machine might need, or should we be considering an additional semantic layer for machines at all?

Let's come together to discuss the problem, the risks of both action and inaction, and the potential paths forwards.

Additional co-chair (if registered): @chrishtr

Goal(s):
To collectively explore whether semantic HTML and ARIA is sufficient for AI agents or if a new, machine-facing semantic layer is needed for a reliable agentic web.

Materials:

Joining Instructions

Instructions are restricted to W3C users . You need to log in to see them.

Export options

Personal Links

Please log in to export this event with all the information you have access to.

Public Links

The following links do not contain any sensitive information and can be shared publicly.

Feedback

Report feedback and issues on GitHub.