W3C

– DRAFT –
AGWG Teleconference

16 June 2026

Attendees

Present
Adam, alastairc, AshleeF, AWK, Azlan, BrianE, CClaire, Charles, Detlev, Eloisa, filippo-zorzi, Francis_Storr, graham, GreggVan, HaTheo, hdv, Heather, Jen_G, Jennie_Delisi, JeroenH, jtoles, julierawe, kevin, kirkwood, Laura_Carlson, LenB, LoriO, Makoto_U, nattarnoff, rayianna, sam-estoesta, scott, shadi, ShawnT, Stephanie, wendyreid
Regrets
BruceB, Gundula
Chair
alastairc
Scribe
Heather, Eloisa

Meeting minutes

<JeroenH> present

WCAG-EM 2 CFC

alastairc: Anyone that is new or change of affiliation to introduce themselves; no takers.

<alastairc> Diff version for those who appreciate it, from the previous draft: https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fw3.org%2FTR%2Fwcag-em-2&doc2=https%3A%2F%2Fw3c.github.io%2Fwai-wcag-em%2F

<JeroenH> New version to be found at https://www.w3.org/TR/wcag-em-2/

<alastairc> Change notes list: w3c/wai-wcag-em#since-publication-of-wcag-em-2-draft-note

hdv: Providing an overview of what people can expect from the changes. Working on updates to the WCAG evaluation methodology; currently published as a draft note. Small changes around language and spelling. Also bigger changes, such as words that are changing, to make it work for a broader range of things. The WCAG-EM was very focused on websites, it
… ... talks about views, so it can be more easily used with things that aren't websites, but are regulated, like apps, kiosks, etc. This document is an evaluation methodology, it explains how you can go about evaluating if something meets WCAG via sampling methodology. Draft was published in October. The name is now WCAG Evaluation Methodology. We changed how the view definition works, which is now defined in WCAG 3, so we don't
… have to manage that in two places. Plan is to publish this as a note, and there will be a CFC that you can respond to. The editors are working on the document currently. If there's anything here that people don't like, folks are actively working on it. Any further callouts the group is happy to address. CFC should come out this week for it. Happy
… to answer questions about any of this.

<hdv> https://w3c.github.io/wai-wcag-em/

AWK: Can someone post the link of the document?

alastairc: Links are posted in IRC.

Paths sub-group starting

alastairc: All links will be in the CRC as well.

Many levels results

alastairc: We looked at scoring, we've got the many levels. The next step on that was continuing meetings; Monday meetings. If anyone wants to put together a proposal for that, we are accepting. Using paths for scoping the scoring aspect. Email the chairs. If you have someone you want to contribute, email the chairs, the work will begin next week.

alastairc: Will be handing over to Adam; please remember to get into queue and provide a note for the question or topic.

AWK: Previous topic question. Has it been published as a note, or is it something that we are going to have a CFC coming up to publish it as a note?

alastairc: Both. It was published as a draft note and then the CFC will be to publish it as a full note.

alastairc: Pass over to Adam for the next topic.

<Adam> https://docs.google.com/presentation/d/1f0dX6sfgx-4yq7e7vRL_aFkfZBcgKdYVu7R6gfqZK9A/edit?usp=sharing

Slideset: https://docs.google.com/presentation/d/1f0dX6sfgx-4yq7e7vRL_aFkfZBcgKdYVu7R6gfqZK9A/edit?usp=sharing and archived PDF copy

[Slide 1]

[Slide 2]

Adam: Backgorund: A few weeks ago, we had the epic 2-day meeting series focused on conformance. One of the survey questions leading up to that meeting. The primary question is "Is this a useful idea for conformance? Survey results were evenly split. See slide 1.

[Slide 3]

Adam: In the 2-day meeting we resolved to set up a small team to explore the idea a little bit more and ask if we had a conformance model with multiple levels, how would we get there and what would it look like, and would it be viable?

Adam: [Reading slide 3] Asks group to focus the group on the approach rather than the details.

[Slide 4]

Adam: Instead of answering a big question, we want to break it down with smaller, easier questions.

[Slide 5]

[Slide 6]

Adam: Breaking down questions doesn't mean easy, since the questions may also are also loaded or subjective. Let's look at worst case scenario to try to be protective. .

[Slide 7]

Adam: Now let's turn those questions into rateable 'dimensions' called user need with 4 ratings. Each of these ratings are assigned a numeric value. Similarly user harm is a potential rating. Going from high value to low value, but in the case of the author, we are now emphasizing things that are easy to do because they are free, or easy, or
… trivial.

[Slide 8]

Adam: Repeat the user need dimension for key user groups. Each table has a 'big bucket' disabilities.

Adam: The user need dimension is repeated across all the groups, and each group now can be rated separately.

[Slide 9]

Adam: We recognize that a single user can belong to multiple user groups.

Adam: Pausing to address questions in the queue

GreggVan: Is it your intention to add your user and vendor scores together to come up with a single rating, or subtract one from the other?

Adam: Will cover that in the next section.

[Slide 11]

Adam: Now let's try rating a provision: Audio shifting adjustable. This goes back to the point that ratings have provided as an example, not as expertise.

<nattarnoff> Yes - Vestibular uses will be affected by shifting sound.

Adam: [Reads normative text on slide 11]. Declared that users with physical and visual disabilities don't need this, so it gives a value of 0. For those herring disabilities, this is a critical need. Cognitive disabilities is a need, given a value of 2. Relationship between harm and cost; for now we said there is no cost, and provided a value of 0.
… Authors cost is 0. Total value of the provision is '10'

[Slide 12]

<kirkwood> wouldn’t non-visual users be heavily considered with issues of orientation balance, potentially dependent on proprioception in conjunction with audio?

Adam: Callout: Two dimensions. One concerned with harming the user the other was costing the user. This doesn't pass the sniff cost. These two things should not have an equivalent value

[Slide 13]

Adam: Try 'weighting' the dimensions toward users. Initial weights defined on the slide.

[Slide 14]

Adam: When weights are applied, they seem more credible. User harm now has a weight of 8, and author cost is now 2.

[Slide 15]

Adam: Now that weights are in place, and applied to all dimensions, the new total value is 26. The proportion of the user focused values, they are in much greater proportion to the author.

[Slide 16]

Adam: Now let's apply that to the next provision: "Copying supported" Total value is 21.

[Slide 17]

<HaTheo> Admittedly, this is a hard problem to solve, but I think user harm is one of the most difficult dimensions to score at the provision level because it depends so heavily on context. Is the goal to create a rating for each provision across all products, or for a specific product? It seems like the former, and I think that makes this weakness even

<HaTheo> more apparent.

Adam: Callout: The total values should feel closer than they should between Audio shifting adjustable versus copying supported.

<kirkwood> copying is very important for Cognitive users. complete blocker without copy ability. critical

<Stephanie> Flagging a concern: if we assign numeric impact ratings per criterion by disability group, W3C may face pushback that it is ranking some disability needs as more important than others.

<Stephanie> Even if that is not the intent, it could be read that way. The scoring model needs to be very careful not to create, or appear to create, a hierarchy of disability impact.

[Slide 18]

Adam: Something that pops out is that one provision had a critical need for it, not just an important need. If you sum up the raw values, the total value is the equal. We want to make sure we don't want to leave anyone behind.

[Slide 19]

Adam: Trying an approach, where we consider the collective value of user need. [see slide contents]

[Slide 20]

Adam: When we apply that to the two provisions, the numbers are adjusted one more time.

[Slide 21]

Adam: The balancing step is done prior to applying the weight. When we come back to all the dimensions to Audio shipping adjustable, the total is now 53

[Slide 22]

Adam: When Copying supported goes through the same exercise, the new value is 39

[Slide 23]

Adam: This creates a greater separation between these two provisions, providing separation for critical needs.

<Zakim> graham, you wanted to say floor + points model would be better maybe?

graham: Like the general principle, one call out is to use a floor and a scoring system. If harm was a level 3, then automatically puts a floor score of 100 or something and then everything else can be used scoring-wise. Else this risks scores still being too low.

<Zakim> alastairc, you wanted to comment on complication of platforms for certain provisions, as some might not support copying by default, so the cost would be different.

<kirkwood> demonstration of harm is well resourced in legal/disability ecosystem (finanical,personal, etc)

Adam: Completely appreciate that sentiment very much, and will try to incorporate that later here, and figure out if it feels like it's handled sufficiently.

<Zakim> wendyreid, you wanted to comment on a potential copying business challenge

alastairc: Chair hat off, particularly for the copying provision, we need to think about scoring per technology if something isn't possible. Chair hat on, these are example ratings, we're trying to get after the principles.

wendyreid: Something that came up in the ranking exercise. There are considerations where there could be decisions about when something should be copyable, and a business consideration may be put into play. It might be mentioned in the provision as an exception, and it will help to further refine the provisions.

<kirkwood> +1 to barrier

<kevin> Barrier has a context aspect to consider

GreggVan: This is very interesting. The progress toward conformance has a lot of exactly the same elements. I think we should keep the users as four columns all the way to the end. You should combine the users together, and it blows the number up. The number of people with each disability may then be considered. We should also keep this as the
… ... average across the four groups. Next, I didn't see the word barrier in there (just harm). If we had barrier, it may represent the spirit better. Other than seizures, I don't know that anything else would warrant 'harm.' Having it as one of your major scoring points is important. Also, think about interpreting the word 'harm.'

<nattarnoff> Related to seizures is vestibular disorders as physical harm.

kirkwood: "Harm" could be interpreted as 'financial harm.' I agree with the blocker perspective. You see things like restricting copying due to business or legal reasons. Copying is a critical aspect for counteracting blockers around working memory.

alastairc: Remember, these are examples; we can go through details on the ratings, we're only focused on the principles.

<kevin> s/… .../.../g

<kirkwood> good point about who it’s impacted ‘more’

Stephanie: Word of caution - would we be willing to openly collaborating with advocacy groups of different disabilities to work on the actual rankings. Previous experience with the Australian government, there was a lot of blowback.

<kevin> s| s/… .../.../g||

<AshleeF> Sharing as one of the members of this spike — Harm/cost/barrier have quite a lot of overlap. I've pulled some research codes from Margaret Price's Crip Spacetime for reference. It's linked in a Research subtab of our Google Doc:

<AshleeF> https://docs.google.com/document/d/14truy5hwjOlL4TkE2JVy-no3S7ht2XvuhjtpcE8yN20/edit?tab=t.fis4u13bmsux#heading=h.5hl7a9mt213k.

<kevin> s/… .../.../

alastairc: We wouldn't be publishing how each provision is ranked, we would be assigning the provisions to different levels. It's a behind the scenes methodology.

Stephanie: If we keep it behind the scenes, people will want to know the outcome.

<kirkwood> plain language question: why are we doing levels? what is the purpose?

julierawe: Coming back to the idea of the higher the score the higher impact on accessibility. Are we trying to figure out a scoring mechanism for telling users to go after the higher scored provisions. How are people going to decide which provisions to work on first?

<Zakim> Jennie_Delisi, you wanted to discuss harm inclusion - ways to distinguish impact to users

alastairc: Covered int he next section. It's a methodology for assigning things to different levels. This is how we would get there.

Jennie_Delisi: I appreciate the logic and walkthrough of this, and appreciate it. I hear echoes of Yanina saying it can be helpful when describing harm or the impact to people with disabilities to qualify it in a difference between the impact to those without disabilities and those with disabilities, and I think that could clarify the financial
… impact difference.

BrianE: We thinking through tall the balancing mathematics, was there a point where we looked of instead of having a linear, we do something more of an exponential increase?

nattarnoff: Love the direction in all of this, but wants to come back to 'harm.' "Harm" needs to be further defined: Financial, legal, confusion, memory issues, physical, neurological. Something where isn't going to be more than just problematic, and there is a physical harm, that should be taken care of before anything else.

alastairc: If you look at the final result, do the things you would expect to be at the top of the base level, depending on which way you think about it, are things in the right place?

<Zakim> GreggVan, you wanted to say AGREE ! with working with consumer advocacy groups -- in Progress to Conformance I did a first pass at scoring to have something to work with as a first pass - but that consumer advocacy groups should be the ones to decide the final numbers for their column -- and said that all calcs should use the same page -https://docs.google.com/spreadsheets/d/1tmh4NkJxur7zGCip3vtBd0utDrhN1nR5aZ_EAGnxDkE/edit?gid=1568275819#gid=1568275819

<nattarnoff> To GreggVan point, we'd need to build and maintain a method for keeping track and allowing such changes.

GreggVan: Completely agree that consumer groups have to be the ones to provide the values for their column for each provision. Put the URL in the sheet; go to the data tab; you can see the rankings. This is the first pass to see if the data collapsed or not. The second thing is that everyone should tie their calculations back. When we have user
… groups go in and change the scores, it changes our models accordingly. I was looking at Barries and harm together because I thought harm should be its own category versus consequential harm.

<Zakim> adam, you wanted to discuss linear vs expoenential

alastairc: Asks Adam to respond to Brian's question of linear vs exponential. Believes you did use Gregg's spreadsheet for the rankings.

Adam: The user need ratings do resemble Gregg's because the group attended the meeting where we looked at that proposal.

GreggVan: did you adjust the one for copying, because it's mostly cognitive?

Adam: I don't recall

<Zakim> alastairc, you wanted to comment on ranking things rather than scoring

Adam: Back to Brians question of linear versus exponential, we did experiment with other kid of mathematical operations to solve. The problem was we needed the balancing step to really identify the highest need, and happy for the interrogation to see if there are better ways.

<kirkwood> is financial harm ‘critical’?

<HaTheo> My concern is slightly different. I’m not sure we can meaningfully score user harm for a provision independent of the product and context in which it appears. The impact of the same violation can vary dramatically depending on how central that functionality is to the experience.

<HaTheo> For example, a shifting-audio issue might be relatively minor in one application where spatial audio is an optional enhancement, but could be a major barrier in another where audio positioning is a critical part of the workflow. In those cases, the harm isn’t really a property of the provision itself—it’s a property of the provision as

<HaTheo> implemented in a specific context.

<HaTheo> That’s what I was trying to call out. Admittedly, I don’t have a great solution; I just worry that assigning a single score at the provision level may be difficult to do consistently.

alastairc: If we get advocacy groups to refine the scoring ratings, worries that folks would score everything high. Need a ranking, not just all highly rated.

kirkwood: Good point, and it's something I've personally come across. Protocols have been surfaced by a different user groups and different clinical groups. It becomes a political situation and is also weary about one disability group conflicting against another disability group. We need to move things forward in a. more effective way.

alastairc: Thanks for the presentation and participation; asks for a new scribe.

[Slide 25]

Adam: We set out to answer the idea of multiple levels of conformance, please hold your judgment until we get there.

[Slide 26]

[Slide 27]

Adam: We've established a weighing and balancing system where every provision can be represented by this "value", but it leaves the questions of what is the range of values that we're dealing with.

[Slide 28]

[Slide 29]

[Slide 30]

[Slide 31]

[Slide 32]

[Slide 33]

[Slide 34]

Adam: If we have a provision with ratings in with a maximum value of 87, or what would be the minimum v alue? We were tempted to kind of just do the inverse a nd give all of the lowest rating values. If we imagine a provision that no one needs and is totally harmless if an author doesn't satisfy it, costs users nothing but is incredibly expensive or difficult for authors to deliver.

Adam: If we are talking about a provision that could deliver, that every user group needs critically, and could kill a user if an author didn't satisfy it, that kind of represents the highest potential value for a provision. If we plug those ratings in, the maximum value would be 87, but what would be the minimum value? If we imagine a provision no one actually needs and is totally harmless,
… but is incredibly expensive and difficult for authors to deliver — if we add up all those weights, we would get a value of 0
… There is a question of if we've rated all of the user groups as having a need for this provision, then why does it exist? Should we maybe go back to the provision itself and consider whether it deserves to exist in WCAG?

Charles: One of hte rows in this scoring mechanism said author cost which had a difficulty level and a score value associated with it. I'm curious how the cost is being thought of. If my company provides closed captioning service, it's going to be a different cost than someone else procuring it. How are we approaching cost to the author?

<Zakim> alastairc, you wanted to comment on core vs supplemental would need adjusting, e.g. no-flashing, and no-flashing no-exception.

<Stephanie> Putting this in chat as it didn’t seem to be captured by the scribe: I raised a question on the call about whether WCAG 3.0 should be finalised first, before determining whether a ranking or scoring model is needed, and before deciding how disability advocacy groups would be consulted on that model. I’m flagging this for the record as my

<Stephanie> concern is that developing the scoring architecture before the underlying criteria are stable may make the weighting harder to explain or defend later.

alastairc: It would have to be some kind of estimated average. One of the things I noticed is no flashing and no flashing with no exceptions come very close together and kind of undoes what we've aimed to do, it indicates either we'd have to re-jig which are supplemental and intended to be at a higher level than equivalent, or indicate that there is more weighting that needs to be added to account for something like that.

<Zakim> GreggVan, you wanted to say that ranking all as high q+

<kirkwood> very good point about supplemental due to LOE because of technology advancement

GreggVan: The reason for that is the difficulty for authors/groups multiplied by 4 if it goes across groups. The difficulty in implementing should be given more weight, so that we're not sayhing things that get washed out by moderate needs across four groups.
… when you want to divide it into levels, you'll run into trouble. You'd have to end up in one level or take the middle one and have the exact same score…

<Zakim> Adam, you wanted to say “reasonable worst case scenario” as it relates to cost potential

<kevin> Stephanie, note that this exercise is really framed to try to explore the idea of Multiple Levels in conformance. The ranking or scoring model itself isn't necessarily something that is necessary in the long run, although there may be value in it.

Adam: Varying cost for authors — in our case, our team tried to anchor around a "reasonable" worst case scenario which is subjective on its own, but these dimensions are difficult to rate when you have to consider different cases of users and authors, especially in harm and cost and complexity. We tried to imagine the reasonable worst case we're asking an author to do.

<Zakim> graham, you wanted to say does this raise potential splitting of items - biometrics top makes no sense

graham: Biometrics makes us identify things that are a bit broad perhaps

<graham> yes butr the principle was do we need to look if splits are right, that was the point alastair, not so much the position itself, was purely an example

<GreggVan> biomentric in web is more likely to be VOICE which is a problem for deaf or severe visical disability.

Francis_Storr: Going back to potential author cost — also cost for changing technology. Do we have to constantl y revisit the author cost as tech changes, so what happens to historic audits? They'll not be on the same level of author cost.

<kirkwood> +1

<GreggVan> +1

<Charles> +1 to Francis_Storr cost varies over time as well as markets and capabilitities

<Zakim> shadi, you wanted to respond

<GreggVan> that is correct Shadi

shadi: We have a similar situation today, the way the requirements put into levels considered some of the feasibility aspects — tech changes and it's an existing problem we have, we're just making it clearer why we're putting something at a particular level

[Slide 36]

Adam: I've got the sense that it was expert judgment to consider the success criteria and what level it should go into — this is not to replace that judgment but make it more structured, predictable, transparent, and easier to compare across all of the requirements and provisions we must consider if we will put them in different levels of conformance.

[Slide 37]

Adam: we could stop at this ranking exercise, this could be a decision aid for us or some sort of data point of folding this rank into other aspects of conformance. If we want to take this to the next step and how we can use rank to determine level.

Adam: One thing we know about WCAG 3 is that it doesn't have a single type of SC, we have requirements, assertions, lingering best practice — different kinds of provisions we want to handle differently. Assertions feel risky potentially to try to demand conformance to.

Adam: We also know we want to protect against "obvious" underleveling — alastairc saw that flashing/no flashing exceptions bubble up to the list and are not far away from each other — if we had 2 very different levels of that provision, how can we protect from that happening.

Adam: There may be cases where one user group needs critically is allowed to fall down the list, we also want to account for author feasibility but not dominate, be realistic about what people can actually deliver, and whatever leveling approach we come up with has a way of surfacing these edge cases to make them easier to review and interrogate.

[Slide 38]

Adam: A conformance model with more levels than WCAG 2.0 — let's try with 5 levels — going from most important to least important: Fundamental, Essential, Expected, Enhanced, Advisory. The terms were for argument's sake.

[Slide 39]

<Stephanie> kevin Thanks, that context is helpful! I understand this is intended as an exploratory exercise, and there may be value in testing the concept. I’m trying to be mindful that this is now the second meeting focused on this topic, plus a survey, so it represents a meaningful amount of volunteer time and group effort for something that may or may not

<Stephanie> be part of the long-term approach. That's why I'm wondering aloud whether it would be helpful to first focus on stabilising the WCAG 3.0 criteria, and then revisit whether a scoring or ranking approach is needed.

<AshleeF> Related to WCAG 2.x leveling, a page I came across at some point with tabular views of Essential, Easy, Invisible, All Content for each SC: https://www.w3.org/WAI/GL/wiki/WCAG_2.x_Priority_levels_discussion

Adam: If we've got all these ranks calculated for the provisions, let's try to specify bands to use as a starting point for where a provision might falli nto a level. If we ended up choosing 55 as the threshold, those are going into the Fundamental level of conformance.
… we don't have a lot of provisions that rank highly because that's where you'd be talking about many user groups needing it critically.
… most provisions are balanced in terms of who they affect.

[Slide 40]

Adam: these bands only pertain to provisions that are requirement types. All other types like assertions and best practices, we don't want to put them in the first two levels of conformance. We set our first threshold at the expected level of 60-100.

[Slide 41]

Adam: We probably still have problems identifying what we want this leveling to accomplish. We need guardrails for things we know that are important in isolation, like lethal harm to the user should not fall from the fundamental level.
… have a guardrail for provision/requirement that could injure a user and would not fall below fundamental.
… anything that can injure a user has got to be at least essential. If any one user group needs it critically, it can't fall below essential. The concept of "barrier" — as a separate dimension or when we were considering user need, it was mostly about the concept of user encountering barrier and whether they needed it or not.

<alastairc> Stephanie - these things are interlinked, how the levels/scoring work impacts the provisions, and vice versa. I'm afraid that for anything fundamental like this, we need the whole group to understand and agree.

Adam: For enhanced, we have the guardrail to protect authors from undue burden. Any provision that is both expensive for an author to satisfy and also requires a lot of effort because of complexity — if both are true, we don't want it to rise above Enhanced, with an exception if any of the guardrails are triggered, that voids this guardrail to make sure we're still favouring the user.

[Slide 42]

<kirkwood> I’m concerned if this becomes an exercise in judging the level of harm?

[Slide 43]

Adam: What if we run all of our provisions, ranks, ratings through the guardrails and the whole system? A bell curve across these five levels.

[Slide 44]

Adam: What if we still do all of this and look at a particular provision and see it doesn't feel like it's in the right level of conforrmance?
… how did we rate the dimensions for this provisions and make sure we still agree that those ratings are still good — is it possible this provision doesn't feel right because it's revealing a new user group?
… Let's say, blind and low vision users into one big bucket of visual disabilities — some might need more granularity to identify critical needs more specifically.

Adam: Maybe a barrier dimension, liability dimension, copyright law or protection of intellectual content? A guardrail to add to our leveling system to make sure we slot that in the right place, or go back to the math?
… some of the provision themselves may be questionable in some way, and need to be refined or rewritten

<kirkwood> What would the parameters be for assigning a level (manually)? [as said]

GreggVan: WCAG 2.0 has 2 levels, A and AA, not 3. AAA are all recommendations. At one point we wanted only 1 level which is requirements and recommendations which are 2 categories not levels. We ended up with A and AA is because half of the group wanted everything AA to go into recommendation.

<AWK> From WCAG: For Level AAA conformance, the web page satisfies all the Level A, Level AA and Level AAA success criteria, or a Level AAA conforming alternate version is provided.

<shadi> +1 to AWK

GreggVan: The other half wanted to put everything into A. It was left as 2 levels because we could not get consensus. Regulators decided to put AA into A and now WCAG 2 essentially has one level of conformance and a second category, which are recommendations.

<Charles> +1 to AWK

GreggVan: AA had to be essential, not just convenience. AAA were technically not feasible.

GreggVan: We're going to end up in the same situation because what we learned then applies here. The important part was we had recommendations and only 2 levels.

GreggVan: As Adam pointed out, we would adjust by consensus of the group, if people felt it couldn't work and had long arguments for final placements, you'd always have to do some adjustments at the end with expert opinions and all consumers are experts representing themselves.

<kirkwood> And that it is why WCAG 2.1 AA it was adopted into US and State law

AWK: We definitely have 3 levels in WCAG. It says in 5.2.1 Conformance Level. For editorial, the table that shows the levels, the top row is the easiest — showing the hardest thing at the bottom was a bit confusing. I think we should consider assertions at earlier levels. At Essential, you're not doing as many things or measuring as many things can claim you're paying attention to certain things.

<julierawe> +1 to including assertions at earlier levels

<Jennie_Delisi> +1 to including assertions at earlier levels

AWK: I think there's value in including those earlier, being able to say we're doing all this work and we're considering this. With A, AA, AAA, we do need a published rationale for what goes into what level. To override it with a manual expert review, we should be cautious of that.

<Zakim> wendyreid, you wanted to talk a little bit about iterative provisions in levels

<AshleeF> +1 to need for published rationale

wendyreid: We have in the provisions right now is iterative progress built in already and one thing to consider for these groupings is that as well, where almost regardless of ranking some things have to come before others.
… the Fundamental level should have provision about captions, regardless of its rank, you can't do any of the other provisions related to captions unless you have captions.

wendyreid: We'd have to consider a qualifier on some provisions that this is a prerequisite provision where if you don't do this, you can't do the other things, without it you're fundamentally missing something that can't be done later. To consider the way this breaks down could be progressive and gives us levels to build towards something better.

wendyreid: I wouldn't want to think of these as encapsulating into each other, but the goal is to build up and not stop at any one place

<Zakim> alastairc, you wanted to suggest that we take the highest user-need across the user-groups and weight it, rather than combine them. and to also suggest that we adjust the critera for ranking wherever possible, rather than manual overrides.

alastairc: I'd like to see what it would look like if we took the highest one across the user groups and don't combine them, might overcome some of the problems Gregg pointed out earlier
… and try and minimise the manual overrides whenever possible so that when we spot problems, we can tweak the algorithm

alastairc: Most legislators picked AA, and A and AA collapsed to 1, and AAA became the optimal recommendation
… The only question we should focus on is would something similar happen?

<Zakim> kevin, you wanted to ask whether this helps us determine if using multiple levels is of value

kevin: I agree, I do wonder whether or not expected level becomes de facto level that regulators will choose and therefore anything below is pointless.

kevin: I see value in the ranking system to demonstrate the rationale for the important we've placed on particular provisions is useful and valuable, but multiple levels of conformance was proposed by a couple of people and I'm not sure whether this highlights value within that as an idea

<kirkwood> +1 to Kevin

GreggVan: Keep in mind recommendations and make a note that all higher level items are recommendations at any level, because they're not requirements.

GreggVan: Anything that says Essential was the second category and the next one was expected, the legislators will end up on the Essential — this is all human rights and what people expect is not what we put in the law, just things that are essential in preserving non-discrimination. You should rethink the labels a bit, the line would be drawn at Essential, not Expected.

GreggVan: We can end up gaming the scoring, in the end use that to sort them out and use consensus to get them into their final places.
… ease of implementation is key — we can't have it anywhere in the scoring because we have no idea what it is.
… we should keep the user groups separate because then people could see how things are working and balancing out.
… we'll be seeing smart browsers and OSes, a lot of our provisions need to be rewritten

<HaTheo> just a note... I do wonder whether we need a score at all(which is not where I started or felt originally). The score generation feels somewhat secondary to having an objective and repeatable way to evaluate the underlying dimensions. If we can’t consistently assess things like user harm or author cost because they’re highly context-dependent,

<HaTheo> then the resulting score may give a false sense of precision. To me, the more important question is whether we can reliably and consistently assign the inputs. I wonder if we would have gotten to a similar space with just what is defined in the guardrails.

wendyreid: A lot of the complaints of usage of WCAG 2 is that it's either this all or nothing and intimidating specification. You have to adhere to 50 or more criteria and coming in with little knowledge or experience is a lot.

<Zakim> wendyreid, you wanted to point out an alternative way of viewing lower levels

wendyreid: it's overwhelming and the benefit of the leveling approach, especially one that takes into account building up towards something means we can make that process easier and make the lower levels more achievable and develop in a way that create a positive user experience — the end goal

<AshleeF> +1 to wendyreid

wendyreid: we also want to make it possible for authors to achieve that goal and to do that we need to ease them into it. I also think we should be separating the conformance model and what regulators do with it from our policy discussions. The focus for the conformance model should be the end user.

<GreggVan> +1\

<shadi> +1 to wendyreid

wendyreid: We can define on the policy side what we're looking towards. I want to separate these out because we can scare ourselves into making decisions that don't benefit people that need to use our specification

<JeroenH> +1 to Wendy

<hdv> +1 wendyreid

<Charles> +1 to wendyreid to keep regulatory audiences out of the how

<Stephanie> +1 wendyreid

alastairc: Adam and team have demonstrated that there is an approach we can use to create multiple levels, dimensions, and so on, but assuming that is the case that we could come to agreement on the 5 levels (names to be discussed), we have mechanisms or features that we have to hand — is having levels in conformance that go underneath, would that help people start somewhere and make progress?

<kirkwood> What is the plain language goal for levels?

alastairc: We went over scoring last time as one side of conformance, there are potentially guidance materials that could be produced to give people some sort of pathway

alastairc: Do you see value in having multiple levels? Calling it WCAG AA

<alastairc> Poll: Assuming we continued with this approach, do you see value in having multiple levels under the level of WCAG 2 AA

<wendyreid> +1

<Charles> +1

<hdv> +1

<AWK> +1

<Eloisa> +1

<JeroenH> +1!

<shadi> +1

<HaTheo> 0, I do but I worry the score is a red herring...

<Heather> +1

<Adam> 0 still unconvinced

<kirkwood> ai can do that

<Stephanie> 0

<AshleeF> 0

<alastairc> 0

<ShawnT> +1

<kirkwood> 0

<Eloisa> s/WCAG 2A/WCAG 2 A-AA

<graham> +1 yes - but maybe 2 levels

<Detlev> +1 because orgs can reach that lower step more easily, then continue - it i a waymark

<kevin> -1

alastairc: The idea is having more levels in conformance

<julierawe> Are "enhanced" and "advisory" the multiple levels below WCAG AA?

<GreggVan> -1

<kirkwood> -1

<CClaire> 0

<AshleeF> 0, because it's not clear to me what is being polled

GreggVan: I see a huge danger, as soon as people say they conform, I think it's valuable when people work on conformance where they start, maybe renaming would help? People will feel some sense of accomplishment for getting to the lower levels — be careful of how we name those levels

<julierawe> 0, in part because it's not clear to me what is being polled

<laura> -1 agree with Gregg and Kevin. Naming may help.

<jtoles> 0, willing to keep exploring, but the argument that the regulators will decide on the level is strong

<Adam> Currently: 11 in favor, 7 neutral, 4 opposed

AshleeF: Not sure the connection between the poll and this proposal

<Adam> +1 to wendyreid’s clarification

wendyreid: Responding to theo that the scores were a red herring the rank is a tool for us, not for the outside world, but us as a decision-making mechanism

<Makoto_U> +1 - step-by-step mechanism for achieving higher levels

julierawe: Confused on what we're voting on as well — the idea of the 5 levels and moving forward with 2 are below expected roughly equivalent to WCAG A-AA?

alastairc: I didn't ask about the levels above that because they're less controversial, so no problems 1 or more levels above what we whoudl hope is the regulatory level

<Charles> note: it seems easy to be confused given we have just listened to math for 2 hours

<Detlev> we may need to restack the graphic to make sense of "above" and "below"...

<kirkwood> were trying but not there yet level? not sure that holds water. esp from regulatory persopective. and thus adoption

<Heather> +1 It helps organizations who have done a WCAG 2.x assessment, and measures their gaps to WCAG 3

alastairc: Might become ignored like the difference between A and AA, to reframe the question, imagine WCAG 3 comes out, designers, developers, people are trying to work out what they need to do. Does it help them to have things organised so that there is a set of fundamental requirements or essential requirements and a set of expected requirements

GreggVan: What if we combine the two ideas? If the purpose having things below that is so people can mark their progress but we don't want them to stop before what we think is the minimal level of conformance. What if we have levels in the progress toward conforrmance?

<kevin> +1 to presented as levels of progression

GreggVan: Take progress toward conformance score, purposefully make some levels — a separate doc from WCAG — if you're trying to get there, here are the levels and what you should do first — to show that they can progress and not necessarily just reached a certain level of conformance

<kirkwood> progress towards conformance. but does that really help people with disabilities?

<Stephanie> +1 kirkwood

<HaTheo> wendyreid, I understood that, but I feel like we are getting caught up in how to assign a score and calculate it, and I wonder if we have actually just obscured the definitions of the levels...

AWK: This is a contentious topic and I'd hope that the direction we take on this is not determined on any single call we have — we have 12 people who said yes and 4 who said no, etc. — having a poll around this and/or get public review of this concept to understand what is acceptable externally moreso what is acceptable on this call

<Zakim> alastairc, you wanted to react to AWK

<kevin> My concern is that any on-ramp that is less comprehensive than Level AA would result in more complacency

<kirkwood> yes

<Stephanie> +1 julie

<hdv> s/q- we're amolst at time //

julierawe: Idea of 5 levels compared to 3 levels in WCAG 2 — do we think people will just aim for the middle no matter what? The 5 is more complex and trying to understand the difference between enhanced and advisory — is this complexity necessary?

<HaTheo> +1 julie

GreggVan: Agree this can't happen in one or two meetings, this is huge progress in terms of our understanding how complicated this is and we need outside input, we need more input rather than putting things out for a vote

<AshleeF> -1 to poll (I'm not sure if there's a specific thing I need to type to change my vote from 0). I'm not convinced there is a way to assign multiple levels across all provisions; I'm writing up an idea for conformance within groups of provisions, sometimes referred to in the past as "modules" but that language isn't necessarily what I'm using. My in

<AshleeF> progress writing: https://docs.google.com/document/d/14truy5hwjOlL4TkE2JVy-no3S7ht2XvuhjtpcE8yN20/edit?tab=t.f8zqrtl6ws4q.

<kevin> +1 to Gregg on more input

<HaTheo> Big thanks to Adam and everyone who made this...

<julierawe> Thanks for all of this good work--helps us think it all through!

Minutes manually created (not a transcript), formatted by scribe.perl version 248 (Mon Oct 27 20:04:16 2025 UTC).