AGWG Teleconference -- 18 Oct 2022

<Chuck> meeting: AGWG-2022-10-18

<Chuck> I have joined the call early if anyone is interested in joining.

<Jay_Mullen> Greetings - what is PW for the Zoom today?

<Jay_Mullen> nvm

<Chuck> Hi Jay, no pwd should be required.

<scribe> scribe: bruce_bailey

Chuck: Please sign up to scribe!
... Any announcements or introductions?

Jay-Mullen: My first meeting, thanks for having me.

Chuck: New topics for later?

<Chuck> https://www.timeanddate.com/time/dst/events.html

Chuck: Reminder that daylight saving times effects meeting times for next several weeks...
... AG meeting use Boston time, U.S. shifts on November 6th

Review WCAG2ICT Work Statement https://www.w3.org/2002/09/wbs/35422/WCAG2ICT_Work_Statement/

Chuck: but your locality probably differs, see link

<Chuck> https://www.w3.org/2002/09/wbs/35422/WCAG2ICT_Work_Statement/results

The background is that AG did approve activity, without finalized workstatment.

Chuck: Survey has 7 respondents to revised work statement, all positive.
... there will be a CFC

<Chuck> proposed RESOLUTION: Approve the WCAG2ICT Work Statement (Work Statement will then go to CfC)

<Rachael> +1

<GreggVan> +1

<Azlan> +1

<alastairc> +1. Ideally, please comment before the CFC, so we don't have to repeat the CFC...

<Jay_Mullen> +1

<jaunita_george> +1

<ShawnT> +1

<JenniferStrickland> +1

AWK: I had question if Mary Jo or other might clarify stance on AAA SC

Chuck: We have put AAA in the queue at end if time allows. We are not excluding AAA.
... we are not committed to including either.

AWK: That is a little different than previous iteration.

GreggV: Last time around, AAA were excluded primarily due to external time limitations.
... Access Board was waiting to move forward on WCAG last time around.
... This time, there is caveat if time allows.

AWK: I am a little concerned, as with 2.1 there are almost 30 AAA which seem could take a long time.

<alastairc> 19 new SCs at A & AA for 2.1 + 2.2

AWK: Does the work statement allow the note to be updated if, for example EU wants A and AA ?
... Could note be updated with AAA at a later time?

<mbgower> From the working draft: "WCAG2ICT could potentially need to be published before addressing Level AAA due to regulatory schedule needs. If publication is needed prior to addressing Level AAA criteria, another update to the WCAG2ICT Note may be necessary"

GreggV: The other big tasks we need to address a little better this time is the "sets of" SC and application for closed functionality....
... those discussions took a long time last time, so we hope to be a little more efficient this time around...

<mbgower> "Phase 1 Goals are to incorporate new WCAG 2.1 and 2.2 Level A and AA success criteria..."

GreggV: I have taken a first pass on that, and facilitators are aware of that work

<laura> +1

<mbgower> The entire ty of AAA is left until Phase 2

<mbgower> +1

RESOLUTION: Approve the WCAG2ICT Work Statement (Work Statement will then go to CfC)

Chuck calls on Wilco

Wilco shares screen (Google doc, see link).

Wilco: Responsive to feedback from sub group, we have made several changes.
... One change is recommending 5-8 people instead of small size.
... Another change is noting commitment and asking people to signup early. W3C policy prohibits exclusion, but these are 8 week sprints.
... Third point is emphasizing off-line work between live calls.
... there will be minutes and surveys and open work items. Meeting time should be used for resolving questions, not doing the writing.
... People should expect 4 hours per week per sub group, so that is 3 hours of off-line work.
... Fourth, facilitators are very much encouraged to reach out to AG co chairs for logistics or accessibility issues with tools.

Jennifer Strickland: This looks really good, thank you, I am very enthusiastic...

scribe: But one thing which seems missing is ramping up to launching sub group...
... From my past experience, I will recommend being sure people know when / where group is meeting.

<Chuck> will get to Gregg shortly

scribe: Also distinction between TF and sub group would be helpful.

Wilco: I think there is a glossary, which I will look for.

<Zakim> GreggVan, you wanted to suggest "Ideally, the subgroup member should reflect diverse opinions in the main group about the topic." to "Ideally, special efforts should be made to

JenniferStrickland: Please take as constructive criticism.

GreggV: This is very helpful, +1 to linking to definitions....
... I suggest edit that sub group should reflect full range of interested participants over what is stated in hand book present.

<Zakim> Rachael, you wanted to say that was added

Ideally, the subgroup member should reflect diverse opinions in the main group about the topic." to "Ideally, special efforts should be made to

Rachael: That sentiment was added at top recently.

[scribe notes that he may not have word smithing exactly correct]

Review and discuss possible conformance models https://drive.google.com/drive/u/0/folders/1X3Paz3WuK4yn09_ZN99P5IFl2-5yn5U9

Chuck: Please do suggest comments or edits, but wordsmithing is out of scope for this call.
... This link is to folder, not one specific file.

<Rachael> https://docs.google.com/presentation/d/15ZoKbczXw3JIoyDxAxKtBG0sWMKnB6lqAnV4V9xVsoM/edit#slide=id.p

Rachael: Screen sharing Possible Conformance Models
... Last week we walked through other presentation in some detail.
... It is not too late to create new or additional conformance models from the template...
... if you have an option 7, please be encouraged to add.
... Q?

<Rachael> https://docs.google.com/document/d/1_D9ZB7G78m-t9lbqRCliYU8X-zN5dtXmO5eUvuz9BDs/edit#

Rachael: Turning to my document, these are just ideas which we might use or not use.
... the intention of Outcome Based Conformance is still to have tiers...

[Racheal moves to doc, rather than slide]

Rachael: This gets into more depth than we have before...

Errors only occur at Bronze level and there are very few Critial Errors

scribe: this approach does have some tolerance for errors, for example missing alt on a decorative image...
... but it does allow for cumulative errors building to a critical error
... at Bronze approach is binary pass / fall
... at Bronze scoring is discreet.
... At Silver this proposal allows for weighting and more refined scoreing.
... there is also a threshold minimum baseline.

Rachael: Example at Bronze is included, checking that decorative images are appropriately coded.
... this is cumulative with other images having alternative text
... and check for functional images like images of text.

At Silver additional assurances are added, and an example include statement of compliance with organizational policy.

Chuck: Question, so there are multiple methods and can add them to cumulative score?

Racheal: The methods might be technology specific, so it is the test which aggregate to a score.

Chuck: Organizations can stop at Bronze with there attestations, even though they do some Silver and Gold things?

Rachael: Yes, but there is also a cumulative error mechanic. This does aim to transition from 2.

Mike Gower: I really like this, the heirachy and structure is excellent...

scribe: but I think we may need to revisit with regard to functional needs. This is great though.

Mike Gower: Possibility of growth too. Scaleabiltiy, flexible. Do minimum first, then layer on additional details.

GregV presents on Option 2

<Rachael> https://docs.google.com/document/d/1rWL_O0sFxsFd7CjILTNhb43c7645QNXK38LJIcsgVjE/edit#heading=h.q7a1p1s14gm5

GreggV: Outomes structured as Pass/Fail with Adjectival rates (very poor, poor, pass, good, very good)...

Protocols are what an organization makes assertion regarding, since outcomes are addressed by asstestation...

scribe: Guidelines would state what we want to be true.
... Sivler is bronze plus, with 50% more metrics. Gold will be beyond that.

[Gregg moves to longer doc]

GreggV: Not all scoring with be black and white scoring.
... Example is contrast, where now we have 4.5:1 and 3:1...
... with this we might say "very good" (for 7:1) but with this approach also have set of things that must be done...
... example is that there must be alt text , but we really want to have advisory about quality of alt tex.

<Rachael> +1 to exploring the concept of outcomes and affirmations

GreggV: Outcomes are scored as Pass/Fail -- for shipping product.
... Affirmation would not be testable per se because they are the claims from the developer.
... affirmations should address features of products, but could also list where there are gaps in the progams.
... We tried to make this work for "views" but could not make that work...
... at the end we returned to unit of conformance being "pages". We really did try to make views work.
... 4.1.2 Conditional test address requirements which might not be applicable, for example blinking or flash rate.
... We do want to include guidance on evaluation how well the content meets the expectation.
... For type of requirements we added Recommendations.
... One item out of sync is we should have how to do the accessibility, but only if the outcome is met.
... Please see comment about Extendable Requirements not being quite the correct term.
... I am also concerned with the prescriptive requirements, which we do not have much in WCAG2. Prescriptive is, for example, saying 1/8th jack rather than compatible with users headphone. i
... Added Recommendations section as a summary, please do review.
... Also provide a run at scoring section with specific numbers.
... Recommend several separate ratings over single aggregated because single score might be gamed.
... With Silver and Gold they are not strictly cumulative, but both require Bronze.
... also provided example of scoring

Chuck: Gregg, you mentioned experience with view versus page level reviews. Is page fundamental as to how this proposal worked?

GreggV: Idea of view is good, what is on the screen, but hard to translate that to pages and even harder for web apps. We resorted back to pages because it was too hard to know what was being tested.

<Zakim> Jay_Mullen, you wanted to say "is there any consideration to hybrid approach of 1/2 - in that you use the tree system to guide the adjectival rating in option 2 given adjectival

GreggV: There are an uncountable number of views except for very simple examples.

Jay_Mullen: Question about hybrid reviews and adjectival rate...

<Zakim> jeanne, you wanted to say that I like large parts of this and some of it matches with the issues I have been reviewing for Option 6

Jay_Mullen: might be too far ahead, but adjectives seem hard to apply to both apps and pages.

jeanne: I am impressed, especially the innovation to apply percentages for adjectival ratings.

Katie: I am also impressed with this presentation and Rachael. I think they might be able to be combined.

GreggV: I want to acknowledge how much I borrowed from Option 1!

<Chuck> +1 to a sub-group reviewing "views"

<Ryladog> +1 to view taskforce

GreggV: It might be worth having a view TF (sub group) if we can solve, it would be good.

Jaunita: No matter which approach we endorse, can we expect tooling to support ratings?

Wilco: There will be tools!

<Chuck> OOPS, sorry Bruce

Wilco: The ACT Task Force is looking into how to define state, and we think that will inform work on views.

<Wilco> scribe+

<Zakim> mbgower, you wanted to say I think it would have to be "...would mean passing 75% of them at GOOD or better"

<Jay_Mullen> +1 to taskforce btw

<Wilco> Chuck: Lets move on to the third option

<Wilco> Alastair: We discussed some form of gating. I wanted to focus on adjectival ratings

<Rachael> https://docs.google.com/document/d/1lPAl7mddnMnIK5PQbvQqSNcXi_pDorSVxwzratelYR4/edit#heading=h.q7a1p1s14gm5

<Wilco> ... Going through the test level, poor is non-passing, Okay and above is passing

<Wilco> ... I was focusing on test. From the guideline level down, I took text alternatives.

<Wilco> ... One thing I was thinking is if under the outcomes we could list the applicable tests as normative. So the name would be normative.

<Wilco> ... I was taking it from the test up, not sure if methods are needed.

<Wilco> ... Image has description would be a computational descriptive test, scoped on an item basis.

<Wilco> .. .All images pass can have a high level, if not, poor - does not pass

Test pages for alternatives

https://docs.google.com/document/d/1lPAl7mddnMnIK5PQbvQqSNcXi_pDorSVxwzratelYR4/edit#heading=h.iqnotugv7p4c

<Wilco> ... The other half, equivalent meaning. In the original FPWD we had ratings with percentages.

<Wilco> ... Thinking about functional image, what would poor be? Any functional image that don't indicate the function would be failing, even at bronze.

<Wilco> ... If it was okay, indicate the function. It's not perfect but you can get the function.

<Wilco> ... Then very good names are appropriate for the function.

<Wilco> ... If you have poor though to very good you can build in the severity at the test level

<Wilco> ... That means you don't need to separate criticality.

<Wilco> ... An informative image rating could be less harsh.

<Wilco> ... If less than half don't indicate the content, that's a fail. If most indicate the content that's pass

<Wilco> ... Then images of text, if the text isn't in the alt, that's a fail.

<Wilco> ...Then for this set of tests, all tests need to a reach Okay for bronze, Good for Silver

<Wilco> ... You can adjust the difficulty at the test level. Those tests would need to be normative.

<jaunita_george> +1 to Gregg's question

<Zakim> Jay_Mullen, you wanted to ask a question about when any single fail, even out of 100 occurrences, could result in failure

<Wilco> Jay: At the Okay level, in a situation where language is important, if it's okay it doesn't drive must urgency.

<Wilco> ... If 9 images are good and 1 is bad, you'd give them an okay. But what if the 10th image means someone can't answer a question?

<Zakim> GreggVan, you wanted to ask NICE WORK a couple questions 1) are tests all technology agnostic? 2) if YES - then OK is like current SC -yes? 3) If not Tech agnostic how

<Wilco> Alastair: I think the same thing applies to where we had percentages. There's a balance between if you require 100% all the time, we have well documented instances where it doesn't work with sites at scale / complexity.

<Wilco> ... That seems to be a case where perfect is the enemy of the good

<Wilco> .. .If you have 10 images all vital for that content this is an issue.

<Wilco> ... I think the key factor is some kind of counting or percentage. We could have some other metric, or require all informative images, or maybe a separate test for that scenario.

<Jay_Mullen> +1 to separate test

<Wilco> ... An image where content is relied on in a test of something could be a separate test, or part of a protocol.

<Zakim> bruce_bailey, you wanted to say that aggregating adjectival scores seems workable (e.g., All tests achieve at least ‘good’)

<Wilco> Gregg: I think that the test should be scored as failing, or the person could report it's accessible except for questions 8 and 10

<Wilco> ... If it's not accessible it should be labelled as such.

<Wilco> ... My question is, when you talk about tests, the tests look like the success criteria.

<Wilco> ... So your Okay is like a WCAG success criterion.

<Wilco> Alastair: The outcome is more equivalent to the WCAG criterion.

<Wilco> Gregg: You only had Poor, I also had Very poor. Is there a reason?

<Wilco> Alastair: I started off with very poor, I wasn't sure why to differentiate. These are both under the level.

<Zakim> alastairc, you wanted to comment on the reporting aspect, fine in some scenarios, infeasible in others.

<Wilco> Alastair: On the reporting aspect. That works fine at the scale where you have a test with 10 images, but when you have an e-commerce site where you have hundreds of images coming from different source, that really doesn't scale.

<Wilco> Bruce: I like the adjectival. It addresses the issue I had with percentages.

<Wilco> Gregg: Any time we get into percentages we get into the trap with tests. If they think it's important enough for it to be there for some users, saying that as long as it's only 1 in 20 that's an issue, that bothers me.

<Wilco> Rachael: We'll get 3 more proposals next week. I encourage those working on that to prep for that.

<Wilco> ... For these three I think it'd be good to talk about the criteria.

<Rachael> https://docs.google.com/presentation/d/1yLYeNcybGxRu43KdrVUcOCL6iXsy6-gxl9-lbyr90dI/edit#slide=id.g15c635140bc_0_132

<Wilco> ... My guess is that no single proposal will go forward as is. What we're looking for is ideas of which ones fit better.

<Wilco> ... We may pick more than one.

<Wilco> ... The first criterion is multiple ways to measure. How do these support, or fail to support?

<Rachael> All WCAG 3.0 guidance has tests or procedures so that the results can be verified. In addition to the current true/false success criteria, other ways of measuring can be used where appropriate so that more needs of people with disabilities can be included.

<Wilco> Gregg: The multiple ways to measure, the goal is to allow more things to get in.

<Wilco> ... I think adjectival, and mixing requirements does that. When you talk about multiple ways to measure, if there are criteria, with multiple ways to measure, people could come up with different answers.

<Wilco> ... If there are multiple levels, then I think it works.

<Wilco> Rachael: The intent is that the spec has other things than pass/fail. I don't want to go into rewriting the requirements.

<Wilco> ... I want to encourage the three of us not to defend or argue for ours, it would be good for the group to discuss the differences.

<Wilco> Chuck: Those who didn't write proposals, if you have questions / observations, please make them.

<Wilco> ... It seems like option 1 and 2 had a lot of cohesiveness. It looks like there's an opportunity for them to complement each other.

<Wilco> GN: I noticed in Gregg's proposal, there was a rating of related requirements. Rating minimum contrast as good, and high contrast as better.

<Zakim> GreggVan, you wanted to say Option 2 drew from option 1 and I see Option 3 as providing more detail for parts of option 2 (and possibly 1?)

<Wilco> ... Providing minimum contrast is very good for people who need minimum contrast, but high contrast with lots of errors is not as good.

<Wilco> Gregg: I saw option 1 draw from option 2. Option 3 provided more elaboration.

<Wilco> ... On high and low contrast, excellent point. We talk about there needing to be sufficient contrast, but on high contrast I wonder if there needs to be a provision of low contrast, unless AT can tak that contrast away.

<Wilco> Chuck: In general it's interesting that an outcome that's beneficial for a functional need might create barriers for other functional needs.

<Rachael> https://docs.google.com/presentation/d/1yLYeNcybGxRu43KdrVUcOCL6iXsy6-gxl9-lbyr90dI/edit#slide=id.g15c635140bc_0_132

<Wilco> Rachael: If you go to the conformance approaches, there are instructions there, and on the template.

<Rachael> https://docs.google.com/presentation/d/15ZoKbczXw3JIoyDxAxKtBG0sWMKnB6lqAnV4V9xVsoM/edit#slide=id.g165c944dd8c_0_11

<Wilco> ... Follow instructions, and add the template at the end.

<Wilco> Alastair: For those signed up to the other 3, if you can't talk about it next week, let us know. If anyone else has ideas please let us know.

- DRAFT -

AGWG Teleconference

18 Oct 2022

Attendees

Contents

Review WCAG2ICT Work Statement https://www.w3.org/2002/09/wbs/35422/WCAG2ICT_Work_Statement/

Review and discuss possible conformance models https://drive.google.com/drive/u/0/folders/1X3Paz3WuK4yn09_ZN99P5IFl2-5yn5U9

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output