Silver Task Force & Community Group -- 31 Mar 2020

<CharlesHall> what else can you do?

<Chuck> Jake, are you able to join now?

<scribe> scribe: sajkaj

Headings

sl: Wanted to start arranging some of our content vis a vis headings

<jeanne> https://raw.githack.com/w3c/silver/ED-VF2F-js/guidelines/#headings

js: Looks for latest from JF ...

<jeanne> http://john.foliot.ca/demos/HeadingsTestTwo.html

<Lauriat> http://john.foliot.ca/demos/HeadingsTest2.html

jf: replace "two" with '2'

<jeanne> https://raw.githack.com/w3c/silver/ED-VF2F-js/guidelines/explainers/SectionHeading.html

<JF> Tests start here: http://john.foliot.ca/demos/HeadingsTestStart.html

js: go to scoring tab ...
... took our 3 user needs (Eval tab) and ...
... added an example addressing only one need, not all three
... Did a partial --

jf: Where are we?
... Asks how numbers were arrived at

js: Whatever percentage of what was done drove the scoring -- 1 of 3 = 33%
... 2 plus a partial, I applied partial credit

jf: Asks whether really accounts user impact

js: That's the purpose of this, so we can discuss
... This was the simple one .. any questions before we move on?
... Turns to jf ...

jf: I'm confused -- I see 7 functional requirements

js: No, 3 user needs for this guideline, on the Eval tab

jf: Still confused

js: Testing responsibilities for functional outcome on the Eval tab
... I should be saying functional outcome, not user need
... My proposal is how to score the 3 functional outcomes
... This is not EN301 ...

jf: Reason why not?

js: Because our earlier design was to get to them at the end of the process, not along the way
... I'm interested in exploring your alternate proposal
... Would like to go through JF's example via the various breakdowns, incl EN

jf: Hmmm, not considered scoring ...

<jeanne> http://john.foliot.ca/demos/HeadingsTest1.html

jf: Suggest we need to consider control page and evaluate for equal importance
... Proposes we need to score on the vertical columns, not just the horizaontal rows

js: Can you do a proposal for Friday?
... with numbers?

jf: I can try
... Wondering do we even bother scoring these for someone sensitive to seizures on flash?
... Suggest we need to evaluate impact on pwd groups

sl: agree
... Do we overlay the needs with the outcomes at the guideline level? or the task completion leve?

jf: Think there's room for both
... We've not identified task yet
... As we eval benefits, it starts to add functionality for certain users
... If we do it correctly, it should also benefit COGA
... Sees how this will interface with p13n spec work in APA
... Do we reward code that supports more tech?

<Zakim> CharlesHall, you wanted to discuss N/A (no known impact)

ch: assuming the impacts are correct, will that impact scoring? if not applyign to functional need, it shouldn't impact scoring

jf: Depends on whether additive or substractive scoring
... What do we do for no impact?
... Don't know that answer
... postulates a scoring model on the fly -- --

<Chuck> janina: I'm a little concerned on the assumption that we are going to frequently have well nested headings. I think that won't be often the case...

<Chuck> janina: Like h2 to h4. Do we want to specificy h's or be happy with aria. John asked personalization to consider. We should be able to fix these things.

<Chuck> janina: For the first time I'm liking divs.

<Chuck> janina: Your impact for people that don't use aria won't be there.

<Chuck> John: We need to be working at this granular a level... if we have h's with no aria, how do we score that?

sl: Recalling notion that could score at both levels --
... Needs to be expressed in task

<Chuck> welcome. I realized that I was only scribing your talking and not scribing the entire conversation. I will probably change my habit on that.

<JF> http://john.foliot.ca/demos/HeadingsTest4.html

sl: What if the content is a side note that isn't particularly relevant? That's less impact than problems in the main flow
... Agree we need to think at all these levels, though
... at guideline level, yes you met/didn't
... but testing level we get to impact on users
... how that math works, I don't know

jf: For this I like BB's multiple currencies metaphor
... suggesting one type of evaluation levels for test to be different than eval category/type for guideline
... build up a total score from multiple parts

<Zakim> jeanne, you wanted to give a sample of scoring

<Zakim> Lauriat, you wanted to talk about applying at both levels.

js: Suggesting a look at one of the bad examples to see whether our approach would give us the scoring we want

<Lauriat> Test one: http://john.foliot.ca/demos/HeadingsTest1.html

js: Walks threough her approach ...
... Suggests my attempt scored too high
... Too many scoring items watered down the problem pieces

sl: Think there's a way to avoid that ...
... By adding impact pass on top of the scoring
... A two-phased scoring model
... First phase at guideline level
... Then a contextualized task scoring analysis

js: How to determine that systematically?

sl: Lots of definition
... Expressed in realized impact to users

js: Is the suggestion to require task completion testing?
... Don't think that would work

jf: Suggests it could work
... Walks a series of questions to satisfy ...

sl: Don't think this will impact small sites
... a small site would have just a few tasks
... Guideline scoring exposes what needs fixing
... e.g. did not meet functional outcome x for guideline a

JS: NEED TO SEE THIS LAID OUT ...
... NOT SEEING THE PATH TO THE END RESULT

SL: YES, NOT YET CLEARLY LAID OUT
... don't have a map for how to apply

js: So what are the steps to getting this mapped? We'd need those to publish FPWD

<CharlesHall> one dependency here is “what is a task?”

sl: need to discuss with ACT
... Can we define a structure for this?
... Suspect it will start out very complicated, but then we'll be able to simplify
... Looks at hours of operation task for a pizza shop ...

jf: Wants to share screen shot to illustrate--with apologiesjf: Showing page and planning to describe ...

<CharlesHall> and i would define most of Shawn’s example as the path or wayfinding to the task

jf: example of ax pro
... running against w3.org
... notes a series of guided pages for various issues
... walks through a heading test with a series of questions for human response
... we're using act format but we step you through
... generally true/false
... wants feedback
... asks whether this illustrates our silver principles

ch: wanted to ask -- do they all apear to be headings -- if I say no because of 1 out of 29

jf: Don't know yet, actually. can investigate
... we need to normalize this across all scenarios
... believe this kind of approach is the way (the WAI?)

<jeanne> I don't think this is task completion testing, this is improved (greatly improved!) manual testing.

sl: believe it's in line with our guideline testing

<JF> Sign up to give Axe Pro (beta) a test drive: https://www.deque.com/axe-pro-sign-up/

sl: would like to walk the tester through creating task tests

jf +1

<Zakim> jeanne, you wanted to say I don't think this is task completion testing, this is improved (greatly improved!) manual testing.

js: Very impressive, but not task completion testing, but hugely improved manual testing!

jf: wanted to show step through process, believing we will need tooling for scoring
... getting essentially the same score from various testors is a major goal

- DRAFT -

Silver Task Force & Community Group

31 Mar 2020

Attendees

Contents

Headings

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output