Silver Task Force & Community Group -- 07 Apr 2020

<ChrisLoiselle> scribe: ChrisLoiselle

I can scribe until 9:50am. I need to drop off call for another meeting at that point. Sorry!

<bruce_bailey> Partial regrets, I have drop off after the first half hour.

<sajkaj> I have a brief agendum request -- re current WBS for Challenges doc? Could we do that?

I'll scribe until 9:50am, need to drop at 10am

Jeanne: Scheduling change request. Adding in AGWG members to Silver calls.

Europe attendees would like to attend at reasonable time.

Conformance will be talked to in whole group.

<JF> 3 meetins/week = too many

<CharlesHall> +1 for change. no reco for what to change to.

<Lauriat> +1 (from all, I think)

<Chuck> +1 for change, +1 for pole

ShawnL: The later call on Tuesday on conformance blends with the earlier call.

mapping WCAG 2.0 to functional needs (from Bruce)

Jeanne: Mapping WCAG2 to functional needs, did you want to say anything Bruce?

<jeanne> https://kengdoj.github.io/WCAGTo508FPC/WCAG2FPC_datatablesnet_local.html

BruceB: This is our mapping from 508 functional performance criteria against wcag

The plan is to get to section508.gov but the timeline is not specific yet on global release. please comment via github or email etc.

JF: What were disagreements with EN 301 ?

BruceB: Primary vs. secondary vs. supports ...disagreement not the word I wanted to use...to strong of a word.

I.e. help , video based, captioning needing to be included in help video.

JF: This will have impact on scoring discussion.

<CharlesHall> we have had other lists along the way as well – all to extend the EN list – looking for them presently

BruceB: we wanted to see where we were on intepretation of functional . US - talks to WCAG 2.0 rather than 2.1 , Europe moving toward WCAG 2.1 as soon as they can.

JF: Supports with exceptions could be measured on a scale, shades of grey discussion. Bruce: could talk to Primary, secondary and not so strong. JF: Deep conversation that may need to be discussed later.

BruceB: the bigger AGWG could review a functional table at a greater length to align with scoring mechanism.

<bruce_bailey> Here is the GitHub link for commenting:

<bruce_bailey> https://github.com/kengdoj/WCAGTo508FPC

Jeanne: This may be a great topic for another F2F meeting.

<CharlesHall> plus all the COGA functions (functional needs) like these: https://docs.google.com/document/d/1QsiD0Y0lLCXvbmOOC4-EPf-2lFEPoEMaqNomQtPzBQI/edit?usp=sharing

WBS for Challenges

<sajkaj> https://www.w3.org/2002/09/wbs/35422/Conformance-Challenges-FPWD2/

Janina: 3 issues for WBS challenges have been reviewed. Comments are welcome , we will discuss next week.

Schedule of meetings

<sajkaj> https://w3c.github.io/wcag/conformance-challenges/

Sites to test testing

Jeanne: Real testing with real sites is critical. We are looking at test results from real sites.

No procedures, rather just results

<CharlesHall> re: Agenda item #2, i found 2 of the spreadsheet docs where we extended the functional needs and mapped them to 2.x

<Chuck> scribe change!

If we have any diverse sites, that would be helpful.

I need to drop , scribe hat off.

thank you.

<Chuck> scribe: Chuck

Jeanne: Jake is up.

jake: I have been busy the last week with setting up an approach I would like to share in the near future. I thought you were asking for scoring examples and how we score it.
... But my q is 2-fold. First of all, we have test results, but everything is behind a login, you can't see the app or native apps. Still it is data I use to show how we score, and I take into account everything mentioned in silver.
... Would be great to share. But with the site behind login... and also it takes some time. I'm working on it complete days, it will be mature in near future, but I can't share right now. I need to set up in excel and word.
... It will be a good approach to gradually show this information.

Jeanne: Just to be clear, is this something we can make public, or parts public?

jake: yes.

Jeanne: I'm fine with private testing for now. Once we decide that we have consensus on a scoring mechanism, we need to have public tests. But for now private is fine.

jake: I can share on a daily basis. I don't put company specific information in there. But it's a process. I'm setting up my versions of tests.
... takes a long time to work out. I'm working on it. Wondering where or how we will share it. I'll send you the link.

Jeanne: That would be great. I'll take a look, we can put it on an agenda, when you are ready.
... ...needs for testing and other sites. Maybe when I take a look it will do everything we need, but I'd still like volunteers. Any volunteers to use Bruce or Rachael's proposals?
... Silence for volunteering. Bruce in q.

<Zakim> bruce_bailey, you wanted to say just use Rachaels

bruce: Let's just use Rachael's.
... It's light weight, it doesn't neglect core basic reqs, while still giving credit for people going beyond basics. We should pursue Rachael's.

Jeanne: Great, that helps the workloads.

<Zakim> Rachael, you wanted to say we likely need to iron some items out before using it

chuck: I think I can get Chris L. to test Rachael's approach.

Rachael: I think we need to iron out some details. Just to make it consistent for people testing, even if we change it later.

Jeanne: Great segway into our next topic.
... Anything else on testing before we move on?

Shawn: Does anybody have url handy? Can someone share?

Jeanne: Rachael do you have it handy?

<Rachael> https://docs.google.com/document/d/1D18qg5pvne94jNvUwvj_Of36E8re6AB4tubdLpmhWOw/edit#

rachael: Pasting in now (above).

Tasks and conformance

Jeanne: One of the things that Rachael put in her proposal is that we have a different between basic and advanced tests. We discussed in agwg call was how do you separate out basic from advanced?
... That was going to be a huge overhead for the wg, because it would have to be discussed per test, updated and changed often. Seems to be a big overhead. I proposed instead of doing basic and advanced...
... Was manual/automated testing. Guideline based vs task completion. Then we ran out of time. I'd like to discuss here.
... In other words, any test that is related to the individual guideline, like the headings test that can be automated, a headings test that can be a rubrik against quality measures...
... would all be considered basic tests. But a task completion test would be considered advanced in Rachael's proposal.
... One thing that came up about that and is important was a caution from Kim that for many companies task completion testing comes at the design phase not at the q/a testing end of project phase.
... We don't want to be in a situation where we set companies up to put their usability $$ into q/a testing and taking away from design into q/a, would make situation worse. Usability needs to be in design not q/a.

jf: One of my concerns is that type of functional user testing is basically mortgaging the future. This workflow, task completion flow is going to be designed in 2020 will be accessible in 2023.
... The task that they designed will morph over the 3 year period. I tossed out an idea of a decay rate. If you don't keep validating, the score you got when first designed will be different from the test score 3 years later.
... We gotta have some kind of decay rate, not sure what else to call it. This notion that time is a factor. An automated test gets immediate results, but when you start test workflows and completion, those will change over time.

<Zakim> Rachael, you wanted to agree with switching away from testing method as a divider and to suggest that we divide them by criticality of test, also to state that we could reduce the

Rachael: 2 things, I agree that test methods doesn't make sense as a divider. One of the points worth calling out, if you test something manually that can be automated, it should not be advanced.
... Instead of task completion vs guideline test, how about level of test, like A, aa, aaa, but not quite. heading's for example. Is there a headings structure? A more advanced test.
... Does the heading structure map to the content accurately, and a 3rd level, the AT doesn't require it but still lots of benefits.
... An example of how these tests might be divied up.
... To John, every 3 or 6 months, if you test and get a perfect test, and if you don't test for 6 months you lose '1' from the score.

jf: Yes, thinking about that, not sure if it will be that simplistic, we need balance. I understand this is illustrative. The idea that the longer you wait between verification impacts the score.
... We said our score would be addative, but failing to test regularly would introduce a punative concept into the score.

<Zakim> Lauriat, you wanted to add to the shelf life idea

Shawn: I want to add to the shelf life idea, not in terms of decay, but in terms of when testing is done. Rather than have a decaying score, I echo concerns around that, I think it's important to note when the testing is done.
... Similar to elevator inspection... on THIS date we did these tests. A very long software dev cycle, say 2016, coga walk throughs , when we launch in 2020 and we say that we tested 4 years ago, it doesn't look as good.
... "as of date" stamp is helpful. Comes down to context of how bad or good that is.
... If you have a lifecycle of a couple of years, having something that says "we were in good shape last year", where as with google docs shipping frequently, greater frequency is more important.

<Zakim> jeanne, you wanted to say that I disagree with decay of score over time from testing

<Rachael> +1 to date stamp

Jeanne: I'd like to... I disagree with score decay over time, I like the idea of a date stamp. Makoto proposed that months ago. I think about Laney Finegold site, she had the first AAA site.

<KimD> +1 to Jeanne - we have to consider "static" sites

Jeanne: Her website adds content, but doesn't change. To take a single website like hers, and she doesn't change it other than adding material, and she tests new material, nothing old changes...
... Many websites don't change. I know that there are examples where the scores should decline, but we can't say that across the board.

JF: Template files don't change, content does all the time.

<Zakim> sajkaj, you wanted to say I'm uncomfortable with "time decay" as a concept

sajkaj: I like the sensitivity of design at one point and tweaks along the way, you can't make the same conformance statement, but I don't like an automated decay system. It's not time, it's changes.

<KimD> +1 to Janina

sajkaj: Possibly changes in browsers. If you simply add more content the Laney Finegold example is a great one. We have to be sensitive to it, but not based on time, based on something else happening.
... Putting a timestamp and version on it becomes important documentation in validating the results you are claiming. Those change over time. There has to be some knowledge that there is a reason for testing again.
... and some guidance on when and how, but not based on time. Not a half-life.

<CharlesHall> so scope of conformance should indicate date and version of a thing

jf: I understand what people are saying, I don't disagree. Analogy: Pickup truck that was inspected in 90's, do you trust the results today? At some point we need to admit that over time there is a high probability that the old report is not accurate.
... We may get some sites that claim wcag conformance, but not true. when you talk to engineers, in any endeavor, you stress test what you done. Failing to do so results in a situation...
... in 2K we were looking for cobol developers because of y2k.
... We need to timebox our conformance in some way shape or form.

<jeanne> +1 to use percentage of changes rather than time.

jf: To say that a conformance report that's older than 5 years flies in the face of reality.

<Zakim> Lauriat, you wanted to -1 to Janina's point. Browser and AT changes year over year break things all the time.

shawn: +1 to John's point, not in the fact that I think we should timebox, but older assessments are misleading. I think that the L/F website that hasn't changed isn't bad, but changes in browsers and screen readers cause bad things, and we need to be more pro-active.
... It breaks constantly. Doing things today doesn't reflect user experience over the years.
... I think that a conformance statement should reflect date and version. "2021 we tested", not just level of tech support, but we want Silver to be supportive of tech changes.
... If we add a number of things that can be done by 2023, then 2020 is way out of date.

<JF> +1 to Shawn

<CharlesHall> another time-based consideration is determining top tasks.

Jeanne: Do we have some kind of agreement of using a time and datestamp? This is something we discussed months ago, and coming back up again.

<JakeAbma> +1 to time / date stamp

<KimD> +1 to time (date) of the conformance claim

<CharlesHall> +1 to scope of conformance including date or date range

<kirkwood> fully agree with date

shawn: Not an expiration date. It depends on tech and use cases, to JF's earlier points, needing cobol developers because cobol was still in use.
... There will be ancient web systems we interact with. A census page required netscape 6, which doesn't make sense in 2020.

jf: If we put a timestamp on a conformance report, for what purpose?

shawn: Transparency. Somebody can make a decision if the report is valid or ancient.

<CharlesHall> we are already asking for the claim to define the scope

john kirkwood: Date is fantastic step forward, I've dealt with a lot. How to understand conformance timelines and when lawsuits cmoe into play.

<sajkaj> Want to suggest date stamp plus some kind of list of technology involved, incl browser versions

john kirkwood: Dates and showing that updates are occurring frequently makes big difference in courts. We could prove the date but it never was published. Dates would show that effort is being made.

Jeanne: JF, would you accept a date stamp?

jf: A date stamp is useful, any kind a report is created is already dated. If I write up a report in ms word, there's a date stamp on the meta-data. I agree with JK that it's useful for evaluators.
... Seems to me that using commercial market forces... they will want up to date conformance reports, old reports will lose trustworthyness. If we don't introduce decay, I will still not trust a 6 year old report.

Jeanne: I'd like to get to in the conversation, group consensus is to put a date on it. Will you accept this, or do we need to debate more?

<kirkwood> +1

jf: I'll accept the consensus, but will probably come in in larger agwg group.

<sajkaj> +1

<Lauriat> +1

RESOLUTION: We will put a date stamp on requirement on conformance.

Rachael: Is there somewhere to put down reasons why we aren't using decay? To share with larger group as to why decision was made?

Shawn: Perfect in meeting minutes summary. We can include things like "we considered this instead, and here's why we didn't go with it..."

Chuck: I don't have time to write up summary.

Shawn: One of the other things that will help illustrate this, as we run sites through overall conformance, illustrating how this will work with various kinds of sites. Very large and dynamic site.
... Doing a full regression and test is too combersome. We can use this concept, because we can use the oldest date of the overall functionality would be the date that would go on there.

JF: I appreciate that we don't want to beat on this horse. In wcag 2.0, conformance is at the page level, has pros and cons.
... Do we want to consider putting in a clause that invalidates a conformance statement that invalidates a conformance statement over time? A statement over x years old is no longer deemed accurate?

shawn: No, doesn't make sense. In some things 10 years is acceptable, but in other things 10 years is too old. Just having a time stamp would provide enough context to make the judgement call.

<KimD> +1 to SL

jf: You added a data point, of date and version. That presume we will have different versions of wcag 3.x.

<jeanne> +1 to Shawn's proposal

jf: I don't disagree with that, we have solid data to make the determination. I know the date and version, I have more info to make decision.

Shawn: Same thing as if something points to wcag 2.1, "we conform to this", in 3 years that will probably be out dated.

jf: In your statement, what you did say was we would include a version, which confirms to me that we will continue to have versions.

Shawn: I think that's not relevant, because if you make a date, you know the version of wcag.

jf: Not true in our experience.

shawn: If we have a statement in 2020, just as telling as someone who says we tested to wcag 2.2 in 2020.

jf: I'm hearing that we will include version along with date. And WCAG 3.x will be 3.1, 3.2, 3.x.

shawn: Numbering is arbitrary. We will have some versioning, but we will have to have the conformance statement state what we are conforming to.

jf: If that's the consensus, that meets my model.

<JF> Date & Version

shawn: We are at time.

Jeanne: No Tuesday night meeting, we'll meet on Friday.

- DRAFT -

Silver Task Force & Community Group

07 Apr 2020

Attendees

Contents

mapping WCAG 2.0 to functional needs (from Bruce)

WBS for Challenges

Schedule of meetings

Sites to test testing

Tasks and conformance

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output