W3C

Accessibility Conformance Testing Teleconference

09 Nov 2017

Agenda

Attendees

Present
MaryJo, Wilco, Anne, Shadi, Jenn, SteinErik, Romain, Kathy, Jeanne (Observer), James (Observer), Gian (Observer)
Regrets
Chair
MaryJo, Wilco
Scribe
Jenn, Gian, SteinErik, Anne, Shadi, Romain

Contents


SA: So wilco was saying that we've received good feedback on the rules, which is a good sign.

W: So the agenda is to come up with more questions. So, let's see, we have... our next draft is due in January, so we have some time to work on this, but we should aim to get all these issues resolved by Draft 3. there's additional work?

MJ: Draft 2 was back in March. So we didn't skip it.

W: Yes. The other big thing we want to do is hammer out this approval process. How do we decide when they're ready to be published? We've been ready for some time now and want to get this done by this year. In addition, there is Test Cases format that would be nice if we could get that done this year. Or should we say, two organisations moving with that, it's not as time-sensitive so if this runs into next year that's fine too. At that p[CUT]
... is done, we can start to look at Test Cases and rules as we need approval process sorted, and in 2018 - start building these rules and getting them implemented.

MJ: And at least two organisations to sign up

W: Just use "..." to continue on. Just use a Donald Trump sound. :)

SA: What we need to understand across all these efforts is how we can leverage existing work... primarily in the testing process, but also in the rules in general. If someone wants to submit 100 or so test cases, how we handle that. Here are several test cases, we need a scalable way to work with this... or if it's sometimes one-by-one or a batch, which is more ideal.

W: Working with MJ before you guys came in - the most plausible way to work is to start auto-generating rules. I've seen how aXe can output rules... it will take some effort to write up additional documentation, but the best way to upload rules is to auto-generate them based on our source code

MJ: for example, if you have well documented and well-annotated rules, you can pull these up using a script... start the process of generating a text-based rule.

SA: Company A can upload and generate their own test cases, then they can throw a lot of their documentation their way. We want to make it easy for companies to send or auto-generate rules and easy for us to accept them

MJ: We can come up with a good way of documenting new rules and then auto-generate the initial code template

SA: Hoping when someone sends us the rules and requirements and others can run their test cases against these. At the end of the day, we want a way for people to share

MJ: We want our members to be contributing rules and not overburden the group

SA: Yes, the idea was for people to contribute techniques .. but it was slow, so people are very weary of that. No - we don't want the W3 group to set up the rules, but make sure we communicate that our job is not to create test rules but generate the ability for the community to drive the test case

Qin: I pronounce is "T-sheen"

W: Qin, can you tell us more about the work you do and how this relates to the group?

Qin: we work with games, wifi is everywhere, we can go out with a phone and do everything to reach users. I want to know what you're talking about, is there some focus on the phone? What is the topic we are discussing?

W: We need to have a similar stance on mobile similar to the rest of W3C. it's not a separate platform - the web is the web, regardless of screen or device. In terms of testing, it doesn't change things massively. The reqs for a11y testing are the same regardless of what device you're viewing the web on. There are new rules coming in WCAG 2.1 specifically for mobile, so how do we take those tests and use them?

MJ: you don't have to move to a mobile platform to test. There are new a11y reqs being added with mobile in mind - i.e. smaller screen devices - tablet, mobile. And the desire to meet user needs so people can interact well with mobile devices. WCAG 2.1 is planning to be complete in June 2018 and so we'll be wanting to make sure we have test cases ready to test those new reqs. We'll definitely be focusing on mobile in that aspect.

W: Qin - not sure I got that - do you work on a11y testing, or testing tools?

Qin: Not much.

SA: and specifcally on games?

Qin: Yes, but i think everything connects with each other.

SA: We also want to make games testable, so you are on our radar!

W: Right, so we were talking about...

SA: Yes - the intent is to share the work and work together. Start developing these things separately.

W: Yes, the difficult part may be when we get to generating the rules - ensure that IBM's and Deque's rules end up being the same thing... we may be duplicates with variations in them

MJ: We want consistent test results, yeah

SA: We want to work on that review process to ensure we have a workable model. 1) Update the working draft for the rules format 2) Have the review process put in place and 3) Nice to have would be a rules repository

W: so Test Cases is first on the agenda. We have outlined a pretty straightforward JSON format. Quite possibly - start figuring out, what do we want to do with these
... So difference is we that every page has a list of failures and selectors of passes that's explicit - we point to this page, we have A, B, C passes and D, E, F fails. Implicit - everything that matches the selector is a pass. Does that make sense?

SA: I'll leave it to you guys which format is best.

W: We need both. For aXe score, we already have this. What we do is point to every individual pass or fail. I know Alistair said he liked this initially then he did something else, so he may have moved on from this. Not sure

MJ: When IBM gives test results, we only point out fails or potential fails ... we don't point out passes.

Anne: With Siteimprove, that's what we do at the moment

MJ: With this, we don't have the noise of pointing out passes - it's busy enough to sort out all the fails

W: Passes are useful to show if something's been caught by the selector

MJ: This can be an optionable part of this

SES: I see your point. It's a pretty common practice - so not sure if we should make it mandatory to record passes

W: I understand. We want to make this available to everyone - figure out where these differences are between orgs ... where we miss things, where you miss things, so we can have a discussion

MJ: As a validation of the test rules, it's valuable to highlight passes

W: This wouldn't necessarily be for when we publish rules, .... it may come with a format like this (on screen, see slides) ... for each test case repository - and we've identified a lot - we flag these things so we can discuss it. Right now this data exists but we're not taking advantage of it

SA: This is German engineered, but some of the things that are in there is individual resources being pointed to. We just see here that it's only a pass or fail being shown. B
... but understand this more

W: There's a load event and some of the content gets built in. There's a timer needed to be built in to show it afterward

SA: For example, you go on to a form, then you tab, and you get an error saying invalid email address. Then there's colour contrast violation ... that's easy to show. But we need to show that tab has been activated. Not even AJAX done
... So are these things in or out of scope, what do you we want to do?

W: We have test inside of frames. But how do you do selectors inside of frames? There's no standard way of doing that.

SES: So you descoped that?

W: Yes, we have a build step that generate this.
... The test cases format - explicit - does not support that. How do we write selectors that target inside frames or shadow DOM? There's no standard way of doing that. We need to align that but it adds a lot of work. The goal however was to make it useable today. If I have to update aXe to update how this happens... I may not get to it down the line or at all

SES: Can we go back to the purpose? Again, we're discussing what's in /out of scope...what's the purpose of the act? When we present our findings, a lot of things we do don't fit inside the framework, which challenges the purpose of the work we do. What is that?

W: Purpose of ACT?

SES: Yes... I hear you say - the purpose is to write rules that we can use today without re-writing...

W: No, sorry, talking about writing test cases.

SES: Ok, sorry, I get that.

Anne: This looks like our output format - why is this different when it's very much alike - can we cover a rule for test cases and output format - ...why we are confused
... Test cases are out of my league though, don't know much about it

W: I suppose it could be. I think it requires more work - for example, sometimes it does not record violations in a readable format so you have to generate that. It takes more time and complexity to build that

SA: There is actually also work by IBM years ago ...the idea - test case description language is more of a result, it tells resources, actions to be done, to fix... all these things that are in steps - then your result with be blah blah. Just the result is the output format. Of course they should be compatible. The test case description has more information about how and what to do. You wait a few seconds, etc.. there is a link - first[CUT]

<shadi> https://www.w3.org/WAI/ER/tests/usingTCDL

SA: putting this in the IRC
... it was so elaborate and we took subpart of it. You could even outline test cases for manual testing. It was done in XML which is now a no-no. It shows what tech it relies on - HTML, etc. It outlines the purpose, preconditions, assumptions... we need to decide as a group - I hear what Wilco is saying - let's only focus on the simple test, find a page and run this test on it... but what we want to do.
... Says the rules format should cover everything. You type in something and a message appears in the form - the rules format should support that. but how should the test case support that?

W: I don't think we have that today. We is not "Deque" but ACT. We looked at this format a year ago, to see what the different members have today and how can we expose it in a way that means other members can use it.

SA: so we will start with a certain sub-set of what the rules format can do... a certain set of test cases

W: yes, I think that makes sense. To do it in a way that works for everyone today.
... I want to avoid us over-engineering a format to stop us progressing

SA: I agree - our prob last time. I want to make sure we support backward compatibility

W: Hi Jen :)

SA: We want to be as simple as possible but address easy (error messaging) but also complex ones

W: Direction is very tricky, especially on live sites. Not sure about that. there's always the risk of accidentally submitting a Delete request

Jen: The delete request needs to be confirmed by the user

W: I don't trust browsers to do that. or people to do that

Anne: We are at? No automation

SA: Test case description - is this sufficient?

Anne: not sure how it maps into what we do today.

SES: We have current issues with output format. by describing the factors that affect - user agent, screen size - will affect test results. there are a lot of ways to get the selectors. If we have probs describing our rules in the tool ...

Anne: We don't have selectors.

SES: As an example - there is the landmarks one... because we check things both in the source code and the DOM interchangeably, to do different test, we might add steps from both in a test.
... We don't want to stop the discussion, let's run through it later.

W: Ok. the concern on the table is - does this format cover enough - is it flexible enough? (Test case format (Explicit results)... the answer is easy - prob not. it's basic, so it doesn't take much work to upload or start consuming other people who use the format... we can expand on this as we progress our work here.

SES: I don't see a problem as long as we're not hindering our expansion later... we can start here.

W: That's my hope anyway.

Anne: The test case or what it's called right now... it won't be part of the recommendation will it?

W: No.

Anne: That's good - we can start here and expand and not hinder innovation.

W: Yes, that's good. Consider what status this has. It's not part of the rules format. It begs the question though, what should it be?

SA: Test case description image?

W: I think that's a step too far. And do you know why it never went beyond where it is now?

SA: First of all, back then we didn't have you guys on board. It was more research-driven, or applied research. Carlos would kill me if i called it research. I think the issues we tended to overengineer is all the poss scenarios .. Using TCBL was elaborate and heavy. Being pragmatic and starting low-level that people can contribute to now that it is good
... industry vendors are starting to share their work, which ... I'm more hopeful. These are things we can update more... the question is to make sure we dont put things here that will block us later on. The test case descriptions can change. We can change the JSON file
... the static stuff - we can start with those tests because they won't be wrong later on. We can have more comprehensive scenarios

W: One part of this is ... who is actually sharing their test cases today? So, Deque and IBM, and Level Access is interested in doing it... is SI considering it?

Anne: We're going to open source part of our check engine... but the state of it now, we don't want to share. it will be the new stuff we'll share.
... This was discussed just before we entered the task force. going back to see if devs have an opinion to share before tomorrow

W: One weakness is success criteria field - we need more than just to point to it. We need a way to point to specific rules as well
... There is work to look into that

SA: And the diff between WCAG req and the test case that I ran is missing 'alt attribute' which is why there's a violation here

W: Not sure I follow...

<shadi> https://www.w3.org/TR/EARL10-Schema/#TestCriterion

SA: so the req is that non-text appears with an alt. So the test you run is several... one for the req and one for the actual test case you ran. Sharing in IRC...
... We have test requirement which is something like 1.1.1. ... and then the test case specifically that you ran, and the test case - you can have several test cases for a requirement. So you can say this test case X is part of requirement 1.1.
... you can return to the user the WCAG req that failed but also the test case that failed

W: That makes sense.. .and we may need to add rules field - which rule or multiple tests related. So Kathy... Jen...to you , i'm guessing you also use test cases that you work with

Jen: Yep, 635.

W: Wow.

Jen: We need to see which can be tested automatically by the tool..

SES: We haven't met

Jen: I've spent six years with the W3C ...OZ-art ... consulting since 2002
... Oz Art is an automated testing tool. I'm only in Oz six months of the year but yes we should talk. 2am on a Tuesday I can't do...

MJ: You're in luck, we're changing the working group time!

Jen: it's a bit easier now the time zones are more closely aligned. I really want to work on this - the aim is important - tests pick up different things. My tool might do it quicker or with more pages but there should never be a discrpency of what is agreed on a pass or fail
... I've had a lot of arguments with the group about testability - there shouldn't be discrepency either about what can be automatically tested

W: you can do the homework too! We've come up with a plan to define what the test cases are, identify the violations and the processes

Jen: I'm not a developer...that seems like code to me!(screen)

W: High-level - here's the page to get hte test cases from...here are the passes and fails on the tested page...

Jen: ...should be referencing techniques instead of success criteria? With my thoughts, techniques should be referenced instead

Kathy: H! I'm Kathy Wahlbin, accessibilty

Jen: Siteimprove: You win work off us!
... Where's Level Access, where's Monsedo?

SES: Monsedo is copying us.(Laughter)

<shadi> https://www.w3.org/2000/09/dbwg/details?group=93339&public=1

Kathy: There is break stuff downstairs so if we want to go... .I don't care, I have my coffee...

Jen: So is everyone agreed?

W: Yes... we can certainly add a technique at the same time as the pass/failures
... We need rules and techniques

Anne: Do we need it both places? same thing with success criteria?

W: in principle you can, but do you know how to map the success criteria to a technique...? if we're doing this, please show me ... my rules don't match your rules today, but we always use WCAG

MJ: If we agree it's easier to find and compare
... We can figure it out from there.

[Taking a break]

"Jen" = Dian

Dian: What do you think Level Access is doing for $40m?

SA: ... We can't discuss that and we don't wish to speak about a member that isn't present

Dian: Are there rules for the group? Could I discuss things over lunch?

SA: I ask that in a W3C meeting we don't discuss these things. But at coffee.

Jen = Dian = Gian

<gianwild> GW taking over as scribe

<shadi> scribe: gianwild

<Jenn_Chadwick> SA = SAZ, MJ = MJM, W = WF, Kathy = KW, SES=SES

<Jenn_Chadwick> Anne = Anne :)

Discussing test case format

WF How are we going to maintain it? Are we going to keep developing it?

WF: We have been working on Wiki - everyone agrees that this works

Anne: if we add a rule would it be "SiteImprove rule" or "IBM rule" or "ACT rule"

WF: It would initially be "Company rule" and if chosen it would be "ACT rule"

WF VA11tyS etc does not have rules (looking at Accessibility Testing Repository)

WF: Happy to add two fields to the test case - one for the rule and one for the technique

KW: Will they be mandatory? Is the WCAG standard mandatory?

WF: No to the rules, yes to WCAG

KW: We have clients that have their own rules - mapped to internal standards, not WCAG

GW: would like to map to outside WCAG2, for example BBC Mobile Accessibility Guidelines

MJM: There is also EN or ISO or other guidlines

JC to SAZ: How does the W3C feel about adaptations like EN and ISO

SAZ: fragmentation of WCAG - which we are trying to do here, harmonize adaptations.
... we want to be part of the open sharing of rules - this is what we are trying to do with ACT
... goes back to scope - is it web only? what requirements?
... we can discuss later

MJM: If there is a rule that does not map to WCAG then it makes us look at WCAG to see what is missing

KW: Looking at design, development, usability, user testing with pwd, so they go beyond the guidelines, they may also be customizing for their specific industry - there are some things applicable and some things that are not applicabe

JC - WCAG is just web, there are non-web

WF: change "successcriteria" to "accessibilityRequirements"

SAZ: SC, techniques and rules currently in the outline, maybe criteria, test case, rule

SES is test rule relevant?

SES would make sense to test technique or failure, not sure that we need to use WCAG wording explicitly

WF: let's simplify this into identifiers
... merge all three: SC, tech and rules into one called "test"

GW: could we reference all three in the one "test"

WF: yes

<Wilco> "test": ["wcag20:text-equiv-all", "WCAG-TECHS:G90.html", "mytool:image-alt"],

GW: Looks good to me

WF: avoid arguments about criteria and rule and technique etc

All approved new format

WF: We'll keep working on this through the Wiki
... need to map WCAG requirements

Everyone should review to make sure the format works

WF: we are looking to set up a test runner to consume formats, launch page using Selenium, execute script for the tool to work, will be doing this with IBM, will be an open source project so we can all contribute

SES

SES: not sure I understand

WF: So we can each write our own parser to process this format to open a browser with a specific URL to run rule but I think we all need to run this through Selenium
... we haven't set a meeting for that yet - it will be outside of ACT but all are welcome

SAZ: so what does a developer have to do, or for IBM have to do to make their tool compatible

WF: what we will be building is something that will run in node or java so we can send this format to it and it will return a page in the correct state so you can run it against this page
... run your own tool against it

SAZ: thought you meant running tools automatically

WF: not necessary

SES: we have this already - a pipeline

SAZ: what do you mean?

ANN: what is the outcome

SES for testing purposes

SAZ: run the script it will output a bunh of HTML pages which you can run over with the tool and you can see if your results are correct

WF: no, that't not it
... take this format, and do the setup: download the page, wait until it is loaded, trigger events and will give you back that page, run the tool against the pages
... returns one page at a time
... not have to write your own parser for this format

Ann: looking for the purpose
... is the purpose for me to check that my tool gets the same results as someone else's tools

WF: no, it is so you don't need to write code

SES: may be about how the tools work
... we can render pages in the browser - we already have this

WF: which is why we are doing this outside of ACCT

SAZ: Level Access did not want this either
... what I'm interested in is if someone ran this tool against a test case, how can they report it so we can find out if they support certain test cases or not
... want tool vendors to report which test cases they support

SES: incentive to report on what test cases supported

SAZ: where do test vendors report that they support a rule

SAZ we need a certain threshold to approve the rule

WF: don't understand

SAZ: not just a snapshot when approving a test rule - a living thing

GW: modify as required,

SAZ: big matrix of tools

https://www.accessibilityoz.com/factsheets/images/automated-testing-tools/

http://www.accessibilityoz.com/ozplayer/ozplayer-support-matrix/

GW: I think we need three levels: automatic, facilitated by the tool, manual

SAZ: we can go into it later but is this the kind of data we need to collect?

SES: who else updates it?

SAZ: tool vendors update as required

GW: modify the support of test rules, not the test rules themselves

SAZ: Tool version 1 supports these, tool version 2 supports others, let's just talk about collecting this data and making it available
... I hope this creates more competition between vendors, create more dynamic tools by taking up these test cases

WF: hesitant to create a matrix, ie tool x meets these rules

SAZ: each of the test cases not just rules

WF: why to that level?

SAZ: because we can and transparency
... if we are only on test rule level, tool vendors may say they support when they don't
... it is easy for the client to say "you don't meet that test case"

SES: makes it easy for the client to check that the vendor meets what it says it does

GW: I agree

SES: need to keep the processes as simple as possible
... one entry point and no redundant information

SAZ: make it transparent

Anne: if there are updates and it will be clear who has updated to new test cases etc

WF: am concerned, is it a race to the bottom?
... it's about incentives, if we incentivize hitting as many test cases as we can to see if you can push your competitors out, concerned that it is quantity over quality

SES: isn't there a quality assurance program in the transparency

SAZ: general question of the validity of the test cases
... assuming all test cases are good, if we have one test case we are better at, isn't it better?

GW: master errors that need to be met first

SAZ: could have a hundred ALT attribute variations and incorrectly inflate the score that a vendor

WF: there are some rules that are highly valuable that would catch a lot of problems, some rules are edge case, want to avoid a numbers game to see you can write the most rules

GW: perhaps mandatory vs edge case

SES: who decides

GW: I would think ACT

SAZ: don't like the idea of gatekeeper

Anne: it's difficult to decide vs important and not important

Ann: people are going to use it in their assessment

<skotkjerra> scribe: skotkjerra

WF: I want to approach this discussion from another angle: We could. manage this by having draft rules, unapproved rules, which are rules that do not have enough implementations yet.

What should drive acceptance should primarily be number of implementations

<gianwild> https://www.slideshare.net/seankelly612/csun-2017-success-criteria-dependencies-and-prioritization

SAZ: maybe that is a part of the answer. If we have minimum number of implementations for acceptance this will be the "trick" to validate test cases.

GW presentations gives examples of how fixing issues reveals other issues

Success criteria prioritization

SAZ: we have certain kind of levels for designators for the test rules. Test rules has to have a certain implementation support to be accepted as ACT rules.
... We already have a mechanism to avoid edge cases

ATN: If we have 15 rules to test the same requirement you can just pass all of them.

SAZ You would always be in this race condition, which may lead to quantity over quality

JC: The FAR model works well as a prioritation for fixes

ATN: Would one tool vendor be able to add a test case without approvement

GW: I would have thought the ACT group need to approve submissions

SAZ: with 10 rules with 100 test cases with 3 vendors, you have 3000 peaces of information in your dataset...we may run into maintenance issues

ATN: if W3C doesn't support collecting this information, different tool vendors will o it individually, which will not be comparable

WF: one thing we have been using is running against HTTP archives to document how many times we are finding violations. that might be a way to validate rules to identify rules that catch violations.
... we could check how common the problem the rule detects is

GW: Frequency of a problem doesn't necessarily reflect importance

it is not a numbers game. Test rules could be split up in groups to demonstrate the spread over e.g. success criteria

SES: We have following suggestions.

1. Using the number of implementations to assess the usefulness of a rule

2. deciding test rules into grouping to better demonstrate spread

<shadi> https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/ACT_Review_Process

<gianwild> scribe: gianwild

SAZ: When we developed techniques to determine support
... we tested, which was put in uaag, was done in 2008 so out-of-date
... how do we keep it updated?
... continuously have developers disclose what they support, hopefully developers with good tools will lead the way
... the user has more information about which tools they can rely
... some tools are focussed on one thing - like captioning, so it is a specialized tool in a particular area
... we have test cases we expect each tool vendor to run these test cases - can we get this information back?

WF: we can always go with self-reporting

SAZ: rather than having a central database, the tool vendor just publishes a file and maintain at their own discretion, they just tell us where the file is and we pull the information

GW: I like that

Anne: manual labor

WF: could be built into development proces

SAZ: we provide format for JSON file, test cases and whether they pass or fail

Judy Brewer enters

WF: lunch!

SAZ: one thing to say rules applicable to other standards, we are focusing on wcag work, so we could just sort or list according to wcag
... number of implementations are important, eliminate manual

WF: Can you claim implementation, if you're not using the rules
... you have test cases, run tool and pass all
... but if you are running one rule that tests five test cases

GW confused

SAZ: checking for 1.3.1 one rule that tests for ordered lists another that tests for unordered lists
... tool vendor has a tool that checks for lists

SES: we would have the same issue, are we over-complicating things, specifying granularity

WF: this is essential, it's about how you report results
... if siteimprove tests exactly the same as axe but siteimprove is more granular, siteimprove will address more rules

SAZ: if our rules are extremely granular then it doesn't matter how the tool vendors implement as they can claim each rule
... this is what Alistair was advocating for
... do we want to be that granular

Anne: outcome to say how many act rules we meet, one-to-one mapping,

SAZ: so act rules are ways to organize test cases, also where there is manual testing, it is documentation
... add note that internal workings do not need to map
... as long as you meet the test cases

SAZ

SAZ: you could also have one test that matches a number of ACT rules

JC: one rule with several criteria need example
... different example: kb accessible criteria that follow that rule are kfi, kfo correct etc

SAZ: intention of the rule was to be granular, but how granular? rule would be a subset of a sc

WF: is there value in having one-to-one mapping, eg every act rule has a corresponding rule in your tool

SAZ: can't see that is udeful

MJM: from an outside perspective that would be helpful

Anne: if you don't know anything about accessibility then people will choose the tool meeting the same rules

the most rules

MJM: better that what it is actually checking for is consistent

WF: what are the odds of interpretation differences if there isn't one-to-one mapping

SES: as long as you can document your test cases and fulfil them than you can claim to follow it

WF: I do think there is value in one-to-one mapping, I think it will happen over time
... in the long term this will be useful - if we can point from our tool to the w3c documentation to say "this is how this works"

SAZ: assuming the act rules are most granular, you can still point to the w3c documentation - the act rules - to say this is what I support

Anne: we used siteimprove and we used wave and it found different things - why?

WF: it will be easier to communicate if there is one-to-one mapping, but it would be internal

SAZ: imagine we have documented test rules and test cases and tool a reports differently from tool b - irrespective of internal design, it doesn't matter if you ran one rule or two rules, you have the freedom to run as many as you want, that's how you differentiate from the others, but if you both support the same test cases, regardless of implementation you should pick up the same issues consistently

GW: can we flesh out an example

SAZ: is checking an ordered list enough?

General discussion about a11y vs best practices

lunch!

<Jenn_Chadwick> scribe: Jenn Chadwick

<Jenn_Chadwick> WF: Overview of aXe - ACT on aXe

<Jenn_Chadwick> WF: False positives are critical to use. We avoid heuristics-related errors. Semantic markup - therefore false positives are limited

<Jenn_Chadwick> WF: [WF presenting slides. Please see]

<Jenn_Chadwick> WF: Things that are hidden with display:none, or advanced logic that you can't cover with CSS selectors. We have three types of rules "checks" - any, all or none.

<Jenn_Chadwick> .. Function where it will return a true or false on the element the selector found. Number of checks - any true, then the rule has passed. All need to return true for a rule to pass. For none, it's the opposite.

<Jenn_Chadwick> ...drilling down to checks - they have an impact - 'evaluate' function that runs steps on it and for some of them, 'after' function that does some post-processing

<Jenn_Chadwick> We use tags to indicate our WCAG levels and to indicate the success criteria that's applicable, as well as a best practices tag. Some experimental, not sure if something is 100% false positive, but working on this

<Jenn_Chadwick> ... ends up with something similar to other tools - violations, failures and a 'need to review' that are not checked automatically. I.e. Colour Contrast rules.

<Jenn_Chadwick> ...Needs review is complete or incomplete. We break up the tests into diff type: Core (js), Commons (js), Checks (js), Integrations - Rule (HTML), full (HTML)

<Jenn_Chadwick> ...All of these things map pretty well to ACT. The requirements come from tags, test mode --> "(Semi-)automatic". Selector is always the selector + matcher. Steps turn into checks. For example, 'any' - we go through all the checks, if 'any' is used, all checks return 'true'

<Jenn_Chadwick> ...aXe rules format: When people want to propose new rules, we ask them to submit a proposal. they create a ticket in GitHub. We discuss before we start implementing to work out potential false positives. when it's good enough, we code it in.

<Jenn_Chadwick> ... Currently all of our rules are documented in a way that allows us to turn them into ACT rules

<Jenn_Chadwick> ... So, in short - our new rules can easily be taken as ACT rules; for existing aXe rules, we need more documentation and External ACT rules can be built into aXe. Not sure how to do this just yet, haven't made that comparison, but still open to do.

<Jenn_Chadwick> ... Questions?

<Jenn_Chadwick> RD: in aXe, do you intend to do a 1:1 mapping at some point?

<Jenn_Chadwick> WF: We have checks that range from minor, serious to critical... that's not a thing that exists in ACT and I don't know if it should

<Jenn_Chadwick> ... it's been useful for us and clients expect it. It's worth looking at, but I suspect it's going to get a lot of disucssions that we don't have to have. Concern that it would be a time suck with not a lot of benefits

<Jenn_Chadwick> SES: Curious to hear about checks where you need a full page. How would you describe that rule in the ACT format since you would have to tackle the whole page and not a sequence of steps, if you know what I mean?

<Jenn_Chadwick> WF: If we select the root element, i.e HTML, and if that has one title inside of it. Just that one element

<Jenn_Chadwick> WF: It's only applicable to the root element of a page. Everything else in inapplicable.

<Jenn_Chadwick> SES: Yes. You run the check only on the HTML element...

<Jenn_Chadwick> WF:

<Jenn_Chadwick> Yes.

<Jenn_Chadwick> WF: The way we do it: we have a note that has the violation and the concept of related notes - so 'note' is HTML elements and related notes = title element

<Jenn_Chadwick> WF:... we've intentionally kept things small - not part of what we define, but something that we want to leave open and not restrict the rules.

<Jenn_Chadwick> Anne: If we had to compare results across tools - we have a 'related notes' field and SI has something else that is the same but named differently

<Jenn_Chadwick> WF: I think that's less relevant.

<Jenn_Chadwick> KW: It's a good example - take 'title' element for example and run a test to see how it actually pans out - a comparison of how it appears in each tool.

<Jenn_Chadwick> SES: Yes, that would be ideal and relevant.

<Jenn_Chadwick> WF: Any other questions, aXe related?

<Jenn_Chadwick> KW: If you've got a series of rules - how does that fit into the ACT format? I was struggling with that

<Jenn_Chadwick> WF: Two ways. In WCAG - we have rules with steps that either do 'fail' and continue to next step.. or 'pass' and continue. We have a group of rules where every check would be a rule and require at least one of those rules within the group achieved to pass the rule group.

<Jenn_Chadwick> KW: Perhaps we might have separte rules around this

<Jenn_Chadwick> [James enters the room]

<Jenn_Chadwick> JN: hello

<Jenn_Chadwick> [Siteimprove - Anne and Stein Erik presenting slides]

<maryjom> Agenda: https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/TPAC_2017

<Jenn_Chadwick> [please see slides]

<Jenn_Chadwick> Anne: When we say "check" we mean "rule"

<Jenn_Chadwick> Anne: Almost a 1:1 mapping of what we currently do. We see some limitations in the ACT rules format as it is. As we see it, it is not flexible enough without having to hack our way through it

<Jenn_Chadwick> ...overall, some things are too specific. The input types are not flexible enough. There are too many differences in input and these will cause differences in output.

<Jenn_Chadwick> We see a need for a shared glossary for all tools

<Jenn_Chadwick> ...To actual start using ACT rules, which is similar, we could just use them right away

<Jenn_Chadwick> ... Anne: For types not flexible enough - we have already created an issue for this. #109

<Jenn_Chadwick> ...Differences in input will cause difference in output: SI runs a check twice 1) in our pipeline, crawling the customer site every five days and saves the number of instances of an issue per page. 2) Then we run the check again when the user views and interacts with the Page Report. Difference can be when we used Chrome and they use FF for instance

<Jenn_Chadwick> ... one of the hopes was to see a solution - a standardized solution to issues we see internally. We've seen it before that issues we've struggled with - Deque doesn't see as a problem. [See screenshot in slides - Why all this trouble?]

<Jenn_Chadwick> ... [screen shot of customer's page and the report on it] - much more visual way of displaying issues.

<Jenn_Chadwick> KW: Why do you accept a tradeoff with issues?

<Jenn_Chadwick> Anne: Because our interface is so visual, it's better than sending a code snippet. This is why we can accept the tradeoff of the differences between the check outputs.

<Jenn_Chadwick> Anne: if there's a diff in the user agent (mobile, desktop, browser or OS) this will show difference in the second test. Screen resolution, states, interactions, dynamic content etc all produce differences. this is currently not addressed in the ACT rules

<Jenn_Chadwick> SES: Therefore we can expect it will affect other tool vendors

<Jenn_Chadwick> Anne: Our proposed solution: Add input context to the output format. Identify which step of the rule failed. Who performed the test - which tool version, name of tool, manual elevator, etc. Knowing as much about the test can assist this.

<Jenn_Chadwick> Anne: Question: do we need to report this as part of the results?

<Jenn_Chadwick> WF: I would say yes, very useful to have in your output data.

<Jenn_Chadwick> SES: This is a good discussion, back to the test description.

<Jenn_Chadwick> WF: We're not trying to write data format. Earl covers a lot of this and I'd like to explore that after ACT.

<Jenn_Chadwick> WF: I think this is valuable data. Reproducability is a challenge that's not going away and we need to account and identify these things. There will always be these differences, what I'm curious about - I guess you're not testing in multiple browsers?

<Jenn_Chadwick> SES: No - it's running in a set environment for the first test.

<Jenn_Chadwick> WF: That's something aXe does differently. We run the test in all major browsers - including IE

<Jenn_Chadwick> SAZ: What extent is this input format and what is output format? What would be interesting is saying - "You should get this result in this context." This is why we said, ok - in order to trigger a test, we need to know the environment and context. We need to say - this test here, we need to run it using these parameters and we'll get this result.

<Jenn_Chadwick> SAZ: It adds complexity.

<Jenn_Chadwick> Anne: It remains as complex if we leave it out.

<Jenn_Chadwick> SAZ: Exactly.

<Jenn_Chadwick> WF: I think all of these are really useful. I'm always very hestitant to do too much at one - hit one problem at a time.

<Jenn_Chadwick> SES: We need to ensure we don't lock ourselves into future development. that's a good thing about this discussion because we're now detecting conceptual problems, take into account and start small.

<Jenn_Chadwick> Anne: this is descoping, we just expect we'll still see differences.

<Jenn_Chadwick> MJ: Agreed

<Jenn_Chadwick> Anne: Starting out with a list doesn't always work for SI

<Jenn_Chadwick> SES: We use smarter algorithms and use other things to be efficient that we would need to tweak to describe these tests in the format. Example - test for text outside landmarks. we also hit challenges with content that doesn't necessarily reside within the DOM but we need to work with anyway.

<Jenn_Chadwick> [see slides for details]

<Jenn_Chadwick> Anne: Rules we cannot write as ACT rules if we follow the specs. ...[see slides]

<Jenn_Chadwick> RD: How does this mean that you cannot write a rule. You could write a rule that addresses two titles on a page

<Jenn_Chadwick> Anne: We don't have the developers here, but it's about being that we don't have the same selector as the one we want to report to the user in the end.

<Jenn_Chadwick> SES: You don't want to report on the page elements

<Jenn_Chadwick> SES: Yes, SI would invent one way to describe these things and a

<Jenn_Chadwick> ... aXe/Deque would describe it another way...? Do we increase the level of abstraction? we want to start this discussion.

<Jenn_Chadwick> Anne: We don't know what "graph based checks" are - developers would know. :) Something that's not possible too write as a selector. But I cannot explain it.

<Jenn_Chadwick> WF: what is makes me wonder. It would take time to describe a content element - then the check would be, is it descendant from a landmark?

<Jenn_Chadwick> SES: We see it that way. We could rewrite the check to the ACT format?

<Jenn_Chadwick> SAZ: that's why Earl (sp?) or ERL has several pointers. i.e. a video - please review from timestamp to timestamp to test for captions

<Jenn_Chadwick> RD: The selector can be the base context for the test. The element you want to report on may not be the parent element

<Jenn_Chadwick> WF: Yes

<Jenn_Chadwick> RD: Question about the issue of things that operate outside the DOM?

<Jenn_Chadwick> Anne: We check for HTTP headers.

<Jenn_Chadwick> WF: You could probably go beyond that.

<Jenn_Chadwick> Anne: Also parsing - where you can run a check on a string of text.

<Jenn_Chadwick> Anne: i.e. checking if text is passing - we want to work on HTML as a string of text rather than a DOM.

<Jenn_Chadwick> Anne: ... if the HTML tags are not closed properly, but you don't want to run it through a browser that would fix everything.

<Jenn_Chadwick> WF: I think we've covered it with the different input types - you're testing the plain HTML or the rendered code

<Jenn_Chadwick> ... checking that close checks are available. aXe cover that with input types.

<Jenn_Chadwick> WF: Selectors oon't have to be CSS. That's a discussion that came up - we don't want them to be synonymous with CSS.

<Jenn_Chadwick> Anne: "Inapplicable" outcome - this is used in Deque / aXe. Need for inapplicable in some cases when doing a depth-first search. Something that SI sees a need for it in its test

<Jenn_Chadwick> WF: A rule is inapplicable to everything it didn't select.

<Jenn_Chadwick> KW: Ok. Understood.

<Jenn_Chadwick> SAZ: At least in WCAG 2.0 and you're looking for malformed lists... when they're not in there, just by virtue of not having it in there... "inapplicable" should be used with caution because it's hard to achieve and can be misused.

<Jenn_Chadwick> SES: You will need to have some way of saying that this sub-check is not relevant.

<Jenn_Chadwick> WF: That's what your selector should do. It throws up the elements it doesn't want to test.

<Jenn_Chadwick> SES: I think that's the problem - we are using other algorithms to go through them sequentially.

<Jenn_Chadwick> SAZ: Trying to think of an example.

<Jenn_Chadwick> SES: I'll come back with an example.

<Jenn_Chadwick> Anne: There's a need for a shared glossary - i.e. something "perceivable". what should be considered for a check? ie. an image that is 1x1 pixel will be perceived by a screen reader if it's only a spacer.

<Jenn_Chadwick> Anne: Therefore, 'visually perceiveable'

<Jenn_Chadwick> WF: So your glossary... what we did in auto-wcag, we have certain defined terms called algorithms - a pretty low-level description of what that means. I.e. a 'content element' which means any element that should be considered content - text, images, etc... not intentionally hidden or decorative. Yes - that might make sense to do, but not sure if you need this to be predefined

<Jenn_Chadwick> Anne: If we find a bug, we need to know how many sites we need to fix it on. Having a consistent definition is important.

<Jenn_Chadwick> WF: I'm wondering about scope.

<Jenn_Chadwick> [James exits]

<Jenn_Chadwick> WF: I think it makes to have a glossary idea. [SAZ agrees.][Anne & SES: 'rules' repository']

<Jenn_Chadwick> SAZ: makes sense. somewhere to put these where it can be updated and maintained - rather than hardcoded into the tool

<Jenn_Chadwick> WF: I agree there should be things that are hard to get right - we need a program level description, ie. accessible name calculation. Some things are missing.

<Jenn_Chadwick> SES: Maybe later on we can provide examples

<Jenn_Chadwick> MJM: My input is superficial as I'm not familiar with the IBM tool... while she did give a pointer to the code, I need that help. I can see how a lot of things can fit into the format... the way we have things structured and the ACT rules are microparts of our rules. We have rules i.e. for ARIA-labelling. There are a bunch of things to test for this, and those ACT rules would be the smaller parts that make up ARIA-labelling. We'd have to [CUT]

<Jenn_Chadwick> ... not sure we'd be changing our structure. have similar kinds of outputs - potential violations, manual check type things - concept is similar. we also have recommendations in the tool

<Jenn_Chadwick> SAZ: What's the best way to capture these issues in the tool? Should we allow people to add to this?

<Jenn_Chadwick> WF: We could create a few issues based on questions we got.

<Jenn_Chadwick> SAZ: Is SI's presentation is publically shareable?

<Jenn_Chadwick> Anne: Yes, the only screenshot is of the Chrome plugin but we can always change a few things and make it public.

<Jenn_Chadwick> MJM: We tell the user how many violations there are and have links that point back to the code to fix them. Similar to other tools.

<Jenn_Chadwick> ... I looked at some of the rules - as long as they're documented, they fit in the context of our text-based rules. We would need to expand to ensure we have all the fields and it be complete.

<Jenn_Chadwick> ... some manual work involved in that.

<Jenn_Chadwick> SAZ: The reason - don't want it to end up as an action item.

<Jenn_Chadwick> MJM: Task for the developer to find out how well ACT can be mapped over what we have and how much work that would be.

<Jenn_Chadwick> KW: As I've been looking at this and talking to people - the benefit of the rules format is to help everyone understand what the rules are and how to do conformance testing. How we implement the rules is irrelevant. We've heard how the tool vendors do it and it will evolve over time

<Jenn_Chadwick> .... looking at AI - it doesn't fit into this model, for instance. Looking at the overall testing rules format and the framework - we need to focus on how to describe this to the real world. and not so much the details

<Jenn_Chadwick> SAZ: Would you suggest removing the selectors and prodecures section?

<Jenn_Chadwick> KW: We need to focus on a common understanding of the rules. Take a failure, technique or criterion - and setting out a consistent rule. Worried that we're looking at things as we do them today and getting away from the real benefit

<Jenn_Chadwick> KW: The rules must be consistent, then the tool vendors and how they are presented are irrelevant

<Jenn_Chadwick> SAZ: How to package the output - report

<Jenn_Chadwick> SAZ: You might use this fail to stick an icon on the page. That's how you are interfacing with the user. But what we're trying to achieve - what to include

<Jenn_Chadwick> WF: I'm surprised we're still discussing whether or not to document rules

<Jenn_Chadwick> SAZ: My optimistic brain is saying - we either define whether we have HTML selectors and other selectors. Or do we scrap that - is the purpose of this rule whether or not to check whether there's an alt tag? Here are the set of rules to determine whether or not you did this correctly

<Jenn_Chadwick> Anne: I would say - what about manual testing - we need those steps in there

<Jenn_Chadwick> WF: I think transparency is a bit part of the problem today. Not clear why people are getting the results we're getting

<Jenn_Chadwick> ... I think AI - there are concerns, we may not be able to write selectors etc if generated by AI. It's not the norm right now. I don't see AI replacing the work that we're doing.

<Jenn_Chadwick> KW: I disagree. We can determine the probability if that's correct or not. A lot more used with AI. Our rules don't fit with ACT and Level Access is the same. My whole way of thinking is in AI and different. If we're being prescriptive, we're limiting what we're doing now and what we can in future. I'd rather have a format to describe and give clarity on what the test ARE... rather than how things are tested.

<Jenn_Chadwick> RD: The AI can do a different check to achieve the same result for ACT. the path can be different.

<Jenn_Chadwick> KW: I disgree.

<Jenn_Chadwick> RD: The selector is used to give context to the rule.

<Jenn_Chadwick> Anne: Having a suggested testing procedure - knowing that you could deviate from that suggestion.

<Jenn_Chadwick> SES: Why do we need the steps as part of the ACT rules when we may not follow them?

<Jenn_Chadwick> SAZ: Anne raises a good point about manual testing procedures. i.e. RGA.

<Jenn_Chadwick> KW: Users are looking for a particular procedure and it doesn't work

<Jenn_Chadwick> RD: Algorithms define how a user agent should render a document.

<Skotkjer_> scribe: skotkjerra

<Skotkjer_> CW: We need to take a step back and write up som real world examples focusing more on making sure this format will actually be able to

<Skotkjer_> 1 accomodate automated rules inklucing AI or other methods

<Skotkjer_> 2. Manual or semi-automated rules to fit in

<Skotkjer_> WF: We want to see where are the points where we are not able to transfer our code into ACT rules..

<Skotkjer_> Where are the points where we can't do it

<Skotkjer_> SAZ: Do you check for aria-describedby?

<Skotkjer_> that don't map in there

<Skotkjer_> CW: ther eason there is value is that we have the same interpretation of what is required to conform to WCAG 2.0

<Skotkjer_> CW: it is listing out the requirements for listing instead of how to test.

<Skotkjer_> Discussion on wether tools need to follow the exact logic of the rule steps, or if the test criteria can by met by documenting the same outcome.

<Skotkjer_> CW: I feel we are trying to fit everybodys way of doing rules into the format, rather than describing what they should be and what the outcomes are - rather than how to do it.

<Skotkjer_> WF: So what is the ask?

<Skotkjer_> CW: Step back and focus on how to translate a requirement into automated, semi-automated or manual testing, and how that would work in that scenario

<Skotkjer_> SAZ: To what degree are we not communicating the role of the test cases?

<Skotkjer_> WF: The only way is with examples or more details

<Skotkjer_> WF: Happy to work with you on sorting this out

<Skotkjer_> JS to discuss Silver

<Skotkjer_> JS: we are in the data collection phase, not in the solution phase

<Skotkjer_> I am mostly here to listen, to hear the ideas from this gruop.

<Skotkjer_> one of the goals of silver is to be data driven, and one of the ways to do that is by tesging

<Skotkjer_> We are trying to figure out the structure - levels, layers...how will conformance/structure be.

<Skotkjer_> Input from testing is important.

<Skotkjer_> Like to hear ideas on how testing is fitting in with Silver

<Skotkjer_> WF: one of the best things with WCAG 2 was it was specifically written for testabilityl

<Skotkjer_> The fact that this taskforce exists proves they didn't gt everything right.

<Skotkjer_> The idea that standard needs to be technology agnostic is probably a good one, but left a lot of gaps and a lot of room for harmonization on how to test specific requirements

<Skotkjer_> that would be one of the key questions for me that we should try to solve for Silver

<Skotkjer_> JS: do you think the problem is across platforms or within one platform?

<Skotkjer_> MJM: Both.

<Skotkjer_> there is always platform differences, depending on browser combinations, screen reader versions etc

<Skotkjer_> what is an issue in your code vs. a user agent issue.

<Skotkjer_> sorting this is not always simple

<Skotkjer_> WF: I would agree. One of the tings that has hurt testability is the abstraction level in WCAG, arguments over heading levels.

<Skotkjer_> WCAG doesn't give you enough today to make it undeniable to anybody that skipping heading levels should be a violation

<Skotkjer_> part of this comes from abstraction, but part of it also from the working group wanting to only show the best examples

<Skotkjer_> JS: Do you do much with writing tests for the edge cases?

<Skotkjer_> JC: I've seen QA testers adding accessibility to test cases

<Skotkjer_> CW: there is a big difference between the testing methods when you are talking about edge cases

<anne_thyme> scribe: anne_thyme

WF: I think there is grey areas when it comes to assessing things like "is it descriptive enough?"
... ... and "is it easy enough?". That is not going to go away
... We want to create a repository of test cases with examples of things that pass and don't pass success criteria. We want it to grow over time and should help us get a shared understanding of what would be a WCAG failure or not

JS: That would be very useful
... We are depending on it. To have good accessibility testing, that is reproducible, is really important to help drive the acceptance of Silver, and drive what the content of Silver could be
... We really want Silver to be data driven, and if testers say that this is a condition that doesn't work for people with disabilities, we want to be able to capture this and tell it to the people who write code
... We use examples that are not web related, e.g. a home assistent. If their manufacturer came to W3C and said that they would like help to find out how to make that accessible. We have some general knowledge on accessibility that we can use for figuring this out
... We could ask people in a working group or we could look at the test results already performed and build requirements based on that. A bottom up approach to building requirements instead of a top-down
... this is not necessarily how we are going to do it, but one of the ideas
... I think that test driven development is the way we should go
... In many ways the test makes the standard. It's the perception of what the standard is
... This was just a very long way of saying that Silver think that testing is important
... We are getting results back from the experts looking into the pain points and needs around the current guidelines
... And one of the things emerging is the need for user testing
... This is one of the ways we can start to develop a process where we can start to bring on new technologies

CW: We have looked across the needs across different devices, and I think it was a good way to start looking at the general needs

JS: We haven't decided for a scope for Silver, we are just starting to think of ideas for structure

<Skotkjer_> scribe: skotkjerra

<scribe> scribe: anne_thyme

SES: Have you thought about making requirements for the process instead of for the end product?
... There is a lot of research on how user involvement is the best way to reach inclusive design in the end product
... Instead of testing the end product, you would document the process

<Skotkjer_> WF: to me a weakness of process testing is that you can't do it from the outside.

<Skotkjer_> Managing your processes is very powerful, but what it is about is the end result. Can people use it?

<shadi> BS8878 - ISO 30071

<Skotkjer_> So maybe a way forward might be a step in th emiddle...you are looking at a certain type of product, and for that product type you can come up with tests based on what your usability tests do

<Skotkjer_> JS: that is what we have been kicking around

<Skotkjer_> JS: one of the things I should also say: I do not anticipate that the work of this group is not going away.

<Skotkjer_> JS: We are not going to throw away WCAG. There are ways where we would change the presentation, but we are not going to do that over.

<Skotkjer_> SAZ: one doesn't rule out the other...

<Skotkjer_> There are several success criteria in WCAG 2 that are written from the author perspective - testable from that perspective. In 2.1 this is looked at more closely.

<Skotkjer_> somebody from the outside should be able to evaluate

<Skotkjer_> SAZ: you need skills, structures and process, and usable end produc ts

<Skotkjer_> JC: it is more of a methodology...the roles and responsibilities has gone a long way for supporting designers and UXers

<Skotkjer_> MJM: This is how I have been approaching teaching designers. splitting up in responsibilities - researchers, interaction designers...what are your bits...

<Skotkjer_> WF: I think it is important to realize that everybody is struggling with WCAG compliance. One of the reasons is that we don't have bug free software.

<Skotkjer_> MJM: when you have comformance statements you can only make it if you are 100% comformant - that is unrealistic

<Skotkjer_> SAZ It depends. Like the Dutch model - if you don't comply, what are you going to do about it

<Skotkjer_> MJM: you cannot claim full conformance because you cannot test it all

<Skotkjer_> CW: One of the things that would be helpful in Silver is to make it much more clear what the impact is to the person with disabilities

<Skotkjer_> ...and what the priority should be for different things.

<Skotkjer_> Clients today are trying to figure out how ti implement accessibility from a business perspective

<Skotkjer_> What they can and can't do - what is feasible while minimizing risk

<Skotkjer_> While we aim at conformance, what companies ar struggling with is understanding the priorities

<Skotkjer_> It would be beneficial to look at it from that perspective:

<Skotkjer_> What would the priorities be

<Skotkjer_> JS: one survey respondant said it would be helpful to prioritize by what is foundational..what is hard to do over

<Skotkjer_> MJM: this goes back to design. which ones will make you have to go back to design

<Skotkjer_> JC: sometimes legal copy behind a button may lead to a customer complaint

<Skotkjer_> MJM: you can get keyboard accessibility, but an administrators installation.....is your general user even going to see this problem? But possibly not in a given context

<Skotkjer_> SES: Context is important for assessing pass or fail in a given situation

<Skotkjer_> JS: I'd like to invite those of you who like to think about these things deeply to

<Skotkjer_> join a design meeting next spring preferably before CSUN

<Skotkjer_> where we invite people to think about these issues and talk about it and figure it out

<Skotkjer_> what we do about ocnformance and structure are the most important decissions we ar emaking right now

<Skotkjer_> we would like to be revolutionary in structure and evolutionary in content

<shadi> scribe: shadi

Daisy Demo

Romain: building a tool called Ace
... command line tool
... generated report in HTML and JSON formats
... extracts pieces of content for further manual checking
... do not currently have information on environment where the test was run, like user agent, but could add that
... uses EARL Pointers, but also some extended types of pointers besides these proposed by EARL
... our rules not yet in ACT Format but working on this
... we have another tool - EPub Accessibility Conformance and Reporting Tool
... provides instructions for manual checks
... helps generate a report based on input by evaluator
... would like to combine both tools into a cascade

SAZ: problems with EARL?

RDT: not really. just understanding EARL and JASON-LD

WF: how did you proceed in writing your rules
... had a Task Force in IDPF, which focused on ePub Techniques
... picked from that spec to write rules that are not already part of WCAG

Rules Structure

SAZ: good discussion today, need to continue that
... also concerns from Alistair and Kathy
... need to know them more specifically
... good input from Siteimprove, need to log these as issues

ATN: could try to rewrite a more complex rule too

KW: "rewriting rule" is exactly the issue
... should not need to rewrite anything
... want to come to equal outcomes
... but not to get everyone to write the same

RDT: ACT Rules are unit to communicate capabilities of a tool
... each rule accompanied by test cases (aka test suite)
... can use to check if tools perform as expected

KW: for example, want to check alt-text, and that tools perform equally

SAZ: "want to check alt-text" is what we call "rule"

KW: don't want to rewrite code to match that

SAZ: where do you see a requirement to rewrite code?

KW: Siteimprove?

RDT: they want to follow that format, not obliged to

WF: test procedures important for manual testers

KW: but that did not work well for Trusted Tester

SAZ: I see it as elaboration for WCAG Failures

KW: feel that the current format is too procedural and HTML-centric

SAZ: can you suggest specific test to look through together?

KW: will work with Alistair to provide specific example

ATN: will also work through a specific example

SES: will share with group

WAI-Tools Project

https://www.w3.org/WAI/Tools/

<rdeltour> rdeltour_: scribenick: rdeltour

<rdeltour> [5:02pm] rdeltour left the chat room. (Ping timeout: 180 seconds)

<rdeltour> [5:02pm] You are now known as rdeltour.

<rdeltour> [5:02pm] rdeltour: the WAI-Tools project is designed in a way to support the work of ACT-TF

<rdeltour> [5:02pm] rdeltour: s/the/SAZ: the/

<rdeltour> [5:02pm] rdeltour: … implementing these in open source engines

<rdeltour> [5:03pm] rdeltour: ... in order to demonstrate they are implementable

<rdeltour> [5:03pm] rdeltour: … the test rules will run through the validation process setup at ACT-TF

<rdeltour> [5:03pm] rdeltour: ... the project itself does not influence the design of the rules format or the test cases, or the validation process

<rdeltour> [5:03pm] rdeltour: … it just helps develop more rules, more output from this group

<rdeltour> [5:03pm] rdeltour: ... I think we can also use resources to develop rules for WCAG 2.1

<rdeltour> [5:04pm] rdeltour: … covering new ground, not what already exists

<rdeltour> [5:04pm] rdeltour: ... other part of the project focus on demonstrating these rules in different settings

<rdeltour> [5:04pm] rdeltour: … like the national obersvatories in Portugal and Norway

<rdeltour> [5:04pm] rdeltour: ... to show these tests tools can be implemented in different context, to show the broader world

<rdeltour> [5:05pm] rdeltour: … project partners are listed in the home page, incl. SiteImprove, Deque, Accessibility Foundation

<rdeltour> [5:05pm] rdeltour: ... other partners likely won't join this group

<rdeltour> [5:05pm] rdeltour: … they are involved in the test rule development, but not directly within the TF

<rdeltour> [5:06pm] rdeltour: ... open meeting on Nov 29 in Brussels

<rdeltour> [5:06pm] rdeltour: … (a bit far from Austin)

<rdeltour> [5:06pm] rdeltour: SES: will there be a recording?

<rdeltour> [5:06pm] rdeltour: SAZ: I don't know yet

<rdeltour> [5:06pm] rdeltour: … e may have remote participation, but not sure about recording

<rdeltour> [5:06pm] rdeltour: s/e/we/

<rdeltour> [5:07pm] rdeltour: M: can we have minutes?

<rdeltour> [5:07pm] rdeltour: SAZ: yes, we want to have minutes

<rdeltour> [5:07pm] rdeltour: … there are already a couple registrants from interesting organizations

<rdeltour> [5:07pm] rdeltour: ... we want to get people involved

<rdeltour> [5:07pm] rdeltour: … some tool developers too

<rdeltour> [5:07pm] rdeltour: ... if you have any questions, please contact me!

<rdeltour> [5:07pm] rdeltour: SES: maybe we can also recruit for this group

<rdeltour> [5:08pm] rdeltour: SAZ: yes, also to develop rules

trackbot, end meeting

Summary of Action Items

Summary of Resolutions

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.152 (CVS log)
$Date: 2017/11/10 01:27:48 $