W3C

– DRAFT –
Silver Task Force & Community Group

19 January 2021

Attendees

Present
Andy, Sheri-B-H, ToddLibby
Regrets
Makoto
Chair
jeanne, Shawn
Scribe
JustineP

Meeting minutes

<ChrisLoiselle> Sorry group, can't scribe today...restarted my computer 4 times already today :(

Update on publishing

Jeanne: Upcoming publication in the near future.

Janina: Have objections been handled?

<CharlesHall> regrets, I have to drop at 10am ET

Update on the approved Decision policy

Rachael: The CfC for Silver decision policy has passed.

<jeanne> https://lists.w3.org/Archives/Public/public-silver/2021Jan/0069.html

Rachael: one typo will be corrected.

Proposed change to next version of Requirements (take 3)

<jeanne> Minutes from where we left off

<jeanne> All Silver guidance includes tests or procedures. Some guidance may use true/false verification but other guidance will use other ways of measuring (for example: rubrics, sliding scale, task-completion, user research with people with disabilities, and more) where appropriate so that more needs of people with disabilities may be included. This includes particular

<jeanne> attention to people whose needs may better be met with a broad testing approach, such as people with low vision, limited vision, and cognitive and learning disabilities.

<Jemma> +1

Jeanne: If you agree, please enter +1

Janina: Suggest last "and" should be "or"

<KimD> +1

<Jemma> I agree with "or" insteand of "and"

<Rachael> I agree with or

<Zakim> JF, you wanted to ask how user-research is measured

John F: Agree that user research would be incorporated, but not sure how it would be measured. We certainly use it, but how to measure?

<Jemma> may be "usability testing"?

Jeanne: When drafted, people who recommended its inclusion had recommendations on how to measure. Haven't written guidelines yet and as a result haven't delved into the details.

<CharlesHall> +1 to JF

John F: Suggest replacing "measure" with "evaluation"

<Lauriat> +1 to Jemma, should read "usability" rather than "user"

<JakeAbma> +1 to evaluation

<KimD> I'm fine with "measurement" because it's "such as"

Janina: Doesn't bother me, as its a requirements document with examples. Not suggesting that all will be used.

<Zakim> sajkaj, you wanted to respond to jf

<Zakim> Rachael, you wanted to talk to user research question

<CharlesHall> user research and usability testing are both appropriate. user testing is not.

<Lauriat> +1 to Charles, that exactly.

Rachael: History on "user research"...usability testing does not include heuristic evaluation. May have landed on "user testing" as terminology but we should agree on terminology and use consistently.

Jeanne: We don't test users...

<Fazio> I feel like user research is an umbrella term

<Fazio> it includes multiple testing methods

Jake: I was wondering if current approach is set in stone.

<Rachael> Charles, I agree with you but we may want to put a discussion on the to do list. The current glossary (https://w3c.github.io/silver/guidelines/#glossary) includes "User testing Evaluation of content by observation of how users with specific functional needs are able to complete a process and how the content meets the relevant outcomes."

<JF> @ KimD - "such as" is not included in the sentence in question - implied, but not written

Jake: or do we have opportunity to propose Silver as a set of documents with technical documents, user needs, etc.? If so, sentence might be applicable to those documents.

Jeanne: This is a FWPD, nothing is set in stone. We are in early stages.

Janina: I'd suggest that the subject of the sentence is guidelines.

Jake: First three words are for guidance.

Jeanne: Should we not include?

Jake: Needs to be part of it, but scope might be expanded.

Wilco: Wonder if we can remove "procedures". In ACT, procedures are problematic to include because there are many ways to test. What matters are the results.

<JF> +1 to Wilco

Wilco: Otherwise, we may need a conversation about why procedures are bad.

<sajkaj> No objection to Wilco here

<Fazio> Maturity!

Jeanne: I believe "procedures" is there to leave us opening for more user research-oriented and maturity-oriented guidance. May not be in bronze level regulatory section, but could be in Silver or Gold levels.
… In that case, we could include "procedures".

Wilco: Would still recommend against it.

John F: How do we measure user research?

Janina: Why do we need to specify that in a requirements document?

<Fazio> User reseach includes usability testing

John F: "User research" is an important term in that sentence.

<Fazio> multiple ways of usability testing

Janina: If we specify a way to conform based on working with user testing, we'd need to specify many things. Its just an example of ways to evaluate.
… Its exemplary, no more.

John F: Don't know how you measure some of these things.

Janina: Its not what we are doing now.

David: The term includes brain wave mapping, eye tracking studies...abstract concepts whereas "user research" is more subjective. Its an umbrella term that covers many types of testing.
… Covers cognitive disabilities and more difficult concepts to handle.

Rachael: Acknowledge the concerns. We have concepts of measuring vs. evaluating. User research includes measurable pieces but also concepts that are more difficult to measure. We can change to "evaluating" or keep "measuring" but replace "user research" with "user testing".

<JF> +1 to Rachael, with a preference for using 'evaluate'

Jeanne: From academic viewpoint, what is the difference between measuring and evaluating? The terms seem synonymous.

Rachael: Evaluating doesn't have a clear scale.

<CharlesHall> evaluating = review. measure = tally the reuslts of the review.

Rachael: Measuring is more specific than evaluating.

Andrew: Measuring can be done with a tool. Evaluating involves judgment/subjective perspective.

<JF> Oxfored Dictionary: verb - form an idea of the amount, number, or value of; assess.

Jeanne: Thank you. I'm fine with changing to "evaluating".

<Rachael> s/ with "user testing / with "usability testing"

<jeanne> Propossed change: All Silver guidance includes tests or procedures. Some guidance may use true/false verification but other guidance will use other ways of evaluating (for example: rubrics, sliding scale, task-completion, user research with people with disabilities, and more) where appropriate so that more needs of people with disabilities may be included. This includes particular attention

<jeanne> to people whose needs may better be met with a broad testing approach, such as people with low vision, limited vision, or cognitive and learning disabilities.

Chuck: Entered change so that we can see suggested change. I'm fine either way.
… John, are you more comfortable with "evaluating"?

John F: Yes.

<Fazio> can't we use both terms?

<Zakim> preference, you wanted to discuss measure

Rachael: Would prefer "measuring" but its a slight preference only. Evaluating doesn't indicate comparable method of assessment.

Wilco: I share Rachael's preference. "Measure" is more concrete.

John F: How do you measure user research?

<CharlesHall> i agree with JF. if use measure, we have to indicate how that could be measured.

<Fazio> focus groups, eye tracking studies brain wave mapping

<Rachael> Alternative would be: All Silver guidance includes tests or procedures. Some guidance may use true/false verification but other guidance will use other ways of measuring (for example: rubrics, sliding scale, task-completion, usability testing with people with disabilities, and more) where appropriate so that more needs of people with disabilities may be included. This includes particular attention to people whose needs may better be met

<Rachael> with a broad testing approach, such as people with low vision, limited vision, or cognitive and learning disabilities.

Sarah: There are ways to measure. Depends on type of research being conducted.

<jeanne> All Silver guidance includes tests or procedures. Some guidance may use true/false verification but other guidance will use other ways of measuring or evaluating (for example: rubrics, sliding scale, task-completion, user research with people with disabilities, and more) where appropriate so that more needs of people with disabilities may be included. This includes particular attention to

<jeanne> people whose needs may better be met with a broad testing approach, such as people with low vision, limited vision, or cognitive and learning disabilities.

<CharlesHall> have to drop

<CharlesHall> +1 to measuring or evaluating

<Zakim> Chuck_, you wanted to ask aren't we using and?

<jeanne> All Silver guidance includes tests or procedures. Some guidance may use true/false verification but other guidance will use other ways of measuring and evaluating (for example: rubrics, sliding scale, task-completion, user research with people with disabilities, and more) where appropriate so that more needs of people with disabilities may be included. This includes particular attention to

<jeanne> people whose needs may better be met with a broad testing approach, such as people with low vision, limited vision, or cognitive and learning disabilities.

Jeanne: Will adjust to "measuring and evaluating"

<JF> +1 to *OR*, -1 to *AND* - which implies both must happen

<ChrisLoiselle> For what it is worth, for usability and ISO definition, usability requires us to measure effectiveness, efficiency and satisfaction. Two measures that will justify design changes...time on task and success rate... whether this adds to measure of evaluate is up for discussion.

<Jemma> +1 for the revised version

Jake: Sarah mentioned that scoring is a part of benchmarking which is a different approach than we've chosen. Are those still a part of scoring system? If so, current approach for scoring will probably change and may be completely different.

Jeanne: I have no objection to including "benchmarks".

Jake: May be a big change if we include in Bronze level scoring. Would be smaller change if included in Silver or Gold.

Jeanne: There are many definitions of benchmark testing. We might be coming from different perspectives.
… I'm fine with including "benchmarking" in examples though.
… any objections to including "benchmarking"?

Chuck: Slightly although not strongly against it. The more examples we include, the more detailed it gets.

Jake: When John F. was asking about ways to measure, Sarah gave benchmarks as an example of ways to measure. If we don't include, we should circle back to John's question.

Chuck: We have multiple considerations at this point. I'm fine with including "benchmark".

<Jemma> one thing is that "bench mark" can be a very broad concept and many different meanings by diciplines or domain area - bussiness or computer domain

<Zakim> Chuck_, you wanted to ask that as this is a first draft, it's ok to change, and even expected at some level

John F: No problem with "benchmark" although I do prefer "or" rather than "and".
… actually, I feel quite strongly about it.

<Jemma> if we want to add, can we add it as "benchmark testing"??

JaEun: Curious if we understand benchmark testing. From a bussiness perspective or computer science perspective, it has a different meaning.

Jeanne: Good point.

Janina: We may not have the same understanding but this is an exemplary list.

Jake: We need to have consensus of what we understand the term to mean.

Jeanne: Would you be okay with not including in the list because its a list of examples, but we can work on it in the future.

Jake: Sure...

<Fazio> User Research Example: Eye Tracking study pinpoints fixation points, duraation of fixaation, scan patterns. multiple sessions with multiple users provides actionable insight

<joconnor> This may be helpful regarding Metrics etc https://www.w3.org/TR/accessibility-metrics-report/

Sarah: Method that I mentioned was system usability scale. There's a question about what "benchmarking" actually means. I was using as an example of ways that we've measured accessibility via user research.

Jeanne: We are still at "measuring and evaluating" or "measuring or evaluating"
… Who is on the "and" vs. "or" side?

John F: Would like "or".
… struggling with term "user research".
… My concern is that we're suggesting we will measure user research.

<Rachael> We could still change "user research" to "usability testing" which is a bit more specific and has clear ways to measure (success or error rate, etc)

Jeanne: We added "evaluating" so that we can include user research.

John: Prefer "or" because we can't measure everything.

<jeanne> ORs +1?

<Sheri-B-H> +1 or

<Fazio> -1

<sajkaj> -1

<Rachael> 0

<Wilco> -1

0

<Jemma> -1

<Chuck_> 0

<JF> +1 for "or"

<ToddLibby> 0

<sarahhorton> -1

<Jemma> I think measuring is the foundation of evaluation and both are connected closely.

<Fazio> I agree wbothh Janina

<ChrisLoiselle> For user research, you can analyze qualitative data from a field visit . You can analyze and interpret existing data. You observe the user doing "X".

Jeanne: We might have a connotation problem in that people are looking at the terms in different contexts.

<Fazio> maybe and/or will be the compromise

<JF> analyze and evaluate are not measure

<Jemma> + 1 to Fazio

<Rachael> Rarely in qualititative methodology do we "measure" per say. We do evaluate. Tpyically we pick qualitative because the situation lends itself to not measuring.

Chris: Adding context. User research can lead to quantitative and qualitative analyses. Research piece is by user tester and researcher, but its getting caught up in the word "evaluate".

<JF> Evaluate: *CAN* the use click the button. Measure: *HOW LONG* does it take to click the button

Jeanne: Could everyone who voted +1 live with "and"?

<JF> No

Or everyone who voted -1, can you love with "or"?

<Chuck_> +1 I can live with "or"

<Fazio> not that big of deal so yes

<Sheri-B-H> +1 i can live with and

<Jemma> I can leave with opposite way.

John F: My preference remains with "or" and can't live with "and".

Jeanne: Can everyone live with "or"?

<Fazio> +1

<Chuck_> +1

+1

<JF> +1

<jeanne> Who can live with Or?

<Rachael> +1 to live with or

<ToddLibby> +1

Jemma: To confirm, "or" give more flexibility?

John F: Yes.

<Jemma> +1

Sarah: I can live with "and/or".

John F: Same.

Jeanne: Let's go with "and/or".

<jeanne> All Silver guidance includes tests or procedures. Some guidance may use true/false verification but other guidance will use other ways of measuring and/or evaluating (for example: rubrics, sliding scale, task-completion, user research with people with disabilities, and more) where appropriate so that more needs of people with disabilities may be included. This includes particular attention to

<jeanne> people whose needs may better be met with a broad testing approach, such as people with low vision, limited vision, or cognitive and learning disabilities.

<Zakim> Chuck_, you wanted to say there is a "live with" consensus for "or"

Wilco: Don't really like "and/or".

<Chuck_> +1 to live with tests or procedures

Jeanne: Let's try to wrap this up.
… Let's go with "tests or procedures". Straw poll please.

<Wilco> -1 to test or procedures

<jeanne> tests or procedures +1

<JF> 0

<sajkaj> +1

Jemma: Can you remind me of the issue?

<JF> +1 to evaluating outcomes

Wilco: Procedures is highly prescriptive when when only need repeatable outcomes.

<ChrisLoiselle> wouldn't you measure off of an evaluation method? ie. formative vs. summative, moderated vs. unmoderated, lab v. remote test, usability tests vs. expert review?

Jeanne: Opposite side is that group envisioned "procedures" as something that could go in Silver or Gold conformance (a long time ago). If you follow the procedure, its more of a maturity model that could lead to Silver or Gold conformance.

<ChrisLoiselle> benchmarking - wouldn't that be comparing one thing directly to another? not sure what the group decided on benchmarking.

Wilco: Still think that its not a procedure, I'd rather use a different word.

Sarah: I agree Jeanne re: keeping door open to opportunities to demonstrate commitment and progress that extends beyond tests.
… wonder if we can say "Silver guidance includes tests..." if "procedures" feels too confining.
… or if we can generalize to give more flexibility.

Jeanne: That's fine with me. Can you draft a phrase that would capture your recommendation?

Sarah: Yes.

Jeanne: Sarah, can you send to the list by Friday?

<sarahhorton> Yep!

<jeanne> Here's where we are leaving off: Silver guidance includes tests or procedures. Some guidance may use true/false verification but other guidance will use other ways of measuring and/or evaluating (for example: rubrics, sliding scale, task-completion, user research with people with disabilities, and more) where appropriate so that more needs of people with disabilities may be included. This

<jeanne> includes particular attention to people whose needs may better be met with a broad testing approach, such as people with low vision, limited vision, or cognitive and learning disabilities.

<jeanne> Sarah will propose a phrase to substitute "procedures".

Jeanne: On Friday, we'll discuss subgroups and goals for January.

Wilco: Can't be on Friday's call, but okay if word "procedures" is taken out.

<jeanne> Corrected proposal for Friday "All Silver guidance includes tests or procedures. Some guidance may use true/false verification but other guidance will use other ways of measuring and/or evaluating (for example: rubrics, sliding scale, task-completion, user research with people with disabilities, and more) where appropriate so that more needs of people with disabilities may be included. This

<jeanne> includes particular attention to people whose needs may better be met with a broad testing approach, such as people with low vision, limited vision, or cognitive and learning disabilities. "

<jeanne> noting that we didn't discuss removing the first word "All".

Minutes manually created (not a transcript), formatted by scribe.perl version 127 (Wed Dec 30 17:39:58 2020 UTC).

Diagnostics

Failed: s/ with "user testing / with "usability testing"

Succeeded: s/not/no

Succeeded: s/bench mark testing/benchmark testing

Succeeded: s/a business perspective/a bussiness perspective or computer science perspective

Succeeded: s/it/both

Succeeded: s/it is connected to me/both are connected closely

Maybe present: Andrew, Chris, Chuck, David, JaEun, Jake, Janina, Jeanne, Jemma, John, Rachael, Sarah, Wilco