Research and Development Working Group Teleconference

28 Jul 2011

See also: IRC log


Simon, Yeliz, Shadi, Karl, Markel, Arun



SH is introducing himself, he works on web accessibility

Arun is introducing himself, they do technology for disabled people

Markel is introducing himself, he is from Manchester, he is interested in understanding behaviour of blind users

KarlGroves is from New York, he does training and evaluation

<KarlGroves> @yeliz Actually I'm from Washington DC area, thx

<sharper> yeliz, introduces herself from METU, looks at blind users

scribe, yeliz

SH: main topic today is choosing the topic to focus
... main topics are benchmarking topics
... let discuss these
... different opinions? objections?

Karl: What about doing another survey to choose the first one?

SH: We have two with the same ordering
... includes asking for editors
... should we say run off or lets choose now

SAZ: Lets discuss more, may be we can join some, etc
... it's some people's idea, who put it on the WIki
... but would be good to discuss them here

SH: Markel put these benchmarking items
... Markel, can we do eiither of them? Join them or do you prefer to do them together?

MV: would be good to do them seperately. but I am not sure
... I prefer to choose one and then go for that

SH: is the ordering of these topics important?

MV: there is no specific order

SAZ: What is the expected output of sucj exploration? What do we hope that comes out of that?
... What impact such an outcome would have?
... and based on the impact, how can we maximize the output for web accessibility
... we need to be more structured
... what kind of outcome would make a different
... we would have 3/4 topics per year so we have to carefully choose the topic

MV: Metrics, the main outcome would be there are so many metrics out there, but we cannot say which one is the best or most appropriate in a defined scernario
... there is a paper that address these questions, we propose a framework that includes these metrics
... based on the requirements, this framework allow you to choose the metrics
... but there are many open questions
... the outcome would be a tool to support decision

KG: I tend to agree, what would give the best benefit, having experience with clients
... having such a tool would be very useful
... some other topics would also be useful for my clients

SAZ: I don't really understand what does metric mean in this case? Aggregation? Ranking? Test participants (subjective)? This is a complex topic....
... We should focus the topic and what is the expected outcome, or if we decide to go in a certain direction, we have to look at it carefully

KG: I agree with Shari, especially if you look at the description, it is very broad, would be good to explain which metrics we focus

SZ: I agree with you, but instead of discussing these in the telecon, we propose a wide topic such that we can all contribute, and then as the telecons go, we can narrow down the topic
... this might be more useful and more appropriate
... before looking at the background and finding editors hard to make this decision

KG: I agree

SAZ: working out scientific questions with a background survey
... one of the concerns I have, WCAG working group put a lot of work into combining the requirements of disabilities, and there is a potential here to undo it
... instead we need a scientific approach, we come up with questions

SH: Markel, can you give us more info

MV: the purpose was to focus on automatic metrics, those are included by the automated evaluation tools in the reports
... one question, how do these metrics reflect the experience of disabled people?
... not only using automatic tools or users, how about the opinion of experts?
... how do expert metrics compare to the user experience


SAZ: what is the impact of this with the lives of disabled people?
... how can we maximize the impact of our work?
... may be we need to think how to improve the impact?

SH: Its difficult to think of that in massive detail because the process is new to all of us
... we can look at how different evaluation tools perform,
... evaluating pages for cognitive disabilities, metrics that would be useful for all users

SAZ: the metrics that you are referring would they be any different from WCAG

SH: we have AI metric, or Barrier Walkthrough, Metrics from MV's thesis, or metrics that WAVE uses
... we also know that with WCAG we have metrics that cannot be automated
... and we know that getting 80% agreement between people is very difficult
... so what would be the effect of combining them

MV: metrics are not specifically attached to guidelines
... I tried to focus on automated metrics, because they are quicker to get, you don't need to wait for experts, or user evaluation
... automated metrics are much quicker to get
... you can also combine manual, user or automated tools

SH: I would also agree with that
... may be we need to look at the pages that evolve, may be some metrics can be still kept
... or may be some evaluation aspect are still valid, but may be some dynamic part of the page can be evaluated
... as the MV says there are so many things like this that have not been investigated
... this is not a one shot deal
... there might be a lot of metrics that people use in different context

SAZ: why do we have so few references
... we have to be careful because we do not have resource to carry experiements
... we mainly invite existing research
... how would we go about this


SH: there might be some stuff or might be nothing
... may be we look into this and then bring them to the group
... published research, published papers, experiment conducted
... MV have you already done some work on this?

MV: we already have a related work section on metrics in our paper
... we compared them, discussed them..
... may be we missed some...

SAZ: what is the conclusion?

MV: all metrics are different, that is the main outcome
... some are good distinguishing accessible/non-accessible pages
... most of them are weak, and some of them are strong

SH: we can invite people to give a seminar who actually created these metrics

SAZ: the problem is what is it that we want to achieve?
... do we want to recompare them? What is it that we want to do?
... I am not sure what we are going to do?

MV: we have downloaded a lot of pages, if we use LIFT, or WAVE, how does the score change? An example direction

KG: when you do that don't you have a lot of unctrollable variables
... for example, do they crawl pages properly? If you use experts, what is the method used?
... metrics and their strength, depend on the tool used, or depend on the expert used, their experiences
... this could be something much bigger than what we think or chew...

SAZ: we want evaluation or authoring tools to apply and use these metrics
... why do we want these metrics
... purpose is to improve the experience of disabled people, right

SH & MV: +1

SAZ: we want to advice people on the quality of metrics, right?
... I am trying to come up with a goal for this work

MV: a framework to support the decision making on metrics

SAZ: I am confused about the metrics and WCAG 2.0 techniques
... guide people to write techniques, may be we can guidance on writing techniques
... making the link between WCAG criteria and techniques to achieve them
... we can provide guidance for people to create good techniques

SH: not sure about techniques?
... WCAG 2.0 is one way of evaluating pages, but how we express the result of evaluation? all the tools out there they all looking at WCAG 2.0 but the metrics used are all different
... not saying we should analyse the tools existing, but understand the metrics better and provide guidance on providing metrics at the end
... guidelines are there and the success criteria fixed, but how do we look at these evaluation results
... as the KG says, we don't know really how people do the evaluation
... may be if we provide metrics, then you can better guide people to do and present the tests

SAZ: if we do testing with experts, or with users or with automated tools, do they peresent different measurement?

SH: some system assume best or worst in applying a criteria? different kinds of users?
... if the users from different kinds of disabilities, may be the metrics used would be very different?
... but if you have one disabled user the metrics would be different?

YY: what is a metric? I am confused!!!

SAZ: this is very different from what we have on Wiki

SH: may we should have an action on writing what would be the outcome?

YY: what is a metric?

<arun> have to leave now, bye

SAZ: what would be the goal of such exploration?
... to define/ or provide your definition of key terms such as metric
... think of a set of questions that can be investigated?
... elaborate the questions we have on Wiki
... it might worthwhile to work on the page on Wiki and expand i


based on the discussion we have

<sharper> ACTION: markel to capture the meaning (definition) and outcome expected of 'Benchmarking Web Accessibility Metrics'. [recorded in http://www.w3.org/2011/07/28-rd-minutes.html#action01]

<trackbot> Created ACTION-1 - Capture the meaning (definition) and outcome expected of 'Benchmarking Web Accessibility Metrics'. [on Markel Vigo - due 2011-08-04].

MV: what would be the right place for dicsussing definitons


SAZ: some key issues/terms need to be further explained
... bring out the purpose/goal and research question
... if we god to public and ask them to contriute, what is it that we want to ask people

MV: OK, I got it

SAZ: it seems like we already have our first topic

SH: it seems like it,

SH & SAZ: please continue to contribute to Wiki

SH: anything else?
... thanks for joining, see you next week

Summary of Action Items

[NEW] ACTION: markel to capture the meaning (definition) and outcome expected of 'Benchmarking Web Accessibility Metrics'. [recorded in http://www.w3.org/2011/07/28-rd-minutes.html#action01]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2011/07/28 16:49:34 $