See also: IRC log
saz: any reaction on the requirements
documents?
... will should and may type of document
... is anything missing?
niq: not sure about F02 and F03
... validity of results (?)
saz: you are suggesting " the persistency of validity of results"
jim: good idea
saz: i agree
sorry: (
<JibberJim> mute JibberJim
<JibberJim> mute Jim_Ley
<JibberJim> aarrgh, stupid phone
saz: next stage, next stage to publish as
working draft
... get some feedback outside the group
... success of EARL will depend if these requirements are met
... this is an intiial working draft - anybody willing to take up future
editing of this?
... collect feedback and incorporate it?
jim: I will take that
saz: we can work together
... we can publish very soon next week or two
saz: wehave been discussing to drop confidence
values
... use instyead some form of percentage
... maybe issue for test case description rather than test result
is this chris?
niq: may be not appropriate for some applications, allow heuristic values, keep also high medium low
saz: do people agree to adopt numerical values, and work out what exactly values we take uo?
<niq> I agree with what JibberJim just said, too
chris: i prefer to keep low-medium-high, maybe people can find useful, its useful for me as well
saz: i hesitate to put it as an optional
property . only useful for fully automated tests
... 75% of the time it works, which is not the case
... describe in the spec how we can calculate high-medium-low, otherwise we
encourage less interoperability, tools not able to exchange this info
... numerical values can be an extension
... chris how did u use it in your tool?
chris: if you have alt text if appropriate, then u say with high-confidence
<niq> Zakim: q+ to say Valet assigns confidence values to results, to determine how likely it is that a guideline has been violated
chris: if the user can make a decision, high
medium and low
... 90% certainty, it will be difficult to exchange the data
... we have to define how high medium and low relate to numerical values
<Zakim> JibberJim, you wanted to say I think this is more of a test case, how reliably you can detect something is a function of the test case rather than of the result
jim: not sure if it is good idea having a
machine giving a value for a good or bad alt text
... but u can use it like I am accurate 80% of the time
<Zakim> niq, you wanted to say Valet assigns confidence values to results, to determine how likely it is that a guideline has been violated and to say I also use "certain"
niq: two different rules, no alt at all is a violation of guideline for sure
saz: i think this is a good use case
... however what about, if there is alt text but it does not seem right
niq: that would give low confidence result
... different test
<niq> s/low/lower/ :-)
thanx
saz: this is an important use case
... confidence "how good is the test" but with results you should be
careful
... would somebody be willing to abstract it in a more generic way?
chris: we have been looking at this, it is a bit loose, are u thinking something even more abstract?
saz: no, the examples apply to WCAG 1.0, but we
have to have sth more generic
... if a human says pass this confidence is hogh
... things like that take into account
chris: exam-result, collect response, is wrong
with high level of certain
... the most generic way is to say high-medium-low
saz: two different developers they both produce
confidence values in similar way, so as to compare results with each other
... high low is not so comparable
... define how you use the confidence level
chris: two cases of the same level, will be interchangable
who is speaking?
jim: interoperability, we want to know if two tests are equivalent
saz: we know success or fail, but
high-medium-low is more granular
... e.g. one tool is fully auto and the other in semi, the second has high
confidence
... the result will be the same, the confidence and the test mode will be
different
<Zakim> niq, you wanted to say we must expect different tools to differ in some cases. Shouldn't be a problem
saz: we have to give more detail on how to use this property
chris: it will be up to the tools
saz: what we need is what are the factors that
influence confidence
... e.g. manualy, automatically, heuristically, or other
very difficult to follow, but line
niq: we should not correlate test mode directly to confidence level
saz: anyone wants to check this?e.g. if we take the WCAG test suite
chris: each of the tests have an inside confidence level?
<ChrisR> yes, each of the tests have an assigned confidence level
saz: confidence value you take from the
test?
... if you have confidence value to you want them in the report?
... thinking of a checklist - if human, if semi automatic test etc.
... would that be helpful?
chris: yes that's what we have
<niq> that was me, sorry
niq: taht's what we have
Sorry about that
saz: is confidence ;level related to test
description, or do we want a way of "calculating"
... are we interested in developing something like that?
chris: not what's in the test description, it depends on the result
niq: it depends on the tool unless if a Tdl, property of the test case
ci: it depends on how the test was done
... not of the test result
ca: not sure cannot tell..
jim: i am not sure, previously it was part of the test, rather than result. I am happy if we can find out later
<JibberJim> okay, cut me off mid call why don't you phone...
<JibberJim> "the conference is restricted at this time"
<JibberJim> 'cos it's after 6?
saz: on this call it seems majority wants confidence property but optional
<niq> JibberJim: zakim told you that?
saz: cause different developers use it differently
<JibberJim> yes
<niq> ok