See also: IRC log
<MichaelC_SJC> scribeNick: MichaelC_SJC
wa: testing helps everybody
figure out how to make best possible test suites
<plh> Wilhelm: I'd like to figure how to make the best possible test suite, how to make the Web better
I work for Opera as testmonkey, test manager
in various parts
jg: also work for Opera
<missed the rest>
ee: also known as fantasai
work on testing in CSS WG
jl: work on testing in Google
want to improve the ecosystem so it all works better
ss: created Webdriver, working Selenium
very aware of the differences between browsers, would love to sort it out
kk: worked in testing at Microsoft
more recently on Web standards
jj: also at Microsoft
interested in automation, test suites
pl: co-chair of CSS WG
have contributed extensively to that test suite
and working on test shepherd for <missed>
ms: work for W3C, staff contact to HTML WG
work on testing for HTML, extensive contributions to framework
as: working for Adobe
interested in tests working across browsers
nm: learn what's up
<MikeSmith> https://browserlab.adobe.com/en-us/index.html <- Adobe BrowserLab
cs: represent adobe
kk: work for google, Webdriver
bs: AT&T, mobile data services
interoperability in various fora
want to understand the challenges browser vendors have in automation
and how to leverage tools in repeatable continuous framework
to certify new devices as they come out, get updated, etc.
jh: Mozilla, test automation
ct: Mozilla, testing
ta: Google, work on Chrome
not as closely involved in testing, but have worked in CSS on some
<plh> involed in WAI. zstaff contact for PF, developping ARIA> we're struggling in testing. hoping to contribute to the test framework
<plh> ... we have reuirements that we'd like to bring as well
plh: W3C, Interaction Domain, lots of your favourite groups
want a common framework, common way to write tests
wa: first, want browser vendors to introduce how they do testing
then, presentations of a few testing approaches
finally, discussion of how to write tests for different types of functionality
90% of tests cover how something rendered to screen in a particular way
or script returns an expected result
or user fills out form and certain result
ss: WebDriver is an API for automation of WebApps
developer-focused, guides people to writing better tests
Merged with Selenium a couple years ago
fairly simple, load page, find element, perform actions like focus, click, read, etc.
kk: does it simulate user input at driver level, or elsewhere?
ss: in past user interactions were done by simulating events in DOM
but browers inconsistent in how they handle those
when they do what etc.
so events at script level not feasible
so do events at OS level
that is high fidelity but terrible machine utilization
and wastes developer's time
so now, allow window not to have focus and send events via various OS APIs
but OS not designed to send high fidelity user input to background window
so now, Opera and Chrome pump events into event loop of browser
<scribe not sure that was caught right>
Webdriver has become a de facto standard for browser automation
most popular open source framework
as can be seen by job postings requiring familiarity with it
has reasonable browser support
Opera, Chrome, and Android add-on, Mozilla starting
uses Apache2 license
nm: tried on mobile browsers?
ss: yes, in various <lists>
it's a small team
covering wide range of browsers and platforms
see 3 audiences for automation
1) App developers are vast majority
need to test applications
hard to get developers to write tests, and can only get them to write to one API when you get it at all
first audience for WebDriver
2) browser vendors
desire to automate their testing as much as possible
bs: how does Webdriver related to qunit <sp?>
ss: <didn't catch details>
bs: so Webdriver isn't a framework, it's an API for automating events
ss: clearly a browser automation API
e.g., understand Opera runs 2 million tests / day with this
3) Spec authors
some specs can be articulated entirely in script
and tested that way
others need additional support, this provides that
ee: more spec testers than authors?
ss: yes, those focusing on test
... user perspective
it's a series of controlled APIs
to interrogated DOM
execute script with elevated priveleges
and provide APIs to interact, so not just read-only
jj: <question missed>
ss: <answer missed>
jj: avoids cross origin vulnerability?
bs: good, some complicated scenarious
ss: implementer view
neutral to transport and encoding
which bring clients that can handle immediately
<JohnJansen> My question was regarding the bypass of the x-origin security restriction
ss: automation and security are opposite concerns
<JohnJansen> answer: the jscript still honors that restriction, though webdriver itself ignores it.
generally, build support into browser
and enable it via an additional component
or command line features
<shows short script, then executes>
kk: how Opera?
ss: Watir on top of
... API designed to be extensable
expose capabilities via a simple interface or casting
jj: How are visual verifications handled?
ss: can take a screenshot, platform-dependent
Opera has extended with ability to get hash of the screenshot
attempt to capture entire area described by DOM, not just viewport
deals with difficulties like fixed positioning etc.
but very browser specific
jj: human comparison mechanism?
ss: in google, teams of people do that
we just provide the mechanism
don't want to over-prescribe how to process images, as state of the art continually changes
bs: to compare layout between different browsers
capture screens, or query position of elements?
ss: can do both
can get location of an element
bs: how about different screens sizes
interested in specifically how things rendered in various circumstances
ss: the locatable interface can provide various types of measures
kk: differences among browsers are wide for many reasons
it's part of the landscape
ss: was able to use same tests using same APIs
at rendering level can be different
plh: platform AAPIs use similar services
hope e.g., ARIA can use WebDriver
ss: have looked at AAPIs, can look at elements by ARIA role etc.
on relationship to AAPIs
sometimes they're enough, sometimes not
one of the next big things in hybridized apps, part native and part Web
may need to use AAPIs to test
plh: think ARIA can be tested using this
ss: have applied Webdrive to native app testing using AAPIs
kk: there has been a path starting with MSAA
ss: AAPIs are extremely low-level
e.g., a combobox is represented as a few different controls together
kk: developers create all kinds of crazy things
so UI automation allows patterns
mc: can speak to AAPI from WebDriver
ss: Webdriver sits on top of AAPI
but because of script interface, could talk back and forth a bit
wa: Opera has a layer "Watir" on top of WebDriver
test file looks like a manual test, e.g., a human could interact with it
<demos manual execution of test>
<that can also be executed using the script showed previously>
for each test file, there's a block in the automation script
ss: Webdriver simlilar
ss: <answer related to webelement.gettext>
jj: why wrapping in Watir
wa: was done before projects had merged
now doesn't matter as much
plan to submit Opera set of tests to HTML WG for official test suite
but want them in a format other browser vendors could use
Opera uses Ruby bindings, Mozilla uses Python bindings
need to automate in all browsers, Webdriver seems way to go
for official W3C tests, question of what language binding to use?
Python is the other one being explored by Mozilla and Chrome
also is "politically unencumbered"
vs some other candidates out there
wa: how complete are JS bindings?
js: still finalizing
kk: <something detailed>
js: API stable
loading script within browser is the part that still needs working on, to get around sandbox
it's usable now, but have debugging etc. to do
ss: so maybe Python preferable?
jg: having dependency on core could be a big stability issue
<^ not sure that's scribed right>
kk: dangerous to build on things that are changing
otoh, need bindings to be something that's available on all targets
ss: normally test and browser communicate like a client / server
can do over a web socket
and run test on machine independent of browser
wa: was able to test a mobile device on a different continents this way
plh: if we set up a test server on W3C site, could you allow it to just run tests at you?
ss: can connect from browser to a test server
so in theory, this works
but security concerns
need a manual intervention to put browser in testing mode
mc: have to trust W3C server from security POV
how we allow tests to be contributed needs to be careful
<general view of usefulness of this approach>
<JohnJansen> as: is there support for IME? how good is it?
ss: support varies by platform as we prioritize development
<mentions wherefores and whynots>
do support internationalized text input
for testing I18N but could be used to test other stuff
do: how well documented is JS API?
ss: fairly extensive
Facebook developed PHP bindings using this documentation
Selenium stuff hosted under software freedom conservancy
can use w/o the open source stuff, but also handy to use the open source stuff
wa: Just started browser tools and @@ WG
primary goal is to standardize Webdriver API at W3C
<jhammel> (i think)
welcome you all to join to make this happen
also want to explore whether all browser vendors can handle official test suites using Webdriver API
ss: aware of support from Google, Opera, Mozilla
explicit non-support from Microsoft, Apple, Nokia, HP
also support from RIM
plh: would Microsoft be able to accommodate tests using this?
standardization of the API will help a lot
<Another link for the WG is http://www.w3.org/testing/browser/>
also need tests structured in certain ways we can work with
<fantasai> kk: having the tests be self-describing is very important. If I was a TV browser vendor that doesn't support webdriver, I would want to be able to leverage the W3C tests as well
jg: tests always structured so you could run manually, though would be ridiculous to do so with them all in practice
ms: first thing we need is a spec
doesn't matter where editors draft hosted, can do at W3C
IP commitments kick in when we publish a Working Draft
ss, wa: ready to move right away on that
kk: W3C would own code?
ss: W3C would maintain spec
and a reference implementation
but there could be other implementations
mc: reference implementation doesn't necessarily have to be W3C
plh: spec is most important for W3C
ss: all Google testing in some way related to Webdrive
bs: supported in mobile?
ss: chrome and android
wa: also opera for mobile
bs: so other platforms is just lack of implementation?
ss: right; Nokia and Apple haven't implemented
just need a driver
kk: support IE6? want to get rid of that
ss: drop support when usage drops below a certain level
plh: support from Microsoft for Webdriver API will help HTML WG a lot
jj: even if Opera submits tests and HTML adopts, they're self-describing so still testable manually
plh: what does Nokia think?
nm: Nokia not really interested
focused on Webkit stuff
today is first time hearing about it
ss: it's not just about testing a spec, it's about ensuring users can use content in your browser
so that market force should drive interest even if internal interest is elsewhere
nm: how is performance?
ss: rapid on Android, but slow on emulator
Iphone is fast directly and in emulator
<something else> fast
<jhammel> ^ pixel verification
ss: haven't seen a lot of pixel verification on mobile devices
<scribe having a hard time hearing or understanding remainder of discussion>
<MikeSmith> agenda: http://lists.w3.org/Archives/Public/public-test-infra/2011OctDec/0014.html
<dobrien> Could we get the minutes updated again as well please?
jj: propose not requiring webdriver in first version of test suite
<bryan> Scribenick: bryan
kk: To walk thru testing of
... shows slides "Standards and Interoperability"
<fantasai> IE testing diagram: Standards, Customer Feedback, Privacy, Accessibility, Performance, Security
<fantasai> (these are pictured as hexagrams around a central "Internet Explorer" label)
kk: IE testing has various chunks as shown on the slide (slides to be shared)
<fantasai> "Internet Explorer Testing Lab" w/ photo
<fantasai> IE5 -> IE10
<fantasai> 948 Workstations
<fantasai> 119 servers
<fantasai> 1200 virtual machines
<fantasai> remotely configurable
<fantasai> 152 versions of IE shipped every "Patch Tuesday"
<fantasai> Green Lab Initiative saves ~218 tons of CO2/Year
kk: IE testing lab using a lot of machines with a lot of IE versions tested every week
<fantasai> "Standards Engagement"
<fantasai> TC39 (Ecmascript 5)
<fantasai> - CSS
<simonstewart> Slides for the webdriver notes: https://docs.google.com/present/edit?id=0AVrYfCxRNKUGZGc5Nm1ocGhfNzFnaGd2bmZnYw
<fantasai> cycle diagram: Testing -> spec editing -> implementations -> (loop back to Testing)
<fantasai> "Standard Contributions"
<fantasai> - Spec editing
<fantasai> -test case contributions w3c and ecma
kk: encourage standards engagement and participation in various groups
<fantasai> -- 14623 tests submitted
<fantasai> -- across IE9/IE9/IE10 features
<fantasai> - hardware (Mercurial server)
<fantasai> - IE Platform Preview Builds
kk: have contributed a lot of
tests and hardware
... preview builds allow early access and feedback
<fantasai> "IE10 Standards Support"
<fantasai> CSS2.1 , 2D Transofrms, 3D Transforms, Animations, backgroudns and Borders, Color, Flexbox, Fonts, Grid alignment, hyphenation, image values gradients, media querie,s multi-col, namespaces, OM Views, positioned floats, selectors, transitions Value sand Units
<fantasai> DOM element traversal, HTML, L3 Core, L3 Events, Style, Traversal and Ragne
<fantasai> ECMASCRIPT 5
<fantasai> File Reader API
<fantasai> FIle Saving
kk: IE 10 will support a lot of standards CSS, HTML5, Web APIs, ... http://ietestdrive.com
<fantasai> HTML5 appcache, asycn cavnas, drag and drop, forms and validation, structure clone, history API, parser sandbox, selection, semantic element,s video and audio
<fantasai> ICC Color profiles
<fantasai> Indexed DB
<fantasai> Page Visibliity
<fantasai> Selectors API L2
<fantasai> SVG Filter Effects
<fantasai> SVG standalone and in HTML
kk: also look at the IE blog
<fantasai> Web Sockets
<fantasai> Web Workers
<fantasai> XMLHttpREquest L2
<fantasai> "Items for Discussion"
<fantasai> * WG Testing Inconsistent
<fantasai> - when are test created? Before LC? CR?
<fantasai> - Whena re tests reviewd?
<fantasai> - vendor prefixes
<fantasai> - 2+ impl passing test srequired for CR/
<fantasai> * Review Tools (none)
kk: issues are inconsistent testing across WGs
<fantasai> Note -- that's not quite true anymore, plinss wrote one for csswg :)
kk: when tests are created e.g.
related to last call or earlier
... soft rules for how a spec is allowed to progress are maybe not enough
plh: these are soft rules currently
jj: test tools recently developed
have helped with consistency, flushing our remaining
inconsistencies is a goal
... different test platforms result in different tests as submitted to W3C
Michael_Cooper: experience has convinced that tests should be available by last call
Kris_Krueger: why would this not be a rec across W3C?
plh: its not easy to
... some WGs will complain
jj: amping the expectations on testing will help
mc: it should be the rule, with exceptions allowed
<Zakim> MichaelC_SJC, you wanted to say I now believe tests need to be ready by Last Call
Elika_Etemad: implementations are needed to see how tests are working
James_Graham: the process does not map to browser development reality
Elika_Etemad: its difficult to say when spec development is done thus making a hard deadline
<dobrien> Mhmv @7
John_Jansen: problems often cause the specs to move backward
<dobrien> Sorry about that.
Elika_Etemad: CR is test the spec
phase, not fixing bugs in browsers
... having to move CR back due to bugs is an issue, we need an errata process to allow edits in CR
plh: we are not here to fix the W3C process
John_Jansen: the more times you go thru the circle (edit/implement/test) the better, and also the earlier
James_Graham: when we implement we write the tests... test suites should not be closed
<fantasai> James_Graham: The state of the spec is irrelevant to when we write tests
Mike_Smith: the Testing IG is
scoped broadly perhaps too much so. The IG will decide what its
products will be, e.g. a best practice on when test suites are
... writing this down even if we do not fix the process will help others avoid the same mistakes of the past
... it will still have some value
Wilhelm_Anderson: how do you run tests, what is automated, is development inhouse
Kris_Krueger: write our own tests
plh: from JQuery?
Kris_Krueger: no, customer
feedback is also considered
... e.g. Gmail support provides feedback
... have a lot of automated tests, ship every Tuesday, and get quick feedback from users/developers
Narayana_Babu_Maddhuri: is there any review of the test cases to determine is the test a valid test, validation of the test results?
plh: the metadata of the test log should clarify what is being tested
Kris_Krueger: pointing to where the test relates to the spec is helpful
plh: we cannot force metadata into tests, but we can encourage this info to help ensure test value clarity
Narayana_Babu_Maddhuri: good reporting would be helpful
plh: knowing e.g. what property works across devices and platforms is a goal, and matching tests to specs would support that
James_Graham: knowing why something is failing is sometimes difficult, dependencies are not clear and why the test failed is unclear
<MichaelC_SJC> == Lunch break is 1 hour ==
<krisk_> Firefox Testing Presentation
<krisk_> clint: Tools automation lead at Mozilla
<krisk_> Clint: overview of their testiong
<krisk_> Grown over the years
<krisk_> Test Harnesses
<fantasai> "Automation Structure: Test Harnesses"
<fantasai> - C++ Unit
<krisk_> C++ Unit testing, XPCShell, no too intresting for this group
<fantasai> - Reftest
<fantasai> -UI Automation Frameworks
<fantasai> - Marionette
<krisk_> Mochitest - tests dom stuff
<krisk_> New UI automation framework - Marionette
<krisk_> Reftest drill down
<fantasai> "Reftest: style and layout visual comparison testing"
<fantasai> Reference: <p><b>This is bold</b></p>
<fantasai> Test: <p style="font-weight: bold">This is bold</p>
<fantasai> clint: The test and the reference create the same rendering in different ways.
<fantasai> clint: Then we take screenshots and compare them pixel by pixel
<fantasai> clint: One of the libraries it pulls in is the SimpleTest library.
<fantasai> clint: It has the normal asserts: ok, is, stuff to control whether asynchronous or not
<fantasai> clint: This other file here (in this example) turns off the geolocation security prompts
<fantasai> clint shows a geolocation test
<fantasai> plh: How does this route around the security checks?
<fantasai> clint: uses an add-on
<fantasai> clint: has a special powers api
<fantasai> "Marionette: Driving Gecko into the future"
<fantasai> This is a mechanism we can use to drive any gecko-based application either by UI or by inserting scrit actions into its various script contexts.
<fantasai> How it works -
<fantasai> 1. socket opened from inside gecko
<fantasai> 2. Connect to socket from test harnes, either local ro remote
<fantasai> 3. Send JSON protocol to it
<fantasai> 4. Translates JSON protocol into browser actions
<simonstewart> uses webdriver json protocol streamed over sockets directly
<fantasai> 5. Send results back to harness in JSON
<jhammel> wiki page: https://wiki.mozilla.org/Auto-tools/Projects/Marionette
<fantasai> clint: We run all of these test on every check in every tree we build on.
<fantasai> clint: Goes into a dashboard
<fantasai> slide: shows screenshot of TinderboxPushLog
<fantasai> wilhelm: Can we steal your Mochitests? What do we need to do to do so?
<fantasai> clint: Check them out of the tree and see how well they run in Opera
<fantasai> clint: Some of the stuff we did, e.g. special powers extension,
<fantasai> clint: but it's now a specific API (used to be scattered randomly throughout tests)
<fantasai> clint: If you had something similar and named it specialpowers, then you could use that to get into your secure system
<fantasai> clint: So should be possible.
<fantasai> clint: A lot of tests we have in the tree are completely agnostic; don't do anything special at all, should work today
<jhammel> mochitests are at http://hg.mozilla.org/mozilla-central/file/tip/testing/mochitest
<fantasai> wilhelm: Are there plans to release these tests to geolocation wg?
<fantasai> clint: I think they already did. guy wrote tests is on that wg
<fantasai> kk: ... they're hard-coded to use the Google service. If you don't use it, they don't run...
<fantasai> kk: Not too many though
<fantasai> some discussion of sharing tests
<fantasai> Alan: I think WebKit is using some Mozilla reftests, but not using them as reftests
<fantasai> kk: I'm fine w/ reftests. But of course won't work for everything.
<fantasai> kk: CSS tests we wrote are self-describing.
<fantasai> Alan: do you have automation?
<fantasai> kk: Yes
<fantasai> rakesh: Do you run the tests every day?
<fantasai> clint: Every checkin
<fantasai> clint: Different trees run different numbers of tests.
<fantasai> clint: Our goal is to have test results back within 2 hours. Right now we're averaging 2.5hrs
<fantasai> fantasai: You're responsible for watching the tree and backing out if you broke something.
<fantasai> discussion of test coverage
<fantasai> discussion of subsetting tests during development
<fantasai> wilhelm: How much noise do you have?
<fantasai> clint: Don't know about false positives
<fantasai> clint: Probably not many; once we find one, we check for that pattern elsewhere
<jhammel> orange factor, for tracking failures: http://brasstacks.mozilla.com/orangefactor/
<fantasai> clint: Thing we really have is intermittent failures
<fantasai> clint: We're trying really really hard to bring it down
<fantasai> clint: Used to be on every checkin you'd get, on average, 8 intermittent failures
<fantasai> clint: we pushed it down to 2
<fantasai> clint: And then we added the Android tests
<fantasai> clint: trying to bring it down again
<fantasai> duane: Can I instrument Marionette today in FF7?
<fantasai> clint: No, code we're depending on now is landing currently on Nightly
<fantasai> clint: Released probably... May?
<fantasai> clint: Depending on work done by Developer Tools group
<fantasai> clint: They have a remote debugging protocol they're implementing
<fantasai> clint: Will be really nice; decided this would be great to piggyback on. Don't need two sockets in lower-level Gecko.
<fantasai> clint: So won't be available until that's released.
<fantasai> clint: Currently in a project repo... land in Nightly in ~2.5 weeks
<fantasai> plh: Marionnet is only for Fennec, not for desktop version?
<fantasai> clint: For Fennec right now. Planning to go backwards and use for Desktop as wel.
<fantasai> clint: My goal is to move all our infrastructure towards that
<fantasai> kk asks about reducing orange
<fantasai> clint: It's mostly a one-by-one effort of fixing the tests
<simonstewart> Interesting comment about avoiding using setTimeout in tests
<fantasai> kk: Are you going to take Mochitests into W3C? Anything preventing you?
<fantasai> clint: Nothing right now. We'd have to clean them up and make them cross-browser. Good for everyone, not opposed, j ist a matter of finding people and time
<fantasai> jgraham: there's a bug on making testharness.js look like Mochitest to Mozilla
<fantasai> "This looks vaguely familiar"
<fantasai> wilhelm: Say a few words about testing at Opera
<fantasai> wilhelm: We have a mainline, which is supposedly always stable, and then when we're developing a feature, it gets branched and at some point tests start passing (that's the yellow, b/c out of sync with mainline) and then we merge and that becomes mainline
<fantasai> diagram shows mainline with six green dots going forward
<fantasai> branch goes off, two red dots, one yellow
<fantasai> arrow from mainline to green dot on feature branch
<ctalbert_> The wiki page we(mozilla) wrote that details our "lessons learned" from fixing intermittently failing tests is here: https://developer.mozilla.org/en/QA/Avoiding_intermittent_oranges
<fantasai> arrow from green dot back to green dot on mainline
<fantasai> jgraham: ...
<fantasai> jgraham: Our setup's a bit different
<fantasai> jgraham: All the tests are in subversion in their own repository that's separate from the code. It's just a normal webserver: apach, php
<fantasai> jgraham: When you ask for tests to be run, they get assigned from the server and we send them out to a couple hundred virtual machines
<fantasai> jgraham: not quite MSFT's setup
<fantasai> jgraham: And then we store every result of every test
<fantasai> jgraham: I think you just store did all the tests past.. we store, in this build this test passed.
<fantasai> jgraham: We have a huge database of this information
<fantasai> jgraham: Theoretically we can delete stuff, but we store everything.
<fantasai> jgraham: In a mainline build from yesterday, we ran quarter of a million tests
<fantasai> jgraham: That's not quarter million files -- it's 60,000 files, some of which produce multiple results
<fantasai> jgraham: e.g. some tests from HTML5 test in W3C, one file might produce 10,000 results
<fantasai> jgraham: Typically it's a JS thing and it just runs a bunch of code and at the end it has some results
<fantasai> jgraham: Dumps them to the browser in some way
<fantasai> jgraham: The way we do that right now is pretty stupid, so I won't talk about it
<fantasai> slide: Visual tests, JS tests, Unit tests, Watir tests, Manual tests :(
<fantasai> jgraham: System was designed 7 years ago or sth
<fantasai> jgraham: For visual tests, you just take a screenshot, and then we store the screenshot.
<fantasai> jgraham: Someone manually marks whether that screenshot was a pass or fail.
<fantasai> jgraham: Don't do that. You have to do it once per test, and then once any time anything changes very slightly
<fantasai> jgraham: e.g. introduce anti-aliasing test, have to re-annotate all tests
<fantasai> jgraham: this format is deprecated
<fantasai> wilhelm: We have 20,000 tests on 3 different Opera configurations...
<fantasai> wilhelm: We want to kill these tests and use reftests instead
<fantasai> jgraham: Oh, reftests should be on that list too
<fantasai> jgraham: Recently we implemented reftests, and we're actively trying to move tests to reftests.
<fantasai> jgraham: You can't test everything with reftest, but when you can it's much better
<fantasai> Alan: Do you keep track of when the reference file bitmap changes?
<fantasai> Alan: What if both the reference and the test change identically such that the test should fail but doesn't?
<fantasai> plinss: In the case of the CSSWG when we have a fragile reference, we have multiple references that use different techniques
<fantasai> jgraham: We have a very lightweight framework we used to use for JS tests. Only allowed one test per page.
<fantasai> jgraham: Easy to use, but required a lot of convoluted logic for each pass/fail result.
<fantasai> jgraham: For new test suites, we're using testharness.js
<fantasai> jgraham: similar to Mozilla's MochiKit
<fantasai> jgraham: Unit tests are C++ level things not worth talking about here
<fantasai> jgraham: When things need automation, we use Watir -- discussed this morning
<fantasai> jgraham: When all else fails, we have manual tests
<fantasai> wilhelm: Notice that the monkey looks really unhappy
<fantasai> jgraham: For the core of Opera, we schedule a test day and just run tests
<fantasai> plh: How many manually tests do you have?
<fantasai> wilhelm: around 2000 before, less now...
<fantasai> wilhelm: Probably spend about a man-year on manual tests per year
<fantasai> wilhelm: Say some things about challenges we have, things we need to take into account when writing tests internally and for W3C
<fantasai> wilhelm: First thing is device independence
<fantasai> wilhelm: We run 3 different configurations of Opera: Desktop profile, Smartphone profile, and TV profile
<fantasai> wilhelm: Almost every time someone requests a build, it will be tested on those three profiles
<fantasai> wilhelm: We notice that if you have a static timeout in your test, e.g. wait 2s before checking result, that will break on stupid profile with low resources
<fantasai> wilhelm: On some platforms we automatically double or triple it, and we hope it works, but it's not really good solution
<fantasai> jgraham: How do you deal with ... ?
<fantasai> clint: we time out our tests after a set time period and mark it as failed
<fantasai> jgraham: Most assumption is don't depend on device size or speed -- test will randomly fail.
<fantasai> wilhelm: Brings me to the next problem: random
<fantasai> wilhelm: If you have so many tests and even small percentage fail randomly, going to spend man-years investigating those failures
<fantasai> wilhelm: When we add new configurations, when we steal tests from source of unknown quality, we spend many man-years stamping out randomness in the tests
<fantasai> wilhelm: The more complex the test, the more likely to randomly fail
<fantasai> wilhelm: Simplest tests are JS.
<fantasai> wilhelm: For imported tests from random sources, could be very bad
<fantasai> wilhelm: Then comes visual tests
<fantasai> wilhelm: Sometimes complexity is needed, but if can simplify will do that
<fantasai> wilhelm: We have a quarantine system: run 200 times on test machines first to make sure its good
<fantasai> wilhelm: Still, sometimes things slip through.
<fantasai> wilhelm: We steal your tests. Thank you.
<fantasai> slide: jQuery, Opera, Chrome, Microsoft, mozilla, W3C
<fantasai> wilhelm: Keeping in sync with the origin of the test is difficult
<fantasai> wilhelm: When someone updates a test elsewhere, w don't automatically get that
<fantasai> wilhelm: When we muck about the test to get it to work on our system, we have to maintain patches
<fantasai> wilhelm: If we fix bad tests, sometimes easy to contribute back, but sometime not
<fantasai> wilhelm: Automating tests to use our Watir scripts, can also become a problem.
<fantasai> wilhelm: Our current approach is not usable
<fantasai> wilhelm: need a better way for us all to keep in sycn
<fantasai> kk: This is why we have submitted and approved folders
<fantasai> jgraham: The problem from our POV is really .. part of it is version control problem on our
<fantasai> jgraham: Don't have a good way to keep our patches separate from upstream changes
<fantasai> jgraham: If we have w3C tests, and we pull new version, don't have a way to say "these are bits we changed ot make it work on our version"
<fantasai> jgraham: ... reporting and script file separate
<fantasai> jgraham: if we pull some tests from Mozilla, say, and they're JS engine tests andthey update them, if we try and merge them.. someone has to work out how to do that by hand. It's kindof a nightmare.
<fantasai> wilhelm: Last thing about randomness, esp imported
<fantasai> wilhelm: Some tests rely on external tests.
<fantasai> wilhelm: Great when we only had a few tests
<fantasai> wilhelm: But now it's a problem. Servers go down, etc.
<fantasai> wilhelm: Conclusion there is: don't do that. :)
<fantasai> wilhelm: That's it!
<fantasai> jhammel: Wrt upstream tests, standardizing on formats and standardizing on process
<fantasai> wilhelm: We set up time at 3:15 today to discuss this exact issue
<fantasai> mc: You say you have to fix tests to work on your product.
<fantasai> mc: Question is how do you separate fixing test to be not random, vs. making them work on a particular product
<fantasai> jgraham: When we pull in tests, we try not to change anything to do with the test.
<fantasai> jgraham: We don't require the tests to pass to be in our system.
<fantasai> jgraham: The thing we need to change is, can this test report back to our servers.
<fantasai> jgraham: But external tests are usually not designed that way.
<fantasai> wilhelm: I think testharness.js approach is good, because those are separated.
<krisk_> That is the end of Opera
<MichaelC_SJC> 's presentation
<krisk_> The next person up is peter from HP on css wg update (10 minutes)
<krisk_> Then a discussion on rendering tests for about 1 hour
<krisk_> has lots of information on CSS WG testing
<krisk_> Tests are 'built' from xml into multiple formats - html, xhtml, etc...
<krisk_> Test harness is a wrapper around the tests that are loaded in an iframe
<krisk_> It loads the tests that have the least number of tests
<krisk_> The harness has a filter for spec section, etc..
<krisk_> The harness has meta-data description for each of the tests
<stearns> test format requirements: http://wiki.csswg.org/test/css2.1/format
<krisk_> The harness also has test results that can be shown for each of the browser/engine versions
<krisk_> Build process has requirements that will be improved overtime - meta data, ref test, title, etc...
<krisk_> Adding meta-data helps review process, though most submitters don't like to add this data
<krisk_> Multiple refs for the same test exist and a negative ref test as well
<krisk_> You can have two ref tests if the spec has two different results - for example margin collapsing
<krisk_> If a ref test can't be used then in some cases a self-describing test works
<krisk_> Spec annotations are used that map back to the annotated spec
<krisk_> The annotated spec has total tests and results for each section of the spec
<krisk_> Now on to the test review system
<krisk_> Very tight coupling to the css test metadata
<krisk_> Tracks history and other information about a test case
<krisk_> jgraham: is this tied to the test file?
<krisk_> peter: no it's possible to have this information in another file
<krisk_> jgraham: can this handle a case when multiple files are used to create alot of tests
<krisk_> peter: yes we have the same issue for the media query test cases
<krisk_> Wilhelm: So does css still use visual non-ref tests?
<krisk_> fantasi: for css3 we require ref-tests, so no
<krisk_> peter: The system is built to save time and automate parts
<krisk_> peter: for example when a test is approved it is moved from submitted to approved
<krisk_> Michael: Does the system have access control checks for approval?
<krisk_> peter: yes
<krisk_> Ken: Chrome Testing Information
<simonstewart> kk: works on the chrome automation team
<simonstewart> kk: not an automation group in the same sense as mozilla
<simonstewart> chrome depends on webkit
<krisk_> kk is not krisk
<simonstewart> webkit layout tests, pixel-based tests
<simonstewart> kk == ken_kania
<simonstewart> kk: dom dump tree tests
<simonstewart> kk: not got a lot of insight into the specifics of the webkit tests. Focuses mainly on the chrome browser
<simonstewart> kk: couple of layers of testing
<simonstewart> kk: lowest layer is the c++ browser tests
<simonstewart> kk: probably more than other browsers do. Special builds of chrome which will run C++ in the ui thread
<simonstewart> kk: relatively low level, though
<simonstewart> kk: beyond those, there are the ui test framework. Based on the automation proxy (AP)
<simonstewart> kk: ap is pretty old, but is an ipc mechanism
<simonstewart> kk: very much internal facing
<simonstewart> those tests are still fairly low level, depsite being called ui tests
<simonstewart> kk: higher than that, Ken's team work on something called the chrome bot
<simonstewart> kk: runs on real and virtual machines
<simonstewart> kk: cache of a large number of sites in a cache. Often used for crash testing. Also include tests that perform random ui actions
<simonstewart> kk: a little bit smarter than pure random, but that's the gist
<simonstewart> kk: qa level tests. Tests that are done by manual testers. Piggy back off the ui test automation framework. things ilke creating bookmarks, installing extensions, etc
<simonstewart> kk: break down manual testing to test parts. First app compat. Push a new release of chrome it continues to work, and testing chrome at the ui level
<simonstewart> Most of the ui is "based on the web"
<simonstewart> For the chrome specific native widgets there are manual tests
<simonstewart> kk: app compat depends on webdriver
<simonstewart> kk: lots of google teams depend on webdriver to verify that sites work.
<simonstewart> kk: guess that at a high level, the testing strategy tends to be developer focused.
<simonstewart> kk: devs should write the tests in whatever tool and harness is most expedient for their purpose
<simonstewart> kk: piggy back a lot on the fact that chrome does rapid releases. 4 channels release to users (canary, dev, beta, stable)
<simonstewart> kk: different release schedules
<simonstewart> kk: depend a lot on user feedback from the canaries
<simonstewart> kk: that's the gist of it
<simonstewart> tab: sounds good to me
<simonstewart> jhammel: do chrome do performance testing?
<simonstewart> kk: we do. Using the AP and the ui testing framework mentioned earlier
<simonstewart> to view the tests that have been run
<simonstewart> plh: do we run jquery tests
<jhammel> ^ correction: http://build.chromium.org
<simonstewart> kk: not really. webkit guys might, and we pick that up
<simonstewart> krisk_: do you create tests and feed them back
<simonstewart> TabAtkins: we don't do much, but we do
<simonstewart> krisk_: is that because it doesn't fit with the systems
<simonstewart> TabAtkins: the ways we write and run tests isn't really compatible with the existing w3 systems.
<simonstewart> TabAtkins: would like to change that!
<simonstewart> TabAtkins: some tests are html/js. which might be used where possible. Doesn't ahppen that regularly
<simonstewart> krisk_: how do you know that you're interoperable?
<simonstewart> TabAtkins: in terms of webkit stuff, it's a case of testing being done by different browser vendors
<simonstewart> kk: lots of c++ tests that are specific to chrome
<jhammel> simonstewart: np :)
<simonstewart> krisk_: v8?
<simonstewart> TabAtkins + kk: v8 team live in europe. Who knows?
<simonstewart> wilhelm: also has legacy stuff for opera. New tests written in a way that (in theory) is usable outside. Can chrome do the same thing?
<simonstewart> TabAtkins: will agitate for that. Involved in spec writing rather than active dev, so might be tricky
<simonstewart> wilhelm: This is a great forum to raise those issues. Opera happy to share with Chrome if Chrome does the same :)
<simonstewart> krisk_: do chrome try and pass a bunch of the w3c test suites?
<simonstewart> TabAtkins: yes. Some of the might be integrated into the chromium waterfall. Some of them might be run by hand
<simonstewart> ?? does anyone know about webkit testing
<simonstewart> TabAtkins: the people who'd I'd like to ask aren't around
<simonstewart> webkit does seem to take in test suites from mozilla. They're running against a bitmap that's different from the moz rendering
<simonstewart> TabAtkins: we don't have a good infrastrcuture for ref tests
<simonstewart> TabAtkins: the test infrastructure people _do_ want to fix that
<simonstewart> TabAtkins: every time a new port is added to webkit, there are more pixel tests. Provides pressure to do better
<simonstewart> plh: any other questions?
<simonstewart> 15 minute break coming up
Info available from webkit: https://trac.webkit.org/wiki
<krisk_> Next agenda Item jgraham talking about testharness.js
<MichaelC_SJC> scribe: testharness.js
<MichaelC_SJC> scribe: krisk_
<MichaelC_SJC> s/topic: krisk_//
<fantasai> scribenick: fantasai
jgraham: testharness.js is something I wrote
to run tests.
... It runs JS tests specifically
... It's a bit like MochiTest or QUnit which JQuery uses, or various things
<plh> --> http://w3c-test.org/resources/testharness.js testharness.js
jgraham: Every JS framework has invented its
... This has slightly different design goals
... The overarching goal is that it's something we can use to test low-level specs like HTML and DOM
... So it can't rely on lots of HTML and DOM :)
... The design goals were to provide some API for writing readable and consistent tests
jgraham: Our previous harness at Opera, as I mentioned, didn't resul in very readable
jgraham: The other is to support testing the
entire DOM level of behavior
... There are 2 test types : asynchronous tests and synchronous tests
... second us purely syntactic sugar
... Another design goal was to allow possibility of the test to have multiple assertions, and all have to be true for test to pass
... typical example might be checking that some node has a set of children.
... Might want to first test for any children before testing that 4th child is a <p>
... Multiple tests per file was a requirement; learning from Opera's 1/file, which was painful for test writers and discouraged many tests
... ... runs everything in try-catch blocks
... One feature of that is that every bit of the test is like a function, basically
... it tries to handle some housekeeping.
... if you have 1000 tests in a file, nice if you can time out those tests individually
... Uses settimeout(); can override that if you want, e.g. if running on slow hardware
... and a design goal was easy integration with browsers' existing test systems
... Should be easy to use on top of MochiKit or whatever you use for reporting results
... next thin I thought I'd do is go through creating a test.
jgraham's text editor:
jgraham: By default testharnessreport.js is
blank. It's for you to integrate into your testing
... the order is not at the moment relevant
... we might later check in testharness.js that testharnessreport.js was included
added to file:
(at the top)
<title> Dispatching custom events</title>
(at the bottom)
var t = async_test("Custom event dispatch");
jgraham: Each test has a number of tests, and
each step is a function that gets called
... It gets called inside a try-catch block, and we can check if the test failed. We don't put anything as top-level code.
(added at the bottom)
(ok, that's too much to type)
jgraham: Here it's adding an event listner
before the second step
... When it gets called, it'll cal lthis other function here, which will run this other step, which is another function. Can get a bit verbose.
... There's a convenience method that will make this easier.. all documented in testharness.js
... Simple assert_equals() with value we get, value we expect, and then you can optionally have a string that describes what it is you're asserting.
... At this point everything we want done is done, so we say t.done();
... If you load this in a browser, because we have div#log, it will show whether it passes or fails and what assert failed
<plh> --> http://w3c-test.org/webapps/ElementTraversal/tests/submissions/W3C/Element-childElementCount.html Example of testharness.js
jgraham: That's all
jj: Is there an id on the steps, so that you can say you failed step 4 of test foo?
jgraham: If there's demand, there could be a second argument there.
jj: would be nice to know where it failed so I can set a breakpoint there
jgraham: If you get a huge number of tests per
file, it's usually auto-generated
... if it's failing in an assert, then it'll tell you which assert failed
plh shows his example
plh: everything shown here is generated by testharness.js
jgraham: There's a failure in this, and it seems everyone fails that.
plh: Bug in testharness.js
jj: Easiest way to debug the test. Is there an error in the test, error in testharness.js, or error in browsers
jgraham: There are various types of
assertions. Usually corresponds to webIDL
... But what's in webIDL isn't always the same
kk: It's pretty well-written, only 700 lines or so
clint: If it's synchronous, you don't have to do t.step()
jgraham: A test that is synchronous implicitly creates a step
wilhelm: Opera currently uses this tool for all the new tests that we write. Can others use this?
clint: Yeah, I think so
kk: There use to be some nunit or something
that W3C had
... Was in IE, but some browsers couldn't run it.
... Very complicated
plinss: Are tests grouped by section into files?
jgraham: In this case, it checks reflection section, plus section of each part of the spec that defines a reflected attribute
wilhelm: plh wanted to talk about test harness, fantasai wanted to talk about syncing problem
MikeSmith: This is an instance of the framework peter demoed
Mike: I'm going to show you what has been
added here to make it easier for test suite maintainers to add
data to the system.
... There's this area called Maintianer Login
... It'll give you an http_auth, which authenticates against W3C's user database
... Email me if you want access to the system
... Once you go in there you'll see 2 options: add metadata, change metadata
... Can add a specification
... one early piece of feedback I got was they have tests they want to run that are not associated with a spec.
... So in this instance of the system, it's not a requirement to have a spec for your test suite
... You can give it an arbitrary ID as long as not a duplicat
... Title of the spec
... URL for the spec
... It expects you'll point it to a single-page version of the spec
... If you have a multi-page spec, don't point it at the TOC. You need the full version of the spec.
... Could change later, but initially set up this way 'cuz easier
... This will get added to the list here
... Next thing you can do is needed if you want to do what Peter was demoing earlier, which was associating testcases with specific sections of the spec -- or specific IDs in the spec
... Structured around idea that you put your IDs per section
... But some WGs like WOFF WG they're putting assertions at the sentence level
... They don't actually have section titles, so needed to accommodate that too
Peter: Alan and fantasai did some work on
... Shepherd tool will be able to parse out spec to find test anchors
... and then can report testing coverage of the spec, so this is something we will automate
Alan: What fantasai and I worked out was based on WOFF work, but will be simpler for spec editors. A bit harder to automate, though
Mike: This part add spec metadata.
... Instead of a form to fill out, it lists existing specs in the system
... once you go here, if there's already data in the system, will show you data in the system alread
... otherwise it'll show you generated data
... This parses the spec and pulls out the headings. If it looks ok, you press submit
... It'll put these section titles into the database.
... If you have IDs below the section title level, then you'll have to use a different way to get it into the DB
... You might have to get me to do it for now :)
... Those steps are optional right now.
... What is necessary is going in and giving info about the test suite itself.
... you can give it an arbitrary ID
... Title, longer description
... to explain better thet est suite
... base URL of where your test suites are stored
... Difference from CSS is, that one requires format subdirectories
plinss: it's optional
Mike: This one doesn't expect subdirectories.
Expects all tests in this one directory
... If you have separate subdirectories...
... Need to make different test suites or ...
... Simplest case you have all tests in one directory
plinss: The code's actually a lot more flexible wrt formats. We'll talk offline.
MikeSmith: Then you have contact information
for someone who can answer questions about test suites
... Then you indicate format of the test suite
... Then you have a list of flags, you can select which ones indicate optional tests
... There are ways to add flags to the system
... No ui for it, so contact me
... Last thing you then do is upload a manifest file
... You have to have a test suite
... You select a test suite
... and then what I have it do right now is that you need to point it to the url for a manifest file, and it'll grab that and read it in
... Right now two forms of manifest files that it will recognize
... second one here is just a TSV that expects path/filename, references, flags, links, assertions
... links are the spec links
... The other big change is, I was talking with some people e.g. annevk and ms2ger
... the format they're using is just listing the filenames
... it marks support files as support files
kk: Mozilla guys wanted to know what files were needed to pull to run a test case
plinss: In the CSSWG, the large manifest file with metadata -- that gets built by the build system
MikeSmith: This form expects the full
filename, not just the extensionless filename
... Because that's what they had
... Once you have that, you should be able to get your test cases into the test database
... and it'll show up on the welcome page
... Long way to go on this.
... Goal when I started on this was to get it to the point where I didn't have to manually do INSERT in SQL to get specs into the database
... What would be really nice is if ppl start using this and getting more test suites in there so that we can ..
plinss: But right now only limited set of ppl can contribute to that code
MikeSmith: I created two groups in our
... I created a group for developers -- anyone who wants to contribute to framework
... That'll give you write access to hg repo for the source code for this
... Take a look at source code and see problems, send me patches or I'll give you direct access
... Second thing is if you want to have access to use this UI to submit test suite data, I'll have to add you to a particular group
fantasai: how is this code related to plinss's code?
MikeSmith: It's forked from that.
... I've just been pulling the upstream changes
... been able to merge everything without it breaking.
... Think it's in good enough shape that we could port it back upstream
plinss: This system and the Shepherd share a
lot fo the same base code
... Lots of things I was going to port Shepherd system back into this system, and then pull your stuff in too
... Mike also has code that ties into the testharness.js code, and will automatically submit results from that
MikeSmith: If you go to enter data, it gives
you some choices about whether you want to run full test suite
... There's a button here that will pull automatic results where possible
... Be careful, this will submit the data publicly!
jgraham: Not saying it's a bad idea, but from our POV, we're not going to use it offline.
(Brian was talking about trying out the system privately offline)
plinss: The system tracks who's submitting the data. By login if you're logged in, by IP if not
Brian: Privacy is useful
plinss: goal is for pulling data from as may sources as possible
wilhelm: fantasai wanted to talk about keeping things in sync
<dobrien> Is someone scribing? I can't keep up on the iPad
<ctalbert_> This is the writeup that we are planning to set up at Mozilla for the CSS tests specifically: https://wiki.mozilla.org/Auto-tools/Projects/W3C_CSS_Test_Mirroring
<krisk_> Mozilla has a way to move tests from mozilla -> w3c -> mozilla
<ctalbert_> wilhelm: how will this cope with local patches?
<krisk_> fantasi: The master copy only lives in one place...
<ctalbert_> jgraham: probably not a problem with the css tests
<krisk_> fantasi: approved is the master in w3c
<krisk_> fantasi: submitted is the master for submissions
<ctalbert_> jgraham: opera is thinking of having the master from w3c which is intact, and our checkout from that master will have the local patches, and when we pull we'll rebase our patches atop the w3c master
<ctalbert_> this should be possible now that hg is in the w3c side and our (opera) side
<ctalbert_> fantasai: we'll probably have to do something similar
<krisk_> wilhelm: how does this handle local patches?
<ctalbert_> jhammel: is there a technical limitation to not have people editing the w3c tests
<ctalbert_> fantasai: no
<krisk_> fantasi: this is only for css which don't seem to have this problem
<ctalbert_> jgraham: probably make it a commit hook
<ctalbert_> ctalbert_: agreed
<ctalbert_> peter: if someone pushes to the approved directory without actually being approved then the system just automatically denies them
<ctalbert_> that may be incorrect ^ (scribe error)
<ctalbert_> wilhelm: might be an idea to split test suites down at lower granularity levels so that you can have test suites with differnt levels of maturity
<ctalbert_> jgraham: don't think that would make a difference tbh
<ctalbert_> peter: our repo would keep all the data from all the suites in the repo so that our build system could build any version of them from any suite
<ctalbert_> wilhelm: are there other things we can do to make it easier to contribute test suites?
<ctalbert_> fantasai: one problem on the mozilla side - there's no place to put tests that should go to the w3c - we depend on a manual process to sort out which should be submitted and then it is done later
<ctalbert_> fantasai: these tests just sit in a random place and are forgotten
<ctalbert_> fantasai: once we have a directory that goes to w3c and we tell the reviewers, then it will help quite a bit.
<ctalbert_> fantasai: the basic idea is to make the process obvious what developers need to do with that test to indicate that it is appropriate and ready for w3c then it should "just happen"
<ctalbert_> jgraham: we have a similar problem. it's hard to surface those tests and bugfixes without a policy and a place for those tests
<ctalbert_> peter: if we have a standard format among the test writers then it will be easier to help developers to upload the tests to the w3c. If the developers have to convert the tests it's too difficult and people won't expend the effort to make it happen
<ctalbert_> krisk_: sometimes it depends on the editors as to when they allow tests into the spec, and you find that tests sometimes lag the spec by quite a bit
<ctalbert_> fantasai: we found that with the css - the person writing the spec is often nominally tasked with also writing the test suite but because the skill sets are different and the spec editor is usually swamped, then the tests get neglected
<ctalbert_> fantasai: we really need a dedicated person to manage these tests and testing effort for each spec
<ctalbert_> MikeSmith: is there some way to motivate people to do that?
<ctalbert_> MikeSmith: maybe we should publicly track the testsuite owner?
<ctalbert_> fantasai: we can do that, but the burden is on getting resources for that, really.
<ctalbert_> MikeSmith: yeah, the question is how do you encourage the managers allow their people to spend times on w3c work
<ctalbert_> MichaelC_SJC: you might be able to convince your company to do that, but we also need to have the working group chairs understand that this needs to happen
<ctalbert_> jgraham: if we have them already in an interoperable format then it's pretty easy, but for our existing tests that are in a different format, we aren't going to spend the time to convert them
<ctalbert_> fantasai: we might just have a place at w3c to take those tests, and just post them publicly and have someone else do the conversion work
<ctalbert_> jgraham: I suspect that's a wide problem
<ctalbert_> krisk_: if you getin the habit of submitting stuff as you're doing development, tat seems reasonable.
<ctalbert_> krisk_: keeping things not super complex is a win, and being consistent will pay dividends
fantasai^: Because for Opera it may not be valuable to do the conversion, but e.g. Microsoft might want those tests, and decide that the cost of converting is less than the cost of rewriting tests from scratch, so to them it'll be worth it to do the conversion
<ctalbert_> fantasai: thanks, I'm not too good at this :/
<ctalbert_> (scribe note ^)
<ctalbert_> wilhelm: the more I think of this, the more I realize that facilitating the handover of tests is a full time job
<Zakim> MichaelC_SJC, you wanted to ask how much should there be a "W3C format" vs how much does W3C framework need to format (nearly) any format?
<ctalbert_> wilhelm: if we could get every browser vendor to commit one person to do this work on their team then that would be good.
<ctalbert_> fantasai: the problem we're at now, people havne't adopted the w3c ofrmats internally
<ctalbert_> it will be less work once that happens
<ctalbert_> it's not w3c's responsibility to convert your tests to w3c
<ctalbert_> fantasai: you can write a conversion script to convert your test to w3c format
<ctalbert_> better to do that than to have w3c to accept all differnt formats
<ctalbert_> jgraham: the problem is that many of these harnesses are not built for portability
<ctalbert_> MichaelC_SJC: the problem with a common format (and I may be wrong) is that you run into things you can't test
<ctalbert_> jgraham: if we run into that, then in that case maybe we can find some lightweight format for those tests, or in that case maybe we use a different type of harness
<ctalbert_> scribe: ctalbert has to step out
<ctalbert_> fantasai: ^
kk: If you can write it with testharness.js,
do that. If not, try reftest, if not, try self-describing
... In your case you have the difficulty of needing a screenreader or something
jgraham: If you can get ppl to contribute in one format, at least you solve the problem once per platform rather than once per test
mc: I think there's a hierarchy of
... The framework should have at least thepossibility of hooking in new formats
wilhelm: For the Watir cases, we noticed areas
where we'd want to addtests for something very obscure and
specific. What we've done is add support at a low level in
Opera and use an API
... Such things could be later added to WebDriver
<MichaelC_SJC> s/I think there's/I can agree with the idea that/
Alan: For tests where there isn't a w3c version, but browsers have something, is there a list of most-wanted specs that need tests on the w3c site
fantasai: All of them? :)
Alan: We were talking about poking ppl,
committing ppl to translating browser tests to w3c tests
... Would be more successful to getting resources if we have a specific list of things we need
jj: Also possibility to ask specific
... Rather than saying, please call all submit tests for HTml5
... Say, can you submit tests for WebWorkers
... need a specific ask to get things done
... It might not cause immediate surge in test submissions, but for me from outside to inside, the idea of submitting tests was impossible to me. Didn't know where to submit them, figured they'd be rejected, didn't know what a reftest was, etc.
... So process was hard, and not being specific
... Better way to get things done is asking
... Would like Opera to submit WebWorker tests
wilhelm: Can I get that in writing so I can show it to my manager?
Alan: Identify the tests, see who has those tests, then request them
plh: We've been corrsponding on testing
framework a little bit, but part of task is also going out
there in the wild and finding tests and getting them to
... Need to get to point where we have framework and start on asking tests
Alan: Use framework to identify areas, since it annotates the spec
jj: We have no idea how much coverage those 47
tests have -- number isn't meaningful from a coverage
... 1 is better than 0, but maybe 100 is needed not47
?: Test coverage is a negative covered only know when something is not covered, not how well something is covered
jj: Even if you say you have 100% on that normative statement, still doesn't tell you if you got all the edge cases
jgraham: At the moment for HTML we have nothing, though.
<simonstewart> ^^ simonstewart: test coverage is a negative thing. It'll only say what's not covered, not how well the covered areas are tested
jgraham: We have our tests organized by
section in the repo, but it's not explicit
... Being able to say per normative statement, do we have a test for this, is pretty nice
<plh> --> http://www.w3.org/2011/10/timer.html (annoying) timer
jgraham: If you look somewhere, there's an
annotation per sentence in the spec showing tests for section
... But that's really complicated, because spec isn't marked up to make that easy
... and testing dozens of disconnected statements
kk: The problem we're struggling with is not
that how do we get perfect coverage. There's a spec, and
there's no coverage.
... Browsers all have this feature, and they don't work the same. So having some is a good start.
Bryan: If you look at most of WebAPIs near LC or at LC, only 1/3 have tests available
<jhammel> fantasai: setup a process for getting tests from *your* organization to w3c, and *going forward*, you should write w3c-submittable tests *and* submit the tests. Once that is in place, we can go back and convert legacy tests
<jhammel> fantasai: we need to get the webkit people to commit to this
<jhammel> fantasai: you can require that when checked into repo, they become reftests
<jhammel> fantasai: plan going forward is to convert to reftest
<jhammel> jgraham: if you're comparing to something bitmap-based, it may take 2x time, but it will save time going forward
fantasai^: Because then the number of legacy tests that are not w3c-formatted stops growing, and we can work on making that number smaller
example of a test that has to be self-describing: This tests that the blurring algorithm produces results within 5% of a Gaussian blur
bryan: We developed a number of specs for
... We recognize these APIs are quite sophisticated, an it'll take some time, but we're continuing the development of these capabilities for web runtimes
... We have developer program, global ... ecosystem
bryan (from AT&T): wanted very briefly ...
bryan: show you these links to the specs, the
APIs, but more importantly the test framework
... Test framework is based on QUnit
... Pulls in a file from a test directory, which has the list of test associated with this particular API.
... Tests individual JS filesin the same directory
... will run them one by one
... This is packaged up as a widget file, whcih is available for download
... So we can run all the tests for example using this widget framework.
bryan shows pie charts of resutls
bryan: Automatically uploaded and made available to vendor
plh: Say 1000 tests for core web standards?
bryan: No for APIs
... What comes for underlying platform is inherently tested by that community
... We need to cover device variation
... identify things that we reference
... We have individual tests for these, test scripts
... this is more than acid level test, but not what we hope to see from W3C in long run
... We don't want to develop and maintain this level of detail in WAC. Want to leverage W3C test suites
... If you look at the tests, you can see for example the geolocation test suite, which we reference.
... We want to auto-generate the tests as widget
jj: So if hte test suite changes, do you update your widget?
bryan: Our goal is to create frameworks where
we can pull in tests and run them in this runtime environment
without havng to necessarily maintain the tests ourselves
... We would benefit from a common test framework
... What exactly these tests are is basically just a JS procedure
... We test existence of methods, call qunit functions for pass/fail, not necessarily married to this format, but it was the most common one at the time we developed this.
... So to summarize our goal is to have the scalability to support this widget-based ecosystem across dozens of devices across the world
... So we have to have scalability
... To depend on the core standards as something we don't spend a lot of effort on
... Duplicate things that eventually come from W3C.
... We'd like to see this developed at W3C so we can directly leverage it.
fantasai comments on how this shows having a few common formats is better than having w3c accept many similarly-capable formats -- it better supports reuse of the tests
1. Vendors commit to running W3C tests
2. Vendors push internally to adopt W3C test formats
plh says W3C should make ti easier for vendors to import suites
fantasai: what does that entail?
plh: make guidelines for WG
jgraham: I feel the problem is more on our side than on W3C side
wilhelm, jgraham: but of course, using hg instead of cvs is important for tests
wilhelm: W3C should commit resources to get tests from vendors
plh: start with webapps
wilhelm: Any conclusions on WebDriver
... We commit to work on the spec, and get that into our browser
plh: MS and Apple should look into that
Mike: normal people at apple are interested, but they're not the ones who sign off on things
kk: Using testharness.js seems to me a very low-hanging fruit, rather than writing a whole bunch of APIs
<jhammel> "not buy Apple" would be more effective
wilhelm: There should be a spec that talks about it, for the IP stuff, we need to get a spec out so there's less risk for those implementing
jgraham: There was some discussion, but no decision, about which bindings W3C would accept tests in
wilhelm: I'd list that as an open issue
MikeSmith: We want to follow up with testing IG , [other grou]
MikeSmith: Spec discussion would go to [... mailing list ...]
wilhelm: Dumping ground for non-W3C-format tests
kk: You can put whatever you want in submitted folder
jgraham: It would be nice if ppl dump random test suites in random formats, to separate those out from thing sthat would be approved in roughly their current form
kk: We should have an old_stuff directory
jgraham: And encourage people to dump stuff there
<MikeSmith> for the Testing IG, http://lists.w3.org/Archives/Public/public-test-infra/ and firstname.lastname@example.org
plh: We can associate a repo with the testing IG, and then anyone in that IG can push to the repo
<plh> ACTION: Mike to create mercurial repositories for Web Testing IG and Browser Tools WG [recorded in http://www.w3.org/2011/10/28-testing-minutes.html#action01]
fantasai: Should be clear that dumping things here is not the same as submitting to an official W3C test suite
bryan: Should also have a wiki that documents what's there
<ctalbert_> TabAtkins_: I accidentally locked myself on the patio, could you come rescue me?
jj: Right, should be clear these are not submitted for review; they're there, and someone can take them and convert them and submit them
jgraham: Come up with a prioritized list of things that need tests
jj: anything that's in CR? :)
plh: I'll take an action item to do that
<scribe> ACTION: plh to make a list of things that need tests [recorded in http://www.w3.org/2011/10/28-testing-minutes.html#action02]
bryan: Need a list of what's available, what are the key gaps, what do we need to get there
kk: Identify specs that are in a bad situation.
fantasai: Also want to track not just what
needs testing, but ask vendors whether they have tests for any
... Can then go pester people to submit those tests
<scribe> ACTION: MikeSmith to Create repos for testing IG and testing framework group [recorded in http://www.w3.org/2011/10/28-testing-minutes.html#action03]
plh: Need places to dump tests for groups that
don't have repos atm
... more and more groups have their own test repo
<plh> ACTION: plh to convince the geolocation WG to use mercurial for their tests [recorded in http://www.w3.org/2011/10/28-testing-minutes.html#action04]
3. Vendors commit to finding a person to facilitate submission and use of W3C tests
wilhelm: need to make a formal request to each organization
bryan: Someone should pull together format descriptions and include the guidelines
<plh> --> http://www.w3.org/html/wg/wiki/Testing/Authoring/ Authoring Tests
dicussion of where to collect this information
<plh> --> http://www.w3.org/testing/ Testing
jgraham: should be in a place not specific to
a given working group
plinss: There's a lot to be gained by standardizing metadata
jgraham: hard to do the CSS way for an HTML
... Could have n ways to do it, where n is a small number
Alan: It would be nice to have everything on a
wiki so we don't have to go through a staff member
... What if this page was a redirect to a wiki?
jgraham: Could have that page be a link to a wiki
MikeSmith: I like redirect idea, minimizes work I have to do :)
wilhelm: So when should we meet again?
jj: I think we should definitely make this a
... Seems like everyone in every WG is going to be solving the same problems
plh: WebDriver will be under browser tools WG
mc: Who's "we"?
wilhelm: I don't know, but this crowd is great.
plh: We can put under the IG
fantasai: We can say at last meet again next TPAC
plh: Would be in France next year
fantasai: Since not everyone will be travelling to TPAC, would we want to do another place at at different time as well?
jj: Does everyone agree we should meet?
kk: Depends on deliverables.
MikeSmith: If we meet 6 months from now, when would that be?
mc: Just want to be sure who the "we" is the invite would go out to
wilhelm is designated in charge
RRSAgent: make minutes