08:37:37 <RRSAgent> RRSAgent has joined #testing
08:37:37 <RRSAgent> logging to http://www.w3.org/2013/11/13-testing-irc
08:37:40 <leif> ScribeNick: leif
08:37:54 <leif> jgraham: This session is more on the nitty-gritty, unlike previous on policy etc.
08:38:07 <shan> shan has joined #testing
08:38:12 <leif> … The state of testing for those who are not aware:
08:38:16 <Ms2ger> RRSAgent, make logs public
08:38:21 <simonstewart> simonstewart has joined #testing
08:38:26 <leif> … Tests on GitHub, accepting pull requests
08:38:29 <leif> … Decent docs
08:38:30 <shoko> shoko has joined #testing
08:38:56 <simonstewart> simonstewart has changed the topic to: Testing Tech - discussion of current and future plans;
08:39:01 <leif> …We now own t…twf.org, have docs there instead of a dozen wiki pages
08:39:16 <leif> SimonSapin: Which groups use that repo?
08:39:26 <leif> tobie: (lists groups)
08:39:43 <leif> jgraham: all but XML-oriented groups
08:39:48 <yuka_o> yuka_o has joined #testing
08:39:48 <simonstewart> Or CSS
08:39:50 <simonstewart> :)
08:39:52 <leif> tobie: Hopefully soonish CSS
08:40:11 <leif> … takes some time because CSS tied itself to hg and Shepherd
08:40:18 <leif> … it's a bit complicated
08:40:31 <leif> jgraham: As for actually running the tests…
08:40:43 <kkubota2> kkubota2 has joined #testing
08:40:45 <leif> … changes coming soon to run themmore easily
08:40:56 <dom> dom has joined #testing
08:40:58 <leif> …a script coming soon to identify which files are tests in repo
08:41:03 <tobie> q+: on running the tests using webdriver
08:41:16 <leif> …and what kind of test file
08:41:37 <tobie> s/q+: on running the tests using webdriver//
08:41:42 <leif> …Show what files do what
08:41:44 <ijongcheol> ijongcheol has joined #testing
08:41:52 <tobie> q+ on running the tests using webdriver
08:42:08 <tobie> s/q+ on running the tests using webdriver//
08:42:12 <leif> …Other change is: Previously running tests req'd Apache and PHP, not fun to get CORS tests running on w3.org
08:42:27 <darobin> darobin has joined #testing
08:42:30 <leif> …Individual contributors had to install heavyweight software
08:42:49 <leif> …At Moz we didn't want PHP on every single test slave. Sysadmins would never have spoken to us again.
08:42:52 <AutomatedTester> AutomatedTester has joined #testing
08:43:00 <shepazu> shepazu has joined #testing
08:43:03 <zcorpan> zcorpan has joined #testing
08:43:11 <leif> …We have a custom PYthon-based solution replicating the dynamic things from the PHP solution, but with testing in mind
08:43:33 <leif> …easy to make HTTP response, but doesn't force you to stick to the standards, useful to diverge in testing
08:43:39 <leif> …currently in review, about 2/3 done
08:43:56 <leif> …Anyone who worked on XHR testsuite could help review
08:44:00 <zcorpan> https://critic.hoppipolla.co.uk/r/368
08:44:06 <leif> …Should be days < weeks away
08:44:11 <leif> …<< months
08:44:23 <SimonSapin> SimonSapin has joined #testing
08:44:26 <edoyle> edoyle has joined #testing
08:44:26 <leif> (i.e. much less than months)
08:44:26 <kurosawa> kurosawa has joined #testing
08:44:37 <leif> jgraham: Still haven't mentioned running tests :)
08:44:43 <leif> …ppl are working on it
08:44:44 <zcorpan> also https://critic.hoppipolla.co.uk/r/364
08:44:55 <leif> …Would like some discussion now on some issues
08:45:06 <leif> …One is the enormous code-review backlog. Need a strategy
08:45:18 <leif> …Another is working out whether we have tests for a certain thing
08:46:24 <cwdoh> cwdoh has joined #testing
08:46:33 <leif> …Very interesting for a lot of reasons. One of the long-term ways of using the testsuite is, instead of stability marker for spec or going to caniuse.com, could map tests to spec parts. Req's us to obtain data from vendors on test results.
08:46:39 <leif> …Thoughts on Code review?
08:46:53 <leif> tobie: Some stuff I'd like to do if I had time.
08:46:59 <leif> …A system to easily run tests
08:47:10 <leif> …w3c-test.org (?)
08:47:21 <leif> …Run on different browsers automatically and report back to pull request
08:47:34 <simonstewart> Interestingly, this is what we do with the selenium project already for our own tests.
08:47:45 <leif> …Struck a deal with SalsLabs (?) that they can do that
08:47:48 <leif> …want to do asap
08:47:57 <simonstewart> s/SalsLabs/SauceLabs/
08:48:14 <leif> David Burns: They don't run nightlies
08:48:36 <leif> tobie: Right now just hooking the whole thing up, hopefully ask them to do nightlies later
08:48:44 <leif> …don't know how feasible, but this is a first step
08:48:58 <leif> jgraham: For unreviewed stuff, it's interesting. Security concerns though
08:49:08 <leif> …full test run data, you really want to leave that to vendors
08:49:27 <leif> …If you want your impl considered as an impl for parsing whatever, you should really be running tests.
08:49:35 <leif> …Not true today, but in the long term.
08:50:00 <leif> …We don't necessarily need a system for running every test every day in SauceLAbs, but these tests once is useful.
08:50:25 <leif> tobie: Both use cases are valuable. Review is obvious. But also aggregate into WebPlatform.org and feed to devs.
08:50:29 <leif> …Lot of value for devs.
08:50:58 <leif> David Burns: "dev" means "webdev"
08:51:37 <leif> jgraham: A problem with code review is that we try to do too much upfront. We should work out what fails, then come back and say that, vendor should look at the test.
08:51:53 <leif> …There's a tension between getting a quality standards and quick review.
08:51:57 <leif> tobie: and quantity
08:52:06 <leif> jgraham: Hard work, and nobody's paid to do it
08:52:20 <leif> tobie: [missed]
08:52:24 <leif> …it's a bottleneck
08:52:40 <leif> …I often see in CR that metadata is missing, other formalities. Could be automated.
08:52:53 <leif> …Immediate comment in CR.
08:53:00 <leif> (CR= code review)
08:53:16 <leif> zcorpan: Trailing whitespace. Don't bother whining about it myself
08:53:24 <leif> rebecca: [missed]
08:53:35 <leif> zcorpan, tobie: can't solve all the problems
08:54:04 <leif> tobie: But saves reviewers from going nuts over details. Reviewer is engaged immediately, instead of 2-6 months and then whitespace complaints. Encourages rude replies!
08:54:21 <leif> zcorpan: Test writers can be encouraged to run checks before submitting
08:54:36 <leif> simonstewart: Would like to edit pull requests
08:54:47 <leif> (may have been a joke)
08:55:11 <leif> simonstewart: I've only seen people volunteering at TTWF event, but afterwards engagement drops.
08:55:38 <leif> rhauck2: Yes, a problem … Shanghai and Baidu people stayed engaged
08:56:00 <leif> tobie: Want to set up assigning Pull Reqeusts to people
08:56:07 <leif> … have a test coordinator for a spec
08:56:22 <leif> …Some automation plus finding the right people…it's my best offer at this point.
08:56:33 <leif> jgraham: I definitely agree that that's valuable
08:56:46 <ijongcheol> ijongcheol has joined #testing
08:56:46 <leif> …GitHub's are one set of solutions, there are others
08:56:58 <leif> …Lots that I could review if it wasn't hard and boring
08:57:12 <leif> …checking that assertion about spec are correct etc.
08:57:30 <leif> tobie: Intersection of skill sets often empty
08:57:35 <darobin> darobin has joined #testing
08:57:52 <leif> rhauck2: (?) has good policies on this
08:57:58 <leif> jgraham: They have salaries
08:58:21 <leif> …Need someone to employ them essentially to review tests part time
08:58:42 <cwdoh> cwdoh has joined #testing
08:58:51 <leif> wilhelm: Both …and review question are about resources
08:58:57 <leif> …Don't know what the right form is.
08:59:05 <leif> …Can go to employers rather than guilting people
08:59:10 <leif> tobie: Non-trivial
08:59:15 <leif> …as seen over the past year
08:59:31 <leif> jgraham: If non-trivial to the level that it won't happen, we need a different strategy
08:59:58 <leif> tobie: Right. Instead of thinking in terms of "making reviews happen quickly", "if not reviewed in 2 weeks, it's out"
09:00:12 <leif> …Build a toolset that makes quality possible with those constraints
09:00:27 <leif> …Quality on one side, quantity on the other. Put the cursor on the right place.
09:00:57 <leif> jgraham: A countdown timer incentivizes people. You might just find a issue to extend the timer
09:01:04 <leif> tobie: "It's no longer my problem"
09:01:06 <lmcliste_> lmcliste_ has joined #testing
09:01:33 <leif> jgraham: This is the reason i like to track the progress of reviews. Mark files reviewed. If it always says 100 % remaing…
09:01:57 <leif> rhauck2: Are there test submissions that can be special-cased? From vendors, e.g., that are scuritinized more on beforehand
09:02:03 <leif> tobie: We've changed the process for this
09:02:21 <leif> David Burns: Would Chromium people be happy with Mozilla's? Being devil's advocate here.
09:02:26 <leif> rhauck2: It's a compromise
09:02:32 <leif> David Burns: More politics
09:03:07 <leif> jgraham: … could we have magic that turns a patch into a PR
09:03:33 <leif> tobie: Yeah, explicitly changed process. Two employees can write and review, as long as process is public.
09:03:47 <leif> rhauck2: Not quite the same. Same company.
09:03:57 <leif> … [missed]
09:04:08 <kawada> kawada has joined #testing
09:04:11 <leif> …Can we special case certain things, like working together on Flexbox suite?
09:04:24 <leif> David: I personally don't see issues
09:04:35 <leif> jgraham: e.g. if one company has completely wrong model of a spec
09:04:43 <leif> … We would have accepted wrong specs
09:04:48 <leif> tobie: Not politics, mistakes
09:05:00 <leif> jgraham: Also can't prevent it completely
09:05:24 <leif> …There are issues, but might be worth it, otherwise we won't accept anything. A vendor could potentially submit 1000 tests
09:05:31 <leif> …Could have to wait a while
09:05:37 <leif> zcorpan: I noticed :)
09:06:00 <leif> tobie: Could be more valuable to just have it public and accessible. If it has a problem, just take it out!
09:06:20 <leif> jgraham: Yeah. If we're happy to automatically forward tests, makes it easier to work with the repo
09:06:30 <Ms2ger> Ms2ger has joined #testing
09:06:38 <leif> Burns: If it's internal to Mozilla, should it be public to all?
09:06:54 <leif> rhauck2: Good question. Would be great if not too much trouble
09:07:08 <leif> jgraham: …
09:07:37 <leif> tobie: Two different things. One is acceptable, CR was in the open, you can track it, valuable info. Doesn't mean we shouldn't special-case tests coming from trusted people or orgs.
09:07:54 <leif> …Maybe we shouldn't, but keep in mind that they're different
09:08:01 <leif> …No-one questions open reviews
09:08:12 <leif> Burns: Review doesn't have to be on GitHub.
09:08:20 <leif> …Keep wording fluffy
09:08:43 <leif> …If a vendor doesn't already have an open process, have to submit PR.
09:08:53 <Ms2ger> RRSAgent, make minutes
09:08:53 <RRSAgent> I have made the request to generate http://www.w3.org/2013/11/13-testing-minutes.html Ms2ger
09:08:58 <leif> jgraham: …
09:09:19 <leif> tobie: The policy says that same company can approve PR if review in the open
09:09:39 <leif> jgraham: Review in the open could be just look at internal bug tracker and marking reviewed.
09:09:49 <leif> Burns: …
09:09:59 <leif> tobie: You want a paper trail
09:10:00 <leif> ?
09:10:18 <leif> rhauck2: Private reviews doesn't work very well
09:10:28 <leif> tobie: Yeah, no need to discuss it
09:10:34 <Ms2ger> s/PYthon/Python/
09:10:47 <leif> …One thing is if Microsoft brings in 1000 tests, and someone else in MS OKs
09:11:16 <leif> …Is the use-case MS and Opera? How does Opera work re. Blink?
09:11:26 <leif> zcorpan: Going forward Opera uses Chromium testing infra, not old Opera infra.
09:11:44 <leif> lmcliste_: Do you have tests upstream of Chromium?
09:12:23 <leif> jgraham: Q is, if there was an Opera-developed feature, web-interacting and needing tests, and you submitted tests, would it be an open patch or reviewed behind closed doors?
09:12:31 <leif> zcorpan: Not sure how it works right now?
09:12:38 <leif> s/?/.
09:12:47 <leif> …the only changes were made by philipj
09:12:56 <leif> …I reviewed them in the open using Critic
09:13:12 <leif> jgraham: My feeling is that Opera is not the problem case here
09:13:32 <leif> tobie: How do Microsoft contribs usually work? My impression is that they are bulks coming in from time to time.
09:13:58 <MichaelC> MichaelC has joined #testing
09:14:03 <leif> … Maybe better to decide that they need review from someone else, but that someone can communicate with them to ask how much internal review was done etc.
09:14:24 <leif> Burns: TBH, better if merge was instantaneous. Paper trail is the valuable thing.
09:14:38 <leif> jgraham: From vendors, we're willing to have forgiveness, not permission
09:14:58 <leif> …and if we find out a lot of crap tests have been coming, we want review from other vendor in the future.
09:15:05 <leif> tobie: That's exactly my intention
09:15:24 <leif> …We're all trying to do the right thing. If ppl behave like idiots, we have to deal with that
09:15:50 <leif> …This a minor policy question when problems arise.
09:15:59 <leif> …Don't mean to pick on MS, but I mean closed impls.
09:16:07 <leif> …Can we make a decision on this now?
09:16:18 <leif> jgraham: Want to ask ML, might be dissent.
09:16:34 <leif> zcorpan: I don't mind blessing vendor tests, but I'd like some time window for review.
09:16:49 <leif> …In case people do want to review and they find problems.
09:17:04 <leif> …At least some way of identifying recently merged, unreviewed tests
09:17:19 <leif> …If they're just merged, I wouldn't normally look at them.
09:17:38 <leif> tobie: At this point, want to move to ML. We've nailed down problem, just need to make a decision.
09:18:00 <leif> action jgraham to ask mailing list
09:18:28 <leif> rhauck2: When you migrated to GitHub, did you …
09:18:51 <leif> jgraham: We moved everything that was obvious where to move
09:19:05 <leif> …some had a different hierarchy, didn't know how to reorg
09:19:11 <leif> …1000s of tests
09:19:23 <yuanyan> yuanyan has joined #testing
09:19:30 <leif> …'old' directory not reviewed
09:19:49 <leif> ???: Some tests contain errors, that we found on Saturday. AT some point someone needs to review
09:19:56 <leif> rhauck2: Can we run them and use them?
09:19:59 <leif> jgraham: yes
09:20:22 <leif> …Now that I can run tests in Gecko automatically (on the python server branch) I fixed a lot of broken tests
09:20:39 <leif> …Broken tests become very obvious
09:20:57 <leif> tobie: You're bringing that up because of worry about state of CSS testsuite
09:21:42 <leif> …I wouldn't worry too much, it's a fact of life. Start running on WebDriver, SauceLabs. You'll quickly see what's going on. If something works on 4 browsers, it's probably good, if fails everywhere, probably a broken test
09:21:55 <leif> rhauck2: Right. We're going to refactor the dir strucuture
09:22:08 <leif> …krit and others are writing scripts that assume a structure
09:22:16 <leif> …How do we address this
09:22:24 <leif> tobie: This gets addressed by running them
09:22:32 <leif> rhauck2: Running them all over the place
09:22:44 <leif> tobie: This is why I push for time and money for doing this.
09:22:57 <leif> …SauceLabs runs on 10s of combos of OS and browsers
09:23:46 <leif> jgraham: Obviously need to take the effort. If you're a vendor and see failures, you need to look at them because they might be impl or spec bugs. If you're not looking at fails, you're not getting value out of testsuite.
09:23:57 <leif> …It's worth always running tests, even if failing.
09:24:07 <leif> rhauck2: FAiling tests are a good things to me.
09:24:29 <leif> tobie: Running the tests and analyzing the results are two different steps
09:24:31 <leif> rhauck2: …
09:24:41 <leif> tobie: Try to enforce using short names
09:24:48 <leif> s/try/trying
09:24:54 <Ms2ger> Short names?
09:24:58 <ken> ken has joined #testing
09:24:59 <leif> jgraham: That's what we wanted to discuss about CR
09:25:13 <leif> …Also want to discuss coverage, but worried tobie will kill himself
09:25:40 <leif> tobie: No :)
09:26:06 <leif> …Spec coverage, usually it's easier to measure that code has coverage rather than specs
09:26:28 <leif> …Specs are harder because two stakeholders with diff. requirements and interests
09:27:09 <leif> …Some are interested in broad but shallow want to parse the spec for normative requirements, specs and algorithms, webidl, propdefs, assume you need a certain number of tests for each
09:27:22 <leif> …Not a fantastic solution, but gives you a good idea of what you've tested and not
09:27:33 <leif> …estimate fairly precisely the engineering time needed
09:27:43 <leif> …some data points show that these estimates are rather solid
09:28:12 <leif> …The other is for robustness and interop, you wanted to test known-brittle areas, know-non-interoperable
09:28:15 <cwdoh> cwdoh has joined #testing
09:28:28 <leif> …We don't have good solutions to measure coverage for this. jgraham has some strategies
09:29:11 <leif> jgraham: I have two interests: one a bit like what you talked about as coarse-grained. Eventually we have a spec and we can believe we don't have major interop problems.
09:29:18 <leif> …reasonable number of tests given a spec.
09:29:20 <cwdoh> cwdoh has joined #testing
09:29:25 <leif> …not perfect, obvs.
09:29:59 <leif> …The other is working out what you've missed. Hard to do: combos of features (web sockets and workers) rarely obvious…
09:30:09 <leif> rhauck2: ? was perplexed for the same reason
09:30:36 <leif> jgraham: That's just hard, and requires afaict people who know that there are interactions
09:30:43 <lmcliste_> s/?/Mihai Balan/
09:30:51 <leif> …One thing that might work is look at ? data from vendors
09:30:59 <leif> …You have 80 % of what we need
09:31:20 <leif> …Would be nice if for specific specs we could say what is covered.
09:31:25 <leif> …"We need worker tests"
09:31:37 <rhauck2> http://lists.w3.org/Archives/Public/public-css-testsuite/2013Nov/0000.html
09:31:45 <leif> …want to look at it in the feature. Looked at it for an afternoon, clearly non-trivial
09:32:04 <leif> …(e.g. killing browser all the time loses data)
09:32:12 <leif> tobie: Points at where complexity of impl is
09:32:26 <leif> …Gives the big picture
09:32:31 <leif> …about codebase
09:32:39 <leif> jgraham: Points out places most likely to be brittle
09:32:59 <leif> …no good solutions atm, out of time
09:33:06 <leif> …anyone else want to say something briefly?
09:33:19 <leif> ???: Would be useful to document the state of discussion
09:33:35 <ijongcheol> ijongcheol has joined #testing
09:33:40 <leif> …coverage analysis etc.
09:33:44 <mizuman> mizuman has joined #testing
09:34:16 <leif> tobie: The tool you saw in the previous meeting [at TPAC plenary], I'm planning to release probably in the next week the coverage tool that makes estimates for effort and cost
09:34:37 <leif> … [missed] ttwf….org/coverage
09:34:45 <leif> …some flaws, stale data, only shallow coverage
09:34:50 <leif> …but will at least be public.
09:35:05 <leif> …if somebody doesn't want me to do that, stop me now!
09:35:25 <leif> Meeting closed.
09:35:29 <leif> RRSAgent, make minutes
09:35:29 <RRSAgent> I have made the request to generate http://www.w3.org/2013/11/13-testing-minutes.html leif
09:35:58 <leif> RRSAgent, excuse us
09:35:58 <RRSAgent> I see no action items