08:37:37 RRSAgent has joined #testing 08:37:37 logging to http://www.w3.org/2013/11/13-testing-irc 08:37:40 ScribeNick: leif 08:37:54 jgraham: This session is more on the nitty-gritty, unlike previous on policy etc. 08:38:07 shan has joined #testing 08:38:12 … The state of testing for those who are not aware: 08:38:16 RRSAgent, make logs public 08:38:21 simonstewart has joined #testing 08:38:26 … Tests on GitHub, accepting pull requests 08:38:29 … Decent docs 08:38:30 shoko has joined #testing 08:38:56 simonstewart has changed the topic to: Testing Tech - discussion of current and future plans; 08:39:01 …We now own t…twf.org, have docs there instead of a dozen wiki pages 08:39:16 SimonSapin: Which groups use that repo? 08:39:26 tobie: (lists groups) 08:39:43 jgraham: all but XML-oriented groups 08:39:48 yuka_o has joined #testing 08:39:48 Or CSS 08:39:50 :) 08:39:52 tobie: Hopefully soonish CSS 08:40:11 … takes some time because CSS tied itself to hg and Shepherd 08:40:18 … it's a bit complicated 08:40:31 jgraham: As for actually running the tests… 08:40:43 kkubota2 has joined #testing 08:40:45 … changes coming soon to run themmore easily 08:40:56 dom has joined #testing 08:40:58 …a script coming soon to identify which files are tests in repo 08:41:03 q+: on running the tests using webdriver 08:41:16 …and what kind of test file 08:41:37 s/q+: on running the tests using webdriver// 08:41:42 …Show what files do what 08:41:44 ijongcheol has joined #testing 08:41:52 q+ on running the tests using webdriver 08:42:08 s/q+ on running the tests using webdriver// 08:42:12 …Other change is: Previously running tests req'd Apache and PHP, not fun to get CORS tests running on w3.org 08:42:27 darobin has joined #testing 08:42:30 …Individual contributors had to install heavyweight software 08:42:49 …At Moz we didn't want PHP on every single test slave. Sysadmins would never have spoken to us again. 08:42:52 AutomatedTester has joined #testing 08:43:00 shepazu has joined #testing 08:43:03 zcorpan has joined #testing 08:43:11 …We have a custom PYthon-based solution replicating the dynamic things from the PHP solution, but with testing in mind 08:43:33 …easy to make HTTP response, but doesn't force you to stick to the standards, useful to diverge in testing 08:43:39 …currently in review, about 2/3 done 08:43:56 …Anyone who worked on XHR testsuite could help review 08:44:00 https://critic.hoppipolla.co.uk/r/368 08:44:06 …Should be days < weeks away 08:44:11 …<< months 08:44:23 SimonSapin has joined #testing 08:44:26 edoyle has joined #testing 08:44:26 (i.e. much less than months) 08:44:26 kurosawa has joined #testing 08:44:37 jgraham: Still haven't mentioned running tests :) 08:44:43 …ppl are working on it 08:44:44 also https://critic.hoppipolla.co.uk/r/364 08:44:55 …Would like some discussion now on some issues 08:45:06 …One is the enormous code-review backlog. Need a strategy 08:45:18 …Another is working out whether we have tests for a certain thing 08:46:24 cwdoh has joined #testing 08:46:33 …Very interesting for a lot of reasons. One of the long-term ways of using the testsuite is, instead of stability marker for spec or going to caniuse.com, could map tests to spec parts. Req's us to obtain data from vendors on test results. 08:46:39 …Thoughts on Code review? 08:46:53 tobie: Some stuff I'd like to do if I had time. 08:46:59 …A system to easily run tests 08:47:10 …w3c-test.org (?) 08:47:21 …Run on different browsers automatically and report back to pull request 08:47:34 Interestingly, this is what we do with the selenium project already for our own tests. 08:47:45 …Struck a deal with SalsLabs (?) that they can do that 08:47:48 …want to do asap 08:47:57 s/SalsLabs/SauceLabs/ 08:48:14 David Burns: They don't run nightlies 08:48:36 tobie: Right now just hooking the whole thing up, hopefully ask them to do nightlies later 08:48:44 …don't know how feasible, but this is a first step 08:48:58 jgraham: For unreviewed stuff, it's interesting. Security concerns though 08:49:08 …full test run data, you really want to leave that to vendors 08:49:27 …If you want your impl considered as an impl for parsing whatever, you should really be running tests. 08:49:35 …Not true today, but in the long term. 08:50:00 …We don't necessarily need a system for running every test every day in SauceLAbs, but these tests once is useful. 08:50:25 tobie: Both use cases are valuable. Review is obvious. But also aggregate into WebPlatform.org and feed to devs. 08:50:29 …Lot of value for devs. 08:50:58 David Burns: "dev" means "webdev" 08:51:37 jgraham: A problem with code review is that we try to do too much upfront. We should work out what fails, then come back and say that, vendor should look at the test. 08:51:53 …There's a tension between getting a quality standards and quick review. 08:51:57 tobie: and quantity 08:52:06 jgraham: Hard work, and nobody's paid to do it 08:52:20 tobie: [missed] 08:52:24 …it's a bottleneck 08:52:40 …I often see in CR that metadata is missing, other formalities. Could be automated. 08:52:53 …Immediate comment in CR. 08:53:00 (CR= code review) 08:53:16 zcorpan: Trailing whitespace. Don't bother whining about it myself 08:53:24 rebecca: [missed] 08:53:35 zcorpan, tobie: can't solve all the problems 08:54:04 tobie: But saves reviewers from going nuts over details. Reviewer is engaged immediately, instead of 2-6 months and then whitespace complaints. Encourages rude replies! 08:54:21 zcorpan: Test writers can be encouraged to run checks before submitting 08:54:36 simonstewart: Would like to edit pull requests 08:54:47 (may have been a joke) 08:55:11 simonstewart: I've only seen people volunteering at TTWF event, but afterwards engagement drops. 08:55:38 rhauck2: Yes, a problem … Shanghai and Baidu people stayed engaged 08:56:00 tobie: Want to set up assigning Pull Reqeusts to people 08:56:07 … have a test coordinator for a spec 08:56:22 …Some automation plus finding the right people…it's my best offer at this point. 08:56:33 jgraham: I definitely agree that that's valuable 08:56:46 ijongcheol has joined #testing 08:56:46 …GitHub's are one set of solutions, there are others 08:56:58 …Lots that I could review if it wasn't hard and boring 08:57:12 …checking that assertion about spec are correct etc. 08:57:30 tobie: Intersection of skill sets often empty 08:57:35 darobin has joined #testing 08:57:52 rhauck2: (?) has good policies on this 08:57:58 jgraham: They have salaries 08:58:21 …Need someone to employ them essentially to review tests part time 08:58:42 cwdoh has joined #testing 08:58:51 wilhelm: Both …and review question are about resources 08:58:57 …Don't know what the right form is. 08:59:05 …Can go to employers rather than guilting people 08:59:10 tobie: Non-trivial 08:59:15 …as seen over the past year 08:59:31 jgraham: If non-trivial to the level that it won't happen, we need a different strategy 08:59:58 tobie: Right. Instead of thinking in terms of "making reviews happen quickly", "if not reviewed in 2 weeks, it's out" 09:00:12 …Build a toolset that makes quality possible with those constraints 09:00:27 …Quality on one side, quantity on the other. Put the cursor on the right place. 09:00:57 jgraham: A countdown timer incentivizes people. You might just find a issue to extend the timer 09:01:04 tobie: "It's no longer my problem" 09:01:06 lmcliste_ has joined #testing 09:01:33 jgraham: This is the reason i like to track the progress of reviews. Mark files reviewed. If it always says 100 % remaing… 09:01:57 rhauck2: Are there test submissions that can be special-cased? From vendors, e.g., that are scuritinized more on beforehand 09:02:03 tobie: We've changed the process for this 09:02:21 David Burns: Would Chromium people be happy with Mozilla's? Being devil's advocate here. 09:02:26 rhauck2: It's a compromise 09:02:32 David Burns: More politics 09:03:07 jgraham: … could we have magic that turns a patch into a PR 09:03:33 tobie: Yeah, explicitly changed process. Two employees can write and review, as long as process is public. 09:03:47 rhauck2: Not quite the same. Same company. 09:03:57 … [missed] 09:04:08 kawada has joined #testing 09:04:11 …Can we special case certain things, like working together on Flexbox suite? 09:04:24 David: I personally don't see issues 09:04:35 jgraham: e.g. if one company has completely wrong model of a spec 09:04:43 … We would have accepted wrong specs 09:04:48 tobie: Not politics, mistakes 09:05:00 jgraham: Also can't prevent it completely 09:05:24 …There are issues, but might be worth it, otherwise we won't accept anything. A vendor could potentially submit 1000 tests 09:05:31 …Could have to wait a while 09:05:37 zcorpan: I noticed :) 09:06:00 tobie: Could be more valuable to just have it public and accessible. If it has a problem, just take it out! 09:06:20 jgraham: Yeah. If we're happy to automatically forward tests, makes it easier to work with the repo 09:06:30 Ms2ger has joined #testing 09:06:38 Burns: If it's internal to Mozilla, should it be public to all? 09:06:54 rhauck2: Good question. Would be great if not too much trouble 09:07:08 jgraham: … 09:07:37 tobie: Two different things. One is acceptable, CR was in the open, you can track it, valuable info. Doesn't mean we shouldn't special-case tests coming from trusted people or orgs. 09:07:54 …Maybe we shouldn't, but keep in mind that they're different 09:08:01 …No-one questions open reviews 09:08:12 Burns: Review doesn't have to be on GitHub. 09:08:20 …Keep wording fluffy 09:08:43 …If a vendor doesn't already have an open process, have to submit PR. 09:08:53 RRSAgent, make minutes 09:08:53 I have made the request to generate http://www.w3.org/2013/11/13-testing-minutes.html Ms2ger 09:08:58 jgraham: … 09:09:19 tobie: The policy says that same company can approve PR if review in the open 09:09:39 jgraham: Review in the open could be just look at internal bug tracker and marking reviewed. 09:09:49 Burns: … 09:09:59 tobie: You want a paper trail 09:10:00 ? 09:10:18 rhauck2: Private reviews doesn't work very well 09:10:28 tobie: Yeah, no need to discuss it 09:10:34 s/PYthon/Python/ 09:10:47 …One thing is if Microsoft brings in 1000 tests, and someone else in MS OKs 09:11:16 …Is the use-case MS and Opera? How does Opera work re. Blink? 09:11:26 zcorpan: Going forward Opera uses Chromium testing infra, not old Opera infra. 09:11:44 lmcliste_: Do you have tests upstream of Chromium? 09:12:23 jgraham: Q is, if there was an Opera-developed feature, web-interacting and needing tests, and you submitted tests, would it be an open patch or reviewed behind closed doors? 09:12:31 zcorpan: Not sure how it works right now? 09:12:38 s/?/. 09:12:47 …the only changes were made by philipj 09:12:56 …I reviewed them in the open using Critic 09:13:12 jgraham: My feeling is that Opera is not the problem case here 09:13:32 tobie: How do Microsoft contribs usually work? My impression is that they are bulks coming in from time to time. 09:13:58 MichaelC has joined #testing 09:14:03 … Maybe better to decide that they need review from someone else, but that someone can communicate with them to ask how much internal review was done etc. 09:14:24 Burns: TBH, better if merge was instantaneous. Paper trail is the valuable thing. 09:14:38 jgraham: From vendors, we're willing to have forgiveness, not permission 09:14:58 …and if we find out a lot of crap tests have been coming, we want review from other vendor in the future. 09:15:05 tobie: That's exactly my intention 09:15:24 …We're all trying to do the right thing. If ppl behave like idiots, we have to deal with that 09:15:50 …This a minor policy question when problems arise. 09:15:59 …Don't mean to pick on MS, but I mean closed impls. 09:16:07 …Can we make a decision on this now? 09:16:18 jgraham: Want to ask ML, might be dissent. 09:16:34 zcorpan: I don't mind blessing vendor tests, but I'd like some time window for review. 09:16:49 …In case people do want to review and they find problems. 09:17:04 …At least some way of identifying recently merged, unreviewed tests 09:17:19 …If they're just merged, I wouldn't normally look at them. 09:17:38 tobie: At this point, want to move to ML. We've nailed down problem, just need to make a decision. 09:18:00 action jgraham to ask mailing list 09:18:28 rhauck2: When you migrated to GitHub, did you … 09:18:51 jgraham: We moved everything that was obvious where to move 09:19:05 …some had a different hierarchy, didn't know how to reorg 09:19:11 …1000s of tests 09:19:23 yuanyan has joined #testing 09:19:30 …'old' directory not reviewed 09:19:49 ???: Some tests contain errors, that we found on Saturday. AT some point someone needs to review 09:19:56 rhauck2: Can we run them and use them? 09:19:59 jgraham: yes 09:20:22 …Now that I can run tests in Gecko automatically (on the python server branch) I fixed a lot of broken tests 09:20:39 …Broken tests become very obvious 09:20:57 tobie: You're bringing that up because of worry about state of CSS testsuite 09:21:42 …I wouldn't worry too much, it's a fact of life. Start running on WebDriver, SauceLabs. You'll quickly see what's going on. If something works on 4 browsers, it's probably good, if fails everywhere, probably a broken test 09:21:55 rhauck2: Right. We're going to refactor the dir strucuture 09:22:08 …krit and others are writing scripts that assume a structure 09:22:16 …How do we address this 09:22:24 tobie: This gets addressed by running them 09:22:32 rhauck2: Running them all over the place 09:22:44 tobie: This is why I push for time and money for doing this. 09:22:57 …SauceLabs runs on 10s of combos of OS and browsers 09:23:46 jgraham: Obviously need to take the effort. If you're a vendor and see failures, you need to look at them because they might be impl or spec bugs. If you're not looking at fails, you're not getting value out of testsuite. 09:23:57 …It's worth always running tests, even if failing. 09:24:07 rhauck2: FAiling tests are a good things to me. 09:24:29 tobie: Running the tests and analyzing the results are two different steps 09:24:31 rhauck2: … 09:24:41 tobie: Try to enforce using short names 09:24:48 s/try/trying 09:24:54 Short names? 09:24:58 ken has joined #testing 09:24:59 jgraham: That's what we wanted to discuss about CR 09:25:13 …Also want to discuss coverage, but worried tobie will kill himself 09:25:40 tobie: No :) 09:26:06 …Spec coverage, usually it's easier to measure that code has coverage rather than specs 09:26:28 …Specs are harder because two stakeholders with diff. requirements and interests 09:27:09 …Some are interested in broad but shallow want to parse the spec for normative requirements, specs and algorithms, webidl, propdefs, assume you need a certain number of tests for each 09:27:22 …Not a fantastic solution, but gives you a good idea of what you've tested and not 09:27:33 …estimate fairly precisely the engineering time needed 09:27:43 …some data points show that these estimates are rather solid 09:28:12 …The other is for robustness and interop, you wanted to test known-brittle areas, know-non-interoperable 09:28:15 cwdoh has joined #testing 09:28:28 …We don't have good solutions to measure coverage for this. jgraham has some strategies 09:29:11 jgraham: I have two interests: one a bit like what you talked about as coarse-grained. Eventually we have a spec and we can believe we don't have major interop problems. 09:29:18 …reasonable number of tests given a spec. 09:29:20 cwdoh has joined #testing 09:29:25 …not perfect, obvs. 09:29:59 …The other is working out what you've missed. Hard to do: combos of features (web sockets and workers) rarely obvious… 09:30:09 rhauck2: ? was perplexed for the same reason 09:30:36 jgraham: That's just hard, and requires afaict people who know that there are interactions 09:30:43 s/?/Mihai Balan/ 09:30:51 …One thing that might work is look at ? data from vendors 09:30:59 …You have 80 % of what we need 09:31:20 …Would be nice if for specific specs we could say what is covered. 09:31:25 …"We need worker tests" 09:31:37 http://lists.w3.org/Archives/Public/public-css-testsuite/2013Nov/0000.html 09:31:45 …want to look at it in the feature. Looked at it for an afternoon, clearly non-trivial 09:32:04 …(e.g. killing browser all the time loses data) 09:32:12 tobie: Points at where complexity of impl is 09:32:26 …Gives the big picture 09:32:31 …about codebase 09:32:39 jgraham: Points out places most likely to be brittle 09:32:59 …no good solutions atm, out of time 09:33:06 …anyone else want to say something briefly? 09:33:19 ???: Would be useful to document the state of discussion 09:33:35 ijongcheol has joined #testing 09:33:40 …coverage analysis etc. 09:33:44 mizuman has joined #testing 09:34:16 tobie: The tool you saw in the previous meeting [at TPAC plenary], I'm planning to release probably in the next week the coverage tool that makes estimates for effort and cost 09:34:37 … [missed] ttwf….org/coverage 09:34:45 …some flaws, stale data, only shallow coverage 09:34:50 …but will at least be public. 09:35:05 …if somebody doesn't want me to do that, stop me now! 09:35:25 Meeting closed. 09:35:29 RRSAgent, make minutes 09:35:29 I have made the request to generate http://www.w3.org/2013/11/13-testing-minutes.html leif 09:35:58 RRSAgent, excuse us 09:35:58 I see no action items