16:55:21 RRSAgent has joined #webdriver 16:55:21 logging to http://www.w3.org/2016/07/13-webdriver-irc 16:55:59 Excellent question! 16:56:01 https://www.w3.org/2002/03/RRSAgent 16:56:10 Meeting: WebDriver F2F meeting July 2016 16:56:15 RRSAgent: draft 16:56:15 I'm logging. I don't understand 'draft', ato. Try /msg RRSAgent help 16:56:36 RRSAgent: listen 16:56:44 RRSAgent: Meeting: WebDriver F2F meeting July 2016 16:56:44 I'm logging. I don't understand 'Meeting: WebDriver F2F meeting July 2016', ato. Try /msg RRSAgent help 16:56:50 Meeting: WebDriver F2F meeting July 2016 16:56:55 RRSAgent: draft 16:56:55 I'm logging. I don't understand 'draft', ato. Try /msg RRSAgent help 16:57:13 RRSAgent, where am i? 16:57:13 See http://www.w3.org/2016/07/13-webdriver-irc#T16-57-13 16:57:31 RRSAgent: draft minutes 16:57:31 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html ato 16:58:41 MikeSmith: Why would I not have access to https://www.w3.org/2016/07/13-webdriver-minutes.html,access? 16:59:05 RRSAgent: bookmark 16:59:05 See http://www.w3.org/2016/07/13-webdriver-irc#T16-59-05 17:00:11 RRSAgent: please make these logs world-visible 17:00:32 RRSAgent: start a new log 17:00:40 RRSAgent: do not start a new log 17:00:43 hey lukeis 17:00:52 RRSAgent: please draft the minutes 17:00:52 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html ato 17:01:58 MikeSmith: nm, solved it! 17:02:12 JohnJansen: present+ 17:02:23 present+ ato 17:02:31 present+ JimEvans 17:02:32 present+ JohnJansen 17:02:32 present+ ClayMartin 17:02:36 present+ jgraham 17:02:44 RRSAgent: generate minutes 17:02:44 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html jgraham 17:02:44 present+ brrian 17:02:55 Chair: AutomatedTester 17:03:09 present+ SamUong 17:03:20 present+ AutomatedTester 17:03:25 chair AutomatedTester 17:04:23 RRSAgent: generate minutes 17:04:23 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html AutomatedTester 17:05:47 https://www.w3.org/wiki/WebDriver/2016-July-F2F#Agenda 17:13:05 Scribe: ato 17:13:08 Topic: State of the union 17:13:31 AutomatedTester: We are mostly done with the writing of the specification into a format that is more precise and actionable. 17:13:45 jgraham: Citation needed on that count. 17:13:50 AutomatedTester: But we’ve come a long way. 17:14:11 AutomatedTester: We have a client that can be used for testing, which allows us to write tests for the specification. 17:14:20 AutomatedTester: It can speak to the HTTPDs that the vendors are producing and shipping. 17:14:30 AutomatedTester: As far as I know no one are running the tests yet, but it gives us a good starting point. 17:14:54 AutomatedTester: Key parts that needs to be written, is the actions API and that is one of the major topics for discussion today. 17:15:18 AutomatedTester: Where are we unnecessarily divergent from the open source project. 17:15:50 JohnJansen: The tests, you speak of the Web Platform Tests? 17:15:51 AutomatedTester: Yes. 17:16:12 jimevans: As far as this WG is concerned, they are the only tests that matter. 17:17:47 ato: [elaborates a bit on the new tests in WPT] 17:17:59 jgraham: There are some tests there. 17:18:34 ato: We test the protocol and main loop extensively. It’s test complete. 17:18:46 ClayMartin: I have a new agenda item. Where should I add it? 17:20:15 ato: https://sny.no/2016/05/wdspec 17:22:53 brrian, jimevans: https://github.com/w3c/web-platform-tests/pull/2752 17:24:28 Topic: Agenda review 17:24:34 jgraham: We should review the agenda. 17:25:13 jimevans: I think the most important thing to get done are the actions. 17:25:40 Topic: Actions 17:25:55 https://github.com/jgraham/webdriver-actions 17:26:10 RRSAgent: draft 17:26:10 I'm logging. I don't understand 'draft', ato. Try /msg RRSAgent help 17:26:20 RRSAgent: draft minutes 17:26:20 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html ato 17:26:59 jgraham: There is some text in the spec that I think is not right. 17:27:13 jgraham: There are many implementation that are incompatible, and not conformant to the spec. 17:27:25 jgraham: I have written a draft of what I _think_ we were aiming for. 17:27:30 JohnJansen: Is it a PR? 17:27:39 jgraham: It’s two text documents 17:27:44 jgraham: (See link above) 17:27:44 https://github.com/jgraham/webdriver-actions 17:28:26 https://github.com/jgraham/webdriver-actions/commit/ae33aa579605ee215e8a0b3dc1b2182c3b6de074 17:43:37 ClayMartin: Can you mix and match per action item the input type? 17:43:54 jgraham: The idea is that each “track” can only represent one device type. 17:44:11 ClayMartin: So why does each action item also have a type? 17:44:16 jgraham: It’s a sub-type. 17:45:07 ClayMartin: Where do you specify the parallelisation? 17:52:11 jgraham: If all implementations were perfect they would run top-down and left-right. 17:52:22 brrian: Are the sources implicitly ordered? 17:52:39 jgraham: They are ordered by the natural ordering of the array. 17:52:53 samuong: Should it matter, the order? 17:53:22 jgraham: Yes. 17:53:36 samuong: Within the sequence, the order matters. 17:53:46 jgraham: For example, you can compress this. 17:54:33 jgraham: There’s no real parallalism here. 17:54:50 jgraham: Mouse move is interesting. 17:54:55 jgraham: An open question. 17:55:21 jgraham: Let’s say you have a pointer that’s at point A, then you want to move it to some element. 17:55:28 jgraham: It has to move along some path. 17:55:44 jgraham: Along that path there will be other elements, and it is not a priori obvious what should happen here. 17:56:04 jgraham: The easiest thing to imagine is a mouse. A finger can teleport. 17:56:24 jgraham: On the elements you hover over, you might or might not see events. 17:58:22 jgraham: I think the implementaiton should calculate the path at the start of the action, divide it into sub-points that are probably implementation defined, and how many there should be I don’t know; but we should give a hint. One per requestanimationframe, for example. 17:58:27 samuong: What should a tick be? 17:58:39 jgraham: By default a tick happens as fast as you can process it. 18:00:30 jgraham: It is possible in the API to specify a pause duration. 18:02:31 jgraham: There should be a null device for specifying a pause so that you can spread out a mouse move. 18:03:19 ato: But what if you don’t have a duration? 18:03:35 ClayMartin: Yeah, there should be a default duration. 18:03:43 jgraham: A minimum duration is not a bad idea. 18:04:59 jgraham: If you haven’t specified a pause, maybe it should really teleport. 18:05:07 ato: But it”s not something the user would do. 18:05:20 jimevans: But for the open source project users it doesn’t matter. 18:05:50 ClayMartin: If you do the movement for everything, you could introduce intermittents. 18:06:06 ClayMartin: It’s doing the shortest path possible, and the path could trigger something that interrupts the sequence. 18:06:22 jimevans: Some content on the page could interfere with the movement. 18:06:51 JohnJansen: Didn’t we decide to defer some of this stuff? 18:07:00 jgraham: I think touch is actually conceptually easier. 18:09:56 JohnJansen: We don’t want to tell the browsers what to fire. 18:10:01 JohnJansen: We want to describe what the user wants to do. 18:10:12 JohnJansen: And then you expect the browser to fire the right events. 18:10:35 jgraham: There should a note saying “if you don’t follow these other specs, then you probably shouldn’t try to implement this”. 18:10:55 samuong: pointerDown, pointerUp, pointerDown, pointerUp (double click), is the double click event fired? 18:11:08 samuong: What does "primary" mean here? 18:11:20 jgraham: It’s basically a shorthand way of saying “fire normal mouse events”. 18:11:36 samuong: There was also sub-type, wouldn’t that tell you? 18:11:51 jgraham: It tells you the type of event. 18:12:11 jgraham: [explains how the pointer events spec works] 18:12:27 https://w3c.github.io/pointerevents/# 18:13:53 JohnJansen: So we’re going to have a normative reference to this spec. 18:13:58 JohnJansen: Can’t we just delegate to it? 18:14:08 jgraham: Yeah, except it’s very hand wavy. 18:15:15 samuong: We have people in Blink using this for pointer events testing: If we’re not feeding in user input, if we’re just specifying the events, that’s not what we want. 18:17:01 ato: WebDriver is adding additional value here. 18:17:14 ato: mouseMove, keyboard layout/modifier keys 18:17:27 jimevans: There used to be a mandate that you use a OS level input. 18:17:33 jimevans: But we’ve stripped that out. 18:18:11 jimevans: I think what we want to say, is that we want the browser react as-if this input occurred. This implies that certain DOM events get generated by the browser on certain elements. 18:18:31 jimevans: This is difficult, or if not impossible, to specify for the reasons jgraham gave earlier. 18:18:43 JohnJansen: Double click could be controlled by the OS. 18:19:40 ato: But these are all emulated, virtual devices. 18:20:22 jgraham: If we implement at the level of DOM events, then the advantages are that it’s consistent across browsers. 18:20:28 jgraham: And that we actually know how to write that as a spec. 18:20:46 jgraham: It has the disadvantage that if you literally implement the spec, it gives you different behaviour. 18:21:12 samuong: Is anyone implementing this? In JS? 18:21:16 AutomatedTester: We do in Marionette. 18:21:30 AutomatedTester: But we generate trusted events, so it’s not content DOM events. 18:22:24 RRSAgent, make logs public 18:22:29 RRSAgent, make minutes 18:22:29 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html MikeSmith 18:23:19 ato: [explains about synthetic events in gecok] 18:23:23 samuong: We have something similar. 18:25:39 brrian: We generate an appropriate platform event. 18:26:22 brrian: We synthesise events, doing it more level has problems. 18:26:37 ato: FirefoxDriver tried native input too. 18:26:43 jimevans: IEDriver same. 18:27:18 jgraham: There might be a hand-wavy way of doing this. “Generate platform events that eventually causes the following DOM events to be generated” 18:27:21 jimevans: That’s what I meant. 18:31:03 ato: There are three levels here: The spec describes the expected output you should expect in DOM after performing the actions, all the different UAs have different input stacks so we can’t specify that. Instead we have a more general abstraction that describes a more general input approach to this. 18:32:18 samuong: At Google they wanted to test tab completion (?) 18:32:55 jgraham: I think there’s a tension between the features needed to test a browsers and testing a content page. 18:33:09 JohnJansen: I want to test the ability to create a new tab, as a browser vendor. 18:36:47 lukeis has joined #webdriver 18:37:54 RRSAgent: draft minutes 18:37:54 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html lukeis 18:38:04 samuong: As a browser vendor having it at the OS level is what they want. 18:38:13 jimevans: New tab/new window is something users want 18:38:21 ClayMartin: [explains how to do UI automation in edge] 18:39:13 jgraham: Why don’t we have a command to open a new window? 18:40:44 scribe automatedtester 18:41:08 ato: let me describe how we do things in Marionette 18:41:42 ato: if you have right-click that will create a context menu 18:42:00 ato: and then we have a command called set_context and switch to browser chrome 18:42:37 ato: we use this context to test addons and Firefox UI testing. Update and localization testing. 18:43:02 ato: we should be careful not put yourself into a state that webdriver cant return from 18:43:27 JohnJansen: We have that with EdgeDriver and we want to addons/chrome 18:44:34 ato: If we describe that stuff is in other specs but as long as the end state is what we expect. 18:45:00 brrian: What if there was a browser flag for cross platform handling? 18:45:31 jgraham: my thinking is for now, this is the algorithm with the event sthat should be generated but implementation may inject them at a higher level 18:46:20 jgraham: we expect this to do the "hand waving" thing and try use pointer events. 18:47:12 ato: [reads out http://w3c.github.io/webdriver/webdriver-spec.html#algorithms] 18:47:38 jgraham: we should describe what we think it should do 18:47:46 ClayMartin: there might be a interop bug 18:48:01 jgraham: but then the interop bug is in a different specification and not in webdriver 18:48:58 jgraham: there are reasons where we should describe what events we should do. [explains example with shift+key] 18:49:04 jgraham: going back to mouse movement 18:50:34 jgraham: an interesting implementation detail is how to handle pinch, we should asynchronously do each of the items on each finger interleaved 18:50:46 ato: how would we describe the micromovements? 18:51:00 jgraham: not sure what problems could come up as I havent written this. 18:51:47 jimevans: for pointermove event specifically, is there any mileage in adding a duration for how a tick (not micromovement) 18:53:29 jimevans: as far as specify, we will need to have a default value for duration 18:54:02 jgraham: we will have a pause(0) which is requestAnimationFrame duration 18:55:26 lukeis1 has joined #webdriver 18:56:01 jimevans: we could add a duration to pointer move and this is where the duration has a meaningful impact 18:56:17 jgraham: and this would be different to pause which is wait for something or pad 18:56:50 ClayMartin: for a move what data would we know? can I see the action 18:58:10 jgraham: The default of the pointer for the start coordinates is 0,0 18:58:59 jgraham: how we do describe coordinates? [draws a box] 18:59:19 jgraham: if you pass in x,y that will be x, y of the viewport 18:59:50 jgraham: if you pass in an element it would be the centre of the visibile centre of the element 19:00:01 http://w3c.github.io/webdriver/webdriver-spec.html#dfn-pointer-interactable-element 19:01:02 jgraham: if you pass in an element and x,y we should take the top/left of the element and move to x,y from that point 19:01:30 jimevans: [explains how this is similar to the OSS project 19:02:20 jgraham: should we do MoveBy or MoveTo? 19:02:34 [group votes for MoveTo] 19:02:59 RRSAgent: draft minutes 19:02:59 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html AutomatedTester 19:46:58 lukeis has joined #webdriver 20:03:05 samuong has joined #webdriver 20:06:02 jimevans has joined #webdriver 20:18:53 scribe automatedtester 20:19:09 jgraham: yesterday, Mozillians were discussing the following 20:19:13 Scribe: AutomatedTester 20:19:22 AutomatedTester: (You need the colon.) 20:19:26 ahh :) 20:20:59 jgraham: we wondered if there should be a a scroll to an element command for actions 20:22:26 automatedtester: how can we help prevent footguns for users. If there are test bed that are 800x600 screens how can we prevent their tests from randomly breaking? 20:23:01 ato: takeElementScreenshot doesnt screenshot to the element as requested by Microsoft in the past 20:24:21 ato: part of me is uncertain that we have specialisation in certain commands where we could have a generalisation on all commands for scrolling 20:25:29 ato: if we have separate command for scrolling, what would we do for normal commands. If we have a separate command we can see actions are more of a pipeline for commands 20:25:42 microserf 20:25:57 ato: we can then, in a later version, see about batching other commands via the pipeline 20:26:42 jgraham: we move to a pipeline we would then need to work out a storage system 20:27:27 ato: we can use this to save on bandwidth between local and remote end points 20:28:40 automatedtester: back to scroll to element 20:29:00 ato: we should not hamper ourselves with our design if we wanted a pipeline later... 20:29:19 jgraham: scrolling is an interesting case, high level does scrolling implicitly 20:30:16 automatedtester: there is a end point from the OSS project called Location in view 20:30:46 JohnJansen: we dont necessarily want to just scroll the element into view, we might want to scroll X pixels 20:31:04 ato: we need to have the implicit scroll in high level 20:32:02 lukeis: yes, but I'd like to accomplish it without requiring any script execution. 20:32:03 jgraham: if I wanted a 2 finger touch, there is currently no way to make sure the elements are in view. I would need to run script and then do the actions 20:34:00 ClayMartin: Should we fire scroll event when moving element into view? 20:34:15 ClayMartin: Should we fire scroll event when moving element into view? 20:34:33 automatedtester: that is a different question. We need to see if we want the command before we seeing what events we want to do 20:35:21 ClayMartin: in touch we dont do mouse scroll 20:35:35 automatedtester: on those we are going to do flick type events 20:36:13 jimevans: I recognise the usefulness of an action for scrolling, is it something that we can add in L2? 20:36:19 jgraham: yes 20:36:31 JohnJansen: implicitly scroll? 20:36:45 jimevans: no, we defer to L2 20:37:25 yup :) 20:40:20 brrian: what would happen in interweaved actions and a scroll? 20:40:41 jgraham: [Described what could happen in that scenario] 20:41:20 ato: The document could do items to the page which could break things 20:41:51 samuong: in ChromeDriver we get all the coords at the beginning of the action so scrolling could cause issues 20:42:58 samuong: should we get the coordinates at the beginning of the tick? 20:43:25 jgraham: yes, you could have a pinch zoom that would move things 20:45:26 resolution: defer scroll to L2 20:45:57 JohnJansen: high level commands can not be described in low level commands 20:48:33 jgraham: after the tick finished, should we add an event to the event loop? (postMessage or setTimeout) or it waits for an animation frame? 20:48:55 brrian: I want it so that it will yield to the event loop 20:49:09 brrian: if its a timer or requestAnimationFrame... 20:50:37 jgraham: vertical items in the actions should be done as fast as possible and then a vsync for the next horizontal item 20:50:59 samuong: what about dialogs? We stop the event loop if the alert happens 20:51:30 ato: it might cycle for the current vertical 20:52:44 ato: should we check for the dialogs at the beginning of each action? 20:53:33 jgraham: suppose I have 2 key press events, if the first 1 causes the alert we can't check for a dialog because we dont know the 2nd item has been processed 20:54:16 jgraham: because the event loop is paused and then you are... 20:56:36 samuong: if we are putting a lot of checks on each check then it could cause issues with the processing of the event loop 20:58:16 samuong: if we have to return to the user we could take longer than the tick was supposed to happen 20:59:57 Guest3 has joined #webdriver 21:00:01 resolved: Defer scroll to L2 21:00:15 errant_rider has joined #webdriver 21:00:30 RRSAgent: draft minutes 21:00:30 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html AutomatedTester 21:01:24 jgraham: [describing how if there is an alert appears we block the event loop] 21:03:27 samuong has joined #webdriver 21:03:35 ato: If I was implement it in marionette we have a user prompt service that we can use that is and then do the alert and check a global state 21:04:43 jgraham: you would need check that state before moving on to the next tick and then dismiss the dialog before moving to the next tick 21:05:39 jgraham: we inject [keydown, pointerdown] into the event loop. We have to do something to say do the next tick e.g. setTimeout 21:06:32 jgraham: if we do [keydown, pointerdown, setTimeout] and the pointerdown causing the alert, we can't reach the setTimeout 21:07:02 ato: [writes pseudo code on whiteboard] 21:09:27 ato: for (let action of tick) { event.sendSyntheticpointerdown(action)} yield content.executeScript("window.setTimeout") 21:09:40 jgraham: but we can't reach the executeScript 21:10:21 jgraham: the pointerDown will cause the timeout to never be reached 21:13:02 jgraham: to deal with alerts, you would need to have an event (a non content event that isnt blocked by the event loop) 21:14:04 jgraham: the difference between this and what is in the spec. We can't always check in actions 21:14:43 ato: if I were implementing it, if this was in a thread I would check during it and then shutdown the thread abort processing the following steps 21:16:15 jimevans: if at any point of the action sequence, we need to make sure we dont hang the driver 21:16:41 jimevans: how it is handled is totally up to the person writing webdriver code 21:17:47 jgraham: if the page injects something that does an alert it would be good to remove the current writing 21:18:37 smccarthy has joined #webdriver 21:19:17 ato: we need to have something that keeps track that the dialog has appeared 21:19:42 jimevans: bottom line, you either expect the dialog or not 21:20:01 jimevans: I am clicking on a button that will create a alert or not 21:20:11 RRSAgent: draft minutes 21:20:11 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html AutomatedTester 21:23:01 jgraham: we should have a basic bit of text how to handle alerts not on each command. 21:23:15 ato: no, we need it on all commands 21:23:45 ato: there are special cases were we dont want to check for alerts 21:26:26 resolution: for commands that spin the event loop, prompt handling should be invoked if an alert appears at any time 21:26:34 RRSAgent: draft minutes 21:26:34 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html AutomatedTester 21:27:45 samuong: Should we remove element references and only use X, Y coords in actions? 21:28:36 jimevans: we need to be wary of backwards compat 21:28:58 jgraham: I prefer the idea of checking the coords before the start of the tick 21:29:16 ato: there might be a race condition or the other actions have meant to move it. 21:30:34 Topic: keyboards 21:30:47 JohnJansen: does this really matter? 21:31:39 jgraham: what is the keycode you get when pass in a key? The following options are : 21:31:43 jgraham: Should we just hard code 104 keyboard and just use that keyboard. 21:32:04 jgraham: use the keyboard attached, but if in a test farm there are not always keyboards 21:33:01 jgraham: option 3: defaults to a US keyboard but the ability to change 21:33:22 ato: we could add a new command, "setThisKeyboard" 21:34:04 jgraham: option 4: set a per session state for the keyboard 21:34:31 JohnJansen: for L1 we default to 104 US QWERTY keyboard 21:35:05 brrian: how would I input Japanese into a page 21:35:55 jgraham: that works, currently you send through unicode code points. We may not do the right keycode and they might be using IME 21:36:32 brrian: do we do it as a string? 21:36:45 jgraham: no, per unicode code point 21:37:12 brrian: not all japanese strings are divisible to code points 21:38:18 brrian: if you deliver it in some OSes like this, it might not handle this properly. There are dead keys that only activate when you press the next letter 21:38:52 brrian: safaridriver uses graphine cluster boundaries 21:39:12 brrian: I would like the spec to say to be split on this boundaries 21:41:02 ato: what would you do with ü? 21:41:07 brrian: We would send it as 2 code points 21:41:28 jgraham: what happens in the DOM for keycodes 21:41:53 jgraham: do you get 2 events or 1? 21:41:57 brrian: you get 1 21:44:01 [ato and jgraham discussing example of dead keys] 21:44:19 AutomatedTester: This key has passed on. 21:44:22 AutomatedTester: It is deceased. 21:44:25 AutomatedTester: It is no more. 21:44:36 http://unixpapa.com/js/testkey.html 21:45:15 the keyUp is registered for the umlaut but not the keypress 21:45:19 ato the keyUp is registered for the umlaut but not the keypress 21:54:10 http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundary_Rules 21:54:44 brrian: it would be better if the spec split on grapheme clusters 21:55:49 ato: I am unclear on the benefit here. Would the client split on grapheme cluster? 21:56:30 jgraham: no, you would send the unicode string in a decomposed form. e.g. "u. 21:56:53 jgraham: you would send over 2 code points 21:57:12 ato: this is different to how everyone currently doing it 21:58:24 jgraham: I think we should investigate more and discuss again in Lisbon 21:58:54 action: brrian to come up more use cases for splitting 21:59:52 jgraham: for actions, and maybe sendKeys, what should [shift, a ] do? 21:59:58 ato: it should be an A 22:00:41 ato: should this only happen on sendKEys or should it work with Actions? 22:00:48 jgraham: there are 2 options 22:01:41 jgraham: 1) if you have modifier key pressed you get the next letter 22:02:20 jgraham: 2) if you can't get a char with a modifier, e.g. modifier is pressed by we release, do the other char, and then do the modifier 22:02:35 JohnJansen: [does an example with caps lock] 22:05:06 jimevans: for sendKeys, and only with shift modifier, you would send the string you wanted, and it should be string as is 22:06:23 jimevans: if you send shift + 1 results in !. In sendKeys we implicitly release the modifier 22:07:34 jimevans: [looks up some data in the OSS code base] 22:08:50 jimevans: the current OSS version does have sendKeys actions end point 22:11:06 ato: what would happen in the current clients to handle this against the spec version 22:12:39 jimevans: most Selenium users are using A and not shift + a 22:14:24 ato: the language bindings might have to do a lot of extra work here 22:14:36 jimevans: that is fine, we are doing a non-trivial amount of work here 22:15:35 jimevans: I am not worried about backwards compat here in actions because it is "Do what I say". People will want to be combining keys where in sendKeys they will send the result 22:19:41 JohnJansen has joined #webdriver 22:25:07 simonstewart has joined #webdriver 22:31:12 resolution: for keyUp/keyDown actions we won't do implicit conversion of shift. e.g. for shift + 7 we do 7 with shift modifier set 22:40:23 simonstewart has joined #webdriver 22:43:38 samuong has joined #webdriver 22:43:42 scribe: samuong 22:47:56 topic: new session 22:49:16 JohnJansen has joined #webdriver 22:49:38 jgraham: want to have top-level capabilities for non-feature-matrix capabilities 22:50:23 jgraham: current api is designed for source labs/google-style use case with grid of test machines 22:50:57 jgraham: however for many uses cases, having desired and required capabilities is confusing 22:52:02 jgraham: can we simplify this, can we standardize new session data? 22:54:08 ato: intermediary could handle selecting a host from the pool with correct features, but driver doesn't need to worry about matrix selection 22:54:38 “I have opinions" 22:54:54 ato: e.g. proxy settings doesn't make sense as a "desired" capability 22:54:58 ato is correct 22:55:03 Until then :) 22:55:34 The “new session” data is what’s required to successfully set up the browser instance 22:55:38 So the proxy is required 22:55:49 Whether it’s honoured or not is a different thing entirely 22:56:01 Which is why they’re “desired” capabailities 22:56:21 (services such as Sauce Labs and Browser Stack may choose to ignore the setting, for example) 22:56:56 My personal view? 22:57:24 We need a minimum set of routing data (browser, OS, version numbers) for intermediary nodes 22:57:36 And then each browser can figure out what it wants to do with the data 22:58:07 We also need to support multiple “profiles” (to use Mozilla’s term) in the same new session request 22:58:08 ato: distinction between desired and required capabilities is a CI-level concept that might be out-of-scope for the spec 22:58:19 I disagree 22:58:28 “desired” == optional, and can be ignored 22:58:39 “required” == must be set, or fail the new session 22:59:01 Is there a facetime audio number I can call into? 22:59:07 Or a Hangout? 22:59:40 The difference between the two is within scope for the spec 22:59:46 And also what the current drivers do 22:59:54 So we’re just speccing existing behaviour 23:00:02 Which is good, right? 23:00:35 Whoever is scribing, please continue to do so 23:01:56 jgraham: is that we should discuss this at lisbon 23:02:10 That sentence makes no sense? 23:02:21 We should discuss “new session” in Lisbon? 23:02:29 yes 23:02:34 Thanks :) 23:03:49 ok, let's kill this. we need to discuss in lisbon with Simon 23:04:08 I can dial in, if there’s a number I can call over wifi without SIP 23:04:22 Or skype. Nothing I have has skype in :) 23:06:11 ato: there was some discussion about having a capability that gets returned to clients to allow feature detection for w3c features 23:06:32 jimevans: simon's email lays out the handshake 23:06:49 jimevans: this is consistnet with how current bindings do it 23:06:51 Feature detection is definitely the way forward 23:06:52 ato: only c#, not node.js 23:07:19 The JS mob got into a horrible mess with assuming certain capabilities based on version numbers 23:07:20 ato: marionette returns a capability with marionette=true 23:07:26 jimevans: this probalby isn't how it should be done 23:07:33 I’m looking at Opera here in particular 23:07:46 marionette saying it’s marionette is cool 23:07:56 I’d expect browsers to report themselves 23:08:00 Additional metadata is fine 23:08:13 jimevans: you can construct new session command that is valid for both dialects 23:08:14 But saying “I’m level 1 compliant” is incredibly dangerous 23:08:22 jimevans: the response should tell you what dialect to continue speaking 23:09:04 jimevans: this can be done by looking at the status field 23:09:30 for reference: https://lists.w3.org/Archives/Public/public-browser-tools-testing/2016JulSep/0001.html 23:10:09 Search that for “handshake" 23:10:39 Hahaha :) 23:10:47 https://www.irccloud.com/pastebin/FnoqIzbO/s4b 23:11:39 Hang on. I need to download skype 23:12:01 you should be able to just call the phone number 23:12:11 or go here 23:12:15 https://join.microsoft.com/meet/clmartin/IA7XCEF4 23:12:20 that will join you in the browser 23:12:48 Skype for Business != Skype so don't bother downloading that I believe 23:13:57 I’m not allowed to install the plugin for all users, and it won’t install for just me 23:14:01 Swithcing to Chrome 23:16:47 :| 23:16:53 I need to install the plugin as root for my local user account 23:17:00 JohnJansen: I have a bug report for you 23:17:08 i was against this from the start 23:17:10 :-) 23:17:16 I can verify 23:17:51 “The organiser will let you in soon" 23:17:53 Apparently 23:17:57 :) 23:18:42 simonstewart: are you on mute? 23:18:48 I’m speaking 23:19:09 Can you see me? 23:19:46 simonstewart: We can hear you now 23:20:35 Can you hear me as I speak? 23:21:17 simonstewart: No, so there is some terrible hack going on here 23:21:25 Ok 23:21:33 https://www.youtube.com/watch?v=htobTBlCvUU 23:22:22 simonstewart: let's discuss this now instead of in lisbon 23:23:02 johnjansen: conversation so far is that this is configuration we want to set that's not related to the browser we get back 23:23:35 simonstewart: if you've got safari, proxy is set at os level, firefox is at browser level, edge is os-level 23:23:43 simonstewart: some things that are browser-specific, some things are os-specific 23:24:07 simonstewart: for every case, it's a case of "i want a session that fits in these parameters" 23:24:28 simonstewart: if local end requests (e.g.) IE on linux, remote end can give back IE on windows 23:24:49 simonstewart: response from new session command is the set of capabilities you've got (not what you asked for) 23:25:07 simonstewart: in open-soruce project, proxy has been omitted, since it's hard for remote ends to sniff proxy settings 23:26:43 simonstewart: but some tests might absolutely require certain proxies, otherwise session is not useful 23:26:59 simonstewart: need to find balance, e.g. can't serialize entire browser profile and send back 23:27:21 jgraham: from my pov, makes sense to talk about desired/required capabilities, in terms of keys that you can have in new session command 23:27:33 jgraham: not clear that spec has to specify what intermediary nodes shoudl do with that 23:27:53 jgraham: if nodes want to be compatible with each other, they should have a separate shorter spec 23:28:05 I am listening jgraham 23:28:08 Keep going, please :) 23:28:18 jgraham: this is the only thing that's specific to intermediary nodes 23:28:45 jgraham: current spec just says what capabilities are there, doesn't say how to resolve between desired/required capabilities 23:28:51 Give me a signal when I should start replying 23:29:13 jgraham: this is a legitimate desire, but it's not useful to have a desired/required distinction for configuration that gets sent to browsers 23:29:43 jgraham: don't want to haev to distinguish this in gecko driver 23:30:02 simonstewart: idea behind new session command is that hte request is the allocation of the resource 23:30:16 simonstewart: originally, everything was a desired capability 23:30:40 simonstewart: and then users would inspect those capabilities and fail the test if the capabilities don't meet requirements 23:30:57 simonstewart: it turned out that people believed that "desired" meant "required", which is unfortunate 23:31:25 simonstewart: so some googlers (jleyba) pushed for required capabilities 23:31:43 simonstewart: browser name is a good candidate for being in required capabilities 23:31:53 simonstewart: preferences, proxies, etc. are good candidates for being in desired capabilities 23:32:26 simonstewart: capabilities (local end -> remote end) is a list of requests 23:33:05 jgraham: i agree this is what the current system is, but this pushes complexity onto gecko driver and other drivers 23:33:36 jgraham: e.g. gecko driver needs to have code to handle the binary 23:33:45 simonstewart: this is simple, the binary is either there or not 23:34:00 jgraham: but we need different code depending on whether its desired or required 23:34:12 simonstewart: are there any other capabilities we can use as an example? 23:34:24 ato: we should only treat browser name, version, platform specially 23:35:00 ato: according to spec, we need to create a "third" capabilities object 23:35:32 simonstewart: binary is a browser-specific feature, it's common but not global (e.g. on android we need an android package) 23:35:51 jgraham: it shoudl be interoperable for each of those common cases 23:36:13 ato: chrome has chromeoptions, gecko driver has something similar, no reason why chrome and firefox couldn't share some of those keys 23:36:51 simonstewart: user could request "a browser on windows" 23:37:27 claymartin: drivers and browsers ahve a one-to-one relationship 23:37:51 claymartin: selenium shoudl handle desired/required capabilities, i don't get why servers should have to care about it 23:38:09 claymartin: why can't selenium handle the complexity, and only pass the needed capabilities? 23:38:32 simonstewart: because existing intermediate nodes would then need to track versions and features 23:38:46 simonstewart: intermediary nodes need a base set of capabilties to do routing (whcih is already in spec) 23:39:01 simonstewart: (brwoser name, version, os version) 23:39:18 simonstewart: we also should have a way to specify ranges for versions 23:39:45 simonstewart: could do a translation in the intermediary node, but then this would make them very complicated, and prone to breakage 23:39:55 simonstewart: this limits browser vendors' ability to innovate and experiment 23:40:32 jgraham: i agree, but don't think that browser-specific information should end up in capabilities 23:41:12 jgraham: current system makes it hard to specify binary path depending on os 23:41:26 simonstewart: that's true, we need to allow differentiation at the os-level 23:42:14 jgraham: current design is poor 23:43:27 simonstewart: we shouldn't redesign this because there's a very large existing userbase, and we don't want to cause unnecessary churn 23:43:50 simonstewart: it's hard for users to do updates, they've spent a lot of time and money to build webdriver tests 23:43:59 simonstewart: we should keep changes to a minimum, although we can do tweaks 23:44:10 simonstewart: e.g. we should allow specifying multiple version numbers 23:44:28 simonstewart: intermediary nodes shouldn't need to care, they just need to select a vm or host to run on 23:45:10 simonstewart: then ie driver, chromedriver, etc. can have their own config 23:45:30 jgraham: i'm not proposing we change anything with routing 23:45:41 jgraham: change would only be in the clients 23:45:49 simonstewart: but there are many clients 23:45:57 jgraham: but they have to update anyway, to be spec compliant 23:46:42 jgraham: my proposal is that we should have a set of keys that browsers agree on, so that it's easier to implement remote ends 23:47:44 jgraham: if it is necessary to have version-dependent fields (e.g. set a certain preference for firefox on linux version X), should be able to express that 23:48:10 simonstewart: but why can't this be separated into desired/required capabilities? 23:48:14 johnjansen: did we lose you? 23:48:22 Can you still hear me? 23:48:30 No 23:48:33 simonstewart: I think the connection dropped. 23:48:37 I was “removed from the meeting" 23:48:47 simonstewart: skype became very cross and hung itself up 23:48:52 Joining again 23:49:08 Can you hear me? 23:49:19 Of course, it always starts muted 23:49:21 Can you hear me now? 23:50:02 jgraham: my point is that for things that aren't about matrix-selection, desired vs required requires extra code 23:50:18 jgraham: it's not clear because this isn't spec'd 23:50:42 simonstewart: spec should just say that if a required capability is not met, then it should fail 23:50:50 jgraham: i don't want to implement that 23:51:20 ato: existing drivers conflate desired and required capabilities 23:51:46 simonstewart: we should have it in the spec that new session should fail if a required capability is not met 23:52:43 jgraham: if i can't start the binary, and the binary is "desired", what should it do? 23:53:12 This is Sam speaking now 23:53:51 samuong: what if firefox driver gets a desired binary that points to a chrome binary 23:54:17 jimevans: (you are right) 23:54:36 simonstewart: if a driver doesn't know how to handle a desired capability, it's ok 23:55:02 jgraham: this pushes complexity in to hte drivers 23:55:12 simonstewart: the other way pushes complexity into the clients 23:56:52 simonstewart: we only need to specify browser name, version, etc. (to allow routing) 23:57:05 and this should be specified 23:57:13 browser name, browser version, platform name, platform version 23:58:24 simonstewart: don't want to require intermediary nodes to do processing, they should pass through payloads without interpretation 23:58:52 simonstewart: otherwise we need to specify how to transforms blobs of data for keys that we don't know about 23:59:02 simonstewart: blob of data from local end should make it to remote end? 23:59:12 jgraham: i don't disagree, but we disagree about how that blob should be structured 23:59:31 jimevans: the stuff that an intermediary node cares about is encapsulated in desired/required capabilities 23:59:44 jimevans: the stuff that a terminal node cares about is also encapsulated there, but i think james is saying it shouldn't be 00:00:02 simonstewart: my position is the opposite - you either have somethign that is optional, or it's mandated 00:00:50 jgraham: we don't do this now 00:01:13 jgraham: who implements this? 00:01:17 samuong: chromedriver doesn't 00:01:22 claymartin: i'd have to check 00:01:26 jimevans: ie driver doesn't 00:01:48 ato: distinction doesn't make sense - e.g. for the binary, either it runs or it doesn't 00:01:58 simonstewart: then it's not required 00:02:12 simonstewart: if it's required and you can't run the binary, it should fail 00:03:00 jimevans: if path to binary is a desired capability, and the path doesn't exist, should we have a "reasonable default"? 00:03:30 jimevans: this would be inferred from the browser, version, platform 00:03:39 jimevans: but james is saying he doesn't want to have to implement that lgoic 00:04:16 brrian: we have use cases where we test different safari binaries 00:04:34 brrian: if we get a required binary and it isn't the right version, it'll fail 00:05:07 brrian: we don't use intermediate nodes, so we can't say "find me a node that has this binary" 00:05:38 simonstewart: intermediary node needs to look at browser name and platform, and spin up the appropriate node 00:05:57 simonstewart: it's up to the intermediary node to decide when a set of required capabiliteis are overly specific 00:06:33 brrian: clarifying use case: no intermediary node, want to test stable and beta version of browser 00:06:44 brrian: encode version number, binary for browser, in required capabilities 00:07:19 simonstewart: so it's perfectly legitimate if the new session command fails if the binary doesn't exist 00:07:43 jgraham: that's how firefox works too - it doesn't go and find a different binary to the one that was desired 00:07:51 jgraham: does anyone care about this use case? 00:09:18 simonstewart: example use case is when a grid auto-scales out to aws or another provider, and binaries are not in standard locations 00:09:48 simonstewart: local ends need to sniff capabilities, and make sure that browser name and version meet the requirements, and fail if it doesn't 00:10:18 simonstewart: current open source implementation doens't work very well if you need to sniff out proxy, for example 00:10:48 claymartin: edge driver does fail if required caps are not met, but will continue if desired caps aren't met 00:12:07 jgraham: if someone requires an extension in edge, and it fails, waht happens? 00:12:20 johnjansen: it would fail to create a session 00:12:55 automatedtester: if you pass in a binary location for edge (desired) and the location doesn't exist on endpoint, what happens? 00:13:45 simonstewart: if binary is desired but not present, then it falls back to other methods of finding the binary (e.g. browser name, browser version, platform name, platform version) 00:13:53 jimevans: or it could look in the registry for edge's location, or something 00:14:36 simonstewart: missing desires are not such a big deal 00:15:19 (simonstewart has an analogy about quitting work and hang gliding, and the nature of desire, but claims this isn't deeply philosophical) 00:16:46 jgraham: not sure this is conversation is productive 00:16:48 samuong++ 00:17:05 johnjansen: the other thing we wanted to talk about is handshakes, versioning 00:18:00 now we're talking about the handshake described in https://lists.w3.org/Archives/Public/public-browser-tools-testing/2016JulSep/0001.html 00:19:27 jimevans: in a successful session creation, open source protocol has an integer status 00:19:38 ato: so in your c# code, it should check for that field, right? 00:19:45 jimevans: yes 00:20:09 simonstewart: if you are w3c compliant, you only send w3c responses, remote end should respond appropriately 00:20:27 simonstewart: from a spec point of view, we shouldn't have to care about this, we just assume everyone's w3c-compliant 00:21:09 simonstewart: there's enough info in the new session request and the response to determine whether we're speaking the open source dialect or the w3c protocol 00:21:48 jimevans: w3c doesn't send an integer status code in the response 00:21:51 ato: ok 00:23:15 johnjansen: we don't need to do anything 00:23:35 simonstewart: yes, we don't need to modify spec, only changes are in open source code 00:24:29 RRSAgent: please draft the minutes 00:24:29 I have made the request to generate http://www.w3.org/2016/07/13-webdriver-minutes.html samuong 00:25:16 RRSAgent: please track the action items 00:25:16 I'm logging. I don't understand 'please track the action items', samuong. Try /msg RRSAgent help 00:25:28 RRSAgent: please track action items 00:26:21 RRSAgent: stop