Browser Tools and Testing WG, day 2

26 Oct 2018


ato, whimboo, AutomatedTester, simonstewart, johnchen, JohnJansen, brrian, crouleau, ChristianBromann, jgraham, thomasboyles, calebrouleau, titus, corevo, sahazel, johnchen_, gsnedders
Andreas Tolfsen, simonstewart, automatedtester, ato


RRSAgent: Please listen

RRSAgent: End meeting

RRSAgent: Start a new log

RRSAgent: make minutes

RRSAgent: please make logs public

RRSAgent: please make minutes

RRSAgent: please make logs public, god damn you

RRSAgent: please make logs public

RRSAgent: please make minutes

<scribe> Scribe: Andreas Tolfsen

<scribe> ScribeNick: ato

RRSAgent: Make minutes, please.

RRSAgent: Please make minutes.

RRSAgent: Make minutes

<AutomatedTester> RRSAgent: make logs public

AutomatedTester: My suggestion is that we set the agenda for the day first.
... Any preference? Otherwise we could knock out a few easy ones.
... I was going to suggest project governance.

simonstewart: Endpoint for window state?
... Scroll to, action primitives for Element Send Keys.
... Hopefully we should be able to do these with minimal bloodshed.

RRSAgent: Make logs

RRSAgent: Make minutes

<simonstewart> https://www.w3.org/wiki/WebDriver/2018-TPAC-F2F

Project governance

AutomatedTester: I guess it's best to understand where Sauce comes from, then I can do my view from the W3C side.

thomasboyles: We don't really have an issue. We want a clean way to tell our folks to make proposals. To document this would be helpful.

ChristianBromann: I also struggeled to prepare for TPAC. What does it mean to make a proposal? I can add an issue, but what does it mean when I put something on the issue list, and how does one move this proposal forward, what are the next steps?

AutomatedTester: OK
... For historical reasons there is a lot of tribal knowledge. There are groups at W3C that prevents people from making proposals [?]. If we work with you to make sure we document the life cycle of it, then this can beneficial to everyone.

thomasboyles: I can help you with that.

AutomatedTester: When it comes to proposing new things, it's a simple case of raising a new issue. If you have a concrete proposal in spec prose, that gives something to look at.
... Our focus has always been on driving a browser. And what that looks like. When it comes to things like Appium it is slightly out of scope, but it doesn't mean that it has to stay out of scope. We are trying to solve the website first, because through the W3C it is easier to get browser vendors involved with that work.
... It's hard to standardise mobile stuff because [it's slightly out of scope of W3C].

thomasboyles: If there's something that seems off topic, then tell us that.

AutomatedTester: Obviously we also have a public mailing list which is easy to join.
... Communication should go through that because I have issues with GitHub. I struggle to keep up with GitHub comments.
... It becomes impossible for me to keep up [with GitHub notifications]. Going through the mailing list tends to be easier. But if there's an issue, raising issues on GitHub keeps things as transparent as possible. I am not in favour of making secret discussions.

thomasboyles: That would be great.
... Do you want to give me write access to the wiki so I can write this up?

<JohnJansen> email address: public-browser-tools-testing@w3.org

AutomatedTester: If we draft this, we should put it up as a PR so we can have proper review process.
... This will keep things transparent.
... In reality it tends to be three people who do the actual work.

<AutomatedTester> mailing list https://lists.w3.org/Archives/Public/public-browser-tools-testing/

ato: [explained what it means to get buy-in from vendors]

simonstewart: If in one of these meetings a vendor says they won't implement something, we simply can't do it.

thomasboyles: Is there a formal voting process?

AutomatedTester: If you don't respond, we assume tacit agreement.

jgraham: In practice, if you don't respond and it goes in the spec, and you later come along and say you're not going to implement it after all, it will probably me removed again.
... The idea is that there needs to be general agreement.

ChristianBromann: I was reading the model of TC39 and I think this should be applied to this WG.
... They have different stages proposals go through. Stage 2 could be writing writing a shim in actual drivers. This mental model could help us understand what the requirements are for getting changes into the specifications.

<calebrouleau> https://github.com/w3c/webdriver is where requests for features and bugs go

ChristianBromann: A lot of people make proposals but don't follow up on them. If you really want to have something in the protocol, we should have steps documented of what it takes to get the proposals into the specification.

AutomatedTester: We're not TC39 and we're not specifying a language. We are significantly smaller than TC39.
... Typically it's the same person doing the spec prose, the tests, and the implementation.
... It sounds a little heavy weight for this WG.

ChristianBromann: I would rather say, if you make a proposal you are also responsible for doing the spec prose and writing the tests.

jgraham: People writing other specs add WebDriver features to their spec. For example Permissions. That model is a bit unproven.
... It makes it harder to centralise WebDriver decision process, if there's an element of decentralisation where other WGs can do their own thing out of band of this WG's process.
... Appium came up: it feels philosophically difficult to argue why the W3C should define automation of native applications. I am personally skeptical whether that is work this WG should concern itself with.
... Not convinced it comes under our charter.

ChristianBromann: New charter?

jgraham: Well getting a charter that grants you the right to write specification concerning automation of native software would be hard to swing by the W3C.

AutomatedTester: It would be hard to describe the semantics as well, especially when you go cross platform. Android's UI structure is very different from Apple's UI model. Describing these semantics in a way that makes sense for both OSes would be hard. I could see how either vendor might push back on this.
... Ignoring the philosophical part concerning W3C, it would be technically difficult.
... But getting Apple and Google to the table might be worth it. It could happen with enough momentum.

ChristianBromann: Coming back to the proposal life cycle.
... I would like to draft something that [???]

AutomatedTester: Yes I would be happy to look at a proposal for a life cycle.

JohnJansen: I want to clarify two thing. There has to be unanimous agreement. If something says they're not going to do it, it can't go into the spec. We reach agreement through long, hard discussions. People value this process.
... Proxy capabilities we use Windows Proxy and you log on as a user, if you start mess with the user you could also mess up the system.
... But I believe it should be in the spec, even if EdgeDriver does not implement it.

Action primitive for scrollTo.

<simonstewart> scribe: simonstewart

ato: I looked into this and had a discussion with some users at SeConf

ato; supports the idea of having an action that scrolls the viewport

<whimboo> https://github.com/w3c/webdriver/issues/1005

ato: with the existing actions, it's impossible to do something with an element outside of the viewport

jgraham: technically not true, in reality it's true

contrived example follows

ato: so what users want is a way of arbitrarily scrolling the viewport.
... I think they want to scroll by element reference and by a delta
... what should those values look like. CSS pixels? View height? Something else

jgraham: all the platform features are in CSS pixels

<AutomatedTester> https://drafts.csswg.org/cssom-view/#dom-window-scroll

jgraham: should be like the pointer move stuff.

<JohnJansen> https://developer.mozilla.org/en-US/docs/Web/API/Window/scrollTo

jgraham: you would say "either scroll relative to the viewport, or from a particular element"

ato: yes, but the two DOM primitives we have are "scrollTo" and it takes a dictionary
... proposes using the same defaults as existing commands
... it's not clear how we define the action element's input

jgraham: it should look like pointer move. Basically, there's a switch that defines behaviour
... seems like we need exactly the same thing here
... what about an offset from an element

ato: reads the spec for pointer move

<AutomatedTester> https://w3c.github.io/webdriver/#dfn-perform-a-pointer-move

jgraham and ato debate what "start position" means

jgraham: three options: 1/ you have a document that's already scrolled. Coordinate system relative to current x/y, 2/ relative to the origin of the document (an absolute position), 3/ you have an element, and if it's possible to implement in a reasonable way you can scroll it into view, and then move by a delta
... part of the point of this is so you can model the high level command using the underlying Actions api

ato: it used to be you'd scroll the element into the top of the viewport, which lead to lots of elements being obscured, so we changed that to "scroll to the bottom", since sites less likely to place elements to the bottom, but we didn't define scrolling to the centre of the view

jgraham: perhaps it'd be fine to just pass in the same values as supported by scrollintoview

ato: guess it's an orthogonal discusssion about changing the defaults of the original method

jgraham: we can't as that would be a breaking change

ato acknowledges this

ato: however users may not notice

whimboo: element click should also use the action primitives

jgraham: what we do is make this possible, and then scroll into view would be defined by what we do in the actions primitive

ato: which of the vendors want this thing most

AutomatedTester: I think it's the selenium community

titus: there's been a lot of work done to hack around this in the ruby community, and it's all in JS, and it's not nice
... this is a regular user request

jgraham talks about implementation details

AutomatedTester: we could just look at how titus has done it, and build off that

ato: it should be modelled on pointer move. I do see the point of using the same defaults as the existing scroll into view

jgraham: discusses the DOM calls that would need to be made

ato: has one question. Possibly selenium. Does selenium expose this primitive already

<AutomatedTester> scribe: automatedtester

simonstewart: the selenium bindings dont have it because its not in the actions API

jgraham: we dont have it because people cant execute script in the middle of the actions

simonstewart: that is correct

<titus> https://github.com/p0deje/watir-scroll/

simonstewart: implementing it would be near zero work

ato: I am worried that only Selenium users would be only able to use it with geckodriver

JohnJansen: EDge has this supported

<simonstewart> calebrouleau: also coming to chromedriver

calebrouleau: also coming to chromedriver

<simonstewart> scribe: simonstewart

ato: what do we want to call the primitive?

jgraham and AutomatedTester: "scroll"

RESOLUTION: Add general "scroll" action primitive that takes the same input as pointerMove, except for origin it will not take the "pointer" variant, but it will have one called "relative" that is relative to the viewport. x/y offset will be given in pixels, and it will take the same ScrollIntoViewOptions defaults as the high level Element Click command.

<AutomatedTester> RRSAgent: make minutes

<AutomatedTester> RRSAgent: make logs public

<ato> Scribe: ato

Opening a new top-level browsing context

<whimboo> https://github.com/w3c/webdriver/issues/1138

<JohnJansen> RRSAgent: make minutes

ato: [explains reasoning for having a new top-level command for opening a new top-level browsing context]

gsnedders: [explains difference between top-level browsing context and auxillary browsing context]

jgraham: If you like detail…

<brrian> Safari extension command PR for Selenium bindings https://github.com/SeleniumHQ/selenium/pull/6430

jgraham: Depending on the browser, there may be more than one way to do that.
... New window/tab.
... It would be a hint, because it does not have a meaning on mobile.

simonstewart: No.

jgraham: Yes.

gsnedders: Resize will affect every browsing context inside that window.

jgraham: There are use cases that depend on that difference, but obviously not something you can expose in a completely browser agnostic way.

simonstewart: If you're hinting that you want tab or window, then you need to have a return value indicating if the driver obeyed that request.

jgraham: Should we have a type argument "this is a hint", then if we need to extend it in the future we could do that. E.g. new headset or holo or whatever.

whimboo: Why wouldn't we return an error?

jgraham: On mobile you can only really open tabs. Windows are not a thing.

<AutomatedTester> RRSAgent: make minutes

brrian: Not true on the iPad.

simonstewart: It gets complicated.

<AutomatedTester> RRSAgent: make logs public

simonstewart: People write their script once and they will expect it to always work.

jgraham: It feels more philosophically consistent with WebDriver.
... If you care, check the return value.

calebrouleau: I wonder what test developers would need this hint for?

sahazel: People don't do a lot of it at moment because they don't have the ability.
... But if they had, there are test cases where users want this.
... E.g. having two users interacting.

simonstewart: Bad example.

jgraham: When you have a window, document.hidden state will be different than to if you have a tab.
... This could affect testing scenarios.

<gsnedders> ato: as to whether or not i should have a hint, I think it's more philosophically consistent, given most people write their tests once and have the expectation that the tests aren't too device specific, so I think it's a good idea

brrian: My motivation for implementing this in Safari, I needed to actually test the strange permutations of window manipulation.
... There's no obvious way to test this behaviour without having two unrelated windows.
... Then I later realised that users had been asking for this, and it works really well in Safari. Although I've implemented it without a hint.

[brrain clarified Safari _does_ have a hint.]

simonstewart: Users currently hit Control + N.
... I'm agreeing with jgraham on this one that taking a hint and not raising an error makes sense.

jgraham: Did you write a proposal for this, simonstewart?

simonstewart: I didn't, but I suggested a URL endpoint.

<AutomatedTester> https://github.com/w3c/webdriver/issues/1138

jgraham: Is the Safari behaviour writtten down anywhere?

brrian: Not as such, but there is a patch.

jgraham: Can you provide a link?

<AutomatedTester> https://github.com/SeleniumHQ/selenium/pull/6430

jgraham: Maybe we have enough to write a spec for this, actually.

sahazel: People want to test single-page app that is supposed to reload data on interaction from a different user. E.g. add a message to a conversation, then have another user in a another window seeing that appearing.

<AutomatedTester> ato: to sahazel's example whether you have a tab or a window because the value for document.hidden

<AutomatedTester> gsnedders: is adding more granularity than that

<AutomatedTester> ato: simonstewart was saying ctrl+N

<AutomatedTester> ato: that is only possible in desktop and drivers that supports it

<AutomatedTester> ato: the spec doesnt allow that

JohnJansen: But geckodriver used to allow Control + T on the document?

ato: FirefoxDriver used to, but not geckodriver.

JohnJansen: Right.

AutomatedTester: Should we make a resolution on this?

calebrouleau: Sure.

RESOLUTION: Add new endpoint as documented in the issue, but additionally take a hint as input indicating tab/window/potentially thing in the future, and return the window handle that was opened and the context type that was created.

[discussion about gitbot]

Add endpoint for Get Window State

<whimboo> https://github.com/w3c/webdriver/issues/1023

<AutomatedTester> scribe: automatedtester

ato: context, we used to have an additional field for {get,set} window rect that would return window state and geckodriver implemented it
... it was removed because it would not allow us to get to REC
... the reason why we need this is because we can not let web APIs tell us that we are in the correct window state before we move on

brrian: does this use the visibility APIs like document.hidden et al

ato: Window managers don't always have the same semantics like what does maximise actually mean
... <discusses the differences between different platforms>
... whimboo 's proposal is to ahve a separate endpoint

<ato> simonstewart: If the proposal is to have a separate endpoint rather than having to the window rect, would we then deprecate maximize window/minimize window/fullscreen window?

simonstewart: if the proposal is to have a separate proposal, would we obsolete the separate commands

<inserted> Scribe: ato

whimboo: When you call fullscreen/minimize/maximize you can call it multiple times. They are indempotent.

simonstewart: But would we make these as obsoleted?

ato: I think probably we wouldn't have had separate commands originally.
... But I don't have a strong opinion on this.

JohnJansen: Are you talking about making a breaking change?

simonstewart: Yes, the minimize/maximize/fullscreen commands will sort of be superseeded by a Set Window State command.

JohnJansen: As ato pointed out, if we thought about this more carefully in the past we probably would've done this differently, but I'm not sure I'm in favour of obsoleting commands. We could just have a getter, Get Window State.

simonstewart: What is the process of removing features from the specification?

AutomatedTester: I'm happy to find out what the process is with MikeSmith.

simonstewart: The separat command endpoints would redirect to the Set WIndow State endpoint.

ACTION For AutomatedTester to speak with MikeSmith about obsoleting features in a specification

RRSAgent: make minutes

<scribe> ACTION: For AutomatedTester to speak with MikeSmith about obsoleting features in a specification

<JohnJansen> RRSAgent: make logs public

brrian: I see this as problematic.
... On macOS there is no maximized state.
... If you maximize a browser, it will look at the web content window and try to figure out how big it can make it.

simonstewart: How do you handle this at the moment?

brrian: We simulate hitting the plus button.

AutomatedTester: I guess it's problematic to decide if that succeeded or not?

brrian: You can check if something is fullscreen using DOM APIs.
... On iOS there's no concept of minimized.
... On iPads you can have side-by-side windows, does that means it's maximized?
... I would prefer being able to return 401 Not Implemented for Minimize/Maximized, because there's currently no way of querying the capabilities.
... There could also be other window states.
... I'm not sure what this is solving. It's highly WM specific.
... I know there's work ongoing in other web platform APIs to define "how active" a window is, is it in the foreground, and so on.

<AutomatedTester> ato: there is already a way to kinda query the capabilities

<AutomatedTester> ato: the webdriver only really considers desktop. THere is a setWindowRect capability

<whimboo> scribe: AutomatedTester

brrian: I am not sure if those are coupled

ato: perhaps we should have a capability for the different types that can be supported on the platform
... in firefox we simulate the maximise button being pressed
... we use an API to do that and we know that won't work on iOS
... the API sets the state internally
... we need a way to get the state info and set the state of the window
... on windows a 40000x40000 window would set it to the max size of the window. On Linux some WM would set it to 40000x40000
... it is good that we can try set the similarities between different platforms

brrian: I am not sure if this testable. If I click maximize on safari when its supposed to be idompotent but it doesnt always do that

gsnedders: do we want to be idempotent?

ato: it is already in the spec

brrian: <discusses what a window manager might>

<inserted> Scribe: ato

simonstewart: Philosophical discussions about platform differences is futile, we already have these commands.
... The Maximize Window definition is generous in what it permits a driver to do.
... We have weazel-wordning in there already.

<whimboo> scribe: ato

simonstewart: The unsupported operation error indicates that the command couldn't be carried out for whatever reason.

brrian: That's going to be very brittle.

JohnJansen: Yes, that seems kind of weird.

brrian: Users couldn't rely on it.

simonstewart: That's one of the reasons we return the setWindowRect capability.

brrian: [some example about maximized and resizing?]

ato: Factual comment: If you're in maximized state and resize the window, you are taken out of maximized state.

simonstewart: I think clients call endpoints blindly, i don't think they test the HTTP status code of whether something is not implemented.

brrian: I don't understand why you would test the maximized state. You would just do it.

simonstewart: If someone tries to maximize on iOS it would be totally legit for the driver to return and say it's as maximized as it gets.
... We are allowed to say "natural size" for example.
... Who knows what that means?
... "fixed" size might make more sense.
... Advertise what window states you got, then when someone passes in an unsupported window state we could error.
... I think we can articulate all the things you're trying to say already with the infrastructure we have in the specification.

brrian: Then I don't understand why we're removing the explicit high level commands.

simonstewart: It's also about extensibility in the future.

brrian: The direction of the web platform is to expose fullscreen/hidden or not, because everything else is very vague and not consistent.

simonstewart: True. But we already have these endpoints.

sahazel: I'm concerned about returning an error on Maximize Window, because this is something users do all the time.

JohnJansen: Yes, all of our internal tests do this.

sahazel: I think it should always be infallible.

whimboo: About the window being active or not active: I wonder if Get Window State could also return whether it's active or not.

ato: When is a window active?

<gsnedders> https://github.com/WICG/page-lifecycle/blob/master/README.md

whimboo: I mean document.hidden.

<gsnedders> is the new Page Lifecycle stuff, the README of the repo contains a pretty diagram unlike the spec

ato: Yes, if we called it "hidden" that would be clearer, as "active" could be misinterpreted as "level of activity in window", as brrian mentioned earlier, or possibly if the window is in the front.

JohnJansen: I'm not sure we need those.
... You have to switch to a window before you can use it regardless.
... Ultimately it does not matter to the test whether you can see this state as a human being or not.

sahazel: Regarding some platforms not having a maximized state: You could return a boolean.
... At least that would mean it doesn't lie to you.

gsnedders: Is there any platform that allows a window to be maximized but not otherwise resized?

simonstewart: Yes, some of the tiling window managers on Linux.

JohnJansen: On Windows you're not really minimizing the window, you put it into a cache.

ato: Actually what WebDriver says in this case is that you must first restore it before you resize. This is in order to have some level of platform interoperability.

sahazel: Maximize Window could return [?] it could say "I've maximized something, but I didn't take this action".
... The response could contain advisory information about whether an action was taken. You could also have a field indicating what operates are available on the system.

AutomatedTester: There seems to be broad agreement that we should add this new endpoint.
... But not necessarily obsolete the other high level commands.

<AutomatedTester> brrian: I don't see the need to have symmertrical endpoints. I am happy with get though it might not give the best info

<AutomatedTester> brrian: I dont see value in obsoleting the older commands

<sahazel> to clarify, my idea was that setting the maximize state might return advisory info like {"actionTaken": true, "isStateOnPlatform": false} on macOS (where maximize does do something, but does not toggle any special window status)

calebrouleau: In Chrome you can't set the window to an extremely small size.

jgraham: What's the limit?

johnchen__: It's around five hundred.

ato: If you ask WebDriver to resize the window to 10x10 px it will just do its best. The command won't fail if it can't reach those dimensions.

brrian: Given the behaviour of Maximize Window it is not really "speccable". I'm not sure there's much you can test with this command.

<AutomatedTester> ato: the concrete problem is marionette will return from the maximise command before it has finished maximising

<AutomatedTester> ato: to pass the WPT currently we had to add hueristics to make it work

<AutomatedTester> ato: we can move these to only be in Firefox tests and then we would lose interop

calebrouleau: Could we return a no/maybe/yes, then Firefox can say "no", then "yes". And Safari could always say "maybe"?
... Is it maximized? yeah… maybe.

JohnJansen: What problem does it solve?

calebrouleau: We could write Firefox WPT tests that goes from "no" to "yes", then for Safari they could still pass on "maybe".
... Who has a problem with that?

<inserted> Scribe: AutomatedTester

ato: <describes how Firefox works>
... the reason I wanted window state is to make more reliable WDSpec tests
... do users really want this feature

JohnJansen: we don't actually check the state even though we run that command

ato and whimboo discuss how this could be done as a shim in the client

jgraham: how do you tell from the contentJS that the command has completed successfully

whimboo: in this time we wont get resize events

jgraham: yes, we would fail this interminttently rather than all the time

RRSAgent: make minutes

jgraham: if we have a size and we call maximise and that returns the size and then you can see if the size changed

ato: I have written the heuristics to make this work in the tests

jgraham: we already have this in the spec where maximise returns the window size

ato: <describes how the test would work>

<gsnedders> [break until 13:15]

RESOLUTION: We will not add a Get Window State command because not all remote end implementations have a notion of maximized window state. This means the tests for Maximize Window will be weak and that we will have to introduce a thread suspend to make sure the window size doesn’t keep changing after the command has returned.

<jgraham> jgraham: We already have all the primitives we need to write a test to show that the update is synchronous. It will be a weak test and could fail if the returned size is the target size rather than the actual size, but it seems like given the constraints it's good enough

<ato> RRSAgent: make minutes

<JohnJansen> RRSAgent make minutes public

<JohnJansen> RRSAgent: make minutes public

<JohnJansen> RRSAgent: make logs public

<ato> Scribe: ato

Element send keys (action primitives)

<whimboo> https://github.com/w3c/webdriver/issues/1322

whimboo: I recently implemented action primitives in Element Send Keys.
... I got to handling modifier keys.

<whimboo> https://w3c.github.io/webdriver/#dfn-dispatch-the-events-for-a-typeable-string

whimboo: We have some specific steps where we handle shift modifiers. Users can say they want to have upper-case letter, then they send Keys.SHIFT again, but we also have the case when there are special keys.
... Here we check if shift is pressed, then reset the shifted state. This is conflict with the shift key being pressed.
... There is an example in the GitHub issue I have filed.
... There is no way for users to press shift then issue lower-case letters.
... I don't know how this worked in the past.
... But there are two ways of putting the input source into shifted state, and they conflict with one another.

jgraham: You are implementing Element Send Keys on top of action primitives?
... Is the problem that it shares the global input source state?

whimboo: We don't share the input state, we use a unique input source every time.

jgraham: Shift works magically for uppercase letters in the high-level command.

JohnJansen: In the high level command we want to do what you mean.

<AutomatedTester> whimboo what would happen if you had "ab" + key.shift + "cd"

<AutomatedTester> ato: what would happen if you had "ab" + key.shift + "CD"

<AutomatedTester> simonstewart: we would have "abCD"

<AutomatedTester> JohnJansen: we have discussed this in the past

<AutomatedTester> jgraham: yea, it was a breaking changing in the past

<AutomatedTester> JohnJansen: <repeats what is in the issue>

jgraham: It's supposed to act as if the key event is generated for the OS for the current pressed keys.
... If you have shift depressed, then you press "a" you would get an uppercase "A".

whimboo: The key modifier state is currently only reset at the end.

simonstewart: "if it’s a shifted character, hold down the left shift"
... [reasons about the algorithm]
... So we never release the shifted key.
... We never undo the work done in step 1.

jgraham: Is this a bug?
... If you explicitly send a SHIFT key.

gsnedders: Should this also apply to control?

jgraham: There's no model in WebDriver holding down control in this particular case.

simonstewart: Unless you accumlate shifts.

jgraham: Reference counting number of shifts held down is not a good idea.
... We don't want to reset the whole state.
... We're saying that you need to handle implicit shifts differently to explicit shifts.
... The question is if it should last for one character or for a duration of multiple characters.

simonstewart: We know each keyboard has its own state, if you have held down the shift key and you have a character that is uppercase you can skip step 1.
... There's a missing AND statement in here.
... You set an implicit flag.
... Then after each character, if this flag is set, you release the implicit shift key state.

jgraham: Question is still if it should apply to just the next character or for multiple chars.

ato: Will users be surprised by this behaviour=?

JohnJansen: Edge is not compliant with this, but we implement the spec as it is.

<JohnJansen> new Actions(Driver).KeyDown(Keys.Shift).SendKeys("abcdef").KeyUp(Keys.Shift).Perform();

webElement.sendKeys("ab" + Keys.SHIFT + "cd")

webElement.sendKeys("ab" + Keys.SHIFT + "CD")

What does this produce?

Do we accumulate shift state, or do we reset it?

simonstewart: The very first thing you would do in the algorithm is to store the current shifted state.

<scribe> ACTION: jgraham to propose a patch for handling accumlated shift state in the high-level Element Send Keys command

RESOLUTION: In Element Send Keys, once a modifier is held down it is considered to be held down until the null key is sent. Duplicate modifiers are ignored, and the implicit modifiers are "don’t override the explicit one" and for implicit modifiers only, we coalesce adjecent same-case characters.

Improving interactability checks for hidden file upload controls

<scribe> Scribe: AutomatedTester

<whimboo> https://github.com/w3c/webdriver/issues/1230

ato: conceptually this is really simple

simonstewart: oh dear [....]

ato: you have a file upload and in modern JS frameworks they set <input type=file display:none> and then have a fancy div elsewhere
... that then sets the value across magically
... so when you try sendKeys to the proper input we throw a ElementNotVisibleError because of interactability checks

Chrome does not treat this as ElementNotVisibleError but Geckodriver does

scribe: so how do we go about solving this that we can reach interop because this is a "special" type
... the proposal in the github issue is to treat this as a special issue where we dont do interactability checks
... the way you upload files with webdriver is <describes the spec prose> and this is not how the user interacts with that element
... we have 2 issues here

brrian: I fixed this last week in safari driver
... and the issue was fixed by removing some of the checks and making sure we can click on it
... in the current safari driver with display none

whimboo: we changed geckodriver to solve this issue

simonstewart: to model the users we should keep the interactablity checks and we can see a new issue

ato: that doesnt solve the interop issue
... geckodriver could move to chrome driver because if chromedriver moved then chromedriver would get errors

brrian: I would like the endpoint because that way we just be a w3c endpoint

ato: I agree philospophically but doesnt solve the issue we need to agree now

simonstewart: in 5 years time we wont care

jgraham: we definitely need to fix that now unless chrome solves it right now
... ... we should move this to be more like what people are using now and their expectations

simonstewart: we definitely need to have the discussion about the endpoint

jgraham: sure but we need to solve the interop issue now as that is important
... it might be worth discussing the future endpoints
... but we need to discuss this point now
... we need to agree on the interop now

simonstewart: can we caveat this in w3c mode

<gsnedders> ato: I'm next on the speaker list so I get to speak

<ato> ack

<ato> (Sorry)

ato: even if chrome and safari move to geckodriver because we dont solve how to handle the JS frameworks

jgraham: yes, we need to do both cases at the same time

simonstewart: I have opinions
... we need to discuss the endpoint
... if we could add a special capability to allow Mozilla to continue to be spec compliant but turn it off and then have the endpoint created and allow other vendors to catch up and then switch to the spec

jgraham: no but we have a queue

calebrouleau: is there any workaround for users?

simonstewart: the correct workaround is the endpoint. the temp workaround is to model after chrome

calebrouleau: to clarify, users have no workaround for geckodriver?

jgraham: no

simonstewart: <describes implementation details on how people can do it>

brrian: could the client bindings test it

simonstewart: yes we can do that
... we can add any shim we want to do that

ato: we were kind of complicit in making this
... I saw that on stackoverflow about accessibility because it breaks accessibility
... we need to make the element transparent then accessibility tools can get it
... we can approach frameworks to fix this to stop using display:none
... the other approach is for the client bindings to set the element to be opacity

<simonstewart> It'd be easy to add a new file detector

ato: we need to either have a transition path

jgraham: we can't change the way the element looks as that causes other issues

<ato> a+

jgraham: we could have the approach of breaking users twice which isnt good

<ato> s/a\+//

jgraham: we need to see about moving the spec to where the users are and where other implementations are
... we can also move to a capability to allow people to be more strict in the future

calebrouleau: <describes how chromedriver does it>

simonstewart: if we add the endpoint for file uploads would chrome adopt?

johnchen: yes we would

simonstewart: great, we should have a capaibility for strict file checks

jgraham: I am not happy with that solution, we need geckodriver to be like selenium

simonstewart: people upgrade everything at once and then cry because it breaks

jgraham: we have non-selenium clients but they are smaller. Chrome isnt going to change so lets move to theirs

ato: we need to clean up the file upload and we can't use \n because some file systems allow that in file names

brrian: cant we just tell them to update

ato: that is still going to be showing as broken

brrian: If there is path forward to upgrade so we can get users to upgrade and it should be fine

<discussion about upgrade paths and clients>

simonstewart: we are aiming at modern selenium users and we can totally update this

calebrouleau: chromedriver wont change it's behaviour but we can have a new endpoint and then we get local ends can update and then make sure users have a good solution

JohnJansen: if there is a test case about this?

<whimboo> example page for hidden file upload: https://jsbin.com/zopicoxeba/1

JohnJansen: if angular broke on this I would rather do the right thing, like ato said, and not do wha the spec says

<ato> ack

<ato> ack

<simonstewart> YES!

brrian: my answer is contingent on what it means on the code
... and iOS is a weird ase

simonstewart: my proposal is a new capability and default it to be like chromedriver default. We define the endpoint in this meeting so we can have a path forward
... and when the new endpoint is deployed in the future local ends will switch the other way in the future

jgraham: I am happy with this proposal

simonstewart: the capability should be phrased positively

jgraham: you write some test cases with no default set then you have a test with that capability not the default and then selenium can wait until everyone passes those 2 tests and then swap the default at the same time. Issues would then be on the selenium side and not implementor side

ato: this sound like something you would want to obsolete in the future
... it feels like upload via sendKeys is not good anyway and special casing it is being a wort on a turd

jgraham: in the future we can have a capability where it turns off the weirdness on sendKeys

ato: although the default behavior is up in the air

jgraham: no, the default behaviour is chrome
... we could instrument the commands and then see about obseleting these in the future

ato: so what do we do with tests?

jgraham: we write tests to pass on chrome

ato: and what would the client look like

simonstewart: <describe how we can do that>

jgraham: the transition path for local ends is out of scope

<gsnedders> [discussion about removing worts from turds]

simonstewart: we are only talking about the keyboard interactablity only here

all: agreed

RESOLUTION: modify spec to make chrome's behaviour the expected behaviour for file upload (eg. to skip keyboard interactivity test for file input elements). Add a new "strictFileInteractivity" capability that defaults to "false". Add a new file upload endpoint.

New command endpoint for file uploads

<ato> Scribe: ato


ato: [explains proposal for a new file upload endpoint]

simonstewart: Looks good, but I have one question.
... People spend forever finding the file upload element.
... They explore the UI to find the <div> they can drop the element on.
... They have an expectation that dropping a file on the <div> redirects to the upload to the <input type=file> element hidden somewhere else.
... I'm using "drop" in the colloquial sense.
... Should this command apply only to <input type=file> element or should it apply generally?

whimboo: Where should the file list be sent?

simonstewart: People don't like the styling of <input type=file> and this causes people to struggle writing tests.
... They will find the locator of the stylish element hiding the file upload element and they expect sending a file to this element will work.

ChristianBromann: If we limit the command to <input type=file> we could improve the error message.

ato: Error message strings are entirely implemention-defined.

AutomatedTester: But we could have an error?

simonstewart: We could.

ato: How do we tell if the <div> redirects to an <input type=file>?

simonstewart: I don't know.

brrian: I don't think we should deal with drag-and-drop.

ato: I don't think we care where the file comes from.
... The <div> could synthesise a click on the <input type=file> causing the file chooser dialogue to appear.
... I don't think we care where the file comes from.

brrian: You pre-set files and the next time a file chooser appears you upload these files.

simonstewart: That's very unfamiliar.
... You could allow users to switch to the modal where you select the dialogue.

calebrouleau: You can't do that.

simonstewart: Well, no. But that would conceptually be the Right Approach.

gsnedders: The file picker UI can potentially be caused by _any_ user input.
... There are many trusted user events that can trigger a modal dialogue.
... Do we want to restrict what we're doing to just the click case?

RRSAgent: make minutes

gsnedders: We want to handle something that can occur from a script at some point.
... What browsers treat as user-initiated events differ.
... It isn't even triggered synchronously.

simonstewart: I have a plan.

gsnedders: Fantastic!

simonstewart: Simplest thing that could possibly work, in a way that possibly doesn't harm us in the future: People the file upload element, they call the upload endpoint on that, anywhere else we trigger an error.
... Clicking on another <div> is already broken, then if we in the future figure out how to handle the redirection in the future then we can add support for that in the future.

RESOLUTION: create the file upload endpoint for elements, make this apply to file input elements only. Have this follow the parameters outlined by ato in his GH bug comment. https://github.com/w3c/webdriver/issues/1230#issuecomment-376989364 Add a new error type for "file uplaods not allowed" (or something like that, eh?)

[break for coffee]

RRSAgent: make logs

RRSAgent: make minutes

<JohnJansen> RRSAgent: make logs public

Clarify the expected behavior when an alert is shown in the middle of a script execution

Review all commands to ensure if user prompt handler has to be called or not

<simonstewart> Outcome of conversations so far: the check is needed to be done on every command, other than those that don't have an associated session, quit, and the commands that specifically handle alerts.

<AutomatedTester> scribe: automatedtester

AutomatedTester: I dont think that findElement should have alert checks

simonstewart: Some implementations use javascript to find the elements
... if an alert is present then the test will halt

AutomatedTester: what implementations does this affect?

simonstewart: I dont know

ChristianBromann: what about getAttribute et al using that?

simonstewart: anything that uses JS will be blocked by this
... its about adding 1 more step

AutomatedTester: no that's not true

jgraham: I thought that we had to have text that if a prompt was hit at any point we should error

simonstewart: there are use cases where we dont want to error e.g. a click that forces an alert to come

ato: there is a special case with executeScript

[missed comments]

<discussion about scoping where these checks could be>

<brrian> WebKit uses JS to implement Find Element

<whimboo> https://github.com/w3c/webdriver/issues/1086

simonstewart: we would want to have this in specs on all commands that is not new session

RESOLUTION: Go read the spec issue and follow up on that one

HTTP Authentication dialogs

RRSAgent: make minute

<ato> The remaining topics on the agenda will be timeboxed.

RRSAgent: make minutes

<ato> We will spend 20 minutes on this topic.

<ato> gsnedders: How do we respond? How do you provide username/password?

<ato> simonstewart: It's not just user/pass.

<ato> simonstewart: How do you authenticate with the website.

<ato> simonstewart: Basic, digest, NTLM, but there are more out there.

<ato> simonstewart: Prior art in Selenium is that we used have an authenticate endpoint with credentials serialised any way people took a fancy to.

<ato> simonstewart: You would extend that to handle NTLM, but it handled basic.

<ato> simonstewart: You could add new authentication handlers.

<ato> simonstewart: You can imagine doing biometric.

<ato> simonstewart: I've performed an action, a dialogue has appeared, you switch to the dialogue to interact with it.

<ato> brrian: It typically handles on navigation?

<ato> simonstewart: Get authenticated.com, then they know there's an alert opening.

<ato> simonstewart: "unhandled authentication present" error, like "unhandled user prompt".

<ato> brrian: And if you try to switch to it without it existing, it would say there is no unhandled auth present as well.

<ato> simonstewart: Right.

<ato> simonstewart: You have navigate, then you have authenticate-as. This is what the remote end would see.

<ato> AutomatedTester: In step 4 of the navigation command, we have a step where we handle user prompts.

<ato> AutomatedTester: At this point we're going to have to pause the navigation, handle authentication dialogues, then move on.

<ato> simonstewart: No.

<ato> simonstewart: We handle this the same way as user prompts.

<ato> simonstewart: You don't pick up where you left off, you wait until the next command comes in.

<ato> AutomatedTester: If you navigate and hit an authentication dialogue, they authenticate, then you navigate again?

<ato> simonstewart: The navigation completes, but then the next command will be blocked by an error.

<ato> jgraham: What's the proposal to serialise the authentication information?

<ato> jgraham: Why not just provide this with the navigation command?

<ato> simonstewart: We could do that, but it wouldn't cover the Element Click case.

<ato> simonstewart: It would return somewhere around step 7 in the Navigate To.

<ato> simonstewart: If the navigation ends prematurely due to an auth dialogue, return success to user.

<ato> jgraham: That doesn't work in detail, which I don't know if is a problem.

<ato> jgraham: We don't always go through this algorithm.

<ato> simonstewart: True.

<ato> ato: Yes.

<ato> simonstewart: There are hook points in the spec that allow us to do this in a clean way, and navigate it self would probably have a special case because auth dialogues are encountered all the time.

<ato> jgraham: is it necessary that a local end knows that an auth prompt happens?

<ato> simonstewart: No.

<ato> jgraham: Is it acceptable to set auth information in the session?

<JohnJansen> spec being referenced: https://w3c.github.io/webdriver/#navigate-to

<ato> simonstewart: This is analogous to handling user prompts, which is already well defined.

<ato> ato: Question: Are there implementations of this?

<ato> simonstewart: There were two implementations that behaved this way.

<ato> jgraham: Can we make this work the way unhandled user prompt open error?

<ato> simonstewart: Yes.

<ato> jgraham: Yes I think that could work, we could make it more specific.

<ato> simonstewart: The normal flow is that we don't look into the future.

<ato> AutomatedTester: There should probably be a capability like unhandledPromptHandler.

<ato> AutomatedTester: Nine out of ten times they are going to treat it the same way.


<ato> ato: Are you saying there are two ways? One defining some auth info upfront in capability, and one at runtime by switching to the alert?

<ato> AutomatedTester: They can prebundle the info as part of the session, or we can create another endpoint that you can provide this info to.

<ato> simonstewart: There is a deliberate auth endpoint.

<ato> AutomatedTester: It seems reasonable.

<ato> AutomatedTester: If it's not set, where do we return the errors?

<ato> simonstewart: Same places we handle alerts.

<ato> AutomatedTester: Yeah OK, we just need to augment that.

RESOLUTION: Add new capability for default authentication credentials. Add handling of authentication dialogues wherever it is that we have unhandled prompt prose. Add endpoint to allow deliberate authentication, and this is probably going to be in section 18.

<ato> ato: My only concern is that the proposal should specify the serialisation of the authentication info.

<ato> simonstewart: Does that sound implementable?

<ato> brrian: [nods]

Asynchronous New Session

<ato> sahazel: The concept is for cloud services, like Selenium Grid, to have a way to start a new session without it happening in the span of a single HTTP request.

<ato> sahazel: Currently we handle this by sending an HTTP redirect every 10 seconds to avoid timeouts, which works OK but depends on the clients following redirects. They all have a maximum number of redirects, which limts how long you can do this.

<ato> sahazel: We used to have a custom header.

<ato> sahazel: We want a New Session that will immediately responds with an ID.

<ato> sahazel: We would be happy to implement this.

<ato> AutomatedTester: The only concern I have is that the rest of the API is considered blocking, that if this part is async then you're going to have a slew of other commands coming in.

<ato> sahazel: If you send Find Element before the session is up, then the response would be the same as if the session didn't exist.

<ato> simonstewart: The problem is if the session takes a really long time to set up, then there are too many redirects.

<ato> simonstewart: This is why we have chunked HTTP encoding.

<ato> simonstewart: You keep the session alive because it keeps sending data.

<ato> jgraham: Why do we not make this a separate endpoint?

<ato> sahazel: Proxies may try to wait fo rthe entire body before relaying it, before relaying it byte-by-byte.

<ato> sahazel: We wanted to use a proxy that waited for the body to be complete before it returned it.

<ato> sahazel: The proxy would have to relay the chunked encoding, which might work, I would have to go try it out.

<ato> jgraham: HTTP long polling is a thing and I think it does something like this.

<ato> jgraham: It sounds like a tremendous hack, which is why I'm wondering what problem you're seeing with implementing this explicitly with a new command.

<ato> jgraham: It sounds like you have a technical objection to doing this properly.

<ato> simonstewart: It assumes that the local end is properly addressable [?] from the remote end.

<ato> simonstewart: The session ID is normally what you would use.

<ato> jgraham: You get back a session ID for an incomplete session.

<ato> simonstewart: None of the local ends know how to switch session ID.

<ato> jgraham: Commands to that session ID would not work.

<ato> ato: Why couldn't this be done in the intermediary node? You could have the intermediary node start geckodriver and wait for it to finish. The intermediary node would return a temporary session ID:

<ato> simonstewart: Everything assumes that the session ID is stable, which is not the case if you have a temporary assigned ID.

<ato> sahazel: Two thoughts: Selenium could select the session ID, and then once it gets a real session ID it could keep a table/map between these. Second is for the async session endpoint to return a temporary ident until it gets the correct one.

<ato> simonstewart: You're right, I don't like the first idea.

<ato> simonstewart: This breaks so much stuff with regards to backwards compatibility.

<ato> sahazel: It's a strictly new addition.

<ato> jgraham: It sounds like the second option is like getting a token back that you can translate into a session: "at some point this will resolve into a session".

<ato> jgraham: The question is, when do you end up with a session?

<ato> sahazel: You would hit a second endpoint polling for whether the session is ready.

<ato> jgraham: This would be easy with a bi-directional protocol.

<ato> jgraham: Adding a polling hack on top of "I want to hear an event".

<ato> simonstewart: It's not a problem for anyone but cloud providers.

<ato> jgraham: It does seem like a real problem.

<ato> simonstewart: I've not met anyone who say this is a problem.

<ato> brrian: Emulators and iOS could take a long time to start.

<ato> AutomatedTester: And Android.

<ato> brrian: The session creation would _not_ be instant, like with desktop Safari.

<ato> simonstewart: With chunked HTTP encoding we wouldn't have to do anything special.

<ato> jgraham: But then you make assumptions about how the HTTP stack works.

<ato> ato: But there's still a hack, like the async-like-sync thing.

<ato> calebrouleau: Delaying the full response… where does that break? If we can't find a way it will cause problems, then I don't see why we need to invent a complicated engineering solution.

<ato> jgraham: Everyone needs to know how to do it. It's hard to document. It would become tribal knowledge.

<ato> jgraham: You need to make sure that something keeps the connection alive.

<ato> jgraham: What is the spec for "you need to send a few bytes to keep the connection alive" look like?

<ato> jgraham: You can't set the timeout to something large, because individual components might have different timeout durations.

<ato> calebrouleau: I guess I see the stateful solutions to be worse.

<ato> ato: I think both solutions here are awful.

<ato> ato: We are actually over time.

<ato> ChristianBromann: Would it be a good idea to document it? [?]

<ato> jgraham: We could have it as a note.

<ato> ato: I suggest we continue this discussion on GitHub or on the mailing list.

<ato> [some talk about geckodriver and whether it would break Sauce if it did it]

Command pipelining

<ato> sahazel: An advantage of JS test framework is that they execute everything in the browser, because they don't have a roundtrip time between the test runner and the browser.

<simonstewart> This seems more like command batching....

<ato> sahazel: A sequence of commands where you're not checking for the outcome, for each of these things you could return immediately.

<ato> sahazel: You could do five click interactions in sequence.

<ato> sahazel: The goal is to reduce test runtime.

<ato> simonstewart: This is batching, not pipelining.

<ato> sahazel: I disagree.

<ato> gsnedders: You are technically correct.

<ato> sahazel: [reads from the Internet]

<ato> simonstewart: This ties into ato's comment about labelling up commands with particular things, like alert handling so we can special case them and stuff like that.

<ato> simonstewart: There are read commands and write commands.

<ato> simonstewart: When you're doing reads you have to them serially.

<ato> AutomatedTester: How would you classify Find Element?

<ato> simonstewart: It would be read, but Element Click would be write.

<ato> AutomatedTester: Find Element + Click could be batched together?

<ato> simonstewart: There are ways that require more user intervention.

<ato> AutomatedTester: If you batch click + type, you could do a click that then cause navigation, which would then cause stale element references.

<ato> jgraham: In the simple model where you just return the result of the last command, you need to find some way or passing on information from the previous command to the next command.

<ato> ato: It sounds like reinventing shell Unix pipes.

<ato> jgraham: It's actually even more difficult, because you sometimes want to pass the return type from the first command to the third or fourth command.

<ato> ChristianBromann: If you fetch a lot of information through HTTP requests, the driver could cut out? [?]

<ato> jgraham: Are you opening multiple HTTP requests on the same session?

<ato> ChristianBromann: Yes.

<ato> jgraham: That's explicitly not allowed.

<ato> ChristianBromann: This would also be solved with bi-di.

<ato> brrian: It queues them.

<ato> sahazel: Well enqueing on the driver side is great! Because then you can send off a lot of requests without waiting for the response.

<ato> jgraham: If all you want to do is queue them up, then that sounds plausible. But also scary.

<ato> jgraham: The simple thing to do is to have an endpoint to which you can send a list of command name and payload to, and you get back a list of results. This allows the same thing you did, but without opening 5,000 HTTP requests and without seeing if the server falls over.

<ato> jgraham: It's not a perfect system.

<ato> simonstewart: [exhales]

<ato> simonstewart: This would be a lot easier with bi-di, e.g. with async callbacks.

<ato> simonstewart: We are never going to get the same efficiency as running the whole thing in a script.

<simonstewart> ato: as a script that is sent all the way to the server before being executed

<ato> jgraham: … JS remote end API

<ato> jgraham: Sorry I jumped the queue.

<ato> sahazel: I think it is viable.

<ato> sahazel: It would help us to save a fraction.

<ato> sahazel: … of time of requests.

<ato> sahazel: List of commands/list of responses would improve the situation further.

<ato> sahazel: Exposing a JS API in the browser would potentially open up competition with other testing frameworks that can do this sort of thing faster.

<ato> sahazel: But if this is going to take a long time getting to implementation, then [?]

<simonstewart> scribe: simonstewart

<jgraham> q close

ato: actually having a special command where webdriver is emulated in the browser
... what would it look like if we coupled webdriver to the dev tools panel?

<jgraham> close the queue

<AutomatedTester> ack

<jgraham> Zakim: close the queue

ato: when you say it's viable, "citation needed"
... this would require significant effort to implement, and it's not clear that this is the "killer thing" for this WG to focus on

brrian: bidi doesn't buy a whole lot for speed.
... It doesn't let you pile in 100 things simultaneosly

<ato> jgraham: I think the idea of implicitly batching in the client is terrifying because there's lots of scripts that depend on the delay between commands.

jgraham: implicit batching sounds terrifying because people rely on implicit timing problems

<ato> Scribe: simonstewart

jgraham: there's an opportunity cost here

ato: mozilla won't have time to work on this

simonstewart: offers to work with sahazel to put together a proof of concept to help lend some data to the conversation

Clarify the expected behavior with multiple alerts executing a script

<whimboo> https://github.com/w3c/webdriver/issues/1153

<ato> RRSAgent: make minutes

whimboo: we have tests for this in wpt. There is a difference between browsers.
... should we return once a user prompt is shown. IEDriver can't do this.

ato: you have an execute script, and an alert happens. That causes the script to be interupted and the command retuerns with null. At this point, another alert happens. Then the unhandled prompt handler gets called, right?

whimboo: clarifies
... that case is okay and not a problem
... it's when execute script causes two alerts to happen serially in the same script
... IEDriver can't break the script

jgraham: is very confused
... if an alert opens, the event loop is halted.

simonstewart: the IE driver is unable to stop script execution, whereas geckodriver can do that

ato: if this is only an IEDriver problem, then it may be fine that this is a known edge case....

whimboo: how do other browser vendors handle this?

Discussion about how browsers are hard work

<gsnedders> wd.execute_script("window.alert(); window.alert()") essentially, right?

gsnedders: so you call execute script. It causes an alert to occur.

jgraham: so IE can't kill the script halfway through

JohnJansen: is what gsnedders posted the right thing?

<AutomatedTester> https://github.com/w3c/webdriver/issues/1153


jgraham: I think this may be something you need a whiteboard for
... the key piece of information is that he needs to close the alert to continuing handling the command

Debate about which test is this in wpt

frantic typing fails to ensue

puzzled faces do appear

jgraham: apart from Edge and IE, everyone appears to have the same behaviour as geckodriver

<AutomatedTester> https://wpt.fyi/results/webdriver/tests/execute_script/execute.py?label=experimental

ato: the spec prose for execute script is pretty clear --- the script should be aborted. In IE's case, the script cannot be aborted

jgraham: comes up with an edge case. People laugh

calebrouleau: "This is why we need bidirectional"

<AutomatedTester> all: laughter

jgraham: what about clicks that cause alerts in IEDriver?

<whimboo> https://wpt.fyi/results/webdriver/tests/execute_async_script/execute_async.py?label=experimental

simonstewart: those are fine

<ato> https://searchfox.org/mozilla-central/source/testing/web-platform/tests/webdriver/tests/execute_script/execute.py#45

ato: I don't think this is right, because the script should have been aborted

"If at any point during the algorithm a user prompt appears, abort all subsequent substeps of this algorithm, and return success with data null."

JohnJansen: this test has a bug in it
... I'd like to know the user case....

jgraham: this is pretty edge-case stuff.
... There's another case where an alert appears because of a timer. It's not inconceivable that this might happen

<scribe> ACTION: whimboo to fi the multiple user prompt tests with Execute (Async Script)

ato: we have about 10 minutes left

Shadow DOM support in WebDriver

<AutomatedTester> https://github.com/w3c/webdriver/pull/1320

AutomatedTester: easy mental model is how we treat iframes. If there's a shadowroot, you can peer into it and go down
... It's turtles all the way down
... the limitations are that from the css stuff, it's limited to only looking at open shadow roots. Closed ones offer no way into them.
... we can't tell the difference between a closed shadowroot and no shadowroot
... If people like the approach in that GH issue, I can start writing tests

jgraham: so the idea is you find an element, and then there's a stateful switch into finding.... do we have a global... do we have a "this is the bit of the dom subtree we're in at the moment" switch
... a shadowroot is not a browsing context

AutomatedTester and gsnedders: agreed

jgraham: the current sessions, current context, current frame, current shadowroot


jgraham: execute script don't care about the shadowroot (because everything shares the same JS global)

AutomatedTester: you can't use a queryselector to do that

corevo: you can queryselector into a shadowroot. There's a pseudo-class for that

<corevo> https://www.w3.org/TR/css-scoping-1/#shadow-pseudoelement

corevo: It's in css scoping 1
... "::shadow"

jgraham: let's assume there's a possibility we need to do some work here.
... I wonder if it would be sufficient to make findElement be the only place we change things?

AutomatedTester: raises concerns about getElementFromPoint. You can't look through to the shadowroot

ato: but the scoping would be to the document
... if you get back a webelement, that element belongs to the shadowdom. How do you interact with it?
... what happens when I do "click"?

jgraham: there's a question about interactibility requests
... with DOM you can peer through
... concerned about having global state

RESOLUTION: we did not reach consensus, but it's an interesting idea to scope retrieving shadow dom elements to the findElements command

<ato> RRSAgent: make minutes

Summary of Action Items

[NEW] ACTION: For AutomatedTester to speak with MikeSmith about obsoleting features in a specification
[NEW] ACTION: jgraham to propose a patch for handling accumlated shift state in the high-level Element Send Keys command
[NEW] ACTION: whimboo to fi the multiple user prompt tests with Execute (Async Script)

Summary of Resolutions

  1. Add general "scroll" action primitive that takes the same input as pointerMove, except for origin it will not take the "pointer" variant, but it will have one called "relative" that is relative to the viewport. x/y offset will be given in pixels, and it will take the same ScrollIntoViewOptions defaults as the high level Element Click command.
  2. Add new endpoint as documented in the issue, but additionally take a hint as input indicating tab/window/potentially thing in the future, and return the window handle that was opened and the context type that was created.
  3. We will not add a Get Window State command because not all remote end implementations have a notion of maximized window state. This means the tests for Maximize Window will be weak and that we will have to introduce a thread suspend to make sure the window size doesn’t keep changing after the command has returned.
  4. In Element Send Keys, once a modifier is held down it is considered to be held down until the null key is sent. Duplicate modifiers are ignored, and the implicit modifiers are "don’t override the explicit one" and for implicit modifiers only, we coalesce adjecent same-case characters.
  5. modify spec to make chrome's behaviour the expected behaviour for file upload (eg. to skip keyboard interactivity test for file input elements). Add a new "strictFileInteractivity" capability that defaults to "false". Add a new file upload endpoint.
  6. create the file upload endpoint for elements, make this apply to file input elements only. Have this follow the parameters outlined by ato in his GH bug comment. https://github.com/w3c/webdriver/issues/1230#issuecomment-376989364 Add a new error type for "file uplaods not allowed" (or something like that, eh?)
  7. Go read the spec issue and follow up on that one
  8. Add new capability for default authentication credentials. Add handling of authentication dialogues wherever it is that we have unhandled prompt prose. Add endpoint to allow deliberate authentication, and this is probably going to be in section 18.
  9. we did not reach consensus, but it's an interesting idea to scope retrieving shadow dom elements to the findElements command
[End of minutes]

Minutes manually created (not a transcript), formatted by David Booth's scribe.perl version 1.154 (CVS log)
$Date: 2018/10/29 11:36:40 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.154  of Date: 2018/09/25 16:35:56  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00)

Succeeded: s/tacid/tacit/
Succeeded: s/philisohpically/philosophically/
Succeeded: s/peopler/people/
Succeeded: s/execptation/expectation/
Succeeded: s/suggest/suggested/
Succeeded: i/When you call fullscreen/Scribe: ato
Succeeded: i/Philosophical discussions about platform differences is futile/Scribe: ato
Succeeded: s/gitbot>//G
Succeeded: s/r+//
Succeeded: i/escribes how Firefox works>/Scribe: AutomatedTester
FAILED: s/a\+//
Succeeded: s/gsnedders/simonstewart/
Succeeded: s/can do/have to/
Succeeded: s/clear this up/fi the multiple user prompt tests with Execute (Async Script)/
Present: ato whimboo AutomatedTester simonstewart johnchen JohnJansen brrian crouleau ChristianBromann jgraham thomasboyles calebrouleau titus corevo sahazel johnchen_ gsnedders
Found Scribe: Andreas Tolfsen
Found ScribeNick: ato
Found Scribe: simonstewart
Inferring ScribeNick: simonstewart
Found Scribe: automatedtester
Inferring ScribeNick: AutomatedTester
Found Scribe: simonstewart
Inferring ScribeNick: simonstewart
Found Scribe: ato
Inferring ScribeNick: ato
Found Scribe: automatedtester
Inferring ScribeNick: AutomatedTester
Found Scribe: ato
Inferring ScribeNick: ato
Found Scribe: AutomatedTester
Inferring ScribeNick: AutomatedTester
Found Scribe: ato
Inferring ScribeNick: ato
Found Scribe: ato
Inferring ScribeNick: ato
Found Scribe: AutomatedTester
Inferring ScribeNick: AutomatedTester
Found Scribe: ato
Inferring ScribeNick: ato
Found Scribe: AutomatedTester
Inferring ScribeNick: AutomatedTester
Found Scribe: ato
Inferring ScribeNick: ato
Found Scribe: automatedtester
Inferring ScribeNick: AutomatedTester
Found Scribe: simonstewart
Inferring ScribeNick: simonstewart
Found Scribe: simonstewart
Inferring ScribeNick: simonstewart
Scribes: Andreas Tolfsen, simonstewart, automatedtester, ato
ScribeNicks: ato, simonstewart, AutomatedTester

WARNING: No date found!  Assuming today.  (Hint: Specify
the W3C IRC log URL, and the date will be determined from that.)
Or specify the date like this:
<dbooth> Date: 12 Sep 2002

People with action items: automatedtester for jgraham whimboo

WARNING: IRC log location not specified!  (You can ignore this 
warning if you do not want the generated minutes to contain 
a link to the original IRC log.)

[End of scribe.perl diagnostic output]