TPAC 2022 - Browser Testing and Tools

Meeting minutes

RRSAgent (IRC): help

RRSAgent: silence

rrsagent: silence

rrsagent: make logs public global

rrsagent: make logs public world-visible

rrsagent: make logs world-visible

rrsagent: make minutes

<BrandonWalderman> Hi folks. Zoom wants a passcode but I don't see one at https://www.w3.org/events/meetings/d7ec73b1-f9b3-4748-a508-eec1f9751d9a how can I join the meeting?

https://us02web.zoom.us/j/3919679751?pwd=SGY5WFpSSFBRN0lyVFo0MGhFT1R3QT09

that worked for me

<sadym> I'm still in Berlin

<BrandonWalderman> "meeting host will let you in soon"

<BrandonWalderman> thanks im in

https://www.w3.org/wiki/WebDriver/TPAC-2022

<jgraham> RRSAgent: make minutes

<jgraham> RRSAgent: start logging

<JimEvans> I’ve been caught away from my office. I will join once I return. Perhaps in about 45 minutes.

ARIA-AT progress update

<zcorpan> https://docs.google.com/document/d/1zALOhEIOn8BIdId3oO8nqjdgDz-hWjjLSj7Do7db1gU/edit?usp=sharing

sethtompson: I would like to start presenting
… [introductions of team and working on ARIA]
… the problems with testing ARIA at scale is similar to wpt
… and test262
… we joined the BTT wg last December to explain our plan and ideas
… [explains what they wanted and the protocol]
… we have a draft protocol document that is very similar to webdriver bidi
… we want to reuse as much as possible for ease and interoperability
… there is a link to a bikeshed preview

<karlcow> https://github.com/w3c/aria-at-automation

<zcorpan> Spec preview https://pr-preview.s3.amazonaws.com/w3c/aria-at-automation/pull/25.html

sethtompson: we've added protocol, the vendor specific settings, and then the bulk of the work is a milestone on how we capture the spoken word
… and the last milestone we have reached is methods to simulate keypresses
… this is the bare minimum requred to be able to do automation
… there is more work that we would like to do
… the remaining milestones are activating commands, internal states, and headless mode
… t
… this is the overview so far
… and we have started working on NVDA
… and seeking a 2nd implementation
… questions for BTT: soliciting feedback from this wg
… working relationship between the two specs

zcorpan (IRC): that covers it... the integration between ARIA-AT and Webdriver is an open question
… it seems useful to be able to use both protocols in a test

jgraham (IRC): having read the spec there are a number of areas that copies webdriver
… it would be good to have a way for you to reference ours instead of duplicating things
… and since we're going to speaking to 2 different applications at the same time, there will be 2 ports open in 2 applications and I can't see there being any major issues

zcorpan (IRC): I agree this seems possible
… I guess the question is how would set this up in practise
… how would you set up a session that you can use both tools at the same time... it could be tricky

jgraham (IRC): if you wanted to multiplex this over 1 connection then this could get tricky
… I am assuming that you want to to a service one 1 port and then it can get tricky

<jgraham> AutomatedTester: I think part of the problem is posibly not solvable by this group, but it might be useful to chat to the Selenium community who have experence of building out this kind of scalable system. Ports are fairly cheap, so it might not be a major issue [to use two]. I'm happy to faccilitate such a discussion if you want.

zcorpan (IRC): to jgraham (IRC) point on duplication, since we only have 1 implementation that seems fine but in the long term we should it live in this WG

automatedtester: I don't see an issue with this living in this wg but I will need to speak to Mike

jgraham (IRC): we will definitely take patches on how to handle the references and work from there. I think we will need to just figure out the logistics

WebDriver-classic review process

<jgraham> Github: https://github.com/w3c/webdriver/issues/1681

jgraham (IRC): I will take this. The topic is about getting review to the webdriver spec
… there are 2 reasons this relevant
… 1) we are adding features and bug features to the classic support
… 2) and this can block bidi
… in bidi where there are things duplicated we should reference things in the classic
… so we have a few issues that are waiting
… and there are PRs blocking working on bidi
… as it says in the issue
… there are only 3 codeowners... are we only expecting these people? How can we add new reviewers

<jgraham> AutomatedTester: Are those the only people who can review? No. If there are other people we should add them, and help them get started. One suggestion is that if these items are blocked and discussed in the BiDi editorial meeting and there's consensus then we could use that approval to get them landed. I'm happy to help make the process simpler.

jgraham (IRC): so my question for sadym (IRC) and Brandon Walderman what can we do to make you feel comfortable in reviewing these PRs?

Brandon Walderman: for my role in Edge on bidi, things have been scaled back at MS
… I am happy to help with reviews but can't commit to more

sadym (IRC): I have the opposite issue, I can get more into the reviews... the issue is trying to get up to speed but I am happy to get up to speed more

patrickangle: I would also be happy to review classic as I have been working on that implementation (at Apple)

ack

ack next

Sam Sneddon [:gsnedders]: some of the reason I haven't done this is because there are items that have made things just sit there and I haven't felt like doing more... but if we are actively doing things then I am happy to help too

jgraham (IRC): I think we have a comms issue here and we need to make sure that we flag these things in editorial meetings moving forward
… maybe we set a process for review priority
… the other question is around the codeowners, do we want it or deleting it?

<jgraham> AutomatedTester: Happy to move towards a BiDi approach

<jgraham> AutomatedTester: We should try to keep things as simple as possible and use the same processes for both specs

sadym (IRC): a bit off topic; what is the process I try prototype it and then give review

jgraham (IRC): from the classic point a lot of things are refactors in the queue, so we don't need to have people trying to implement it

ACTION: remove CODEOWNERS from the repo

ACTION: add label for items that need to be reviewed in the next editorial meeting

Actions IME support

github https://github.com/w3c/webdriver/issues/1683

github: https://github.com/w3c/webdriver/issues/1683

jgraham (IRC): This is also relevant to webdriver classic
… the issue here is a proposal on how to handle ime input in webdriver
… IME is input method editor
… it is commonly used in languages where you can't type the characters directly
… [describes examples]
… there are a lot of web compat issues in editor libraries because they can't test IME
… [describes input breakage in Gecko]
… for those who have heard of Interop 22... part of that is working on interop in input
… in webdriver, the lowest level inputs is actions that allows you to send through the keyboard, pointer events and so on
… with IME you press a key and that intercepts and a different event is fired. e.g. A would change it to the keycode and then do composition
… the webpage gets composition events
… [explains different composition methods]
… the proposal is we add a new input type called IME
… this has 2 actions, `compositionUpdate`
… the other action is `compositionEnd`
… so the webdriver specific thing that's not clear how these things hook together
… [explains IME and Keyboard]

<jgraham> ack

<jgraham> AutomatedTester: Historically WebDriver (Selenium) had IME support built in. It was handled by actions trying to inject directly into the event queue. There was special C++ code required to handle it. That's why we didn't do this and focused on US keyboard input. We did allow actions to handle sending specific unicode characters so you could input final composed characters.

<jgraham> AutomatedTester: Required specific install on the machine.

<jgraham> AutomatedTester: Is it easier to implement now?

<jgraham> AutomatedTester: High level actions seem OK, but is it implementable?

jgraham (IRC): this is a case benefits being supported directly in the browser
… the proposal is it is at the moment... it's a mid layer proposal
… we won't go to the OS IME
… we will provide enough data to the browser so it could inject the relevant events
… this should be implementable and it can be implemented in gecko
… [explains how we need to maintain some states]

Brandon Walderman: I support this feature request
… we had an intern do some of this in Chromium for CDP
… the building blocks are already in chromium so it's a case of adding this to chromedriver

Lan Wei: I was working on the actions implementation
… we have looked at this and it's very hard to implement
… could you explain the client API

jgraham (IRC): so from the point of view of webdriver user
… it doesn't ever interact with an IME on the machine
… we will emulate it

Lan Wei: do you have language type as an input

jgraham (IRC): the proposal doesnt have a way to handle any configurations... e.g. different IMEs handle different combinations to get a different order of events

Lan Wei: do you have any plan on when we want to work on this API?

jgraham (IRC): since this is part of Interop 2022, there is pressure to get this done quickly
… we would love feedback now

<Zakim> karlcow, you wanted to ask about gecko on different platforms

karlcow (IRC): I wanted to ask jgraham (IRC) ...do we need a different test per platform?

jgraham (IRC): if platform IMEs handle things different then a test per platform?

karlcow (IRC): how do you make this universal?

jgraham (IRC): this is very hard...
… it won't adress all cases but it's an improvement since we have zero way to test

Break for 15 minutes

Shared element references

github: https://github.com/w3c/webdriver/issues/1594

jgraham (IRC): the title of this issue is very misleading
… for webdriver when we return an element or send an element
… we check is the element in the DOM and attached
… if not we return `StaleElementReference`
… when working on trying to make sure that the cache of elements is shared between classic and bidi
… for bidi we have decided that we don't want to add this
… [explains example of across windows]
… [explains that we shouldn't be allowed to share things across origins]
… the question is how do we explain this in the spec

automatedtester: what does this mean in 'accessible'?

jgraham (IRC): this means browser context group I think
… the questions are Do we disagree with assumptions in there?
… and "how do this relate to PR 180 in the bidi spec"

<jgraham> https://github.com/w3c/webdriver-bidi/pull/180

<jgraham> AutomatedTester: If we remove StaleElementReference, and make it per browsing context group, what is the impact on compat?

<jgraham> ... this could impact Selenium users.

jgraham (IRC): I forgot to explain, this wouldn't have any normative changes. WebDriver would still work as it does today

<jgraham> But WebDriver-BiDi would allow a superset of the WebDriver behaviour

Jim Evans: as I am thinking about it and I will talk mostly from Selenium use case
… if we look at a SPA where doing an action on the page that causes a refresh
… people are using that stale element as a sychronism point
… in the classic we aren't changing things but how would we create this type of test in the bidi world

jgraham (IRC): the first thing classic doesnt' change
… at a high level we should be encouraging people to look for specific events that they can use
… instead of using polling
… if people pass in a element id and it's been GC'ed then it wouldn't be able to deserialise and error
… and since we want to implement classic on top of bidi we need to make sure it works

Sam Sneddon [:gsnedders]: so would the connected check be an extra bidi call?

jgraham (IRC): yes as things are now

Sam Sneddon [:gsnedders]: so it looks like we need to implement it but it's implementable
… [discusses a possible solution with tree walking]

jgraham (IRC): I'm not concerned about our ability to support these use cases
… the question is what we want bidi to do and the state of where sadym (IRC) is at

Jim Evans: This does answer my question. I wonder if there is mileage in having an event in sent when an element is removed from the DOM

jgraham (IRC): in the spec there is seriealization of a node
… in terms of getting an event when a node is removed
… I would expect people to install a script to do mutation observers
… it might not be a primative but a combination of features

sadym (IRC): so I when I wrote the spec for shared ID
… I wrote it that it didn't care if the element was in the DOM but where it is now

jgraham (IRC): my understanding of shared id is that as long it hasn't been GC'ed it will be able to send the element back
… so you can see if the element is there by trying to deserialise and can't it's not then we can error
… or if we return something we can see if the parent is a document to see if it is in the document

<gsnedders> https://developer.mozilla.org/en-US/docs/Web/API/Node/isConnected

<jgraham> AutomatedTester: A question I have is to get to the same state as StaleReference might require 3 wire calls. That's a lot of round trips for cloud providers. Is that correct?

jgraham (IRC): if you are implement element selected? In this case you could do this as execute
… but if it's returned from a different method classic would fail quickly but bidi would need to do an extra call
… but in the common case it should hopefully be equal between the two modes

sadym (IRC): what is the scenario when the user wants to get the stale state in bidi
… to me it seems exotic
… there is a web api

jgraham (IRC): [repeats Jim Evans example and explains]
… there is a DOM api that people can use to check the connectedness
… the observable difference is in classic, if you find an element refresh it would error
… in bidi it wouldn't error without doing a number of checks
… we could not implement this info but it makes classic on top of bidi slightly more complex
… and it could be racy

<Zakim> gsnedders, you wanted to mention why we have the connectedness test originally

Sam Sneddon [:gsnedders]: why do we have the connectedness tests?
… provided we aren't concerned with implementing this without weakmaps then we don't need to worry
… [explains example web component in a SPA]
… can we change the semantics of stale reference/ I expect no

automatedtester: no we cant

<jgraham> gsnedders' use case of a SPA implementing a component cache is useful.

Jim Evans: for clarification I will reiterate
… there are a number of APIs for element state and element interaction that will require an extra round trip

jgraham (IRC): not necessaryly

Jim Evans: in the case of element rect example, if people do that and it's not attached they will get a stale reference and then people do that extra test
… specifically for webdriver classic on bidi

Sam Sneddon [:gsnedders]: if we need to walk the tree then in bidi it would be very expensive

jgraham (IRC): in the script that we can just add 1 more line for connectedness
… but in the executescript case it will be a lot more that needs to be done
… [discussion around executeScript and how this could impact networks etc]

<jgraham> Question is "can we ever end up running user code during JSON serialization in classic that would allow us to observe teh difference between bailing early with a stale element reference and doing the full serialization and then later checking for stale elements?"

sadym (IRC): you can get a stale reference when you try access that element is that correct?

<gsnedders> jgraham: Oh, no, we do serialise accessor properties, not just data properties, so "yes".

jgraham (IRC): I think it might also error when doing a serialisation

sadym (IRC): so chromedriver keeps a map of the guid and tries to access it on cdp and if its not there it throws stale element

jgraham (IRC): if chromedriver isn't following the spec exactly here we might be able to change the sematics slightly here and get away with it

<jgraham> https://w3c.github.io/webdriver/#dfn-json-clone is the serialization algorithm from classic and it does check if the element is stale

jgraham (IRC): sadym (IRC) do you have a status update on PR 180

<jgraham> https://github.com/w3c/webdriver-bidi/pull/180

sadym (IRC): I have an action to look at how chromedriver should work
… I have changed priorities since then so would need to check on it

jgraham (IRC): do you have any outstanding items on that pr?

sadym (IRC): I can't remember the state of this
… I think there are actionable items there that I need to work on

Closing the browser

github: https://github.com/w3c/webdriver-bidi/issues/119

jgraham (IRC): The question on this topic is "should we allow people to close the borwser?"
… forcible quits might not clean up things as people expect
… since we aren't stopping multiple sessions then 1 session could kill other sessions
… the second question is "Should closing the last browsing context shutdown the browser"?

Sam Sneddon [:gsnedders]: in MacOS applications are singletons so safari automation is running in the same safari as users
… but quiting the entire browser kills all sessions and I think that is unacceptable

Jim Evans: I am want this primative
… in the last 2 days I have been working on a bidi client
… I am able to use Firefox to get things working
… but I don't have a mechanism to kill the browser that could leave things in an incomplete state

sadym (IRC): my question is why shouldnt we just close the windows that automation has control over

Sam Sneddon [:gsnedders]: when the session is closed we close the windows under automation control?

sadym (IRC): if there are more than 1 session?

Sam Sneddon [:gsnedders]: we only allow 1 session
… but I will defer to patrickangle

patrickangle: we have things we need to untangle to support multiple sessions

sadym (IRC): for quit we only need close the automation that should be fine?

Sam Sneddon [:gsnedders]: it depends on how we spec the semantics

jgraham (IRC): I agree what people want for desktop is that the browser process has shut down
… and it makes sense that this is not always implementable
… in bidi one could observe all the windows
… but I would expect safari would split what you can and can't observe
… for most use cases it would be sufficient that we just say that all automation windows are closed and shutdown
… I would expect the spec that we spec out that we close it all the automation windows

Sam Sneddon [:gsnedders]: it's not just safari that is a singleton app... all applications are on macOS
… so all browsers can have this problem and you can get around it would be weird

Jim Evans: I think your assessment that the close on the browser closing all the automation windows is fine
… but from a test author point of view
… I can ensure restore state when the test is finished that the machine is back to the same state
… if this isn't the case then it will cause issue reports if we leave anything behind
… and those issues will just be forwarded to the spec

jgraham (IRC): i think the sematics is we drop state of anything related to webdriver
… i think I agree that people will find it will be weird if we don't kill things
… but there are times where we can't fully kill things e.g. FirefoxOS/ChromeOS/KiaOS
… I think we can add something to help clients "clean up the webdriver state'

<Zakim> gsnedders, you wanted to mention alternative suggestion

Sam Sneddon [:gsnedders]: my proposal is we have a flag that opts into closing all browsers that is fallible in an implementation specific way
… [explains how this could look using Safari based on state of different things]

jgraham (IRC): yea. that's fine, but it also doesn't need a flag
… we need to think about mobile and we might not be able to get PID etc and wait for that
… or return that info

<Zakim> JimEvans, you wanted to discuss a quick poll of implementors before we end

Jim Evans: the bidi spec has a definition of a bidi only session
… Firefox has an implementation
… can we expect bidi only sessions in all browsers?

Sam Sneddon [:gsnedders]: we can't comment on what is going into a future safari release

sadym (IRC): technically you can do that now in chrome

<sadym> https://github.com/GoogleChromeLabs/chromium-bidi

Brandon Walderman: edge will follow chromium

Sam Sneddon [:gsnedders]: going back to the comment on option that could be a capability to request a completely isolated instance would make sense
… for most that's a easy capability to satisfy

<sadym> to @JimEvans: WebDriver BiDi in Chromium could be used via ChromeDriver (classic+BiDi) or via Mapper (BiDi only)

<sadym> https://github.com/GoogleChromeLabs/chromium-bidi

jgraham (IRC): I am trying to think how a client would do things? clients would have to understnad what it is speaking to
… we need them to expect to handle all of this and not just call end and assume something else has handled it
… in the first case we dont have a return value or similar
… there are differences between OSes and people need to be aware
… I think we should just say that it cleans up the automation state
… we can figure out the wording
… we should say something about that we don't guarantee events of things being shutdown will be sent over

<Zakim> gsnedders, you wanted to also mention non-browser cases

Sam Sneddon [:gsnedders]: I wanted to point out that we can't guarantee that it will be a full browser, it could be a web view in an app
… it would be good to make sure that we can connect to an embedded webview

sadym (IRC): how do we test it with wpt? I am guessing we wont?

<jgraham> RRSAgent: make minutes

– DRAFT –
TPAC 2022 - Browser Testing and Tools

15 September 2022

Attendees