WebDriver – 08 December 2021

Meeting minutes

<jgraham> RRSAgent: make logs public

<jgraham> RRSAgent: make minutes

<foolip> The meeting has not started?

<brwalder> doesn't look like it

<whimboo> david mighth not be around yet

<foolip> jgraham: I think I'll need to bail after not too long, in case you want to front load anything you'd expect me to have useful feedback on.

ARIA-AT Automation API

s3thThompson: Hi every, I am a product manager at Bocoup

<s3ththompson> https://aria-at.w3.org/

s3thThompson: ARIA-AT is a CG project to help wit
… the automation is a11y tools
… we have written a large set of manual test on how screen readers should work
… and we would like to automate as much as possible here
… and we want to see if we can have a "screen reader driver"
… we have created ways to do interactions and voice capture from the readers
… but we want to do is get screenreader vendors to join and help with making interop better
… this project is focused on conformance and interop testing
… What we wanted to share quickly is an API explainer
… and what the scope of the AT Driver could be

<s3ththompson> https://github.com/bocoup/aria-at-automation

zcorpan: As Seth has said, the use case is to automate screenreader testing to help screenreader vendors can make sure that things work
… [discusses the states that it would need to hook into]
… and we don't want to duplicate things that webdriver can already do
… and we want to be simulating, at an OS level, interactions and other capabilities going.
… at now, this is just an explainer, and if there is multivendor interest to make sure this standardised

jugglingmike: Simon has already convered most of the things I wanted to discuss
… and we have a lot of experience with webdriver and can some cross over

jgraham: I glanced over the explainer and there are a few things

jugglingmike: The tool we are making is focused around a websocket and we haven't got the whole creation sorted
… we are watching the webdriver-bidi spec with interest

automatedtester: What's the difference between this and AOM and since AOM was supposed to be allowing programmatically access to the browser

zcorpan: AOM is only about the browser-level part of the a11y stack and we want to test everything, including the OS level and what the screen reader actually says

jugglingmike: I can imagine ARIA-AT doing a good enough job we might not need worry about the variation between tools
… there is a lot of leeway around what verbage that screenreaders can use

foolip: Simon, you asked if there was multiple vendor interest. WHat do we need for that? What's the role of Browser Vendors here?

zcorpan: The screenreader and the browser are going to be run together in an OS and we are going to need a number of people here. The most affected vendor is the screenreader vendors and to make sure that we have stronger signals of interest from them
… but we still need browser vendors interested too

foolip: Do we need browsers to change anything or just prioritise the bugs that are likely to come out of this.

zcorpan: I don't think tha tbrowser vendors are missing capabilities right now to get things started

<foolip> My question sounded rather skeptical, so I want to say this seems like a reasonable approach, hope it works!

automatedtester: What is the ask of this group that we can help with?

zcorpan: we have a few questions so Mike, can you help?

jugglingmike: we keep talking about keypresses. One thing that we can maybe see if there was a way that we can get proper OS events
… and then there are session capabilities
… [discusses different binaries and window and sessions and how they can map together]

simonstewart: re: keypress at OS level. We tried to get there as close as possible but it's described as best we can in W3C language.
… the concept about about session. We created sessions the way we did so we can better utilise computer resourses so people can have a lot of tests running on parallel on a single machine

jgraham: re keypress
… the way that things are currently done in the browser event loop instead of the OS event loop
… we have sandboxed and it's going to be scary for browsers to change t
… and sessions: there is a bit of state around a session that needs to be maintained
… and it makes sense to do what you said

s3ththompson: This has been useful and it would be great if people can read this

Communication channel from page scripts to the automation client.

github: https://github.com/w3c/webdriver-bidi/issues/157

sadym: The first topic wanted to discuss was around what James brought up around the comms channel
… there is an example in the discussion linked
… [discusses the example] and James had some ideas that I liked

jgraham: To put this into context, we discussed to have a sandbox value and [discusses generation of custom events]
… this would an alternative to the other way as we can have a callback that is called
… this will be simpler from spec prose as we don't have to get webidl to do things it doesnt' normally do
… one concern that could be there is "what happens to scripts that are loaded on page load". It's not going to be impossible just going to be "fun"
… I will write on the issue for deeper discussion

Discussion: need in browsingContext.findElement command

github: https://github.com/w3c/webdriver-bidi/issues/150

<jgraham> If people have concerns about that model it would help to post them on the issue

<jgraham> I think we're likely to get a PR soon

<simonstewart> That looks like it's based on https://chromedevtools.github.io/devtools-protocol/tot/Runtime/#method-addBinding, right?

sadym: apparently in Selenium, we have a number of ways to find element but these can be done via script eval
… do we really need a find element moving forward

<jgraham> simonstewart: it's a different approach to the same problem, you do callFunction(function, args=[callback]) and whenever the remote end calls callback it generates an event with the arguments

automatedtester: What would happen on things like closed shadow dom ?

sadym: It probably doesn't cover the shadow dom when it's closed

jgraham: I disagree, we can return a shadow root value to the element and then just go down that way

simonstewart: the thing that findelement gives us separation of concerns
… and in clients we can get people building their own locator types
… e.g. FindElementByARIA
… separating the interface is generally good design and we can have people return element ID to be passed into other things later on

<jgraham> I note again that closed shadow roots should work with the script execution approach :)

<simonstewart> I was looking for an example that might work :)

<simonstewart> Point noted, jgraham!

Remote Object Lifecycle

git

github: https://github.com/w3c/webdriver-bidi/issues/90

jgraham: We now have the ability to execute script and return references to objects that are owned but the ecmascript engine
… at the moment, the way the spec if writen and you return an array we return a reference to it
… and if that is GC'ed then when we try access them then this could problems as things could be thrown away
… we have can have strong referenced that is associated to the global so that as long as the global exists then that's fine
… or we have strong reference to root objects and weakmaps for the rest
… or the CDP way of object groups
… I have written up a comment on this
… Please can people read through this and write it up in the issue

simonstewart: we need to make sure we think about clients that are written without GC in mind when doing the work here

<jgraham> RRSAgent: make minutes

– DRAFT –
WebDriver

08 December 2021

Attendees

Meeting minutes

ARIA-AT Automation API

Communication channel from page scripts to the automation client.

Discussion: need in browsingContext.findElement command

Remote Object Lifecycle

Diagnostics