WebDriver June 2023 – 14 June 2023

Meeting minutes

RRSAgent: set logs world-visible

RRSAgent: publish minutes

<jgraham_> RRSAgent: make logs public

<jgraham_> RRSAgent: make minutes

https://www.w3.org/wiki/WebDriver/2023-06-BiDi-roadmap

WebDriver BiDi status in browsers

whimboo: I will give a brief overlook of the Firefox implementation
… we have implemented network logging events
… we also implemented the preload scripts API
… this can be used for script pinning in selenium
… we have also added for full user interactions APIs
… we support all inputs except pen at the moment
… puppeteer has added capability matching for session.new
… we are next going to be working on ending a session
… and jgraham is working on the network interception part of the specification
… and will also start working on modal dialogs

sadym (IRC): from Google, we have been working on screenshots and PDF generations
… we are working actively on network events and network interceptions
… we are also working on proper serialisation
… and we are working on the user inputs APIs

Sam Sneddon [:gsnedders]: we, webkit, have not shipped anything yet and we do not comment on any future releases.

WebDriver BiDi implementations in browsers

whimboo: before we start on the roadmap I wanted to talk about the follow as it will help with the implementation
… I have created the following spreadsheet https://docs.google.com/spreadsheets/d/1bkiPU5eDBCqFkx5p_VSBx_OK8gy9TeHRKQVPHKMATGQ/edit?pli=1#gid=0
… we have been working on this and documenting what features are implementecd and in what version it was implemented in
… so would others find it helpful to keep this up to date

sadym (IRC): we have an internal document that is similar and we are happy to add more details to this to keep it up to date

Puppeteer for Firefox over WebDriver BiDi

whimboo: in this spreadsheet is more targetted for puppeteer and firefox https://docs.google.com/spreadsheets/d/1jFODscDeaqqnXC3xzMNt2biX0eY7kZ9e-KTs5GPhjG4/edit#gid=0
… we are tracking all the work that is needed to support all versions of puppeteer over bidi instead of CDP
… if the Selenium folks want to have a similar discussions and documents that would be good but can be done later

Roadmap

github: w3c/webdriver-bidi#403

mathiasbynens (IRC): I will get us started on this
… THere are some draft PRs against the spec of things that need working on
… we have some good use cases of what people want

jgraham: one weakness of what we ahve done in the past is that we have looked at abstract use cases and not always the client use cases
… e.g. setEmulation is needed for puppeteer to work and isn't on our roadmap
… so I think we need to focus on what we can do to get clients using bidi in the short term

simons: one thing that we should have is to have reformulating webdriver classic on top of webdriver bidi
… but figuring out what is missing would be a rich seam of work for us moving forward

whimboo: the last time we did this we started with Selenium and puppeteer needs and work out the priority from there

mathiasbynens (IRC): the puppeteer needs are documented in this thread here. The top of the issue has things in the order of what we need

simons: could you reformat what you're saying into a use case
… the use case is mostly for those writing clients and not the end user writing tests
… in terms of missing functionality, I don't know what's missing but we should try figure out if there are things

mathiasbynens (IRC): but that makes a new audience moving forward

simons: yes I agree

jgraham_ (IRC): the classic spec has a lot of good use cases that people need are there
… and there is a 2nd level of detail and is lower priority is the ability to make chromedriver use bidi on it's own
… and I do think that "it must be an end user use case" is ok but I think we can do better
… I think that making the clients working is a much better approach
… the priority here is to make sure that we can work and not all use cases come into it. e.g. setEmulation

mathiasbynens (IRC): should we add simons item on classic reformulation?

jgraham_ (IRC): perhaps but maybe that should be more of a process
… I think we should use this and perhaps spreadsheets

<whimboo> https://docs.google.com/spreadsheets/d/1fmPugULnRlsuUGHLnjTdlUWU01X4kkVfrLjj96w_Tcc/edit?usp=sharing

Sam Sneddon [:gsnedders]: when we are working on things defined on webdriver classic are we discussing the proper spec or also the extension points to that spec in other specifications

jgraham_ (IRC): yes
… some of it is purely browser testing rather than web app testing but I do believe that it is in scope

<jgraham> https://docs.google.com/spreadsheets/d/1Cg3rifrBZClIitU3aFW_WDv64gY3ge8xPtN-HE1qzrY/edit#gid=0

<gsnedders> whimboo: you changed "Print screenshot" to "Capture screenshot", but that has different sets of existing implementations in Classic, so we probably need to split them?

<jgraham_> Both print and capture are already in BiDi, so if it's specific subfeatures we should call them out

<jgraham_> Aren't atoms already handled by preload scripts?

<cb> WebdriverIO has a lot of Appium users that run mobile web tests in emu/simulators

<gsnedders> jgraham_: Ah, I missed them being there 🙃

mathiasbynens (IRC): Why don't Selenium/WebDriverIO need device emulations?

automatedtester: Selenium works with those mobile browsers either on real devices or via emulators/simulators?

jgraham_ (IRC): are we talking about viewport or device px or?... I have added a new column to make sure that we have the same understanding of terms

AlexRudenko: for device emulation the minimum is viewport shaping
… at the minimum.
… users struggle with getting the correct window size becauae of the browser chrome impacting the window size
… in puppeteer it is set as a initial capability
… window size can impact all tabs in the main window
… there is a spec proposal

AlexRudenko: it allows you to make a "temp profile"
… for sandbox. The issue is here is to try help handling the browser start up and creates an "private/incognito" tab which is much faster than restarting the browser

jgraham: this could be interesting to spec as there are no standard way of describing items between browsers. this might not map to profiles... e.g. containers in firefox might be the same here but we don't know

sadym (IRC): webextensions: what we suggest we add the ability to install extensions and access background pages

jgraham_ (IRC): I suggest we split this into 2 tasks
… loading is 1 and what people might do
… and debugging is a very different thing

whimboo: do we also need uninstall and enable/disable?

simons: we definitely install/uninstall as a priority for this item. Enable/disable are nice to haves

AlexRudenko: puppeteer only allows the installation of extesnions

sadym (IRC): get/set/delete of cookies for a specific origin

automatedtester: do we want access to the cookie jar from outside the origin?

jgraham_ (IRC): yes, we need to rethink this as cookies have grown since we did this originally

whimboo: window manipulation: this is similar to device emulation is that we need to manipulate the browser window
… and then we have window manager impacting this
… i'm not sure if we should split this between resizing and window movement
… I am not sure how useful maximise and minimise on the window

whimboo: in bidi we can run any commands in any context
… but there are some commands that need that context to be in focus
… e.g. puppeteer needs it for screenshots
… and browsers may throtlle what is happening in non-focused browsers like firefox does

Sam Sneddon [:gsnedders]: related to throttling, there are browsers following OS "low power modes" which can impact the frequency of different events
… this is kinda detached to focus but has similar impacts to being in out of focus

whimboo: Firefox doesn't have this impact

mathiasbynens (IRC): Sam Sneddon [:gsnedders] should we add this to the queue?

Sam Sneddon [:gsnedders]: maybe but I'm not sure of the proirity here

whimboo: we could maybe add this a capability

Sam Sneddon [:gsnedders]: more importantly going into lower power mode may break websites so we want to be able to do that somewhere

simons: Framehandling: this boils down to getting a browsing context of a frame. Executing JS might not always work
… t
… the classic has commands to do this and would be good to make work

jgraham_ (IRC): in bidi you can do this via executeScript
… but there might be the missing part of returning an iframe should return the context. I believe ther eis an open issue for that

Jim Evans: there is some functionality from the browsing context as that has child browsing context shared

simons: as long as we can formulate classic over bidi here then we should be fine

whimboo: clients should be able to track all of this

sadym (IRC): do we have a use case for the finding a node that could be in an child context?

AlexRudenko: we might have a use case in puppeteer especially around clicks but not sure

<sadym> do we have a use case for the finding a node that is a root for the child context?

<sadym> *specific child node

jgraham: a11y inspection: in classic we have getComputed{Label|role}
… there are going to be more requests probably from other groups
… we should at least reimplement classic over bidi here

mathiasbynens (IRC): there is also querying the a11y tree instead of just dumping it

Sam Sneddon [:gsnedders]: There are some investigations into the tree querying by other wg's and querying parent/child relationships might make it into classic at some point

simons: Proxy settings: I can see that we are add this and delegate down to new session

whimboo: we don't have wdspec tests atm

simons: Timeouts: we have these in classic but no analogues in bidi yet

jgraham_ (IRC): from my point this is by design as it can be implemented in the spec

simons: not all languages can do this in a sane way
… and then if you're going across the web might cause other issues

unknown: if the person are connecting over long distances and causing issues thats outside scope

simons: not everyone has good UAT systems

unknown: most languages can handle timeouts

simons: a lot of languages can this but then creates situation where we are not the same in all clients

simons: we can put it anywhere, in classic we put it the spec as that is where we felt the complexity live

whimboo: we need to make sure that for timeouts we have all the same items in classic as they have script/navigations etc timeouts

jgraham_ (IRC): file upload: for classic we handle input type=file and we probably need to figure out something similar for bidi
… there also may need to handle sending the file to the remote machine to run the file upload

simons: one small note: for uploading on remote machines an intermediary node should handle that and selenium has prior art for that now

sadym (IRC): element location: is this for find element

jgraham_ (IRC): there are commands for handling the location of the element and we should probably add that to the end

whimboo: isShown: So this is an API to see if an element is visible... but this might also be shared in the atoms line

automatedtester: I think chromedriver uses the atoms here
… and maybe safari

Sam Sneddon [:gsnedders]: I don't think we do

jgraham_ (IRC): form elements: this is specifically for setting complex form input type e.g. <input type=color>
… and we don't ahve means to set that
… and I am assuming puppeteer has a way

orkon: we do this via script execute and set the value property

mathiasbynens (IRC): there are ways to do via setting the value property and works for most things
… but not work everywhere

jrandolf: we might be able to do things with interaction with drag and drop

Sam Sneddon [:gsnedders]: drag and drop doesn't always work on mobile

whimboo: element interaction: this covers all elements
… we have most of this covered by the user interaction APIs
… we probably don't want to carry over element send keys from classic
… and we can remove from this discussion as it's being implemented

jgraham_ (IRC): browser process info: the request is to have system information being sent over so that we can return PID and system usage info

Jim Evans: geo location: the idea that someone wants to be able to set the browser location different to where they are currently testing
… e.g. testing differences between US and EU
… not currently avialble in classic
… and this also goes for the timezone

mathiasbynens (IRC): yes agreed
… but this also comes to the idea of permissions for geolocation. We should set the value so the prompt doesn't appear to the user if it also ready set

sadym (IRC): User Agent: If we split it from device emulation then we should need to set the user agent for the given browser context
… and it is required by puppeteer

jgraham_ (IRC): is getting different to window.navigator.useragent?

AlexRudenko: there can be a need for original user agent vs browsing context user agent

whimboo: locale/lang emulation: it is important to be able to change the locale/language so that the relevent content is sent
… this is a request from the playwright team

jgraham_ (IRC): history traversal: This is implemented in classic for moving back/forward through pages

jgraham_ (IRC): user gesture during navigation: it's not gesture in the terms of input

AlexRudenko: in script there are reuqests to make sure scripts are available before a page load for gestures

<jgraham_> web-platform-tests/rfcs#128

Sam Sneddon [:gsnedders]: there are things in html for handling some of these cases

<jgraham_> whatwg/html#8609

whimboo: screenshots; We have basic feature but don't have all of it. e.g. we dont have full screen or element screenshots

jgraham_ (IRC): so lets split this to element and others

orkon: I wanted to mention that this could be solved by cropping images

Sam Sneddon [:gsnedders]: what happens if the element is larger than the viewport

automatedtester: good questions but not part of this question now :)

whimboo: click and wait: this is a feature for classic
… if you click an element and it has potential for the page to load and capture that it is moving over
… it's tricky for browsers to do this when it could handle this in the client

sadym (IRC): atoms: we use selenium atoms in chromedriver
… do we want to be able to handle things purely in browser or do we want to allow the use of atoms for things?

mathiasbynens (IRC): this goes to prioritization and if we can do things simple in the client
… imho it’s more important to unlock brand new use cases than it is to spec + implement things that can already be done in a slightly less nice way (on the client or via script evaluation)

jgraham_ (IRC): the spec already supports this in bidi via sandbox and preloadScripts

simons: for the atoms are only needed for element displayedness (and get text on top f that)
… so we can probably see if we still really need them
… or a different proxy?

jrandolf: keyboard layouts: the main goal is to support internationalised keyboards
… I have a design doc on this on classic
… the MVP description :The ability to test international keyboards
… there are multiple ways to implement this moving forward

<jgraham_> 100% we should do something better for permissions

automatedtester: permissions, is this not just classic over bidi implementation

Sam Sneddon [:gsnedders]: there are more complex use cases to handle it

jgraham_ (IRC): I think we should improve the way we do it via events to know its there and then handle rather than try pre handle it

jrandolf: drag and drop capabilities: This is different to user interactions
… there are issues between OS(win/lin). Mouse events need to handle drag and drop as you can't do it
… let's add a property to know if a client can handle drag and drop

<jgraham_> RRSAgent: make minutes

<simons> Are we having the monthly meeting later? I'm guessing not, but I'm checking now

<jgraham_> Yes, it's happening as normal

<jgraham_> RRSAgent: bye

<jgraham_> Yeah, I didn't remember how to end the meeting and start a new one, so it makes more sense to re-invite them

<orkon> but fighting the video conf software

<sadym> couple of mins please

<mathiasbynens> (I'll unfortunately need to drop off at the :25 mark, just FYI)

AT Driver

jugglinmike (IRC): Today, I am going to show off the work that I am doing around normalising the behaviour around screenreaders and hopefully the future of assistive tech
… we want to make sure that we set what we believe is the behaviour that we need
… and today I am will be going through the community group report

slides https://docs.google.com/presentation/d/1cEonR8cMkQz4WgCc6PlPypLLZdo_K2a3Z_gO167XU6s/edit
… automation in this area is unnecessarily hard
… and we want to simplify this for developers
… and allow them to make sure they give users good user experiences
… this is a protocol to introspect and remote assistive tech via a bidi protocol
… so today I want to show what we have done and where we are going so we can collab
… and using your knowledge of compat issues
… so we are wanting to see improvements from our proposal and what would it take to get it into the BTT charter
… current status (see slide 5)
… (discusses what has been specified)
… what we need to do next is start/stoping sessions
… and so on
… (discusses current implementations)
… and these implementations are trying to connect to the OS
… If you're starting webdriver bidi from scratch, what would you do differently?

jgraham_ (IRC): I don't think that there are any large landmines here
… there are some things that we would do differently but only on the 2nd review
… and hindsight is good
… capabilities is one area that you might want to avoid... maybe

simons_: I would follow jgraham_ (IRC) says generally but for capabilities see them as a config item too

automatedtester: just be aware of how chatty your protocol is

jugglinmike (IRC): for capabilities we won't need all of it
… since for most cases it's mapped to 1 machine always
… then if there are more questions later I will email this wg with more details
… then the 2nd item is "what would it look like to promote this proposal to an editors draft in the BTT charter"

<jgraham_> I think the question I have on this is the tradeoffs between a new WG and a different WG given that the degree of overlap in expertise might be quite small i.e. few people would be involved in both things

automatedtester: I think for this question it's mostly a process question that I and a couple others would need tow rok through. figure out if this wg is the best place or is a CG sufficient etc

Sam Sneddon [:gsnedders]: the current charter runs out in August 2023 so we would need to sort that
… and then for the ACs will need to check that there would be multiple implementations which seems less of a concern and understanding how the ACs will view this

<jrandolf> +1

Roadmap followup - setting priorities

jgraham: for those who weren't here earlier we came up with a spreadsheet https://docs.google.com/spreadsheets/d/1Cg3rifrBZClIitU3aFW_WDv64gY3ge8xPtN-HE1qzrY/edit#gid=0
… and I have set a couple priorities
… and my proposal goes in and assigns a priority to each feature
… and this will hopefully allow us to figture out short term goals and the larger long term goals

Should we default Wdspec tests to make use of HTTPS by default

github: web-platform-tests/wpt#28847

jgraham_ (IRC): WPT in general uses http as the default. THis doesn't match the web anymore
… this is not going to impact the webdriver so much
… so we should move that to be more like the web

automatedtester: as long as we still make sure that we test misconfigured servers then I don't see this being an issue

jgraham: that will be fine

Targeting documents by ID

github: w3c/webdriver-bidi#434

jgraham_ (IRC): in bidi we route commands to a specific browsing context
… that is fine but it can cause problems and is subject to race conditions
… e.g. if the page is navigating when the command is sent that it would be executed in the wrong place
… in addition to target a context/navigable that you would be able to send commands to an id for a document
… and if it couldnt go to that document that it would error instead of just going ahead

sadym (IRC): this document idea, do we want it to be used elsewhere?

jgraham_ (IRC): we would want to make sure that it is sent back during the serialisation
… I havent thought about scoping events down to a document
… or network interception down to the document

simons: document IDs is a chromium idea from what I can see... is this browser agnostic ?
… or can we make it that way

jgraham_ (IRC): we can define it for our purposes

Update of key codes - wpt tests landed before spec changes

github: w3c/webdriver#1740

jgraham_ (IRC): there is a pr on the webdriver spec
… this is more a process question
… so this is more of a "we shouldnt merge tests until the spec changes have been merged"

jrandolf: I did check everywhere
… and it did work everywhere but I waited a month for the merge to happen

jgraham: we should have checvked it quicker

jrandolf: there are some keycodes that we saw were missing
… and should match behaviour on all platforms

<jgraham_> https://github.com/web-platform-tests/wpt/runs/13660763249

jgraham_ (IRC): FYI: WPT not showing failures doesn't mean it has tests passing
… it has some failures. We should also see about defering if there are something there.

jgraham: we should see if there is a better source for the keycodes than what we are currently using and see if it makes sense to use that

<gsnedders> The other question about key codes is whether they vary on different OSes (for the same browser). There's a lot of historic complexity here.

<jgraham_> RRSAgent: make minutes

– DRAFT –
WebDriver June 2023

14 June 2023

Attendees

Meeting minutes

WebDriver BiDi status in browsers

WebDriver BiDi implementations in browsers

WebDriver BiDi implementations in browsers

Puppeteer for Firefox over WebDriver BiDi

Roadmap

AT Driver

Roadmap followup - setting priorities

Should we default Wdspec tests to make use of HTTPS by default

Targeting documents by ID

Update of key codes - wpt tests landed before spec changes

Diagnostics