W3C

– DRAFT –
WebDriver-BiDi

12 May 2021

Attendees

Present
AutomatedTester, brwalder, cb, Honza, jdescottes, jgraham, jimevans, JohnChen, sadym, shengfa_, simonstewart, whimboo
Regrets
-
Chair
AutomatedTester
Scribe
AutomatedTester

Meeting minutes

<jgraham> ScriveNick: AutomatedTester

<jgraham> RRSAgent: make logs public

<jgraham> RRSAgent: make minutes v2

https://www.w3.org/wiki/WebDriver/2021-05-BiDi

<jgraham> RRSAgent: make minutes v2

Bidi client

github: https://github.com/web-platform-tests/wpt/pull/28381

jgraham: there is a PR up for the bidi client. The structure is there but it doesnt have any functionality
… could people please look and see if there are issues. The sooner we get this in the sooner we can write tests
… it can send/receive messages but there are no predefined helper APIs

Navigation commands

github: https://github.com/w3c/webdriver-bidi/pull/93

jgraham: There is a PR up and a PR for the integration to HTML. brwalder and foolip have done a first pass
… this is nearly ready to land
… I have summarised the first first "issues" related to this
… WHen we navigate and it's initiated from webdriver we have the response from the command and the events that are returned
… foolip had the question that events happen and then we return the navigate request response
… and does this make sense or should we respond for the command and then do the events
… [describes how the promises can be responded to and the order]
… the 2nd question is do we handle relative urls
… and the 3rd thing is the proposal
… has wait for 1 event or do we want to be like puppeteer and wait for a group of events
… and since in html the event order is always guaranteed
… but should we care about the network idle
… so I could do with help understanding the use case
… if we are going to support an array we need to do it from the start and not patch it in later
… and the last topic is around page reloading
… I don't think this will be controversial and we can copy parts of CDP here

github-bot: end topic

Script execution

github: https://github.com/w3c/webdriver-bidi/issues/63

jgraham: After we have something for navigation the next item on the list is script execution
… as it can open up the rest of what we can do
… what are the requirements around this.

jgraham: I have written up in the issue some of the relevant things
… e.g. selectors against the DOM
… and being able to overload APIs on the page e.g. Random giving a response that isn't random
… and so on
… the next point is there are 2 points here on where we can execute script
… this can be on the page/worker
… and the 2nd one is sandboxing
… so that clients can install there own helpers that doesn't pollute the page
… we want it to be a lot like executeAsyncScript from webdriver but with more features
… and do we want to have it like using `arguments` or just execute as is
… and the sandboxing seems really cool
… and do we want to have something like webextensions where we have a dedicated WebDriver DOM API surface and allow people to inject using something like a `postMessage`

automatedtester: One of the things that might be good is that Jason Leyba, a few years ago, suggesting moving webdriver execute script to being more promised based

jgraham: the way that CDP works now is that it has an `awaitPromise` type event
… and we can do something like that

simonstewart: re: arguments... as long as we can reformulate webdriver on top then its fine. Ideally we want to update things in the simplest ways

jgraham: [describes
… how call function works in CDP with arguments]

brwalder: re: CDP something like `addBinding` or webextensions
… then we would have more network roundtrips
… where if we wnt for webextensions with a more `postMessage` structure then we can save a round trip as it can post straight away
… and if it were the `postMessage` then it would simplify sharing snippets of code
… it might be more work but it would be easier for developers and offer more functionalit

github-bot: end topic

closing a session

github: https://github.com/w3c/webdriver-bidi/issues/104

jgraham: Last time we talked about only bidi session (no http then upgrade process)
… and corresponding to that we need to end the session
… and this should shutdown the UA
… and it should hopefully fit the requirement of implementing webdriver on top of this

automatedtester: as long as it is like webdriver in cleaning up after itself that should be sufficient

simonstewart: my understanding is we have 1 session limit... and if we shut the browser that should fully end the session

jgraham: there are usecases where you would want to close the session [describes a use case around network monitoring]
… and I am reluctant to limit the lifetime of the session to that of the browser

github-bot: end topic

Multiple concurrent sessions

github: https://github.com/w3c/webdriver-bidi/issues/103

jgraham: in webdriver http it never made sense to have more than 1 session per browser due to the blocking nature of the API
… however in bidi there are APIs where that limitation is not needed
… and use cases
… and we could have multiple tools connected to the browser
… there is more complexity supporting more than 1 session
… however multiple tools connected seems like a really appealing direction to go in

sadym: from the practical side it makes sense but from implementation it could be very difficult

brwalder: a question, from the clients pov, how do you tell the difference between new session and new session via attach

jgraham: so from the client it is always connecting to the existing client (even if it started the browser)
… from the protocol point of view
… there are cases where we can set prefs/cli arguments but that are browser specific

brwalder: <paraphrasing jgraham to understand>

jgraham: [describes the differences between client starting the browser and then connecting and then just connecting the browser]

brwalder: I just want to see how this is in practical terms for the browser
… and if it joins an already running browser and it connects, how did that happen

jgraham: I don't think it matters, it could be another tool or the user did the right commands to get it but we connected
… and we just assume the browser started the appropriate mode
… so you could connect to the ws and then get a session id

<shengfa_> I think supporting multiple concurrent sessions would have practical values for users, but we need to be careful about to what extreme would client be using it. i.e. They could start 200 concurrent sessions?

jgraham: and you could ask again and not worry if there is a session or not

gsnedders: without adding complexity, in safaridriver we only 1 browser and 1 safari
… there are a variety of different ways we could connect to a single browser

jgraham: to clarify the state of the browser would be accessible to all sessions
… e.g 1 session is active (do things) and another session is passive (doing monitoring)

gsnedders: I am aware, just pointing out our differences to firefox and chrome

cb: at Sauce we allow customers to connect to browser for automation and we connect to do monitoring
… and we want to continue supporting that
… via bidi

RRSAgent: make minutes v2

<gsnedders> github-bot: end topic

<jgraham> RRSAgent: make minutes v2

Minutes manually created (not a transcript), formatted by scribe.perl version 131 (Sat Apr 24 15:23:43 2021 UTC).

Diagnostics

Succeeded: s/puppeteer/CDP/

Succeeded: s/limit the surface/have a dedicated WebDriver DOM API surface/

Succeeded: s/safari/safaridriver/

Maybe present: github-bot, gsnedders, RRSAgent