W3C

– DRAFT –
WebDriver-BiDi

09 February 2022

Attendees

Present
brwalder, jdescottes, jgraham, patrickangle, sadym
Regrets
-
Chair
-
Scribe
jgraham

Meeting minutes

RRSAgent: make logs public

RRSAgent: make minutes

<simonstewart> I'm in the office today, so will be on mute with no video unless there's something that I actually need to say :)

<simonstewart> Big building with basically no-one in it.

<simonstewart> I'm more socially distanced here than I am at home

TPAC 2022 - Thoughts on in person vs virtual TPAC vs hybrid in Vancouver in September

<sadym> should we make a vote?

brwalder: I could drive to vancouver, so would support hybrid

<sadym> +1 for hybrid

simonstewart: +1 to hybrid

jgraham: Let AutomatedTester know if you have other preferences

Host/Origin checks

github: https://github.com/w3c/webdriver/pull/1634

jgraham: Allowing HTTP/WebSocket connections comes with some security concerns eg. if you don't check origin header then websites might be able to start a session. The spec didn't cover this before, but the PR adds some checks in when establishing a connection.

sadym: There is a use case with docker, because the host header for the container is different to the actual host. We need to account for this in the spec.

jgraham: Agreed, we should allow implementations to allow an implemtation-defined list of hosts to handle these cases.

jgraham: Also need to perform the checks on the WebSocker upgrade request for BiDi.

New session setup procedure

sadym: We've dicussed this internally and can remove it from the agenda.

Specify closing a browsing context

github: https://github.com/w3c/webdriver-bidi/issues/170

jdescottes: Want to be able to close a browsing context. We have a PR, but there are some open questions.

jdescottes: How to handle closing last window? Should that always close the browser even if it doesn't have to (e.g. mac)? Or should it be implementation defined?

jdescottes: What should the command return? Classic command returns a list of open windows, should we do that?

jdescottes: Puppeteer has the ability to run some code right before closing the window, do we need that capability?

simonstewart: Closing windows is an interesting problem. We left it so that if you want to close a session you call "quit", but if you close the final window we try to quit. So behaviour is defined in original webdriver and I suggest we follow that.

simonstewart: Running something before close: maybe? Could be done by allowing JS to execute on page load and that could register an unload event handler.

brwalder: With puppeteer it's a common pattern to run in headless mode and reuse a "browser context" and open/close within that context. It might be useful to be able to keep the browser alive when the last browser "window" is closed.

ack

simonstewart: Depends on the browser/WM. Some will probably quit the browser when there are no windows, so we can't rely on this.

brwalder: Could we say it's implementation dependent? For e.g. Edge in headful mode you should expect the browser to close when the last window closes, but for e.g. Chrome in headless mode the browser might still be alive.

simonstewart: I am worried about people depending on specific behaviour and not being able to port tests to other browsers. Maybe this could be a capability?

simonstewart: A capability would make this deterministic

brwalder: That makes sense

simonstewart: Predicatbility and determinism are important. Easy to force quit.

jgrhaam: I'm worried about regressing the performance of tests that depend on the no-quit-on-last-window-close behaviour of e.g. headless Chrome in Puppeteer. We should check with those teams if it's an acceptable regression or if they could use a workaround like keeping a dummy tab around.

simonstewart: What's the use case for keeping the browser open when the last window is closed?

brwalder: e.g. Puppeteer users open the browser but create a Page per test without quitting the browser.

simonstewart: Does BiDi session map to the Page or the process?

brwalder: Session in the spec maps to a browser process. Puppeteer Page is more like a tab.

jdescottes: Would the solution with exposing a capability work in all cases?

jgraham: We could use a capability, but it's not clear that people will pay attention to the capability value if they wouldn't also pay attention to e.g. an event indicating that the browser's closing after the window closed.

jgraham: Could also have "browserAboutTOQuit" in the return value if we wanted.

simonstewart: Thinking about parallelism. We want every call to new session to be a new isolated browser session. Could make each a new browser, or could partition a single browser process into multiple sessions. So as long as closing final window in the session closes the session that would be reasonable. Then you could support different behaviours in the same way, even if in some cases there is a

process that outlives the session.

brwalder: I agree the session doesn't have to map specifically to a process. I'm not sure what the best alternative is at this point.

jgraham: Our session currently do map to processes, so I don't want to depend on that not being 1:1 to support important use cases.

simonstewart: I think the safest thing to do is close the session when the final window is closed. Then if the process is still running you could start a new session for subsequent tests.

jgraham: There is still some overhead there, and it's different to classic WebDriver. But it would mean that you throw away all session state between tests, which could be better.

brwalder: Puppeteers' browser context is close to session. Those contexts are always explcitly closed. So this wouldn't support Puppeteer's existing behaviour, which could be a problem for them.

simonstewart: One of the reasons that WebDriver has Quit and Close is we wanted to make the distinction between being totally done with all of a session, and maybe retaining some state. For puppeteer is each Page an isolated instance of the browser? So apart from performance is it indistinguishable from shutting down the browser except in performance?

brwalder: Not quite, BrowserContext provides the state isolation. Page is like a tab. BrowserContext has many Pages. Page doesn't provide any isolated state.

BrowserContext is like a state container, holds cookies, localstorage etc.

simonstewart: In Selenium we tell people to open a new window. You could implement the puppeteer behaviour by having a window that doesn't get closed.

brwalder: Placeholder window would still be visible in API.

simonstewart: Do we think this is behaviour we should support? Is it just a performance optimisation?

brwalder: It's considered the correct way to use Puppeteer.

simonstewart: In Selenium people want to use a single session for multiple tests to retain state (e.g. cookies) and avoid starting a new browser process. It comes at the cost of breaking test isolation. Do we want to provide an easy mechansim to break isolation? Is BiDi new ssession in Chrome actually an isolated session?

brwalder: We want to make it possible to break isolation e.g. for setting up cookies. I didn't mean to imply that in Chromium a new BiDi session should reuse state. Each new session should have new state. Puppeteer doesn't have anything quite like a WebDriver session. Nearest is BrowserContext. WebDriver session should have isolated state.

Simulated user input

github: https://github.com/w3c/webdriver-bidi/pull/175

jgraham: (describes the proposal in the issue and the knwon open questions)

brwalder: A question: the actions model for HTTP might have been very influenced by HTTP. It's very declarative and requires precomposing the actions and sending them in a batch. What alternatives were considered. Does BiDi open up different models.

simonstewart: Main constraint is network latency. Want to allow e.g. click, pause of known length, click. If you need to send each command one at a time something that works on localhost will break when there's higher latency. Several different options were discussed at the time.

brwalder: That makes sense.

simonstewart: I think adopting the Classic actions model into BiDi makes sense. We still want to handle high latency problems.

simonstewart: We want to support the use case of dragging between windows. Not sure how that interacts with the state. If we can't do that having per-top-level-bc state makes sense.

jgraham: Can always decompose actions into multiple commands if that's easier, but this model allows simultanous manipulation of multiple inputs which is important for e.g. pinch zoom. State would at least be per top level context, but draggging outside the browser window (for example) is hard to support.

simonstewart: I haven't seen many people asking for the ability to drag between windows. Original Selenium handled this by returning an additional set of screen coordinates which would allow interfacing with external tools that could do things like OS-level drag and drop. With headless browsers that doesn't work well.

<simonstewart> Thank you for running the meeting, @jgraham

RRSAgent: make minutes v2

Minutes manually created (not a transcript), formatted by scribe.perl version 185 (Thu Dec 2 18:51:55 2021 UTC).

Diagnostics

Maybe present: jgrhaam, RRSAgent, simonstewart