WebDriver BiDi – 12 June 2024

Meeting minutes

<jgraham> RRSAgent: make minutes

<jgraham> RRSAgent: make logs public

Should the "New Session" command return default values for capabilities that have not been specified?

jgraham: should we return the defaults for capabilities that are not specified? at the moment in classic webdriver the spec says you return certain capabilities always, such as browser name, or capabilities that you specified or the resolved value.

jgraham: this question came up around the unhandled prompt behavior and if you should get something back if you did not specify anything. Safari follows the spec, and Chrome and Firefox return the values always

jgraham: should we continue handling this per capability or handle it in a general way?

orkon: We would prefer to always return resolved capabilities but don't feel strongly. It seems like it might be easier for the users to return what's actually used by the session.

simonstewart: the returned capabilities are usually for the local end to figure out what the remote end supports

simonstewart: the criteria to decide should be if the client would need the information and correct the request based on the response

simonstewart: the safe way is be to return everything. Otherwise, we need to decide on teh case by case basis

jgraham: so returning it always seems to be the only backward compatible way to move forward. The only thing is that we do not return the full list of Firefox specific properties. It is possible that there are cases where the list of possible values/capabilities is too large

jgraham: returning the defaults make a more uniform API

simonstewart: there are some capabilities that you do not want to return in full, like proxy configuration. For properties it makes sense to exclude them or provide a subset

simonstewart: perhaps we should put in the spec how to return the value if you need to return it

jgraham: the spec has this but for many capabilities we do not default to returning values

Update spec to define things in terms of Navigables - anyone who has time to work on that?

<jgraham> github: w3c/webdriver-bidi#91

jgraham: a while ago the html spec moved away from the browsing context concept

jgraham: the concept is now navigables

jgraham: unfortunately, we have not yet upadted the spec to be in terms of navigables

jgraham: some time ago we decided not to change the api but change the spec text to use navigables in the spec prose

jgraham: this came up because orkon was updating the HTML spec to integrate navigables with the WebDriver BiDi

jgraham: from my side it is unlikely that I will have time any time soon but can help someone who wants to update the spec

orkon: We discussed this. It will be really nice to update, but we don't have a lot of time at the moment. There's already a PR that hasn't landed which puts notices to be aware of the terminology. Proposal is to make incremental updates rather than change it all at once.

jgraham: makes sense, the q is what we need to get that PR landed

<jgraham> w3c/webdriver-bidi#565

orkon: there are pieces from the PR that related to navigables and they can be probably landed separately

ACTION: item: orkon to extract the relevant pieces from PR 565 and create a new PR

Prompt Handler for beforeunload

<jgraham> github: w3c/webdriver-bidi#681

jgraham: the next topic is about prompt handling, not only the beforeunload. The context is: there is a PR to add the prompt handling support based on the WebDriver spec changes. It allows to specify defaults for the BiDi only sessions. In WebDriver the handler applied to all prompt types except for beforeunload. In BiDi, you can get an event that

there was a beforeunload prompt and the client can handle it. But that means that the way things are specified in Classic, default prompt handler does not apply to beforeunload. The question is if we want to carry this over to by and what to do when we start treating file dialogs and auth prompts. The default in classic to always ignore. And for

file dialogs the accept does not make sense as a value. What should we do about that?

<jgraham> https://github.com/w3c/webdriver-bidi/pull/681#discussion_r1634703120

jgraham: there are multiple options. For example, we can overwrite the behavior in Classic. We could change the semantics but it might mean that things might break when updating from Classic to BiDi. We could also only allow ignore and dismiss in BiDi or remove the default.

orkon: We discussed on the PR. From our perspective changing the semantics and having the default apply to all prompt types seems best for BiDi. But not sure about compatibility with classic. If you're updating from classic, users might expect the default to apply to all dialogs including beforeUnload, so maybe it would fix expectations.

orkon: In general it's not a blocker for this proposal. We could live with accept not working for all prompt types.

jgraham: we could just make it completely breaking if you switch to BiDi. We could make it than we support only what we currently support. You can still make it work for classic+bidi session so that the syntax changes then.

jgraham: question to the selenium users: do you want to do more with the prompt handling compared to classic?

simonstewart: in the selenium project, we will be migrating the users to WebDriver BiDi as soon as possible

simonstewart: everyone knows that the long term future will be BiDi. The only thing we need Classic for, for teh browsers that do not support BiDi

jimevans: also in selenium, we can, for a mixed session, tailor the behavior of the prompt handler for the prompts that are not part of the classic spec to match the existing behavior. Because once we enable BiDi in Selenium, we can have in one go update our methods that do prompt handling to accomodate the behavior so that the user experience is

no different

jimevans: so I do not see a problem with having the prompt handlers being different for bidi than for classic

jgraham: ok what I should try is for classic: you can either provide a string value or you can provide an object and if you provide a default key on the object, then it is the default for everything. And specific keys need to be defined to override the default. Then any navigation will fail if you do not re-define the default for the beforeunload.

If we cancel the prompt, the navigation is also cancelled. I think it is compatible with classic but also has better semantics for BiDi

Bypass network cache behaviour

<jgraham> github: w3c/webdriver-bidi#721

jgraham: this one is about request interception and how we force bypassing the network cache. I put a PR which was designed around the idea of setting the cache mode on requests. So when you have a request, it will set the cache mode to no-store, which means no reading or writing to the http cache. Unfortunately, it does not quite match what

browsers do at the moment when you disable caches in devtools: 1) cors preflight cache which is always used 2) image cache and possible other caches So the question is how much flexibility do we care about here? Should we try work with low level primitvies? Or do we actually only care about bypassing all caches?

jgraham: is that going to be the easier way to solve this?

orkon: We also found that Firefox uses no-store by default but chrome uses reload, so Chrome still writes responses to the cache. We couldn't find use cases which require controlling different kinds of caches individually. Maybe there are some wpt use cases, but we haven't run into them yet. For testing scenarios you mostly care about bypassing everything. Also for interception to catch every request.

So it sounds like a single bypass cache setting would be sufficient. We could extend in the future with additional parameters if we needed. Question is which cache mode should the bypass use, or how important is that?\

jgraham: I think it is important that we specify a specific cache mode. bypassing the cache meaning no writing to cache makes more sense.

jgraham: we empty the cors cache or whether we just bypass it. But then when you disconnect webdriver session, the cache will be still populated with the cache

jgraham: I noticed that the spec has primitivees for clearing the cache but not by-passsing

jgraham: and what happens with preflight requests if you toggle the bypass setting

orkon: I need to look at what we do for the CORS cache. We can change the cache mode; if we use reload we don't need to change but Firefox would. For memory caches we clear the cache. I need to double check on bypass vs clear. For HTTP we definitely don't clear, but I'm not sure about the other cache types

jdescottes: in Firefox we only bypass but do not actively clear anything, at least with DevTools

jgraham: I think bypassing makes more sense than clearing. It sounds like at least for some of the caches it is hard to implement and memory caches can be cleared at random. So perhaps it is less imporant.

orkon: I think what happens to the cache after you've done your tests is less important. it's more important to be able to test a cold start of the application and measure the time taken, data transferred, etc.

jgraham: we should change the command to just set cache bypass to true or false and for now we should set the specific cache mode and it should probably bypass the cors cache and be a bit more handwavy about memory caches

orkon: I think there's an open HTML issue about the image cache, but implementations may also have more caches than that.

jgraham: it might be that for image cache we could specify better but we could also mention implementation-specific caches

Network events for data URIs

<jgraham> github: w3c/webdriver-bidi#727

jdescottes: Firefox is implementing data URIs for network events. So Firefox only emits responseCompleted. So BiDi spec expects a beforeRequest event before responseCompleted.

jdescottes: that is also related to the topic of network interception what we should for data URIs and if we only emit response completed and that is fine since it cannot be intercepted at this stage

jdescottes: do we want the full or limited set of events? do we want to have the interception for those requests?

orkon: Chrome emits those events but doesn't support interception for data requests. Data requests don't come from the network, so how much it's applicable to data URLs is a question. It would be great to have the full set of events from the client perspective so you get the same workflow as for other requests. At the same time if there's one event with all the data then it's probably acceptable.

jgraham: we could a completely separate events for data URIs and if that just has a single event it might break fewer invariants

jdescottes: so a separate was a good solution. So we will still need to modify the spec that data URIs do not trigger responseCompleted. In DevTools UI data URIs usually show them in the same place. So users are likely to treat data URI as regular URIs. So I would prefer to have the full set of events as well.

jgraham: sending one event instead of three would be more efficient

orkon: I'm wondering if there's a use case for data urls. Potentially an app could fetch some data and insert it as data uris, and maybe there's some scenario where you want to modify that. But it's an argument for having three events because that would make it easier to cover data uris for interception vs a single event.

jgraham: it sounds like people are in general in favor of making 3 events

jgraham: it sounds like we should change the spec change to make sure we are emitting them

<simonstewart> Cheerio, everyone

<jgraham> RRSAgent: make minutes

– DRAFT –
WebDriver BiDi

12 June 2024

Attendees