W3C

– DRAFT –
WebDriver-BiDi

12 March 2025

Attendees

Present
jdescottes, jgraham, jimevans, lauromoura, orkon, sadym, sasha, simons, tidoust, whimboo
Regrets
-
Chair
tidoust
Scribe
tidoust

Meeting minutes

sadym: Same question as for another topic. Should it leave in the core spec or in an extension? I'm in favor of putting in the main spec
… If we put it somewhere else, we don't have a way of extending CDDL with a specific action type for example. Do we want a generic mechanism for that?

<jimevans> join #webdriver

jgraham: We discussed Gamepad at some previous meeting. The problem we're trying to tackle is that the Gamepad modle is different. Poll-based instead of event-based.
… I would be happy to have it in the main spec. I would be happy either way, if the experts are in other places.
… If it were integrated in actions, that would be an argument to have it in the main spec.
… And I think we should add the extension points to the spec that we need, yes.
… We talked about that a long time ago for WebDriver classic in the context of WebXR. In the end, they didn't use standard actions.

sadym: They're looking at the best way to make things testable. No decision yet.

tidoust: I'm hearing support to integrating with the main spec.

Restrict supported formats for captureScreenshot to an allow-list

Restrict supported formats for captureScreenshot to an allow-list

<jgraham> github: w3c/webdriver-bidi#875

jdescottes: The original GitHub issue was filed because according to the spec, the browser should fallback to PNG if an unsupported format is provided.
… Chrome throws instead.
… After discussion in Firefox, we agree that throwing is the right behavior.
… We'd like to hear from the group if people have concern with the approach?

simons: Seems reasonable to me

tidoust: Silence suggests that people agree with the approach!

jgraham: We also need a discussion about what happens if you don't support one of the formats that gets added to the allow-list? Should it throw an error or fallback to PNG?
… Falling back to PNG has some advantages as you get an image.

orkon: Having a list of basic formats sounds fine. It would be difficult to bring in encoders for formats that are otherwise not supported by browsers.
… For example, AVIF or JPEG-XL, there was at some point no need for encoders. We want to avoid WebDriver BiDi being an all-purpose image converter.
… Having a list of known supported formats. Throwing an error for unsupported formats should be supported as well.
… If we want a fallback, we should have a way of signaling the format that gets returned together with the screenshot data.

jgraham: I think it's safe to assume that there will be cases where some browser supports a format that others don't. Either an error, or PNG as fallback. I don't think we should allow either behavior though. We need just one of these.
… An error seems preferable to me, not to surprise clients with some unexpected format.
… Either way, the user eventually has to handle it by dealing with a different format or, for that browsing, asking for a different format in the first place.
… It's probably better to let the test run with a fallback image than just failing the test.

tidoust: Does the screenshot already come with an indication of the image format it is in?

jgraham: No.

jdescottes: Two use cases we're trying to address here. Initial question came from a typo in the image format that was provided to the command. In that case, it's good to throw an error to signal the error.
… The proposed allow-list would fix this use case.
… And then with image formats generally supported but not for screenshots, PNG fallback seems acceptable.
… Error and different formats. Two different use cases. The allow-list and the fallback should address both.

jgraham: We could allow a list of formats that you try in order and first one that works gets returned. The advantage of that is that the client can then hardcode a list. It doesn't have to know about specific browsers and their supported formats. It feels pretty over-engineered though.

simons: If we are implementing fallback, then we absolutely need to put the image type as part of the response value. Test need to decide what to do with the data, including saving as a file with the right extension.
… The list of formats seems very content-type negotiation-ish to me.
… It would probably be acceptable for us in Selenium to throw an error each time.

jimevans: I think you're correct there.

orkon: From a Puppeteer perspective, we have a wrapper on top of this, separate validation. What Julian says makes sense to me. Detecting a typo would be great. But then I don't think it affects many users directly.
… Useful to return the image type.

simons: If there is any ambiguity at all about what's going to be returned, we absolutely need a type in the response.

jdescottes: Would it be ok to say that, on the BiDi side, we always return the type of the image and let any validation to the client itself, since that seems already done by Puppeteer and Selenium?

jgraham: Then without allow-list?

jdescottes: Yes. Completely without it.

<jgraham> So we'd be back in the situation where `image/jpg` returns `image/png`

<jgraham> (just so people are clear on the consequences)

simons: Not having an allow-list would be fine with Selenium, I think.
… I'd say returning an error is the easiest so that clients can detect typos rapidly. And for unsupported formats, you can easily have a loop with different formats to try something else.

<simons> Yes. That's what I'm suggesting

jgraham: It seems that Simon is proposing a solution where you always throw. I'm still not sure about the option we want to implement.

jdescottes: 3 options. Allow-list, throwing dynamically, and the last one with fallback. I can summarize that in the issue, highlights pros and cons, so that we can get back to it next time and take a decision.

Handling of navigations interrupted in JavaScript

<jgraham> github: w3c/webdriver-bidi#763

jdescottes: This issue is about the behavior of browsingContext.navigate when it is interrupted by a new navigation.
… We are fine to update the implementation in Firefox to align with the behavior that Chrome has, which also matches the spec.
… If you were navigating with waitComplete, the navigation will resume as soon as the first navigation is interrupted.
… That might be confusing if the navigation is interrupted as they may then expect a complete response.
… Is that still fine for clients?

orkon: From a Puppeteer perspective, it would be preferable not to throw, I think. Throwing makes it difficult to figure out the reason: did something go wrong or did a new navigation happen?
… Also, navigation can only be canceled before document is created, then it can only be aborted.
… We want to throw before document is created, we don't want to throw afterwards when events can be reported.

jdescottes: My main issue was if clients were not properly using events. But you seem to suggest that Puppeteer is doing that. If it's clear that consumers need to rely on a combination of things including events to figure out what happened, that seems fine with me.

<sadym> w3c/webdriver-bidi#799

sadym: It looks like we already discussed that and resolved in December. We wanted to throw if the navigation was aborted.

jgraham: I sort of thought that we had reached consensus on not throwing. You load example.com, but then JavaScript updates the location. From the point of the user, the page is not done loading. From the point of view of the spec, you return and say that the navigation is successful.
… Any page can always kick off a navigation at any moment, for sure.
… From the point of view of the user, if you say "please navigated and wait until you get a load event", it's confusing if you end up with a page that is actually loading another page.
… If the navigate command is basically not usable without having to check and follow events, then it sounds that the command should be doing that itself. I know that it's hard to specify. But that seems more like what the user would probably expect.

sadym: We can extend the navigateResult to show the user what exactly happened. At which point the wait condition was met.

orkon: Adding some sort of indication of what happened might be useful, but events already cover that. It's a question on whether clients care enough to listen to events.
… Puppeteer listens to events because we are doing some more things such as network idleness. From experience, if you want to automate a third-party web site, some of them will never load, never be completed. Events are useful to get a signal and get the response back.
… We also do some things aroudn what happens with iframes.
… Iframes could have a client-side navigation thing.
… Many users want to make sure that the page is somewhat stable.

jgraham: The load event is delayed by the iframe in usual cases. But of course, the page can do whatever.
… I can of like the idea of returning more information. If you could return that a new navigation started. But you would still have to listen to events to get to completion.
… I'm interested in how other clients handle this.
… Simplicity vs. complexity of keeping all the maps of the navigation.

jimevans: Traditionally, Selenium has tried to, upon navigation, simply wait for everything to happen, whatever that means.
… If the page kicks off some redirect to another page and relocates that way, Selenium has treated that as "as soon as first navigation is loaded, navigation is complete".
… I think we would probably want to avoid having to deal with complexities when we switch to BiDi. At the moment, we give a best faith effort to wait for the navigation to complete. If that's not enough, user is left figuring out the best way forward.

jgraham: I'm hearing that there's not a lot of enthusiasm for changing things here. It seems that we should align Firefox behavior with what the spec currently says. We could add additional info. And we could consider some extra wait value in the future such as waitForFullCompletion. I'm certainly happy not to have to spec things too much!

Emulation: Support geolocation override

<jgraham> github: w3c/webdriver-bidi#343

sasha: Since the last meeting, I created a PR with a proposed API. I spoke with Geolocation folks.
… Please take a look. I created a new module. A command and end point here. Simon already provided a feedback that we should maybe not have all in emulation.

simons: My feedback is that I think we're going to end up with a geolocation module anyway.
… An emulation package module may make sense for things that don't fit anywhere else.
… My main concern is that it would be quite easy for the emulation module to grow quite large.
… But I'm broadly in favor of the idea!

orkon: I have not looked into the PR. Having an emulation name makes sense.
… I don't think that there is anything else than a command for Geolocation.
… I think it would also be interesting to see the bits in the Geolocation spec. It would be interesting to see how the algorithms are updated there to see how they match the proposal.

sasha: I've linked also in the PR both sides. Not a perfect match yet, but it aligns mostly already.
… Not yet fully up-to-date on their side.

Minutes manually created (not a transcript), formatted by scribe.perl version 244 (Thu Feb 27 01:23:09 2025 UTC).

Diagnostics

All speakers: jdescottes, jgraham, jimevans, orkon, sadym, sasha, simons, tidoust

Active on IRC: jdescottes, jgraham, jimevans, lauromoura, orkon, sadym, sasha, simons, tidoust, whimboo