Browser Testing and Tools WG - 15 Jan

Meeting minutes

Extend log.entryAdded types to more than just console API and Javascript errors

whimboo: This is about having better characterisation of types or logging
… we currently only have console and JS errors
… for CDP for logentry types is much wider than what we have
… this can lead to differences between browsers.
… the question is what type of errors can we support. We can see what google can do with CDP
… but it would be good to see what the other clients would want. We can do this in an incremental way so don't need to do all of them

sadym: we are fine with extending the logs in chrome. There is some functionality in puppeteer that shows the owner of the error. We can't do this in bidi but in CDP you can have a handle of the object that caused the error
… this could be a different topic to discuss maybe

Consensus: We are happy to extend this.

Scroll Into View [Actions]

github: w3c/webdriver-bidi#544

hbenl: we want to bring in an action to scroll and element into view
… this is currently not able to be done in actions
… This is currently done by default in playwright. They don't use a DOM api, they use a CDP method for this
… their scroll can do text nodes and display nodes and can also handle the which part of the element needs to be in the view port
… though they do sometimes us the DOM API. If the first way doesn't work then they use DOM method to take advantage of the alignment options
… hoping that the elemtn will eventually be visible
… CDP doesn't have the alignment methods
… q1: Do we want this action?
… q2: what options do we want to have on this action
… q3: do we want to have a behaviour option, alignment options, container options, also if needed to have an option to have fallback to try again to bring it into view

jgraham: there is one more question, do we need a separate top level method here or should this be part of actions?
… if we put it in actions we can interleve it with other actions
… it's not related to user actions...

hbenl: if you have multiple clicks in an action chain you want to scroll them into view before the next actions

shs: I think there are cases to have it in actions. There could be a drag and drop where the first element is in view but the 2nd option isnt
… and we discussed this yesterday with double click
… and in the "do what I mean" click we will want to have the element pulled into view before the click so that we can have something like what we have in the classic spec

sadym: I am afraid we will have race conditions as the scroll could take a while
… my concern is we can have race conditions with other actions

jimevans: I believe that in classic spec that for the elemnet click we have auto scrolling. Are there cases where the element can't be scrolled into view?

<whimboo> https://w3c.github.io/webdriver/#element-click step 5 is scroll into view the element

jimevans: as in they can't be scrolled into view by a user but can be done via execute script
… <describes a scenario about overflow in a container where a person can't but JS can>
… if we are doing this we need to make sure that we handle this case

orkon: I think the important part of the method is "if needed'. This isn't always going to be needed. We probably need to have this as it's in playwright/puppeteer? Should it be in the client or protocol?
… I don't think we need to have a top level method as people can just call execute script

jgraham: re: race conditions. In other actions we have a way to extend actions ticks so it seems reasonable way to be able to handle this without
… as for top level or action, I think just having it in actions is sufficient as you can send an action with just a scroll

<Zakim> sadym, you wanted to ask if extending https://www.w3.org/TR/webdriver-bidi/#cddl-type-inputelementorigin with bool param "scrollIfNeeded" be sufficient?

sadym: will it be sufficient to add a param scrollIfNeeded?

orkon: I think there could be a challenge knowing if a scroll has completed
… there could be things that happen during a scroll
… there could be issues if we are trying to do a smooth scroll.
… I think there could a case to scroll the element or a page and that could be a top level method

jgraham: I think limitations apply equally if it is actions or a top level method
… we would need to define "done"
… the other thing is if it should be a param? I agree with orkon that it should be able to do this independently and it could have it work
… I think it should be a top level method
… and should it be an immediate or smooth scroll

orkon: the diff is with actions or top level is that one can wait for the scroll to happen. I think that we can have a way

orkon: The other part is whether we're prepared to spec scrollIntoViewIfNeeded, that seems important to the use cases

whimboo: If we're unsure that we can check if the scroll has ended should we make it instant-only for actions? Smooth is definitely something we want to have, but classic only has instant. People are asking for smooth

jgraham: In both actions and a top-level command we have to decide when the scroll is complete; I think it's fine to have smooth scroll for both. Are there concerns with specifying the if needed part for scrolling?

hbenl: I wanted to reiterate the use case of specifying which part of the element should be in view e.g. a rectangle that's inside the element. This is especially important for the if needed part e.g. for drag and drop you might want a specific rect that's in-view. I think at least a position and probably a rect are needed.

jgraham: I think that's reasonable; it's a small extension of what we already support for pointerMove where you can specify a point relative to an element. Being able to specify where the element ends up in the viewport would be more novel.

whimboo: if there are elements that people try move to elements that are a text node. do we want to handle the situation where they can move to parent them scroll them but there are cases where it is too large so what do we want to do there?

jgraham: I think we want to figure a way to get a bonding box for a text node. I am not sure if CSS already handles this for us?
… I don't think we want to go up to the element to do this
… I think the requirement is that we move the text

whimboo: we could do <describes way to get the position of a text node> to handle this

<whimboo> as example this might work: document.createRange().selectNodeContents(textNode)

sadym: since we ahve consesus above, do we want a dedicated command for this?

jgraham: I think the consensus above was that it made sense in actions due to use cases and the top level commands can be composed of actions

jimevans: as a follow up to jgraham . you can't do getBoundingRect on text nodes. Whimboo has given an example in the chat

Meeting to continue at 6pm UTC

Mobile emulation

Github: w3c/webdriver-bidi#772

sadym: First naive approach would be to make a flag that would be "emulate mobile" whatever that means.
… All the things that Chromium does when emulating mobile are listed in the issue. We propose to address them individually.
… and then introduce common device configuration perhaps.

jgraham: I think dealing with each thing separately makes a lot of sense.
… I'm not sure it makes a lot of sense to explain what, e.g., Firefox does because that's going to change over time.
… There may be a way to publish somewhere the set of settings that would take you close to something.
… I would want to be in control if I wanted to test something, and not let the browser in control and possibly change the interpretation.

sadym: The main scenario is for users to emulate specific devices. It would be good to check that Safari and Firefox are working with the settings, as a way not to introduce regressions in tests.

jgraham: Where would you maintain these configurations?

sadym: First: list of features. We want to make sure we didn't miss anything.
… Second: list of known devices, but it can be done on the side, perhaps as part of the test suite.

jgraham: OK. First check is to assess whether whether the list in the agenda matches RDM in Firefox devtools, then.

jdescottes: [going through the list]. Some are not used in Firefox. List needs to be refined, I guess.

( we don't support client hints at all at the moment )

sasha: The media features are not necessarily bound directly to the mobile mode. Some of them, you can turn on or off in RDM.
… The scroll by emulation, media features, but not text sizing, etc.

bburg: Catching up. Some of the things are confusing to me. In Webkit, WebInspector, there is a way to override things such as accessibility contrast.
… We don't really know what the device is or not.
… It sounds like we're talking about a desktop browser emulating a mobile browser, not a mobile browser directly.
… Some clarity on what is being solved would be appreciated.
… Understanding the user scenario would also be super useful.
… I prefer to use existing capability mechanisms whenever possible.
… Media stuff, there are lots of things you specifically need. I don't really know the spread of features and needs are.
… How are we going subfeatures in web pages, I'm interested to know. Are we talking about WebViews? Emulators?

jgraham: My understanding is that the use case is basically testing in a desktop browser in such a way that it's a reasonable approximation of a mobile browser.
… For example, Playwright allows you to tweak the desktop browser to be like a mobile one, but not to be directly a mobile browser.
… Other slightly problem might be KVM access, which may not be possible.
… Goal is to find a sweet spot in the simplicity vs fidelity curve.

jdescottes: I don't believe Firefox RDM mode enables something outside of the list. From that perspective, it does not miss anything.

sadym: I would like to ask Safari folks to look at the list and see if there's something else.
… The goal for us is to enable mobile emulation on desktop devices. That's the easiest way for users to have a good perspective on what the site would be on mobile devices.
… Of course, that's imperfect.

<bburg> I see, so this sounds more like RDM mode options.

Scrollbar emulation

github: w3c/webdriver-bidi#1050

sadym: Scrollbar types. We want to allow for switching between classic and overlay scrollbars to emulate mobile devices.
… With yesterday's discussion in mind, it should perhaps be a dedicated command.
… I believe the decision here would be: would people support setViewport together with scrollbarType, or should they be separate?

sasha: Setting viewport may not be only about mobile emulation. You would not need to set the scrollbar type in such cases.
… The setViewport command was introduced early on and doesn't fit the new way we introduce emulation commands these days. List of user contexts and list of browser contexts are missing. Not the same semantics.
… Device pixel ratio would also be changing the screen settings rather than the viewport.
… I would want setViewport to only change the viewport and have other settings somewhere else.

sadym: I don't think it belongs to emulation. We're switching the scrollbar type, not emulating some of them.
… I'm going to add a dedicated command.

<tidoust> s/comment/command/

<Zakim> orkon, you wanted to say that desktop systems can have different scrollbar types AFAIK

orkon: Scrollbar types can be relevant for desktop. I think on macos, you can change that in the settings in particular.

whimboo: How should we behave by default? The resulting viewport may be different and a test could fail as a result.

<sadym> Currently, it is done by an optional param `scrollbarType` on `browsingContext.setViewport`. One can sspecify it, or keep it default, up to the user

jgraham: I think the default should be whatever the browser does by default.
… If you're using it in a scenario where it matters, then you just have to configure that at the start of your test.
… We shouldn't change anything by default just because we start a WebDriver BiDi session.

sadym: The consensus is to move it to a dedicated command, and to let browsers use their own default.

Text sizing

github: w3c/webdriver-bidi#1051

<sadym> the consensus: move it to the dedicated command and allow for setting the default and returning "unsupported" if so

sadym: Minimum readable text size. It's different in different browsers. It tells what is the smallest a readable text can be.
… I'm going to update the issue description.

sadym: This specification is CSS specification, and it can take a long time to change. Are we fine with landing WebDriver BiDi first with hooks to it and wait for CSS to adopt it?
… Or do we want to land the BiDi command only when the CSS hooks are approved or pre-approved.

jgraham: I think we'll need to look what implementations do here. I don't know if minimum text size is a good characterization, or if you need text inflation, or if it's more complex than that.
… I don't know enough details to make a call on this.

sadym: We want this unified across the browsers. For sure, we have different engines working differently. We can use that command to do additional logic to emulate things further, which isn't required by CSS.

jgraham: Very quickly, there seem to be 4 different values in the code. The command could be setting mobile text layout for example. By default, the browser has to pick something that is appropriate to the context.

sadym: No split between specific answers.

jgraham: That does not seem like a good idea.
… For the font size inflation that browsers do on mobile, that seems fine.
… I don't think it should affect other mobile emulation stuff.

jdescottes: If we don't have a strong spec that we can hook into, we should not try to add behavior that only Chrome would implement.
… I think this command should still do something that can matched to a spec and by all browsers as much as possible.

sadym: So consensus is to introduce a command, emulateMobileTextLayout with implementation-specific logic behind it.

Viewport Meta

github: w3c/webdriver-bidi#1053

sadym: Another thing that mobile browsers do but do not desktop is the viewport meta tag.
… I didn't read the issue that Sam raised in details.

sasha: That would be fine for us. We do this in RDM as well. We were planning to have it as a feature as well.

AutomatedTester: Consensus: let's do this as a separate command

Safe area insets

github: w3c/webdriver-bidi#1054

sadym: Insets, I'm not sure how well they are supported by all the browsers. It can be the camera area on the phone.
… It's something that can be emulated, the question is: do we want to have it specified?

AutomatedTester: I think we probably would want to have something like this. I'm not sure how common people would have it but I can see the watch example being quite useful for people to test on a smaller area.

sasha: Firefox also supports it, so we could add it as well. I'm not sure whether it's a priority at the moment. Maybe in the future.

AutomatedTester: Consensus is we'll look into it at some point.

Mobile testing - show / hide virtual keyboard

github: w3c/webdriver-bidi#1059

jgraham: This has come up out of the interop mobile testing.
… People want to test in web apps whether the virtual keyboard is hidden or displayed.
… There's also an API proposal to allow people to show/hide the virtual keyboard.
… It would be useful if WebDriver gave you the control of whether the virtual keyboard is displayed or not.
… See the current proposal in the issue.
… Of course, on some platforms, it may not work. UnsupportedOperation in such cases.

sadym: Are there any other observable effects on top of navigator.virtualKeyboard and viewport changes?

jgraham: I'm not 100% sure how this integrates with the layout API.
… I think that the virtual keyboard is overlaying but I'm not sure that's the case in 100% of all cases.

jdescottes: I want to mention that we have started working on VK for RDM
… we have also started working on dynamic toolbars for mobile
… so I think we should think about handling overlays e.g toolbars or keysboard

jgraham: thats the next topic and I think that's harder

sadym: I have 2 main concerns
… on devices taht support it, I don't think there is a mechanism for handling if an element that launches it
… I don't think there is a scenario when the user expect VK to appear if there are no forms or inputs which is a perfect use case for emulation
… and for devices that handle this natively it could use the native device and not the emulated device

jgraham: coming from interop mobile testing, the use cases for devices that have a VK are strong enough
… I think the same API can be used for vk
… I think that it is a bit much for implementors to have to create their own VK and should be allowed to return unsupported operation if they don't support it
… however there have been requests for this in RDM

sadym: to be clear, I don't propose that implemetations have their one
… it should just create the observable effects

orkon: I think if we want to run this on real devices that we and have support for VK. I think we should be able to do this already. We have concerns about this being in a dedicated command

jgraham: I am not sure that I have understood the concerns other than an extra command. It seems like it would be convienent for users to be able to have this. If we don't have it in webdriver then we can't test that API at all
… as for emulations, I think one problem is that a device could have a touch and have emulated a smallers screen and you call vk, should you get a system keyboard or a smaller vk?
… so should we have a command that only targets the system vk?

orkon: putting emulation aside, can you clarify where this isn't testable if we can already do it via script evaluation? What is not possible?

jgraham: You end up testing the API with the API. If I use the API, do I get the events. It would be nicer to have a automation command to make sure the events are fired
… that way you can test the side effects without calling the API
… I don't know if that will be consistent enough between browsers to make sure it was interoperable

orkon: The API says you can show the keyboard. In practise you need an element to make it work properly. That way we need to have a way to pick an element, make sure it gets focus etc
… we don't have a way of hiding the VK and the events?
… so do we need an anchor element?

jgraham: that is a good question. I don't know if there is something practise if there was a way to trigger the keyboard without having an anchor element?
… there should be a way to say use this element and set focus and load the vk

bburg: you can use a keyboard on iOS and we won't show the keyboard unless there is a carat for it

consensus: there are cases that are different to the DOM API. If you have a carat, you should freely show/hide the VK. There is something worth pursuing here. jgraham to look into this further

Mobile testing - toggle browser UI state

github: w3c/webdriver-bidi#1060

jgraham: This is for testing on mobile browsers that allows us to set a UI state
… there are bugs on sites between transitions of toolbars
… this feels more problematic than the previous topic
… is there a minimal API that we can create here that we can extend in the future
… the API can starts with setting it to visible and then setting it to hidden
… I am not sure if that is going to be sufficient

orkon: I am assuming this isn't web exposed and wonder if there is a way to do this via browsingContext.activate

jgraham: this is about showing/hide browser ui. e.g. Firefox for android if you scroll to the top of the page it will show the URL bar
… having a browser that scrolls in a test can be difficult. It would be easier to have a "show me the browser with the toolbar enabled and then show me without it enabled"

orkon: I am not sure if the implementation would look like? How would the client learn about to set?
… is there a list of side effects people would expect
… or get not supported if the browser can't do it

jgraham: the site would see the viewport changing
… a site would have an element that would appear in a place when the url is hidden and then jumps up if the URL bar checks its in the correct place
… they should be able to check the view port and has changed and react accordingly other it could lead to bugs in the site

orkon: I think this sounds useful but I think we need to do more research to understand the problem better

jgraham: I think this is exposed in CSS via viewport units
… I think there already understanding on the platform for how to do deal with this and it makes sense for webdriver to understand so that we can test this better

orkon: so this would only be for what is supported on the devices and not for emulation.

jgraham: I think it would be good to start with what is on the device but there may be usecases for emulation at some point

consensus: Leaning towards adding this but more research to be done

Log argument ownership

github: w3c/webdriver-bidi#214

sadym: with the discussion around log entry
… at the moment when logEntry added event is sent the owner is always null
… it would be better to have better ownership of the error
… the only issue is that if they want the error they need to make sure that if it is no longer being used they will need to clean it up
… so what is the best way for us to handle this

sadym: is the log.config with the ownership details is the one that we would agree on? This was proposed by jgraham

<sadym> `log.configure({ownership})`

<sadym> perhaps with `contexts` and `userContexts`

jgraham: I think the reason we landed there before was because of the reasons you mentioned
… as it could to leaking of objects
… for devtools console, if you wanted to have those items then you would want to keep them alive
… chrome keeps them alive until they are discarded
… we didn't want the user to be having to know when to free objects
… we have avoided this type of config command in the protocol
… it's not ideal as it create global state
… with logging, the other use case is to filter waht log events to send (e.g. debug, info)
… I don't have a better idea that 4 years ago
… from a client you are setting flobal state. You can't isolate state

sadym: the lifetime of the object is not longer than the lifetime of the realm
… we can config via context and userContext
… this would allow for precise enough control
… in puppeteer we would enable it it globally

jgraham: I agree that scoping it to a user context is useful
… but there can be a case where a test fixture decided to own objects
… and the test redoes the config which can cause problems
… or a test one level errors but the fixture wants another
… that is the isolation that I had in mind

jgraham: I guess puppeteer doesn't give people the choice and just does it globally but that's not ideal

sadym: it is related to any other state of the browser that can be controlled by 2 different clients. It is a valid concern but we need to make sure that people will only do that if they have a reason
… or we turn it on for everyone and don't allow them to configure

jgraham: no. I think the default is that objects get GC'ed unless they ask for it

jdescottes: would it help to disown? everything for log related?

<jgraham> s/disarm/disown/

sadym: I wonder what firefox does when it has devtools open? Do you GC it? and for Safari?
… or is GC'ed when the realm is destroyed

jdescottes: we GC on navigation

sadym: which means you configure to global for devtools

jdescottes: yes for devtools but bidi might be different

sadym: and for classic?

jdescottes: we don't for classic

sadym: I don't think it's much to wait for the realm to be destroyed. It can be default to global or have a configure

jgraham: I would prefer the configure

sadym: the consensus is for us to have a configure for this

orkon: it might also be a good place to set the log levels that you expect to get

ACTION: Create a new issue to handle the log levels as a config

Status of the WebDriver BiDi roadmap planning spreadsheet

whimboo: we have this spreadsheet for the bidi. I was wondering if it was worth maintaining but it's not always up to date or should we do things via github?

sadym: I haven't looked at this in ages. I think github should be a source of truth
… and have a mechanism for clients to use

orkon: we also have a roadmap.md on github. It is 99% complete and we could move the spreadsheet to it
… so do we need a new roadmap or just use github issues?

sadym: how we prototype things is puppeteer things over and mobile emulation as the main priorities

jdescottes: the added value is that it give the priorities from clients but this may not be reality
… I think the labels on the github would give us the same value

ACTION: create an issue to how to have github as single source of truth

RRSAgent: make minutes

– DRAFT –
Browser Testing and Tools WG - 15 Jan

15 January 2026

Attendees