Browser testing and Tools Working Group update

Presenter: James Graham
Duration: 11 min
Slides: download

Learn how the Browser Testing and Tools Working Group is adding bidirectional capabilities to Web Driver to make it a better fit for testing modern Web application.

All demos

Skip ⬇

Slides & Video

Slide 1

So for those of you who are unfamiliar with Browser Testing and Tools, the purpose of the working group is to create technologies for automating web browsers.

In particular, our focus is on enabling automated testing of web applications running inside those browsers.

Slide 2

So you might ask why do I care?

Why's it important we provide these browser automation APIs?

Well, the modern web is used to deploy complex application software, and shipping reliable software requires testing.

Modern development processes focus on extensive automated test suites, usually as part of a continuous integration and deployment pipeline.

So if we want application authors to write software for the web, it must be easy for them to write automated tests for those applications.

And being an open platform with multiple independent implementations, the web presents some additional complexities compared to other competing platforms.

Despite our best efforts at standardization, and the good intentions of browser engineers and extensive test suites, we do have differences between engines.

So developers want to test that their web app works across multiple engines.

And this isn't always easy.

Indeed, the 2020 MDN Developer Needs Assessment shared that cross-browser testing was the overall number four pain point identified by web developers.

And this really is one of the problems that Browser Testing and Tools is trying to address.

Slide 3

So until recently, the focus of the working group has been on the WebDriver specification.

WebDriver provides an HTTP-based protocol for simulating user interaction with a website, for example, clicking on elements or filling in forms.

It can, of course, also do some things that users can't do, like executing a script but it was originally based on the Selenium testing framework, and really the model of WebDriver is in simulating user interaction to provide end-to-end testing, the kind of thing that you could ask a real human to do.

And the HTTP-based protocol means that WebDriver is basically command and response, which is actually really good in many ways.

It's a very simple linear control flow, and it means that writing tests is very easy in a wide variety of different programming languages.

And WebDriver is kind of done.

It went to Rec in 2018.

It's widely implemented.

And whilst it's still undergoing updates, for example, adding shadow DOM support, it isn't likely to undergo substantial revision at this point.

So the fact that we're still seeing problems with developers writing cross-browser tests suggests that there's more to be done, and more fundamental change is needed, rather than just an incremental update to WebDriver.

Slide 4

So to understand what's required going forward, we need to look at the limitations of the WebDriver protocol.

So modern web applications often don't follow the historic model of interconnected pages with limited internal state.

Instead, they have rich interaction inside the page itself, often with a lot of behavior driven by scripting.

Those scripts might invoke additional I/O in the form of network requests.

And this introduces possible sources of non-determinism.

In the face of all this complexity, it's quite hard to write a reliable test in the command-response model of WebDriver.

And indeed, flakey tests are one of the most common things that users complain about.

So for a testing tool to meet the needs of modular applications, we really want it to have more of a browser's eye view of what's going on, instead of just concentrating on simulating user interaction.

We want to be able to observe internal state changes that are happening in the browser, just like we can, for example, in dev tools.

And the command and response model of HTTP-based WebDriver just isn't a good fit for these requirements.

And these problems are already having an effect in the real world.

Slide 5

So modern testing tools have started to use non-WebDriver based protocols to enable low-level access and control.

These other protocols aren't based on open standards, but they're often things like the browser dev tools protocols or other custom protocols that are invented by the individual automation tools.

So that means even these tools that developers are using aren't cross-browser or they have to do a lot of additional implementation work or each browser they want to support.

And that's a problem for the open web.

When automation tools are tied to specific browsers, we developers have to make a decision on whether to opt for advanced features or to have cross-browser support.

This increases the opportunity cost of making a site work across multiple browser engines, and so it's important to the health of the web that we change this, and return the browser automation ecosystem to have a basis in standards.

Slide 6

So as a result of this, the Browser Testing and Tools Working Group is currently focusing on WebDriver BiDi.

This is a bidirectional protocol for browser automation.

Bidirectional in this context means the browser is able to send events directly to the automation tool, rather than only responding to commands in the way that it could in the traditional WebDriver model.

So that allows providing features that are currently only possible using non-standard automation protocols that we've previously discussed.

For example, common user requests, like access to console log events and network request monitoring and interception.

WebDriver BiDi is aiming to be a slightly lower level protocol than the existing WebDriver.

Instead of starting from the premise that we should replicate the interaction of a real user with a single browser page at a time, WebDriver BiDi will provide the ability to run commands, receive events from all the loaded browsing contexts and different scripting realms, including, for example, running script in the context of a worker.

This will allow people writing tests to observe and interact with the full internal state of their web application.

Now, the problem with inventing a technology like this is that you need to transition the existing ecosystem to your new technology, and that can be hard.

In order for WebDriver BiDi to be a success, it must be possible for existing WebDriver users to transition seamlessly.

And it should also be possible for existing automation tools that aren't using WebDriver to move to the WebDriver BiDi standard with that functionality.

So in order to help existing WebDriver-based tools transition to WebDriver BiDi, we're making sure that it's possible to use BiDi as a sort of value add on top of the exiting HTTP-based protocol.

For example, it will be possible to take a list of elements that you got from a WebDriver test using say the find element HTTP command and use that directly with the BiDi protocol.

For example, if over BiDi you get a console log message that includes an element ID, you can compare that directly to the elements that you got through the HTTP protocol.

Or you can use that element ID as the argument to a script that you're going to execute using the BiDi protocol.

Slide 7

So what's the status of this work?

Well, the spec's currently in editors draft and it's under very active development.

We've got agreed spec text covering the basic transport layer, events related to console logs and commands related to navigation.

This should already make the specification useful for some of the WebDriver clients, which are using browser dev tools protocols or other custom protocols, just as a value add on top of the HTTP WebDriver.

For example, those that are just looking at log events.

Next, we're going to add the fundamental features required to implement a full test client on top of WebDriver BiDi, starting with script execution, which is obviously a very high leverage feature and we're also looking at including the ability to run scripts in a sandbox, similar to Web Extension context scripts.

We'll be prioritizing features that will add value for existing WebDriver users, for example, the ability to monitor and intercept network requests.

Now, obviously, it's a lot of work to create a full browser automation protocol, so if you'd like to get involved, then the spec development happens on GitHub, and we're really happy to welcome and mentor new contributors.

So, so far, I've talked a lot about why WebDriver is needed to help web developers with cross-browser testing.

Slide 8

But for this audience, it's also useful to see how it's going to help write tests for the browser engine themselves.

Currently, most testing of browser engines happens with web-platform-tests, and the web-platform-tests runner uses WebDriver to schedule tests and provide features, such as the ability to generate trusted click events through the testdriver API.

So this means that limitations of the WebDriver HTTP protocol show up as limitation of what we can do on web-platform-tests.

For example, it's currently quite difficult to use testdriver across multiple browsing contexts, and even more difficult when there's multiple browsing context groups involved.

And though this is possible to solve, the fact that WebDriver has a strict command-response structure only allows interacting with a single window at a time makes this kind of feature involving multiple browsing context groups very difficult to implement technically.

With WebDriver BiDi, many of these restrictions will be lifted, and it'll be possible to write tests that do things like inspect network requests or send messages between different browsing contexts without resorting to server-side Python code.

It should also make it easy to add features for testing other feature specifications where the nature of the specification, for example, if it involves some kind of hardware means that it's not well suited to a command-response API.

And so the WebDriver BiDi work should improve our ability to test the platform and make sure that new complex APIs could be invented by all vendors with as few interop issues as possible.

Slide 9

Okay, so that's the end of my status update for the Browser Testing and Tools Working Group.

If you have any further questions, please contact us on the WebDriver channel on W3C IRC, which I note is also available via a matrix bridge if you happen to be a matrix user.

Or come and join the specification discussions on GitHub.

Thank you.

Skip ⬇

All Demos