W3C

– DRAFT –
Browser Testing and Tools WG @ TPAC - Day 2

27 October 2020

Attendees

Present
(jodvarko in the invite), AutomatedTester, brwalder, cb, drousso, jgraham, jimevans, shengfa, simonstewart, whimboo
Regrets
-
Chair
AutomatedTester
Scribe
AutomatedTester, David Burns

Meeting minutes

RRSAgent: nolisten

<jgraham> RRSAgent: make logs public

<jgraham> RRSAgent: make minutes v2

https://‌www.w3.org/‌wiki/‌WebDriver/‌2020-TPAC

BiDi Bootstrap scripts

https://‌github.com/‌w3c/‌webdriver-bidi/‌issues/‌63

jgraham: before we start on that it might be worth noting that I raised an issue from yesterdays discussion issue 63

<jgraham> GitHub Topic: https://‌github.com/‌w3c/‌webdriver-bidi/‌issues/‌65

brwalder: Bootstrap scripts execute as the first thing that is executed in a realm as it is created. It allows them to introspect the realm or setup the page for the items needed for testing
… in a bidi world it would be a good have a mechanism for bootstrapping scripts and send it back to local end
… there is a proposal that allows us to bootstrap a script assigned to a url pattern and when a pattern is matched it would inject the script on that realm
… and if we needed to inject it was on a realm ID it would create a race condition that we would like to avoid
… The script would allow you work on newly created realms
… [describing how a JS port and JS works in this context]
… the scenarios that this enables is listening for events on a page and send them from the bootstrap script to the local end
… and allows us to support items without having to specifically support items
… and we can have a scenario for handling JS errors and cause a test to end quicker

jgraham: Overall this is a good idea, and it's a powerful idea. I have had cases where webdriver has not be able to do this.
… there are a number of open questions and these are partly script execution and this feature
… is this only for scripts associated with documents only
… e.g. worklets could be very difficult
… The other item to wonder about is this executes before the page has loaded. DO we need to have guarantees around the shape of the document/DOM
… and are scripts sandboxed?
… A precedence for here are extensions
… The final thing is around the value communication issue. The message ports is ok but that only works with certain values
… and we should maybe model it like a DOM API?

brwalder: Should this work in a document or all realms? The proposal doesn't mention this just yet
… and this touches slightly on the discussion yesterday on how script execution works
… so yes we need to have more expressive pattern matching
… re: sandboxing there are cases where we want to be limited to the sandbox and sometimes we don't
… the proposal already handles this
… re: message port... if we are using message porting as specified won't work here. I think we need to have a way to do the serialisation better

simonstewart: I second the proposal on the serialization
… is the message port is just a 1-way comms?

brwalder: I missed mentioning this earlier, there would be a specific command for this

simonstewart: with worklets... they are supposed to be high performance... how will this impact this performance here?

jgraham: my thought is that we start by making this on JS Realms that are part of a Window global and then go to realms as the use cases present themselves

simonstewart: I can see people wanting workers and we can add as the need but worklets less likely
… though we can see about extensions in later versions of the specification

jgraham: yes, we can easily do that.
… so the question to implementors, are there any concerns in this area?
… and hopefully it will be like extensions

brwalder: this is certainly possibly... CDP has the bootstrapping but its for ALL realms
… and is possible

brrian: and in webkit it is possible

Resolution: brwalder to create a formal PR for this area.

jgraham: we want to make sure that this works with general script execution
… the open issues on general script execution are :
… a) how will this work on sandboxing
… b) what the API for comms will look like? Message Ports? DOM API?

github-bot: end topic

Navigation

<jgraham> GitHub Bot: https://‌github.com/‌w3c/‌webdriver-bidi/‌issues/‌43

Github topic: https://‌github.com/‌w3c/‌webdriver-bidi/‌issues/‌43

jgraham: the point of this topic, is how should navigation in the bidi topic
… [reads from the comment in the issue]

simonstewart: From the comments in the issue, when the navigation starts it returns automatically and then people will need to listen for events
… I think that it would be good to have webdriver http reformulated to work ontop of bidi here.

brwalder: From reading the comments, I imagine this a command that doesnt return instantly and then listening for events
… so people want to use this then use the original page load strategies from webdriver http
… as doing it as instant return and listening for events is making things very complex for what should be thought of by users as simple
… and I agree with simonstewart here

simonstewart: I like the ability passing in the page load strategy

<mathiasbynens> +1 global state (setting a default page load strategy) is evil

jgraham: global state is evil and we should avoid it

brwalder: I wasn't thinkking of a global state but the client would be sending this per commande

jimevans: is there mileage in leveraging the DOM events, like puppeteer?
… It would be relatively easy for a client library to descript it's page load strategy based on DOM Events
… WebDriver HTTP kinda does this for it's page load strategy
… would allow clients to become opinionated in this way

simonstewart: brwalder highlighted that if you wanted to avoid the dance of setting things up correctly
… and that it would be a lot of work for clients

jgraham: my initial take is page load strategy does things simply
… and as whimboo will attest, navigation is full of edge cases
… [describes a use case]
… and we need to make sure that the load event matches where you're expecting
… and what does navigate return?
… in CDP you get a loader event
… I dont know if this is described in the platform

shengfa: I would like to comment on how CHromedriver does it
… there are loads of edge cases here
… we try keep track of all the frames
… and we look at frame start/stop loading events
… and checking readyState
… I don't know if we need to specify things better in bidi
… so do we want to expose the page loading strategy or all the events
… for exposing the events, we can specify the webdriver events
… or expose the events we think are relevant to the client
… and regarding the workflow, the navigation events would just ack and then listening for events
… and the navigation is just a "subscribe to these events" type call

brrian: from safari there are lot of junk in this area
… and people do want events
… and the happy path should look nicely to them
… and I don't want to be exposing the loaders to the clients
… and we should make sure that we solve the correct use cases

brwalder: It sounds like there are 2 legitimate paths for end users
… a happy path -> THe page just is loaded
… and the more exotic usecases and allow people to do more "advanced"
… and we can accept the page load strategy and we do "best case"
… and then provide an escape hatch and allow immediate return and listen for events
… and I think we should support both

jgraham: there is definitely a concensus on making sure th
… what WEbdriver http already does and then for what ee

events.
… [describes CDP implementation]
… we should look into the feasibility that if you subscribe events to a navigation so that we know that things are tied back to the initial command
… and we don't want people to have to guess that an event came from the command... hopefully

simonstewart: one thing we forgot so far... if this coming from a Selenium cloud providers
… and it would be good to make sure that we dont send too much data back for those clients as we have 2 internets in the way

Resolution: Specify navigation taking a page load strategy parameter to match WebDriver HTTP and investigate the page load life cycle events and exposing those

github-bot: end topic

Logging

jgraham: One of the value add module proposals is a logging module
… we know that Selenium have asked this as a priority
… so it would be good get requirements

github topic: https://‌github.com/‌w3c/‌webdriver-bidi/‌issues/‌45
… is this a good enough or do we need to extend it?
… do we need to get browser internal logs? Intermediary logs?
… and for clarification network logging is not part of this proposal and we can add it to a newer API

AutomatedTester: I have a few questions around this around filtering of logging as cloud providers could DDoS clients and then how granular is the data?

jgraham: the 2nd question : we allow people in the spec currently to handle this in browser context or realm
… and for the first question for this we don't have a mechanism for this at the moment?
… and we dont have a precendent in this area to learn from

AutomatedTester: I was thinking of the difference between console.log vs console.error and only getting the latter

jimevans: At a high level here that I hear from selenium users is
… they want to get console.log between 2 time points
… and if there are unhandled JS errors, notify me

<jgraham> RRSAgent: make minutes v2

jimevans: at a less frequency is I want to get the performance logging that I see in my devtools console
… at a bare minimum we need to do the console logs and unhandled exceptions
… and I agree that there are concerns around chattiness
… from my experience of using CDP, those 2 things are similar to me
… it would be good to find out from a driver to get what logging they support
… and we need to agree on the general shape of the log entries

jgraham: The performance is in scope for the spec but not high priority
… since perf is browser specific... it's important but not low hanging fruit
… the main issue that I am hearing is what type of filtering is there

cb: from my experience implementing perf tooling it would be hard for us to spec
… it would be good to have the log entry to be more generic in where it's cominf from

simonstewart: this is bikeshedding. Having a logging module shouldnt be there. It should be on a JavaScript or Console module

<simonstewart> Or a `Console` module

simonstewart: and happy to take this bikeshedding to the proposal later

drousso: as a friendly warning there are always new console messages... and filtering should be done via an event
… this is how webinspector works

jgraham: luckly this is what is in the proposal but thank you for the warning on new items being added

Resolution: Take proposal in issue and turn into spec prose

github-bot: end topic

Putative modules

<simonstewart> https://‌github.com/‌SeleniumHQ/‌selenium/‌blob/‌trunk/‌java/‌client/‌src/‌org/‌openqa/‌selenium/‌devtools/‌idealized/‌Domains.java

simonstewart: In the selenium project we need to multiple CDP implementations in each of the Chromium Drivers
… and I have put the link above that shows this "idealised" API
… jimevans has correctly described the naming isn't the best but we are homing in
… and we need to think of ways of naming the modules so that they don't clash with CDP or webinspector

jgraham: In terms of implementing this in gecko, we agree it's an issue.
… we partial implementation of CDP
… and we want to reuse as much as possible for webdriver Bidi
… and we want make sure that we dont have a case where we are dancing around names that are obvious to use but can't use for CDP using it
… and we dont want anything that will make migration is hard
… and we dont want to also be super careful about how we name things

<simonstewart> https://‌trac.webkit.org/‌browser/‌webkit/‌trunk/‌Source/‌JavaScriptCore/‌inspector/‌protocol

simonstewart: a lot of the conversation is around CDP and I have put a link to JSC protocol
… it doesn't have the sprawl that CDP has but doesn't have all the things
… jgraham have you found a way to not clash and reuse

jgraham: it's a question in the air, we are currently discussing
… we see this a real problem. We know we can't always use reuse code but want to where possible
… we can see where we get. I want to make sure we don't get to a place where we can't use Console where it is the best name
… and I think from a Firefox point of view is that you can opt into a specific protocol
… I can see where people would want to make sure that they can use both protocols because implementations are not complete
… we need to make sure it's not too hacky either

simonstewart: as a concrete proposal we don't constrain ourselves and we just namespace it. That way we can have the name we want and then they get mapped internally to the correct place

<drousso> +1, we should use good names for the sake of good naming, not because something else already uses it

<jgraham> Also +1

brwalder: I was going to say what simonstewart said
… if we keep propietrary and bidi on separate sockets then things will be simple
… we shouldnt reuse the same channel
… [describes how we could do extension commands

Navigation during commands for WebDriver HTTP

<jgraham> fission == site isolation

whimboo: The Mozilla Fission project (site isolation) when are doing there are some unknown issues
… so when there is a click or actions
… and in execute script
… and when these things like happen what should we doing

jgraham: for clarity fission is site isolation
… so we want to know what to do if there a long running task and the page loads then the state could all be lost
… so should we store in the parent process or abort

whimboo: executescript is the worst as we inject the whole script so we don't know where it was when the actor is "flipped"

simonstewart: it depends... some commands needs to survive a new page
… but with actions we never wanted to support it and it's legitimate to error out to the user
… we may want store state around mouse and keyboard
… in execute script we should error because people are probably not expecting it
… but I think we need to go through the commands and do it case by case which I am happy to at a later stage

jgraham: I think this makes sense and we need to update the spec text
… and we may want to see how other implementations handle this
… if people are not aware of use cases then we can go have a look

<jgraham> RRSAgent: stop

Summary of resolutions

  1. brwalder to create a formal PR for this area.
  2. Specify navigation taking a page load strategy parameter to match WebDriver HTTP and investigate the page load life cycle events and exposing those
  3. Take proposal in issue and turn into spec prose
Minutes manually created (not a transcript), formatted by scribe.perl version 123 (Tue Sep 1 21:19:13 2020 UTC).

Diagnostics

Succeeded: s/JavaScript/JavaScript or Console/

Maybe present: brrian, github-bot, RRSAgent