gsnedders: hey :) what's the github bot name?
<Honza> * jodvarko in the meeting invite
State of the Union
<jgraham> RRSAgent: make logs public
<jgraham> RRSAgent: make minutes v2
AutomatedTester: The WebDriver specification is mostly in maintenance mode and wpt have been improving. THanks to all vendors in this area
<cb> David, should we record this session?
jgraham: SInce TPAC last year, the webdriver specification has been created. It is creating APIs that are not readily available in webdriver. This works by creating these new APIs that are wanted based on other bit of work based on proprietary APIs
… We have seen clients, like Selenium, adding new APIs that we think people will need, and seen clients that use directly like puppeteer/playwright/cypress
foolip: Not much to add... the justification from Google is that cross browser and ergonomics are not where we need them to be for anyone
… from TPAC it would be good to move our near term roadmap to being more longer term
brwalder: In addition to how to start sessions, we have started using CDDL to allow us to generate clients for the new spec
simonstewart: in the Selenium project has started work on an "idealised" API for how we use domains and would be good to discuss the modules later
<cb> jgraham: to much work is missing from my side to be properly prepared, I will follow up with it soonish but doesn't need to be part of our TPAC meetings
BiDi priorities - what do clients requires to move features over
jgraham: This is a question around the transition process to cdp -bidi. If we want people to move over from webdriver, as an example, to webdriver. What APIs should we start with?
<jgraham> RRSAgent: make minutes v2
AutomatedTester: We need to make sure that we don't break selenium, puppeteer, cypress
jgraham: which features of CDP are you using and are planning to use?
simonstewart: The first thing we do is during session creation to see where the cdp endpoint is and possibly rewrite that endpoint
… we have been looking at the use cases people are using
… we have some domains that we created to get an idealised and have created our own APIs on top
… we took a very use case approach
jgraham: That's great. When thinking of the spec, there are 2 ways to do this. There are create a session and script execution. Or there is the value added item ontop of WebDriver 1 and 2
simonstewart: this goes back to TPAC 2019, we had use cases discussed
foolip: we could appraoch this to be feature complete but that would be fun but I suggest doing what you suggested
jgraham: THis is going back to a discussion at a previous F2F call
… basically the way the spec is currently worded, you will always get a response to a command that is sent.
… Events can always come at any point
… and if I remember at a previous meeting, I think Apple suggested that these responses could come out of order
… in a spec we can easily fix this but I want to check if people are ok with this
brwalder: We discussed this in the chromium implementation recently and this makes sense
… especially for people are using CDP
… and this would ease the transition for people moving from CDP to bidi
… this allows for scenarios where people could get a status update on a command and then get a full response when its done
… and there are netwrok interceptions scenarios
… and by not enforcing the order we make the protocol more powerful
drousso: WebInspector will always return in the order they are send unless explicitly called async
<brrian> Example async commands in Web Inspector: network callbacks, IndexedDB commands, anything that could take a long time.
jgraham: my question is then... should we make them async explicitly?
simonstewart: the programming model for the original selenium is not great for the async work
… and I don't think we need to explicitly mark it as async
jgraham: I think the models are isomorphic
… if you want to represent the response that could take a long time you could get an empty placeholder and then an event could come later to replace that placeholder, or via an ID
simonstewart: I don't see value in adding this to webdriver http, so we need to add it as purely event driven
brrian: I would prefer to have it all async but if can 't then we need to definitely show it clearly
foolip: In terms of preferences I would go for everything to be async
… I think where order is guaranteed we need to make sure that highlight it
simonstewart: Its worth calling out that commands are executed in the order they are sent but responses might not come back in that order
jgraham: The natural way of writing this will guarantee that
… the commands will be executed in order
Resolution: Agreement that we should have commands async and any further discussion can happen in the PRs for this work
<jgraham> RRSAgent: make minutes v2
Targets, contexts, realms
jgraham: One of the things to do in an automation protocol we need to know where commands are addressed to
… and we wanted to address commands to specific areas
… e.g. for resizing a window we need to make sure we are in the correct area
… so the question is what is the shape of the API here?
… we are taking our cues from CDP here and we want to be close to CDP to maximise moving the ecosystem over to the Bidi work
… There is an anomaly here from CDP due to it's history likely
… so using a browser context here would be good but we need to discuss it
… I have put a PR up at https://github.com/w3c/webdriver-bidi/pull/62
<gsnedders> github topic: https://github.com/w3c/webdriver-bidi/pull/62
… I agree that we need a much higher concept here
… and browser context that makes a lot of sense
<brrian> Enumerating all the context types seems like a fool's errand, new ones get added all the time. Web Inspector now supports JSContext, AudioWorklets, Workers, Web Workers, Service Workers, Extensions/content scripts, and normal pages. I'm sure there will be something next year.
<brrian> IMO, for the purpose of restricting which commands work for which contexts, it would make more sense to focus on capabilities of a context (has JS, has DOM, etc) and allow introspection of the context type.
<drousso> there's also new types of contexts being created, like worklets
<drousso> and more annoyingly, different worklets have different behaviors about what/where the execution context lives
brwalder: it makes sense that we have them as addressable
foolip: James could you direct us to the problems that there might be here
<gsnedders> that just implies we need to be able to extend the set, rather than it being impossible to enumerate them? we see similar with IDL which enumerates contexts
jgraham: My main concern is the migration path for clients trying to support both versions of the protocol. E.g. Puppeteer supporting CDP and bidi
… and I don't want to hit complex pitfalls around multiprocesses
brrian: Enumerating all the context types seems like a fool's errand, new ones get added all the time. Web Inspector now supports JSContext, AudioWorklets, Workers, Web Workers, Service Workers, Extensions/content scripts, and normal pages. I'm sure there will be something next year.
… for the purpose of restricting which commands work for which contexts, it would make more sense to focus on capabilities of a context (has JS, has DOM, etc) and allow introspection of the context type.
jgraham: I see the concerns but I dont see the concrete implications of them are
<simonstewart> It almost sounds like realms have capabilities
brwalder: we need to make commands forward compatible so new commands just fit in
<brrian> Other concern (maybe already addressed), are we trying to specify what contexts are top-level and which are not? And the relationship (i.e., a ServiceWorker can't be subcontext of an iframe)
jgraham: I might be misusing the terms
… and the context is a browsing context and not a JS context
… I was thinking of them as discrete items based on their capabilities
brrian: Other concern (maybe already addressed), are we trying to specify what contexts are top-level and which are not? And the relationship (i.e., a ServiceWorker can't be subcontext of an iframe)
jgraham: yes, so the service worker wouldnt have a browsing context
simonstewart: is it possible to reframe the idea of the context in terms of capabilities and you send a command to something that matches the capabilities
jgraham: for clarity, capaibilities here is not Webdriver capabilities
foolip: I assume solutions would be isomorphic
<simonstewart> Getting the list of targets that match a given "feature" seems like a new command
foolip: if we are targeting command to a realm that matches "features"
simonstewart: so if there a command takes a union, not 2 targets
foolip: that makes sense
… what is the complexity we are trying to reduce here?
simonstewart: [reads what brrian mentioned earlier in this topic]. So we have the problem of forward compatibility
… and realms seem like a specialised feature here
… and brwalder mentioned earlier that we would feature matching (pattern matching)
<brwalder> was paraphrasing brrian
jgraham: I think this is describing all the same thing
… a realm has the ability to execute JS
… and browsing context has the everything including the realm
… so you could execute JS in a browsing context and it automatically finds the realm
… so the features model is what we want to follow
… and there are likely to be "specialisations"
… so it will do the thing or it will error loudly. E.g. Get a DOM node or error out saying you can't
… do we think that we have a model here that we actually execute?
simonstewart: When we have something to play with we can have more of a discussion
foolip: what is the decision that we are facing here?
jgraham: The current design looks fine. There might be changes in the future when we get more concrete use cases
foolip: I am struggling to think through the heirarchy of things and then how to go down and then targeting
jgraham: there is no doubt that we need to inform the user what the realm is.
… and this is more of spec organisation issue
… so if another spec adds a new realm we handle it it fits into this set of features
Resolution: Treat browsing context and realms as 2 concepts. Ensure the model is extensible to future items. This is not a great summary
Github Issue: https://github.com/w3c/webdriver-bidi/issues/18
github-bot: end topic
<brwalder> https://github.com/w3c/webdriver-bidi/issues/16 looks like its related to the PR
jgraham: We have just been discussing realms
… one of the items that we need to do to prove out the spec is add out support executing script
… There are a few questions:
… How do we serialise data?
Github Issue: https://github.com/w3c/webdriver-bidi/pull/57
… Should we follow the like the webdriver http approach using `arguments`
simonstewart: how adrift are we from the webdriver http serialisation
jgraham: it is different to how devtools approaches
… it will return a handle to a value so you pass that on and then carry on
… and thing like CDP has the N+1 problem around `querySelectorAll` that it needs a iterator that does N+1 loops
… and there are improvements to CDP how to improve things
… and this fronts it
<foolip> https://chromedevtools.github.io/devtools-protocol/tot/Runtime/#method-evaluate is what I've been looking at. `returnByValue` is what turns JSON serialization on/off. I'm not sure what `generatePreview` does, maybe that has something
simonstewart: If you return a list via Execute Async it would return a handle to the list and then a handle to each item in the list that you need to collect
jgraham: that is correct.
brwalder: The `generatePreview` just gives a "shape" of that can be returned
simonstewart: doing N+1 can be a lot of work if you have a service provider like BrowserStack/Saucelabs
simonstewart: is there a reason why aren't using the webdriver HTTP serialisation?
jgraham: yes, there are times we want to return a handle to JSON objects as well
… we want to try have the best of CDP and WEbdriverHTTP here
foolip: could someone explain the webdriver http approach?
simonstewart: the tl;dr is if it is a basic type return that, if its a webelement return the UUID that represents it. It expands objects to JSONObjects.
… there is special casing for windows
<foolip> Is it https://w3c.github.io/webdriver/#dfn-json-clone?
jgraham: its `JSON.stringify` but special handling for windows and elements?
foolip: that feels differnt to James proposal
jgraham: WebDriver HTTP would fail in cases where they can't be serialised like cyclic elements etc
simonstewart: we need to prevent too many roundtrips as possible with cloud providers
jgraham: there are cases we need to think about it like WASM in the future
… how do protocols return multiple values async
drousso: in webinspect you have 1 opportunity to return a value
… it gets the completion value as in ecmascript
… if you need multiple returns you need to do something special
<foolip> https://chromedevtools.github.io/devtools-protocol/tot/Runtime/#method-evaluate indeed has an `awaitPromise` parameter
drousso: but you could do await promises
foolip: that's the same model that CDP does
drousso: there are ways we can try solve it but it's never been a use case that people wanted
<foolip> the sort of model that exists in https://streams.spec.whatwg.org/ might be worth looking at
drousso: you could have your code return a generator
<jgraham> RRSAgent: make minutes v2