WebDriver-BiDi – 01 December 2022

Meeting minutes

RRSAgent: Make logs

RRSAgent: Make logs public

RRSAgent: make minutes

Roadmaps

<whimboo> https://docs.google.com/document/d/1fVe0U1DWUTYJkj2dj57_xVKzVU3FoDp_ASw4mZo89F4/edit#

whimboo: We're currently adding Network Event Logging, with the goal of being able to generate HAR files. The code is being shared with our devtools implementations. Spec PR is open, but isn't landed yet. We area also adding additional navigation events, and event subscription to a given browsing context.

whimboo: We're also adding (de)serialization of complex js objects, specifically nodes, which is also on the agenda later.

whimboo: Our next goals are based on the earlier discussions on priorities. First is bootstrap scripts, since that seemed to be highest priority for everyone. We will also add shadowroot support, also for WebDriver Classic. We will add support for input actions, and PDF, and screenshots. IsStale on Nodes to help with Classic on BiDi.

whimboo: BrowsingContext created event, so we can send out events for existing tabs, and the missing browsingContext.ContetDestroyed event

Q2 we're hoping to get to Network Request interception and HTTP Auth support.

Other features we're considering include user prompts, cookies, and some others (see linked document)

sadym: The milestones seem like they're from different areas.

whimboo: We checked the roadmap that we talked about last time, considering the need to parallelise work.

whimboo: We can also rearrange some things

jgraham: The features are to enable full e2e test autoomation with screenshots and so on. Features that are close to classic are relatively wasy for us to add to BiDi.

ach AlexRudenko

AlexRudenko: Support for shadowroot is interesting. Nothing standard exists for interacting inside the shadow DOM in user space. We should discuss this on WebDriver level.

foolip: Milestone 6 includes bootstrap scripts. That doesn't seem as useful for automated testing

whimboo: Client that supports most of the API for WebDriver BiDi would be helpful. Puppeteer for example is using bootstrap scripts. It's not in the spec yet, so it might take longer to fully implement it.

jgraham: Also useful to other clients. It allows installing functionality without worrying about conflicts with the page.

sadym: Selenium people mentioned using these for DOM mutation observation

https://github.com/GoogleChromeLabs/chromium-bidi/milestones?direction=asc&sort=title&state=open

sadym: We consider the minimal scenario which involves navigating to a page and running some scripts closed. We consider the bootstrap scenario very important for both Puppeteer and Selenium. We think it makes sense to also work on bindings, that's on the agenda for later.

sadym: We also consider console.log important; we're almost done with that. For each milestone we're creating an example script; we might want to share those.

sadym: Next is P2 items, haven't decided the order. Capturing screenshot and input emulation will probably be first.

sadym: Our milestones are based entirely on scenarios we want to support.

mathiasbynens: About the examples. These are standalone scripts that issue the raw BiDi commands to a specific websocket. We can test across multiple browsers. If we can agree on e2e scenarios we can upstream these examples, and have a shared roadmap at the spec level.

whimboo: Do we know what's happening in clients?

simons: In Selenium people are implementing the spec as it happens, particuarly in Java and .Net. We're very keen to remove the dependency on CDP and move to BiDi.

simons: Patches from JimEvans to the spec; he's owning .Net and raising issues as he implements

<AlexRudenko> https://puppeteer.github.io/ispuppeteerwebdriverbidiready/

AlexRudenko: On Puppeteer side we have some basic testcases that work with BiDi in Firefox e.g. starting the browser and doing some basic script execution (?). Next we're going to add Chromium support. We have a website to track test coverage support. We want to work out how to convert the CDDL types to typescript, and then work on fixing the tests. We have things in place, and then it should be

incremental. We'll bring up missing features.

AlexRudenko: Few tests passing right now from whole of Puppeteer testsuite

sadym: Are there specific parts of CDP that Selenium wants to prioritise removing?

simons: Would like to go to a website and search. In this scenario CDP we use for capturing logs and for network interception. Those are the biggest use cases for us.

sadym: Bootstrap scripts?

simons: Would be nice for Selenium internally to use for the atoms to minimise the amount of data we're sending, but not a big user request.

simons: It's a QoL improvement, but not something we're using in CDP

Roadmap / Milestone definitions

Github: https://github.com/w3c/webdriver-bidi/pull/265

whimboo: Want to mention that this was discussed before. mathiasbynens did some work here to define user-centric milestones. The PR is stuck. Would be good to make some progress here. We could add the milestones to GitHub. Would make it easier for implementors to understand.

mathiasbynens: The point of the PR was to generate discussions, not to enforce a list of things. In the last meeting, we agreed on a list of features. Some of those are in the PR in an end user scenario. Would like all the features to be captured in that way.

mathiasbynens: The main idea is to bring the features into concrete scenarios that we can communicate to developers. <details of what's in the PR>

mathiasbynens: Recoding navigation performance is a compelling example of bootstrap scripts.

mathiasbynens: Does the format of the document make sense? Is the priority list work? Start with the format: does it work for outreach?

simons: This looks good to me

hablich: This helps us communicate with tooling vendors and others we want to talk to about WebDriver BiDi

hablich: It will also help us synchronise and work on the most effective things first

patrickangle_: Do we have a sense of which of these aren't possible today with classic? Request interception stands out. We want to communicate why BiDi over classic.

mathiasbynens: Good point. One specific thing that's not in the list is logging: that's a motivating feature for classic over BiDi so we should add that scenario.

I like this format. We could make it more granular. For each subitem we could split into different browser implementations and show the status of each thing (tests, implementations). A question is where we want to put this. Do we want to use Github milestones?

whimboo: I'm happy to help with setting all this up

sadym: We have milestones which are similar, but this is somewhat higher level. We could link from the readme to milestones, but the readme is about showing what bidi's capable of, and how (e.g. via example scripts). This is more for external users than for us.

whimboo: We could reference wdspec tests to show how to implement this.

sadym: We had a discussion at TPAC about where to host example scripts.

hablich: I'm also happy to help set this up. Using GH tooling will be useful. We want this document to be consumable by users. There will be a lot of ways to evolve that. We should try to agree to land this and then make changes, rather than trying to make it perfect in the first instance. Land & iterate is better.

jgraham: We should land, but I think some of the existing comments should also be considered e.g. bootstrap scripts aren't a blocker for any use.

hablich: YEs, we should consider the feedback, and then make further changes via smaller PRs

whimboo: Regarding the explainer, you have everything on one page. It means you can search for things. We could link each section to the milestone page. Then users can dig into the details.

mathiasbynens: Can someone summarise discussion on where to host example scripts from TPAC?

mathiasbynens: Once we have a roadmap we can provide more resources to go with it.

<sadym> https://www.w3.org/wiki/WebDriver/2022-10-BiDi

RRSAgent: make minutes

https://www.w3.org/2022/10/12-webdriver-minutes.html#t05

https://github.com/w3c/webdriver-bidi/issues/309

<* github-bot Because I don't want to spam Github issues unnecessarily, I won't comment in that Github issue unless you write "Github> <issue-url> | none" (or "Github issue: ..."/"Github topic: ...").

mathiasbynens: If we agree on end to end usecases, we could consider adding these as a different kind of wdspec test instead of building new infrastructure

jgraham: It could work, but writing async python tests might be confusing for frontend webdevelopers

mathiasbynens: We're using something like this in the chromium implementaion repo, where we just send raw commands rather than using the libraries

simons: For developers there are going to be a lot of projects with good examples of the APIs e.g. Puppetter, Selenium. I don't know if we need wpt to contain things that developers might want to look at.

sadym: In our examples we don't even assert the result, we're just showing what to run to get the result. We don't have any CI.

jgraham: Examples that can't be run seems like a mistake; it can lead to people not being able to use the example code. We could consider using MDN for documentation of the protocol and we could in theory write example there that would connect to a local browser instance (but you'd have to teach people to opt in to allowing the connection from that origin, which could be hard)

hablich: We have two kinds of users. Tooling vendors might find example scripts useful. MDN might be more useful for web developers, but they're more likely to be using the tools. Maybe the conversation about example scripts for web developers is too early.

jgraham: Should work out which features on the roadmap we are in a good place for spec wise and which ones still require work.

<foolip> Is it https://github.com/w3c/webdriver/pull/1653?

jgraham: ... there is another PR open in webdriver classic about obscure handle serialization. The way we serialize elements in Classic is different than BiDi

foolip: Should we land https://github.com/w3c/webdriver/pull/1653?

jgraham: Yes, should have a serious conversation about this.

jgraham: There is a clear way forward here, we need to agree on the details though

jgraham: Let's add this review to the end of the agenda. Next one is capturing screenshots. Nothing controversial there IMHO. Nothing spec-blocked there

jgraham: PDF, there is nothing in the spec. Should be a short conversation though. It should be like in classic. Let's resuse the classic definition exactly.

jgraham: Request interception is more interesting. There is a lot of spec work here. Logging (?) is a prerequisite. OMG this is going to be a lot of work.

jgraham: On measuring performance: We have cookies. Nothing in the spec though. We discussed a bit of cookies at TPAC. Similar to classic. Should cover cookies in different domains and partition cookies,

jgraham: Cache, nothing there yet. Same for bootstrap scripts, for that we have an agenda item for today.

mathias: I tried to link in the roadmap doc to the issues. We don't have proper issues for the unliked ones.

jgraham: do you also have a list of all the features we discussed last time?

mathias: Yes. *presents list*

jgraham: element interactability is missing. Video recording we have not talked about at all.

hablich: Natural next step would be to create issues for all the things that we don't already have an issue for. Logging is missing from the roadmap doc, we should create an entry for that. It should perhaps be first since it's the biggest feature request.

whimboo: Could make logging scenario A and split script execution into the next topic

jgraham: I can file issues on the things that don't already have them

mathiasbynens: For the PR I'd like to resolve the concerns first and then land. If people want to create patches to the PR that would help get this across the finish line. Making concrete suggestions would be a useful contribution.

sadym: Another option to proceed is to remove the specific milestones and then create a PR to add it to the roadmap. This would reduce the amount of noise in the conversations.

mathiasbynens: Which scenarios are controversial

jgraham: Not much controversial. Could split the first scenario to split out the things required to allow isolation (sandboxes, bootstrap scripts) and maybe add a scenario for network request logging / HAR generation.

mathiasbynens: I'd like to take another pass at this and maybe make it smaller so we can get it landed.

hablich: Are there open comments we should discuss right now?

mathiasbynens: I think it's fine. I'll try to do that today.

<Sam Sneddon [:gsnedders]> who is in Berlin? James & Henrik, who is opposite them and there's someone closer to the camera?

<whimboo> Sam Sneddon [:gsnedders]: There is Sasha (Mozilla) and Maksim + Michael (Google)

Shared IDs

RRSAgent: make minutes

Github: https://github.com/w3c/webdriver-bidi/pull/180

sadym: I started a PR describing the shared reference [de]serailization. Several open questions. One is naming, but that's minor. Bigger topic is approach. Our approach is to keep the map on the Window. Another approach is to use the reference to the Element ID from WebDriver classic. That would mean specifying something different for elements, shadowroots, etc. How do others think this should work?

jgraham: We should provide them with an upgrade path from classic/cdp to bidi incrementally. Migration should be straight-forward for selenium users. This is user facing requirement

jgraham: functional requirement. What classic does not fulfill is the following example: You have a same origin frame and on the top level page you return window-frame-0.body. You should be able to take that reference and run it anywhere in the doc

jgraham: There are different options on how to make this happen. The webdriver spec is a bit broken on this topic. for Bidi we should do better.

jgraham: we put the element chache on the window. when you get an element you not only look at that window but you need to iterate on every window you have access to. not only the window that is in the same realm as the script is running in.

jgraham: the current browsing context group should have these references. We can store the cache per process, rather have a per window cache.

sadym: If the user have the rquirement to use the same element ids between classic and bidi this holds true. We are considering having a bridge method too, that can migrate an element id between classic and bidi

sadym: Having access to the parent window: Change would need to be done in WD classic. This needs to be done in the implementation and can be tricky on our side

whimboo: We have a lot of classic tests in order to make sure we don't break classic stuff on the way.

sadym: How can we make sure that if this should work automatically, that it works?

jgraham: Node caches have this smarts build in.

jgraham: the idea is to update classic to have it understand and handle the concept of browsing context node cache

sadym: Sound reasonable, will implement it too like this.

whimboo: we have two caches right now. one for elements, one for shadow roots. Can we combine them? mnight happen with window ids where this is relevant

sadym: current implementation says that its just a form object.

<sadym> https://pr-preview.s3.amazonaws.com/sadym-chromium/webdriver-bidi/pull/180.html

sadym: current implementation says that its just a platform object. see https://pr-preview.s3.amazonaws.com/sadym-chromium/webdriver-bidi/pull/180.html

<sadym> https://pr-preview.s3.amazonaws.com/sadym-chromium/webdriver-bidi/pull/180.html#platform-objects-map

sadym: This map will go to the classic specification. It will go to the top level window.

jgraham: IMHO it is theoretically unobservible if there are one or two caches.

jgraham: There should not be conflicts. So it should be fine.

whimboo: In the link each window has a shared ID map ... is this a separate cache that you need for BiDi?

jgraham: I think we should hoist move the cache out of the PR and put it into the spec. Also, this is claiming this is a strong map, for dom nodes we always wanted to have a weak map because we dont want WD to keep nodes alive

sadym: Yeah, need to change, should be weak

jgraham: For us, it would be more challenging to make the IDs behave different between classic and BiDi. You always would need to make sure that all the methods work with both IDs, which is not great.

sadym: In WD we store a magical property in the global document to hold the reference to the node. We don't want the same approach in BiDi. We are thinking about using CDP node IDs instead.

sadym: It would tehcnically more easy to split them

jgraham: If it is about easyness of implementation, we should have that conversation. I dont think people care about the format, but it should be an unique string. Add another item for later about this topic?

simon: We really don't want people doing conversions on our own

simon: The IDs are treated as IDs without any local meaning. Technically a breaking change in the spec, realistically it will not have any impact.

sadym: So you are saying it is fine to switch from UUID to string?

simon: More or less, yes.

sadym: We don't want to keep the UUID map in our backend. we only increase the counter there to generate a new ID

simon: We had a similoar mechanism in the past. We moved on because it was not stable enough.

jgraham: Do we have more to discuss about element IDs?

<hablich_> sadym: no, thanks, I am fine.

sadym: A goal of BiDi is interop with WebDriver classic. In the spec we have it for browsing context. Classic window handle must be the same as browsing context id. We need to add some wpt tests for this.

sadym: For Window it should be quite easy to test, but we'll have to also add tests for Element and ShadowRoot.

ach whimboo

whimboo: For browsing context / window handle we can write the tests today. We are already implementing the spec here, and can write the tests

sadym: I already have a test here

whimboo: For sharedId: we don't currently support shadow roots. For Elements we can write a BiDi test that uses both the HTTP protocol and BiDi. We can use child nodes to ensure that it's not only explicitly referenced nodes that work. I've written tests.

sadym: Do we have to split sharedId into ElementId / ShadowRootId?

sadym: This would make it easier to reuse in classic.

jgraham: I don't think we need to divide stuff up more. The type annotations should be enough to dostinguish elements and shadow roots.

sadym: We have a distinction in terminology between classic, bidi and CDP, this might make things less confusing.

<whimboo> https://github.com/w3c/webdriver-bidi/issues/235

whimboo: Right now we serialize nodes independnetly of which node type they have. We could serialize Element differently

sadym: From an implementation perspective it's very expensive for us to change the serialization. That happens internally in the browser, and we'd prefer to have as few iterations as possible. We'd prefer to have a single serialization for Node.

jgraham: I'm also happy to keep it as is. There are probably some cases where it will be worse, but it's not obviously a terrible tradeoff.

sadym: We are still working on checking if the current spec is implementable for us, but we are hopeful.

Refactoring of RemoteValue tests.

Github: https://github.com/web-platform-tests/wpt/issues/37160

<* github-bot I can't comment on that Github issue because it's not in a repository I'm allowed to comment on, which are> w3c/webdriver w3c/webdriver-bidi.

RRSAgent: make minutes

[previous topic]

VladimirNechaev: Do the element ids need to actually be the same, not just interchangeable?

simons: Yes, local ends will depend on the ids being the same; people will cache the ids in order to avoid network traffic where possible

<sadym> VladimirNechaev

[new topic]

whimboo: [de]serilization is quite heavy. In the tests it doesn't make sense to have exactly the same tests for all the different commands that can return js objects. We are currently returning objects in callFunction/evaluate/log.entryAdded/. A question is how to make the testing more efficient. We could have a top-level directory for all the serialization testing, and then for each command we would

only have basic testing

whimboo: sadym suggested having parameterised tests so we run each serialziation test for each command that can generate them. This works for now but might not work for all the commands in the future. session.[un]subscribe has some similar concerns; we could use this as a template for how to handle that

sadym: If we want to reduce the runtime costs, we could move all the serialization into a single test which would reduce the overhead of creating new tabs. Serialization shouldn't be the bottleneck. Checking serialization in all commands is important to us; we implement the serialization slightly differently per command, especially for complex objects. So I think we want to check all the combinations.

For maintaince costs, it doesn't seem that hard to add another line in parameterisation. Checking all the cases in one go might make debugging harder, but it would be acceptable I think.

jdescottes: I agree with what sadym said. Since our implementation handles seraization the same way we can assume all implementations work the same way, but shouldn't. It also avoids the discussion of what's a basic test vs an advanced case. Changing the code organisation sounds fine, but we should keep testing everything.

patrickangle: I think this sounds reasonable. We want tests to just test a single command as far as possible.

whimboo: It sounds like we should have tests that are parameterised across commands so we can test all the cases with one test. For events more setup is needed to ensure we are in the right state to gnerate the event (e.g. if it requires a page load)

<hablich_> Lunch break until 13:35 Berlin local time

Demo

https://github.com/firefox-devtools/bidi-webconsole-prototype

Bootstrap scripts

Github: https://github.com/w3c/webdriver-bidi/issues/65

sadym: We want start implementing bindings to send custom events, because this is used by puppeteer. This is straightforward if it was an argument of callFunction. Bootstrap scripts could be a function that accepts an argument that would be called to emit custom events.

jgraham: We discussed this at TPAC. We considered both the possibility of passing in a specal kind of remote value as a function argument and putting a WebDriver object with an emitEvent method in the scope of WebDriver scripts. The first option makes routing more flexible, but the general consensus at that meeting seemed to be that the second is simpler.

AlexRudenko: Do you intend to expose emitevent only for bootstrap scripts or for any kind of script evalutation? It seems useful in both cases.

jgraham: For any WebDriver-BiDi script including those run via callFunction and evaluate

sadym: A concern with this is that implicitly adding the script into every scope is some overhead for us. Do we want to add it always or just when the user wants the capability?

jgraham: From a user point of view this would just look like a normal WebIDL interface if it's on the scope, whereas a special kind of value you get as a function argument is more unusual.

sadym: I think the opposite; something that's just in scope for WebDriver seems confusing compared to something that's explicitly passed in.

sadym: From the implementation perspective, if we provide this in the scope we'd have to wrap the script to provide that scope, which makes it harder to provide the stack trace.

jgraham: For our implementation it doesn't matter either way. I don't know what's easier to spec

foolip: For event handlers we had something like this, but it changed. We couldn't do this with WebIDL, it would have to be done in terms of ECMAScript.

foolip: I prefer the arguments version, since it means we don't have to pick a name.

jgraham: In the previous discussion, Apple were in favour of the implicit global, is that still the case?

patrickangle: The object that's just implictly in scope makes it easier to work with cases where we don't have an implied function wrapper e.g. script.evaluate. But if we don't care about that, just being able to use an argument does have advantages in that we don't have to pick a name, etc.

sadym: For now we have two ways to evaluate script. Bootstrap will be a third method. We don't expect to introduce new script execution mechanisms. So we might be able to limit this to just working with bootstap script and call function.

jgraham: So how would bootstrap scripts work in this model

sadym: It would take a functionDeclaration like in callFunction.

sadym: We could provide information about the page as arguments to bootstrap script

jgraham: I think you'd already have access to window.location &c. at this point so it's not clear what the extra information would be.

jgraham: Any input from client authors?

AlexRudenko: In puppeteer bootstrap scripts are more like evaluate(). From a Puppeteer perspective there's not a big difference, as long as you can send the message back and choose when the binding is available e.g. should be able to use setTimeout and still use the callback. For puppeteer we call the same bindings from multiple parts of the code, but that might be quite design specific.

sadym: I'm not sure if Puppeteer is using CDP bindings?

AlexRudenko: Yes. We don't inject bindings into the bootstrap script, but the bootstrap script removes the bindings from the actual context. But BiDi could be an improvement here.

jgraham: Are we assuming that there's a different channel per RemoteValue in this model or something global?

sadym: One per instance, not global

simons: No strong opinion from Selenium side

JimEvans: Agreed,

AlexRudenko: Bootstrap scripts are usually upfront for a specific browsing context. How are we going to provide arguments that survive across multiple contexts?

AlexRudenko: If you install bindings in bootstrap script and then navigate what happens? In CDP the bootstrap script is run in the new page, and the binding also survives.

AlexRudenko: RemoteValue would be bound to a context?

sadym: The binding wouldn't stick to a realm. For each browsing context you call with the same bindings.

jgraham: I think the event channels would be owned by the session. Every evaluation would get the channelfrom the session, so you'd alays get the same one.

jgraham: this could also work with other argument types, although if you passed e.g. a node that was gc'd then you'd get an error trying to run the script.

jdescottes: What do you expect to do with the bindings that surive? Are you keeping state in it? How much do you expect to be persisted?

AlexRudenko: We don't expect anything to be persisted. I had a different model in mind, but that doesn't seem to be what's proposed. We map each binding name to a specific function on the client side and use that to handle state, nothing is shared in the context.

jgraham: Seems like we want to adopt the argument-based approach, which is a change from TPAC. The next step is to make a PR so that we can evaluate the technical details.

Make session.unsubscribe more flexible

Github: https://github.com/w3c/webdriver-bidi/issues/326

whimboo: As mentioned during the roadmap presentation, the Firefox implementation can only globally [un]subscribe from events. When we subscribe to a specific context, and then unsubscribe everywhere we throw an error: it puts the burden on clients. We should make it possible to unsubscribe globally even if you only subscribed in certain contexts. Also for subscribe all and then unsubscribe from just

some.

Sasha: Currently the spec only allows you to exactly undo what you originally did: if you subscribed globally you have to unsubscribe globally. If you subscribe to specific contexts then you have to unsubscribe from specific contexts.

jgraham: For unsubscribe after subscribing globally there's a clear use case, currently for cleanup you need to know the intersection of existing browsing contexts and the ones you subscribed to. That's also easy to spec. For unsubscribe from specific contexts after subscribing globally I haven't yet heard a compelling use case, and it's quite hard to spec.

jdescottes: Trying to unsubscribe from a specific context after subscribing globally: at the moment we throw an error in that case. That might force consumers to always handle errors when trying to unsubscribe. It might be better to not throw an error.

hablich: When I subscribe to everything and then to one context, what happens?

Sasha: Currently that's a noop

hablich: If I subscribe to everything and want to downgrade, what do I need to do? Unsusbscribe from everything and then subscribe to some?

Sasha: Yes

hablich: Is it atomic?

Sasha: No.

hablich: that seems bad.

hablich: I was hoping that you could reduce subscriptions without dropping events, but I don't have a clear use case at the moment.

<Vladimir Nechaev> I can answer Alexes question

AlexRudenko: A related question is: when the client subscribes globally to some event, and then the client subscribes again to some events, you get one event. But clients might want to handle events differently in different parts of the code. Is there a way to get events multiple times? Every subscribe call could return a subscription id that would allow subscibing multiple times.

Vladimir Nechaev: You can use different connections to provide isolation.

AlexRudenko: That adds overhead.

Vladimir Nechaev: You can also have channels. If this becomes part of the standard it could also solve the problem. Then you get messages per channel.

<Vladimir Nechaev> nechaev: channels or multiple connections might be used for multiple subscriptions

sadym: We didn't figure out scenarios where we need channels. My proposal wasn't about subscription. We might to subscribe to just new contexts.

sadym: That could reduce traffic overhead.

jgraham: Do we agree that if you subscribe in certain contexts and then unsubscribe globally, that should remove all the per-context subscriptions?

(general agreement)

whimboo: If you subscribe to a specific event and want to unsubscribe to all events in a module, should that work?

Sasha: The spec will currently get all the events and then error if you weren't already subscribed to all of them.

jgraham: Should we also change things so that subscribing to some events in a module, and then unsubscribing from all events in that module unsubscribes to the events that you already subscribed to, even if it wasn't all of them?

Sasha: I think the previous change will automatically change this scenario too

jgraham: That's progress. For the issue of clients wanting to have multiple copies of events, please file an issue. My first thought is that it miht be better for those clients to implement it rather than making it part of the protocol.

jdescottes: What about subscribing globally and trying to unsubscribe from a specific context? Should we make this a noop?

sadym: Another question was if we subscribe to a browsing context and then it's deleted, and then unsubscribing will cause an error.

whimboo: Unsubscribe all might solve this.

jgraham: Could we solve this class of problems by having the command return something about the current subscription status or what the command achived? Then the client would be able to tell whether what it did had any effect or not.

sadym: For the case I described, having a specific error (no such frame) seems sufficient.

jgraham: For jdescottes' case it seems like we'd want to tell the client what happened.

jdescottes: The fact that it's currently asymmetric between what happens in subscribe and unsubscribe is what I'd like to change.

<* jgraham Topic> Trains

<hablich> choochoo

Clarify extensibility rules

Github: https://github.com/w3c/webdriver-bidi/issues/274

sadym: We didn't reah consensus on this, we should clarify the open issues.

How extensible should BiDi be? Should it be possible to add random fields and still say it meatches the spec? For example when you execute some script you get an object with realm/context/sandbox. This makes the result compatible with two different ways of specifying a target for script execution. If this is accepted it could be confusing for users, but it's useful to be able to return the values we

got from script execution directly.

jgraham: I see three options. 1 is to do what we do currently, which could be an interop hazard, particuarly if people are using deserialization libraries that make different choices around ambiguities. 2 is to say that for these type we aren't extensible and need to match the fields of one variant exactly. 3 is to require each variant to have a type field so that we always match on a specific type

name. We are already using approach 1 in tests, but it could be problematic in the future.

sadym: I have two concers about restrictions. I think it's very useful for people to be able to send back what they receive. I already mentioned "channels" in our implementation that allow splitting; we want to do that.

jgraham: Yes, top level commands/events need to be extensible. One problem with allowing extensions of data without a type tag is that names hav to be globally unique: if people are relying on the fact that the result of some command can be sent as the payload to some other command, we can't add conflicting fields at any point in the future. That might not be a big problem, but it is a worry. Type

tagging is much less convenient when writing in an untyped language, but it's at least very safe.

whimboo: If we have a type here where the client can specify context or realm, how should the browser handle this? We might want to return one or the other?

jgraham: That doesn't work, it would be on the client to reform the data into the correct type. We could theoretically change the responses so that they always return types packaged in a way that we could firectly extract a field and return it to the server, but that would be a breaking change.

[general looks of concern]

JimEvans: As someone working on client bindings, and in strongly typed languages, having the spec be as unambigous as possible in terms of what's returned across the wire is a big plus. As is called out on the issue, having to make a guess between two different types of things if the response happens to contain valid properties for both of multiple return types, the client has to make a judgement call,

and removing that ambiguity would make life easier for us.

sadym: Is the issue mostly about browsers?

jgraham: Yes, but a browser could in theory use extensibility to return something to a client that's ambiguous and then the client has to guess because there's no spec to tell it how to resolve the ambiguity.

JimEvans: The realm-vs-context case is interesting, but more interesting is a script.evaluate response that contains both a result and an exceptionDetails property. How do you actually know what's the intended result. Browser vendors might say "we're never going to do that", but for clients it would be better if the spec made it impossible to do.

<sadym> Agree on the response (server-client message) to be not extendable, including the root level

jgraham: exceptionDetails vs result is a top level response. We had a case where we were sending an evaluate response with no result field and so we assumed it was an exception but actually it was a missing field. That obviously shouldn't happen, but having an explicit field to tell us what the type was would have been helpful.

jgraham: Maybe we could look for all the cases where a client has to intorspect fields to determine types and change those, even if we don't do the same for the case where browsers have to introspect

jgraham: I think that's just ScriptEvaluateResult at the moment. Fixing that doesn't solve the problem for browsers, but it would make things unambiguous for clients.

<JimEvans> also script.Target type?

<JimEvans> Ah. Right. Sorry. I withdraw my comment.

Support for shadow roots/DOM * in WebDriver spec?

<whimboo> RRSAgent: make minutes

AlexRudenko: I want to get more context about support for shadow roots.

whimboo: In the classic spec we have get shadow root, find element from shadow root, etc. In Firefox we only have get element from shadow root. For BiDi Firefox doesn't implement anything yet. But the spec parts are there in node serialization

AlexRudenko: We did some exploration. We see that users are implementing a selectors syntax that allows querying across selector boundaries. I was wondering if that could potentially be specified in WebDriver BiDi. Currently clients create their own implementation. Some clients are using arrays of selectors. Some are redefining descendent selectors.

<sadym> Classic CSS selector: https://www.w3.org/TR/webdriver/#css-selectors

jgraham: I think that the use cases make sense, but it's hard to see how the spec would work. We'd at least need to talk to the CSSWG to ensure that we don't define some syntax that clashes with their plans. I think we'd need a proposal with use cases and so on.

AlexRudenko: OK, we'll open an issue and start to investigate what's possible

AlexRudenko: There's also a Selector Extensions proposal in CSS to provide custom combinators and pseudoclasses, so maybe that would allow implementing this in some more standard way, but I"m not sure what the status of that proposal is.

<AlexRudenko> http://tabatkins.github.io/specs/css-aliases/#issues-index

<AlexRudenko> https://www.w3.org/TR/2014/WD-css-scoping-1-20140403/ (deep combinator, removed now)

AlexRudenko: There might be a newer revision of this spec

(the css-aliases one)

jdescottes: Could we implement this as a different find element strategy

AlexRudenko: It's currently possible to implement what we need via js evaluation. If we come up with a syntax we need to make sure it's compatible with CSS>

jdescottes: I was thinking of closed shadow roots; you can't pierce that with js, but you can with WebDriver. In Fx devtools we have an array of selectors and we run one in each level of shadow tree

AlexRudenko: Array of selectors is also something we're considering for Puppeteer, but there might be a better syntax that's easier to write by hand. Closed shadow roots are a problem since you can't traverse via js. Open shadow roots are more common, so that's the case we're targeting first.

Add the Element Origin concept

Github: https://github.com/w3c/webdriver/pull/1653

jgraham: The idea is to reuse Actions in BiDi. But we can specify an origin relative to an element. In BiDi we don't want to reuse the classic form of the WebElement Reference type. So the proposal is that we allow a BiDi RemoteValue type instead. BiDi would _only_ accept that (the CDDL will reject the other form), classic would accept both (to avoid the algoritihm having to know if it was called by

classic or BiDi)

JimEvans: from the local end perspective in classic, it's not a huge burden from the selenium side to have multiple things as the trigger that says this object is a web element reference. This is doable on the local end as long as the alternative data structure returned has unique enough properties to identify it as the web el reference. This applies to other places where return shadow roots, for example. I think there is the opaque key for frames too. It's not a huge burden to be able to accomodate that in Selenium

jgraham: nothing here that changes the requirements for clients. Client never has to interpret this and works as it works today. It only changes the requirement on remote ends

whimboo: what properties do we need on the properties object to know it's an element? we need node at least? and the shared id? shared id can link to an element or shadow root

jgraham: you take the share id and look it up, if it is not the element, throw an error

jgraham: just matter at the lookup time

whimboo: when we serialize the element, we check the type node, we add the shared ID if it is the element. So it is what the client gets. Based on the sharedID, we can impl the lookup? ok

sadym: Why do we need to work on the concept of element origin, rather than deserializing and then noticing the result is an element.

<whimboo> https://w3c.github.io/webdriver/#dfn-dispatch-a-pointermove-action

jgraham: It's mostly easier from a spec point of view since we don't deserialize first and then use the deserilized object, we look at the object fields as we go on

sadym: This change would change the behaviour of ChromeDriver for classic?

jgraham: Yes

jgraham: If we want to make the classic behaviour unchanged we should pass in a flag to choose one behaviour or the other to the caller

sadym: Where do we use this?

whimboo: For pointerMove we can specify three different origins, either viewport, the current position, ore relative to an element

sadym: I will review.

<whimboo> https://github.com/w3c/webdriver-bidi/pulls

<sadym> https://github.com/w3c/webdriver-bidi/pull/208

Github: https://github.com/w3c/webdriver-bidi/pull/208

sadym: I wanted to find things in the spec, but found that line breaks interfered with that. I reformatted the spec, but there was pushback that it wasn't automatically enforced. jdescottes suggested a script to detect the issue, but I'm not sure how to set it up.

(for CI)

<whimboo> https://github.com/w3c/webdriver-bidi/blob/master/scripts/build.sh

jdescottes: I proposed the script but I don't know what is the preferred tooling. Happy to try smth like this if there is nothing to lint the spec. I can see how to set up github actions to lint

<whimboo> https://github.com/w3c/webdriver-bidi/blob/master/scripts/test.sh

jgraham: in terms of CI, we already have a job set up that is installing node and running the validation stuff. Because CDDL extraction is still running in node. I think the script will be easy to add to this job. The only problem is that the lint does not automatically fix it and increases the feedback time. If it is irritating enough we can probably write a script to fix it automatically

sadym: I can take another look at that PR and try to implement one of the options. I can take it over.

<whimboo> https://github.com/w3c/webdriver-bidi/pull/337

whimboo: a PR about capabilities: difference between classic and bidi. Should make it consistent.

whimboo: anything to make CDDL naming more consistent?

<JimEvans> https://github.com/w3c/webdriver-bidi/pull/109

jgraham: need to fix conflicts