<ato> Chair: AutomatedTester
<mmerrell> everyone should mark themself present
<simonstewart> I hast marked mineself as being present
<mmerrell> AutomatedTester: continuation of the bi-di talk. centered around proper examples
<mmerrell> ... need to start from implementation and go backward
<mmerrell> AutomatedTester: start with "loading", how we'd do navigation
<mmerrell> CalebRouleau: first thing would be to target a navigation (across tabs)
<mmerrell> JohnChen: we start with how nav is initiated. the DevTools method is that every tab is a separate tab, and choosing which tab needs to navigate is significant
<mmerrell> .. page.navigate() has 2 params, the tab (the target) and the URL
<simonstewart> https://chromedevtools.github.io/devtools-protocol/tot/Page#method-navigate
<mmerrell> ... 3 other events: page.frameStartedLoading(), happens when the HTML is received. before that, nav is tentative (not committed yet), but once loading actually begins, a page.load event is fired
<mmerrell> ... page.frameStoppedLoading event indicates the loading is done
<jgraham> scribenick: mmerrell
UNKNOWN_SPEAKER: chrome monitors for these events, and makes decisions based on that. This is how loading happens with CDP
brrian: why are we talking about this?
AutomatedTester: we're trying to use an example of a command, and working backward
brrian: is this part of the use case we talked about yesterday?
JohnChen: yes
brrian: this seems like a duplicate
jgraham: if the use cases don't include this the use cases are wrong
<AutomatedTester> RRSAgent: make minutes
jgraham: the point isn't to
discuss every command, the point is to cover things that are
fundamental to the protocol, of which navigation is one
... there should be some way in the bi-di protocol to initiate
navigation, the requests, responses, etc. We need to discuss
this, because it's the base part of the framework for the whole
conversation
brrian: this isn't part of the use cases
CalebRouleau: we should be discussing navigation because it's more contentious, while we're here in the room, rather than discussing something like logging, where we'll probably agree on everything
lukebjerring: it's easier to start with a use case that involves every part of the protocol
jgraham: nav is a necessary component of a rewrite of the protocol
simonstewart: it's not necessary, and it's already in there
jgraham: but we've already demonstrated that the discussion has been incomplete
simonstewart: if the purpose here is a re-do of the bi-di protocol, then it makes sense to do load. if it's to discover the shape of how to do it, that discussion is worth having
JohnChen: one use case was the
request modification (intercept), which has been covered in the
CDP
... we could insert an event in front of the loading call,
which would allow this kind of interception in a way that is
non-blocking, which fosters an async loading of the page in
parallel with the intercept
drousso: the idea of loading a
page, and intercepting a request, are two separate things
entirely. It's not necessary to talk about these things at the
same time... the only thing that overlaps is that they're going
across the netork
... a lot of what's being proposed comes down to data, and we
need to discuss the shape of what it is and how it goes, not
about the combinations of these concerns in a specific
implementation
JohnChen: whatever script is running on the page disappears once the load event changes
drousso: you should be able to send a script to the driver, which executes just prior to a load event
CalebRouleau: but JavaScript can't do everything we want, so we need to talk about it as a separate issue
jgraham: this comes down to script execution, though, not bi-di fundamentals. We need to understand the nature of the bi-di communication in order to agree on how to proceed in this conversation
CalebRouleau: this is getting too
meta, so I suggest getting back to concrete examples
... we should talk about logging, which will give us the
context that fosters agreement on a framework or bi-di
jgraham: talking about logging risks too deep a dive into a simple use case, where we'll get distracted from discussing the nature of bi-di communication
AutomatedTester: with logging, we *won't* discuss the particulars of logging, we'll stick to the fundamentals of the packets and how they look
bwalderman: that will include transports, correct?
jgraham: yes, but there's a risk that we're leaving too much out
[general agreement on moving forward]
simonstewart: we should also include the handshake
<JohnJansen> I will paste this again to make sure everyone has read Google's description: https://docs.google.com/document/d/1eJx437A9vKyngOQ49lYYD3GspDUwZ6KpKDgcE2eR00g/edit#
jgraham: has the position changed since TPAC 2018? the conclusion was around a capability that included "bi-di". Has that changed?
JohnChen: our prototype allows for that capability, which "upgrades" the connection to one that goes via a websocket, but which doesn't exactly hand back a websocket connection. It keeps the websocket connetion between the client and the browser, not exposing it further
simonstewart: the top-level return payload should include the "upgrade URL", but which can be re-written
RESOLUTION: Bi-di is always enabled. An optional capability, defaulting to true, indicating that bi-di is desired. When a new session is established, the return value of the new session contains the new top-level property of the bi-directional URL
<AutomatedTester> scribenick: automatedtester
CalebRouleau: are we going to be doing 1 URL or multiple
simonstewart: 1
<brrian> https://www.jsonrpc.org/specification
<mmerrell> scribenick: mmerrell
brrian: I propose that we send
commands in events. We can work through examples, but it needs
to go via JSON, which can include binary data
... there are . alot of tools available for generating code
that can use this, incl C, JS, C#, etc. We should discuss
jgraham: is this close to existing implementations?
bwalderman: we should start with
JSON RPC. I second
... CDP is basically already JSON RPC, only missing a couple
things
brrian: one difference is that JSON RPC is very particular about one request, one response. You have to batch things together. We should write tests for this, but ultimately adhere to the JSON RPC protocol
bwalderman: it's proposed that the bi-directional WebDriver protocol uses JSON RPC
jgraham: we would need to study the standard more before agreeing to this
JohnChen: there are some pieces to this that would be a challenge for us to conform to, particularly around notifications and identification of these responses
ato: I don't think we can use
JSON RPC. We are fundamentally constrained by existing clients.
We need to be able to proxy existing RPC clients, which would
require a fundamental rewrite of all clients
... we shouldn't change the fundamental transport
protocol
... there's a corpus of clients which already use a protocol.
This would be an unnecessary change
jgraham: what's the advantage of using JSON RPC, as opposed to the CDP version of the JSON that's being carried over?
brrian: it's a spec. I don't want to be required to conform to weird CDP bugs
ato: I would like to use a more well-defined message formatting, but I'm not sure the JSON RPC is the right answer. We should nail down the specifics before we do this. JSON RPC is a good guidepost, but we shouldn't assume it will solve all our problems, and may require additional specprose for how to define these things, and it may get very complicated very quickly
lukebjerring: can we have a translation layer? this would allow us to transition clients over time
CalebRouleau: what problems are there in the CDP right now that would prevent us from using it as a guideline?
simonstewart: we shouldn't start from an existing implementation and work backward, we should start with what we need and define the protocol as required
jgraham: the mistake we've made
is "making small changes to things that work, and spending
years fostering adoption", when we should instead focus on
solving user problems, as evidenced by the heavy usage of
CDP
... we're making a mistake by talking about the transport
layer, in that we're missing the actual use case. Let's defer
making a resolution at the moment, because we haven't moved
through the conversation enough, guided by real knowledge of
the existing issues
Hexcles: the JSON RPC spec is still a single-direction protocol. We haven't started to talk about how to make them truly bi-di, but there's nothing in the JSON RPC spec that directly encourages bi-directional communication
cb: adoption of tools is a whole other problem. Cypress, Puppeteer, etc., all use the CDP itself, so making the spec more friendly to the CDP would be a benefit to the spec, and we'd be missing a lot of use cases by ignoring it
<AutomatedTester> Zakim close the queue
brrian: there's a lot of concern
about the difference between what we want and what JSON RPC
offers. I've personally found it to be quite easy to follow,
and made no practical difference to the amount of change. The
benefit was that we can say we conform, and that our packets
will be predictable
... having written 3 implementations in 3 languages, I can say
it's trivial
jgraham: JSON RPC is roughly the shape of how the transport protocol should go, but there are particulars to its adoption
ato: we have concerns about version pinning as well as the server-side events as they come across. I have ideas for a transition plan, but we need to address the concerns before we can resolve
jgraham: we agree that we can't adopt the JSON RPC spec without having studied it further
<inserted> scribenick: MikeSmith
AutomatedTester: start from .. transports roughly agreed
simonstewart: How to send
references to browsing contexts, frames
... something like ExecuteScript
CalebRouleau: WD curently has notion of "you are attached this to this specific handle"
simonstewart: I think we will
allow to communicate with multiple handles
... [example of communication with ServiceWorker]
CalebRouleau: send to any browsing context, and get events back from any?
simonstewart: yes
<CalebRouleau> ack
ato: "control" messages, example,
if you want to get browser-internal logs, that might not
require a target ID
... some commands make sense in a global scope, some make sense
in scope of a single browsing context
jgraham: JS realms, global object
is a JS realm
... important to distinguish between browsing contexts and
targets
... it's important to be able to target more than just window
globals
... [inject script case]
... should be possible to specify either a browsing context, or
a target, or
drousso: similar to how Web
Inspector already works
... we include a target ID in every message
brrian: SafariDriver has an internal protocol in which the browsing context is passed around with every command
ato: what we have now requires a
lot of context-switching
... that makes sense in WebDriver's view of the world
... [which maps to how a user sees and does things]
... but I have a sense that the bidi protocol is a bit lower
than that
... so some of the things we have held true so far might no
longer hold true in the bidi protocol
... ability to associate a message going over bidi with a
specific browsing context, without needing to switch into that
context
RESOLUTION: It should be possible for command request messages to target a particular target/browsing context.
brrian: for random clients, let's make it as foolproof as possible
simonstewart: we want to
reformulate WebDriver on top of the bidi communication
thing
... have world where we can do everything in the same
protocol
ato: we should continue to bear
in mind that we enable that kind of programming model that is
being use by, for example, Puppeteer
... message indexing is really important
... we would agree that every request wof this bidi protocol
should have exactly one response
... that case is not implicit in JSON-RPC
... additional complication of CDP, the fact that has a target
that is not a browsing context but is instead an execution
context
... you get an event back telling you that a new execution
context has been created
AutomatedTester: what is left to do?
jgraham: so bunch of stuff we
been doing by reference to CDP
... wrapping messages to root them to a target
... we too want to do it that way?
simonstewart: new thing in CDP, where yo uhave a session ID and you prepart a message and send to it a single WebSocket connection
drousso: we are similar to that
jgraham: do we weant to replicate that design?
simonstewart: no
jgraham: artifact of the way that
devtools needed to operate
... the reason they added that wrapper way is because they did
not want to change the existing protocol they had, which
assumed a single browsing context
simonstewart: looks definitely
like a historical artifact
... double-encoding JSON
CalebRouleau: gross
... we don't want that
brrian: we have a similar
implementation detail
... I don't think we should expose any of that
CalebRouleau: target will be
consistent across a browsing context
... not just doing what CDP does
jgraham: do we adopt the syntactic pattern that already exists? or do we do something more sane?
brrian: the existing devtools
mechanism comes from things that are specific to
debugging
... and we are not making this feature for debugging needs
bwalderman: process-switching is
an implementation detail
... I don't see any reason to abandon the current model of the
browser as we are using for WebDriver now
ato: conflation of targets and execution contexts
<ato> https://firefox-source-docs.mozilla.org/remote/Architecture.html
ato: Firefox is working on a
implementation of CDP
... see the doc as the URL above
... a target can be a tab (as opposed to a browsing
context)
... you can route individual messages using the session ID
simonstewart: so we connect, and
the thing we want to do is, register a listener (say)
... we will have a command name, a list of arguments
... how do we say which execution context it will be run
in?
JohnChen: WebDriver process is
normally implicit
... but in bidi it is is different
simonstewart: context ID
ato: historially CDP did not
support site isolation
... so there are artifacts in it that were based on assuming
that
... important thing is for the context ID to be a serializable
value that can be passed around
simonstewart: we haev a window handle, but not for a frame
jgraham: we do
ato: we have this is the spec but
not implemented
... CDP is both an HTTP API and a socket API
... can auto-attach
... (which changes the implicit target, btw, and not sure we
want that part of it)
... a service worker is not a browsing context
... we are inventing a super-abstraction above browsing
contexts
bwalderman: getting an event back
to the client when a mutation occurs
... maybe need a way to pass in a function go get message back
to client
ato: how to identify JS object,
we should talk about
... in CDP you can return anything; for example, a JS
object
simonstewart: element ID in the
WebDriver spec is because we were limited by the serialization
mechanism
... element IDs are the JS object reference
... window handle is the target iD for browser context
bwalderman: message
passing?
... supply a postMessage ID
ato: is that in CDP?
<mmerrell> we're also introducing a new id for the frame
CalebRouleau: is there a reason CDP does not have this?
drousso: in devtools we have direct access to the engine
CalebRouleau: how does Puppeteer do this?
<mmerrell> Simon, can you summarize the last point about the frame ID so that we have your telling on record?
ato: you can pass in fuctions, inner functions, or Promises
bwalderman: CDP pollutes the global namespace [to do the similar thing]
<simonstewart> Summary: Add a new "get context" function to existing webdriver. If the "current context" is a top level browsing context, this will return the current window handle. For a frame, it's "something" that we create. In both cases, the "context id" can be passed to bidi to use as a target id
<mmerrell> thanks
ato: current CDP primitive is
conceptually very similar to what we were are already doing in
WebDriver
... primitive for script execution without a lifetime
simonstewart: there is a no synchronous thing for this in CDP
jgraham: there is no blocking
ato: another model is the Promise style [in addition to return-by-value]
simonstewart: get the communication part sorted out first
jgraham: existing clients provida a way to create a custom event stream in JS?
ato: there is a way in Puppeteer
[discussion of handling of bootstrap scripts]
jgraham: this is how extensions
work, basically
... so conceptually is already exists
ato: connections in the API, for
a script injection, for a single browsing context, there soucld
be multiple execution context
... some execution contexts may be privileged
... each service worker can have mulitple JS realms
jgraham: theoretically, yes, but in practice, no
simonstewart: if you send an element from the remote end to local end, how do you know which context it has come from?
ato: I don't think it does in CDP, but there is a way to query for it
simonstewart: ... which is very inefficient
bwalderman: included in the event, is how we should do it
CalebRouleau: to implement this in ChromeDriver will be a pain and inefficient
[JohnChen explains why]
JohnChen: have not found an efficient way to map element ID
CalebRouleau: in JS land
JohnChen: low-level, the IDs that devtools know about are not exposable to JS
jgraham: when do runscript in
CDP, you get back a reference to an object
... but what WebDriver wants is not that
... so you would need to query again to get what we would
need
JohnChen: so we could do what we need but it will require additional roundtrips
drousso: similarly for us in Safari
<bwalderman> has joined #webdriver
<simonstewart> https://github.com/GoogleChrome/puppeteer/blob/master/lib/ExecutionContext.js#L142
<jgraham> https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Xray_vision
<mmerrell> scribenick: mmerrell
<projector_webdriver> hello
your name is brrian
<AutomatedTester> https://docs.google.com/document/d/1gUm7Be-akW2-4mjr15cnZlzwoAfOlfL7b3tWCDrb1Jg/edit#heading=h.f9zxnd3oxxm9
ato: good progress was made this
morning, but we need to make sure follow-up actions are taken
in order to prevent a repeat conversation next year
... [ato takes an action to make some detailed proposals around
these decisions]
simonstewart: context: new session is synchronous--request new session, and the wait can be forever
<ato> ACTION: ato to draft proposal for the bi-di protocol interop terminology we discussed this morning
simonstewart: this can take unreasonably long, and given that it's a blocking call, this can be "bad"
cb: queuing and throttling from grid/vendors is another use case
simonstewart: networks are [not very good]. We need to have an async new session
<ato> Unrelated to the current topic, here is an example of some CDP protocol chatter: https://taskcluster-artifacts.net/FQEINPSIQ-CWhvboeeJTbg/0/public/logs/live_backing.log
simonstewart: request a new session, a token is returned, which you can use to track on your own to see the progress
mmerrell: similar to the async nature of the AWS API
simonstewart: I have a draft implementation for this
<simonstewart> https://gist.github.com/shs96c/108f5313eae54b94658ee018e37926d2
jgraham: use cases all seem to be for intermediary nodes, right? drivers are usually on local machine, so is this really a concern for non-local nodes, or can it just be implemented on intermediary nodes?
simonstewart: I want it to be
consistent, so we should only have to write the code
once.
... might not need to be the highest priority, but this would
be a benefit
ato: is this how VM requisition works in the cloud?
simonstewart: that seems to be the case
ato: we should model this kind of API on known-good usages of such a library
simonstewart: a good usecase for this on a local machine would be for queueing multiple requests
<ato> titusfortner: https://w3c.github.io/webdriver/#dfn-readiness-state
AutomatedTester: what are the session creation events?
<ato> titusfortner: Also https://w3c.github.io/webdriver/#dfn-active-session
simonstewart: the first stage is that the request is queued, then being created, then created
<ato> “A remote end that is not an intermediary node has at most one active session at a given time.”
<ato> Also:
<ato> “A remote end has an associated maximum active sessions (an integer) that defines the number of active sessions that are supported. This may be “unlimited” for intermediary nodes, but must be exactly one for a remote end that is an endpoint node.”
simonstewart: in the creation process, there are events we'd like to know about that might be interesting for the client to discover
<titusfortner> oh, yes, so a single driver, but most drivers can have multiple processes
<ato> You may have multiple processes on systems that support that.
<titusfortner> right; except safaridriver, which is why I thought this might be more interesting for Safari, but doing it in series is probably better than parallel. anyway confusion resolved, thank you!
mmerrell: would it be good to expound on the AWS example?
simonstewart: roughly, though it
will require some more investigation. But we're heading in the
right direction, with some open questions about how to return
lists of events, etc
... this would be greatly helped with a bidi implementation
JohnChen: you'd see a connection token that you could use to track the status of various events
CalebRouleau: which in Chrome's case will likely be nothing
JohnChen: there will probably be a few interesting cases, in the case of queuing in particular
simonstewart: yeah, when you want to query capacity
JohnChen: that's actuallyone place where Chrome is not spec-compliant--it won't queue sessions, it will just create them on-demand
cb: there is already an end-point to query the state of a session. why couldn't we just use that, rather than creating a whole new mechanism for async session management?
simonstewart: there will be other cases where you'd need this kind of information, e.g. different versions of grid, or SL, etc
cb: as an implementer you'd want to use the async version, correct?
simonstewart: yes
cb: would you then get rid of the synchronized method?
simonstewart: we'd likely reformulate the sync to use the async, and just appear to be synched
jgraham: should this be a parameter to the existing endpoint? why create a new endpoint?
brrian: wha'ts the fallback if you have the extra param and it doesn't come back correctly?
jgraham: it would make the return type polymorphic
ato: question regarding security: with WebDriver, you can't query open sessions. Will this break that?
simonstewart: you can't get a
list of sessions--you get a request key, and you query on that
request key
... you don't get access to others that you don't already know
about
AutomatedTester: would this be implicitly handled in client bindings?
[general assent]
AutomatedTester: the end user won't know or care?
[general yes]
titusfortner: chromedriver.new() currently blocks on that call. How do we handle that?
simonstewart: the user will never know
titusfortner: the token we get back--should that be the same UUID?
simonstewart: no, there should be no requirement around that... that should be able to be determined by the implementing bindings
jgraham: another argument for
adding a param rather than a new endpoint
... this would make it even easier to foster backward
compatibility
ato: but you'd still need an
endpoint to query the status
... but you'd still need an endpoint to query the status?
[general yes]
ato: you have to account for whether or not you're hitting an older version of the server, without the new endpoint
simonstewart: there needs to be a mechanism to query for the async support. extra parameter is one way, new endpoint is another
jgraham: this needs to be considered a session capability, not a browser capability
ato: sending as an "alwaysMatch"
capability would risk rejection on an older driver
... there would have to be merging of the capabilities dict on
all the drivers, resulting in combinatorial explosion of logic
for processing them all
jgraham: you could keep retrying with the capabilities
ato: this is why it should be a HEAD request, to query the server's capability
simonstewart: based on the agreement from yesterday, we decided we should keep passing the capabilities through
titusfortner: from a user standpoint, I'd rather see a new endpoint, and if it fails, hit the older sync version
simonstewart: this is why ato requested a HEAD to gauge capability
titusfortner: so every session creation request is 3 separate requests?
simonstewart: yes, similar (but better) to an older version of this
titusfortner: what are the bindings going to do to optimize this?
simonstewart: 3 requests: one to
get the token, one to query readiness, and one to engage?
... the middle request would give you the state of the
request
diemol: how does the grid keep track of all the tokens?
simonstewart: through the hub
ato: how do you get the capabilities of the session?
simonstewart: you hit the endpoint that returns the capabilities?
ato: but we don't get that yet
simonstewart: we should add
that
... we used to have that, but we removed it
<scribe> ACTION: Simon to finish PR around the async session request
AutomatedTester: suggestion from Microsoft for scrollwheel, from Samsung
JohnJansen: pointer spec
modification request is to allow MS to create new request in
this repo to add new tests for this repo
... writing new tests is basically impossible
... like to merge this into the new tests... e.g. how fat is
the pen, what tilt is the pen
... like to create new bindings for this, to TestActions in the
WD spec. WinAppDriver can't run in WPT, which is very
challenging for engineers
... MS would like to write tests for this once it's merged
AutomatedTester: does this handle proximity cases for a pen?
JohnJansen: "tangential pressure" is what that's called. One of the new events
jgraham: what's the issue regarding running the tests?
JohnJansen: they all fail, but
they currently can't actually run
... it's frustrating to write the tests now when we know
they're going to fail. Need to update the Python bindings in
order to make them run
jgraham: why would spec changes block you from making progress on this? it's ok to make changes as a result of creating a proposal for an implementation
JohnJansen: we want to merge this now so we don't have to continue to wait
ato: we can't make this process completely atomic
CalebRouleau: this process shouldn't block you from making progress on something that will eventually be a proposal
JohnJansen: the priorities won't be written until after the changes are merged
jgraham: working group policy is that we don't make changes until the tests are written or passing
JohnJansen: the team we're relying on to write these tests is blocked awaiting official movement on the spec
CalebRouleau: WPT doesn't use selenium--it shouldn't be required that the WD spec changes in order to write WPT
AutomatedTester: this might dovetail into the discussion around the scrollwheel, but for the moment we need to decide whether to merge this or not
ato: the important thing is that it can't land in something we publish until the tests are complete
jgraham: should we provide an exception to the spec policy to provide for this?
ato: we should find out from JohnJansen's colleague what the particular blocker is to making progress. They should be able to move forward without breaking any rules or feeling any risk of having to redo or lose work
AutomatedTester: we should meet next week
<JohnJansen> ACTION: JohnJansen schedule meeting with AutomatedTester to chat with Timotius re: Test for Pointer Modificatoin
AutomatedTester: we agreed last
year that we need a scrolling action of some sort, but we
haven't made progress
... we have Lan here to help with use cases
Lan: devtool protocol has an Action for mouse wheel, but this hsouldn't be part of the Point actions. It should be its own kind of action
ato: why is mouse wheel not associated with the mouse device?
Lan: it is associated with mouse, but it should be decoupled from the mouse--scrolling actions can happen without input from the mouse
ato: something like mouse wheel introduces questions: what does the wheel actually do? most things are definite (XY coordinates, etc), but mouse wheel motions are vague, and based on preferences set in the system or browser
jgraham: but we have this kind of ambiguity in plenty of other input devices, and these things break across systems on occasion
ato: compared to a mouse click or button press, the wheel is less deterministic
<simonstewart> https://w3c.github.io/uievents/#event-type-wheel
jgraham: you still end up
generating DOM events that are predictable with a mouse wheel,
so we should be able to model this interoperable behavior like
anything else
... it's the same as measuring the mouse motion itself--we
measure by deltas. This is just a slightly different mechanism
for measuring a slightly different input device
Lan: we propose extending the action API, by adding the delta of the mouse wheel, or one "tick" as defined by the mouse hardware
simonstewart: right, as ato says,
we don't measure movement, we measure deltas by CSS
pixels
... looking at the events generated by the wheel in the spec
above, it's still measured by CSS pixels
<ato> https://developer.mozilla.org/en-US/docs/Web/API/Element/wheel_event
<ato> Funky example.
simonstewart: there should still be a new type, and we would allow these deltas to be recorded, which would map well down to the expected event
<jgraham> https://github.com/w3c/webdriver/pull/1410/files
jgraham: the above PR for "pointer wheel" might be good enough as-is
simonstewart: this needs a test
jgraham: with tests, if Lan agrees, this should be good enough
Lan: would a WPT be good enough?
[room says yes]
JohnJansen: for implementation of this, Windows cares whether you're swiping or turning the wheel here. How do we mimic the hardware here?
ato: we don't
AutomatedTester: if you're using a finger, wouldn't that be a pointer gesture?
JohnJansen: yes, we already account for these swiping gestures. The wheel is different
CalebRouleau: this is just creating the web events, it has no hardware interaction
jgraham: does scrolling with your finger create web events?
AutomatedTester: no, it's a touch-and-drag
<scribe> ACTION: ask for tests on the above PR
<ato> ScribeNick: ato
brrian: This doesn’t have a scroll wheel.
jgraham: It seems totally reasonable for the driver to complain if the scroll wheel is not supported on the platform.
<JohnJansen> (bbrian was pointing to his phone re: no mouse)
https://developer.mozilla.org/en-US/docs/Web/API/Element/wheel_event
ato: "wheel" event is being emulated on macOS.
brrian: There’s no way to scroll on these devices using WebDriver.
jgraham: Is the proposal that we also have a generic scroll API?
simonstewart: Last year we did
the scrollToElement in the middle of an action chain,
... scrollIntoView in the middle of actions.
<AutomatedTester> scribenick: AutomatedTester
brrian: we are going to need a mechanism that allows use to scroll
jgraham: yes, we agreed on this last year that would allow us to move to an element and we can move that
ato: yes, but no one has actually specified that
jgraham: there is the argument, why do scrollwheel or just normal scroll
ato: we need both so that we can have the wheel DOM events
RESOLUTION: create scroll as discussed last year as well as item for something that gives off wheel events
<ato> ScribeNick: ato
JohnJansen: We have large teams
around building PWAs, like Twitter.
... Testing these things isn’t just about testing the DOM as
WebDriver expose it.
... But also things like service worker.
... It spans in a weird way the area between the web platform
and other things around you.
... Is testing service worker in WebDriver in scope? Or should
we look into other solutions?
simonstewart: We have alluded to it in our earlier bi-di discussion, that automating different execution contexts and JS realms are in scope.
AutomatedTester: What would be different from testing Outlook as a PWA, as opposed to Outlook in a browser? What are the expectations that would be different?
JohnJansen: There’s no address
bar, the frame of the browser is different, and the features
that PWAs access could be in service workers which we can’t yet
access from WebDriver.
... Media queries would be interesting.
... They could be, for example, completely full screen.
bwald_: They are likely to be
productivity apps, they might be using the native file
system.
... There’s not way to WebDriver to mock these things so the
page thinks it’s interacting with a real file system.
AutomatedTester: Take files as an example: are those APIs only going to be available to PWAs and not websites?
JohnJansen: There’s talk about extending the Permissions API to cover PWAs.
ato: Permissions API already has a WebDriver extension, and it has an implementation in Chrome.
bwald_: Let’s say you use this
API to grant geolocation to a page.
... Why grant it permission if it can’t interact with it?
... Are there external tools that drive these tools?
JohnChen: A website wants to access my web cam, a popup will ask me if I want to grant permissions.
jgraham: In bi-di you would get an event that this popup appeared.
ato: In WebDriver you have this strange API similar to the unhandled prompt behaviour.
CalebRouleau: But we don’t have a way to mock out the devices at all.
bwald_: File API lets you access
native file system and this might popup a native widget for
selecting a file.
... This is not automatable with WebDriver.
jgraham: We can do interaction
with <input type=file> but it’s just very hard to model
this in a nice way over a command-response based API.
... Let’s hope we get bi-di before we really need to implement
this.
AutomatedTester: Conclusion is
that this falls within the scope of the WG.
... Making sure that people actually use us for wider review
(horizontal review) whenever these changes are coming up,
especially for Fugu, is important.
JohnJansen: I didn’t know that
the Permissions WG had extended WebDriver.
... Is there a reference from WebDriver to their spec?
ato: No, the relationship is the other way around.
<jgraham> https://github.com/w3c/webdriver/pull/1410#issuecomment-533424647
<scribe> ScribeNick: ato
<boaz> scribenick: boaz
reillyg: I work on chrome on apis
like web bluetooth and usb. These apis are difficult to test
because they don't follow the normal web testing workflow. They
rely on a hardware device and need a mock of that. Or if you
are web developer, you might have real hardware and want to use
web driver to drive your app with the peripheral
attached.
... I struggle with adding web driver commands because I have
to land patches on 7 places.
AutomatedTester: are your spec prose changes to webdriver, or to an automation section to an existing spec?ou
reillyg: an existing automation section, now that I am aware of that.
AutomatedTester: now that you have extension points, is there anything else that we can make simpler?
reillyg: plumbing a new command
through web driver layers of the web driver spec, cdp,
chrom/saf/ff-driver, etc.
... anything we could do to reduce the number of places, would
be great. and any guidelines that you could offer woudl be
helpful.
AutomatedTester: I think some of
this is browser specific.
... for a spec author, we want to focus on how easy it is to
write spec prose. and potentially any plumbing that may need to
change to the client. what you need to do in your own browser
is somewhat orthogonal.
reillyg: when I write my spec I
can take my idl and copy/past that into a file in my browser
codebase and then there are tools I have to plumb the
interfaces into my engine.
... I'd like to see a similar tool for my chromedriver
code.
brrian: is there anything else reducable about this?
reillyg: I'm complaining about how many projects I have to touch.
simonstewart: you should only need to define the endpoints, and put it into one implementation
jgraham: it is legitimate that we dont have a lot of idl and you need to do a lot of extra work around that. I dont think having that would help you right tests.
brrian: I could make an
ingester
... but it hasn't been a pain point
jgraham: I think it is a bit of a pain point.
brrian: its not a painpoint because there are ~12 endpoints, and not 100s of js apis
ato: the concrete feedback to
this working group is that it daunting to add new web driver
commands, and we should take that seriously
... historically it has been difficult to map things you want
to test onto the req/resp flow. we are now working on a
bidirectional protocol. I think this will make it a lot easier
to add extensions for specs that currently aren't tested.
... permissions for example, we had to contort ourselves and
make a weird api to fit into the current model.
... however, this is all complicated in different ways
depending on the various tech stacks. chrome may be harder than
safari.
... I think we should take the step to document our
expectations of how people are going to use web driver
... this is connectied to what I said before this topic
started. we should gather these things.
brrian: I agree
CalebRouleau: we have that
ato: we could use that
CalebRouleau: good
reillyg: we only have a few dozen web driver commands because it is confusing for people
AutomatedTester: yah
... if we link to prior art would be helpful. if we looked at
the docs for wdspec, this would help things.
ato: and this would have solved microsoft's problem of not understanding the process.
simonstewart: is it obvious who to reach out to for help
reillyg: well, we have a problem inside chrome, because I don't know who owns chromedriver.
CalebRouleau: he is sitting here (points to right)
simonstewart: myself and david
are the webdriver editors
... we can help you
... what made this so hard?
reillyg: I will take as an action item to reach out to people in my org.
simonstewart: but what could we have done?
ato: I think we are bad at following up on issues being filed.
jgraham: I think there are also
some issues with extensions.
... also, wrt to web idl, I think there is a possible
opportunity for using a schema language when we write down
bidi
<simonstewart> https://swagger.io/specification/
<jgraham> https://json-schema.org/
RESOLUTION: research having a more formalized schema for defining the transport layer
RESOLUTION: research having a more formalized schema for defining the transport layer
reillyg: there is also this issue where web driver is overloaded, both web driver and selenium, and I don't know if I need to add my work to both, and it would be nice to have tools for this
ato: in the interest in making progress, what we should do
<scribe> ACTION: cb to draft some high level documentation containing who is owning the driver, how to add wpt tests, putting the web driver api we have now and putting it in swagger yaml and publish it so it can be consumable
ato: I think the most important thing for us to do in terms of our expectations is to say what web driver cannot do
CalebRouleau: are you saying you would have discouraged permissions?
ato: if bi-di were a serious conversation when permissions was happening, I would have discouraged it.
<jgraham> close the queue
This is scribe.perl Revision: 1.154 of Date: 2018/09/25 16:35:56 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00) Succeeded: s/frameStopLoading/frameStoppedLoading/ Succeeded: s/but it's/and it's/ Succeeded: s/the WD spec/the JSON RPC spec/ Succeeded: i/transports roughly agreed/scribenick: MikeSmith Succeeded: s/that does not have/that has/ Succeeded: s/?// Succeeded: s/simonst__// Succeeded: s/JS [in an easy way]/JS/ Succeeded: s/bwald_// Succeeded: s/simonste_/simonstewart/g Succeeded: s/bwald_/bwalderman/g Succeeded: s/Sfari/Safari/ Succeeded: s/a?// Succeeded: s/Boaz Sender// Succeeded: s/rileyg/reillyg/g Succeeded: s/tioin/tion/ Succeeded: s/swager/swagger/ Succeeded: s/saying you discourage/saying you would have discouraged/ Present: AutomatedTester CalebRouleau Hexcles JohnChen JohnJansen Lan MikeSmith ato brrian bwalderman cb diemol drousso jgraham mmerrell scheib simonstewart titusfortner zghadyali zcorpan boaz Found ScribeNick: mmerrell Found ScribeNick: automatedtester Found ScribeNick: mmerrell Found ScribeNick: MikeSmith Found ScribeNick: mmerrell Found ScribeNick: ato Found ScribeNick: AutomatedTester Found ScribeNick: ato Found ScribeNick: ato WARNING: No scribe lines found matching ScribeNick pattern: <ato> ... Found ScribeNick: boaz Inferring Scribes: mmerrell, automatedtester, MikeSmith, ato, boaz Scribes: mmerrell, automatedtester, MikeSmith, ato, boaz ScribeNicks: mmerrell, automatedtester, MikeSmith, ato, boaz Agenda: https://docs.google.com/document/d/1gUm7Be-akW2-4mjr15cnZlzwoAfOlfL7b3tWCDrb1Jg/edit# WARNING: No date found! Assuming today. (Hint: Specify the W3C IRC log URL, and the date will be determined from that.) Or specify the date like this: <dbooth> Date: 12 Sep 2002 People with action items: ask ato cb johnjansen simon WARNING: Possible internal error: join/leave lines remaining: <bwalderman> has joined #webdriver WARNING: Possible internal error: join/leave lines remaining: <bwalderman> has joined #webdriver WARNING: IRC log location not specified! (You can ignore this warning if you do not want the generated minutes to contain a link to the original IRC log.)[End of scribe.perl diagnostic output]