Meeting minutes
Recording: https://
Slideset: https://
TPAC 2022 🎞︎
Dom: TPAC being considered as a hybrid event this year - please indicate whether you think you might join physically such an event?
[from online poll: 3 Yes, 4 No, 4 don't know]
WebRTC-SVC 🎞︎
Bernard: issue #68 relates to behavior of getParameters() - unclear about re-negotiation (vs before/after negotiation)
… PR #69 has proposed text that clarifies that we're talking about **initial** negotiation (before/after)
… if you re-negotiate, you'll still get the currently configured scalability mode
Harald: wfm
Jan-Ivar: is this correct? getParameters() algos are very explicit about what you get based e.g. on localDescription
… some come from pending, others from current
Bernard: let's say you change preference order for codecs, and you renegotiate (e.g. from VP8 with L1T2 to H264 that doesn't support scalability) - what happens then?
… at what point do things change?
JIB: even without setCodecPreferences, getParameters() may return different values depending on whether re-negotiation is happening or not
… e.g. if you have a local offer, it might affect the results
Bernard: looking at the VP8→H264 case, what should happen?
HTA: as long as you're sending VP8, you should get L1T2 back
… when you switch to H264, you get L1T1 back
Bernard: that's what I would expect and what the text tries to convey
… nothing changes until the new codec starts being used
… JIB, could you write up your concern in #68 ?
RESOLUTION: Continue discussion in issue #68
WebRTC-Extensions 🎞︎
Bernard: Fippo gathered a list of hardware acceleration bugs that has been encountered
… which raises the question of allowing to disable hardware acceleration
… WebCodecs provides an enum to hint about whether or not use hardware acceleration
Bernard: I looked into 2 approaches: setParameters, setCodecPreferences
… the first one doesn't really work since the envelope of changes may not include hardware alternatives
… it also only makes sense if mid-stream switch is necessary
… the second approach goes through re-negotiation via setCodecPreferences()
… How would you discover this?
… Media capabilities may need amendment https://
Dom: should this be managed by the browser rather than left for developers to detect and manage?
Bernard: this would be useful *when* developers detect a problem so that they don't need to wait for browsers to react to it
Florent: there are also cases where a decoder interacts badly with a specific encoder
JIB: for setParameters, there are read-only properties
… putting it in codeccapability (which is returned to developers) means doubling the number of entries
Bernard: you may not have to return it from Capabilitiy
JIB: but then it doesn't fit very well with a notion of codec preference
… we've also moved fingerprinting surface to media capabilities
… I wouldn't want to reintroduce concerns without good reasons
… it doesn't seem necessary to include that info if it is tackled as a preference
Johannes: I understand this as developer wanting to disable hardware encoding as a short-term patch to the browser getting it fixed
… it sounds like a recovery mode, more than a capability
… also agree it's hard for developers to use it, but that it would have its uses
Harald: routing around bugs is for specific implementations of the codec, which requires they know the specific implementation
… does that point toward media capability as the right way to go?
Bernard: that's where you'd find out if it's "smooth", "power efficient", "supported"
Harald: if it's X's hardware encoder with software version Y, that may be the information you need to know whether or not to use it
… not sure that fits with the Media Capabilities model
Johannes: it would seem challenging
… Also, the bugs that have been identified seem to be browser-specific
… there are block-lists for this or that hardware; it may be worth investigate the possibility to move towards dynamic blocklists from browsers
Riju: we share the GPU blocklist defined in Chrome with our driver team to get them to be fixed platform by platfomr
Harald: no clear resolution, but some suggested paths worth exploring
Harald: issue #99 about RTP header extension
… if an implementation supports an extension, it doesn't show up in Capabilities at the moment
… is this problematic? if not, no change needed; if it is, we may need to surface that it exists but is disabled by default
… you can get the information by inspecting the offer, so this may not be needed
Bernard: it's a convenience in the use case; there will be scenarios where you don't want to set it on by default
Dom: is anyone asking for it?
JIB: if this is for debugging, looking at the SDP is fine; if it's to control running code, it should be an API
Harald: the most likely example would be if transport-cc is not supported, I fallback to another congestion control
… I think it can be shimmed by creating an offer and dancing with a throw-away peer connection
Dom: not hearing a lot pushback, nor a lot of demand either; maybe wait until we have more demand if it can be designed in a way that is backwards compatible
Harald: yes, it can be done later in a backwards compatible
RESOLUTION: close #99 with no change
Avoiding the “Hall of Mirrors” 🎞︎
Elad: the proposal would to add a new member to the DisplayMediaStreamContraints à la includeCurrentTab to hint to the UA whether or not to include the current tab or not
Elad: influencing the user decision in picking display surfaces has security implications
… but I argue that in this case, it is not problematic: the risks of selection are of two nature:
… - the attacker influence the user to share a surface under the attacker's control
… - the attacker influences the user to share a tab with sensitive content (e.g. their bank account)
… but excluding-self is orthogonal to these
Elad: if we agree this is worth solving; the question becomes what's the default value should be
… if we make it optional, this could be left as a UA dependent default
Elad: a potential expansion would cover additional surfaces (e.g. screen)
JIB: #209 has the detailed discussion - what is the proposal we're reviewing?
Elad: I suggest adding a dictionary member (either include or exclude) that serves as a hint, with no change to current behavior
JIB: I like this API, but would want the default to be "false"
… I don't think this is so much about hall of mirrors - a symptom that the UA could address either ways
… the real issue is that in many cases, self-capture is NOT the intent
… long term, self-capture would be getViewportMedia
… some sites that want self-capture to be part of the selection - they would need to opt-in
… also, TAG guidance is that undefined maps to false
Elad: re default true - agree
… re alternative approaches Youenn suggest, I don't think ti works for current tab (it would work for current screen)
… I agree with your characterization that the root cause is if you're not ready to self capture
… I suggest we don't take getViewportMedia into account since there is little visibility in terms of its adoption
… I think we should avoid breaking apps, even if shortly
JIB: I think we should keep that separate from what implementations do
… here the question is what's the most frequent case, most sites wouldn't want to it
Elad: lost of self-capture happning every year; assume a lot of it not accidental
Youenn: re security, the current spec doesn't deal much with tab capture in that regard
… we're bringing more and more control to what UAs will show, and that means we need to strengthen the guidance to UAs
… Chrome has some mitigations in this space that might serve as a starting point
… If this is a hint, this is fine
… Some implementations might remove entirely the possibility to select the tab, that's something new
… hints allow to push users towards the more meaningful choice, but leave the user in charge of the final choice
… re hall of mirrors - I don't think this is solving it
… some native apps have implemented current-app blurring to solving the issue
… cropping would be another way to solve the issue
… if it's only a hint, it's fine; but if it brings a required behavior, I don't think we should go there
… also want more security guidance
… and keep issue open on addressing other aspects of hall of mirrors
Elad: could you help with the security guidance?
Youenn: Ideally would like to get the work that Chrome has done
Dom: +1 on a hint; if boolean is problematic, we can use an enum to avoid the default value fallback
Elad: happy to help with getting the security considerations with guidance from Youenn on what he wants to see
Harald: hearing overall support to continue in that direction, towards a hint
Display Surface Hints 🎞︎
Elad: similar to previous issue, but distinct
… some apps want to hint to the UA that it is will geared toward a particular display surface type
… I think there is agreement that this is worth supporting
… but we've struggled to find an approach that everyone likes
… I'm suggesting a compromise based on the discussion which would be:
… - use constraints as a mechanism
… - make it a hint with UA dependent behavior
Youenn: hint is fine; it could be a constraint as a model, but with an improved simpler WebIDL surface
Elad: reject on "exact"?
Youenn: "exact" would be ignored
Harald: -1 in integrating this in the proposal - I hate irregularities
JIB: +1 to Harald; "exact" is already a type error in getDisplayMedia which already narrows down the constraint mechanism
… agree with reusing displaySurface
… I have concerns with an app asking for a monitor - I don't think we should provide this level of control
… I proposed text to steer away users from monitor capture
Elad: this is a hint - UAs can decide not to follow it
Dom: with a hint, UAs can provide the best experience they can
… not sure the SHOULD would achieve much if the main target isn't interested in SHOULD
Youenn: the SHOULd owuld be useful for new implementors
Elad: there is merit to that
… non-normative language pointing to the risk would be good
JIB: the SHOULD already allows for this; given Chrome has a good motivation, this feels like an exact reason why SHOULD would be used
RESOLUTION: modulo discussion on SHOULD guidance, we adopt the displaySurface constraint proposal to manage Surface Hints
getViewportMedia update 🎞︎
JIB: FYI, there is a PR up to describe getViewportMedia which hopes to bring to a call for adoption soon
Viewport Capture Unofficial Draft
Youenn: we probably need a different set of constraints than the ones for getDisplayMedia
… re audio, we need to think about whether to include system level audio or just current tab
JIB: currently restricted to current tab
Harald: if it can't be isolated, no audio should be captured
JIB: there are pending PRs that I hope will be merged before we start the call for adoption
Elad: the general intent of this work is awesome; looking forward to see it implemented
… that said, until we see it adopted, we need to be careful in basing our decisions on this work, or consider relaxing some of the restrictions
Youenn: has there been any outreach to web developers re x-origin isolation?
Elad: the feedback I got from developers was this was a blocker for them
Bernard: ditto
JIB: I agree this is taking the long view here
… hence the flexibility we're showing on getDisplayMedia
… re using different constraints, we can change it when it shows as needed
Youenn: displaySurface would be one case where this is needed
MediaCapture Extensions proposals 🎞︎
Riju: this is follow up from a conversation that started at TPAC
Riju: PR #48 is allowing in-browser face detection
… when we showed this last time, the feedback included:
… - tie it to VideoFrame rather than MediaStreamTrack, which the PR reflects
… - future-proofing the bounding box approach - this is addressed with the Contour described in the PR, with a way for the developer to request something other than the default 4
… - another request was to have a face mesh - which is now exposed as an additional property (although there is no native support for it today)
… - face expression was raised as a concern, so we removed it
… - making face detection work with transform stream
Riju: we've put up an example to show how they would work together
… we've done early testing that shows improved power consumption - more specific numbers to be shared soon
Youenn: good to expose it on VideoFrame; but would also be good to expose in requestVideoFrame callback e.g. for use with canvas
… re using "exact" constraints - I would expect "exact" not to be allowed in this
… There seems to be switches to give hints to cameras - do we need several switches to allow per-algo enabling, or could we have a single "face detection" switch?
Riju: e.g. "is face detection supported"?
Youenn: why multiple switches if a single one is good enough, leaving it to the Web app to deal with what they're obtaining
Riju: for instance, contour points would allow future support for additional more detailed contours
Youenn: since the camera is doing the work, not clear we need to give more hints to the driver
Riju: contour/mesh were added for extensibility
Youenn: maybe reduce to what's implementable, while future-proofing it
Bernard: high level questions about the API surface
… I understand the supported contraints & capabilities are used to provide the basic parameters for the algorithm in the driver
… videoFrame.detectedFaces is already done by the driver
… as opposed to have a promise-based method to which the parameters would be given
… if your camera driver doesn't support it, you wouldn't have it
Riju: going through promises, this would impact performance and re do work the driver has already done
… OS level face analysis would duplicate computation already done in the driver
JIB: so, it's a camera API - only available to sources that are camera?
Riju: right
JIB: my concern is that there is another effort in the WICG, the shape detection API - how does it relate to it?
… would be unfortunate to have it to deal with face detection differently depending on the source
Riju: shape detection work on images, can be called multiple time
… no face tracking available, which helps detecting face across frames efficiently
… face detection is based on OS level face analysis, which duplicates the driver work and is less power efficient / robust
… we started from that API in our effort in this space - we feel this new approach gives much better results
… FaceDetector is only supported in Windows atm; the work has stopped afaict
Bernard: so you're saying the WICG work is not going ahead?
Riju: I can check the status with Reilly (but my team was the one behind the implementation)
Harald: I share some of JIB's worries
… we have functions today that depend on high quality face detection e.g. background blur
… I'm worried about having these different interfaces to solve the same problem
… esp if some interfaces end up proprietary
… if the proprietary interfaces provide much higher quality than what standard interfaces can provide
… hence my pushback on making contours and meshes available in the API
… I'm still not happy with the design that seems to be totally focused on axing this on hardware/driver resources rather than a representation API
… it has a bit of that flavor, but there is still a lot of a sense of configuring the camera
… also I'm surprised this only gives a 50% factor over media pipe
… but in general, this feels like a major new way of treating media information
… I'd like to see be proposed as a proposal, not as a set of API patches
… with an explainer, use cases, examples - that we typically put together before agree on taking it up
Riju: no need to configure the driver
… the PR includes examples
Harald: I'm thinking of what application would be use this for, what problems to solve
Dom: what an explainer would cover
Riju: I can come up with that
Dom: happy to help with the logistics of making it happen
Riju: is the question about whether this is useful or not?
harald: yes
bernard: or rather whether it handles all the use cases people want
Jan-Ivar: e.g. tying this with camera may become obsolete or too limiting
… having an API that isn't as strongly tied to hardware acceleration
Harald: I'd like to have a better understanding of which apps want a rectangle around a face
Youenn: encoders actually optimize around faces if such metadata are available
… +1 on defining API that can obtain metadata from the hardware or a TransformStream
JIB: among other things, having less hardware-dependency allows UAs to step in
Riju: backgroundBlur has more platform API support than replacement
Youenn: iOS has the ability to switch on & off background blur, fully outside of the Web app, and fully dynamic
… the Web app could not unblur if the user has set this us at the OS level
… (but not vice versa)
… that situation is not well supported by constraints
… we may need a way to surface whether a constraint *can* be changed (and to signal when it can no longer be changed)
JIB: this is a case where constraints work very well - the app states its ideal
… background blur is popular, would be good to support it
Youenn: I don't think "ideal" suffices to expose the situation
… re backgroundBlur level - it's not settable on iOS; are there platforms that would benefit from it?
Riju: no platform API supports this, but some software models have that parameters
… but I understand some platforms are working towards making it settable
Youenn: but without knowing the algorithm, setting a particular value would be hard for developers
… we may need a boolean instead
JIB: part of the question is whether this needs to be controllable by apps vs the UA
harald: in audio, we've encountered cases that it's valuable to tell have manipulating settings that are supposed to be useful in the driver, but actually creates issues
… e.g. double echo cancellation control
… the most important control we have is to turn platform effects off; the second was to detect the situation to ask the user to turn it off
Riju: on the last three proposals (lighting correct, face framing, eye gaze correction), any sense of interest?
… the goal is to give options to developers on whether or not to use hardware capabilities
Bernard: should we get back to this in April?
JIB: from Mozilla's perspective, we don't have strong interest in this approach given possible interop cross-OS issues
… we don't see any urgency
Harald: for face detection, we have a pretty solid way forward via the explainer with use cases and justifications to support adoption
… some of these additional camera controls may fit into that new document
… if we accept constraints as a way to control camera drivers, grouping them together make sense
JIB: but adding individual constraints is something we've used mediacapture-extensions in the past
Youenn: the complexity of a boolean constraint is very different from the more complex Face API detection
Dom: I'll work with the chairs to agree on a clearer path forward then :)