Meeting minutes
Recording: https://
Slideset: https://
Screen capture 🎞︎
getDisplayMedia: Distinct “Error” for Cancellation 🎞︎
Jan-Ivar: this seems like a valid use case to solve, if all browsers implement transient activations
… Firefox already returns a "NotFoundError" when hitting an OS limitation
… in an iframe with a policy limitation, it does return NotAllowedError which might indeed be improved
… [NB a mistake with the constraint attribute on slide 14]
… -1 on relying on prototype, would be preferable to have an additional attribute (e.g. boolean "userInitiated")
Youenn: could we use the Permission API to determine this already?
… it tells you if the user persistently denied access - which would be the same situation if denied by iframe policy
… I'm not sure if they need to be distinguished
… I think NotFoundError + persistent denied should be sufficient
Elad: the permission API can't be used for this: it's asynchronous, and it requires asking all the time
Youenn: if the user has put a setting to always deny getDisplayMedia, it will not be "ask" again, it will be systematically "deny"
Elad: I don't think this is possible in any browser at the moment, not even sure it should be exposed
… but that's an edge case compared to the majority of cases
… the asynchronous nature of the permission API doesn't allow to tie a rejection to a specific call of getDisplayMedia
Youenn: the same issue exists with getUserMedia - what is specific to getDisplayMedia here?
Elad: getDisplayMedia() will always involve a new prompt - each call has its own state
… getUserMedia can persist in a given session or even across sessions
… likewise, usually a choice on denying camera is unlikely to be changed, whereas canceling a screen share can reflect a temporary decision
Youenn: not seeing a big difference between the two
Harald: you listed 3 sources of failures - that point towards a string rather than a boolean (e.g. "user-denied", "internal-failure", "os-disallowed")
Elad: I tried this with mute reason before, but this wasn't too well received, hence I'm focusing on the narrowest need
Jan-Ivar: +1 to solving the use case; no strong opinion on the approach, but Youenn's point made sense to consider
Elad: the spec doesn't give any suggestion for persistent denial at the moment
Jan-Ivar: the Permission API allows to add a track handler which could probably solve this
Guido: the normal behavior is "always ask" - right after "deny" it becomes "ask" again
Jan-Ivar: it would have to be set to ask for all situations that aren't user initiated
Elad: that would still it a lot more complex than integrating it in the error message
Jan-Ivar: but there is value in not shipping new API surface
TimP: not convinced that synchronous handling is needed in the examples you gave (stats, prompt strategy)
Elad: one example: getDisplayMedia called from a button - should that button stay disabled while checking the permission state?
TimP: doesn't feel like a hard requirement
… I think the simplicity argument is more compelling
Harald: I don't how Youenn's suggestion would work
Youenn: let's explore on github what use cases can or cannot be addressed with the two approaches
… we need to see which situations would be distinguishable based on permission/error type
Guido: synchronicity is needed for correctness checking
RESOLUTION: continue discussion on github issue
Expose capturer/capturee overlap 🎞︎
Youenn: in the example with the PiP, the Web app would like to know the current situation and what would happen with PiP
… PiP might trigger an overlap of its own
Elad: I'll cover this when describing proposal #2
… but note that PiP is only one of the situations you want to manage
… the first question would be whether PiP has any chance to be useful - which it can't be if there is no overlap at all
Jan-Ivar: leaving aside our current open position on document PiP
… this feels like a good use case
… re proposal 1, could we expose only the percentages instead of the values needed to calculate them?
Elad: the absolute values can be recovered from the percentages
Jan-Ivar: but conversely requires exposing fewer attributes
Elad: I can live with either approaches
Jan-Ivar: re initial vs dynamic (slide 28), making it dynamic could lead to the browser fighting with the user
Elad: this would be a readonly value
Jan-Ivar: but there could be a fight via PiP
Elad: there is no such mechanism at the moment for the Web app to control the position of the PiP, and if there was, the problem would exist independently of exposing that dynamic data
Jan-Ivar: I think starting with starting with at "opening" time would be safer
Elad: the only use case I can imagine is if the web app wants to offer a better layout based on determining dynamic values changing, but it's arguably hypothetical
… some activity is already observable through the change of the captured window sizes
Jan-Ivar: let's focus on open for now, and discuss if/when to update the values
… I'd support initialPercentage
TimP: I don't see anything problematic in terms of privacy
Youenn: percentage seems fine; I'm wondering whether a hint to the UA would be sufficient
… Could you file an issue which would illustrate how the percentage would be used? sometimes what is overlapping is important
… my primary concern is about managing focus rather than dealing with PiP
… understanding the PiP use cases better would be important
Jan-Ivar: let's distinguish PiP as a remedy vs PiP as a source of overlap
Elad: the MVP for me is not to trigger PiP when disruptive, trigger it when it's clear it would help, and let web app developers explore the in between
RESOLUTION: agreement on the validity of use case, continue discussion on to-be-created github issue
WebRTC: How to find the remote fingerprint? 🎞︎
[Elad departs]
Peter: re slide 32 - does getRemoteCertificates() return an RTP certificate?
TimP: it returns a blob
Peter: it could be fixed to return something on which we could call getFingerprint(); re c), what's your worry about "being too late"?
TimP: receiving unexpecting RTP, earlier than expected; I'm keen to not have a connection up if it's not expected, before any packet exchange. The DLTS handshake is already finished by the time of getRemoteCertificates()
Peter: there are alternatives (although not great): you could set up a send-only RTP transceiver, and switch it sendrcv after the certificate is verified
TimP: you'd still be talking to someone that is likely hostile - letting this filtered by the upper layer feels weaker
Peter: but the existing API already allows to achieve this?
TimP: I think you can't; this would protect you against a malicious SCTP ack
Peter: you could wait to negotiate until you've verified the fingerprint
TimP: I'm trying to make a proper API; getFingerprint() was defined for IdP which never happened, I'm trying to make better use of it.
Peter: re slide 36, I like b), but it can only be done if it's fully bundled - which applies to other options
Harald: I think this is at the wrong level - fingerprints are a transport attribute, not a connection attribute. I also don't understand the API here - the fingerprint is what the other end tells you to verify they are who they say, it shouldn't be set by you
TimP: the problem we're trying to solve is persistence: imagine two devices have established trust and stored each other fingerprints in a trusted context, and want to re-use that trust relationship when exchanging SDP, à la QUIC zero, even in a case of a less trusted signaling mechanism
Youenn: is there a github issue to continue this discussion?
TimP: will file one.
Peter: d) would be sufficient to check the remote certificate is known (without having to manually parse the SDP)
WebRTC Encoded Transform 🎞︎
SFrameTransform mode per-packet vs per-frame 🎞︎
Jan-Ivar: why would it be difficult to support per-packet with ScriptTransform?
Youenn: ScriptTransform generates frame, would make it difficult to deal with packets. ScriptTransform could not deal with decryption
Jan-Ivar: ScriptTransform could still be used to deal with frame manipulation on top of SFrameTransform
Youenn: right, that's not the use case I'm discussing as out of scope
Youenn: there could be an option C, per transceiver
Jan-Ivar: the global option feels a bit artificially limiting; which mode you pick might depend on which SFU you talk to
Harald: if you want to switch from SFrame to JS or the other way around, you need to add new media lines - that argues against using RTCCnfiguration, it should be a transceiver parameter.
Youenn: will there be two m-line section, one using per-packet and another using per-frame? didn't seem very compelling
Harald: the use case I was thinking of is one using SFrame and the other not using it at all
Youenn: that's controlled by setting the transform on a transceiver basis; this is about setting the sframe flavor
Jan-Ivar: it does seem more like a per-transceiver thing; re slide 42, why an interface for the options?
Youenn: this allows the UA to check whether it's a scripttransformoptions and thus switch to a difference processing flow, allowing to have the type as a second parameter to the constructor
Jan-Ivar: not sure I like that - we can bikeshed that; the problem is the ambiguity between options and message to the worker - we had that issue with setting codecs as well. There are alternatives we could discuss in the PR.
Youenn: the PR uses a 4th option object as a dictionary
Jan-Ivar: that would be preferable from my perspective; let's discuss in the PR
Youenn: per-transceiver seems fine
Jan-Ivar: Per transform might be better
Youenn: that'll depend on how SDP negotiation happens; last I heard there would be an sframe a-line per-packet or per-frame to the m-section. In that case, both ends needs to abide to it. If this is only "use sframe", senders and receivers could do different things
Jan-Ivar: why not Transceiver.sframeTransform = true?
Youenn: that's equivalent to option A… Sounds like more discussion needed, and not clear Option B is getting much support. This will depend on the SDP negotiation for SFrame on which I expect progress this month
SFrame cipher suite #256 🎞︎
Jan-Ivar: LGTM
Harald: nothing to add beyond my comments on the PR
RESOLUTION: move forward with proposal
SFrame RTCEncodedVideoFrame on receiver side 🎞︎
Youenn: it's for the case where decryption is done via scripttransform
Peter: if you decrypt yourself, you could look at the payload format yourself
Youenn: the scripttransform on the receiver side exposes and rtcencodedvideoframe whose content is encrypted - what should the type attribute indicates?
Peter: if your job is to decrypt, it's not your job to set this; this sounds like 2 different stages
Youenn: it's require to expose a value here - it could be empty, or another value
Peter: saying "unknown" would be better
Youenn: except if that's provided by an RTC Header extension
Peter: likewise for spatialIndex/temporalIndex
Youenn: those are optional; they would only be exposed if the RTP header extension is present; the spec doesn't say how they're set in any case at the moment
Jan-Ivar: If I'm using SFrame and ScriptTransform, on the receiverside, do I get an encrypted or a decrypted frame?
Youenn: the former - it's up to you to decrypt it
Jan-Ivar: that's the old model we have today; why not have the browser deal with the decryption?
Youenn: you could do that
Jan-Ivar: that'd seem cleaner; you could imagine specifying separately JS transform and sframe transform since they're orthogonal
Youenn: When doing SFrame at the frame level and are already using ScriptTransform, being able to do decryption it as part of ScriptTransform feels simpler; for instance, you could apply different algorithms on the Sframe configuration itself based on inspection of the frame - this couldn't be done with separate transforms. Separating the two processing can already be done with the current API.
Jan-Ivar: worth clarifying the use cases on github
TimP: re frame type, it should be "encrypted" or "sframe"
Youenn: is there an RTCEncodedAudioFrame.type? We would also want the audio frame to know it is encrypted, so maybe this needs to be a different attribute to signal encryption that would apply both audio and video frames
Jan-Ivar: what is the use case for ScriptTransforming an encrypted frame?
Youenn: that's the current way of doing things; ideally, for ease of migration, we should allow apps to continue using ScriptTransform using native decryption, via the SFrameTransform stream writable/readable.