WebRTC TPAC F2F Day 2 -- 20 Sep 2019

<hta1> scribenick: hta1

WebRTC-SVC

Starting with SVC structures.

Decision: Remove drawings from SVC spec; AV1 is normative for drawings. SVC spec is normative for strings; make that clear.

Aboba - Presenting on custom scalability modes.

If a different mode is required, we could define a new mode string. Assigning a new number in AV1 requires AOMedia cooperation.

aboba: we could define strings that reference AV1 custom scalability modes if we can define scalability structures using SS.

could also define modes in prose in this spec.

<scribe> ACTION: aboba to Add a paragraph to the table to describe considerations for adding new values to the table of modes.

<scribe> "new modes can be added to this table by updating this specification".

issue 14 - orphis: Encoding parameters for spatial layers

active, maxBitrate and maxFramerate probably make sense.

<hta> youenn: what's the usecase for active?

<hta> aboba: There's a difference between dropping a layer and changing a mode.

<hta> amit: should we value consistency over conformance? Users will use modes that they dont understand, and will be surprised.

<hta> hta: People who don't understand the codec and mode they are using are going to have a hard time anyway.

<hta> orphis: <shows list of possible proposals>

<hta> youenn: a mode would have a sensible default, used when you don't specify anything.

<hta> scribenick:hta

<scribe> scribenick: hta

(discussing case of bitrate allocation - is percentage right, or is numbers right?)

<weiler> where are minutes being taken?

here. writing slowly.

dr alex: scaleResolutionDownBy is independent for each layer in simulcast? several: yes, it is relationship to source, not between streams

This discussion will continue at 2PM.

Privacy Issues

<dontcallmeDOM> ScribeNick: dom

<dontcallmeDOM> ScribeNick: dontcallmeDOM

[welcoming PING representatives]

Youenn: we want good privacy reviews on our planned Proposed Rec of WebRTC-PC and Media Capture
... Issue 612
... enumerateDevices is a way for Web pages to enumerate all capture devices (camera, mics, speakers)
... it's meant to help implement a device picker to select a particular camera and microphone
... that information is gated by device-info permission
... without that permission, you don't get labels, but you get persistent device ids and that entails the exact number and type of devices
... our proposal is to hide more info before device-info is granted, only expose generic/low-fingerprint profile of devices
... In Safari, we've looked at Web pages - most device pickers are only available after getUmserMedia was granted
... one proposal was to expose one device of each type - that's what Safari ships; it is Web compatible
... (we haven't received negative feedback since we've shipped this a month ago)
... this still exposes whether a given device has access to a capture device of given type
... A second proposal is to lie: pretend there is one capture device of each given type
... you get the truth after the gUM prompt
... this fixes the fingerprinting issue of the 1st proposal
... a 3rd proposal would be to push enumerateDevices() behind a prompt
... but that feels difficult to implement (how to make it meaningful to the user)
... also feels hard from a compat perspective

PeterSnyder: is this actually used for fingerprinting?

Youenn: yes

EricC: and it is used for that a lot more than for getting cameras

Harald: hangout chat will display a camera icon only if it finds a camera in its chat mode
... proposal 2 would make that distinction useless

Youenn: this would require a change in the UX of hangout

Sam: what is the deviceId? free-form string?

Youenn: a UUID

JIB: per-origin

Youenn: double-keyed

Dom: what if new capture device types come in?

Harald: e.g. depth cameras which are in the pipeline

Youenn: I think this would have to move to a picker-based approach

Armando: lying about devices availability doesn't sound right

EricC: in Safari, we default to the list of capture devices that are tied to that platform (e.g. 2 cams, 1 mic on iPhones)
... if there are additional devices, we pretend they come up upon gUM grant

PeterSnyder: would moving to Proposal 3 in the long term be a possibility?

EricC: what kind of questions would make sense for Option 3?

Sam: one anti-pattern is asking permissions as soon as users get in
... I've been that in WebRTC land in particular when it comes to asking for cameras

<weiler> axk wei

Bernard: part of the reason this happens is because fewer and fewer features are available before you ask for permissions
... e.g. network topology detection is gated to camera permission today
... and as this causes problems to real deployment

JIB: I doubt going to permission prompt is a direction any browser would go to
... plus, a permission prompt for enumerating devices would be just before another prompt

Harald: I'm hearing moving towoard Proposal 1 would be acceptable to all implementors

<hta> we have 5 issues to get through. We have 45 minutes.

<hta> (now we have 22 minutes)

Harald: Proposal 2 is more controversial

PeterSnyder: I do have concerns with proposal 1

Dom: for clarity, all these proposals are improvements over the current situation

PeterSnyder: Proposal 2 then lies about the reality? when does the truth get revealed?

EricC: upon getting permission access for capture
... the list of devices would be updated (with an event) to reflect the reality

PeterSnyder: OK, that sounds appealing

Youenn: Proposal 1 is clearly implementable (shipped in Safari)
... Proposal 2 will break some web sites e.g. if they adapt their default UI to what's available
... hearing Proposal 1 is a good implementors target in the short term; Proposal 2 is appealing to PING
... and appealing to some implementors
... Issue 607 - persistence of device ids
... can be used for cross-site tracking
... deviceIds have already some level of protection
... they're not exposed to cross-origin iframes
... they can also be hidden from iframes with device-info
... A proposal is to partition id if other persistent partition id
... Difficult in making it mandatory today given existing deployment
... Re partitioning, Safari double-keys all persistent data based on the nesting of browsing context (à la iframe) origins
... (existing discussions in this space for http cache in fetch spec, and in service workers)

<Orphis> Double keyed HTTP cache: https://github.com/whatwg/fetch/issues/904

PeterSnyder: the concern is when there is no double-keying

Youenn: the goal is to go there; the question is what to do when double-keying is not available

PeterSnyder: I'm suggesting double-keying the seeding of UUID
... to partition the deviceId space

JIB: this breaks the case of embedded e.g. Hangout
... and it doesn't actually buy privacy until indexeddb is partitioned

PeterSnyder: PING would be happy with requiring partitioning of deviceID

Youenn: Issue 374
... WebRTC Stats is exposing for network types from users
... e.g. if the user is on wifi, ethernet, vpn, etc
... mostly for debugging purposes
... the problem is that it exposes fingerprinting surface, expose privacy-sensitive information
... and could be mis-used for user-hostile adaptation
... A first proposal would be to move this to an extension spec where special mitigations could be discussed
... Second: we have already a note about this, nothing needs to be done
... Third: we hide this behind the getUserMedia

Emad: what's the fingerprinting surface?

Youenn: you can monitor the different types of network a user has been using, when, detect patterns

@@@: different behavior on different type of networks sounds useful

Dom: not what WebRTC Stats are meant for - they're meant for monitoring or debugging

<hta> Amit = amit

Dom: UX adaption would need to use e.G. network information

<weiler> debugging should be "special case", right? hence it's okay to not provide the info in the ordinary case?

Peter: Network Error Logging is another API that reports information on network traffic

Dom: does it?

Peter: would need to double check

Harald: Proposal A & B doesn't actually change what's shipping
... C about guidance feels more important

Dom: let's distinguish between the practical considerations of where to define the mitigations
... I think I'm hearing interest in investigating mitigations in more details

Amit: we could fuzz the network types (e.g. network1, network2, ...)
... could help distinguishing cases of debugs

Peter: I think that rather guidance is specific rules

Bernard: in practice, vpn isn't all that useful

Christine: vpn are actually forbidden in some countries, this shouldn't be broadcasted

Harald: vpn should just be deleted from what I'm hearing
... we still need to look at how critical networkType is for debugging purposes given the concerns we're hearing

Henrik: other exposed stats given maybe more actionable (e.g. bitRate)

Dom: proposal D would then be removing it

<weiler> again, isn't debugging a special case, such that it could be hidden behnd a permission?

Peter: PING would support removing it if that's not too big a loss, gating it would be the fallback

<Amit> we might want qingsi to give an opinion as an expert on ice debugging

Youenn: Issue 83 in audio output
... We're currently limited to which speakers can be used for audio output
... as it linked to getting permission to getting audio input
... We would like to offer an in-chrome picket to give access to new devices

Armando: we're supportive

Peter: without knowing the details, this sounds interesting

Youenn: thank you for all your help

Peter: thank you for having us

WebRTC NV Use Cases

<scribe> ScribeNick: dralex

<dontcallmeDOM> ScribeNick: dralex_

scribing

14 open issues

++> issues 53

advanced codec cap.

HW acc available?

min and max resolutions

support for sic ?

svc?

use case B : identify the implementation (good for debugging, and good for adaptation to a specific implementation features)

in some cases, supported profiles and resolutions are different for HW and software

right now the encoder is assigned during negotiation, and then the bandwidth adaptation might change the setting outside of what the current encoder/decoder can do.

youenn express concerns that from the developer point of view, it can become real hard to handle.

Action item, henrik will ....

describe one or two uses cases, and then make a PR.

JY: it seems to be presented the wrong wa: API first, instead of use case first. Can you please write the use corresponding use case?

florent: we would like to be in a case where we get helpful error messages to act on it.

dom: that really underline the need to write a user story

JY: we have to be careful not to augment the fingerprinting surface.

--------------

issue 37: requirement for secure web conferencing.

second attempt since the meeting in Stockholm

PR 49 submitted by Cullen

google'se mad suggested that we look at the MLS security architecture as an inspiration to write our requirements here

discussion about req. N25 to N29 from MLS

proposal to put that in the secure conferencing requirements

dom recommandatie is to only fuse one of this use case.

mozilla position is to merge in the untrusted case, and needs to think about the trusted case.

google is considering very seriously considering the trusted case and implementing it

apple is not in favour of anything, but shows interest in the result of investigations from mozilla and google.

---------------

charte expire in march 2020

since a new or updated charter would need 3 months to be approved, we would need a WG-approved draft by EoY

obvious charter changes:

2 secs out (webrtc 1.0, GUM)

update deliverable timelines

<dralex> scribbing

<dralex> open questions: what do we do with existing deliverables

<dralex> is there any new deliverable?

<dralex> how long do we need.

<dralex> what should we do about the maintenance of webrtc 1.0 and GUM

<dralex> what about webrtc-stats and webrtc identity

<dralex> there is also a leak of editor

<dralex> some people and/or organization need to provide ressources

<dralex> other specs are stable but much earlier in the standardisation process.

<dralex> image capture, depth cameras

<dralex> content hints

work could only continue with an identified editor, AND a commitment by a browser vendor to implement

emerging use cases

more granular objects

webrtc insertable

end-to-end encryption

<dralex__> really only three ways forward:

<dralex__> in re-chartered group: only with editor and implementor commitment

<dralex__> in incubation (CG group)

<dralex__> or moved to another group (media group)

<dralex__> it seems that the consensus is to keep the 3 activities together (maintenance, dev, ....)

<dralex__> the question is open about whether some media capture should follow web transport, webcodec.

<dralex__> the consensus seems that there is a strong link with web transport and webcoedc (peter), but as it would dbe the same implementor as the main webrtc spec, it would make sense to keep things together

<dralex__> interest from mozilla and apple on image capture

<dralex__> no strong interest in depth

<dralex__> stop discussion here because of time questions

Joint Meeting with Accessible Platform Architecture Working Group

Minutes taken in APA meeting minutes

<jcraig> Web RTC group. Dom said to come to the #APA room (the physical room) in Kashi 1F right behind registration

<dom__> [minutes for this meeting taken on #apa]

<vr000m> trying to join the meeting URL is this URL correct: https://meet.google.com/vxf-wgai-rjn

<hta> we're having a meeting in a different room at the moment, so I guess nobody's watching the "allow to join" button.

<hta> will be back soon.

<vr000m> ok

<dontcallmeDOM> scribenick: steveanton

WebRTC stats

RESOLUTION: issue-365 & issue-470: just remove RTCMediaStreamStats (move to obsolete)
... issue-437: agreement for proposal

Issue 398/PR 495

jib: like these members better than those proposed for transceiver stats

hbos: already have per-encoding stats in the outbound-rtp stats, only thing missing is the rid
... if that's the only thing we want we should just add it to outbound-rtp
... but if we want more information then we should have a separate dictionary

jib: prefer to just add rid to outbound-rtp

vr000m: if getParameters exists then rid should be sufficient
... no concrete objection to just adding rid but would prefer consulting with the list to see if the other stats are useful

jib: might be a bug in getParameters -- every time you call it transaction id changes
... which means you don't know if the transaction id changes because of getParameters or an outside change

dom: let's discuss that in a separate issue

RESOLUTION: add rid and confirm that we don't want to add encoding parameters

issue-401

bernard: along with svc we have a new weird thing of simulcast without different rtp streams
... when talking about layer we should identify the layer and the mode of the layer

hbos: might just have an index into the svc config array

bernard: need temporal id & spatial id to identify a layer. byte for each would work for every known codec

vr000m: are sid's numbers?

bernard: possible to have sid represent a simulcast encoding
... could have multiple simulcast encodings in the same rtp ssrc
... need to describe better what the layer is and what mode it is

vr000m: we should discuss this in another meeting since it's getting complicated
... can the svc mode change?

bernard: yes, and would affect the number of layers and stats

hbos: does the basic idea of having one svc stats object per layer with these attributes make sense?

bernard: definitely want to know aggregate stats for the entire stream
... not sure if per-layer loss is useful for congestion control
... might use the stats to realize after the fact that your behavior was sub-optimal

hta: framesSent and bytesSent seem like enough

jib: inconsistency ... width vs. frameWidth

hbos: because in that dictionary there's a mix of audio and video
... frame was added to make it clear it's for video

RESOLUTION: henrik should go write a PR and we'll add it

issue-443

vr000m: isn't the endpoint aware of the deviation of the stats from expectation?

hbos: need to define expectations

vr000m: can do the math. expectation is inter-frame stats

hbos: over what time frame?

alex: expectation changes with sfu

hbos: expectation can change during call (60 fps to 30 fps)
... shouldn't be reported a glitch

alex: can only define a glitch by knowing what was sent by the peer

hta: calculate the sum of intervals and sum of squares of intervals? allows you to calculate mean and std dev

jib: before or after the jitter buffer?
... on the sender side there's feature creep
... where we have stats on the sources
... worried we are adding stats for playback

hbos: this has to do with received
... but will be visible in playback
... don't intend to measure how the video tag works

alex: need to decouple which part of the video playback pipeline is being measured

jib: if we have a stat that reports glitches but don't have any rendering glitches because jitter buffer smoothed it over is confusing

alex: don't want to force the app to query stats too often

hbos: need to be specific with how glitch is defined
... getting the feeling we should move towards proposal B

jib: why do we care about glitches before jitter buffer?

hbos: only care about after jitter buffer
... some existing stats are confusing about this though

varun: want the stat for after the jitter buffer

alex: fps is supposed to be stable by segment
... there will be a step at some point
... can we report the stability of fps within segments?

hbos: that would show up as a glitch in proposal B

alex: can infer expectation by assuming fps is constant for a period of time

hta: sum of square intervals allows you to calculate the average/std dev of arrival rate between any two samples

hbos: that sounds like it would solve the problem

RESOLUTION: add a stat which is the sum of the square of inter frame intervals

issue-440

hbos: implementation has separate booleans but they are not independent

alex: but this isn't implementation specific

armax: proposal C: remove?

hbos: chrome reports cpu over bandwidth if both are true

discussion about what is the use case for this stat

varun: cpu limiting is interesting when it happens
... bandwidth limiting we have a lot of other stats
... cpu is influenced by unknown things on the system

hbos: issue is not about having this stat but what to do when both value apply

jib: if the value is none you can infer that quality is fine

hbos: any value other than none indicates you are not sending the resolution you want

jib: this is sender side only?

hbos: yes

varun: none can also be source limited
... e.g. if the camera can't capture enough pixels because it's dark

jib: can add the string both

hbos: didn't want to add a vector since you might infer that a false element indicates no problem
... e.g. if you are so cpu limited that you can't use all your bandwidth

jib: if there's a situation where both are limiting, can you improve quality by improving either or both?

hbos: likely limited by both

varun: other implementations may know what is the real limiting factor
... so proposal b seems better

hbos: should we say that if the implementation can't distinguish just say bandwidth?

issue-448

RESOLUTION: go with the proposed solution

issue-358

jib: what would happen after an ice restart finishes?
... when do the stats go away?

hbos: they go away automatically with no event

jib: changing the id would mean they are not referenced anywhere
... so you can infer they are no longer in use

hbos: they would not exist in that case

jib: what if you are in the middle of an ice restart?

varun: depends on behavior of make before break
... e.g. if you are still using the transport
... you would see both objects

jib: how would you know what is the old or new one if both are present?

hta: we should do proposal a

hbos: proposals are not mutually exclusive

RESOLUTION: go with proposal a and continue discussing proposal b

issue-376

steveanton: problem is this is an instantaneous stat not a cumulative stat

hbos: can't think of a good cumulative stat

steveanton: could do minimum rtt

hta: building on a standard stat that is part of the protocol machine is ok

hbos: continue looking for a cumulative stat we could use

RESOLUTION: try to find a cumulative stat that could be used for rtt, but otherwise rely on spinfo_srtt -- regardless we want sctp rtt

issue-377

hta: cwindow is orthogonal to bufferedAmount

steveanton: cwindow is useful for application logic, but getstats is not a good place to put it

RESOLUTION: don't add to stats but discuss adding a dedicated API (not in 1.0)

<dontcallmeDOM> scribenick: dontcallmeDOM

Content-Hints API

Harald: Recalling previous discussions, words that could characterize the contents of a track. Close discussion by Harald rewriting draft based on ideas presented at the meeting, including MUST/SHOULD.

Jan-Ivar: And clarify track.clone().

Harald: Need to specify that it copies all the attributes.

Jan-Ivar: And add guidance for future specs.

Wrap-up

[Armando was able to name the bird.]

Conclusions and next steps

Harald: We have a number of PRs to write now. We have a number of documents that need wider review, including TAG. Shepherding needed. We have gone through features at risk; the proposal is to delete all features marked in the document, except for maxFramerate and RTCError.

Youenn: RTCError needs more discussion? It should be marked feature at risk but let’s not delete it for now.

Bernard: PR requirement is to have better error handling, so removing it from the spec might create more work than it saves. It needs more discussion.

Harald: Move the normative identity dependency steps to the extension spec.

Youenn: E.g. call a NO-OP function or step that is overridden by the extension spec.

Bernard: Isolation...?

Jan-Ivar: Both webrtc-pc and media recorder has language like MUST NOT record if there are isolated properties. We can move this out of identity to mediacapture-main.

Youenn: It should stay in the identity spec for now because it is a feature at risk.

Harald: Action item on Jan-Ivar to sort out isolation property (suggesting to move it?).

Youenn: Let’s do ICE forking later on as an extension spec or iterative work.

Harald: Developer feedback; people still not happy with Unified Plan switch and consistency. More SSRC nosies.
... SVC, keep it simple please. We only support predefined modes.
... Privacy issues. Details in the minutes.

Youenn: Now is the time for PRs.

Harald: We did not get to issue 68.

Youenn: Don’t know what to do about that, need to pick up discussions and figure out what to do there.

Harald: NV use cases. Split conferencing PR in two, Jan-Ivar will go back to Firefox to clarify if we can merge the untrusted JavaScript case. We should want this, but there is disagreement if we should support trusted JavaScript use case.
... Rechartering. Not too much to say about that. Discussed options, no clear answer. Should we incubate specs? WIll discuss further.
... Stats again. And then we ran out of time.

WebRTC TPAC F2F Day 2

20 Sep 2019

Attendees

Contents