WebRTC TPAC F2F Day 2 -- 23 Oct 2018

<armax> I am scribing

<dom> Slideset day 2 [PDF]

<armax> Starting with the agenda after starting recoding

<dom> scribenick: armax

<dom> [warning: this meeting is being audio-video recorded]

Agenda for today being read by hta

we will go through the topic for today & issues from yesterday

that we did no mange to covered.

starting with Scalable video coding extension to webrtc by Bernard

SVC Extension for WebRTC

reference https://rawgit.com/aboba/webrtc-sim/master/svc.html

we lost a few slides. hta is catching up on it

Bernard explaines some of the screenshots related to simulcast APIs, RTCSdpSendParameters and Encoding

and related examples

Goals: maintina compatibility to previous APIs, support single RTP and SRST and and send temporal spatial scalability

non goals: mrst, differential protection and simulcast-equivalent functionality

how to satisfy parameters without parameters with SDP?

VP8/VP9 can decode independently, so negotiation might not be needed

SVC not needed to be negotaioted in the case of AV1 either

(decoder can decode anything in the case of AV1 as well)

Proposal for scalability modes in WebRTC 1.0: add a single attribute "scalabilityMode"

and to dsicover the modes add a sequence of modes to RTCRtpCodecCapabilities

Pros & Cons of scalability modes

Listing of the scalability modes, example based on AV1

L1T2: one spacial layer, and two temporal layer

S are removed because WebRTC 1.0 already has simulcast so this is not requird

Examples graphically represented of L1T2 and L1T3

and L2T1

and other examples of such scalability moedes, namely L2T2, L2T3

Scalability modes can be described partly with list depndencies but is more powerful (more irregular representations can be expressed with such modes)

ref https://docs.google.com/presentation/d/1paZ-bKeduIdwW6GDM4RJIGJvpFgjyQLZf1mUhiEcwxk/edit?ts=5bb26a5e#slide=id.g4099f108a7_0_4 [PDF]

hbos asks: why the structure expressed in layer-shifted temporal prediction cannot expressed

answer is that the dependecy structures are thought for more regular structures

clarification asked [lost the question :/ ]

summary scalability modes are more expressive. Concern is that if in the future there are changes, we would need to go throught the process again

hta: references mailinglist request related

bernard presents an example using scalability mode for L3T3

clarification the first would sayd 3 ssrcs, while the other two only 1

example is presented of how getCapabilities would work to present which capability modes ara available

Next topic: basic operation

because there is no negotiation in of svc, there is no modification of createOffer/createAnswer

setCodedecPreferences could be used to negotatiate the scalability mode

setParameters can be used to modify/set the scalability mode

Clarification: is there a way to request preferences of L2T2 or L1T2

Answer: no there is not, it is assumed that if in the list the code can do it indepdentently of negotiation

Question: if we need something which is not supported, the string could be negotiated

Answer: SDP would be required which does not allow for expressing such structures

Negotiation does not help that much, and there is no need for negotiation because the decoder can do anything (exmaple AV1

H264 is a special case

setParameters() is used when the codec is negotiated

which allows to determine if the negotiation worked

Decisions for the WG: adopt this as a work idem or leave it to ORTC

last alternative is to leave it up to the implementation

(possibly non-interoperable)

Varun: asks whether a codec could be moved from hmode to non-hmode

Answer: the proposal is allowing to change the mode

the proposal is to not forbid entries is to avoid lack of decoder availability

setPArameters needs to change the mode to avoid getting stuck in edge cases that are not recoverable

special simul cast could work and temporal in addition, but this one would get something reasonable for any codec

Discussion: it is not clear if setParameters() makes sense to be called before negotiation

hta: decisions need to be made

youenn: compared to the initial solution, this is a better approach. Overall direction makes sense, but in terms of WG work item, we need more things to do before that and we have the plate already full.

aboba: if it is not standardized now, it will be propetary because it is going to be shipped this year

hta: the option not to take it as a work item, poses the risk to deal with existing implementations

youenn: next year WebRTC 1.0, and we will finish simlcast, but the simlcast is too big and that needs to be finished first

poposal is to postpone the discussion

aboba: it would make no sense to make it a work item, because it will be included in AV1

DrAlex: it seems that the iteam might be already late because the RTP encapsulation will come soon and we need to work on it now

youenn: not against the work item, but several pending issues are critical for web developers. Adding working items might prolongue this situation further

dom: can we get deployment plan where we leave space for improvement on SVC later;
... could we make it so that for new features like this to avoid less "racing" but more experimentation

aboba: if WebRTC we do not have an API, the standardiztion of AV1 will become standardized and will never be discussed to W3C

hta: counter point to youenns argument, is to make the work for SVC is to encapsulate the work in the WG and attempt to get them to contribute to other WG items too

youenn: hoping to have a proper plan to get to WebRTC 1.0, if we add more extensions we risk to have too much and delay WebRTC 1.0 further

hta: doubts about this

varum: there is a big work for stats that has been delayed an yet not finished

not opposed to get the new item

hbos: the proposal seems fairly simple, it is not clear if this is a can-of-worms so difficult to clear out timing

aboba: this is an extension not for WebRTC 1.0

options: do now, do not adopt it now do later, never do it

decision: is to adopt it now but will verify the decision on the mailing list (estimated a couople of weeks for consensuse)

NEXT PRESENTATION

Access to Raw Audio

hta: Access to Raw Audio

ref https://alvestrand.github.io/audio-worklet/

the problem: people want to process audio, either process or read it

the requirement is low delay, reliable delay, low overhead

until now: WebAudio has been the approach to run javascript code on the samples

<scribe> new approach would be to use AudioContext (different JS context)

Demo: audio worklet do send a message to the main page by only measuring the level: https://www.alvestrand.no/audio-worklet-demos/main.html

Samples are converted from WebRTC -> WebAudio and then sync to the clock WebAudio uses and pass the samples to the worklet, take the output, and put the output somehwere

Result of this is to have a 25% overhead

<scribe> New proposal: let's do something independent, by doing something that does not require synchonization and directly going to the WebRTC layer.

Proposed interfacing works and is interoperable with AudioWorklet and is easily connected to WebRTC C++ APIs

Questions: how many tracks do we process in a worklet. Either one track per worklet or multi tracks per worklet

Aboba: how would work for codecs that represents different channels

hta: is just a matter of representation, does not tie to tracks. Lowering the overhead is important so fare the API is clear on how data is received

Popen issue also is how many formats should be supported

Does the API allow for a much better API/Performance or should we get WebAudio to fix their issues

youenn: adaptation on local clock instead of depending on the system algorithm cannot be done with webaudio, because WebAudio synchronizes it already for you

Aboba: there are uscases with ML that would not work with such overahead

youenn: WASM integration seems to be needed because in JS would not be performing enough. We would need to ask WebAudio to understand based on their expertese

jan-ivar: open issues on WebAudio to fix their issues

Question: is the overhead due to GC or other WebAudio issue

hta: this could be an issue related to linux, not clear

quick test on site with Mac: it seems the overhead is 10% CPU

hta: we need input from WebAudio folks

hbos: we would need to have the same API for audio and video, with consistancy in the API

hta: WASM is limited to certain array, and not be able to used shared buffers, but is a WIP in WASM wokring group

AI: talk with WebAudio and report back; we do not adopt the item but the decision is not to drop it
... possibly the Machine Learning CG might be interested in this too

hence might require investigation from that side too

AI assigned to hta

AI(hta): talk with WebAudio

RTCDataChannels & WHATWG Streams

[OT] https://www.reddit.com/r/funny/comments/9qkxwy/time_to_switch_to_chrome/

<dom> ScribeNick: DrAlex

RTCDatachannel and WHATWG streams

discussion started at stockholm

first slide is a recap

original RTCDataCHannel is based on web socket design

buffer need to be monitored on sender side

and receiver has no way to push back

WHATWG have useful semantics

it allows to outsource chunking and pressure.

different design are being discussed

queue of message and queue of stream will be described first

goal is e.g. to send through a file of 2GB file without needing a 2GB buffer anywhere

one queue, message based design presented

still push based, but there is back-pressure on sender side removing the need for monitoring buffer level

on receiver side, the situation is also slightly improved.

as the browser will be able to apply back-pressure to a certain extend.

benefits

pretty processing APIs

example on encryption, binary datas, ....

zero-buffer, pressure, off-mainstream

what bout streaming text

we follow fetch spec

the tradeoff is

we lose the text-and-file behaviour of web socket, but we re closer to more moderns APIs

vote: not do, or do it as an extention

youenn: same comment as before, we should focus on 1.0, and not dilute the time chairs have during meetings.

jib: basically, the WG needs to figure out how to deal with 1.0 and NV at the same time

bernard: <joke> easy, take it off of the youenn thread (main thread) </joke>

the proposal on the table now is: adopt, not adopt, more info ....

decision: morr info

peter T. presenting now

<soares> off topic, but why is the title for #webrtc channel has a link to this? https://quickremovevirus.com/tutorial-to-delete-webrtc-completely/

on QUIC and WHATWG streams

Second Screen WG

WebRTC QUIC

first set: QUIC

original goal was only DC over QUIC

and possibly eventually Media Over QUIC

many real life feedback on the wish to use QUIC

challenges list:

- QUIC transport does not have all API to put dC on top

- no way to standardise ptocols on top of QUIC at IETF

- - the DC API itself was suboptimal anyway

status report:

there is a spec achieving the original goals, and there is an implementation in chrome

it is compatible with most recents QUIC updates including send only streams

there is a proposal to multiplex

questions for today

what about WHATWG streams?

how to get off the main JS sthread

and then considering client/server use case like web socket (without ICE)

poitn1 in detail: WHATWG streams

if we were using WHATWG streams for all things, what would it look like

doable for DC, not too hard.

but if that s the only thing we do, it might not be that usefull

SCTP could be done on top of streams as well

MSE would like to it like that as well.

RTP could be done on top of streams as well.

and if we do all of that, then we can connect the underlying pipes in a very interesting ways

e.g. with one line you could go from SCTP and MSE

and it does not stop there

receive from QUIC< parse, pipe into MSE

, now, replace MSE by WEBRTC in all the above

E2EE is also doable by replacing the parser by an encryptor / decryptor

the main part now

is to be bale to get off main thread

pause for discussion on previous example before going into the off-main-thread discussion

mozill ais not in favour of adding QUIC for now, too immature.

peter propose to discuss only about WHATWG stream

using next slide as an example

code example of all the objects that would dbe used (serialised) to achieve the desired goals

jib: the question is then: is it all or nothing?

questions about how to get off main thread

TAG representing puts on record that TAG would support reusing streams, or more specifically would oppose reproducing the same capacity with a different API.

peter: to clarify I have more slides, and we still need to explore before we decide. I am not pushing for a decision today.

hta: at the moment we do not have any way to access encoded packets for instance. If we decide to go in the WHAT WG direction, there are some consequences. We could decide to open the current API, independently of accepting the move to WHATWG API.

peter: do i get permission from the group to investigate this ?

jib: i wouldn't like to expose encoded frame to main thread, it should only be available to worker e.g.

peter moving to next slide

moving off main thread

one option is to make EVERYTHING transferable

but it s a lot of stuff

(example slide)

option B is to prep things through a worker,

if we could add readable/writable capacity to the message port of a worker, then it becomes syntaxing equivalent to a WHATWG stream

and you can pipe things through

back and forth

this decision can not be made uniquely by the webrtc WG

WHATWG stream chair position: it s the obvious thing to do, but it is waiting for a customer. so that could work

option C

nothing to do with web worker

but worklet

piping worklets around

jib: there is the question of GC

varun: clarification question: who decide which thread the worklet ends up on

option D: transferable streams

(example code slide)

so now

Second Screen WG

<DrAlex_> totally different topic

<DrAlex_> second screen / open screens

<DrAlex_> it overlaps a little bit, so I wanted to give visibility to those

<DrAlex_> apis that allow to show web content on a second screens

<DrAlex_> and then, separately, there is a community group (@W3C) which takes care of network protocols that implements the presentation API and remote screen APIs

<DrAlex_> overlaps with ICE, QUIC, and somehting

<DrAlex_> ASP is currently on ly intra-LAN, but they are investigating extending beyond lan, and then would need ICE

<DrAlex_> here, a possible helpful item would be NV-style IceTransport

<DrAlex_> request for question

<DrAlex_> other groups have define "companion screen" protocol, defined over WebSocket

<DrAlex_> it doe snot depend on ICE

<DrAlex_> what s the motivation of not reusing this

<DrAlex_> "companion screen" was define within ATSC

<DrAlex_> dom: what s the context? an extension of the API to use ICE?

<DrAlex2> peter: these are just FYI slides, no decision needed, let s move on

<DrAlex2> QUIC

<DrAlex2> OSP is build on QUIC. defining specific messages

<DrAlex2> we may want to allow the app to extend the protocol

<DrAlex2> here again, the QuicTransport work done in webrtc G could be useful to the second screen group

<DrAlex2> finally, media over QUIC

<DrAlex2> OSP is currently favouring "flinging"

<DrAlex2> but the group might extend the spec to stream

<DrAlex2> it could be of interest to WebRTC WG as an example of media over QUIC, and be a "customer" of work done as WUICTransport.

<DrAlex2> example from IETF CBOR CDDL

<DrAlex2> that is a POSSIBLE direction the second screen WG might be going, in which case it could be of interest to webrtc WG

<DrAlex2> raising discussion about certificates management, as raised by a netflix representative in the past.

<DrAlex2> simpler question: is the second screen WG interested in any of webrtc security feature ?

dom: we saw some push back here about maturity of qui, is there a channel to speak about that at IETF?

peter: not that i know of.

hta: no action needed from the WebRTC WG at this point (beyond Thank you).

peter: correct, it was just FYI
... please keep working on IceTransport and QUIC Transport please :-)

hta: go through agenda

stats, worklets, media over qui, encryption, next steps

break now, and reconvene at 1

Webrtc-pc issues

<caribou> scribenick: caribou

issue 1827

proposal: go for onclosing

RESOLUTION: use 'onclosing'

Webrtc-stats issues

https://github.com/w3c/webrtc-stats/issues/339

varun: it looks similar to IceConnectionStats

varun: rtt is about stun packets

issue365

jib: (described the issue)

hbos: seems non-controversial, we should do it

jib: we would add the mid to the sender stats
... and receiver stats
... that would help correlate the 2

issue365

hbos: we should get rid of the stream object

jib: if you want to find a track, you can enumerate all the senders

hta: it seems a stream is associated with a sender and a receiver, it seems to make sense to put the mid on the sender

jib: currently we attach stats to the sender only

varun: (issue 366 slide)

jib: the only time we need the delete procedure is when all the data will be replaced
... it would not apply to a connection close

[discussion about whether or not we want to delete stats after close]

hbos: today you'd not get stats after peer connection is closed

hta: seems like a bug

varun: we want the data after the pc is closed

jib: browser models were very different

hbos: if we keep stats around after close, it's inconsistent

hta: most stats objects don't end.

varun: onstatsended was written that way to match FF behavior
... when I switch from front camera to back camera and again, do I get stats on same track id?

jib: [points to 8.1]
... [volunteers for the PR for issue 366]

back to issue 365/361

jib: do we still need track stats?

RESOLUTION: do not remove track at this time

hta: are there implementations that produces track stats on the receivers?

chrome does, FF does not

jib: should the implementations add it? or just backward compat?

FF has receiver stats only, that would be similar to track stats on chrome

Issue 374:

issue 374

youenn: exposing VPN information is an issue and networkType is leaking part of the information
... proposal: remove networkType, identify Use Cases and restrict to those with mitigation
... either it's important for an application is running on WiFi or 4G
... for other use cases we don't need network type

varun: useful for debugging

bernard: we've had customers with bad stats on a bad wifi

youenn: in safari we expose stats through logging only into the dev console
... going through JS is easy but not essential

hta: sometimes we aggregate data, like quality on wifi or 4G

youenn: maybe it's ok for dev previews, not in general

varun: ok to explore more constrains on this

<dom> http://wicg.github.io/netinfo/

jib: also a group scoping issue. if the rest of the world does not need this, networkType could be an extension for us

varun: it's ok for networkType to be unknown for most cases

hta: proposal to leave in spec + add a note about fingerprinting

youenn: it's an optional stat, right?

varun: yes, all are

bernard: networkInformation spec will ship

jib: if we need to implement it, we will return "unknown" all the time

youenn: ditto

hta: keep it or remove it from spec?

[strawpoll leads to no consensus]

issue 375

youenn: in some cases, IP address could be exposed

varun: it only becomes available when both parties have established connectivity

youenn: in most cases, at the very beginning you don't expose it, then receiving candidates it will

varun: ok to not expose until the JS has the IP

hta: you can only expose the IP on the add candidate
... seems acceptable to me

Issue 378: Width and Height of Simulcast Layers Sent

Issue 378

jib: option 1 Move all but framesCaptured to "outbound-rtp"
... option 2 Have more than one "sender" stats floating around in the simulcast case
... option 3 Some flavor of "do nothing"

varun: I support option 2

jib: option 2 is more backwards-compatible, although you might get the wrong sender

hbos: we could keep track stats and move them out of sender stats

varun: keeping on track, moving to sender

@@@ will take care of the PR

Worklets and Workers

jib: question about option 2, how do you use sdp?

youenn: from iceTransport
... the ORTC model works, basically

peter: there's also option 4 (what I presented this morning)

lennart: whatwg streams don't really solve the issue, you need to transfer objects to worker

youenn: maybe option 3 would say 'transfer whatwg streams'

peter: that's what I presented

@@: option 2 is much better than option 1

jib: if we add whatwg streams to datachannel, that helps in workers too

hta: how do you see the priority of this work?

youenn: low, very low
... if some people are planning to implement it, I'd be happy to draft it

<dom> https://w3c.github.io/webrtc-ice/#dom-rtcicetransport-addremotecandidate

[back to slides, media]

youenn: it would be very difficult to transfer the mediaStreamtrack without all its context

Create sender/receivers in workers

hta: hearing that priority on this is even lower priority

[break]

<vr000m> E2EE for Conference Calls, slide 90 https://docs.google.com/presentation/d/1WOihY0SMJbWvfbc-41GA78F4yzPSwmyDOJ8GbRoU7dw/edit#slide=id.g44b656d873_73_0

<vr000m> switched to alternate slides sent by email, follow hangouts

<vr000m> I seem to have lost scribing while partaking in the discussions.

<vr000m> discussing the decision currently

<vr000m> E2EE use case where we trust the application -- take to the list

<vr000m> QUIC API/BYOT -- take it to the list

<vr000m> Encoder/Decoder streams -- convert slideware into document, and think about next steps

<vr000m> PeterT takes the lead in converting the encoder/decoder streams into a doc

<vr000m> How do we go forward, what is next.

WebRTC TPAC F2F Day 2

23 Oct 2018

Attendees

Contents

SVC Extension for WebRTC

Access to Raw Audio

RTCDataChannels & WHATWG Streams

Second Screen WG

WebRTC QUIC

Second Screen WG

Webrtc-pc issues

Webrtc-stats issues

Worklets and Workers

Summary of Action Items

Summary of Resolutions