[minutes] WebRTC F2F meeting Quebec City - 23 July 2011 from Francois Daoust on 2011-07-24 (public-webrtc@w3.org from July 2011)

From: Francois Daoust <fd@w3.org>
Date: Sun, 24 Jul 2011 23:12:04 +0200
To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <4E2C8AA4.4060809@w3.org>

Hi,

The minutes of yesterday's F2F meeting are available at:
http://www.w3.org/2011/07/23-webrtc-minutes

... and copied as raw text below. Please let me know if something is missing or incorrectly reported there.

A few actions were given during the meeting. The tracker tool used within the group is available at:
http://www.w3.org/2011/04/webrtc/track/

However, this tool can only track actions assigned to group participants. Here's a summary of actions, please refer to the minutes for more context:
- Cullen to send a server-provider TURN use case and user-provider TURN use case
- DanB to send comments reviewing requirements to list
- Harald to query authors on A15 on what context means
- John Elwell to propose use case on recording
- Matthew Koffman to send some text around SDP
- Roni Even to find some of the use cases in other WG in IETF and send to group

Thanks IETF for hosting the meeting!

Francois.

-----
Web Real-Time Communications Working Group - Quebec City F2F

23 Jul 2011

[2]Agenda

[2] http://www.w3.org/2011/04/webrtc/wiki/July_23_2011

See also: [3]IRC log

[3] http://www.w3.org/2011/07/23-webrtc-irc

Attendees

Present
Stefan_Hakansson, Harald_Alvestrand_(hta), Dan_Burnett_(dan),
Francois_Daoust, Cullen_Jennings, Gonzalo_Camarillo,
Ted_Hardie, Emile_Stephan, Roni_Even, Andrew_Hutton,
Leon_Portran, Alan_Johnston, Ross_Finlayson,
Ram_Ravinaranath, John_Elwell, ThomasRoessler, Alissa_Cooper,
Timothy_Terriberry_(tim), Dan_Romascanu, Jon_Peterson,
Bert_Wijien, Narm_Gadiraju, Xavier_Marjou, Christer_Holmberg,
Miguel_Garcia, Magnus_Westerlund, Colin_Perkins,
Salvatore_Loreto, Dan_Druta, Bert_Greevenbosch,
Matthew_Koffman, Eric_Rescorla_(ekr), Cary_Bran, Daryl_Malis?

Regrets
Rich_Tibbett

Chair
Harald_Alvestrand, Stefan_Hakansson

Scribe
Francois_Daoust, Cullen Jennings, Dan_Burnett

Contents

* [4]Topics
1. [5]WebRTC Architecture
2. [6]Use Cases
3. [7]Derived API requirements
4. [8]Implementation Experience — Google
5. [9]Implementation Experience — Mozilla
6. [10]Implementation Experience — Cisco
7. [11]Implementation Experience — Ericsson
8. [12]API Design Questions
9. [13]Signaling Issues
10. [14]Administrativia
* [15]Summary of Action Items
_________________________________________________________

hta: [introduction]. W3C meeting hosted by IETF. W3C rules.
... No polycon for the conference today.
... Looking for scribes.

[francois and Cullen step up]

WebRTC Architecture

[Harald projects slides on Web RTC architecture]

hta: Going to present goals, architecture layers, security. I won't
touch upon details.
... Goal is enable realtime Communication between browsers. Real
Time means you can wave at someone and he can wave back. 100ms
timescale.
... Media is audio/video but people also want to send other stuff
to.
... Important to drive the design by use cases
... We have to go for general functions to enable innovations. Use
cases are least amount of things possible.
... Basic concept: somehow Javascript, with the help of the server,
can establish a connection to the other browser.
... Media flows through the shortest possible path for latency and
because it makes life simpler.
... Different architecture layers. Apart from the browser, any other
box must be assumed to be able to be controlled by an enemy.
... That is a security context that is slightly different from in
other areas.
... In IETF, we're mostly concerned by attacks on the network.
... Here, we have to take into account all components.
... Data transport means you have to establish some data path. More
or less agreed to use ICE.
... UDP is the obvious transport given the constraints (we need to
be able to backup to TCP though). Congestion management is
necessary.
... I'll skip rapidly through IETF issues as they will be addressed
on Tuesday and Thursday. Focus on API here.
... There will be data framing, securing, we must negotiate data
formants and we need some baseline that everyone implements for the
negotiation to always succeed.
... We have use cases for setting up connections that require SIP
and others that don't require SIP.
... User interfaces include privacy considerations. The user has to
know that he has allowed the use of camera and microphone and must
be able to revoke that access at any time.
... In scope for W3C, not so much for IETF.
... Talking about API, it shouldn't take too many lines of
JavaScript to setup a connection and tear down a call. Multiple
streams, pictures that jump up, etc. should be possible.
... There are things that are on the wire but are truly relevant for
the user.
... In some cases, security demands that they are hidden to the user
interface.
... Interoperability requires that it all gets specified.
... If you precise control precisely, it ages badly, e.g. "I want
that precise codec".
... Of course, we have to have interoperability. If you give the
same script to two browsers, it should work. Not exactly the same
resources because different capabilities are possible, but it should
work.
... When data is passed through this API, format has to be
specified.
... In some cases, we have blobs that get passed.
... These blobs will be parsed by different browsers though, so they
need to know how to parse them.
... Summary slide: Having an overview is a means to ensure that we
can talk about different parts of the system and we feel confident
that we have all the pieces covered.
... Questions/Comments/Disagreements?

Cullen: that seems consistent with what I'm think I'm hearing.

EKR: you said precise control age badly. I'd like to say "higher
quality than x/y/z", right.
... Problem is that the notion of "higher quality" depends on codecs
and profiles.
... I fear it falls into a rathole designing a new way of describing
codecs and qualities

Matthew_Koffman: I think legacy interoperability is missing from
your slides.

??2: what do you mean with legacy interoperability

Matthew_Koffman: I can show you existing devices that do RTP but not
SRTP. If you want to non secure devices, you need to relax the
bullet presented that unencrypted data do no need to be carried.

hta: one of the things that someone mentioned is that we need to
talk to gateways.

TedHardie: is this the right place to discuss that? Shouldn't this
be handled by IETF RTCWEB group on Tuesday/Thursday?

Matthew_Koffman: I believe it has API implications.
... It's overview, the overview should talk about legacy system.

hta: I'll consider including that for Tuesday/Thursday as well.

Francois: Asked question about architecture and if we need to
resolve it in this WG

hta: IF we discover that W3C perspective results in things need to
change, we should take that change to IETF

<inserted> [quick raise of hands reveals that most of the room
follows both IETF and W3C mailing-lists]

Use Cases

Projected slides: [16]Web RTC use cases (PDF, 869KB).

[16] http://www.w3.org/2011/04/webrtc/wiki/images/4/45/Use_cases_and_reqs_webrtc.pdf

Stefan: presenting use case
... simple use case is two web browsers communicating. One of the
brwosers is behind a NAT. One link has packet loss
... works with different browsers and os
... video windows are resizable

Can move from ethernet to wifi to cellular and the session should
survive

scribe: Can move from ethernet to wifi to cellular and the session
should survive
... Moving to second use case between two service providers
... case where you must handle two cameras sending video from one
browser

Roni: asked question about streaming

Stefan: it is not streaming of the game, it is just the two camera's
being sent to couach
... use case with a mess of video stream

Colin: Question about if there was NATs in this case

Stefan: yes, there are nats

John: Is there an assumption that the video is the same or is
different between peers

Stefan: each peer sends same video to all other peers
... use case with multi party on line game
... Use case with telco interop with PSTN
... need to be able to place and receive calls to PSTN
... not clear how much gateway functionality would be needed
... IN the case of call FedEx, this adds being able to navigate IVR

Dan_Burnett: brought up IVR interaction can be voice rec too

hta: need to tease out the requirements from this use case

Dan_Burnett: does not care about telephone use case but if we are
going to do it, we should do it right

Colin: are there other scenarios for legacy end points

Stefan: these are the only two case right

Roni: brought up need to deal with call center cases

Christer: goal is not to limit to PSTN, it is to connect to SIP

Colin: very differnt to GW soemthing that uses same media formats vs
differnt media formats

Roni: ALso different in terms of security
... do we need to knwo it is secure end to end

hta: In his google role: worried that we are worring too much
concern about interoperabilyt
... telco network is only one concer

Cullen: the Fedex use case. It's not only DMTF. There's the initial
prompt. PSTN is not easy. Many attempts to interop with that have
failed with Fedex.
... We're very interested with the legacy use case.
... 2.5 billion users out there without Internet connections.

<Venkatesh> I agree with that comment about worrying too much about
PSTN.

ekr: There is interop with PSTN, legacy SIP devices, partially
standard devices like webex

<Venkatesh> the very same argument was used when other initiatives
started and complicated the heck out of the specifications with very
little benefits IMO.

stefan: Use case video conference server
... doing simulcast where clients send high and low res video
... central server siwtches the active speaker high res video to all
others plus sends a copy of all low res streams

Dan_Burnett: Q, we are talking about a display with many people,
plus when speaking each person gets bigger

stefan: does not need to get bigger immediately, can be hysteresis
on staying on room

<Dan> trying to identify the Dan's

<Dan> yep - Dan in Cullen's notes it's not the Dan (Romascanu) in
the IRC :-)

<Dan> just call me DanR if I speak

stefan: the server decides which one to display

colin: very differnt requirements if users get to decide what
streams get display instead of server

stefan: This use case is inside an organization and introduces a
firewall. People outside the firewall should be able to participate

Derived API requirements

See: [17]WebRTC requirements.

[17] http://lists.w3.org/Archives/Public/public-webrtc/2011Jul/att-0008/webrtc_reqs.html

hta: these requirements are only going to be discussed here not in
IETF

Dan_Burnett: Is A1 asking permission or asking them which one to use
?

Ekr: this is a fundemental invariant that the browser that needs to
do this

Dan_Burnett: in W3C we should use the term User Agent not Browser

ekr: The web application needs to be able to request use of the
device. The user agent needs to get consent to allow that
... two way to do device selection. 1) application finds the devices
and asks user which one wants to use 2) application asks for audio
device and UA has way to select one

Matthew: useful to be able to preflight the permissions and find out
if they would be OK or not

hta: getting close to end of time for this

francois: do we have some willing to review requirements

<scribe> ACTION: DanB to send comments reviewing requirements to
list [recorded in
[18]http://www.w3.org/2011/07/23-webrtc-minutes.html#action02]

[18] http://www.w3.org/2011/07/23-webrtc-minutes.html#action02

<trackbot> Created ACTION-5 - to send comments reviewing
requirements to list [on Daniel Burnett - due 2011-07-30].

Alissa: where are the requirements going to live?

hta: open issue - like to hear comments on this at end

stefan: moving on Security consideration slide

<francois> ISSUE: where are requirements going to live?

<trackbot> Created ISSUE-2 - Where are requirements going to live? ;
please complete additional details at
[19]http://www.w3.org/2011/04/webrtc/track/issues/2/edit .

[19] http://www.w3.org/2011/04/webrtc/track/issues/2/edit

John: what about recording of media. Record what is spoken on mic or
received at far end ?
... recording local or recording on a device across the network

John: are people interest in this type of use case ?

<francois> ACTION: harald to query authors on A15 on what context
means [recorded in
[20]http://www.w3.org/2011/07/23-webrtc-minutes.html#action05]

[20] http://www.w3.org/2011/07/23-webrtc-minutes.html#action05

<trackbot> Created ACTION-6 - Query authors on A15 on what context
means [on Harald Alvestrand - due 2011-07-30].

<scribe> ACTION: John Ellwell - propose use case on recording
[recorded in
[21]http://www.w3.org/2011/07/23-webrtc-minutes.html#action06]

[21] http://www.w3.org/2011/07/23-webrtc-minutes.html#action06

<trackbot> Sorry, couldn't find user - John

<francois> [Note there is no way to action someone who is not a
participant in the WG using Tracker]

stefan: asking question about adding other use case

hta: do we want lots of use cases that differ or a use case that
encompasses lots of aspects
... what style do people want?

Cullen: slight preference that encompasses lots of aspects instead
of having tens of use cases.

hta: I seem to be outnumbered.

Stefan: same as Cullen

Francois: do we need a use case with screen casting between peers,
like VNC?

Roni: There are uses cases in other WG in IETF. For example CLUE and
the semantic label.

<scribe> ACTION: Roni Even - find some of the use cases in other WG
in IETF and send to group [recorded in
[22]http://www.w3.org/2011/07/23-webrtc-minutes.html#action07]

[22] http://www.w3.org/2011/07/23-webrtc-minutes.html#action07

<trackbot> Sorry, couldn't find user - Roni

Dan_Druta: We need to look at them from the user perspective. End to
end user experience is important thing. There are some use cases
that are driven by actors:" in our case users, user agents, servers.
We need to think that way about this work. Discovery of capabilities
and matching two browsers together should be a big one. The
timelines of browser development will mandate that we need this.

hta: over time - want to move on

Christer: goal is to come up with use cases that derive new
requirements

Tim: like to include music use case

Cullen: in favour of it

hta: on E911, drop for now

Implementation Experience — Google

Projected slides: [23]WebRTC Chrome implementation status (PDF,
188KB).

[23] http://www.w3.org/2011/04/webrtc/wiki/images/7/7f/Webrtc-chrome-impl-status.pdf

hta: presenting in his google role on their implementations in
chrome

hta: goal, going for production quality code in chrome for everyone
... used to provide concrete feedback to the API and protcols
... they know the version they are shipping in the first version
will not be what is in second version
... they have released key components at code.webrtc.org
... working on integrating into chroming
... add a webrtc C++ api that wraps the GIPs code
... webkit had a "quite rigorous" review process. Specs are very
unstable.
... roling out changes to libjingle, webkit and more more I missed
... Got to a working demo with audio and video in brwoser
... going to work real soon now

ekr: what does that mean?

hta: can't comments on release dates - matter of months before it is
in production chromium
... prefixing everything with webrtc to allow for changes to stable
system later

Cullen: after you get with a version in the production code. Is the
intention to remain backwards compatible with the API you'll have
shipped?

hta: we'll argue more strongly against cosmetic changes, yes. We're
open for more important changes.

ekr: will it roll out as command line switch, then no switch?

hta: yes, expect to see stage with switch

Implementation Experience — Mozilla

Tim: mostly been focusing on infrastructure work
... for example, speeding up camera pipeline
... doing a new low latency audio backend
... likely to land in firefox 8 or 9
... doing Media Stream API for splitting , mixing, synchronization
... allows for the more complex use cases and innovation
... Plans: using GIPS code from google. First target is firefox
add-on. Want to do this as it is rapidly evolving.
... Makes it easier to rapidly interate.
... Target is something production ready in Q1 2012 (just a rough
estimate, not a commitment)
... whole bunch of user experience questions, call interupt, multi
domain conferencing
... been discussing doing SIP directly in browser
... feel this gives you easier way to tie to other devices

Implementation Experience — Cisco

Projected slides: [24]Cisco's WebRTC implementation (PDF, 1.61MB).

[24] http://www.w3.org/2011/04/webrtc/wiki/images/b/b1/RTC-Web-Cisco-Implementations.pdf

cary: started to see can we get two browsers to call each other
using SIP
... have implemented this in Chromium and Mozilla
... can do browser to browser voice and video calls between browsers
and between browsers and video phones
... using GIPS
... put Cisco SIP stack chromium by implementing a render host API
and also need to touch the webkit glue
... Did Firefox extension focusing on putting the video and voice
... plan to contribute code to open source projects "soon"

Implementation Experience — Ericsson

Projected slides: [25]PeerConnection implementation experience (PDF,
39KB).

[25] http://www.w3.org/2011/04/webrtc/wiki/images/a/aa/Peerconnection-implementation-experience.pdf

Stefan: working on top of webkitGTK+
... goal is to learn about the API and how it works, learn about
flexibility of API. We learned it can be implemented with reasonable
effort.
... We have send feedback to editor of spec to add things like label
... there are a bunch of blog posts (URL in slides)
... can demo offline if you want and there is a youtube video of
this

Magnus: How many of you have looked at security issues?

hta: chrome has touch security review process and this is going
through it

Tim: have touch security review process

Cullen: security, what's that? ;) Primary goal was to get something
working.

API Design Questions

Projected slides: [26]WebRTC API Design Questions (PDF, 53KB).

[26] http://www.w3.org/2011/04/webrtc/wiki/images/4/46/Webrtc-jennings.pdf

Cullen: trying to come up with questions and answers that people in
the room may have as things they want to do.
... Looking for feedback on whether we should this or that.
Consensus on things that don't need to be done.

TedHardie: thinking about whether some of the interfaces between the
browsers and the OS need to be taken into account

Cullen: Right. Today, I'm going to stay high level, but we'll need
to go into much more details later on.
... Design principles: same stuff as said earlier. A simple app does
not need to know a lot about underlying things.
... Looking at use cases that enable things.
... Starting with connecting to media: connecting to devices,
cameras, microphones.
... Do we have an API to enumerate what the various cameras are on a
device?
... Example of laptop with different cameras.
... I'd like some feedback.

hta: one thing that is fairly common is "switching to headset".

ekr: also common that the system picks up the wrong camera. The
feature that is imperative is that the user gets the choice.

<anant> switching to headset is taken care of the OS though (in the
most common cases)

ekr: whether it's a web app or a chrome issue is still tbd.

<jesup> tablets: front/rear, etc. May be able to group with user
giving permission to use hte camera

TedHardie: two cases. One is when you want to set a default. Second
is when you want to switch or mix.
... For the enumeration, I do think that the JavaScript needs to be
able to query that information from the browser, but not for naming.

Cullen: an API to find out the current list of media devices and
some notifications mechanism to tell us what modifications there are
to that.

Dan_Druta: that ties with the consent problem.

<jesup> Right: camera/mic plugin/removal. Consent needed for a new
device to be used

TedHardie: I disagree. The need for consent needs to be on a per
call basis.

Dan_Druta: I may not want to give permission to an app to see my
face, but may be ok for it to see my room.

<jesup> Though a user could (at their option) pre-give consent for a
specific device/app combo

Tim: the ability to enumerate the different cameras may raise a
security concern as it gives the ability to fingerprint the browser
more easily.

MatthewKoffman: when you install Skype on a tablet, for instance,
you typically enable the app to access cameras.

<jesup> Related issue: naming of cameras - "standard" names vs user
input names vs generic names (camera_1, etc)

Cullen: the permission problem is increadibly complex.
... I don't think we have enough to nail down the many ways we may
need to access the camera yet.

<jesup> Is the solution to the permission problem part of our spec,
or something for each implementation to decide on?

Alissa: thinking about the use case where you may want to use the
camera to take still pictures but not to stream video

Stefan: coordination with DAP. We'll handle streams, they will
handle still pictures.

<burn> actually, I think Alissa's concern was that this API might be
used to record but not stream

Cullen: you should be able to add new cameras/microphones and switch
to that at any time.

<Alissa> yeah, capture or record, but not stream

<burn> right, capture. and then presumably do evil.

Cullen: the currently proposed API does not give you much in terms
of ICE process.
... The one issue that I want to ask is how do we want to pass
credentials?
... Does the JavaScript see the password?

hta: good question on what the model is. Whether it's on the user,
browser, or server.

Cullen: [examples of different TURN servers configurations found in
the wild]

MatthewKoffman: do we need to have calling use cases that involve
enterprises?

Cullen: there's one.

<scribe> ACTION: cullen to send a server-provider TURN use case and
user-provider TURN use case [recorded in
[27]http://www.w3.org/2011/07/23-webrtc-minutes.html#action08]

[27] http://www.w3.org/2011/07/23-webrtc-minutes.html#action08

<trackbot> Created ACTION-7 - Send a server-provider TURN use case
and user-provider TURN use case [on Cullen Jennings - due
2011-07-30].

Cullen: other things we could possibly want to be notified in JS
about such as:
... can't gather address from one of servers, fail to connect to
TURN server, other side disconnects.
... etc.
... Each time you get a better path to the other side, knowing about
that would help debugging things a lot.

TedHardie: why would we want that other than for debugging?

<burn> Another point Matthew made a moment ago that Cullen wanted
captured: may want to know when my (the user's) address changed.

TedHardie: If you chose 2 instead of 3 or 4, do you want this to be
passed back to the JavaScript?

Matthew_Koffman: yes, you need that for several purpose

Cullen: to tell people to switch to another NAT, because the current
one is evil.

hta: I can imagine that people will say that not passing the address
back to JavaScript is actually a security feature.

<gape> +1

Matthew_Koffman: I can explain why it's a fake security issue.

<jesup> The remote address is trivially available on the wire since
data is going peer-to-peer

<derf> Not to the JS.

<jesup> True

[discussion on aggressive/fast/low mode]

Colin: sometimes you want not to use the best possible connectivity,
but maybe something below.

Christer: not so much an error, rather a choice when you call the
API.

<tedhardie> I'm concerned that the API not force the JS application
to deal with this level of detail; after all, some of these
applications are simply going to say "sorry, video/audio not
available" to the user, where this is an add-on to the basic
application (the poker site video use case)

ekr: connectivity check, you're going to want to know whether the
connection is direct or through the relay, etc.

MatthewKoffman: that's the sort of information you know to be able
to say: "your NAT is fine, it's John's NAT that's crappy".

[calling for a 15mn break. Discussion to continue afterwards]

Signaling Issues

Cullen: for non-ICE signaling, when do you send messages?
... need to add all media codecs before end of javascript (all at
same time). when function call returns, signaling is sent
... Other option is "open" we proposed.
... either add explicit start signaling, or queue up everything and
add at once which means implicit signaling

Matthew: do it the way everything else does, whatever that is.
... I think browsers do it implicit way.
... because every time control is returned it re-renders

Christer: who is doing negotiation?

Cullen: not javascript that does signaling

EKR: you express opinions to PeerConnection about what you would
like, and invisible to JS this happens in the background as
necessary

Cullen: some negotiation will happen, done by the browser

Dan_Druta: this is early vs. late binding. either give pref in
advance or control directly.

Cullen: one way as you get permission and access to media streams,
you gather up and then put all in the PeerConnection object at once.
alternatively, could add to PeerConnection one at a time as you get
them but don't start sending media on any until you say go.

(missed Matthew comment)

EKR: they are really equivalent

Stefan: should be able to add and remove during session. confusing
if you have to start session.

EKR: JS VM must not start until control has returned from all JS.

Cullen: this is not true of all JS.
... sounds like leaning towards implicit.

Matthew: yes, but treat everything as an add.

Roni: and need delete as well

Cullen: negotiation is implicit
... most of the APIs were leaning towards SIP-style SDP
offer/answer, thought there was consensus there.
... three models: SIP, Jingle, or raw SDP in offer/answer wrapper.
... another variant is an advertise/propose model that I had sent
in.

<scribe> ACTION: Matthew to send some text around SDP [recorded in
[28]http://www.w3.org/2011/07/23-webrtc-minutes.html#action09]

[28] http://www.w3.org/2011/07/23-webrtc-minutes.html#action09

<trackbot> Sorry, couldn't find user - Matthew

Colin: all payload formats use offer/answer semantics, so keeping
that would be helpful.

Matthew: Need to be able to determine what kinds of coders/decoders
you have.

hta: have never seen a use case where you need to know which
coder/decoder you're using.

Matthew: matters for audio recording. same as determining whether
you can do real-time media. if API allows recording of video, need
to be able to know how to encode it, resolution, etc.
... maybe other groups might do this, but it needs to be done.
... want to be able to choose from JS which encoding, etc. to use.

(missed comment from Harald on why this is necessary).

hta: JS coder needs to just say "I want to communicate" but not
necessarily how.

Matthew: what if browser is a terminal for PBX. want browser to act
more like Skinny phone than SIP phone.

Cullen: replace skinny with MGCP for this discussion. you need to
know things about device. can't negotiate SDP without knowing
additional info.

Roni: there are many parameters, not all are codec-specific. Some
params you need to have anyway.

Ted: maybe middle ground is advertise/offer/answer. First send
what's available, then offer/answer from then on. You get an
informed O/A and can still use O/A.
... gateway should not need to have fundamental semantic shifts.
Adv/O/A leaves you with the same semantics as SDP. Should discuss
over beer.

Stefan: we need this data to negotiate, but is it part of this API?

JonPeterson: O/A always had the notion of counter-proposal. SDP can
describe sessions well but not negotiate. So you can describe a
complete session and allow a counter-proposal for something better.

Ted: makes gateways too complex.

Jon: if offer or answer described full session, yes, but it doesn't.

hta: no matter how we do this, we will see JS parsing these
negotiation blocks. If we want to support our use cases, this will
need to be gatewayed eventually anyway.

Matthew: it's a horrible hack to use PeerConnection to ask for
capabilities and parse it in JS, when the API could just support it.

Cullen: let's see a proposal and then discuss.
... already decided to add video mid-call.
... do we need to know when other side is sending?
... nice to know in the UI that connection is being set up and when
it's done.
... media in different directions may connect at different times,
nice to have notification.

Roni: when you receive the media you know you're getting it. when
you send you don't know.

Cullen: right. should there be an API that says that both sides are
receiving?
... Will reword this question to be clearer.
... Now let's talk about tracks.
... whatwg API example up on screen
... which kind of media goes in different tracks. when are they in
one track, when are they separate.
... I like for them all to be separate.

Matthew: don't like. many encoders can combine stereo channels into
one codec on one track

Cullen: I like your metaphor, which is based on the codec.

JohnElwell: when is it a track, and when is it a media stream?

Stefan: stream contains 1 or more tracks. keeping them within one
stream helps you with synchronization.

hta: one PeerConnection can be connected to multiple streams, each
with multiple tracks.

Cullen: working definition is that if different pieces of media are
in same codec, they are to be in same track. if multiple tracks need
to be synchronized together, they are in the same media stream.

Magnus: has to do with mapping to RTP sessions
... sync cannot be across sessions.

hta: i thought media stream mapped to cname, but not sure.

Roni: track and media stream are both logical entitties from a w3c
perspective. but we need to know how to map to IETF level

Cullen: want Magnus to work all of this out
... (joking, mostly)
... Need mapping to AVT, for sure.

(general agreement)

Roni: As long as we talk about logical entities, we don't need to
talk RTP or SDP

Cullen: things in one media stream will map to one RTP c-name. This
is how you signal that they are synchronized (rendered together).
... and a track will have a one-to-one correlation with an SSRC in
the simple case.
... receiving video, bit rate is being adjusted, should we know the
other side is doing this? when the media we're receiving changes in
some way, do we want to be notified?

Roni: why would we?

Cullen: may want to change my screen resolution
... for bit rate, if all my streams just dropped their bit rate I
may in the JS decide to close some of my streams.

(general agreement that this is useful info)

Christer: if quality is decreasing, for example, could remove video
to improve audio.

Daryl_Malis?: good to collect and make use of this. My concern is
that this info in practice is often used only to decrease quality of
the end result but never improve.

Tim: bitrate is a terrible proxy for quality
... maybe everyone stopped moving or talking
... exposing quality info is very codec-spceific

Magnus: this is really about providing congestion info, right?

hta: this is difficult to do in real time.
... we can get info on sender's changes.

Cullen: trying to keep this simple, e.g. either sender changed
resolution or reduced cap on bandwidth.

Tim: difficult to detect cap on bandwidth

Daryl: with clients using adaptive bitrates, they will lower the
rate when nothing's happening and then increase back up when there
is motion/sound.

EKR: what we need is a way for the sender to say to the receiver
"I'm having to back off here"

Cullen: summary is we like this but it's hard and we don't really
know how to do it properly (like packet loss concealment)
... presuming going to legacy devices via gateways. Do we have
enough signaling info?

Matthew: out of scope.

Cullen: no, for example receiving early media.

Matthew: need SDP for early media.

Cullen: changing from one-way to two-way media.

EKR: where is the call state machine?

Cullen: all current proposals have it in PeerConnection object.

Matthew: this kind of signaling has to happen over the JS channel.
It would otherwise prevent many great use cases.

Daryl: instead of this just being about ringing, can we generalize
to early media?

hta: impacts FedEx use case.

Matthew: no such thing as early media, just media. There are no
signaling implications. what would a skinny phone do calling fedex?
if it didn't work, is the problem in the phone or elsewhere?

Cullen: other question. You'll want some general option to reject an
incoming call based on who's calling.

Matthew: also, how's B notified when A calls B if B does not run his
browser?

hta: out of scope

Matthew: we should have use cases that show that this is needed.

Cullen: sounds like "how do I receive calls when my phone is off"?

Matthew: no.

stefan: notifications in scope of the Web notifications WG. We'll
follow their conclusions.

Christer: if your browser is not running, you're probably not
registered to your SIP provider, so the client will never be able to
figure out someone called in the first place.

TedHardie: basically, you need some architecture that allows people
to receive notifications when things run in the background.
... It's not an API issue.

Matthew: right, it's a use case issue.

TedHardie: I will send a use case.

hta: rejecting a call should be a matter of not creating a
PeerConnection object.

Cullen: question is do you start your ICE before or after? This is
going to make a timing question. My prediction is that ICE
processing will be started before.

Matthew: an evil Web site gets your address.

Cullen: I can't force browsers to go to an evil browser.

Matthew: a Web site that does not want to reveal that information
must be able to go through the state machine and make the process
happen later.
... It must be able for a Web site that wishes to protect users
privacy to send JavaScript that has ICE processing happen after.

[ekr made a comment on presence which I missed]

[discussion on "Msg blob" bad naming]

cullen: moving to msg blog issues. We need more or less the SDP
message. We need to have crypto context set up. It means we need the
identity.
... We probably need some unique identifier for peer connections.
... Those are the minimum amounts of things I can think of.

ekr: Who's the target of these information? The JavaScript, the Peer
connection?

Cullen: in the simple case, it's going to be relayed. Same thing up,
same thing down.
... There will sure be cases when things get manipulated (JavaScript
or server)

ekr: what information is carried here?

Matthew: if you have SIP in the browser, you need to get this right.

hta: media negotiation machine needs to be in the browser. The call
state machine is not.

Cullen: looking forward to someone splitting media state machine
from call state machine that is SIP-mappable.

ekr: re. same message up and message down, do we have consensus
there?

Stefan: there should be as it should be possible to get encryption
from endpoint to endpoint.

Cullen: is it possible, in the simplest case to have the server do
nothing but relay the message from one side to the other? Do we have
consensus on that?
... That's what all proposals have.
... There's always a "you need to send this chunk of data to the
other side", but none of the spec says that the server needs to make
any update.

Christer: well, at the end of the day, the other side needs to
understand what comes in. If you convert between protocols, you may
need to adjust the message.

Cullen: let me rephrase the question. Should the format that comes
from one side be potentially identical to the one that goes to the
other side?

[no pushback heard]

Cullen: final question is the size of the blobs.

hta/Stefan: no limit. Limit is for datagram.

Cullen: ok, so these blobs can be large enough.
... moving on to media issue.
... Question about hints you give when setting up cameras.
... What I'm proposing here are size, spacial vs temporal quality
are important (spoken voice, or non-spoken voice). Clearly needs to
evolve over time.
... Some people proposed we'd have none of these things.

Roni: Let's assume that we're using SDP. Are you suggesting that we
have a separate set of hints that are not part of SDP?

Cullen: this is even on the which codec should I use.

Roni: I assume you can negotiate everything with SDP.

Cullen: The Web browser can. But the JavaScript?

Matthew: everything can be manipulated through JavaScript before it
goes out.

Cullen: there's one range of opinions is that JavaScript ought to be
able to construct the SDP offer. The other range is that it ought to
be able to do nothing.

hta: no one objected to the idea that screen size should be
communicated

Cullen: also rough consensus earlier on on voice/music.

Matthew: server can strip out any SDP offer/answer as it wishes
before transmitting it.

hta: yes, but it can only subset things. It cannot ask for more
offers.

Roni: if the Web server does not know how the codecs were chosen in
the first place, how is the Web server to make the right choice?

Cullen: if you don't have the info that there's hardware
acceleration for one codec, right, indeed.
... Propose to stop here in the interest of tie.

Tim: one other point. The audio vs. voip has a lot of implications
that do not show in SDP.
... Processing that have no bearing whatsoever on what codec you
choose.
... Filtering SDP will never tell the browser to turn off the AGC,
AEC, etc.

Administrativia

hta: first, an easy one. Next meeting is going to be during TPAC
2011, in Santa Clara, USA, first week of November.
... We'll call out for a next teleconference through some Doodle
poll.

<burn> we could also use a w3c teleconference schedule poll . . .

hta: The interesting question here is how do we get to document our
output in a way that is effective, acknowledged, implemented and
deployed?
... What we do at the moment is discuss changes we need to bring to
the WHATWG spec.

cullen: we'd have more useful feedback in the group if the group
publishes a spec in a W3C space.

Christer: we have one document regarding the requirements.

burn: Common to do both. Requirements doc and spec.

francois: [explaining W3C process]. FPWD triggers call for patent
exclusions. Document needs to be in W3C space.

Dan_Burnett: one way is to take a starting point. Other way is to
redo from scratch.

Cullen: from my point of view, critical thing is to have a document.

Alissa: being able to explicitly state where there is no consensus
in a document is important.

Dan_Burnett: I agree.

Cullen: how many do we have to choose from?
... Only one proposal on the table from actual members of the
working group.

hta: I suggest that the chairs continue the discussion and figure
out how to solve this.

hta: Any other business?
... Thanks all for showing up!

[meeting adjourned]

Summary of Action Items

[NEW] ACTION: Cullen to send a server-provider TURN use case and
user-provider TURN use case [recorded in
[29]http://www.w3.org/2011/07/23-webrtc-minutes.html#action08]
[NEW] ACTION: DanB to send comments reviewing requirements to list
[recorded in
[30]http://www.w3.org/2011/07/23-webrtc-minutes.html#action02]
[NEW] ACTION: Harald to query authors on A15 on what context means
[recorded in
[31]http://www.w3.org/2011/07/23-webrtc-minutes.html#action05]
[NEW] ACTION: John Ellwell - propose use case on recording [recorded
in [32]http://www.w3.org/2011/07/23-webrtc-minutes.html#action06]
[NEW] ACTION: Matthew Koffman to send some text around SDP [recorded
in [33]http://www.w3.org/2011/07/23-webrtc-minutes.html#action09]
[NEW] ACTION: Roni Even to find some of the use cases in other WG in
IETF and send to group [recorded in
[34]http://www.w3.org/2011/07/23-webrtc-minutes.html#action07]

[29] http://www.w3.org/2011/07/23-webrtc-minutes.html#action08
[30] http://www.w3.org/2011/07/23-webrtc-minutes.html#action02
[31] http://www.w3.org/2011/07/23-webrtc-minutes.html#action05
[32] http://www.w3.org/2011/07/23-webrtc-minutes.html#action06
[33] http://www.w3.org/2011/07/23-webrtc-minutes.html#action09
[34] http://www.w3.org/2011/07/23-webrtc-minutes.html#action07

[End of minutes]

Received on Sunday, 24 July 2011 21:12:30 UTC