[minutes] WebRTC F2F meeting Quebec City - 23 July 2011

Hi,

The minutes of yesterday's F2F meeting are available at:
  http://www.w3.org/2011/07/23-webrtc-minutes

... and copied as raw text below. Please let me know if something is missing or incorrectly reported there.

A few actions were given during the meeting. The tracker tool used within the group is available at:
  http://www.w3.org/2011/04/webrtc/track/

However, this tool can only track actions assigned to group participants. Here's a summary of actions, please refer to the minutes for more context:
- Cullen to send a server-provider TURN use case and user-provider TURN use case
- DanB to send comments reviewing requirements to list
- Harald to query authors on A15 on what context means
- John Elwell to propose use case on recording
- Matthew Koffman to send some text around SDP
- Roni Even to find some of the use cases in other WG in IETF and send to group

Thanks IETF for hosting the meeting!

Francois.


-----
Web Real-Time Communications Working Group - Quebec City F2F

23 Jul 2011

    [2]Agenda

       [2] http://www.w3.org/2011/04/webrtc/wiki/July_23_2011

    See also: [3]IRC log

       [3] http://www.w3.org/2011/07/23-webrtc-irc

Attendees

    Present
           Stefan_Hakansson, Harald_Alvestrand_(hta), Dan_Burnett_(dan),
           Francois_Daoust, Cullen_Jennings, Gonzalo_Camarillo,
           Ted_Hardie, Emile_Stephan, Roni_Even, Andrew_Hutton,
           Leon_Portran, Alan_Johnston, Ross_Finlayson,
           Ram_Ravinaranath, John_Elwell, ThomasRoessler, Alissa_Cooper,
           Timothy_Terriberry_(tim), Dan_Romascanu, Jon_Peterson,
           Bert_Wijien, Narm_Gadiraju, Xavier_Marjou, Christer_Holmberg,
           Miguel_Garcia, Magnus_Westerlund, Colin_Perkins,
           Salvatore_Loreto, Dan_Druta, Bert_Greevenbosch,
           Matthew_Koffman, Eric_Rescorla_(ekr), Cary_Bran, Daryl_Malis?

    Regrets
           Rich_Tibbett

    Chair
           Harald_Alvestrand, Stefan_Hakansson

    Scribe
           Francois_Daoust, Cullen Jennings, Dan_Burnett

Contents

      * [4]Topics
          1. [5]WebRTC Architecture
          2. [6]Use Cases
          3. [7]Derived API requirements
          4. [8]Implementation Experience — Google
          5. [9]Implementation Experience — Mozilla
          6. [10]Implementation Experience — Cisco
          7. [11]Implementation Experience — Ericsson
          8. [12]API Design Questions
          9. [13]Signaling Issues
         10. [14]Administrativia
      * [15]Summary of Action Items
      _________________________________________________________

    hta: [introduction]. W3C meeting hosted by IETF. W3C rules.
    ... No polycon for the conference today.
    ... Looking for scribes.

    [francois and Cullen step up]

WebRTC Architecture

    [Harald projects slides on Web RTC architecture]

    hta: Going to present goals, architecture layers, security. I won't
    touch upon details.
    ... Goal is enable realtime Communication between browsers. Real
    Time means you can wave at someone and he can wave back. 100ms
    timescale.
    ... Media is audio/video but people also want to send other stuff
    to.
    ... Important to drive the design by use cases
    ... We have to go for general functions to enable innovations. Use
    cases are least amount of things possible.
    ... Basic concept: somehow Javascript, with the help of the server,
    can establish a connection to the other browser.
    ... Media flows through the shortest possible path for latency and
    because it makes life simpler.
    ... Different architecture layers. Apart from the browser, any other
    box must be assumed to be able to be controlled by an enemy.
    ... That is a security context that is slightly different from in
    other areas.
    ... In IETF, we're mostly concerned by attacks on the network.
    ... Here, we have to take into account all components.
    ... Data transport means you have to establish some data path. More
    or less agreed to use ICE.
    ... UDP is the obvious transport given the constraints (we need to
    be able to backup to TCP though). Congestion management is
    necessary.
    ... I'll skip rapidly through IETF issues as they will be addressed
    on Tuesday and Thursday. Focus on API here.
    ... There will be data framing, securing, we must negotiate data
    formants and we need some baseline that everyone implements for the
    negotiation to always succeed.
    ... We have use cases for setting up connections that require SIP
    and others that don't require SIP.
    ... User interfaces include privacy considerations. The user has to
    know that he has allowed the use of camera and microphone and must
    be able to revoke that access at any time.
    ... In scope for W3C, not so much for IETF.
    ... Talking about API, it shouldn't take too many lines of
    JavaScript to setup a connection and tear down a call. Multiple
    streams, pictures that jump up, etc. should be possible.
    ... There are things that are on the wire but are truly relevant for
    the user.
    ... In some cases, security demands that they are hidden to the user
    interface.
    ... Interoperability requires that it all gets specified.
    ... If you precise control precisely, it ages badly, e.g. "I want
    that precise codec".
    ... Of course, we have to have interoperability. If you give the
    same script to two browsers, it should work. Not exactly the same
    resources because different capabilities are possible, but it should
    work.
    ... When data is passed through this API, format has to be
    specified.
    ... In some cases, we have blobs that get passed.
    ... These blobs will be parsed by different browsers though, so they
    need to know how to parse them.
    ... Summary slide: Having an overview is a means to ensure that we
    can talk about different parts of the system and we feel confident
    that we have all the pieces covered.
    ... Questions/Comments/Disagreements?

    Cullen: that seems consistent with what I'm think I'm hearing.

    EKR: you said precise control age badly. I'd like to say "higher
    quality than x/y/z", right.
    ... Problem is that the notion of "higher quality" depends on codecs
    and profiles.
    ... I fear it falls into a rathole designing a new way of describing
    codecs and qualities

    Matthew_Koffman: I think legacy interoperability is missing from
    your slides.

    ??2: what do you mean with legacy interoperability

    Matthew_Koffman: I can show you existing devices that do RTP but not
    SRTP. If you want to non secure devices, you need to relax the
    bullet presented that unencrypted data do no need to be carried.

    hta: one of the things that someone mentioned is that we need to
    talk to gateways.

    TedHardie: is this the right place to discuss that? Shouldn't this
    be handled by IETF RTCWEB group on Tuesday/Thursday?

    Matthew_Koffman: I believe it has API implications.
    ... It's overview, the overview should talk about legacy system.

    hta: I'll consider including that for Tuesday/Thursday as well.

    Francois: Asked question about architecture and if we need to
    resolve it in this WG

    hta: IF we discover that W3C perspective results in things need to
    change, we should take that change to IETF

    <inserted> [quick raise of hands reveals that most of the room
    follows both IETF and W3C mailing-lists]

Use Cases

    Projected slides: [16]Web RTC use cases (PDF, 869KB).

      [16] http://www.w3.org/2011/04/webrtc/wiki/images/4/45/Use_cases_and_reqs_webrtc.pdf

    Stefan: presenting use case
    ... simple use case is two web browsers communicating. One of the
    brwosers is behind a NAT. One link has packet loss
    ... works with different browsers and os
    ... video windows are resizable

    Can move from ethernet to wifi to cellular and the session should
    survive

    scribe: Can move from ethernet to wifi to cellular and the session
    should survive
    ... Moving to second use case between two service providers
    ... case where you must handle two cameras sending video from one
    browser

    Roni: asked question about streaming

    Stefan: it is not streaming of the game, it is just the two camera's
    being sent to couach
    ... use case with a mess of video stream

    Colin: Question about if there was NATs in this case

    Stefan: yes, there are nats

    John: Is there an assumption that the video is the same or is
    different between peers

    Stefan: each peer sends same video to all other peers
    ... use case with multi party on line game
    ... Use case with telco interop with PSTN
    ... need to be able to place and receive calls to PSTN
    ... not clear how much gateway functionality would be needed
    ... IN the case of call FedEx, this adds being able to navigate IVR

    Dan_Burnett: brought up IVR interaction can be voice rec too

    hta: need to tease out the requirements from this use case

    Dan_Burnett: does not care about telephone use case but if we are
    going to do it, we should do it right

    Colin: are there other scenarios for legacy end points

    Stefan: these are the only two case right

    Roni: brought up need to deal with call center cases

    Christer: goal is not to limit to PSTN, it is to connect to SIP

    Colin: very differnt to GW soemthing that uses same media formats vs
    differnt media formats

    Roni: ALso different in terms of security
    ... do we need to knwo it is secure end to end

    hta: In his google role: worried that we are worring too much
    concern about interoperabilyt
    ... telco network is only one concer

    Cullen: the Fedex use case. It's not only DMTF. There's the initial
    prompt. PSTN is not easy. Many attempts to interop with that have
    failed with Fedex.
    ... We're very interested with the legacy use case.
    ... 2.5 billion users out there without Internet connections.

    <Venkatesh> I agree with that comment about worrying too much about
    PSTN.

    ekr: There is interop with PSTN, legacy SIP devices, partially
    standard devices like webex

    <Venkatesh> the very same argument was used when other initiatives
    started and complicated the heck out of the specifications with very
    little benefits IMO.

    stefan: Use case video conference server
    ... doing simulcast where clients send high and low res video
    ... central server siwtches the active speaker high res video to all
    others plus sends a copy of all low res streams

    Dan_Burnett: Q, we are talking about a display with many people,
    plus when speaking each person gets bigger

    stefan: does not need to get bigger immediately, can be hysteresis
    on staying on room

    <Dan> trying to identify the Dan's

    <Dan> yep - Dan in Cullen's notes it's not the Dan (Romascanu) in
    the IRC :-)

    <Dan> just call me DanR if I speak

    stefan: the server decides which one to display

    colin: very differnt requirements if users get to decide what
    streams get display instead of server

    stefan: This use case is inside an organization and introduces a
    firewall. People outside the firewall should be able to participate

Derived API requirements

    See: [17]WebRTC requirements.

      [17] http://lists.w3.org/Archives/Public/public-webrtc/2011Jul/att-0008/webrtc_reqs.html

    hta: these requirements are only going to be discussed here not in
    IETF

    Dan_Burnett: Is A1 asking permission or asking them which one to use
    ?

    Ekr: this is a fundemental invariant that the browser that needs to
    do this

    Dan_Burnett: in W3C we should use the term User Agent not Browser

    ekr: The web application needs to be able to request use of the
    device. The user agent needs to get consent to allow that
    ... two way to do device selection. 1) application finds the devices
    and asks user which one wants to use 2) application asks for audio
    device and UA has way to select one

    Matthew: useful to be able to preflight the permissions and find out
    if they would be OK or not

    hta: getting close to end of time for this

    francois: do we have some willing to review requirements

    <scribe> ACTION: DanB to send comments reviewing requirements to
    list [recorded in
    [18]http://www.w3.org/2011/07/23-webrtc-minutes.html#action02]

      [18] http://www.w3.org/2011/07/23-webrtc-minutes.html#action02

    <trackbot> Created ACTION-5 - to send comments reviewing
    requirements to list [on Daniel Burnett - due 2011-07-30].

    Alissa: where are the requirements going to live?

    hta: open issue - like to hear comments on this at end

    stefan: moving on Security consideration slide

    <francois> ISSUE: where are requirements going to live?

    <trackbot> Created ISSUE-2 - Where are requirements going to live? ;
    please complete additional details at
    [19]http://www.w3.org/2011/04/webrtc/track/issues/2/edit .

      [19] http://www.w3.org/2011/04/webrtc/track/issues/2/edit

    John: what about recording of media. Record what is spoken on mic or
    received at far end ?
    ... recording local or recording on a device across the network

    John: are people interest in this type of use case ?

    <francois> ACTION: harald to query authors on A15 on what context
    means [recorded in
    [20]http://www.w3.org/2011/07/23-webrtc-minutes.html#action05]

      [20] http://www.w3.org/2011/07/23-webrtc-minutes.html#action05

    <trackbot> Created ACTION-6 - Query authors on A15 on what context
    means [on Harald Alvestrand - due 2011-07-30].

    <scribe> ACTION: John Ellwell - propose use case on recording
    [recorded in
    [21]http://www.w3.org/2011/07/23-webrtc-minutes.html#action06]

      [21] http://www.w3.org/2011/07/23-webrtc-minutes.html#action06

    <trackbot> Sorry, couldn't find user - John

    <francois> [Note there is no way to action someone who is not a
    participant in the WG using Tracker]

    stefan: asking question about adding other use case

    hta: do we want lots of use cases that differ or a use case that
    encompasses lots of aspects
    ... what style do people want?

    Cullen: slight preference that encompasses lots of aspects instead
    of having tens of use cases.

    hta: I seem to be outnumbered.

    Stefan: same as Cullen

    Francois: do we need a use case with screen casting between peers,
    like VNC?

    Roni: There are uses cases in other WG in IETF. For example CLUE and
    the semantic label.

    <scribe> ACTION: Roni Even - find some of the use cases in other WG
    in IETF and send to group [recorded in
    [22]http://www.w3.org/2011/07/23-webrtc-minutes.html#action07]

      [22] http://www.w3.org/2011/07/23-webrtc-minutes.html#action07

    <trackbot> Sorry, couldn't find user - Roni

    Dan_Druta: We need to look at them from the user perspective. End to
    end user experience is important thing. There are some use cases
    that are driven by actors:" in our case users, user agents, servers.
    We need to think that way about this work. Discovery of capabilities
    and matching two browsers together should be a big one. The
    timelines of browser development will mandate that we need this.

    hta: over time - want to move on

    Christer: goal is to come up with use cases that derive new
    requirements

    Tim: like to include music use case

    Cullen: in favour of it

    hta: on E911, drop for now

Implementation Experience — Google

    Projected slides: [23]WebRTC Chrome implementation status (PDF,
    188KB).

      [23] http://www.w3.org/2011/04/webrtc/wiki/images/7/7f/Webrtc-chrome-impl-status.pdf

    hta: presenting in his google role on their implementations in
    chrome

    hta: goal, going for production quality code in chrome for everyone
    ... used to provide concrete feedback to the API and protcols
    ... they know the version they are shipping in the first version
    will not be what is in second version
    ... they have released key components at code.webrtc.org
    ... working on integrating into chroming
    ... add a webrtc C++ api that wraps the GIPs code
    ... webkit had a "quite rigorous" review process. Specs are very
    unstable.
    ... roling out changes to libjingle, webkit and more more I missed
    ... Got to a working demo with audio and video in brwoser
    ... going to work real soon now

    ekr: what does that mean?

    hta: can't comments on release dates - matter of months before it is
    in production chromium
    ... prefixing everything with webrtc to allow for changes to stable
    system later

    Cullen: after you get with a version in the production code. Is the
    intention to remain backwards compatible with the API you'll have
    shipped?

    hta: we'll argue more strongly against cosmetic changes, yes. We're
    open for more important changes.

    ekr: will it roll out as command line switch, then no switch?

    hta: yes, expect to see stage with switch

Implementation Experience — Mozilla

    Tim: mostly been focusing on infrastructure work
    ... for example, speeding up camera pipeline
    ... doing a new low latency audio backend
    ... likely to land in firefox 8 or 9
    ... doing Media Stream API for splitting , mixing, synchronization
    ... allows for the more complex use cases and innovation
    ... Plans: using GIPS code from google. First target is firefox
    add-on. Want to do this as it is rapidly evolving.
    ... Makes it easier to rapidly interate.
    ... Target is something production ready in Q1 2012 (just a rough
    estimate, not a commitment)
    ... whole bunch of user experience questions, call interupt, multi
    domain conferencing
    ... been discussing doing SIP directly in browser
    ... feel this gives you easier way to tie to other devices

Implementation Experience — Cisco

    Projected slides: [24]Cisco's WebRTC implementation (PDF, 1.61MB).

      [24] http://www.w3.org/2011/04/webrtc/wiki/images/b/b1/RTC-Web-Cisco-Implementations.pdf

    cary: started to see can we get two browsers to call each other
    using SIP
    ... have implemented this in Chromium and Mozilla
    ... can do browser to browser voice and video calls between browsers
    and between browsers and video phones
    ... using GIPS
    ... put Cisco SIP stack chromium by implementing a render host API
    and also need to touch the webkit glue
    ... Did Firefox extension focusing on putting the video and voice
    ... plan to contribute code to open source projects "soon"

Implementation Experience — Ericsson

    Projected slides: [25]PeerConnection implementation experience (PDF,
    39KB).

      [25] http://www.w3.org/2011/04/webrtc/wiki/images/a/aa/Peerconnection-implementation-experience.pdf

    Stefan: working on top of webkitGTK+
    ... goal is to learn about the API and how it works, learn about
    flexibility of API. We learned it can be implemented with reasonable
    effort.
    ... We have send feedback to editor of spec to add things like label
    ... there are a bunch of blog posts (URL in slides)
    ... can demo offline if you want and there is a youtube video of
    this

    Magnus: How many of you have looked at security issues?

    hta: chrome has touch security review process and this is going
    through it

    Tim: have touch security review process

    Cullen: security, what's that? ;) Primary goal was to get something
    working.

API Design Questions

    Projected slides: [26]WebRTC API Design Questions (PDF, 53KB).

      [26] http://www.w3.org/2011/04/webrtc/wiki/images/4/46/Webrtc-jennings.pdf

    Cullen: trying to come up with questions and answers that people in
    the room may have as things they want to do.
    ... Looking for feedback on whether we should this or that.
    Consensus on things that don't need to be done.

    TedHardie: thinking about whether some of the interfaces between the
    browsers and the OS need to be taken into account

    Cullen: Right. Today, I'm going to stay high level, but we'll need
    to go into much more details later on.
    ... Design principles: same stuff as said earlier. A simple app does
    not need to know a lot about underlying things.
    ... Looking at use cases that enable things.
    ... Starting with connecting to media: connecting to devices,
    cameras, microphones.
    ... Do we have an API to enumerate what the various cameras are on a
    device?
    ... Example of laptop with different cameras.
    ... I'd like some feedback.

    hta: one thing that is fairly common is "switching to headset".

    ekr: also common that the system picks up the wrong camera. The
    feature that is imperative is that the user gets the choice.

    <anant> switching to headset is taken care of the OS though (in the
    most common cases)

    ekr: whether it's a web app or a chrome issue is still tbd.

    <jesup> tablets: front/rear, etc. May be able to group with user
    giving permission to use hte camera

    TedHardie: two cases. One is when you want to set a default. Second
    is when you want to switch or mix.
    ... For the enumeration, I do think that the JavaScript needs to be
    able to query that information from the browser, but not for naming.

    Cullen: an API to find out the current list of media devices and
    some notifications mechanism to tell us what modifications there are
    to that.

    Dan_Druta: that ties with the consent problem.

    <jesup> Right: camera/mic plugin/removal. Consent needed for a new
    device to be used

    TedHardie: I disagree. The need for consent needs to be on a per
    call basis.

    Dan_Druta: I may not want to give permission to an app to see my
    face, but may be ok for it to see my room.

    <jesup> Though a user could (at their option) pre-give consent for a
    specific device/app combo

    Tim: the ability to enumerate the different cameras may raise a
    security concern as it gives the ability to fingerprint the browser
    more easily.

    MatthewKoffman: when you install Skype on a tablet, for instance,
    you typically enable the app to access cameras.

    <jesup> Related issue: naming of cameras - "standard" names vs user
    input names vs generic names (camera_1, etc)

    Cullen: the permission problem is increadibly complex.
    ... I don't think we have enough to nail down the many ways we may
    need to access the camera yet.

    <jesup> Is the solution to the permission problem part of our spec,
    or something for each implementation to decide on?

    Alissa: thinking about the use case where you may want to use the
    camera to take still pictures but not to stream video

    Stefan: coordination with DAP. We'll handle streams, they will
    handle still pictures.

    <burn> actually, I think Alissa's concern was that this API might be
    used to record but not stream

    Cullen: you should be able to add new cameras/microphones and switch
    to that at any time.

    <Alissa> yeah, capture or record, but not stream

    <burn> right, capture. and then presumably do evil.

    Cullen: the currently proposed API does not give you much in terms
    of ICE process.
    ... The one issue that I want to ask is how do we want to pass
    credentials?
    ... Does the JavaScript see the password?

    hta: good question on what the model is. Whether it's on the user,
    browser, or server.

    Cullen: [examples of different TURN servers configurations found in
    the wild]

    MatthewKoffman: do we need to have calling use cases that involve
    enterprises?

    Cullen: there's one.

    <scribe> ACTION: cullen to send a server-provider TURN use case and
    user-provider TURN use case [recorded in
    [27]http://www.w3.org/2011/07/23-webrtc-minutes.html#action08]

      [27] http://www.w3.org/2011/07/23-webrtc-minutes.html#action08

    <trackbot> Created ACTION-7 - Send a server-provider TURN use case
    and user-provider TURN use case [on Cullen Jennings - due
    2011-07-30].

    Cullen: other things we could possibly want to be notified in JS
    about such as:
    ... can't gather address from one of servers, fail to connect to
    TURN server, other side disconnects.
    ... etc.
    ... Each time you get a better path to the other side, knowing about
    that would help debugging things a lot.

    TedHardie: why would we want that other than for debugging?

    <burn> Another point Matthew made a moment ago that Cullen wanted
    captured: may want to know when my (the user's) address changed.

    TedHardie: If you chose 2 instead of 3 or 4, do you want this to be
    passed back to the JavaScript?

    Matthew_Koffman: yes, you need that for several purpose

    Cullen: to tell people to switch to another NAT, because the current
    one is evil.

    hta: I can imagine that people will say that not passing the address
    back to JavaScript is actually a security feature.

    <gape> +1

    Matthew_Koffman: I can explain why it's a fake security issue.

    <jesup> The remote address is trivially available on the wire since
    data is going peer-to-peer

    <derf> Not to the JS.

    <jesup> True

    [discussion on aggressive/fast/low mode]

    Colin: sometimes you want not to use the best possible connectivity,
    but maybe something below.

    Christer: not so much an error, rather a choice when you call the
    API.

    <tedhardie> I'm concerned that the API not force the JS application
    to deal with this level of detail; after all, some of these
    applications are simply going to say "sorry, video/audio not
    available" to the user, where this is an add-on to the basic
    application (the poker site video use case)

    ekr: connectivity check, you're going to want to know whether the
    connection is direct or through the relay, etc.

    MatthewKoffman: that's the sort of information you know to be able
    to say: "your NAT is fine, it's John's NAT that's crappy".

    [calling for a 15mn break. Discussion to continue afterwards]

Signaling Issues

    Cullen: for non-ICE signaling, when do you send messages?
    ... need to add all media codecs before end of javascript (all at
    same time). when function call returns, signaling is sent
    ... Other option is "open" we proposed.
    ... either add explicit start signaling, or queue up everything and
    add at once which means implicit signaling

    Matthew: do it the way everything else does, whatever that is.
    ... I think browsers do it implicit way.
    ... because every time control is returned it re-renders

    Christer: who is doing negotiation?

    Cullen: not javascript that does signaling

    EKR: you express opinions to PeerConnection about what you would
    like, and invisible to JS this happens in the background as
    necessary

    Cullen: some negotiation will happen, done by the browser

    Dan_Druta: this is early vs. late binding. either give pref in
    advance or control directly.

    Cullen: one way as you get permission and access to media streams,
    you gather up and then put all in the PeerConnection object at once.
    alternatively, could add to PeerConnection one at a time as you get
    them but don't start sending media on any until you say go.

    (missed Matthew comment)

    EKR: they are really equivalent

    Stefan: should be able to add and remove during session. confusing
    if you have to start session.

    EKR: JS VM must not start until control has returned from all JS.

    Cullen: this is not true of all JS.
    ... sounds like leaning towards implicit.

    Matthew: yes, but treat everything as an add.

    Roni: and need delete as well

    Cullen: negotiation is implicit
    ... most of the APIs were leaning towards SIP-style SDP
    offer/answer, thought there was consensus there.
    ... three models: SIP, Jingle, or raw SDP in offer/answer wrapper.
    ... another variant is an advertise/propose model that I had sent
    in.

    <scribe> ACTION: Matthew to send some text around SDP [recorded in
    [28]http://www.w3.org/2011/07/23-webrtc-minutes.html#action09]

      [28] http://www.w3.org/2011/07/23-webrtc-minutes.html#action09

    <trackbot> Sorry, couldn't find user - Matthew

    Colin: all payload formats use offer/answer semantics, so keeping
    that would be helpful.

    Matthew: Need to be able to determine what kinds of coders/decoders
    you have.

    hta: have never seen a use case where you need to know which
    coder/decoder you're using.

    Matthew: matters for audio recording. same as determining whether
    you can do real-time media. if API allows recording of video, need
    to be able to know how to encode it, resolution, etc.
    ... maybe other groups might do this, but it needs to be done.
    ... want to be able to choose from JS which encoding, etc. to use.

    (missed comment from Harald on why this is necessary).

    hta: JS coder needs to just say "I want to communicate" but not
    necessarily how.

    Matthew: what if browser is a terminal for PBX. want browser to act
    more like Skinny phone than SIP phone.

    Cullen: replace skinny with MGCP for this discussion. you need to
    know things about device. can't negotiate SDP without knowing
    additional info.

    Roni: there are many parameters, not all are codec-specific. Some
    params you need to have anyway.

    Ted: maybe middle ground is advertise/offer/answer. First send
    what's available, then offer/answer from then on. You get an
    informed O/A and can still use O/A.
    ... gateway should not need to have fundamental semantic shifts.
    Adv/O/A leaves you with the same semantics as SDP. Should discuss
    over beer.

    Stefan: we need this data to negotiate, but is it part of this API?

    JonPeterson: O/A always had the notion of counter-proposal. SDP can
    describe sessions well but not negotiate. So you can describe a
    complete session and allow a counter-proposal for something better.

    Ted: makes gateways too complex.

    Jon: if offer or answer described full session, yes, but it doesn't.

    hta: no matter how we do this, we will see JS parsing these
    negotiation blocks. If we want to support our use cases, this will
    need to be gatewayed eventually anyway.

    Matthew: it's a horrible hack to use PeerConnection to ask for
    capabilities and parse it in JS, when the API could just support it.

    Cullen: let's see a proposal and then discuss.
    ... already decided to add video mid-call.
    ... do we need to know when other side is sending?
    ... nice to know in the UI that connection is being set up and when
    it's done.
    ... media in different directions may connect at different times,
    nice to have notification.

    Roni: when you receive the media you know you're getting it. when
    you send you don't know.

    Cullen: right. should there be an API that says that both sides are
    receiving?
    ... Will reword this question to be clearer.
    ... Now let's talk about tracks.
    ... whatwg API example up on screen
    ... which kind of media goes in different tracks. when are they in
    one track, when are they separate.
    ... I like for them all to be separate.

    Matthew: don't like. many encoders can combine stereo channels into
    one codec on one track

    Cullen: I like your metaphor, which is based on the codec.

    JohnElwell: when is it a track, and when is it a media stream?

    Stefan: stream contains 1 or more tracks. keeping them within one
    stream helps you with synchronization.

    hta: one PeerConnection can be connected to multiple streams, each
    with multiple tracks.

    Cullen: working definition is that if different pieces of media are
    in same codec, they are to be in same track. if multiple tracks need
    to be synchronized together, they are in the same media stream.

    Magnus: has to do with mapping to RTP sessions
    ... sync cannot be across sessions.

    hta: i thought media stream mapped to cname, but not sure.

    Roni: track and media stream are both logical entitties from a w3c
    perspective. but we need to know how to map to IETF level

    Cullen: want Magnus to work all of this out
    ... (joking, mostly)
    ... Need mapping to AVT, for sure.

    (general agreement)

    Roni: As long as we talk about logical entities, we don't need to
    talk RTP or SDP

    Cullen: things in one media stream will map to one RTP c-name. This
    is how you signal that they are synchronized (rendered together).
    ... and a track will have a one-to-one correlation with an SSRC in
    the simple case.
    ... receiving video, bit rate is being adjusted, should we know the
    other side is doing this? when the media we're receiving changes in
    some way, do we want to be notified?

    Roni: why would we?

    Cullen: may want to change my screen resolution
    ... for bit rate, if all my streams just dropped their bit rate I
    may in the JS decide to close some of my streams.

    (general agreement that this is useful info)

    Christer: if quality is decreasing, for example, could remove video
    to improve audio.

    Daryl_Malis?: good to collect and make use of this. My concern is
    that this info in practice is often used only to decrease quality of
    the end result but never improve.

    Tim: bitrate is a terrible proxy for quality
    ... maybe everyone stopped moving or talking
    ... exposing quality info is very codec-spceific

    Magnus: this is really about providing congestion info, right?

    hta: this is difficult to do in real time.
    ... we can get info on sender's changes.

    Cullen: trying to keep this simple, e.g. either sender changed
    resolution or reduced cap on bandwidth.

    Tim: difficult to detect cap on bandwidth

    Daryl: with clients using adaptive bitrates, they will lower the
    rate when nothing's happening and then increase back up when there
    is motion/sound.

    EKR: what we need is a way for the sender to say to the receiver
    "I'm having to back off here"

    Cullen: summary is we like this but it's hard and we don't really
    know how to do it properly (like packet loss concealment)
    ... presuming going to legacy devices via gateways. Do we have
    enough signaling info?

    Matthew: out of scope.

    Cullen: no, for example receiving early media.

    Matthew: need SDP for early media.

    Cullen: changing from one-way to two-way media.

    EKR: where is the call state machine?

    Cullen: all current proposals have it in PeerConnection object.

    Matthew: this kind of signaling has to happen over the JS channel.
    It would otherwise prevent many great use cases.

    Daryl: instead of this just being about ringing, can we generalize
    to early media?

    hta: impacts FedEx use case.

    Matthew: no such thing as early media, just media. There are no
    signaling implications. what would a skinny phone do calling fedex?
    if it didn't work, is the problem in the phone or elsewhere?

    Cullen: other question. You'll want some general option to reject an
    incoming call based on who's calling.

    Matthew: also, how's B notified when A calls B if B does not run his
    browser?

    hta: out of scope

    Matthew: we should have use cases that show that this is needed.

    Cullen: sounds like "how do I receive calls when my phone is off"?

    Matthew: no.

    stefan: notifications in scope of the Web notifications WG. We'll
    follow their conclusions.

    Christer: if your browser is not running, you're probably not
    registered to your SIP provider, so the client will never be able to
    figure out someone called in the first place.

    TedHardie: basically, you need some architecture that allows people
    to receive notifications when things run in the background.
    ... It's not an API issue.

    Matthew: right, it's a use case issue.

    TedHardie: I will send a use case.

    hta: rejecting a call should be a matter of not creating a
    PeerConnection object.

    Cullen: question is do you start your ICE before or after? This is
    going to make a timing question. My prediction is that ICE
    processing will be started before.

    Matthew: an evil Web site gets your address.

    Cullen: I can't force browsers to go to an evil browser.

    Matthew: a Web site that does not want to reveal that information
    must be able to go through the state machine and make the process
    happen later.
    ... It must be able for a Web site that wishes to protect users
    privacy to send JavaScript that has ICE processing happen after.

    [ekr made a comment on presence which I missed]

    [discussion on "Msg blob" bad naming]

    cullen: moving to msg blog issues. We need more or less the SDP
    message. We need to have crypto context set up. It means we need the
    identity.
    ... We probably need some unique identifier for peer connections.
    ... Those are the minimum amounts of things I can think of.

    ekr: Who's the target of these information? The JavaScript, the Peer
    connection?

    Cullen: in the simple case, it's going to be relayed. Same thing up,
    same thing down.
    ... There will sure be cases when things get manipulated (JavaScript
    or server)

    ekr: what information is carried here?

    Matthew: if you have SIP in the browser, you need to get this right.

    hta: media negotiation machine needs to be in the browser. The call
    state machine is not.

    Cullen: looking forward to someone splitting media state machine
    from call state machine that is SIP-mappable.

    ekr: re. same message up and message down, do we have consensus
    there?

    Stefan: there should be as it should be possible to get encryption
    from endpoint to endpoint.

    Cullen: is it possible, in the simplest case to have the server do
    nothing but relay the message from one side to the other? Do we have
    consensus on that?
    ... That's what all proposals have.
    ... There's always a "you need to send this chunk of data to the
    other side", but none of the spec says that the server needs to make
    any update.

    Christer: well, at the end of the day, the other side needs to
    understand what comes in. If you convert between protocols, you may
    need to adjust the message.

    Cullen: let me rephrase the question. Should the format that comes
    from one side be potentially identical to the one that goes to the
    other side?

    [no pushback heard]

    Cullen: final question is the size of the blobs.

    hta/Stefan: no limit. Limit is for datagram.

    Cullen: ok, so these blobs can be large enough.
    ... moving on to media issue.
    ... Question about hints you give when setting up cameras.
    ... What I'm proposing here are size, spacial vs temporal quality
    are important (spoken voice, or non-spoken voice). Clearly needs to
    evolve over time.
    ... Some people proposed we'd have none of these things.

    Roni: Let's assume that we're using SDP. Are you suggesting that we
    have a separate set of hints that are not part of SDP?

    Cullen: this is even on the which codec should I use.

    Roni: I assume you can negotiate everything with SDP.

    Cullen: The Web browser can. But the JavaScript?

    Matthew: everything can be manipulated through JavaScript before it
    goes out.

    Cullen: there's one range of opinions is that JavaScript ought to be
    able to construct the SDP offer. The other range is that it ought to
    be able to do nothing.

    hta: no one objected to the idea that screen size should be
    communicated

    Cullen: also rough consensus earlier on on voice/music.

    Matthew: server can strip out any SDP offer/answer as it wishes
    before transmitting it.

    hta: yes, but it can only subset things. It cannot ask for more
    offers.

    Roni: if the Web server does not know how the codecs were chosen in
    the first place, how is the Web server to make the right choice?

    Cullen: if you don't have the info that there's hardware
    acceleration for one codec, right, indeed.
    ... Propose to stop here in the interest of tie.

    Tim: one other point. The audio vs. voip has a lot of implications
    that do not show in SDP.
    ... Processing that have no bearing whatsoever on what codec you
    choose.
    ... Filtering SDP will never tell the browser to turn off the AGC,
    AEC, etc.

Administrativia

    hta: first, an easy one. Next meeting is going to be during TPAC
    2011, in Santa Clara, USA, first week of November.
    ... We'll call out for a next teleconference through some Doodle
    poll.

    <burn> we could also use a w3c teleconference schedule poll . . .

    hta: The interesting question here is how do we get to document our
    output in a way that is effective, acknowledged, implemented and
    deployed?
    ... What we do at the moment is discuss changes we need to bring to
    the WHATWG spec.

    cullen: we'd have more useful feedback in the group if the group
    publishes a spec in a W3C space.

    Christer: we have one document regarding the requirements.

    burn: Common to do both. Requirements doc and spec.

    francois: [explaining W3C process]. FPWD triggers call for patent
    exclusions. Document needs to be in W3C space.

    Dan_Burnett: one way is to take a starting point. Other way is to
    redo from scratch.

    Cullen: from my point of view, critical thing is to have a document.

    Alissa: being able to explicitly state where there is no consensus
    in a document is important.

    Dan_Burnett: I agree.

    Cullen: how many do we have to choose from?
    ... Only one proposal on the table from actual members of the
    working group.

    hta: I suggest that the chairs continue the discussion and figure
    out how to solve this.

    hta: Any other business?
    ... Thanks all for showing up!

    [meeting adjourned]

Summary of Action Items

    [NEW] ACTION: Cullen to send a server-provider TURN use case and
    user-provider TURN use case [recorded in
    [29]http://www.w3.org/2011/07/23-webrtc-minutes.html#action08]
    [NEW] ACTION: DanB to send comments reviewing requirements to list
    [recorded in
    [30]http://www.w3.org/2011/07/23-webrtc-minutes.html#action02]
    [NEW] ACTION: Harald to query authors on A15 on what context means
    [recorded in
    [31]http://www.w3.org/2011/07/23-webrtc-minutes.html#action05]
    [NEW] ACTION: John Ellwell - propose use case on recording [recorded
    in [32]http://www.w3.org/2011/07/23-webrtc-minutes.html#action06]
    [NEW] ACTION: Matthew Koffman to send some text around SDP [recorded
    in [33]http://www.w3.org/2011/07/23-webrtc-minutes.html#action09]
    [NEW] ACTION: Roni Even to find some of the use cases in other WG in
    IETF and send to group [recorded in
    [34]http://www.w3.org/2011/07/23-webrtc-minutes.html#action07]

      [29] http://www.w3.org/2011/07/23-webrtc-minutes.html#action08
      [30] http://www.w3.org/2011/07/23-webrtc-minutes.html#action02
      [31] http://www.w3.org/2011/07/23-webrtc-minutes.html#action05
      [32] http://www.w3.org/2011/07/23-webrtc-minutes.html#action06
      [33] http://www.w3.org/2011/07/23-webrtc-minutes.html#action09
      [34] http://www.w3.org/2011/07/23-webrtc-minutes.html#action07

    [End of minutes]

Received on Sunday, 24 July 2011 21:12:30 UTC