Web Real-Time Communications Working Group - Quebec City F2F -- 23 Jul 2011

hta: [introduction]. W3C meeting hosted by IETF. W3C rules.
... No polycon for the conference today.
... Looking for scribes.

[francois and Cullen step up]

WebRTC Architecture

[Harald projects slides on Web RTC architecture]

hta: Going to present goals, architecture layers, security. I won't touch upon details.
... Goal is enable realtime Communication between browsers. Real Time means you can wave at someone and he can wave back. 100ms timescale.
... Media is audio/video but people also want to send other stuff to.
... Important to drive the design by use cases
... We have to go for general functions to enable innovations. Use cases are least amount of things possible.
... Basic concept: somehow Javascript, with the help of the server, can establish a connection to the other browser.
... Media flows through the shortest possible path for latency and because it makes life simpler.
... Different architecture layers. Apart from the browser, any other box must be assumed to be able to be controlled by an enemy.
... That is a security context that is slightly different from in other areas.
... In IETF, we're mostly concerned by attacks on the network.
... Here, we have to take into account all components.
... Data transport means you have to establish some data path. More or less agreed to use ICE.
... UDP is the obvious transport given the constraints (we need to be able to backup to TCP though). Congestion management is necessary.
... I'll skip rapidly through IETF issues as they will be addressed on Tuesday and Thursday. Focus on API here.
... There will be data framing, securing, we must negotiate data formants and we need some baseline that everyone implements for the negotiation to always succeed.
... We have use cases for setting up connections that require SIP and others that don't require SIP.
... User interfaces include privacy considerations. The user has to know that he has allowed the use of camera and microphone and must be able to revoke that access at any time.
... In scope for W3C, not so much for IETF.
... Talking about API, it shouldn't take too many lines of JavaScript to setup a connection and tear down a call. Multiple streams, pictures that jump up, etc. should be possible.
... There are things that are on the wire but are truly relevant for the user.
... In some cases, security demands that they are hidden to the user interface.
... Interoperability requires that it all gets specified.
... If you precise control precisely, it ages badly, e.g. "I want that precise codec".
... Of course, we have to have interoperability. If you give the same script to two browsers, it should work. Not exactly the same resources because different capabilities are possible, but it should work.
... When data is passed through this API, format has to be specified.
... In some cases, we have blobs that get passed.
... These blobs will be parsed by different browsers though, so they need to know how to parse them.
... Summary slide: Having an overview is a means to ensure that we can talk about different parts of the system and we feel confident that we have all the pieces covered.
... Questions/Comments/Disagreements?

Cullen: that seems consistent with what I'm think I'm hearing.

EKR: you said precise control age badly. I'd like to say "higher quality than x/y/z", right.
... Problem is that the notion of "higher quality" depends on codecs and profiles.
... I fear it falls into a rathole designing a new way of describing codecs and qualities

Matthew: I think legacy interoperability is missing from your slides.

??2: what do you mean with legacy interoperability

Matthew: I can show you existing devices that do RTP but not SRTP. If you want to non secure devices, you need to relax the bullet presented that unencrypted data do no need to be carried.

hta: one of the things that someone mentioned is that we need to talk to gateways.

TedHardie: is this the right place to discuss that? Shouldn't this be handled by IETF RTCWEB group on Tuesday/Thursday?

Matthew: I believe it has API implications.
... It's overview, the overview should talk about legacy system.

hta: I'll consider including that for Tuesday/Thursday as well.

Francois: Asked question about architecture and if we need to resolve it in this WG

hta: IF we discover that W3C perspective results in things need to change, we should take that change to IETF

<inserted> [quick raise of hands reveals that most of the room follows both IETF and W3C mailing-lists]

Use Cases

Projected slides: Web RTC use cases (PDF, 869KB).

Stefan: presenting use case
... simple use case is two web browsers communicating. One of the brwosers is behind a NAT. One link has packet loss
... works with different browsers and os
... video windows are resizable

Can move from ethernet to wifi to cellular and the session should survive

scribe: Can move from ethernet to wifi to cellular and the session should survive
... Moving to second use case between two service providers
... case where you must handle two cameras sending video from one browser

Roni: asked question about streaming

Stefan: it is not streaming of the game, it is just the two camera's being sent to couach
... use case with a mess of video stream

Colin: Question about if there was NATs in this case

Stefan: yes, there are nats

John: Is there an assumption that the video is the same or is different between peers

Stefan: each peer sends same video to all other peers
... use case with multi party on line game
... Use case with telco interop with PSTN
... need to be able to place and receive calls to PSTN
... not clear how much gateway functionality would be needed
... IN the case of call FedEx, this adds being able to navigate IVR

Dan Burnett: brought up IVR interaction can be voice rec too

hta: need to tease out the requirements from this use case

Dan Burnett: does not care about telephone use case but if we are going to do it, we should do it right

Colin: are there other scenarios for legacy end points

Stefan: these are the only two case right

Roni: brought up need to deal with call center cases

Christer: goal is not to limit to PSTN, it is to connect to SIP

Colin: very different to GW something that uses same media formats vs different media formats

Roni: ALso different in terms of security
... do we need to know it is secure end to end

hta: In his google role: worried that we are worring too much concern about interoperability
... telco network is only one concern

Cullen: the Fedex use case. It's not only DMTF. There's the initial prompt. PSTN is not easy. Many attempts to interop with that have failed with Fedex.
... We're very interested with the legacy use case.
... 2.5 billion users out there without Internet connections.

<Venkatesh> I agree with that comment about worrying too much about PSTN.

ekr: There is interop with PSTN, legacy SIP devices, partially standard devices like webex

<Venkatesh> the very same argument was used when other initiatives started and complicated the heck out of the specifications with very little benefits IMO.

stefan: Use case video conference server
... doing simulcast where clients send high and low res video
... central server switches the active speaker high res video to all others plus sends a copy of all low res streams

Dan Burnett: Q, we are talking about a display with many people, plus when speaking each person gets bigger

stefan: does not need to get bigger immediately, can be hysteresis on staying on room

<Dan> trying to identify the Dan's

<Dan> yep - Dan in Cullen's notes it's not the Dan (Romascanu) in the IRC :-)

<Dan> just call me DanR if I speak

stefan: the server decides which one to display

colin: very differnt requirements if users get to decide what streams get display instead of server

stefan: This use case is inside an organization and introduces a firewall. People outside the firewall should be able to participate

Derived API requirements

See: WebRTC requirements.

hta: these requirements are only going to be discussed here not in IETF

Dan Burnett: Is A1 asking permission or asking them which one to use ?

Ekr: this is a fundemental invariant that the browser that needs to do this

Dan Burnett: in W3C we should use the term User Agent not Browser

ekr: The web application needs to be able to request use of the device. The user agent needs to get consent to allow that
... two way to do device selection. 1) application finds the devices and asks user which one wants to use 2) application asks for audio device and UA has way to select one

Matthew: useful to be able to preflight the permissions and find out if they would be OK or not

hta: getting close to end of time for this

francois: do we have some willing to review requirements

<scribe> ACTION: DanB to send comments reviewing requirements to list [recorded in http://www.w3.org/2011/07/23-webrtc-minutes.html#action02]

<trackbot> Created ACTION-5 - to send comments reviewing requirements to list [on Daniel Burnett - due 2011-07-30].

Alissa: where are the requirements going to live?

hta: open issue - like to hear comments on this at end

stefan: moving on Security consideration slide

<francois> ISSUE: where are requirements going to live?

<trackbot> Created ISSUE-2 - Where are requirements going to live? ; please complete additional details at http://www.w3.org/2011/04/webrtc/track/issues/2/edit .

John: what about recording of media. Record what is spoken on mic or received at far end ?
... recording local or recording on a device across the network

John: are people interest in this type of use case ?

<francois> ACTION: harald to query authors on A15 on what context means [recorded in http://www.w3.org/2011/07/23-webrtc-minutes.html#action05]

<trackbot> Created ACTION-6 - Query authors on A15 on what context means [on Harald Alvestrand - due 2011-07-30].

<scribe> ACTION: John Ellwell - propose use case on recording [recorded in http://www.w3.org/2011/07/23-webrtc-minutes.html#action06]

<trackbot> Sorry, couldn't find user - John

<francois> [Note there is no way to action someone who is not a participant in the WG using Tracker]

stefan: asking question about adding other use case

hta: do we want lots of use cases that differ or a use case that encompasses lots of aspects
... what style do people want?

Cullen: slight preference that encompasses lots of aspects instead of having tens of use cases.

hta: I seem to be outnumbered.

Stefan: same as Cullen

Francois: do we need a use case with screen casting between peers, like VNC?

Roni: There are uses cases in other WG in IETF. For example CLUE and the semantic label.

<scribe> ACTION: Roni Even - find some of the use cases in other WG in IETF and send to group [recorded in http://www.w3.org/2011/07/23-webrtc-minutes.html#action07]

<trackbot> Sorry, couldn't find user - Roni

Dan Druta: We need to look at them from the user perspective. End to end user experience is important thing. There are some use cases that are driven by actors:" in our case users, user agents, servers. We need to think that way about this work. Discovery of capabilities and matching two browsers together should be a big one. The timelines of browser development will mandate that we need this.

hta: over time - want to move on

Christer: goal is to come up with use cases that derive new requirements

Tim: like to include music use case

Cullen: in favour of it

hta: on E911, drop for now

Implementation Experience — Google

Projected slides: WebRTC Chrome implementation status (PDF, 188KB).

hta: presenting in his google role on their implementations in chrome

hta: goal, going for production quality code in chrome for everyone
... used to provide concrete feedback to the API and protcols
... they know the version they are shipping in the first version will not be what is in second version
... they have released key components at code.webrtc.org
... working on integrating into chroming
... add a webrtc C++ api that wraps the GIPs code
... webkit had a "quite rigorous" review process. Specs are very unstable.
... roling out changes to libjingle, webkit and more more I missed
... Got to a working demo with audio and video in brwoser
... going to work real soon now

ekr: what does that mean?

hta: can't comments on release dates - matter of months before it is in production chromium
... prefixing everything with webrtc to allow for changes to stable system later

Cullen: after you get with a version in the production code. Is the intention to remain backwards compatible with the API you'll have shipped?

hta: we'll argue more strongly against cosmetic changes, yes. We're open for more important changes.

ekr: will it roll out as command line switch, then no switch?

hta: yes, expect to see stage with switch

Implementation Experience — Mozilla

Tim: mostly been focusing on infrastructure work
... for example, speeding up camera pipeline
... doing a new low latency audio backend
... likely to land in firefox 8 or 9
... doing Media Stream API for splitting , mixing, synchronization
... allows for the more complex use cases and innovation
... Plans: using GIPS code from google. First target is firefox add-on. Want to do this as it is rapidly evolving.
... Makes it easier to rapidly interate.
... Target is something production ready in Q1 2012 (just a rough estimate, not a commitment)
... whole bunch of user experience questions, call interupt, multi domain conferencing
... been discussing doing SIP directly in browser
... feel this gives you easier way to tie to other devices

Implementation Experience — Cisco

Projected slides: Cisco's WebRTC implementation (PDF, 1.61MB).

cary: started to see can we get two browsers to call each other using SIP
... have implemented this in Chromium and Mozilla
... can do browser to browser voice and video calls between browsers and between browsers and video phones
... using GIPS
... put Cisco SIP stack chromium by implementing a render host API and also need to touch the webkit glue
... Did Firefox extension focusing on putting the video and voice
... plan to contribute code to open source projects "soon"

Implementation Experience — Ericsson

Projected slides: PeerConnection implementation experience (PDF, 39KB).

Stefan: working on top of webkitGTK+
... goal is to learn about the API and how it works, learn about flexibility of API. We learned it can be implemented with reasonable effort.
... We have send feedback to editor of spec to add things like label
... there are a bunch of blog posts (URL in slides)
... can demo offline if you want and there is a youtube video of this

Magnus: How many of you have looked at security issues?

hta: chrome has touch security review process and this is going through it

Tim: have touch security review process

Cullen: security, what's that? ;) Primary goal was to get something working.

API Design Questions

Projected slides: WebRTC API Design Questions (PDF, 53KB).

Cullen: trying to come up with questions and answers that people in the room may have as things they want to do.
... Looking for feedback on whether we should this or that. Consensus on things that don't need to be done.

TedHardie: thinking about whether some of the interfaces between the browsers and the OS need to be taken into account

Cullen: Right. Today, I'm going to stay high level, but we'll need to go into much more details later on.
... Design principles: same stuff as said earlier. A simple app does not need to know a lot about underlying things.
... Looking at use cases that enable things.
... Starting with connecting to media: connecting to devices, cameras, microphones.
... Do we have an API to enumerate what the various cameras are on a device?
... Example of laptop with different cameras.
... I'd like some feedback.

hta: one thing that is fairly common is "switching to headset".

ekr: also common that the system picks up the wrong camera. The feature that is imperative is that the user gets the choice.

<anant> switching to headset is taken care of the OS though (in the most common cases)

ekr: whether it's a web app or a chrome issue is still tbd.

<jesup> tablets: front/rear, etc. May be able to group with user giving permission to use hte camera

TedHardie: two cases. One is when you want to set a default. Second is when you want to switch or mix.
... For the enumeration, I do think that the JavaScript needs to be able to query that information from the browser, but not for naming.

Cullen: an API to find out the current list of media devices and some notifications mechanism to tell us what modifications there are to that.

Dan Druta: that ties with the consent problem.

<jesup> Right: camera/mic plugin/removal. Consent needed for a new device to be used

Ted Hardie: I disagree. The need for consent needs to be on a per call basis.

Dan Druta: I may not want to give permission to an app to see my face, but may be ok for it to see my room.

<jesup> Though a user could (at their option) pre-give consent for a specific device/app combo

Tim: the ability to enumerate the different cameras may raise a security concern as it gives the ability to fingerprint the browser more easily.

Matthew: when you install Skype on a tablet, for instance, you typically enable the app to access cameras.

<jesup> Related issue: naming of cameras - "standard" names vs user input names vs generic names (camera_1, etc)

Cullen: the permission problem is increadibly complex.
... I don't think we have enough to nail down the many ways we may need to access the camera yet.

<jesup> Is the solution to the permission problem part of our spec, or something for each implementation to decide on?

Alissa: thinking about the use case where you may want to use the camera to take still pictures but not to stream video

Stefan: coordination with DAP. We'll handle streams, they will handle still pictures.

<burn> actually, I think Alissa's concern was that this API might be used to record but not stream

Cullen: you should be able to add new cameras/microphones and switch to that at any time.

<Alissa> yeah, capture or record, but not stream

<burn> right, capture. and then presumably do evil.

Cullen: the currently proposed API does not give you much in terms of ICE process.
... The one issue that I want to ask is how do we want to pass credentials?
... Does the JavaScript see the password?

hta: good question on what the model is. Whether it's on the user, browser, or server.

Cullen: [examples of different TURN servers configurations found in the wild]

Matthew: do we need to have calling use cases that involve enterprises?

Cullen: there's one.

<scribe> ACTION: cullen to send a server-provider TURN use case and user-provider TURN use case [recorded in http://www.w3.org/2011/07/23-webrtc-minutes.html#action08]

<trackbot> Created ACTION-7 - Send a server-provider TURN use case and user-provider TURN use case [on Cullen Jennings - due 2011-07-30].

Cullen: other things we could possibly want to be notified in JS about such as:
... can't gather address from one of servers, fail to connect to TURN server, other side disconnects.
... etc.
... Each time you get a better path to the other side, knowing about that would help debugging things a lot.

Ted Hardie: why would we want that other than for debugging?

<burn> Another point Matthew made a moment ago that Cullen wanted captured: may want to know when my (the user's) address changed.

Ted Hardie: If you chose 2 instead of 3 or 4, do you want this to be passed back to the JavaScript?

Matthew: yes, you need that for several purpose

Cullen: to tell people to switch to another NAT, because the current one is evil.

hta: I can imagine that people will say that not passing the address back to JavaScript is actually a security feature.

<gape> +1

Matthew: I can explain why it's a fake security issue.

<jesup> The remote address is trivially available on the wire since data is going peer-to-peer

<derf> Not to the JS.

<jesup> True

[discussion on aggressive/fast/low mode]

Colin: sometimes you want not to use the best possible connectivity, but maybe something below.

Christer: not so much an error, rather a choice when you call the API.

<tedhardie> I'm concerned that the API not force the JS application to deal with this level of detail; after all, some of these applications are simply going to say "sorry, video/audio not available" to the user, where this is an add-on to the basic application (the poker site video use case)

ekr: connectivity check, you're going to want to know whether the connection is direct or through the relay, etc.

Matthew: that's the sort of information you know to be able to say: "your NAT is fine, it's John's NAT that's crappy".

[calling for a 15mn break. Discussion to continue afterwards]

Signaling Issues

Cullen: for non-ICE signaling, when do you send messages?
... need to add all media codecs before end of javascript (all at same time). when function call returns, signaling is sent
... Other option is "open" we proposed.
... either add explicit start signaling, or queue up everything and add at once which means implicit signaling

Matthew: do it the way everything else does, whatever that is.
... I think browsers do it implicit way.
... because every time control is returned it re-renders

Christer: who is doing negotiation?

Cullen: not javascript that does signaling

EKR: you express opinions to PeerConnection about what you would like, and invisible to JS this happens in the background as necessary

Cullen: some negotiation will happen, done by the browser

Dan Druta: this is early vs. late binding. either give pref in advance or control directly.

Cullen: one way as you get permission and access to media streams, you gather up and then put all in the PeerConnection object at once. alternatively, could add to PeerConnection one at a time as you get them but don't start sending media on any until you say go.

(missed Matthew comment)

EKR: they are really equivalent

Stefan: should be able to add and remove during session. confusing if you have to start session.

EKR: JS VM must not start until control has returned from all JS.

Cullen: this is not true of all JS.
... sounds like leaning towards implicit.

Matthew: yes, but treat everything as an add.

Roni: and need delete as well

Cullen: negotiation is implicit
... most of the APIs were leaning towards SIP-style SDP offer/answer, thought there was consensus there.
... three models: SIP, Jingle, or raw SDP in offer/answer wrapper.
... another variant is an advertise/propose model that I had sent in.

<scribe> ACTION: Matthew to send some text around SDP [recorded in http://www.w3.org/2011/07/23-webrtc-minutes.html#action09]

<trackbot> Sorry, couldn't find user - Matthew

Colin: all payload formats use offer/answer semantics, so keeping that would be helpful.

Matthew: Need to be able to determine what kinds of coders/decoders you have.

hta: have never seen a use case where you need to know which coder/decoder you're using.

Matthew: matters for audio recording. same as determining whether you can do real-time media. if API allows recording of video, need to be able to know how to encode it, resolution, etc.
... maybe other groups might do this, but it needs to be done.
... want to be able to choose from JS which encoding, etc. to use.

(missed comment from Harald on why this is necessary).

hta: JS coder needs to just say "I want to communicate" but not necessarily how.

Matthew: what if browser is a terminal for PBX. want browser to act more like Skinny phone than SIP phone.

Cullen: replace skinny with MGCP for this discussion. you need to know things about device. can't negotiate SDP without knowing additional info.

Roni: there are many parameters, not all are codec-specific. Some params you need to have anyway.

Ted: maybe middle ground is advertise/offer/answer. First send what's available, then offer/answer from then on. You get an informed O/A and can still use O/A.
... gateway should not need to have fundamental semantic shifts. Adv/O/A leaves you with the same semantics as SDP. Should discuss over beer.

Stefan: we need this data to negotiate, but is it part of this API?

JonPeterson: O/A always had the notion of counter-proposal. SDP can describe sessions well but not negotiate. So you can describe a complete session and allow a counter-proposal for something better.

Ted: makes gateways too complex.

Jon: if offer or answer described full session, yes, but it doesn't.

hta: no matter how we do this, we will see JS parsing these negotiation blocks. If we want to support our use cases, this will need to be gatewayed eventually anyway.

Matthew: it's a horrible hack to use PeerConnection to ask for capabilities and parse it in JS, when the API could just support it.

Cullen: let's see a proposal and then discuss.
... already decided to add video mid-call.
... do we need to know when other side is sending?
... nice to know in the UI that connection is being set up and when it's done.
... media in different directions may connect at different times, nice to have notification.

Roni: when you receive the media you know you're getting it. when you send you don't know.

Cullen: right. should there be an API that says that both sides are receiving?
... Will reword this question to be clearer.
... Now let's talk about tracks.
... whatwg API example up on screen
... which kind of media goes in different tracks. when are they in one track, when are they separate.
... I like for them all to be separate.

Matthew: don't like. many encoders can combine stereo channels into one codec on one track

Cullen: I like your metaphor, which is based on the codec.

JohnElwell: when is it a track, and when is it a media stream?

Stefan: stream contains 1 or more tracks. keeping them within one stream helps you with synchronization.

hta: one PeerConnection can be connected to multiple streams, each with multiple tracks.

Cullen: working definition is that if different pieces of media are in same codec, they are to be in same track. if multiple tracks need to be synchronized together, they are in the same media stream.

Magnus: has to do with mapping to RTP sessions
... sync cannot be across sessions.

hta: i thought media stream mapped to cname, but not sure.

Roni: track and media stream are both logical entitties from a w3c perspective. but we need to know how to map to IETF level

Cullen: want Magnus to work all of this out
... (joking, mostly)
... Need mapping to AVT, for sure.

(general agreement)

Roni: As long as we talk about logical entities, we don't need to talk RTP or SDP

Cullen: things in one media stream will map to one RTP c-name. This is how you signal that they are synchronized (rendered together).
... and a track will have a one-to-one correlation with an SSRC in the simple case.
... receiving video, bit rate is being adjusted, should we know the other side is doing this? when the media we're receiving changes in some way, do we want to be notified?

Roni: why would we?

Cullen: may want to change my screen resolution
... for bit rate, if all my streams just dropped their bit rate I may in the JS decide to close some of my streams.

(general agreement that this is useful info)

Christer: if quality is decreasing, for example, could remove video to improve audio.

Daryl Malis?: good to collect and make use of this. My concern is that this info in practice is often used only to decrease quality of the end result but never improve.

Tim: bitrate is a terrible proxy for quality
... maybe everyone stopped moving or talking
... exposing quality info is very codec-spceific

Magnus: this is really about providing congestion info, right?

hta: this is difficult to do in real time.
... we can get info on sender's changes.

Cullen: trying to keep this simple, e.g. either sender changed resolution or reduced cap on bandwidth.

Tim: difficult to detect cap on bandwidth

Daryl: with clients using adaptive bitrates, they will lower the rate when nothing's happening and then increase back up when there is motion/sound.

EKR: what we need is a way for the sender to say to the receiver "I'm having to back off here"

Cullen: summary is we like this but it's hard and we don't really know how to do it properly (like packet loss concealment)
... presuming going to legacy devices via gateways. Do we have enough signaling info?

Matthew: out of scope.

Cullen: no, for example receiving early media.

Matthew: need SDP for early media.

Cullen: changing from one-way to two-way media.

EKR: where is the call state machine?

Cullen: all current proposals have it in PeerConnection object.

Matthew: this kind of signaling has to happen over the JS channel. It would otherwise prevent many great use cases.

Daryl: instead of this just being about ringing, can we generalize to early media?

hta: impacts FedEx use case.

Matthew: no such thing as early media, just media. There are no signaling implications. what would a skinny phone do calling fedex? if it didn't work, is the problem in the phone or elsewhere?

Cullen: other question. You'll want some general option to reject an incoming call based on who's calling.

Matthew: also, how's B notified when A calls B if B does not run his browser?

hta: out of scope

Matthew: we should have use cases that show that this is needed.

Cullen: sounds like "how do I receive calls when my phone is off"?

Matthew: no.

stefan: notifications in scope of the Web notifications WG. We'll follow their conclusions.

Christer: if your browser is not running, you're probably not registered to your SIP provider, so the client will never be able to figure out someone called in the first place.

TedHardie: basically, you need some architecture that allows people to receive notifications when things run in the background.
... It's not an API issue.

Matthew: right, it's a use case issue.

TedHardie: I will send a use case.

hta: rejecting a call should be a matter of not creating a PeerConnection object.

Cullen: question is do you start your ICE before or after? This is going to make a timing question. My prediction is that ICE processing will be started before.

Matthew: an evil Web site gets your address.

Cullen: I can't force browsers to go to an evil browser.

Matthew: a Web site that does not want to reveal that information must be able to go through the state machine and make the process happen later.
... It must be able for a Web site that wishes to protect users privacy to send JavaScript that has ICE processing happen after.

[ekr made a comment on presence which I missed]

[discussion on "Msg blob" bad naming]

cullen: moving to msg blog issues. We need more or less the SDP message. We need to have crypto context set up. It means we need the identity.
... We probably need some unique identifier for peer connections.
... Those are the minimum amounts of things I can think of.

ekr: Who's the target of these information? The JavaScript, the Peer connection?

Cullen: in the simple case, it's going to be relayed. Same thing up, same thing down.
... There will sure be cases when things get manipulated (JavaScript or server)

ekr: what information is carried here?

Matthew: if you have SIP in the browser, you need to get this right.

hta: media negotiation machine needs to be in the browser. The call state machine is not.

Cullen: looking forward to someone splitting media state machine from call state machine that is SIP-mappable.

ekr: re. same message up and message down, do we have consensus there?

Stefan: there should be as it should be possible to get encryption from endpoint to endpoint.

Cullen: is it possible, in the simplest case to have the server do nothing but relay the message from one side to the other? Do we have consensus on that?
... That's what all proposals have.
... There's always a "you need to send this chunk of data to the other side", but none of the spec says that the server needs to make any update.

Christer: well, at the end of the day, the other side needs to understand what comes in. If you convert between protocols, you may need to adjust the message.

Cullen: let me rephrase the question. Should the format that comes from one side be potentially identical to the one that goes to the other side?

[no pushback heard]

Cullen: final question is the size of the blobs.

hta/Stefan: no limit. Limit is for datagram.

Cullen: ok, so these blobs can be large enough.
... moving on to media issue.
... Question about hints you give when setting up cameras.
... What I'm proposing here are size, spacial vs temporal quality are important (spoken voice, or non-spoken voice). Clearly needs to evolve over time.
... Some people proposed we'd have none of these things.

Roni: Let's assume that we're using SDP. Are you suggesting that we have a separate set of hints that are not part of SDP?

Cullen: this is even on the which codec should I use.

Roni: I assume you can negotiate everything with SDP.

Cullen: The Web browser can. But the JavaScript?

Matthew: everything can be manipulated through JavaScript before it goes out.

Cullen: there's one range of opinions is that JavaScript ought to be able to construct the SDP offer. The other range is that it ought to be able to do nothing.

hta: no one objected to the idea that screen size should be communicated

Cullen: also rough consensus earlier on on voice/music.

Matthew: server can strip out any SDP offer/answer as it wishes before transmitting it.

hta: yes, but it can only subset things. It cannot ask for more offers.

Roni: if the Web server does not know how the codecs were chosen in the first place, how is the Web server to make the right choice?

Cullen: if you don't have the info that there's hardware acceleration for one codec, right, indeed.
... Propose to stop here in the interest of tie.

Tim: one other point. The audio vs. voip has a lot of implications that do not show in SDP.
... Processing that have no bearing whatsoever on what codec you choose.
... Filtering SDP will never tell the browser to turn off the AGC, AEC, etc.

Administrativia

hta: first, an easy one. Next meeting is going to be during TPAC 2011, in Santa Clara, USA, first week of November.
... We'll call out for a next teleconference through some Doodle poll.

<burn> we could also use a w3c teleconference schedule poll . . .

hta: The interesting question here is how do we get to document our output in a way that is effective, acknowledged, implemented and deployed?
... What we do at the moment is discuss changes we need to bring to the WHATWG spec.

Cullen: we'd have more useful feedback in the group if the group publishes a spec in a W3C space.

Christer: we have one document regarding the requirements.

Dan Burnett: Common to do both. Requirements doc and spec.

Francois: [explaining W3C process]. FPWD triggers call for patent exclusions. Document needs to be in W3C space.

Dan Burnett: one way is to take a starting point. Other way is to redo from scratch.

Cullen: from my point of view, critical thing is to have a document.

Alissa: being able to explicitly state where there is no consensus in a document is important.

Dan Burnett: I agree.

Cullen: how many do we have to choose from?
... Only one proposal on the table from actual members of the working group.

hta: I suggest that the chairs continue the discussion and figure out how to solve this.

hta: Any other business?
... Thanks all for showing up!

[meeting adjourned]

Web Real-Time Communications Working Group - Quebec City F2F

23 Jul 2011

Attendees

Contents