Web Real Time Communications Working Group

05 Oct 2011


See also: IRC log


Harald_Alvestrand, Adam_Bergkvist, Dan_Burnett, Francois_Daoust, Dan_Druta, Christophe_Eyrignoux, Ralph_Giles, Stefan_HÃ¥kansson, Cullen_Jennings, Randell_Jesup, Mani_Mahalingam, Tim_Panton, Dan_Romascanu, Neil_Stratford, Tim_Terriberry, Rich_Tibbett, Justin_Uberti


Harald: Agenda sent out on the 27th
... reviewing items

since then, we've added low-level javascript api proposal

Changes made to the editors draft

cullen: Overview of the changes to the draft
... primary set of changes was around separating the ICE state machine a bit
... Want feedback on the overall model
... ICE state machine checks candidates, and can start using some while still waiting for other candidates at the same time you have have the negotiation with the other endpoint going on
... codecs, ICE parameters, etc
... separated those two aparet
... and tried to update the text and semantics around when things change state

Cullen: second change:
... want to redo how we do the data channel and the data api
... commented out what was in the previous draft
... since it didn't have agreement
... will replace it based on discussion today and on the list going forward
... those are the major changes

MediaStream inheritance

Adam: had a discussion on the mailing list about confusing text
... about how you can create a stream from another stream, and if you disable tracks in the parent they affect the derived streams
... the usecase was to clone streams e.g. to send a copy of a stream somehwere else
... but you might want different attributes, e.g. to mute the copy you're uploading
... so when you fork a stream, it's better if you create a new stream which is a peer of the original, without any parent/child relationship
... this is easier to understand
... that was teh first step
... the second was to be able to create a new stream that contained components gathered from multiple other streams
... e.g. to combine all the audio tracks from multiple incoming streams and record a single file with all the conversation audio
... has implications wrt sync, but we'll take that as a separate discussion

Data channel API

harald: next item, assuming people are ok with non-inherentance
... shall we talk about the data channel api for a moment
... the main thing going on is the deletion of that part of the spec
... but the part that has been happening is that eric (sp?)
... has suggested how you could define a protocol for this
... and multiplex multiple streams over the channel
... most of this is happening at the IETF, but what implication does this have for the API

justin: there is a need to have multiple channels
... so it's nice not to do that at the application level
... and also signal reliable vs unreliable

cullen: can we just have a data channel, which you add like an audio or video stream
... regardless of whether it's reliable or not, I care about realtime/low-latency data transmission
... that's the important distinction
... of course if the library does that form me, that's great
... but can always build something

justin: agree

randell: also agree

someone else: requirements address low latency, reliability

harald: requirements are strict on congestion control
... that means we have to throw data away
... so a tradeoff we have to be aware of is that reliable will be slow

tim terriberry: we can't assume the endpoint supports data
... so adding a data stream needs to be able to fail, we should reuse the a/v stream api for that

someone else: you don't even know if the other side supports data right now

harald: in the absence of a draft
... the group is drifting to having a data channel you can add just like a media stream

justin: yes, the other side has a data stream show up just like a/v
... and you can have multiple such data streams with different properties

<fluffy> +1 what Justin said

question: thoughts on peer streams vs peer tracks

cullen: just make them exactly like a mediastream as much as possible
... one issue is that tracks are SSRC identified, and demuxed that way
... while the data stream my have internal muxing
... the lowest unit of a/v transmission is the track, which is an assymmetry

question: how do you create data streams?

it's strange if the peerconnection is the factory?

harald: could have an .addData() method
... right now we don't have a method of having tracks in a media stream
... so we don't have a symmetric operation of creating data tracks
... right now getUserMedia just make a stream appear
... and we can add streams to that
... it would be something similar to it
... getUserMedia must be async
... but getDataStream could be sync

Harald: people have argued, we should just have what's effectively capture keyboard and have that be a stream

Next agenda point.

Glare resolution

Cullen: Basic issue is that if we're using browsers with SDP to negotiate
... ignore SIP, etc. for now
... if we're using offer/answer
... and if you and the other side both send offers at the same time (glare)
... SDP doesn't have a way to resolve that
... both sides need to retry
... the way I characterized glare in sip on the IETF thread was sort of wrong but the issue is if we're using SDP we need to handle glare
... if we want to be able to gateway to sip we need to handle it in a way that's compatible with sip's handling and the way sip does it isn't necessarily optimal
... one question here is, will the sip approach change?
... will we invent a backwards-compatibile glare resolution which can be used for any SDP protocol
... the other question is if we want to be able to gateway to SIP wrt to glare handling
... either way we have to deal with glare somewhere.
... So, that sets the stage for discussion

Tim Terriberry: what API implications are there?
... your glare resolution had a magic number, do we need to expose that?

cullen: I don't think that has api implications
... the only implication is that things might take longer than you think
... adding a stream will always be async because there's some back and forth there

someone else: if you throw these SDP messages away
... if you find a glare, it would be better to reuse the message
... just a thought

cullen: I think that would be an attempt to resolve this in in a better way
... one side backs down, and that would be a faster way to do that but I don't think it affects the api
... I expect to see some drafts or other evolution of the email threads about how we can do glare resolution better, how we can map it.
... it's not as bad as I originally made it sound

harald: no specific comments?

one question for cullen: if we go with this number thing to do faster resolution, would that still work with sip?

cullen: yes. this problem exists in sip as well, so anyone using SDP offer/answer would move to this
... it would have to be an mmusic draft in the IETF
... if you didn't have the number, you'd fall back to something which was compatible with the way older sip devices did it
... but if you had the number it would resolve the glare faster
... maybe even if only one side uses the new resolution it might go faster
... This matters because there's a common metaphor of people starting with voice, and adding video. That escalation is a really common case when you have glare

I wonder if this is premature optimization

jesup: I wonder if we lock ourselves in to glare when we're switching modes, or if there's been an interface changes which affects bandwidth

cullen: good questions
... interfaces changes tend to be one sided, so they don't result in glare
... changes to network are more of a problem
... e.g. congestion causes them to drop video
... we need to design algorithms so both sides don't try to do that at the same time
... cisco telepresense is designed so the two ends aren't likely to switch within a few 100 ms
... that's by design, but so far it hasn't been a issue

Harald: Cullen will work more on the glare problem

Changes to getUserMedia

Harald: we have implementation feedback
... js argument to getUserMedia
... tim's note on json hints for mediastreams
... seems to me these are all reasonable things to do
... modulo suggested modifications
... want to keep getusermedia async
... any other comments?

jesup: As far as the async nature of getUserMedia, it's still async in anant's proposal
... but it's async in a different way
... you get a stream right away but it's inactive until you get an event

tommy and anant were talking about and forth about what the most appropriate design is

Harald: Tommy and Anant will work out a proposal

Adam: I also got into that discussion
... So I think that will continue on the mailing list
... xit's a good discussion

Harald: the proposal will actually change the interface to the mediastreamevent and will require that you have a listener on that
... before it was possible to use MediaStreams without using an event listener

Adam: Right, you'll have a stream in a state where you can't use it
... if we make the track list immutable it will help us in the future
... It will make it a lot easier

Harald: you might not get away with that
... in the case of remote streams, you'll have tracks popping up which weren't there before

Adam: right now you have all the tracks, but they're empty

Harald: how is that communicated?

Adam: with addStream signalling

harald: but what you're connecting to doesn't use addStream, they just appear
... Need some kind of gateway to make them look like a peerconnection

Harald: The change to getUserMedia is non-controversial
... but the change to MediaStream is controversial

Tim: I think the issue harald is describing is that if we want to make the mediastream a representation of an RTP CNAME, we have to account for the possibility that a new track is added

Cullen: I don't think we have a very good definition of a track
... Can't have everything in a peerconnection be synchronized

jesup: can't have that; they might come from different sources
... the feeds could be coming from five different devices

Cullen: if you have a gateway
... it may want to tell you that the five streams aren't synchronized

jesup: those streams are inherently unsynchronized

cullen: trying to synchronize them introduces latency and delays them all if one loses a packet

someone else: application asks for video and audio at the same time with getUserMedia, so it can add video later if it wants to

jesup: User may have accepted audio only, so escalating to video requires asking the user again so you can't assume you don't have to call getuserMedia again.

Harald: we're running out of time, can someone take an action item to write down what the media stream is and write to the list

Tim Terriberry: I can do that

Harald: Ok, you got it then

Plans for moving to FPWD by Oct 18

francois: It's a good signal to the rest of the community
... to other groups that we're making progress
... it doesn't have to be complete
... it's a good idea to flag sections which we're still discussion
... apart from that the only thing missing is an abstract

<scribe> ACTION: editors to add an abstract before the next draft [recorded in http://www.w3.org/2011/10/05-webrtc-minutes.html#action01]

francois: we need a resolution that the group agrees to publish a draft

Dan Burnett: can do that on the list

francois: yes, it just has to be recorded somewhere and archived.

Harald: I think it's a good idea

[No objection heard]

Harald: I'll send a note to the list

Inserted agenda point (3 minutes left)

Are some people ok to stay late

(some people are)

Low level API

Neil: based on our experince with phono, we'd like to see the bare minimum in the browser and do the rest of sdp in javascript

Cullen: I think the only thing it would take to convince me this was a good thing
... is a clear idea that this was mapping to an existing signaling only gateway
... my concerns would be the complexity of the js and getting that right
... if it were just audio, I'd be for that but the complexity of the parameters with video are complicated
... e.g. the number of macroblock per second the decoder can process
... there are a bunch of different variables you have to negotiate and many codec-specific constraints
... where's the best place to do that negotiation, browser, or javascript?
... next year, if a better video codec comes out
... I want websites written in a basic way to be able to take advantage of that

jesup: since the parameters are codec-specific
... some of them map to each other
... but when a new codec comes out, there's not prebuilt mapping js can use

cullen: the last time I looked at this, this was the part which was hard and I think that's what we need to look at to figure out what to do

jesup: there's no guarantee websites will upgrade

cullen: e.g. most prevelent jquery is the first one that really worked
... so many people pick a version and never upgrade

Neil: so we need to prove we have no data in the javascript?

Cullen: we need to talk about the tradeoffs
... let's look at a new video codec, because that's the most complicated case

justin: I'd like to look at RTP, encryption
... I like the flexibility this approach provides
... but I'm concerned it makes it impossible for anyone but experts to do this
... and if it *is* done in the browser, experts can improve the chances of iteroperate
... My concern is specifically that at the API if we have an opaque blob generated by the browser
... we'll have a better chance of interoperability than if that blob is generated by a web application because there are fewer things to test

Neil: We could just test/certify the js libraries
... do browsers support downloadable or 3rd party codecs? that raises the same issue

Tim: Safari sort of does, in a limited way, but that's the only one

jesup: this has been considered out of scope

Cullen: downloading a codec wouldn't be enough, you'd need to also download the negotiation logic for the codec

Harald: With the addition of downloaded codecs, we have *three* moving pieces that need to know about each other

Tim: Regardless of the intents of this group to make an api for experts only
... people will use it
... we've seen with video codecs
... that users love starship consoles
... and if you give them knobs to tweak they will tweak them
... regardless of whether it's a good idea or not

Neil: but giving the flexibility to those who want them is a good idea, surely?

Cullen: I like that js can do absolutely anything, but I also want things to be as simple as possible

Justin: can't do both easy and expert interfaces in the same API

jesup: I don't have a problem with a layered api to do that

cullen: if you think the javasscript can't do it with the sdp, I think that's why we say "offer answer"
... I think it would be hard to find something you couldn't do by modifying the SDP

Tim Panton: We think it's safer to manipulate the SDP than to have an actual API?

Cullen: well, they'd use a library

jesup: Anything that exposes the encryption keys to js means your media isn't secure
... SRTP-DES is a problem because of this

Harald: this is covered by the security draft
... To close up the discussion, we don't have consensus for one or the other here
... It occurs to me...
... Can I ask Neil and Cullen to take the action item to figure out this issue?

justin: The other thing I'd like to understand is

<francois> [scribe dropped of the call, some exchanges missed]

Cullen: glad to work with people on this. Doable to do an API on this. It's a trade-off to reach. Complicated cases would need to be mapped to SDP.
... You're re-inventing an alternative to SDP here, make no mistake.
... Watched this several times, and it's not a good path to follow.

DanBurnett: is it reinventing SDP or allowing Web developers who may or may not be experts in codec negotiation to develop alternatives?

Cullen: I can't tell how it's different, but I think you're right.

Tim Panton: In general, the difference is in the intelligence of the client.

Harald: let's take the discussion to the list, Neil and Cullen to drive the discussion.

Liaison with other WG

Stefan: I just want to inform the group of the of the IETF audio working group
... there was another proposal which was Mozilla proprietary extension to Media streams which could solve this
... and I think there wasn't good progress on this
... we are hoping to have a joint session with the audio working group
... to discuss this further, any questions?

Tim: That sounds like an accurate summary

[Remember to register for TPAC: http://www.w3.org/2002/09/wbs/35125/TPAC2011/ Deadline 14 October]

Cullen: wrt TPAC, I think the other meetings are relatively open
... should I plan of sticking around for those?

DanBurnett: depends on what you want to attend

Cullen: Is there anything you'd recommend, Dan?

DanBurnett: the plenary on wednesday is a good way to find out what the hot topics are
... We're doing potentially the html speech working group on Thursday
... and ultimately there might be a need to mix Interactive Voice Response applications with this group>

[On Thursday/Friday: HTML WG, DAP, HTML Speech XG are meeting]

Stefan: Next liason point with DAP

Rich Device apis
... Media Capture api: no work for a year
... hope at TPAC we'll prune this and hand over to WebRTC for getUserMedia
... still got HTML Media Capture but that's not about getting objects but prerecorded audio video
... also web intents, discovery which may be of interest
... meeting Thursday and Friday
... if there's stuff from webrtc which might find a home in dap, we'd welcome that

Harald: we're at the end of the agenda
... only 1.5 hours
... any other business?

Francois: anyone planning to attend remotely? need to schedule if we need a polycom

Justin: I'm hoping to attend in person
... but if now, I'd like to call in

Dan Burnett: I find it very difficult to have remote participants in f2f meetings
... It's ok to have in the room, but I don't think we should assume that's a good option for participants

jesup: what worked at the last IETF meeting was having someone take comments from IRC and speak them at the mic
... which helps a lot with those issues

Francois: If you think you'll attend remotely, let me know by next Wednesday

Harald: Let's close the meeting
... Thanks all!

Summary of Action Items

[NEW] ACTION: editors to add an abstract before the next draft [recorded in http://www.w3.org/2011/10/05-webrtc-minutes.html#action01]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2011/10/06 16:50:27 $