Media Capture Task Force F2F Meeting -- 05 Feb 2013

<trackbot> Date: 05 February 2013

<dom> ScribeNick: ekr

martin: proposes slides...

<stefanh> Martin's slides: http://www.w3.org/wiki/images/7/7f/Device_Enumeration.pdf

<dom> Device Enumeration presentation

requirement: two different sites get different identifiers

<dom> ScribeNick: dom

ekr: what's wrong with just numbers?

burn: they can't have the same meaning across sites

[several]: they may be the same, but there is no guarantee they are

burn: it's important that browsers don't implement it in the way that in practice, people can rely on it

<scribe> ScribeNick: ekr

martin: monotonically increasing is a management exercise per browser.

juberti: this is all without any permissions. a site can find out how many audio and video devices you have?

martin: yes
... are people comfortable with the privacy properties and is this a valuable function?

fluffy: this is the right thing to do, but I think you also need to be able to ask for a human-readable string that might be used to identify the device.
... this adds fingerprinting surface.

martin: privacy-preserving option would be to make this available to the site only after granting consent.

fluffy: is thete a way to get permission to all cameras?

<Ted_Hardie> ScribeNick: Ted_Hardie

fluffy: is there some easy way to ask: I want permission for all cameras?

Dan: I don't think that's necessary cullen. The browser is the one who actually knows.
... the application requests a source id; the browser has the opportunity to name it to the user.
... one of the values became clear to me at WEBRTC expo. A user asked about setting up medical devices.
... The nurses won't know about this, and this would provide them the right info.

Justin: is this the right approach? Hey the browser should have a way to know and expose this (the user selects in the browsers)

Martin: this is a poor user experience.

Justin: do you have a use case in mind?
... We need to have a strong use case that justifies this.

Dan: This is a constraint—if it is not satisfied, then it goes back to the default.

Justin: We need to have a strong use case that justifies this if we are going to do this work; if

Room: HAAAH

(Speaker excitement)

Justin: I get the hint.

ekr: first of all, I can't tell you how much I despise every new (mumble) gets a new name.

We can't have a smellovision constraint.

fluffy: the current API does have a method for getting all the types. There is another "tell me about everything" intended here.

room: can you clarify, ekr?

ekr: I hate that we enumerate only audio and video and those separately. If we come up with something new, like smellovision, testing for it using this system will be painful.
... I do think this functionality is needed. Desktop applications do provide "select camera" functionality. We don't want to go out to OS or browser to get a preference dialog.

Martin: to be clear about interrogation: after the consent only.

ekr: fluffy suggested something different

<fjh> +1 to ekr re ability to control which device from application for usability;

<dom> ScribeNick: ekr

abr: is this functionality we need? I would be unhappy if I always have to select the camera on web sites.

jesup: certain applications are interested in different devices. you don't want to have to flip the browser or OS default whenever you switch applications.

giri: unclear how you implement the unique ids.

martin: plan was to run the GUID through a hash in order to generate the site-specific mappings
... one global browser secret.

giri: how does the user clear out the mapping.

martin: when you clear cookies.

giri: native apps? looks like an uninstall/deletion event.

fluffy: in the name of protecting privacy, we have constructed a situation where every app will ask for full permission.
... this is what happens if we don't give you any access before full permissions.

martin: I don't have any research on that.

hirsch: this is like cookies? [yes]. What about correlation across sessions and across users.

<dom> [re use case of device enumeration, it seems to me that Google Hangout exposes available devices in its UI for the user to select FWIW]

hirsch: needs to be more details in the doc.

juberti: the proposal is that initially you can only get ids, but then once you have permission for a device you can interrogate its properties.
... what about access to the names of the other devices.
... the microphone about camera and microphone seems more sensitive
... this seems like the right compromise

barnett: what if the user could pass in a usable string...

<fjh> medical device case makes it clear that privacy risks and mediations need to be enumerated

<fjh> apart from fingerprinting knowledge of names of devices seems benign, or am I missing something?

clarification on the proposal: the app gets to provide a string that the user can use to choose.

ekr: users ignore all the information in the choosers

jesup: app can explain anything it wants outside of the picker.

<fluffy> "Bad guys would use this for more than the good guys" - Martin

ekr: so new devices that were conceptually like cameras and microphones would be interrogated this way

juberti: once you had permissions you could ask for anything?

fluffy: trying to make certain we leave with a decision
... You can get all the device IDs and then they are stable.

<dom> [not so much "site" as much as "origin"]

fluffy: Once you have the device IDs you can ask for type.
... Given a device ID, you can ask for the device subsequently

juberti: so how do I make the picker? Given that I don't have access to other devices

<dom> ScribeNick: Ted_Hardie

fluffy: how much other information are we now providing without consent?

Justin: Do we return this information off what we get from this or from getDevices?

ekr: one thing people talk about is allowing the javascript to ask for camera and mic, but in a restricted domain.
... This is bound to a different domain, so it can't be seen by the javascript. I think we need to be clear whether we believe giving people access to one camera gives access to all of them or whether we can let them see what the camera sees without "getting access to it", so we can build this picker.

<ekr> to clarify my statement at the microphone, you would probably need to have that request not require a permission grant.

<dom> ScribeNick: ekr

ted_hardie: mobile cameras often point in different directions.

<dom> [I think the first requirement for a good proposal is to be an actual proposal :) ]

fluffy: need to be able to grant access to some cameras and not others. don't want to get dialogs for each camera.

paulk: is it possible to have two levels of permission?

giri: are we considering changing the requirements to potentially accomodate this proposal

harald: the chairs don't think this is inconsistent with the requirement

giri: it seems inconsistent to me.

hta: we discussed this at the last meeting. the number of devices isn't sensitive

juberti: you probably want some sort of callback if the list of devices changes.

martin: polling?

juberti: what's the rationale for not exposing a callback?

Josh_Soref: do you have WebIDL for this API
... is it mutable or fixed?

hta: closes mic

<gmandyam> From Giri: the requirement P5 in the use cases and requirements doc will not allow for human-readable descriptions of devices not available to the user, but Martin's proposal does not cover this feature

fluffy: when do permissions get re-asked? if someone says no....

Josh_Soref: can't this be an implementation detail.

<Ted_Hardie> speaker: you can have a "re-insertion" type event that makes a new device that looks remarkably similar to one that was previously excluded.

<dan_romascanu> we hear 80% of the speakers

Error Handling

<burn> scribenick: burn

<Ted_Hardie> speaker: that allows you to get past the "accidentally no means forever no"

<timeless> scribe: Josh_Soref

<timeless> scribenick: timeless

hta: define some terms for Error Handling
... there are two things we can call Errors
... application asks API to do something
... what it does succeeds or fails
... the other thing is "oops, something went wrong"
... i'm not talking about the last category
... that's obviously going to be a callback/event thing
... i'm focusing on the Application asks the Browser to do something
... and it goes wrong
... currently, in the API,
... there's a function called
... you know exactly one and only one thing will happen
... you get thrown an exception (e.g. illegal argument)
... you get a success callback
... or you get an error callback

[ Next slide ]

[ Current language ]

hta: this is what's in the spec atm
... i'm not all happy with the language of the spec
... it's more or less
... saying what i want to say

stefanh: is this WebRTC or MC?

hta: this is WebRTC

burn: we decided these are not the only conditions under which an exception can be thrown
... i need to look up the specific wording

hta: if you parse it as logical statement
... that one [points to something] is not what we want
... we also decided
... error identifiers are strings, not numbers

[ Next slide ]

[ Desirable properties not found yet ]

hta: when we have illegal params
... we should have exactly the same behavior as any other API
... what compilers do... it's specified in WebIDL
... i forget what those errors are
... it should be easy for an application to predict
... if i do this mistake, and if i do that mistake, i get an error callback
... browsers should be reasonably consistent

[ Next slide ]

<martin__> consistency is a crutch for the weak-minded

[ Alternative API structure ]

hta: alternatively, you make a call, and return an object
... and then success/failure is available from the returned object
... indexeddb is doing this
... whereas we're doing what geolocation does

burn: from the January call
... programming errors are thrown exceptions
... others are error callbacks

martin: +1

[ Next slide ]

<martin__> that was +1 to Justin doing more work

[ Emulating "alternative" in JavaScript ]

[ slide shows it's relatively easy to convert from one style to the other ]

[ Next slide ]

[ Evaluating this change proposal ]

hta: there's no clear advantage
... developers will have to deal w/ both patterns anyway
... library developers can mask one with the other
... changing stuff is disruptive
... proposed: No Change

stefanh: you're talking about getUserMedia ?

hta: i'm talking about getUserMedia and also
... AddStream, AddTrack, GetStats

stefanh: that's for the other group

hta: having the two groups being consistent would be a "bad" idea
... there's reasonable overlap between the two groups
... i'd like to see if there are comments on this proposal (no change)

ekr: this is just on promises v. errors

hta: yes

ekr: i support this decision
... i see no benefit
... are we proposing to continue w/ success+error callbacks for all calls?

hta: i think so
... it's trivial to ...
... we'd like people to be unwise explicitly

<martin__> adam.roach speaking

adam.roach: [ something ]

adambe: A style or Session style ?
... we can't force people to ...
... we can't run a call without an error callback

ekr: you could

<martin__> adambe: looking at error callbacks against promises - it's impossible to force the setting of error handlers on a promise

gmandyam: dom exceptions, they're defined
... in the html 5 spec

dom: webapps group is asking that if you want new types/errors, you should coordinate with them
... once we know what we want, we need to communicate with them

jim: we define an object
... where values are only valid during the scope of the error
... there are async errors, like OOM
... which is from global

hta: errors i'm not talking about are a different topic

[ Next slide ]

<scribe> [ New API points - design ]

hta: from previous discussion
... there's no chance of getting consensus in the room to change the api
... we add GetStats(), Recorder
... we should have a principled decision for these
... use Callbacks
... use Status objects
... "no consistency needed"
... i'd like a decision on this

cullen: i'd like consistency with callbacks
... for peer connection

<burn> +1

martin__: i like disagreeing w/ cullen
... consistency is a clear sign of a weak mind
... be as inconsistent as possible
... invent something new

[ laughter ]

dom: question about consistency is defining scope
... i don't know that every other WG is agreeing w/ status object
... in one of the documents that Robin Berjon is editing
... Web API cookbook
... if Callbacks work for all cases, then let's keep using callbacks
... but there are cases involving extending the api

martin__: what dom said is more along the lines of what i believe
... we have a narrow focus in this group

<dom> Specifying callbacks in Web API Design cookbook

martin__: people doing WebRTC
... and IETF
... consistency within this narrow wedge of the web platform
... isn't that wise
... if there's good guidance, then we'd be stupid to ignore it

cullen: i agree with martin__ on that

hta: 0, 0, 0, and we should read another document

stefanh: we have onEnded events
... and onError for recorder
... we already have a mixed model
... as long as it seems fitting, we should do callbacks

hta: you ask the api and it has exactly one result
... and stuff where you ask and things just happen
... this principled decision
... we have arguments for Callback
... and we have pointers to sage advice
... if we look and aren't swayed, we don't change?

[ Next slide ]

hta: i always have a next slide
... oh, i deleted it

jim: we need to decide on error classes/not, maybe not now

martin__: dom's advice
... stuff just happens advice
... we have events
... does that work for you jim?
... we generate an event and fire it
... and everything works?

jim: sounds like another group wants to control those events

martin__: events we know how to define
... dom was talking about DOMError

jim: i'm thinking about Error

gmandyam: DOMException requires modifying the HTML5 spec
... DOMError you don't

jim: you have an attribute inside an object
... it's only valid within scope
... but avoids having to coordinate with another group

hta: what happens if you have 2 errors happening in quick succession?

martin__: it doesn't work that way
... it's a single threaded application
... you have two events
... you generate a callback
... during the scope of that callback, it's set,
... on return, you clear
... and for the next, you set the next values for the next callback

hta: i'm not understanding this
... who is you?

martin__: the browser
... it queues events
... and sets values for the callback scope and clears on return

jim: there's another limitation
... a bunch of our objects now have an error name and an attribute

dom: the DOMError interface has a `name` attribute
... which we should reuse names that exist
... but what to do when we need new names?
... DOM4 says if you need new names, contact us
... we should maybe try the path or discuss it
... on the callback question
... are there many cases where we expect the developer will need to react to several error callbacks?

hta: Recorder...
... you get Data + Ended at roughly the same time
... but only one is really an error...

dom: we should be clear in our algorithms when and if there are cases where an error callback invocation doesn't end the operation

adambe: it seems there's some confusion if an attribute is an error callback
... or an error

jim: the attribute is a string, the name of the error
... you raise a standard DOMError
... when you process that, you read the error and see what it says
... it keeps you from defining new classes

adambe: why have an error string that's overwritten?
... if you dispatch objects, you can store them in a queue

jim: we could also use custom events

hta: we need a presentation on that,
... but by someone who has read the specs
... Further study on Errors that happen

<dom> DOM4 definition of errors

<martin__> ACTION: hta to come up with a concrete proposal on what to do with error classes, based on the discussion on DOMErrors/DOMEvents. [recorded in http://www.w3.org/2013/02/05-mediacap-minutes.html#action01]

<trackbot> Created ACTION-14 - Come up with a concrete proposal on what to do with error classes, based on the discussion on DOMErrors/DOMEvents. [on Harald Alvestrand - due 2013-02-12].

hta: i think we'll have coffee

cullen: what do we change?

hta: atm, nothing
... but we may eventually change every class which has Error to something maybe like DOMError

[ Coffee break for 15 minutes ]

"immediate stream" gUM

martin__: the basic problem that came up in the WebRTC meeting
... it became difficult to make a good UX
... with gUM and PeerConnection
... the consent dialog blocked the creation of the MediaStream

<JonLennox> timeless — are you scribing?

martin__: but the MediaStream was needed to negotiate the stream

<JonLennox> (I am happy not to)

martin__: the idea was that we would create placeholder streams
... we went back and forth a number of ways
... there were concerns about usability
... the conclusion was to create a bastard step child of the two proposals
... and merge them together
... there's an email on the list w/ WebIDL for these things
... it isn't really synchronous anymore
... you have the existing api
... constraints
... pass in success callback, error callback
... if you want two video streams
... you'd call this twice before relinquishing control to the browser

ekr: the premise appears to be
... that i can't generate an Offer-Answer
... the appropriate SDP
... prior to having a Stream Permission grant
... because i don't know the exact characteristics
... i have no way to tell the PeerConnection what i want
... the only info that PeerConnection needs is the number of cameras
... i don't think any syntax lets it do it
... describe addressing the requirement
... regardless of syntax
... for SDP, you need more than just count

martin__: there's an assumption that there's a single SDP

cullen: all existing video codecs transmit Resolution in SDP

martin__: you have 2 cameras and don't know what they are

ekr: browser knows the cameras
... app doesn't know anything
... user hasn't granted permission

martin__: you can't send resolution in SDP because you aren't supposed to reveal it before permission is granted

justin: you can mask it
... send resolution in band

cullen: doesn't work for any video codec
... there isn't an RFC for doing that in VP8

JonLennox: for XR
... there's XXX
... characteristics of the encoder
... the encoder implementation is probably fingerprintable

justin: you can do a reoffer
... what need to do is get the ID in the
... and when i get authorization, i don't need to do additional signalling

martin__: i'm less concerned about audio clipping
... as JonLennox points out

<JonLennox> JonLennox: in H.264 you negotiate profiles and levels

martin__: it isn't possible for video to make sense of the data from camera

cullen: for audio, it's just a clipping issue
... you could always send an invite that's just a data channel
... and then send an invite over the data channel
... be careful about the problems you're trying to solve here

martin__: we spent a lot of time in Lyon
... we concluded on Clipped Hello
... someone answering the phone
... you wouldn't be able to do your ICE negotiation
... so once microphone is granted
... you wouldn't be able to send right away
... for Video, people expect to wait some time

ekir: i think it's useful to define the problem
... for Clipped Audio Hello
... they don't require tihs

martin__: no one's proposed one

ekr: cullen suggested negotiating the data channel
... in the case where there are devices w/ different capabilities
... what do you expect the Browser SDP to be
... when it isn't known what the user is going to select

justin: is the case that
... the browser doesn't know what to do
... are they enumerable?
... i think they're fairly finite

ekr: camera 1. H264 encoder
... camera 2, no H264 encoder
... and i don't have H264 software

justin: please tell me how i'd set up that configuration

cullen: you have a logitech camera

justin: what percent of people will people have aftermarket camera

cullen: iPad, H264 encoder for the hardware

justin: the H264 there is hardware distinct from the Camera
... depending on which camera is chosen, the offer in SDP will be different
... you could offer the best you can do
... and you could always go lower

jesup: on the receive side, there's an issue about receiving something you can't decode
... to the main point, how we avoid clipping
... there are options
... using fake/empty streams for negotation
... if the app is asking for permission for audio/video
... you can use fake/disconnected streams
... as soon as it gets permission, it can re-swizzle the streams being fed in
... much as you would do for mute
... you wouldn't need another negotiation
... if you need it, you do, if not, you don't
... cullen's proposal of using a half-RTT
... it's a direct to the Peer, without the server

martin__: that's perfectly reasonable to do

jesup: do we need another solution to the same problem?

martin__: that's where i was going to get into other things this potentially provides
... how you provide multiple cameras is kinda weird
... you call gUM twice
... it's possible for the browser to give you the same camera twice

justin: weird, but is it inconsistent?

martin__: if i want 2 cameras
... what is the obvious api to get 2 cameras?
... we don't do it that way
... you ask for a camera
... you ask for another camera
... it gives you the same camera

justin: it could give you a reference counted object

martin__: but you wanted two cameras
... not the same twice

justin: source ids?

martin__: this slide uses source ids

justin: you call gUM
... you get an object back w/ ids
... you have space for tracks
... and once you get consent, you pipe in w/ media
... once media fills in, you don't need signalling
... you don't have clipping

cullen: you thought that was the current spec or the proposed change?

justin: i thought that was the goal

martin__: there were benefits, providing the constraints to the tracks

justin: you call gUM
... you don't get the stream back immediately?

martin__: you make a dummy stream
... you ask gUM to fill in the stream

ekr: can i pass this stream to peer connection
... prior to gUM filling it in?

martin__: partially, peer connection never really works correctly

hta: it's fairly normal to call something twice when you want to do something twice

martin__: if you call it twice from the same context
... if you call it once, and then settimeout and call it again
... it could behavior differently

hta: ok to receive audio, ok to receive video
... we can make more constraints
... for `number of m=lines`

<fluffy> M- lines

burn: i like this proposal
... we'll talk about settings tomorrow
... when we originally talked about how gUM worked
... if you called gUM and
... got access
... then it wasn't available for someone else to get it
... but even in that model
... MediaStreams can be created from other Streams
... that gives you multiple Tracks pointing to the same source
... i don't know if we have to allow gUM to give access to the same source
... i don't know if you need gUM to get multiple tracks pointing to the same source

cullen: i think this discussion has pointed out how confusing this is if you call it multiple times
... you create a PeerConnection, add tracks to it
... it's the Create Offer that's problematic
... i don't think you can have the Offer before you bind the stream
... i think there's existing SDP that won't work that way

martin__: that's another problem w/ offer-answer

cullen: i understand that

ekr: everything is interconnected, unfortunately
... burn pointed out you can synthesize streams from others
... points to difficulty and a way out
... tim maybe
... another way to dig myself out
... instead of rewriting gUM
... you synthesize a dummy media stream
... w/ generic audio-video tracks
... and when gUM returns, we swap those out

martin__: it's kind of what this does

ekr: i'm suggesting we create a fake media stream
... and those objects have no meaningful information
... the only thing you could offer what you can do in software w/ no hardware support

<Ted_Hardie1> Comfort-noise only audio....

ekr: maybe you can produce an offer
... maybe you can't

burn: how does attachment happen?

<Ted_Hardie1> Fluffy: the peer connection would have a replace track action

cullen: peerconnection would have a replace track

<scribe> scribe: Ted_Hardie1

Jesup: the original idea had the ability to take a track and construct it from other tracks and other places.
... we can re-use that idea. This is similar to mute, so you don't have to have renegotiation.

Martin: turns out that you have to renogtiate if there is a serious change in the characteristics anyway

Paul: apologize, because I don't follow this that well, but it seems like if you have info about the device, even if you don't have access the device, you can do what you want here.

burn: I was going to say that one thing that is interesting about moving this to be a replace track on a peer connection, is that the issues only show up when you're sending it to a peer.
... they don't show up when you're using it locally
... those already look like a dummy track—the source and sink are local, but there isn't a clear notion of a track

burn; we've created a virtual track, and that's convenience. All of this trickiness comes about because we're negotiationing it over a peerconnection,

burn: so it might be better to doing it in peer connection, not gUM

Tim: Jesup's suggestion was what I suggested in Lyon,

<scribe> ACTION: item to Tim Teriberry: investigate ReplaceTrack as a solution to the problem. [recorded in http://www.w3.org/2013/02/05-mediacap-minutes.html#action02]

<trackbot> Error finding 'item'. You can review and register nicknames at <http://www.w3.org/2011/04/webrtc/mediacap/track/users>.

<stefanh> ACTION: Tim Terriberri to investigate replace track [recorded in http://www.w3.org/2013/02/05-mediacap-minutes.html#action03]

<trackbot> Created ACTION-15 - Terriberri to investigate replace track [on Timothy Terriberry - due 2013-02-12].

Martin: if you have an API that shows you all of the devices, but doesn't allow you tot turn them on, then you could negotiate in advance of the consent for turning them.

Fluffy: instead, we're doing dummy tracks that allow you to default blank one. Those might create fingerprinting problems.

Martin: you do one pixel by one pixel

Fluffy: but then you need renogitiation.

Martin: audio doesn't

Justin: what about advertising HD, but then negotiating down

Martin: that's the other way to do it—say everything you can do, then negotiate down to what you will.

Justin: there seem to be two different approaches here: do we want to keep bimodal functionality? Why not just shift this to the second approach?

Martin: you were there in the December teleconf, where we thought of this, and we didn't return promises or streams that weren't open yet etc.

Ekr: Isn't this the same thing? What if attach this a "black" stream?

Martin: seems to be reasonable to me

Justin: I would prefer this be a single syntax, so we don't have two ways to do everything.

ekr: Large step back: what's the problem we're trying to solve here. Is it only media clipping?

room: can somebody walk through the clipping problem.

Martin: walks through the set-up/user experience issues.
... when the user clicks "yes, you can have the camera", that is the expectation that it is setting up the call.

Fluffy: send a provisional answer that marks them all receive only, that will set up the ice negotiation, then you can send the video before sending the offer.

room: various questions

ekr: this seems like a pretty heavy weight change to solve this problem.

Martin: I'm just the messenger

ekr: But I'm not the only person who thinks this isn't the most awesome thing ever.
... If we must have a way to set up a temp stream. I am not sold that this last syntax (fake stream plus pieces) is what we want

Martin: accepting proposals

ekr: replace track would have to be mine

Mandy: there are some video conditions in which it is obvious that it needs to be asynchronous. I think you're going to have a difficult time finding a method that works in all conditions.

Giri: You're going to have set various constraints.

Martin: Note that the set constrains operates on ones that are already live

Sorry, thansk.

Giri: that is not the equivalent of set constraints?

Martin: yes,

Stefan: this is both a method to solve the clipping problem and using getUserMedia with finer grained control.
... what am I saying is that you can start sending media before you send the answer, but that won't work, because the other browser won't be able to process it.

There can be multiple ssrcs arriving, and you have no clue what they corresponding.

Justin: this is a PRANSWER for everything,

Martin: comment 22 (expletive)

Justin: why this is actually useful: you can get DTLS and ICE hot through PRANSWER—that's the critical part of why we need the IDs (to allow the remote UI to set up). The second is we already have a complicated state machine—we have a lot of other asynchronous pieces-having this be asynchronous is going to prove interesting. having this be synchronous makes the application writers' lives much easier.

Fluffy: I think there are ways of solving that track/mapping issue without breaking this.
... but as a general rule of thumb, if I send early media that matches what the "dummy" thing set (two audio and one media claimed and that's what's delivered)

Martin: we don't have a good way of ensuring that the configurations of streams and tracks on one end matches the configurations on the other end.

Justin: you're bundling as a simple thing and you don't have a demultiplex method
... cites rtcp

Fluffy: but that's statistics, not media

Justin: but we can't use them by media type without going down a long complex series of special cases. Let's do this the easy way.

ekr: comment not caught

Jesup: I think Tim's method will simplify the state machine.
... In some of these cases when you change track, you have to renegotiate; in other cases you don't.

Fluffy: the easy way to do this is to pull Bundle, that makes this simple

Justin: All because you don't want to return objects synchrously

<dom> [Justin's (tongue-in-cheek) question was "why don't we get rid of SDP?"]

Fluffy: no, when you have complex negotiations with multiple streams and tracks Bundle is complex. It may be easy now in your situation, because it will be complex ten years from now.

Justin: let's get read of SDP

(groans, laughs)

Hta: can you repeat your comment for the minutes?

Martin: I need a plan.

<hta> The idea that each video stream needs its separate source/destination IP port pair is incompatible with the SDP of 20 years ago, it's incompatible with stuff that's practiced today, and it's just plain stupid.

Justin: We discarded this but we could revisit: what if media would not flow on a media stream until there was user consent?

ekr: that's syntactic sugar on this.

<hta> The question of whether audio and video can travel on the same port pair is a different question.

ekr: I can barely distinguish these two.

burn: is this a muted stream effectively?

Martin: yes

burn: that works well with the amended setting proposal.
... you can always have things go away. no matter what you have negotiated, that can happen
... the settings proposal talks about an over-constrained situation
... lots of things can cause this. Instead of breaking all your track connections—you can call "muted" any track that has become over constrained.
... mute anything that's wrong

Justin: you get audio and video track, if it was negotiated but there was no video camera? It just stays there?

Martin, Burn: if it's an immediate failure, you can just fail, but if you have it later, that works.

adambe: you didn't show the last slide with advantages of the template method.
... I like return Media stream for some things, but I like this for other reasons.

Fluffy: I have a concrete high level proposal, but without ditching bundle.
... You create all of these tracks, muted; once permission is granted are unmuted. Any that did not get permission are treated as if the "camera/device* were unplugged.

ekr: Doesn't this create a media stream for every camera?

Fluffy: yes, but the ones you don't use get blown away.

Justin: I don't like having two ways to do this.

Martin: I want to get rid of the constraints can be imprecisely specified option

Justin: I did not get that

Martin: probably Harald that wanted that.

Justin: I would like to see more info on the cases in the next slide.

ekr: I don't think what Cullen suggested fixes the problem
... the problem is the sdp for the platonic ideal for devices I might have
... it doesn't help me to add five camers and not know which one the user is going to select
... I am concerned that we're going to end up with replace track anyway.
... I claim replacetrack will solve this.
... I would like to solve this problem once—if we are going to ditch replacetrack or agree it doesn't work, I'm ok on this approach, despite its warts. We're going to need a lot mroe clarity on how muted/black devices behave.
... (goes back a slide)
... Goes through case with dummying out the streams—you get some cases when you get full permisions, some where you get partial, some none. Need concrete understanding for each of these
... what's the timeline on replacetrack

frederick: not following the full set of technical issues: privacy issues vs. clipping. I thought what Cullen was suggesting using muting to do call set-up before media flows. Trying to understand if that's the proposal?

Martin: basically yes.

Burn: if you want different constraints for different tracks, this approach (as an alternate syntax) is going to be difficult. The combination of constraints and this is not going to work well.

<fjh> seems like we are combining discovery with call setup with permissions

Burn: I like either Justin's approach or a synchronous approach that doesn't have this issue

Martin: the issue is that the algorithm for which camera gets which constraints is not deterministic.
... this imposes the constraint you don't get the same camera in two different getUserMedia

Burn: You are talking about setting a constraint with source idea

<fjh> cullen's example had a user action both giving permission and accepting a call leading to clipping, the fundamental issue being receiver permission (as opposed to sender)?

Burn: it's a failure of a mandatory constraint if you call it twice

Martin: this is a short-hand for mandatory or optional

Burn: But if it's optional, it will move on when the first one is busy
... the issue is that once you get access to the source you can send them to as many sinks as you like.

<dom> "Subsequent calls to getUserMedia() (in this page or any other) should treat the resource that was previously allocated, as well as resources held by other applications, as busy. Resources marked as busy should not be provided as sources to the current web page, unless specified by the use" http://dev.w3.org/2011/webrtc/editor/getusermedia.html#implementation-suggestions

<dom> (in other words, it's not normative that two calls to getUserMedia results in two different cameras being attached)

Fuffy: the problem started as clipping, and I'm pretty dubious about that; then it changed to disambiguation.

Fluffy: there are various games to solve different problems, but we need to be clear on what th eproblem really is.
... in most cases that I can think of sending fake sdp won't really help.
... we need to be really crisp about the problem

Harald: next time you send this to the mailing list: please be clear that you're requesting two video streams, because it wasn't clear to me One thing you haven't addressed is whether the same stream that goes into getUserMedia the one that comes out?

Martin: yes

Harald: is it changed?

Martin: the object attached to the tracks changes.

Harald: if you think of tracks being replaced by getUserMedia,

Martin: I don't think this proposal sees it that way: it's a pipe and now you're hooking it up to the mains.

ekr: reads spec to mic

Justin: reads different aspect of spec

<dom> "A MediaStreamTrack object represents a media source in the user agent. Several MediaStreamTrack objects can represent the same media source, e.g., when the user chooses the same camera in the UI shown by two consecutive calls to getUserMedia() ." http://dev.w3.org/2011/webrtc/editor/getusermedia.html#mediastreamtrack

ekr: at best, this is inconsistent

Justin: the bits I read were normative.

One use case: mashup application. If you set it up so that the user cannot get multiple copies of the source, then they won't be able to use them in the mashup.

Martin: good point

Burn: it's interesting that the spec says what you sent—when I discussed this with Anant, he added text to that section. I am not sure where to go from that—it doesn't match.

Stefan: Travis in his proposal changed a bit of that

… you would get multiple access, but it would be a read-only device etc.

Burn: that non-normative secution, there was a concern that we did not want to mandate how user agents would represent multiple media.
... reads a new spec section….
... I think the spec is at best inconsistent

Cullen: I'm shocked!
... but is there anything we do have consensus on?

ekr: I propose what we ought to be able to do, without syntax
... somehow acquire the same camera twice and alternatively get two different camers. Anyone disagrees?

Burn: yes: the question is whether you send the source multiple times or re-use, sending to multiple sinks.

Then you can set different constraints on them.

Martin: In the mashup, you can ask for the same source twice.
... it should be possible to do that.

Harald: suprisingly, that topic is relevant to our next presentation.
... it's clear that we have no consensus, but we have two action items assigned

Tim is sending mail to the list on replace track. Martin is going to mail to the list with the syntax for multiple cameras. It's clear that people have more use cases in mind.

Harald: so we better have more mail to the list, showing where it matters.

Justin: I thought ekr was going down a good road, trying to see what people carry about.
... f we can solve those problems, I will be happy.

<fjh> +1 to clarity of use cases on list

Cullen: I want to see use cases before we going down the path for "dummy track" pieces.

Martin: this whole thing hinges on the other thing—what do you do with media arrives. I don't know what to do with that, but it is not what Cullen is proposing.

Burn: we got into a bunch things there. I want to make sure we preserve the simple case of having multiple different video sources with one audio. I want "your two video camera".

<timeless> scribe: Josh_Soref

<timeless> scribenick: timeless

Device reservation

jesup: the presentation which only exists as of a few hours ago
... this touches on issues already discussed today
... and issues that came up in the past
... they will provoke discussion

<burn> I said I wanted the simple (and original) use case of calling getUserMedia twice requesting video and get the user's two video cameras

jesup: i have opinions on some of these things
... but not all of them
... i'm hoping to make progress on a few of them
... the basic things
... who gets to access a device
... who gets to modify a device
... how does it affect others who are accessing a device
... how are they notified
... how do you set up a secure call

[ Slide: Who gets to access a device? ]

jesup: basic possibilities for access:
... exclusive
... once one thing has a camera, nothing else can get it

burn: your list of items sounds similar to my Settings proposal

jesup: i'm primarily talking about multiple applications/across tabs
... our original implementation was purely exclusive
... even within a tab
... and you had to use a fake stream
... it wasn't a big deal
... but for mashups
... it matters
... so, exclusive, either by default, or by locking it
... constraint{ mandatory }
... sharing it w/ another tab/app
... so i have it and Engadget has it as well
... the assumption is that the user has in some manner ok'd this
... sharing w/ tabs that are same origin
... mostly same time
... but in some cases, in the same origin
... shared w/ a friend app
... willing to share w/ X but not w/ anyone else
... so normally non same-origin apps could exchange a permission token to allow this other app to have access
... what do we surface to the user?
... in the current Firefox UI implementation
... the user is involved in every access/grant
... this isn't the same for Chrome
... there's an implicit decision that the user is wanting to share the device w/ multiple apps
... it's assumed they know they're sharing
... that's trickier w/ persistent permissions
... the user isn't informed directly
... the second point is speaking to that
... should there be an indication that a source is in use when they're asked
... do you need an explicit grant that it's shared?
... does an application need to know that a source it has got shared w/ someone else
... to warn the user, or to change how it's acting
... before i get to my thoughts on this
... there are UCs for these things
... I have an app that is doing Voice Commands
... it runs in the background in a tab
... and it's listening to the Mic
... and i make a phone call
... and i say "computer, please bring up my spreadsheet"
... and it acts on the command
... a real world application where sharing a stream w/ something in the background is really useful
... perhaps the user needs to give permission for that, on a onetime basis, or a permanent permission
... same for Video, if you do Video-gesture-recognition

justin: are there UCs where you wouldn't want this sharing?

jesup: for Secure Call
... even though you gave Voice Recognition permission
... if you're doing a Secure Call, you might not want to give it access
... my gut feeling is there are UCs for that

TedHardie: access to the stream
... doesn't mean access to the media
... i have a screen sharing app
... and i have a tab which is getting access to pixels
... but the source of the media capture
... is the piece of the system that grabs my screen
... not the thing that grabs my camera

<JonLennox> Speaker is Ted Hardie, not sure what nick he uses

TedHardie: we could say that there's a permission for the camera
... but it doesn't necessarily cover screen sharing

jesup: Screen sharing while useful, is fraught with...

TedHardie: users can hang themselves w/ someone else's rope

ekr: is this Read access or Write access?

jesup: i was talking about gUM
... there's a separate slide for writes

ekr: there's a reason to not allow other accessors to fight over pan/zoom

jesup: we could, but we'd be sorry

ekr: there are arguments for having an explicit lock down of a device
... it isn't that you couldn't have secure ...
... i think we'll need locking for write anyway
... for Secure Coms
... i'd be satisfied for the throb that blocks write to block read
... the issue about non-exclusive-read
... is about user privacy
... whether or not having access to a device allows you to find out about other sites with access to the device
... 1-XZ
... 2. if user is crazy to give access to a site they shouldn't trust
... then we can't protect them from being discovered as using a sex site
... 3-XW
... if you think you're allowed to set whitebalance/pan/tilt/zoom

justin: i think we need examples to indicate there's a real threat

jesup: there's certainly a privacy leak issue there
... for x and time y, you're in a hangout
... those are in theory possible
... that said
... i think i've come around to the opinion that it isn't a good argument for disabling the feature
... given implicit permission

ekr: as long as there's a way to lock it

jesup: for a secure call, as long as there's a way to mandate

JonLennox: is there an exclusive access and there are other apps on the device
... is it reasonable that the browser be expected to make the OS api call for exclusive?

<stefanh> josh: the UA should handle exclusivity, not the app

<scribe> scribenick: martin__

Josh_Soref: I don't want to leak the user to be exposed into DoS based on this constraint
... maybe the user needs a way to specify exclusive access for a given app during the permission grant

jesup: for example, a personal recording app might be used, even though you want to make a call, you might want to continue recording despite the permissions requested by the app
... even if the app requests mandatory constraints, the UA doesn't need to respect those

Josh_Soref: I can live with that

<timeless> scribenick: timeless

jesup: right now
... if user gives permission to a second app
... we let them access it
... that gets more interesting with persistent permissions
... locking with exclusive
... pure exclusive seems overly-restrictive
... an alternative is to default to exclusive
... but permissive, if the app/user says so
... i suspect you get better utility out of default-permissive
... the apps interested in mandatory exclusive are a small subset
... but you could go the other way around
... as permission doubles as selection in Firefox
... same origin doesn't matter to us
... once we get persistent, we'll need a better solution
... lastly, if i have access to a device, and someone starts sharing it
... does my app know that it's being shared?
... to indicate to the other person that it's being shared w/ someone else
... i'm not certain about this
... the mechanism is simpler than what you do w/ it and why
... an event perhaps

[ Who gets to modify a shared device? ]

jesup: First user, everyone?
... [random]
... first asks to lock down access?
... there are probably arguments for only one app being able to modify a device
... gets tricky
... my example of the background voice commands
... has very little interest in modifying
... but it's likely the first opener
... so it'd need to be able to specify disinterest in modifying the device
... or a way for multiple to be able to modify
... can you lock a param, e.g. 30fps
... a subset of reader-writer issue
... is it per item or the whole stream
... i don't think that complexity is worthwhile

stefanh: do you have a proposed api?

jesup: i don't think it's worthwhile
... so no api
... i defer to travis's discussions

ekr: there's got to be a way to ask for exclusive access to pan/tilt/zoom
... i'm aware someone asked for access, but ignore him

jesup: at least a lock, or an exclusive writer

ekr: anything manipulable on this device
... i have write, and anyone can share read

<Travis> I think one interesting part of this discussion is being able to manage the exclusivity of a source.

ekr: i could imagine living with a setting where there's a way
... for the other guy to steal the lock
... but i don't know how to have shared devices and have fights over writes

jesup: that could be resolved culturally

martin__: in the settings proposal
... you ask for a track w/ pan/tilt/zoom constraint

ekr: i'm going to move them around

martin__: you don't set a range, you set a value
... if someone else sets a constraint, they'll be told they can't change it

ekr: but that's exclusive writer

martin__: that's my understanding of the settings proposal
... you set pan/tilt/zoom for the source
... and it's incompatible w/ anyone else setting values

jesup: for pan/tilt/zoom
... i can see plenty of cases where multiple people want to have access to control something
... in a shared control situtation
... it isn't smart to be able to control at the same time

martin__: exclusive for the first guy to set a property

jim: can you get deadlocks?
... there's a track or device detail

burn: that's why i asked about within an app or across apps/tabs
... solutions may be different
... settings has ways for within an app
... on what you do for a track which is shared
... we can talk about that tomorrow

jesup: i'd like to focus on between apps

TedHardie: i wonder about a way to give up exclusive access
... you set pan/tilt/zoom
... from a remote location
... but allow people to change it

<Travis> +1 This resonates with me...

jesup: multiple rooms dialed in
... you set a value
... but let the room change it

<martin__> which resonates with you specifically travis?

TedHardie: does that work for you?

jim: you could have a way to decide to allow it?

jesup: you get to an area of diminishing returns

<Travis> Exclusive by default w/option to give up your rights...

TedHardie: if you have the ability to give it up, seems enough

<martin__> Yeah, that's sounding good to me.

jim: do you get notified if it's changed if you give it up?

TedHardie: no
... you notice the video looks different

JonLennox: pan/tilt/zoom isn't how it actually works

[ Security and Trust Model ]

jesup: this maps to the whole sharing thing
... by default an app gets access to bits
... if you don't trust the app, it can send them anywhere
... not directly related to what we just talked about, but indirectly
... a local photobooth app
... could connect to a peerconnection and send it anywhere
... you play w/ a sample app
... and the person who wrote it could be watching you
... that's bad
... the first time it happens, it'll get in the press
... it'll be bad for all of us
... we need to think about it
... we haven't solved the problem

martin__: i think we had

ekr: tomorrow,
... i'll be talking about restricted access on a binary basis
... for the restricted media streams, that's covered

jesup: preview for tomorrow's discussion
... but think about that for sharing media streams
... you may have to lock down streams in secure mode
... a lot of useful
... UCs may suddenly become hard
... you must trust those apps
... that they don't ship your data off
... the other option would be to constrain
... give access to bits, but not allow them to ship
... you could prevent them from wiring to a PeerConnection
... but they could put them into a Canvas and send them to same-origin

Dan_Druta: the browser should know
... if there's a bit

jesup: but it doesn't have to be an RTC session
... it could be WebSocket or XHR

justin: is this a solvable problem?
... we spent time this morning talking about "easy" problems,
... and didn't make progress

adambe: for a booth
... if i do funny stuff with a secure file

jesup: there are existing cases of people hacking into computers and getting mics/cameras
... and using them for blackmail
... justin's right, if this isn't a solvable problem
... we should just put up warning flags

justin: if you give access to the camera
... is there an expectation that the site won't upload the data to a site?

jesup: right now, there's no way to block this
... there are technical ways we could do so
... via tainting and origin protection
... you could let them manipulate w/o bit access

martin__: if you give bit access, you've given bit access
... if you've tainted the stream, you aren't giving bit access

jesup: in that case you've given stream access but not bit access
... you could hook it up to <Video> but not <Canvas>
... this also applies to image security in browsers... mostly

justin: what can you do that's interesting if you can't access the bits?

gmandyam: on raw bits
... i thought this was encoded data

jesup: gUM is raw-bits

gmandyam: hta corrected me, there's no such thing as raw-bytes
... it's feasible to eventually get raw data
... our devices have face detection and such
... but the argument for taking it out was you could do it in JS

jesup: there are other ways to
... if we revived media stream processing

[ Time Check ]

jesup: such that you could get access to the camera, manipulate it for face recognition
... but not give access to the bits w/in the app
... it's tricky, i agree, it's doable
... i'm not sure it'll get done

ekr: we'll talk about this tomorrow
... interesting reasons for Calling/Preview apps to render or transmit
... anything more complicated will be very complicated

[ Security (cont) ]

jesup: leave most of this for ekr's discussion tomorrow
... enjoy lunch

[ Lunch - 1 hour ]

hta: we've identified needs for some degree of locking

<Travis> See ya all tomorrow!

Media Capture Task Force F2F Meeting

05 Feb 2013

Attendees

Contents

Error Handling

"immediate stream" gUM

Device reservation

Summary of Action Items