Media Capture Task Force Teleconference

06 Dec 2012


See also: IRC log


gmandyam, Jim_Barnett, +, +, +1.650.241.aacc, [IPcaller], +, Dom, [Microsoft], +1.650.678.aaee, Josh_Soref, +1.610.889.aaff, Dan_Burnett, [Mozilla], hta, jesus|laptop, [GVoice], +, stefanh, Travis_Leithead, Frederick_Hirsch, Stefan_Hakansson, Martin, Thomson, Martin_Thomson, Dominique_Hazael-Massieux
Josh_Soref, dom


<trackbot> Date: 06 December 2012

<Josh_Soref> scribe: Josh_Soref


stefanh: maybe we'll start with Travis
... let's approve the minutes from October 9th?

<stefanh> http://lists.w3.org/Archives/Public/public-media-capture/2012Nov/att-0041/minutes-2012-10-09.html

RESOLUTION: Approve minutes from October 9

Version 5 of device handling/change settings proposal

<Travis> http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/proposals/SettingsAPI_proposal_v5.html

Travis: i'll start w/ the high level changes
... i think we're all familiar w/ the v4 version we discussed at TPAC
... i think we're familiar w/ the changes we discussed
... I wanted to accommodate the idea of synchronous GetUserMedia
... and placeholder streams
... there might be settings exposed on a track that don't make sense to return a value
... until you have a source associated with a track
... previously you could only get a track until it was ...
... until getUserMedia had approved the track
... meaning there was likely a source behind the track
... in the new world, it's likely you could get a track without a source
... in section 2,
... the track hierarchy is simplified
... you don't have devices inheriting from tracks
... MediaStreamTrack is still the root object
... derived VideoStreamTrack and AudioStreamTrack

<ekr> can whoever is not muted besides travis please mute?

Travis: I redefined MediaStreamTrack so i can extend the readyState

<ekr> I am hearing a lot of heavy breathing

Travis: I defined it the new state
... i did some other provisional changes, we'll talk about them later

<ekr> Josh: there is probably a keypress interface to mute

Travis: going to section 2.2

<martin> my comment on the mediastreamtrack was related to states: muted might be orthogonal to readystate

hta: there are some tracks, you defined Facing as a track level attribute
... and why not a device attribute?

Travis: i heard that feedback twice
... my motivation was to have things that i consider Settings only exist on Source objects
... things I don't consider mutable elsewhere
... i've since heard the feedback
... if you exclude Facing, there's a single attribute "source"
... to get from Track to Source, it's now a property instead of via inheritance
... there's a similar source attribute on AudioStreamTrack
... there's now a provisional [Constructor] on these
... to allow you to do `new VideoStreamTrack`
... in JS, without using getUserMedia
... in section 3
... these don't have a hierarchy
... there's a little between video stream source and picture sources
... there are 4 kinds of sources
... 2 are obvious: source representing video device
... and microphone device
... their settings are attributes
... that you can see directly on these sources
... that you can query with an if statement to see if it exists
... and you can read its value
... so on video you could get the current height, width, framerate
... read mirroring/rotation
... zoom factor
... focus
... light

ekr: there's a large side channel in this interface
... if i allow you to share cameras
... and i allow you to XXX
... i can determine YYY

Travis: because the video source would be identifiable

ekr: I regularly poll these objects
... Hangout regularly changes its zoom every 3s
... and some other site changes its zoom every 5s
... I'm not saying I oppose this
... but we should acknowledge that sharing these bits

Travis: [summarizing]
... multiple sites may be able to display a single camera source
... if i change the zoom factor value in one application
... that zoom factor could be observable by another site
... knowing the camera is being changed
... could let you determine the site

ekr: certain sites use whitebalance settings
... another factor is competitive manipulation
... if we both try to zoom at the same time

Travis: yep
... there's a large question about Exclusivity of a camera device
... that we could have a discussion of

QQQ: now seems like a good time

<dom> [how does hangout handle this? how do native apps deal with this in general?]

QQQ: maybe we could say access to a device is origin specific

Travis: if i recall my discussion with the OS team
... the OS defines access to devices
... only the front and center app can have access at one time

<ekr> on macos, applications can share the camera

Travis: and if you switch applications, the new application can steal it
... the difference between w8 and classic
... when a classic app takes access of a video device, nothing can steal it

jesup|laptop: that's true

scribe: but we're talking about different tabs from the same app
... those OS level controls don't help you

Travis: you'd have to redefine it

hta: we have multiple OSs with different semantics
... opening the same camera in different tabs is extremely useful

ekr: BBB
... Chrome currently allows two tabs from different origins to have camera access
... Firefox does not
... i probably agree with martin
... i don't want to decide this without justin on the phone
... and then revisit it
... unless hta can proxy

hta: i can't proxy

jesup|laptop: i'd largely agree with martin

scribe: it covers most cases safely
... if you want to allow unsafe sharing
... it could be an about:config option

hta: this seems like a best practice implementation concern

ekr: given i know how chrome behaves
... with manipulation
... we could Forbidden
... describe interaction
... maybe one tab controls settings
... or just live with it
... i think just live with it won't work

stefanh: i'd like to record an issue/action

Travis: i agree

ekr: justin might be on in 10 minute

martin: run this at the end if we have the time?

Travis: resuming...
... if you look at video stream source and audio stream source
... there's a method for stopping a source on the source object
... there's a method for getting the number of available devices of this type
... which replaces the devices from the previous version
... as i mentioned in this proposal, there's a fingerprinting issue
... this allows the app developer to skip the request for camera if there aren't any
... and if you'd like to be aware of new devices
... the only mechanism is to poll at regular intervals

stefanh: is there the possibility to change the source of a track?

Travis: good question

<ekr> chairs, should we be using queue discipline here or just jump in.

Travis: i hadn't though too much about that

<ekr> I tried to be polite this time, but I'm good either way

Travis: with the new model
... it's extremely easy to create a new track
... and then set a new source from it

stefanh: if you create a track and send it over peer connection
... if you create a new one and do a new negotiation

Travis: that's in the peer connection

adambe: if i request 2 different streams
... from getUserMedia
... and get 2 tracks referencing the same video source

Travis: in my proposal

<martin> for this last item :)

Travis: there's an expectation there's a single source object
... no matter how many tracks you create
... you'd be affecting global settings for all tracks

hta: you're doing settings from the source

adambe: seems a bit confusing
... you could have competing settings

Travis: that's why it's important to represent it that way
... you might have two, but if you only have one
... this singleton settings object helps enforce that concept

ekr: looking at this interface
... assigning an identifier for a camera
... for the duration of an origin
... doesn't increase the fingerprinting
... if there are 5 cameras and they're named A,B,C,D,E
... then the site can build up some sense of the cameras you have
... look at the Hangouts interface

<martin> +1 to guid idea, so that you can have some sort of application stability, and request a specific camera again when you return

ekr: with the camera picker
... and you flick back and forth
... to allow the site to present them in the same order
... otherwise it's hard to flip back and fourth

<martin> but you must ensure that different origins get different, unlinkable identifiers

ekr: either number 0..N
... or simply have them have hokey identifiers
... and have an attribute to getUserMedia for "get me device N"

<martin> numbering from zero is not going to work, because the set of devices is not stable

Travis: object identity serves that purpose
... but for constraints, you need that

<martin> getDeviceIds() : sequence<guid>

burn: i went to the WebRTC conference last week
... and someone specifically mentioned that
... Nurses coming into a room
... as an application he wants a way to say "i want the last device i set up"
... or "get back the one i got before"
... a constraint that's mandatory/optional

<martin> constraint = deviceId : guid

gmandyam: back to get-num-devices
... hanging off streamSource
... i thought i understood
... but now i'm not really sure
... i thought it was device specific
... but it seems like it'd be better to be a method you obtained at a higher level api
... why is it here instead of elsewhere?

<adambe> +1 on gmandyam's proposal

Travis: the reason, gmandyam, is to separate between Video and Audio categories
... i could have a method on [Navigator]
... but that seemed klugy
... so i put it here
... the Static moniker on that method
... means there's only a single instance of the method
... on the Constructor rather than on the object instances
... so it'd be on VideoStreamSource.

gmandyam: but VideoStreamSource is specific to a device

<dom> VideoStreamSource::getNumDevices()

Travis: the instance of a VideoStreamSource videoStreamSource is specific to a device
... but the class method isn't

hta: we should speed up

Travis: there's a PictureStreamSource
... an inherited source from VideoStreamSource
... but it allows the high-res-photo bits

<scribe> ... new in this proposal are RemoteMediaSources

UNKNOWN_SPEAKER: things you get from PeerConnection
... i think stefanh had great input
... this source object would allow you to change settings from a remote source
... Section 4 is how i want to frame
... what it means to change settings
... if you haven't read this, this is the most important thing to read through
... sources on a `home device` don't always necessarily impact sources on a client across the network
... there's a divide with a PeerConnection
... sinks themselves can communicate back to a source device what the optimal settings would be
... it doesn't make sense to use a high res for a source if none of the sinks need it
... the way you change settings is in 4.2
... anant and i came up with this over TPAC
... 3 apis
... you provide a setting (attributes defined in the spec)
... getRange()
... given a setting tells you the range for the setting
... get()
... gives the value for a setting
... set() is the request to change a setting
... you pass in a constraint
... these set() requests are queued and attempt to apply them
... anything isMandatory that can't be satisfied raises an error
... there's no feedback for success
... in section 5 i talk about constraints
... that map to other things in the proposal
... in section 6 i redid the syntax of various scenarios in JS
... gmandyam looking at GetNumDevices in 6.1 might help you understand how that works
... there was great feedback on the list
... lots of little things i'd like to have another go at
... it seems this is still moving in the right direction
... we still need a discussion on sync getUserMedia

martin: one thing that's a little unclear on Settings
... i see Sources having constraints they're operating under
... within those constraints, it can change its mode at any time
... in response to sinks
... in the example in 4.1
... output video scales down, camera scales down
... kinda wrong on this, the source has constraints
... but the streams should have concrete values
... width/height/framerate
... i'd like to set constraints on source
... but you only get output-value on Stream

burn: related
... at TPAC
... i thought we decided between settings/constraints, we'd just have constraints
... to be very precise, you'd set constraints that only allow a single value
... a single value isn't something you would set

Travis: when you talked about Stream, i mapped that to Track

martin: i meant that
... maybe we need to change the name, it's confusing

Travis: i think that makes sense
... i'd change where you read values to Tracks
... and change how getRange/set/get
... so you could introspect constraints on a source
... and modify the constraints
... so you'd be affecting constraints on a device
... if you narrow constraints to only allow a single value
... it'd either allow a single value, or fail

burn: just trying to make clear what we're doing wrt settings

Travis: makes sense
... re: stefanh 's comment about dropping width/height as a constraint
... and make that something that sink's tell sources under the hood

gmandyam: martin asked about
... Settings v. Constraints
... i put this in my email yesterday
... i asked about vendor values in the IANA registry
... without that, it isn't very useful

<martin> we need extensibility, that should be fine with expert review

stefanh: the registry, is going to expert review

burn: correct

hta: anyone can request one

Travis: to wrap up
... we ought to have a way to specify a specific source identifier
... so you can request it using a constraint
... and restructuring read()ing settings and how we go about applying settings
... i'll take an action to make those adjustments

stefanh: when do you think you can have that?

Travis: within a couple of weeks
... before Christmas

stefanh: we'd like to move forward

martin: i wanted to make sure we captured the issues
... the other that hasn't been discussed this morning is mandatory constraints
... wrt fingerprinting

stefanh: we should also sort out windows for the same camera

Synchronous getUserMedia

martin: has everyone got the slides i sent around the other day?

<stefanh> http://lists.w3.org/Archives/Public/public-media-capture/2012Dec/att-0027/a_synchronous_choice.pptx

martin: peer connections require media streams
... often device characteristics determine the nature of the stream
... you can't do negotiation without consent
... in call-answer this isn't desirable
... there's a reason to want a placeholder stream
... stable identifier "camera", "microphone"
... browser needs to be able to continue to identify this
... option 1

[ Slide 4 of 8 ]

martin: getUserMedia returns streams that do not start until consent is granted
... A new started event is added to tracks.
... denial of consent ...

burn: when permission is never granted
... you could get end-event without start?

martin: yes
... user could also get consent and never clicks anything
... in which case you get no events

burn: thanks

martin: option 2

[ Slide 5 of 8 ]

martin: As option 1, except the return value is a wrapper:

<scribe> scribe: dom

martin: another option is to use an extra arg as harald proposed at some point
... a third option, one that I quite like, is the idea that you can create a new mediastream via a constructor
... getUserMedia connect the object to a source
... constraints are then attached directly to tracks
... there is currently a sort of a mismatch between where the constraints are set and where they take effects
... no backwards compatibility option, except if do tricks with overloading

stefanh: we had discussions around something related — allowing the IVR use case

martin: yes, there are use cases where you wouldn't need to attach to an actual media source
... it could be linked to a made up mediastream

jim: does that mean DMTF could be sent without the user giving consent

adambe: also, This enables the developer to request more than one device of a certain type at a time (we currently limit a gUM() call to one audio device and one video device)

martin: I like this because it is compatible with the model we have developed, and with where Travis' proposal is going

ekr: [presenting option 4, slide 7]
... one problem is that we're making the simple case hard in favor of the hard case
... also, I'm not sure it is actually solving our problem
... without having selected a particular device, how can you negotiate the proper SDP?
... you don't get to know e.g. the proper codec without knowing the device
... if the app really wants to get ahead of this curve, and indicate what's needed to PeerConnection
... the app can indicate that it doesn't know@@@

-> http://lists.w3.org/Archives/Public/public-media-capture/2012Dec/0042.html EKR's idea for getUserMedia

martin: it seems you're describing an overload for option3

ekr: you're suggesting you could pass a stream object as a parameter instead of a constraint object

martin: the advantage of that is that we get the DMTF use case (although I'm not particularly excited about it, we would get both)
... otherwise, EKR's proposal doesn't let one create a stream to send DMTF without a device attached

Jim: what about a stream built from a file, could you do that without permission?

martin: we don't have that yet

adambe: you would still need to get permission to get the file though

travis: for a file, I would assume you would need to have a constraint indicating to look for a file rather than a device
... and then getUserMedia would suggest a file rather than a source

<martin> my apologies, I will re-enter queue

<ekr> One question is do we want to be able to replace streams

[I'll note that generating mediastream from various sources is something that will be useful in many other ways]

<ekr> Like, what do you do when the user changes the camera

<ekr> jesup++

jesup|laptop: this is similar to the placeholder concept

scribe: it's important to me: common use case is the mute/pause where you want to replace with a slate or a pre-recorded video
... also linked to the capture from file (that you can do through captureVideoTillEnded in our stuff)
... so it doesn't necessarily to be linked to getUserMedia

gmandyam: re option 2, when you say extra arguments to gUM

gmandyam: could one of the args be peerIdentity=true that ekr suggested during TPAC
... could this also be linked to consent to a muted stream?

martin: I think this covers all of the existing constraints we would have
... thinking to the peerIdenity constraint, this is absolutely essential
... what randell was talking about wrt captureStreamUntilEnded(), that's a perfect way to other ways mediastream can be handled

<burn> +1 martin

martin: I think getUserMedia would be better limited to *user* media, and have other APIs for other sources


<adambe> +1 martin

ekr: the point that jesup just raised is very important
... it's important to be able to replace streams
... it's not clear to me how to do that with martin+travis proposals

<martin> Swapping streams should be possible, though in many cases it could require a renegotiation. I'm not sure that I like ekr's proposal.

<ekr> I'm using Opus!

ekr: it needs to be worked out, or it's something to hold against it

<ekr> just saying :)

martin: swapping a stream is fairly important, so we should look at fixing that somehow
... peer connection might be the mechanism

stefanh: how do we conclude this discussion?

<martin> for <video>, it's easy, just set <video>.src to a new value; pc.replaceStream sounds feasible

hta: option4 is do nothing, option3 is radical
... combining both seems hard

martin: I think it's doable

ekr: should we try to hash something out together?
... I don't want to side on option3 until I see this dealt out with

<scribe> ACTION: martin to look at combning sync gUM option 3 and 4 [recorded in http://www.w3.org/2012/12/06-mediacap-minutes.html#action01]

<trackbot> Sorry, couldn't find martin. You can review and register nicknames at <http://www.w3.org/2011/04/webrtc/mediacap/track/users>.

Recording proposal

stefanh: I would like to make an official FPWD out of the proposal Jim has been working on

travis: I support that

ekr: @@@

ekr: can we have a CfC on this?

gmandyam: I think another revision of the document is necessary
... I don't want to see it go FPWD until then
... due to the call for exclusion

hta: your objections were about file vs blob

gmandyam: no, I sent another round of comments this morning
... I don't think we're far away
... I don't think the current is suitable for FPWD

stefanh: but would you oppose making it an editors draft?

gmandyam: not at all

stefanh: so let's make it an editors draft

Travis: +1 on making it an official draft

<martin> +1 travis

Travis: I would also like to make it so that the recorder can set width/height as per our previous discussion

jim: would that be a setting?

travis: I don't know yet

hta: I see a constraint

travis: setting is more user-friendly :)

Moving forward with getUserMedia

hta: I'd like to see something stable on which we can build upon
... if we think we have all the capabilities we need in gUM
... with tracks and setting changes
... we have a usable functionality set
... So what I would like to do is to get all that stuff in the document before the new gregorian year
... and send out a call after the new year, spending a month nailing the various remaining issues and nits
... trying to solve as many as possible in the list before the F2F
... then hammering as many as we could during the F2F
... and then go to LC
... This depends on doing some heavy editing before the new year, including Travis's proposal which should be ready around Dec 20
... I think that's the fastest we can do, and it would be beneficial for people shipping implementations to get stability on this piece
... I have buy-ins from the editors to spend significant time on this before Christmas

ekr: that sounds like sci-fi to me

<martin> +1 to ekr

ekr: we're entering a period when nobody does anything

<juberti> +1

travis: I'm a little skeptical, but I'm sure I can have a new iteration that addresses the various points that have been raised
... but then it needs to be integrated, which I don't know how much time it will require

<ekr> so dan will have time right at the time that travis is done :)

dan: I have some time until the 18th, and some time around the end of December

<ekr> sorry, will be busy right when travis is done

dan: so I can have some of the changes before christmas, and the constraints and settings end of Dec

Jim: I also have time around christmas if needed
... but there is still a lot of work around error conditions
... I would like to see who is interested in helping on that

hta: we did have a proposal from Anant on error processing
... but we do need to nail the details

Travis: I'd like to participate in that conversation as well

jim: should we do that on the list and see how far we can go?

travis: yes

gmandyam: +1

jim: but I would be surprised if we can get it done and integrated before Dec 30?

dan: I'll have time; but if the proposals aren't ready, that won't help :)

hta: so I have permission to hound you to get consensus on this :)

Travis: as you've been empowered by the Power That Be
... I'll get to it as it is still fresh in my mind and try to get it out early

stefanh: sounds like we have a plan, if somewhat ambitious :)

hta: let's try it!

F2F meeting

stefanh: there will be a 3 days F2F meeting, half IETF/half W3C
... half of Tuesday would be used for Media Capture
... and the following half day would be for WebRTC
... if our gUM plan works out, we can use some of the remaining time to resolve some of the outstanding issues
... what do you think of that approach?

jim: sounds reasonable. Wasn't there complaints about splitting days in half?

hta: not complaints as much as statements that we should nail down the schedule early

<stefanh> MediaCap on Tuesday Feb 5 (half day) is the proposal

ekr: I would really appreciate if the chairs could provide agendas as soon as possible

martin: I think part of our WebRTC call next week should nail the agenda of the F2F

ekr: one potential objection to that split is if we have to drop everything on the floor while in the middle of a W3C discussion because we need to switch to IETF

stefanh: one of the potential issue is if someone comes only for media capture
... we need to provide them with a schedule plan

gmandyam: one thing I don't want to see happen is things discussed in WebRTC that are relevant to Media Capture
... I think the meeting should be officially a joint meeting between WebRTC and the Media Capture Task Force

hta: the difficulty is that a lot of things in WebRTC are relevant to Media Capture

gmandyam: don't disagree, but I think for instance ekr's presentation at TPAc should have been considered Media Capture stuff rather than WebRTC

travis: that stresses the need for setting the agenda early and clearly

<ekr> No! Don't disagree with me!

burn: I want to make sure there is time left on the agenda to discuss things that get raised in one of the context in the other

<gmandyam> Agree with Dan

burn: that justifies the alternative format

hta: flexibility is good


<martin> and no thanks for the progress we made :(

Summary of Action Items

[NEW] ACTION: martin to create proposal for combining options 3 and 4 [recorded in http://www.w3.org/2012/12/06-mediacap-minutes.html#action02]
[NEW] ACTION: martin to look at combning sync gUM option 3 and 4 [recorded in http://www.w3.org/2012/12/06-mediacap-minutes.html#action01]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.137 (CVS log)
$Date: 2013-01-21 08:30:26 $