Media Capture Task Force F2F

19 May 2014


See also: IRC log


HTA, StefanH, Giri, Adambe, AdamR, JIB, EKR, Martin, pthatcher, Juberti, shijun, bernard, Fluffy, burn, JimB, AlexGouaillard, Dom, TedHardie, AndrewH, UweRauschenbach, DanDruta, Suhas, Jesup (remote), DiniMartini
MaryBarnes, RomanShpount
hta, stefanh
gmandyam, juberti


<stefanh> agenda and other stuff at https://www.w3.org/wiki/May_19_2014

<dom> ScribeNick: gmandyam

<stefanh> Minutes last meeting: http://lists.w3.org/Archives/Public/public-media-capture/2014Mar/0176.html

Minutes approved by group

<Ted_> So, I currently see only one remote participant in the Hangout.

Proposed agenda:https://www.w3.org/wiki/May_19_2014

getUserMedia bug runthrough

hta went over open request for bugs. Not all bugs will be addressed in the presentation.

<dom> Harald's presentation on gUM bugs

Presentation covers bugs that were filed up through Friday, May 16, 2014

Bugs broken into 4 categories: Nits (no discussion required), Not specified (missing features - need to decide whether to include), Works Wrong, and Functional Extensions

Not specified bugs: permissions persistence duration, and "when does the light come on?"

Function extensions: "other" VideoFacingMode, getCapabilities() with no track, event propagation when devices change, separate "access" call, "Ideal" or "tendentious" limit values

Not working: getSettings (asynch vs. sync)

10 minutes of discussion per bug

First issue (Bug 22214): How long do permissions persist?

Martin: Permissions only persist as long as the web page is part of active browser context (paraphrased - Martin may correct)

juberti: Permissions may also be explicitly turned off, but once light goes off and there are no persistent permissions then the user could be required to re-permit access

ekr: worried about permissions that are not in the user control
... There are two operational modes: one where you get permanent access for any HTTPS site, and one where access is dependent on the RTPpc

juberti: Allowing permissions for HTTP versus only for HTTPS did increase the probability of attack

adambe: Stopping a stream (e.g. unplugging a capture device) results in a MediaStream.stop event, and the user should be re-prompted when this occurs

hta: Hanging up the call (WebRTC) is different from ending the capture stream

ekr: Long-lived sessions (e.g. 3-day FB sessions) should not have persistent permissions

juberti: Users don't trust/pay attention to camera light
... HTTPS could solve the issue of persistent permissions

Ten minutes expired, moving on to next bug

cullen: I propose we do not leave this meeting without solving the permissions model. Maybe we should focus on HTTPS first, and then move on to HTTP. The spec is silent on the topic.

Bug 22337: When does the light come on? - next topic of discussion

ekr: There are two indicators available: the HW light, and the browser chrome

hta: We are talking about browser chrome - the other stuff is not in our control

ekr: As long as in principle you can reacquire the camera, the browser chrome should provide an indication to the end user
... Only the webpages actively making use of the camera should be indicated to the end user; not the ones who have persistent permissions

hta: The indicator that a page has acess to the camera is different from an indicator that a page has active access to a capture stream

burn: There may be multiple indicators to the end user. As an end user, if all of those indicators are "off", then I would expect that the capture device would not come on w/o permission.

DanD: Indicator should be consistent for each domain.

hta: Reasonable consensus has emerged: indication that permission has been granted, and distinct indication that audio/video is being captured are both important and should be provided by the UI

juberti: MediaStream.stop should turn off the capture indicator

ekr: We need something more than advice, so that browser vendors wil limplement..

<fluffy> Isseu 22337

<fluffy> Two indicators - one when there is a permission

<fluffy> - one where media is being captured

<fluffy> When you call media stream stop, the capture ligt goes out

<fluffy> Will say MUST indicate. We don’t say how we indicate.

<burn> it's media track stop, and the light would only go off if all tracks accessing that source have stopped

Conclusion: Guidance will be provided to browser vendors in specification: provide a UI indication that the permission has been granted, and provide a different UI indication that the capture device is active and the application has access to the MediaStream

martin: Some of the mobile platforms may not be able to display indicators (due to lack of display real estate)

shijun: (1) We should not have any dependencies on the HW light, as this is in the control of the driver, (2) We should consider revoking capture permission when the webpage has received a stop event on the MediaStream

hta: Request to ekr and juberti to come up with a proposal during the coffee break

Bug 25707.25708: Should getSettings() be asynch?

stefanh: We think that they can be obtained synchronously, and recommend closing the bugs

Conclusion: Bugs 25707/25708 will be closed. Settings can be obtained synchronously.

\Bug 23820: Special values in constraints. Idea is to bias constraint selection algm. One proposal is to spec a max value for range bound constraints, and the other is to specify a third "ideal" constraint in addition min/max pair

jimB: These items can be deferred

juberti: max does not fully cover the concepts behind "ideal"

martin: Inf can be used to max out the constraint without "ideal"

burn: -Inf usually works for min possible, Inf works for max possible

jib: Inf is rarely desired and is not what apps are designed for; "ideal" solves this problem

jib: Without ideal, you have to repeat the constraints in ordered sequence going to lower and lower values

cullen: "ideal" is different from a sequence of {max,min}'s.
... I wonder about the uses for this. :"Ideal" is usually the largest value possible. What are the use cases when ideal is not the max?

<dom> "ideal:Infinity"

ekr: Agree. "Ideal" is usually max resolution of camera

<Zakim> burn, you wanted to respond to an earlier JIB comment

JimB: You can query capabilities to get max value; right now if you ask for something out the range it fails. We would have to change spec to allow for non-specific max value

juberti: We would like to avoid "advanced" constraints

burn: This is too much to add to the spec at this point

jib: WebIDL has Inf/-Inf. So we may already have a way to express indeterminate max and min. There is an expectation however that the max end of range will always be allocated, while the browser may allocate the midpoint. "Ideal" addresses this issue.
... Firefox can also implement ordered constraints if "ideal" is not adopted.

juberti: There are capture cards that only provide fixed res, but "ideal" will allow you to operate even if a lowered res is desired

cullen: I propose that at the coffee break I work offline with jib and any other interested parties to incorporate "ideal"

hta: Include burn too

martin: There are a number of cases where the no. of pixels on the other end (WebRTC) are less than what the local capture device produces. Then "ideal" would not match max.

<jesup> I think ideal should be added

The chair then put forward in infomal poll: (1) Do you have enough info to make a decision? (2) Do you think "ideal" should be added? (3) Do you think "ideal" should not be added?

<jesup> 1: yes, 2: yes, 3: no

More than half the room was ready to make a decision. About 20 persons felt "ideal" should be added, and that 2 persons felt that "ideal" should not be added. Therefore there was no consensus detected in the room.

<hta> s/no smooth consensus/no smooth consensus/ (there was rough consensus)

Martin requested that the persons who voted against "ideal" make their concerns known.

burn: The "advanced" list enables more than what "ideal" does, and there have been many statements claiming that none of the "ideal" use cases cannot be accomplished with the existing syntax, which is just not true.

JimB: I don't think we need to add "ideal" now, because "advanced" accomplishes what is required

Conclusion: No smooth consensus. Issue unesolved.

Bug 25298: VideoFacingMode for other directions (i.e. "other")

<Ted_> I think "Dangerously underspecified" as a value.

hta: enum values in WebIDL are not easily extensible

Possible JS representations (acc. to hta slides): "unknown" or "other"

jib: Unknown values in WebIDL-complian enums end up in thrown exceptions
... Agreeing w/Justin: "other" and "unknown" are different. Absence of a dictionary member should be "unknown"

jib" "other" would be a JS undefined

jib: "other" would be a JS undefined
... We could use a DOMString instead of an enum for future proofing, but the JS undefined will be returned for other/unknown directions

cullen: An undefined value can get into SDP, which could be problematic

Action item for jib: Write up proposed change

<trackbot> Error finding 'item'. You can review and register nicknames at <http://www.w3.org/2011/04/webrtc/mediacap/track/users>.

Conclusion: VideoFacingMode is no longer an enum - it is a DOMString

Bug 25247: GetCapabilities w/o a track

hta: Would be nice to know what is possible to call w/o fist calling gUM

Martin: We don't solve this problem

dom: what I understand Martin was summarizing - fingerprinting is a concern.

JimB: Agree with Dom and Martin. Have an enumerate devices call instead.

COnclusion: We don't do this. Close the bug.

<jesup> yes!

<jib> correction from earlier: "unknown" would be JS undefined

Bug 24015: Event to signal when devices have changed

<jesup> Yes == yes we want en event for device plugged/unplugged

hta: Would track when devices are plugged or unplugged
... We don't have an event target that can track this.

Conclusion: Add an event to listed for device changes

Bug 23128: "get access" call

The bug was closed, but should it be revisited?

<jesup> I want to create a MediaStream with a video track from a Canvas, and tell it to capture frames either by telling it "capture now", or (maybe) by setting a delta time between async captures (though one could do it through setTimeout)

<jesup> I was going to write something up, but ran out of time

dom: This is complicated. I recommend keeping it closed. Also, the problem is more generic than just media capture.
... There is some ongoing work on permissions management in the W3C. Recommend living with the specification as is.

Conclusion: Bug remains closed

<jesup> I may try to piggyback this on Martin's presentation :-)

cullen: Propose that we don't discuss Recording right away. Open bugs should be prioritized.

Moving on to Recording API. Bug topic closed. No change in agenda.

Recording API: https://dvcs.w3.org/hg/dap/raw-file/default/media-stream-capture/MediaRecorder.html

<fluffy> https://dvcs.w3.org/hg/dap/raw-file/default/media-stream-capture/MediaRecorder.html

JimB introduces API. start() is the basic record method, with optional timeslice parameter.

MediaRecorder API

JimB; There are onerror and onwarning events. onwarning events have not been well-justified, so recommend removal.

juberti: Agreed with removal of onwarning.

Martin: Warnings are not programatically actionable, so agree with removal. Also, Mozilla does not currently implement anways.

JimB: Is MIME type sufficient for recordng? {Most of the room felt the answer to this is no}

juberti: Recording options should match the track resolution

gmandyam: This was meant to address native media recorders on handheld devices which only offer up fixed resolutions

juberti: Bit rate of recording should be under the app control.

Recording API conclusion: MimeType and bit rate are required for MediaRecorder constraints

hta: This is not sufficient. we need an indication of 'seekability'

cullen: It may be enough to find out what the format is, not to control it

JimB: This is why it is constrainable

juberti: Doohickeys may be a solution to this

Roman: Does MIME type include parameters (answer was yes)
... How is timeslice defined (paraphrased)? {Answer is this is clock time, not audio time}

JimB: There has been no MTI codec defined for recording

martin: Recommend avoiding this topic of MTI codecs for recording

Summary of conclusions for Recording API: Get rid of onwarning, constrants include bitrate and MIME type

juberti: onstart, onpause, etc. can be collapsed into a state change event handler

dom: We should look at other similar specs before going forward with this before going with an onstatechange event handler

juberti: WebRTC has onstatechange, but MediaStream has onstarted/onended

<mt_> A foolish consistency is the hobgoblin of little minds,...

<dom> scribenick: juberti

<jesup> mt_: Can you handwave about adding mediastream = canvas.captureStream(), and then either strobing it (canvas.captureFrame()?) or setting a periodic sampling (canvas.captureFramePeriod() or Timeout() or whatever)

<jesup> along with the media.captureStream* stuff

Back to constraints

cullen: will propose simplification of constraints stuff
... ideal allows us to really simplify something that has grown overcomplicated

martin: when

cullen: august 15

martin: seriously?

cullen: lots of stuff in flight

hta: that's not reasonable.

cullen: it will be done well before the WG agrees to the permission model.

stefanhak: what were the results of the breakout on this topic?

burn: some folks were supportive of ideal

cullen: I preferred the syntax of 3 weeks ago.
... I think others prefer it as well.
... I would be willing to propose something before IETF about this


stefanhak: what about permissions?

martin: discussed what does it mean for an app to disconnect a stream while retaining access to the stream?
... through some mechanism, the app could release the track, but retain permission
... we'd need a way to have separate indicators for having access to media and capturing media
... .enabled could be one way to do this

cullen: if this happens, the light will go off?

martin: yes, the light going off is the end of the indicator for capturing media
... but there will be another indicator for permission to capture media
... only when you call stop, and the track ends, would that permission go away
... we also discovered some other significant problems that we need to deal with
... (later)

cullen: what about a delay to the indicator turning off?

martin: yes, we left that as something we can say about the capture indicator

ekr: that should be left to indicators
... security requirements only should go into the draft

martin: don't want a drive-by capturing a single frame

ekr: needs to be noticeable, leave it to the implementors to figure this out

Image Capture

<gmandyam> Latest version of Image Capture spec: http://gmandyam.github.io/image-capture/

juberti: to be clear, .enabled wasn't agreed upon. But we did agree there needs to be a way to release the capture but not the permission

gmandyam: image capture API for photo taking applications
... provides a means to set camera settings, not usually available on a webcam
... also provides two methods for taking a photo, called...
... takePhoto, and...
... grabFrame, which returns a raw buffer, allowing JS image processing
... (takePhoto returns JPG)
... let's talk about settings... last discussed in 2013 TPAC
... zoom - group didn't want this as constraint
... other new setting is autofocus
... open issue - how to deal with non-autofocus cases

juberti: is this specific to image capture cases (i.e. not MSRecorder or PC?)

gmandyam: yes
... TPAC 2013 - agreed on camera preview MediaStream - no special security properties
... takePhoto has been proposed as Promise-based method
... no arguments against it
... designed ImageCapture as an overloaded object to allow use of promise-based methods
... setOptions and onoptions event handler reamin

dom: does it make sense to have two interfaces?
... think image capture makes sense to be promise-based
... I think a promise when you have a single callback make sense, and event when you have multiple callbacks

shijun: device may have multiple pins. videopin is 1080p, image is 20M
... what does API say about this?

gmandyam: originally there were two modes - high fps for video, low fps for photo
... now we have a single MS that can go to PeerConnection or be used for taking photos
... we'll have that which can be used for preview stream, but we should be able to capture from the hi-res, low fps pin

shijun: I think there may be problems

gmandyam: Who is implementing?
... not in IE 11, Blink. Mozilla has committed to takePhoto
... any remaining questions?

pthatcher: terminology question: why photo in some places, and image in others?

gmandyam: no opposition

pthatcher: suggest replacing photo with image

gmandyam: sure, will take to mailing list

ekr: the problem we realized - changing camera from front to back
... removeStream(frontCam), addStream(backCam)

cullen: use case: want to switch cams while recording

<jesup> I desparately want to be able to change cameras without onnegotiationneeded firing

martin: I have a solution for that

<jesup> mt_: ^

martin has a solution for cullen, not jesup

Creating MediaStreams from DOM

<dom> Martin's slides

martin: discussing captureStreamUntilEnded and captureStream

<jesup> I have some comments to make if I can have the floor after Martin, if he doesn't cover what I wanted to mention (track switching, and mediastream capture from canvas). He may cover it, so I'll wait to see

martin: the idea is to create a media stream from the contents of a <video>/<audio> element
... and feed it into web audio, canvas, peerconneciton
... captureStream continues capturing forever from the specified element
... captureStreamUntilEnded captures until the stream stops
... using this all the time in testing

juberti: is this only audio/video, or also canvas?

martin: just a/v for now

<jesup> juberti: I said earlier: "Can you handwave about adding mediastream = canvas.captureStream(), and then either strobing it (canvas.captureFrame()?) or setting a periodic sampling (canvas.captureFramePeriod() or Timeout() or whatever)"

martin: people have asked for P tag, DIV tag

hta: the video that is output from a video element - is it the same as what goes in? no cropping, letterboxing, etc?

martin: no, it's the same, as I understand it

<jesup> hta: Correct, no modifications

martin: one issue with capturing canvas is that canvas has to go to RGB space first

<jesup> Currently it is uncompressed

shijun: is there a way to set the media type?

gmandyam: imagine the input is a MPEG-DASH stream
... usual selection criteria apply

(martin said that)

martin: as long as the stream is playing back without interruptions, there should be no issues
... haven't hit perf issues, not that taxing

randell, you're up

jesup: having a source from a canvas is really useful
... makes mediastreams much more usable
... displaying something while on hold

cullen: when doing audio, can you play something in an audio tag in one of these tags without it going to your speakers

martin: yes. set tag to not render

burn: I like this a lot. What is the 30 second mental model here?

martin: you get a stream with what would be rendered if the input had been rendered

burn: anything you can do to simplify this

martin: the only caveat is that any letterboxing/cropping in the rendering won't apply to the stream

<jesup> fluffy++

cullen: if you want bars, use a canva


jim: any other differences?

<burn> my comment on simplification was because of the exceptions. It's nice to have a simple explanation for a developer on how to think about this working.

martin: audio is also mixed

juberti: I think thinking about this as "same as input, except for mixing" is good
... Can this be cascaded?

martin: yes. You can cascade this as much as you want.
... Various cross-origin things to think about

If you are rendering content from example.org, on example.com, you don't get access to that stream.

scribe: various other things to think about
... similar to MediaStream isolation issues
... isolation state can change
... no way to send content from example.com to example.com
... over WebRTC. The identity stuff just doesn't work that way right now
... normally, if you're on example.com, and you load content from example.com, you can access that content

ekr: we have not yet enhanced WebRTC to allow you to exfiltrate cross-origin data back to yourself.
... specifically, if you are on example.org, and load content from example.com, you can't send the data to example.com over a peerconnection, even through example.com *should* be allowed to access that content.

juberti: ok. my interest level just went off a cliff.

martin: blocked streams. when paused, the stream becomes blocked, and no more frames come out.
... this can happen for other reasons - becoming isolated, network blips.
... usually manifests as either a frozen frame or solid black in video, silence in audio playback

hta: this should be up to the app - add an option to MSR, decide if it should record last frame, blackness, or whatever

pthatcher: do you get exactly what the element renders? if you seek backwards, would that also show up in the output?

martin: yes
... It should be optional to pause the recording when streams become muting.

burn: is this really only something that applies to recording?
... or is this just a source, and the source could have specific properties?

martin: we could map this state to muted

burn: general definition - we say, this creates a source with the following properties - then it works everywhere

dom: do we need anything here for recording? can we not just react to muted events, etc?

martin: if you do that, you might miss frames when restarting
... frames might arrive before mute notification is dispatched

adambe: if we have media element capturing to stream, going into MSR, should work regardless of stream

martin: that's dan's comment
... we should make this generic
... one more slide... boring process stuff
... captureWossname???
... i.e. what do we want to call it
... haven't seen a lot of interest
... so what do we want to do with it?

gmandyam: what about EME-protected video

martin: isolation addresses this

gmandyam: broadcasters will have to be convinced

<adambe> +q

gmandyam: can we use this for file apis

dom: picker was developed for privacy

hta: taking consensus call
... option 1) abomination, kill it
... option 2) might want to do it, but not right here or now

dom: does that mean not start work on it?

hta: yes

<jesup> I LOVE cute kittens: adopt it!!!

hta: option 3) start work

<mreavy> please adopt

hta: strong consensus for #3
... how do we progress this into the work effort?
... pull request, or new document?

cullen: i think we could go either way
... would this slow down gUM in any way
... pull request would be done in 10 minutes
... if nobody hates this, suggest same document
... avoids problem from SIP of lots of little docs

<jesup> fluffy++

adambe: maybe work on it for a while, then progress it to same doc as mediaelement

martin: argument for the chairs

jim: already a section about putting streams into video elements
... the converse should not be an issue

<jesup> yes

hta: should this be a section in the current doc

<burn> editors were nodding on this

<jesup> same document

1)... 1 yes

<mreavy> same document please

scribe: 2 no

hta: mostly #1

<jesup> 1

<burn> since we can always pull it out into a separate doc if we get to Candidate Rec and it's holding us up

mt: I would like to see a more fully formed proposal


strike that

Back to constraints

hta: now, we need to figure out what we want to do about constraints, rolling forward or back

cullen: it's a substantial change

jim: it's a change, but not a substantial change

hta: what should the procedure be for making a decision?

cullen: people have hated constraints for a long time. how will we change that?
... ideal will not be trivial.

jim: I know how to do it/

cullen: ekr. constraints, love or hate?

ekr: thumbs down

juberti: thumbs down

jim: thumbs up

burn: constraints exist for a reason. as long as those reasons exist, they will continue to exist

cullen: i think we need to get some people together and discuss how we can move forward

dom: one approach would be a formal CfC to the list, timeframe of two weeks

hta: lack of a clear path forwar


hta: we reconvene at 830 EDT, tomorrow. RTCWEB up first.
... 1230 for lunch. 1330 for WEBRTC meeting.

juberti signing off.

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-05-27 12:23:17 $