15:01:13 RRSAgent has joined #webrtc 15:01:13 logging to https://www.w3.org/2021/10/14-webrtc-irc 15:01:16 Zakim has joined #webrtc 15:01:53 Chair: harald, bernard, jan-ivar 15:02:13 Present+ Dom, Harald, Youenn, Jan-Ivar, TimP, BernardA, PatrickRockhill, CullenJennings, EladAlon 15:04:10 hta has joined #webrtc 15:04:31 Slideset: https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf 15:04:35 Present+ Guido 15:04:43 Present+ BrianBaldino 15:04:56 [slide 8] 15:05:07 Bernard: [reviewing agenda] 15:05:24 Topic: The Streams Pipeline Model (Youennf) 15:05:27 [slide 9] 15:05:29 [slide 10] 15:05:51 Youenn: this presentation is about topics and issues we discussed with Jan-Ivar when we explored using Streams for media pipelines 15:06:07 ... goal is to identify blocking issues when looking at adopting streams for media pipelines 15:06:09 [slide 11] 15:06:18 Youenn: media pipelines connect sources with sinks 15:06:30 ... sources are readablestreams and sinks writablestreams 15:06:45 ... we would want to go from camera to network just using streams 15:06:53 ... I'll be focusing only on video pipelines 15:07:06 ... and we'll look at threads and intersection between frames and @@@ 15:07:09 [slide 12] 15:07:24 Youenn: dealing with realtime media is better done off the main thread 15:07:45 ... in the Web Audio API, the graph is done in the main thread but the processing is done in a dedicated audio thread 15:08:01 ... in our case, there is no dedicated thread 15:08:15 ... the safest assumption is to asusme the video frames flow where they're set up 15:08:17 [slide 13] 15:08:38 Youenn: example 1 is a funny hat example using pipeThrough and pipeTo 15:08:53 ... it's not clear where the video frames would flow in terms of thread 15:09:31 ... the assumption would be that it runs in the same thread where these operations are being called 15:09:56 ... example 2 uses a JS transform 15:10:13 ... example 3 uses a tee - it makes it very unclear where it would be run, whether the UA would optimize it or not 15:10:29 ... so the safest assumption, with streams being a generic mechanism, is to assume same-thread 15:10:33 [slide 14] 15:10:46 Youenn: one potential related idea is to transfer the stream to a worker 15:11:28 ... this requires optimizations that are not standard and hard to expose to Web developers 15:11:40 ... the current implementation in Chrome is also not compliant 15:11:55 ... it's really hard to predict whether the optimization will kick in or not 15:11:58 [slide 15] 15:12:17 Youenn: a few examples - example 1 is the typical example where chrome will optimize after a stream transfer 15:12:28 ... in example 2 - not clear whether optimization will happen 15:12:33 ... in example 3 - also unclear 15:12:49 ... and again in example 4, when using non-camera streams 15:13:12 ... let's say you transfer an MST to another frame, and then take a stream transfered to a worker - will it be optimized? as a developer, you can never know 15:13:22 ... as opposed to Web Audio that gives very clear spec'd guarantees 15:13:25 [slide 16] 15:13:55 Youenn: streams are a generic tool designed for flexibility - we can't guarantee for performance 15:14:12 ... we can give that guarantee with transferable MediaStreamTrack 15:14:38 ... this allows to avoid the issues associated with streams when dealing with realtime streams 15:14:56 ... additional optimizations can still happen as a bonus, but they're no longer a pre-requisite 15:15:02 [slide 17] 15:15:16 Youenn: buffering with streams happens at each transform step in the media pipeline 15:15:43 ... a typical pipeline is like the one at the top, with greedy processing 15:16:17 ... but in cases you don't want to process all frames, e.g. a 1-second old frame might be better skipped 15:16:24 ... as does mediastreamtrackgenerator 15:16:51 ... the second pipeline illustrates sequential processing which can be beneficial 15:17:05 ... I think that's a safer approach 15:17:07 [slide 18] 15:17:22 Youenn: this is a real issue; videoframe are big and scarce resources 15:17:39 ... it's also unclear for web developers what happens; buffering is hidden from them 15:18:05 ... issue-1158 is where this is being described - there is probably a solution that will emerge 15:18:24 ... but it's unlikely that the default behavior will be the safe behavior for stream of frames 15:18:54 [slide 19] 15:19:11 Youenn: in general for streams, the idea is that backpressure will deal with buffering 15:19:35 ... but for us, some limited buffering might be useful to allow 15:19:44 ... but it's hard to deal with WHATWG streams 15:19:55 ... the stream queue is opaque to the application by design 15:20:07 ... and the queuing strategy is very static, based on the high-water mark 15:20:22 ... updating the strategy requires resetting your pipeline 15:20:45 ... WHATWG streams might be able to cover the use case, but with complexity 15:21:15 [slide 20] 15:21:31 Youenn: Tee is the typical way to allow multiple consumers with streams 15:21:58 ... tee is part of the design of the API so we should support it 15:22:01 [slide 21] 15:22:27 Youenn: but we know tee is broken when used with our videoframes stream 15:22:38 ... structured clone might solve this, as suggested in issue 1156 15:22:56 ... but the default behavior again won't be the right one for us 15:23:07 [slide 22] 15:23:17 Youenn: but even with structured clone, more changes are needed 15:23:29 ... if you apply structureClone, you add hidden buffering 15:23:53 ... if the two branches don't consume data at the same pace 15:24:15 ... issue 1157 discusses this - so far, no clear solution to this 15:24:22 ... streams by design aren't made to drop items 15:25:14 Present+ Carine 15:25:16 [slide 23] 15:25:26 Youenn: the last issue I want to discuss is lifetime management 15:25:50 ... streams rely on garbage collection, whereas we don't want to rely on GC for videoframe 15:26:38 ... there is no easy way to enforce who will close a VideoFrame, making it error prone for Web developers 15:26:54 ... there is no API contract, so unclear how to solve this 15:27:15 +1 15:27:17 ... maybe a dedicated subclass with built-in memory management? 15:27:27 ... but no work has started in that direction 15:27:47 ... if you look at the pipeline - if you change the pipeline, you need to cancel streams 15:28:06 ... these streams might have buffer, which raises the question of GC again 15:28:18 [slide 25] 15:28:31 Youenn: we need to solve these issues, buffering, tee and life management for VideoFrame 15:28:51 ... there has been progress, but more is needed and it's unclear to me how far we can go 15:28:54 [slide 27] 15:29:17 Youenn: having a high level confidence that these issues can be solved before picking it as our model for designing our APis 15:29:45 ... if we select streams, we should extend support for them in existing and new API (e.g. videodecoder/encoder, barcodedetector) 15:29:57 ... this doesn't seem to be part of the plans for e.g. WebCodecs 15:30:27 Jan-Ivar: a couple of comments 15:31:11 ... on backpressure, I believe with a transformstream and highwatermark of 0 will automatically call backpressure 15:31:50 ... wrt dynamic buffering, highwatermark is indeed static, but dynamic buffering can be dealt with a transformstream - but not with a high water mark of 0 15:32:29 Youenn: I'm not optimistic of seeing the problem solved at the source level 15:32:39 ... my understanding with life time management is that there is no API contract 15:32:49 ... you don't know if close will be called; I like consistency 15:33:19 ... memory management would be something we would want to design carefully 15:33:29 I intended to write q+ 15:33:32 Bernard: in the current model where we don't have highwatermark 15:33:32 q+ 15:35:22 Youenn: the camera pool might have 10 video frames; with a 5 steps pipeline, 5 frames will be automatically allocated - this leaves only 5 remaining slots which might not be enough 15:35:29 ... and some devices might have a smaller buffer of frames 15:35:52 ... which will create variable framerates 15:36:18 bernard: the lack of streams integration in webcodecs creates two queues that need to be managed 15:36:40 ... and that's not particularly transparent, something you have to keep track of 15:36:56 ... this can create significant memory management issues 15:37:13 ... wrapping streams is not particularly satisfactory in our case 15:37:39 Harald: a couple of observations 15:37:58 ... webcodecs did have a stream-based API for a while; MSTP and MSTG was the reason they got dropped 15:38:10 ... we've had very few people reporting problems with these issues 15:39:09 ... my impression is that the Stream model has been somewhat confused with the stream shim implementation 15:39:21 ... we should have a clean model where issues are moved to implementations, not the model 15:39:35 ... wrt tees, I have some experience with reading the CL that added tee to the spec 15:39:48 ... worries were expressed that are very similar to ours 15:39:51 ... tee is a bad design 15:40:07 ... it's fairly easy to write your own JS to get the tee you want, which is quite dependent on your app 15:40:28 ... tee doesn't respect the high water mark on down stream - tee is bad 15:40:58 ... on the contract point, I think it's natural to say that downstream either has to call close, or pass it to something that will call close on VideoFrame 15:41:09 ... we shouldn't depend on upstream to do anything 15:41:22 ... we do have an issue with disrupted pipeline - that needs to be solved 15:41:53 ... my conclusion is that some of these issues are with the description more than implementations, and some are issues we need to solve but aren't fatal 15:42:10 ... like tee - it's not because it's possible to use it badly that we shouldn't use streams 15:42:30 ... the streams API is superior to callbacks because it avoid re-doing it all 15:42:48 Youenn: I agree with you that tee is bad - salvaging it will be difficult 15:43:09 ... doing one's tee in JS is indeed better - but you'll end up using promise-based callbacks 15:43:17 ... but if so, why using streams? 15:43:35 ... re other issues not being fatal, I would welcome proposals that address these concerns 15:44:07 ... at the moment, I'm not confident we can proceed with confidence that streams is a good enough match 15:44:15 ... if they can be solved, I agree that streams are appealing 15:44:34 Jan-Ivar: all these issues filed on github are with the model 15:44:46 ... they're not necessarily huge though, and I'm not sure we should block on them 15:45:08 ... given that one API is already shipping, I think we need to converge on a standard sooner rather than later 15:46:07 Youenn: I'd be interested in getting a pro/cons comparison of promsise callbacks vs streams 15:46:11 Topic: Altnerative Mediacapture-transform API 15:46:43 [slide 30] 15:47:07 jib: today, the realtime media pipelie is off main thread today 15:47:15 s/lie/line/ 15:47:35 [slide 31] 15:47:44 jib: that remains true in webrtc-encoded-transform 15:48:05 ... the original chrome APi was on main thread, but we then converged on a standardized API off the main thread 15:48:22 ... this was importatn for encoded media, all the more so for aw media 15:48:25 [slide 32] 15:48:55 jib: the premise here is that the main thread is bad - "overworked & underpaid" as surma qualified during a chrome dev summit in 2019 15:49:23 ... surma highlighted webworkers as the solution to that problem 15:49:51 ... contention on the main thread is common and unpredictable 15:50:16 ... and hard to detect outside of a controlled environment - as opposed to web workers 15:50:20 [slide 33] 15:50:54 jib: when webcodecs made the decision to expose the API on the main thread, they based this on non-realtime media use cases 15:51:11 ... and they strongly encourage to do realtime processing off the main thread 15:51:16 [slide 34] 15:51:42 jib: we have a non-adopted document "mediacapture-transform" (which has shipped in Chrome 94 despite not being standardized) 15:52:27 ... my position is that this proposal is not satisfactory because it exposes realtime pipeline on main thread by default, it doesn't encourage use in workers, relies on non-standardized optimizations 15:52:41 ... also, now mediastreamtrack is transferable so this creates new opportunities 15:52:46 [slide 35] 15:53:17 [slide 36] 15:53:58 jib: having to ask the main thread all the time to interact with the API makes sense 15:54:10 ... it's baked in the assumption of main thread 15:54:13 hta: that's untrue 15:54:32 [slide 37] 15:55:12 jib: for a processed (e.g. background replacement) self-view use case combined with webtransport 15:55:29 ... tee, clone, postMessage(constraints) aren't good approaches 15:56:01 ... whereas with track available in a worker, we have a natural API 15:56:07 [slide 38] 15:57:07 jib: the tunnel semantics of WHATWG streams are not meant to solve creating streams on the wrong realm 15:57:16 ... MSTP is built on broken assumptions 15:57:30 [slide 39] 15:57:55 jib: I have an alternative proposal based on transferable mediastreamtrack 15:58:05 ... the proposal focuses on video at the moment 15:58:18 ... it encourages use on workers 15:58:36 ... it still uses streams, despite youenn's identified issues - which I think we can find solutions for 15:58:39 [slide 40] 15:59:13 jib: we expose a readable attribute in a worker version of the MediaStreamTrack 15:59:50 ... this keeps data off the main thread 15:59:52 [slide 41] 16:00:00 jib: a more complicated example, read & write 16:00:18 ... this is the equivalent of mediastreamtrackgenerator 16:00:29 ... we expose only on workers a new VideoTrackSource interface 16:01:07 ... the example is a crop example inspired from WebCodecs 16:01:39 ... it aligns better with the separate of source and track of the mediacapture-streams spec 16:01:47 ... it interacts well with clone and structured cloning 16:02:03 [slide 42] 16:02:43 jib: for any video processing, you have a self-view (with high framerate) and a low-fps to send on the network 16:03:35 ... applyConstraints works well with a peerconnection 16:03:37 [slide 43] 16:03:57 jib: now with WebTransport, using track cloning 16:04:37 ... this shows native downscaling with applyConstraints as a workaround to using tee 16:04:49 ... not clear how MSTG would let you do this via a worker 16:04:53 [slide 44] 16:05:07 jib: benefits: simpler API taking advantage of transferable tracks, with fewer APIs to learn 16:05:20 ... doesn't block real-time media pipeline by default 16:05:39 ... it has parity with MSTP & MSTG features 16:05:52 ... similar in terms of brevity 16:05:59 ... doesn't rely on UA optimizations 16:06:07 ... and deal with muted sources 16:06:13 [slide 45] 16:06:46 jib: Bonus: if we want promise callbacks for stream-based, you can use "for await" on the stream 16:06:54 [slide 46] 16:07:24 jib: if you want more than a readable - this can be done with cloning, but we could also provide dedicated surface 16:07:47 Harald: I kind of like the proposal - it's almost totally equivalent to MSTG and MSTP 16:08:23 ... the examples where you have posting messages to the main thread - MSTG and MSTP are designed to be available to the same contexts where tracks are 16:08:42 ... MSTG and MSTP will need to be available on workers when MST are 16:09:18 ... in terms of quoting Chris Cunningham on the Web Codecs decision - one of the motivation for main thread is the availability of other APIs on the main thread 16:10:06 ... transfering streams as a pipeline between origin and destination context - it assumes the source is main thread, but that's not true 16:10:13 ... with a camera, the source of the stream is the camera, not the main thread 16:10:39 ... otherwise, I like the shape of the API; it's very similar to what I proposed 16:11:02 jib: I didn't mean to misrepresent these aspects; I see now that MSTG and MSTP are available in workers 16:11:07 ... but they're not transferable 16:11:13 ... so they would have to be created in the worker? 16:11:14 harald: yes 16:12:54 youenn: re slide 37 16:13:14 ... re not using tee because it's bad - I agree, but I hope we should be able to use it 16:13:23 ... with the example in slide 37, we lose back pressure 16:13:37 ... we might be able to add it back 16:14:07 ... in general, in terms of API shape, if we assume that we use streams, this is a good shape that solves some of the issues that I had with the prior proposal 16:14:15 ... in general, mediacapture-main has concepts of source and track 16:14:27 ... having a JS object that represent the source is a good thing 16:14:38 ... similar to a readablestream that can be native or a JS object 16:14:50 ... I think we should go there, will make it easier to extend the API and remove edge cases 16:15:12 ... I would prefer not to rely on tranferable streams, but instead rely on transferable MST 16:15:34 ... which creates a typed way of transferring that can help fulfill the requirements we need 16:16:07 jib: my example may have a mistake on which track to clone - would flipping it around fix backpressure? 16:16:10 youenn: I don't think so 16:17:00 ... introducing backpressure on the writablestream might do the trick 16:17:09 harald: backpressure cannot deal with framerate 16:17:39 [slide 47] 16:17:59 jib: tee can help with backpressure, at the cost of tee problems 16:18:19 ... the only thing odd is the "createFrameDropper", a transform stream to drop frames 16:18:39 ... clone/applyConstraint is a work around if we can't solve the tee problem 16:19:18 bernard: slide-36 and -37 don't make sense to me 16:19:38 jib: right, I wasn't aware that MSTP and MSTG were be available in workers 16:19:48 ... but you could still do this, and the situation would need to be handled 16:20:08 ... but Harald is right there is a lot of similarities between two proposals 16:20:26 ... the advantage is that we don't need to add a new object 16:21:00 bernard: re slide 33 16:21:15 ... datachannels for instance is only available on the main thread 16:21:28 ... the lack of consistent API support in workers was part of the challenge 16:21:34 s/Cunningham/Needham 16:21:53 jib: MSTG is a bit of an odd duck - it's also track 16:22:30 ... re lack of APIs, you can always transfer tracks back to the main thread when needed 16:22:40 ... this doesn't require breaking transferable streams semantics 16:23:11 Harald: if you have to tell some place upstream that you're frame is 30, then backpressure can't carry that information 16:23:28 ... backpressure can't tell the difference between "I'm slightly late" and "I want only every other frame" 16:23:40 ... we need to be able to carry these signals 16:23:56 ... we haven't gotten to it yet 16:24:14 bernard: there may be several stages of reporting that's needed 16:24:29 youenn: this depends on whether sources are push or pull 16:24:38 ... consumers need to propagate things up to the source 16:24:49 ... backpressure may not always be the right mechanism, but we need to support it 16:25:06 ... I also agree we need to fix carry backmessages 16:26:11 ... the fact that some of the APIs need to be done in the main thread is sad, but it still moves a lot of the heavy processing to workers, leaving only some of the plumbing on the main thread 16:26:40 ... there may be gaps to do good media processing - if so, we should make them available in workers, and this API would help accelerate that transition 16:27:40 Guido: in addition to APIs availability on the main thread, we have first-hand feedback from app developers who WANT to do on main thread for their use cases 16:28:32 ... otherwise, the two APIs are equivalent beyond their shape 16:29:16 dom: re use cases on main thread, is it a matter of developer experience? 16:29:42 guido: for certain apps, adding workers in the mix is adding a cost, not a value 16:29:57 ... it only adds complexity and extra resource consumption 16:30:16 jib: even there are such use cases, we're trying to protect a realtime media pipeline 16:30:53 Topic: Mediacapture Transform API 16:31:07 [slide 50] 16:31:24 Harald: [summarizes the API of MSTP and MSTG] 16:31:28 [slide 51] 16:31:46 Harald: it shipped in Chrome 94, it's actually used in products with new features based on it 16:32:01 ... very few problems reported on it 16:32:07 [slide 52] 16:32:27 Harald: we believe the threading model is something that app developers need to pick, with encouragement from platform developers 16:32:34 ... but dictacting it is not the right approach 16:32:40 ... Streams are transferable objects 16:33:17 ... adding worker availability to MSTG and MSTP is a reasonable addition following the transferability of MST 16:33:31 [slide 53] 16:33:53 Harald: we need to make sure we have samples that show realistic working real-time operations 16:33:59 ... including offthread processing 16:34:12 [slide 54] 16:34:35 Harald: in terms of improvements, we need better control of adaptation source (backpressure, synchronizing streams, framerate) 16:34:58 ... we need to improve experience with streams that don't come from camera - not trivial to synchronize them 16:35:13 ... we can work on these aspects once we have agreed on a common base 16:35:20 [slide 55] 16:35:48 Harald: the two proposals agree on Streams for frame delivery 16:35:58 ... difference of opinion for availability on main thread 16:37:08 ... the proposals differ on whether the generator or the consumer expands MST or use a separate class 16:38:23 ... this can be discussed 16:38:35 ... another difference is that MSTG/MSTP is dealing with both audio and video 16:38:42 ... where jib is focused on video only 16:39:15 ... clear similarities on model, and distinctions that can be derived in specific issues 16:40:22 jib: streams are transferable, but implicit transfer of the source isn't web compatible and we should go away from it 16:40:39 harald: my interpretation is that the stream source is NOT on the main thread - e.g. it's attached to the camera 16:41:31 jib: the optimizations that chrome has been doing is not compliant to spec AFAICT 16:41:46 harald: I haven't been convinced the issue is not with the spec 16:42:08 jib: the fact that this can't be optimized all the time would make this head scratching 16:42:15 s/the time/the times/ 16:43:20 harald: I find the stream spec impossible to navigate - happy to get pointers 16:43:29 jib: one of my slide covered the intent of the spec 16:43:55 harald: but it relies on the interpretation that the source is in the main thread 16:44:25 youenn: the algorithms described in the stream spec will need to be run in the context of the stream (not the source) 16:44:36 ... there is some leeway in the stream spec to optimize pipethrough et al 16:44:40 ... but not for the rest afaict 16:45:09 ... Adam Rice (stream editor) suggested a specific optimizable stream might be needed 16:45:56 [slide 38] 16:46:14 jib: [quoting from the spec] 16:46:23 ... it's explicitly about transfer between realms 16:49:56 -> https://github.com/whatwg/streams/issues/1063 Transferable streams: the double transfer problem #1063 16:50:35 jib: re exposure to main thread - for webrtc-encoded-transform, we agreed to focus off-thread only 16:50:50 harald: I have a bug open to allow to reenable it on main thread 16:51:07 ... I think this was a bad decision 16:51:07 Topic: Wrap up and next steps 16:51:31 [slide 55] 16:51:48 bernard: I would like to get a sense of the room on the major distinctions between the 2 proposals 16:52:51 jib: would also like to get a sense on whether my proposal is acceptable under what changes 16:53:15 harald: we have 2 potential starting points, I don't see any reason to pick one over the other 16:55:10 youenn: I want to reiterate my concerns about the difficult stream issues that I raised and for which I'm not seeing progress 16:57:31 dom: I think the question is about API shape (readable/VideSource vs MSTP/MSTG) 17:00:29 Cullen: I don't feel strongly about any of these questions, not knowing enough about the impact on implementations 17:00:50 ... I would need more background to give an informed opinion 17:04:25 Bernard: So, we will bring these questions to the mailing lists 17:05:08 Dom: ... after discussions with the chairs 17:05:21 RRSAgent, draft minutes v-slide 17:05:21 I have made the request to generate https://www.w3.org/2021/10/14-webrtc-minutes.html dom 17:05:28 RRSAgent, make log public 17:07:02 Meeting: WebRTC October 2021 Virtual Interim 17:07:04 RRSAgent, draft minutes v-slide 17:07:04 I have made the request to generate https://www.w3.org/2021/10/14-webrtc-minutes.html dom 17:33:16 hta has left #webrtc 18:00:29 Zakim has left #webrtc