14:53:24 RRSAgent has joined #me 14:53:28 logging to https://www.w3.org/2024/12/10-me-irc 14:53:47 meeting: Media and Entertainment IG 14:53:52 present+ Kaz_Ashimura 14:53:56 cpn has joined #me 14:54:20 agenda: https://lists.w3.org/Archives/Public/public-web-and-tv/2024Nov/0004.html 14:55:06 slideset: https://www.w3.org/2011/webtv/wiki/images/b/bb/2024-12-10-MEIG-SVTA-MSE-Meeting.pdf 14:55:20 rrsagent, make log public 14:55:24 rrsagent, draft minutes 14:55:25 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 14:55:32 scribe+ cpn 14:58:38 present+ Daniel_Silvahy 14:59:55 present+ Bernd_Czelhan 15:00:11 present+ Hisayuki_Ohmata, Ryo_Yasuoka, Eric_Carlson 15:00:59 present+ Chris_Needham 15:01:04 chair: Chris_Needham 15:01:14 nhk_ryo has joined #me 15:01:40 ohmata has joined #me 15:01:43 present+ Jer_Noble 15:02:02 present+ Thasso_Griebel 15:02:09 present+ Tatsuya_Igarashi 15:02:21 rrsagent, draft minutes 15:02:22 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:02:37 present+ Nigel_Megitt 15:02:52 present+ Nigel_Megitt 15:03:27 nigel has joined #me 15:03:27 igarashi has joined #me 15:03:40 present+ 15:04:14 Present+ Nigel_Megitt 15:04:22 JohnRiv has joined #me 15:04:36 present+ 15:04:39 present+ Casey_Occhialini 15:04:56 Topic: Introduction 15:04:58 present+ Mark_Young 15:05:06 rrsagent, draft minutes 15:05:08 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:06:06 ChrisLorenzo has joined #me 15:06:06 Chris: This is joint meeting with W3C and SVTA 15:06:08 rrsagent, draft minutes 15:06:09 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:06:32 Daniel: We're SVTA / DASH-IF members, we want to share feedback around MSE, in the context of media player implementations 15:06:53 ... Pain points and issues we see as developers. Maybe we're doing something wrong, you could provide your feedback 15:06:55 present+ Ali_C_Begen 15:07:18 present+ Chris_Lorenzo 15:07:18 ... Improve existing implementations, and we want to understand what you're working on: EME, other APIs 15:07:26 present+ Francois_Daoust 15:07:29 ... I'd like to join calls more frequently in future 15:07:34 rrsagent, draft minutes 15:07:36 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:07:54 [slide 3] 15:08:03 rrsagent, draft minutes 15:08:04 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:08:13 Daniel: We want to discuss each topic in turn 15:08:37 [slide 4] 15:09:03 Thasso: I work for CastLabs, leading the player team there. We have experience dealing with MSE etc 15:09:27 Daniel: I'm with Fraunhofer Fokus, lead developer of Dash.js, and co-chair of the SVTA players and playback WG 15:09:28 present+ Yuriy_Reznik 15:09:40 rrsagent, draft minutes 15:09:42 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:09:49 ... Ali and Yuriy are chairs with me 15:10:57 s/Daniel_Silvahy/Daniel_Silhavy/ 15:11:37 Chris: @@ 15:11:44 [slide 6] 15:12:05 Daniel: We have various groups and subgroups in SVTA. DASH-IF and SVTA merged 15:12:20 [slide 8] 15:13:00 Daniel: We have a general structure for the discussion: how it's working today, implementation issues and implications, and workarounds in players, suggested improvements to MSE and related use cases 15:13:12 igarashi has joined #me 15:13:23 Topic: Buffer capacity 15:13:28 [slide 9] 15:13:43 Daniel: Every media player buffers data. Create SourceBuffers and append data 15:13:53 rrsagent, draft minutes 15:13:55 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:14:01 ... The app can define the size of the forward and backward buffers 15:14:05 ... The forward buffer has a trade off with latency 15:14:24 ... A limitation we have is memory of buffer capacity. There's no API to query how much data we can append to the buffer 15:14:43 ... We schedule a request for a media segment, but we get QuotaExceeded if there isn't sufficient capacity 15:15:10 ... What would improve the behaviour is to have a way to query the capacity, then we can delay appending and the fetching of the segment 15:15:33 ... What we do today is way for the error event, then reduce the max possible buffer 15:16:10 ... And adjust the backward and forward buffer. Would help every player, with downloading segments 15:17:01 Jer: I wrote the MSE player implementation in WebKit. Do you want a general idea of how much room is available without doing appends first? Are you asking for remaining buffer size or something more general? 15:17:50 Daniel: I'd be fine with total buffer size. It depends on the segment duration. Giving a feeling of how much data I can append, and combine with bitrate info, would help understand if I can append or not 15:18:16 Nigel: In implementations, is the buffer size constant, or does it vary over time during playback? 15:19:01 Jer: In our implementation, it's somewhat constant. But wouldn't want to design ourselves into a corner. WE have to deal with requests from the system to jettison memory, which motivated Managed Media Source 15:19:20 ... I would not want fixed buffer size to be a requirement in the spec 15:19:39 Nigel: Does that imply it's preferable to return how much space there is right now? 15:20:06 Jer: That answer isn't a guarantee. The system could detect a low memory condition and that would change the answer 15:20:19 ... Any such API couldn't provide a guarantee 15:20:31 Eric: And we woulnd't want an API that encourages apps to poll 15:21:16 Thasso: On fixed buffer size, we used to have an implementation where the buffer size was dynamic. Having the ability to deal with dynamic buffers is something that needs to be supported from player perspective 15:21:29 ... How would it be expressed to the client? Media time, memory? 15:21:59 ... For my use cases, I'd want to poll this infrequently, just before downloading the next segment 15:22:12 q? 15:22:17 Daniel: I had the same comment. If you schedule the request, you can decide if you want to query data 15:22:41 Jer: What's the expectation, when you hit the memory limit? 15:23:09 ... When we designed the APIs, when you get QuotaExceeded, you purge the back buffer to make room for the forward buffer. That wouldn't change this 15:23:40 ... I'm not sure what the benefit would be. You have the data in JS, and as you reach the end of the forward buffer, you need to append the downloaded data to prevent a stall 15:23:54 ... It shouldn't be so expensive you can't do it a few seconds ahead of time 15:24:36 ... If you have at least a minute of forward buffer, and as you get close to the end you'd have to purge the backbuffer to append more data. Is that a problem? Do you want us to handle purging the backbuffer for you? 15:25:19 Thasso: No, but would be fine with it. This goes in the direction of MMS. I could be OK with depleting certain parts of the buffer not others 15:25:33 ... Most of my pain points are with TVs and STBs 15:26:08 ... We discuss frequently we like having MediaSource buffers we can retain in memory, and we'd rather not have another buffer implementation client side 15:26:22 ... The machine needs the memory in the end, doesn't matter if in JS or in MSE side 15:26:29 ... Want to avoid splitting it into the two worlds 15:26:59 ... MSE appends take time, so finding the right moment can be challenging, and time could be 2 or 200ms before the frame can be rendered 15:27:29 Jer: I'm mostly familiar with high powered devices, much more powerful than TV or STBs typically 15:28:09 ... For our implementation, an append doesn't have to be an entire media segment. If you're trying to keep the fwd/back buffers full, with out implementation you can break the buffer into pieces to keep playback uninterrupted 15:28:22 q+ 15:28:31 ... Don't know about TV implementations, if they have enough memory 15:28:50 Jer: You can append parts of mdat and moov boxes, but it does require an init segment first 15:29:26 Daniel: We have CMAF low latency chunks, because what you get from fetch API doesn't align with CMAF chunks 15:29:59 Jer: The MSE parsing loop understands, and the reset step requires an init segment first 15:30:01 ... No requirement that each chunk is entire 15:30:04 q? 15:30:30 Thasso: A lot of implementations require full mdat box structures 15:30:54 Jer: There's no requirement for that, but some implementations might require it 15:31:19 Kaz: This proposed API could be harmful, for hackers to crash systems, so should be careful, discuss pros and cons 15:31:50 Jer: Yes, as an internet exposed browser we worry about fingerprinting. If you have a dynamic buffer size API, you could use it for cross-site user tracking 15:32:04 ... Could be like a super-cookie 15:32:09 ack k 15:32:47 Nigel: It's interesting you could append partial segment data to the buffer, but at the moment it feels like no client code would know that's a good idea to do 15:33:22 ... If it's a strategy to fill the buffer as much as possible, and the client has a whole segment and half a segment would fit, there's no information back from the API to suggest that 15:34:08 Jer: It's not an unrecoverable error though. Some ideas. I could imagine relaxing the requirement, so if you exceed the quota, but you can't append until some flag is cleared 15:34:50 ... If some implementations require a full buffer to be appended, an API could be more flexible with its buffer size requirements. Accept the buffer but don't allow further appends until a flag is cleared, e.g., by a remove command 15:35:07 Nigel: A call to append could return the number of bytes successfully appended 15:35:19 Topic: Box parsing 15:35:23 [slide 10] 15:35:48 Daniel: We append ISO BMFF boxes, many players have their own box parser, useful for the player or app 15:35:59 ... Example is EMSG box, for use by player events 15:36:19 ... As of today, MSE doesn't support parsing boxes and dispatching to the player 15:36:29 ... It's done in JS. WASM could be an option 15:36:54 ... With low latency streaming. We parse moov and mdat boxes, we try to append complete moov+mdat combinations 15:37:09 ... Suggest an API to allow clients to register to receive the boxes 15:37:40 ... EMSG, PRFT for latency adjustment, ELST. Have to parse MOOV to get correct timescale value 15:38:18 Jer: An arbitrary MP4 parser with WebCodecs lets you create your own player 15:38:37 ... As long as the STB supports WebCodecs. WebCodecs provides low level access to audio and video decoders 15:38:57 ... Render to a canvas 15:39:22 scribe+ 15:39:30 scribe+ nigel 15:39:37 cpn: The question has come up before in the context of WebCodecs 15:39:55 me thanks tidoust I was about to do that too - please continue! 15:39:59 ... Preferable approach was thought to be JavaScript as it offers flexibility 15:40:01 s/me thanks tidoust I was about to do that too - please continue!// 15:40:23 rrsagent, draft minutes 15:40:25 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:40:44 ... On emsg specifically, WebKit has the DataCue API. In this IG a while ago, we were looking at how would we do emsg parsing surfaced through DataCue events. 15:41:01 ... That work kind of stalled. We didn't have enough active contributors pushing this forward. 15:41:21 ... If people are interested, I would suggest to get together and get that moved forward. 15:41:45 ... Very targeted solution towards emsg events. Immediately triggered or triggered at some point on the timeline. 15:41:46 q+ to mention that this could be helpful for subtitle/caption decoding from MSE 15:41:58 ... It wouldn't do the general box parsing that you're talking about. 15:42:22 Chris: @@ 15:42:52 Thasso: We're very interested. Essentially, it means we end up with an MSE implementation that we do ourselves. 15:43:08 Thasso: A software implementation based on WebCodecs sounds a good idea. But we're still lacking a lot of features, e.g., DRM 15:43:48 ... A simple approach, register a listener for any box type, don't need to do heavy lifting 15:44:19 s/Chris: @@// 15:44:31 i/A software/scribenick: cpn/ 15:44:33 rrsagent, draft minutes 15:44:34 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:44:51 Jer: I see a couple of problems here. An API to return an arbitrary box, especially if it's one the implementation doesn't understand 15:45:03 ... There are use cases I'd like to address. EMSG is one of them 15:45:22 i/We're very/Chris: @@/ 15:45:25 ... Other case is 608/708 caption data, given regulatory requirements 15:45:41 rrsagent, draft minutes 15:45:43 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:45:50 ... Those are embedded in the media stream, but not elevated to the subtitle rendering 15:46:21 ... So we see websites doing parsing themselves. But that might not be accomplished using a box parsing API, they're muxed in the mdat 15:46:48 Nigel: At TPAC we talked about potentially adding subtitles and captions to MSE, but then the question is how do you know on the output side which mdats to pull out 15:47:04 ... so you do what you need to for the player code 15:47:04 ack n 15:47:04 nigel, you wanted to mention that this could be helpful for subtitle/caption decoding from MSE 15:47:26 ... When you say register for ISO BMFF boxes, it's not any mdat, it's some particular mdat 15:47:55 Thasso: I agree, the general problem with not every implementation understanding the boxes, and issue with nested boxes 15:48:19 ... Maybe for CMAF constant, all boxes defined there are supported by spec, so I can pull them out 15:48:33 Daniel: Suggest following up offline 15:48:41 Topic: Codec information 15:48:47 [slide 11] 15:49:10 Daniel: You have changeType method. In dash.js we save the codec info in a variable 15:49:36 ... Not possible to ask the current codec string, so you have maintain yourself. Suggest adding an API 15:49:59 rrsagent, draft minutes 15:50:00 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:50:06 Jer: We had an idea in the WebKit, to pull codec information from the VideoTrack. It's relevant for MSE clients and for HLS and file-based downloads 15:50:47 ... changeType requires passing a complete codec string. We've seen cargo-culting or magic strings being used for AAC or H.264. How do you know which codec string to use with Media Capabilities 15:51:18 ... Needs info out of band. An API to get the codec string as understood by the browser. Some interest from browser vendors to do this 15:51:41 ... We've heard from other clients what they really want is a timeline based set of information: start, end, properties 15:51:57 ... It's an interesting use cases. Want to solve aspects of this. Please bring to the WG 15:52:11 Topic: Dynamic addition of SourceBuffers 15:52:14 [slide 12] 15:52:32 Thasso: MediaSource session, and maintaining a number of buffers in the session 15:53:04 ... Issue is inability to manage the number of buffers. Turn off audio, but I can only mute it. Once removed it's gone, get an error after adding it back 15:53:29 ... Use cases: remove buffer: turn off audio fully, or turn of video fully 15:53:59 ... A text track I definitely want to turn off. Some players have workarounds, difficult to maintain. For audio, pushing silence isnt' so complicated 15:54:12 ... Modelling a black frame with H.264 not too bad, but more complex for other codecs 15:54:24 ... Want more dynamic behaviour when adding or removing buffers 15:54:48 Daniel: MoQ group looking at low latency, hard with current implementations if you need to append dummy data 15:55:25 Jer: Two related efforts in Media WG: One is behaviour when you hit a gap in video data, continue playing and catch up, don't stall. Would solve some of these use cases 15:55:52 ... Bigger issue, there's a solution for having multiple SourceBuffers in MS that aren't curretly active 15:56:24 ... Tracks associated with media element. Once it's removed from the active source buffers list, it should have no impact on playthrough 15:56:45 ... Shouldn't have to feed black frames through 15:57:03 ... I don't think Chromium has that yet. But would unblock this use case 15:57:26 ... It exists tin the spec but not all impls yet 15:57:32 Topic: Multiple Source Buffers 15:57:38 [slide 13] 15:57:40 rrsagent, draft minutes 15:57:41 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 15:58:16 Thasso: Related use case. We implemented HLS interstitials, ran into problem. Conditioning not perfect, when timelines overlap 15:58:28 ... Want to use MSE as our buffer, and make use of it later 15:59:10 ... Problem is how to do this even with virtual buffer. If timelines overlap, need to be very precise. currentTime not accurate enough, every 250 ms. So workaround of rAF() and polling time to work out when to append 15:59:58 ... Want to get rid of the data earlier. Best case scenario: not do the switching myself but be able to schedule: when done with video track 1, play number 2, then go back to 1 if there a no gaps 16:00:06 ... Hard to deal with overlapping timelines on the client side 16:00:34 Jer: A couple of ideas. You shouldn't have to poll currentTime. Use synthetic TextTrackCue events. 16:00:38 Thasso: Not accurate on all impls 16:01:24 Jer: We've heard this ue case before, with HLS interstitials. A MediaSource you can detach and re-attach later. Designed to solve use case of switching to differently encoded content. 16:01:39 s/ue case/use case/ 16:01:54 ... Could be used to play interstitial content, without having to reappend the original data. Only requirement is to seek back to the main timeline position when you do the switch 16:02:35 ... The issue from implementers we heard is there may not be enough memory in low-end impls to support multuple MS instances. Multiple video buffers would have similar problem, leading to more QuotaExceeded errors 16:02:59 Thasso: I think the limitation on embedded devices isn't necessarily the memory, it's how they initialise the hardware resources 16:03:25 Daniel: That's why we did the virtual buffer in dash.js 16:04:05 Jer: Detachable MediaSource. You have main content attached to the media element. If you want to preload ad insertion in a second MediaSource 16:04:38 ... Use an audio element instead, to avoid the impl instantiating an embedded codec. It's technically allowed by the spec, but needs some experimentation on STBs to see if it would work 16:04:50 ... And would require an impl of datachable Media Source 16:05:09 Topic: Summary: 16:05:22 Daniel: I want to join calls more frequently, and we'll file GH issues 16:06:35 q+ 16:06:55 Mark has joined #me 16:07:29 cpn: let's talk about how we can handle this 16:07:36 ... your input is welcome 16:08:07 daniel: yes, let's talk about that offline as well 16:08:23 JohnRiv has left #me 16:08:46 q- 16:10:20 [adjourned] 16:10:24 rrsagent, draft minutes 16:10:26 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 16:15:15 i/let's/scribenick: kaz/ 16:15:16 rrsagent, draft minutes 16:15:18 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz 16:15:38 i/handle/scribenick: kaz/ 16:15:39 rrsagent, draft minutes 16:15:41 I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz