14:53:24 <RRSAgent> RRSAgent has joined #me
14:53:28 <RRSAgent> logging to https://www.w3.org/2024/12/10-me-irc
14:53:47 <kaz> meeting: Media and Entertainment IG
14:53:52 <kaz> present+ Kaz_Ashimura
14:53:56 <cpn> cpn has joined #me
14:54:20 <kaz> agenda: https://lists.w3.org/Archives/Public/public-web-and-tv/2024Nov/0004.html
14:55:06 <cpn> slideset: https://www.w3.org/2011/webtv/wiki/images/b/bb/2024-12-10-MEIG-SVTA-MSE-Meeting.pdf
14:55:20 <kaz> rrsagent, make log public
14:55:24 <kaz> rrsagent, draft minutes
14:55:25 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
14:55:32 <cpn> scribe+ cpn
14:58:38 <kaz> present+ Daniel_Silvahy
14:59:55 <kaz> present+ Bernd_Czelhan
15:00:11 <kaz> present+ Hisayuki_Ohmata, Ryo_Yasuoka, Eric_Carlson
15:00:59 <cpn> present+ Chris_Needham
15:01:04 <cpn> chair: Chris_Needham
15:01:14 <nhk_ryo> nhk_ryo has joined #me
15:01:40 <ohmata> ohmata has joined #me
15:01:43 <kaz> present+ Jer_Noble
15:02:02 <kaz> present+ Thasso_Griebel
15:02:09 <kaz> present+ Tatsuya_Igarashi
15:02:21 <kaz> rrsagent, draft minutes
15:02:22 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:02:37 <kaz> present+ Nigel_Megitt
15:02:52 <kaz> present+ Nigel_Megitt
15:03:27 <nigel> nigel has joined #me
15:03:27 <igarashi> igarashi has joined #me
15:03:40 <igarashi> present+
15:04:14 <nigel> Present+ Nigel_Megitt
15:04:22 <JohnRiv> JohnRiv has joined #me
15:04:36 <JohnRiv> present+
15:04:39 <kaz> present+ Casey_Occhialini
15:04:56 <cpn> Topic: Introduction
15:04:58 <kaz> present+ Mark_Young
15:05:06 <kaz> rrsagent, draft minutes
15:05:08 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:06:06 <ChrisLorenzo> ChrisLorenzo has joined #me
15:06:06 <cpn> Chris: This is joint meeting with W3C and SVTA
15:06:08 <kaz> rrsagent, draft minutes
15:06:09 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:06:32 <cpn> Daniel: We're SVTA / DASH-IF members, we want to share feedback around MSE, in the context of media player implementations
15:06:53 <cpn> ... Pain points and issues we see as developers. Maybe we're doing something wrong, you could provide your feedback
15:06:55 <kaz> present+ Ali_C_Begen
15:07:18 <kaz> present+ Chris_Lorenzo
15:07:18 <cpn> ... Improve existing implementations, and we want to understand what you're working on: EME, other APIs
15:07:26 <kaz> present+ Francois_Daoust
15:07:29 <cpn> ... I'd like to join calls more frequently in future
15:07:34 <kaz> rrsagent, draft minutes
15:07:36 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:07:54 <cpn> [slide 3]
15:08:03 <kaz> rrsagent, draft minutes
15:08:04 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:08:13 <cpn> Daniel: We want to discuss each topic in turn
15:08:37 <cpn> [slide 4]
15:09:03 <cpn> Thasso: I work for CastLabs, leading the player team there. We have experience dealing with MSE etc
15:09:27 <cpn> Daniel: I'm with Fraunhofer Fokus, lead developer of Dash.js, and co-chair of the SVTA players and playback WG
15:09:28 <kaz> present+ Yuriy_Reznik
15:09:40 <kaz> rrsagent, draft minutes
15:09:42 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:09:49 <cpn> ... Ali and Yuriy are chairs with me
15:10:57 <kaz> s/Daniel_Silvahy/Daniel_Silhavy/
15:11:37 <cpn> Chris: @@
15:11:44 <cpn> [slide 6]
15:12:05 <cpn> Daniel: We have various groups and subgroups in SVTA. DASH-IF and SVTA merged
15:12:20 <cpn> [slide 8]
15:13:00 <cpn> Daniel: We have a general structure for the discussion: how it's working today, implementation issues and implications, and workarounds in players, suggested improvements to MSE and related use cases
15:13:12 <igarashi> igarashi has joined #me
15:13:23 <cpn> Topic: Buffer capacity
15:13:28 <cpn> [slide 9]
15:13:43 <cpn> Daniel: Every media player buffers data. Create SourceBuffers and append data
15:13:53 <kaz> rrsagent, draft minutes
15:13:55 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:14:01 <cpn> ... The app can define the size of the forward and backward buffers
15:14:05 <cpn> ... The forward buffer has a trade off with latency
15:14:24 <cpn> ... A limitation we have is memory of buffer capacity. There's no API to query how much data we can append to the buffer
15:14:43 <cpn> ... We schedule a request for a media segment, but we get QuotaExceeded if there isn't sufficient capacity
15:15:10 <cpn> ... What would improve the behaviour is to have a way to query the capacity, then we can delay appending and the fetching of the segment
15:15:33 <cpn> ... What we do today is way for the error event, then reduce the max possible buffer
15:16:10 <cpn> ... And adjust the backward and forward buffer. Would help every player, with downloading segments
15:17:01 <cpn> Jer: I wrote the MSE player implementation in WebKit. Do you want a general idea of how much room is available without doing appends first? Are you asking for remaining buffer size or something more general?
15:17:50 <cpn> Daniel: I'd be fine with total buffer size. It depends on the segment duration. Giving a feeling of how much data I can append, and combine with bitrate info, would help understand if I can append or not
15:18:16 <cpn> Nigel: In implementations, is the buffer size constant, or does it vary over time during playback?
15:19:01 <cpn> Jer: In our implementation, it's somewhat constant. But wouldn't want to design ourselves into a corner. WE have to deal with requests from the system to jettison memory, which motivated Managed Media Source
15:19:20 <cpn> ... I would not want fixed buffer size to be a requirement in the spec
15:19:39 <cpn> Nigel: Does that imply it's preferable to return how much space there is right now?
15:20:06 <cpn> Jer: That answer isn't a guarantee. The system could detect a low memory condition and that would change the answer
15:20:19 <cpn> ... Any such API couldn't provide a guarantee
15:20:31 <cpn> Eric: And we woulnd't want an API that encourages apps to poll
15:21:16 <cpn> Thasso: On fixed buffer size, we used to have an implementation where the buffer size was dynamic. Having the ability to deal with dynamic buffers is something that needs to be supported from player perspective
15:21:29 <cpn> ... How would it be expressed to the client? Media time, memory?
15:21:59 <cpn> ... For my use cases, I'd want to poll this infrequently, just before downloading the next segment
15:22:12 <kaz> q?
15:22:17 <cpn> Daniel: I had the same comment. If you schedule the request, you can decide if you want to query data
15:22:41 <cpn> Jer: What's the expectation, when you hit the memory limit?
15:23:09 <cpn> ... When we designed the APIs, when you get QuotaExceeded, you purge the back buffer to make room for the forward buffer. That wouldn't change this
15:23:40 <cpn> ... I'm not sure what the benefit would be. You have the data in JS, and as you reach the end of the forward buffer, you need to append the downloaded data to prevent a stall
15:23:54 <cpn> ... It shouldn't be so expensive you can't do it a few seconds ahead of time
15:24:36 <cpn> ... If you have at least a minute of forward buffer, and as you get close to the end you'd have to purge the backbuffer to append more data. Is that a problem? Do you want us to handle purging the backbuffer for you?
15:25:19 <cpn> Thasso: No, but would be fine with it. This goes in the direction of MMS. I could be OK with depleting certain parts of the buffer not others
15:25:33 <cpn> ... Most of my pain points are with TVs and STBs
15:26:08 <cpn> ... We discuss frequently we like having MediaSource buffers we can retain in memory, and we'd rather not have another buffer implementation client side
15:26:22 <cpn> ... The machine needs the memory in the end, doesn't matter if in JS or in MSE side
15:26:29 <cpn> ... Want to avoid splitting it into the two worlds
15:26:59 <cpn> ... MSE appends take time, so finding the right moment can be challenging, and time could be 2 or 200ms before the frame can be rendered
15:27:29 <cpn> Jer: I'm mostly familiar with high powered devices, much more powerful than TV or STBs typically
15:28:09 <cpn> ... For our implementation, an append doesn't have to be an entire media segment. If you're trying to keep the fwd/back buffers full, with out implementation you can break the buffer into pieces to keep playback uninterrupted
15:28:22 <kaz> q+
15:28:31 <cpn> ... Don't know about TV implementations, if they have enough memory
15:28:50 <cpn> Jer:  You can append parts of mdat and moov boxes, but it does require an init segment first
15:29:26 <cpn> Daniel: We have CMAF low latency chunks, because what you get from fetch API doesn't align with CMAF chunks
15:29:59 <cpn> Jer: The MSE parsing loop understands, and the reset step requires an init segment first
15:30:01 <cpn> ... No requirement that each chunk is entire
15:30:04 <cpn> q?
15:30:30 <cpn> Thasso: A lot of implementations require full mdat box structures
15:30:54 <cpn> Jer: There's no requirement for that, but some implementations might require it
15:31:19 <cpn> Kaz: This proposed API could be harmful, for hackers to crash systems, so should be careful, discuss pros and cons
15:31:50 <cpn> Jer: Yes, as an internet exposed browser we worry about fingerprinting. If you have a dynamic buffer size API, you could use it for cross-site user tracking
15:32:04 <cpn> ... Could be like a super-cookie
15:32:09 <kaz> ack k
15:32:47 <cpn> Nigel: It's interesting you could append partial segment data to the buffer, but at the moment it feels like no client code would know that's a good idea to do
15:33:22 <cpn> ... If it's a strategy to fill the buffer as much as possible, and the client has a whole segment and half a segment would fit, there's no information back from the API to suggest that
15:34:08 <cpn> Jer: It's not an unrecoverable error though. Some ideas. I could imagine relaxing the requirement, so if you exceed the quota, but you can't append until some flag is cleared
15:34:50 <cpn> ... If some implementations require a full buffer to be appended, an API could be more flexible with its buffer size requirements. Accept the buffer but don't allow further appends until a flag is cleared, e.g., by a remove command
15:35:07 <cpn> Nigel: A call to append could return the number of bytes successfully appended
15:35:19 <cpn> Topic: Box parsing
15:35:23 <cpn> [slide 10]
15:35:48 <cpn> Daniel: We append ISO BMFF boxes, many players have their own box parser, useful for the player or app
15:35:59 <cpn> ... Example is EMSG box, for use by player events
15:36:19 <cpn> ... As of today, MSE doesn't support parsing boxes and dispatching to the player
15:36:29 <cpn> ... It's done in JS. WASM could be an option
15:36:54 <cpn> ... With low latency streaming. We parse moov and mdat boxes, we try to append complete moov+mdat combinations
15:37:09 <cpn> ... Suggest an API to allow clients to register to receive the boxes
15:37:40 <cpn> ... EMSG, PRFT for latency adjustment, ELST. Have to parse MOOV to get correct timescale value
15:38:18 <cpn> Jer: An arbitrary MP4 parser with WebCodecs lets you create your own player
15:38:37 <cpn> ... As long as the STB supports WebCodecs. WebCodecs provides low level access to audio and video decoders
15:38:57 <cpn> ... Render to a canvas
15:39:22 <tidoust> scribe+
15:39:30 <nigel> scribe+ nigel
15:39:37 <tidoust> cpn: The question has come up before in the context of WebCodecs
15:39:55 <nigel> me thanks tidoust I was about to do that too - please continue!
15:39:59 <tidoust> ... Preferable approach was thought to be JavaScript as it offers flexibility
15:40:01 <nigel> s/me thanks tidoust I was about to do that too - please continue!//
15:40:23 <kaz> rrsagent, draft minutes
15:40:25 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:40:44 <tidoust> ... On emsg specifically, WebKit has the DataCue API. In this IG a while ago, we were looking at how would we do emsg parsing surfaced through DataCue events.
15:41:01 <tidoust> ... That work kind of stalled. We didn't have enough active contributors pushing this forward.
15:41:21 <tidoust> ... If people are interested, I would suggest to get together and get that moved forward.
15:41:45 <tidoust> ... Very targeted solution towards emsg events. Immediately triggered or triggered at some point on the timeline.
15:41:46 <nigel> q+ to mention that this could be helpful for subtitle/caption decoding from MSE
15:41:58 <tidoust> ... It wouldn't do the general box parsing that you're talking about.
15:42:22 <cpn> Chris: @@
15:42:52 <tidoust> Thasso: We're very interested. Essentially, it means we end up with an MSE implementation that we do ourselves.
15:43:08 <cpn> Thasso: A software implementation based on WebCodecs sounds a good idea. But we're still lacking a lot of features, e.g., DRM
15:43:48 <cpn> ... A simple approach, register a listener for any box type, don't need to do heavy lifting
15:44:19 <kaz> s/Chris: @@//
15:44:31 <kaz> i/A software/scribenick: cpn/
15:44:33 <kaz> rrsagent, draft minutes
15:44:34 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:44:51 <cpn> Jer: I see a couple of problems here. An API to return an arbitrary box, especially if it's one the implementation doesn't understand
15:45:03 <cpn> ... There are use cases I'd like to address. EMSG is one of them
15:45:22 <kaz> i/We're very/Chris: @@/
15:45:25 <cpn> ... Other case is 608/708 caption data, given regulatory requirements
15:45:41 <kaz> rrsagent, draft minutes
15:45:43 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:45:50 <cpn> ... Those are embedded in the media stream, but not elevated to the subtitle rendering
15:46:21 <cpn> ... So we see websites doing parsing themselves. But that might not be accomplished using a box parsing API, they're muxed in the mdat
15:46:48 <cpn> Nigel: At TPAC we talked about potentially adding subtitles and captions to MSE, but then the question is how do you know on the output side which mdats to pull out
15:47:04 <cpn> ... so you do what you need to for the player code
15:47:04 <kaz> ack n
15:47:04 <Zakim> nigel, you wanted to mention that this could be helpful for subtitle/caption decoding from MSE
15:47:26 <cpn> ... When you say register for ISO BMFF boxes, it's not any mdat, it's some particular mdat
15:47:55 <cpn> Thasso: I agree, the general problem with not every implementation understanding the boxes, and issue with nested boxes
15:48:19 <cpn> ... Maybe for CMAF constant, all boxes defined there are supported by spec, so I can pull them out
15:48:33 <cpn> Daniel: Suggest following up offline
15:48:41 <cpn> Topic: Codec information
15:48:47 <cpn> [slide 11]
15:49:10 <cpn> Daniel: You have changeType method. In dash.js we save the codec info in a variable
15:49:36 <cpn> ... Not possible to ask the current codec string, so you have maintain yourself. Suggest adding an API
15:49:59 <kaz> rrsagent, draft minutes
15:50:00 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:50:06 <cpn> Jer: We had an idea in the WebKit, to pull codec information from the VideoTrack. It's relevant for MSE clients and for HLS and file-based downloads
15:50:47 <cpn> ... changeType requires passing a complete codec string. We've seen cargo-culting or magic strings being used for AAC or H.264. How do you know which codec string to use with Media Capabilities
15:51:18 <cpn> ... Needs info out of band. An API to get the codec string as understood by the browser. Some interest from browser vendors to do this
15:51:41 <cpn> ... We've heard from other clients what they really want is a timeline based set of information: start, end, properties
15:51:57 <cpn> ... It's an interesting use cases. Want to solve aspects of this. Please bring to the WG
15:52:11 <cpn> Topic: Dynamic addition of SourceBuffers
15:52:14 <cpn> [slide 12]
15:52:32 <cpn> Thasso: MediaSource session, and maintaining a number of buffers in the session
15:53:04 <cpn> ... Issue is inability to manage the number of buffers. Turn off audio, but I can only mute it. Once removed it's gone, get an error after adding it back
15:53:29 <cpn> ... Use cases: remove buffer: turn off audio fully, or turn of video fully
15:53:59 <cpn> ... A text track I definitely want to turn off. Some players have workarounds, difficult to maintain. For audio, pushing silence isnt' so complicated
15:54:12 <cpn> ... Modelling a black frame with H.264 not too bad, but more complex for other codecs
15:54:24 <cpn> ... Want more dynamic behaviour when adding or removing buffers
15:54:48 <cpn> Daniel: MoQ group looking at low latency, hard with current implementations if you need to append dummy data
15:55:25 <cpn> Jer: Two related efforts in Media WG: One is behaviour when you hit a gap in video data, continue playing and catch up, don't stall. Would solve some of these use cases
15:55:52 <cpn> ... Bigger issue, there's a solution for having multiple SourceBuffers in MS that aren't curretly active
15:56:24 <cpn> ... Tracks associated with media element. Once it's removed from the active source buffers list, it should have no impact on playthrough
15:56:45 <cpn> ... Shouldn't have to feed black frames through
15:57:03 <cpn> ... I don't think Chromium has that yet. But would unblock this use case
15:57:26 <cpn> ... It exists tin the spec but not all impls yet
15:57:32 <cpn> Topic: Multiple Source Buffers
15:57:38 <cpn> [slide 13]
15:57:40 <kaz> rrsagent, draft minutes
15:57:41 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
15:58:16 <cpn> Thasso: Related use case. We implemented HLS interstitials, ran into problem. Conditioning not perfect, when timelines overlap
15:58:28 <cpn> ... Want to use MSE as our buffer, and make use of it later
15:59:10 <cpn> ... Problem is how to do this even with virtual buffer. If timelines overlap, need to be very precise. currentTime not accurate enough, every 250 ms. So workaround of rAF() and polling time to work out when to append
15:59:58 <cpn> ... Want to get rid of the data earlier. Best case scenario: not do the switching myself but be able to schedule: when done with video track 1, play number 2, then go back to 1 if there a no gaps
16:00:06 <cpn> ... Hard to deal with overlapping timelines on the client side
16:00:34 <cpn> Jer: A couple of ideas. You shouldn't have to poll currentTime. Use synthetic TextTrackCue events.
16:00:38 <cpn> Thasso: Not accurate on all impls
16:01:24 <cpn> Jer: We've heard this ue case before, with HLS interstitials. A MediaSource you can detach and re-attach later. Designed to solve use case of switching to differently encoded content.
16:01:39 <kaz> s/ue case/use case/
16:01:54 <cpn> ... Could be used to play interstitial content, without having to reappend the original data. Only requirement is to seek back to the main timeline position when you do the switch
16:02:35 <cpn> ... The issue from implementers we heard is there may not be enough memory in low-end impls to support multuple MS instances. Multiple video buffers would have similar problem, leading to more QuotaExceeded errors
16:02:59 <cpn> Thasso: I think the limitation on embedded devices isn't necessarily the memory, it's how they initialise the hardware resources
16:03:25 <cpn> Daniel: That's why we did the virtual buffer in dash.js
16:04:05 <cpn> Jer: Detachable MediaSource. You have main content attached to the media element. If you want to preload ad insertion in a second MediaSource
16:04:38 <cpn> ... Use an audio element instead, to avoid the impl instantiating an embedded codec. It's technically allowed by the spec, but needs some experimentation on STBs to see if it would work
16:04:50 <cpn> ... And would require an impl of datachable Media Source
16:05:09 <cpn> Topic: Summary:
16:05:22 <cpn> Daniel: I want to join calls more frequently, and we'll file GH issues
16:06:35 <kaz> q+
16:06:55 <Mark> Mark has joined #me
16:07:29 <kaz> cpn: let's talk about how we can handle this
16:07:36 <kaz> ... your input is welcome
16:08:07 <kaz> daniel: yes, let's talk about that offline as well
16:08:23 <JohnRiv> JohnRiv has left #me
16:08:46 <kaz> q-
16:10:20 <kaz> [adjourned]
16:10:24 <kaz> rrsagent, draft minutes
16:10:26 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
16:15:15 <kaz> i/let's/scribenick: kaz/
16:15:16 <kaz> rrsagent, draft minutes
16:15:18 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz
16:15:38 <kaz> i/handle/scribenick: kaz/
16:15:39 <kaz> rrsagent, draft minutes
16:15:41 <RRSAgent> I have made the request to generate https://www.w3.org/2024/12/10-me-minutes.html kaz