21:02:25 <RRSAgent> RRSAgent has joined #webrtc
21:02:30 <RRSAgent> logging to https://www.w3.org/2024/09/26-webrtc-irc
21:02:30 <riju> riju has joined #webrtc
21:02:30 <Zakim> Zakim has joined #webrtc
21:02:34 <jesup7> jesup7 has joined #webrtc
21:02:35 <cpn> Topic: Introduction
21:02:41 <tidoust> RRSAgent, make logs public
21:02:45 <cpn> scribe+ cpn
21:02:45 <eric_carlson> eric_carlson has joined #webrtc
21:02:54 <padenot> padenot has joined #webrtc
21:02:55 <tidoust> RRSAgent, draft minutes
21:02:57 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/26-webrtc-minutes.html tidoust
21:03:04 <padenot> present+
21:03:24 <cpn> Bernard: We'll talk about new things today, involving WebCodecs, what additional things people want, gaps, enable wider use
21:03:31 <Orphis> Orphis has joined #webrtc
21:03:33 <cpn> ... We had breakouts yesterday
21:03:48 <cpn> ... We'll cover those topics here
21:04:14 <cpn> .... RtpTransport breakout discussed custom congestion control
21:04:30 <cpn> ... WebCodecs and RtpTransport on the encoding side
21:04:54 <cpn> ... Sync on the Web session. Interest to sync things like MIDI
21:05:18 <cpn> ... Any comments?
21:05:21 <cpn> (nothing)
21:05:29 <cpn> Topic: Reference Control in WebCodecs
21:05:56 <cpn> Erik: Reference Frame Control, to repeat the breakout session. And Corruption Detection
21:05:59 <rahsin> rahsin has joined #webrtc
21:06:21 <cpn> ... Reference Frame control, the coal is to implement any referene structure we want. As simple API as possible
21:06:28 <cpn> ... Make the encoder as dumb as possible
21:06:43 <cpn> ... Use a s few bits as possible, don't get into how to do feedbacl etc
21:07:12 <cpn> Eugene: We propose a new way to spec scalability modes for SVC
21:07:22 <cpn> ... This allows any kind of pattern of dependencies betwen frames
21:07:34 <cpn> ... Most encoders have a frame buffer abstraction
21:07:43 <cpn> ... For saving frames for future use
21:07:57 <cpn> ... getAllFrameBuffers() returns a sequence of all the FBs
21:08:05 <cpn> ... No heavy underlying resources
21:08:23 <cpn> ... Lets us extend video encode options, so say which frame goes to which framebuffer
21:08:35 <cpn> ... Signals in whcih slot the frame should be saved
21:08:44 <cpn> ... And dependencies between them
21:08:51 <felipc0> felipc0 has joined #webrtc
21:08:57 <cpn> ... This is only available under a new "manual" scalability mode
21:09:22 <cpn> ... Chromium implemnted behind a flag for libav1, hopeful for libvpx, HW accel on Windows under DirectX12
21:09:39 <cpn> Erik: Concrete example of how to use it. Three temporal layers
21:09:49 <cpn> ... Dependencies are always downwards
21:10:22 <cpn> ... We create a VideoEncoder with "manual", check the encoder supports this mode, the check the list of reference buffers, then start encoding
21:10:40 <cpn> ... There are 4 cases in the switch statement.
21:10:53 <baboba> baboba has joined #webrtc
21:11:05 <cpn> ... To make this work, we had to make simplifications and tradeoffs
21:11:05 <baboba> +q
21:11:30 <cpn> ... We limit it to only use CQP
21:11:31 <youenn> youenn has joined #webrtc
21:11:38 <cpn> Bernard: Can I do per-frame QP?
21:11:40 <cpn> Erik: Yes
21:11:55 <cpn> ... You have to do per-frame QP at the momemnt, CBR is a follow up
21:12:11 <cpn> ... If the codec implements fewer reference buffers than the spec
21:12:24 <cpn> ... Don't support spatial SVC or simulcast
21:12:40 <cpn> ... We limit to updating only a single refernece buffer for a single frame today
21:13:05 <cpn> ... H264 and H265 have more complex structure how they reference things. We model only with long-term references
21:13:16 <cpn> ... We have some limitations aroudn frame dropping
21:13:31 <cpn> ... To summarise the breakout, most people seem supportive
21:14:03 <cpn> ... We want to take this a step further, support more codecs, user needs to understand the limitations, so need to query, need to discuss an isConfigSupported() or MC API
21:14:24 <cpn> ... Fingerprinting surface, not really new, just a more structured way to look at data already there
21:14:47 <jib6> q+
21:14:47 <cpn> ... Need examples. Can do L1T3 today, need examples for what you can't do today
21:14:49 <baboba> +q
21:14:53 <cpn> ack b
21:15:00 <cpn> ack j
21:15:11 <dom> q+ baboba
21:15:14 <chrisguttandin> chrisguttandin has joined #webrtc
21:15:19 <hta> hta has joined #webrtc
21:15:28 <cpn> Jan-Ivar: There's a videoframebuffer id?
21:16:02 <cpn> Eugene: Wanted to make it more explicit from a type point of view. The spec in future can say take buffers from a particular encoder instance. Can't take from strings
21:16:03 <hta> hta has joined #webrtc
21:16:16 <cpn> ... It's a closed cycle
21:16:36 <cpn> Jan-Ivar: Just a bikeshed, strange to havesomething called a buffer that isn't actually a buffer
21:16:46 <cpn> Eugene: Open to renaming, e.g., add Ref at the end?
21:17:05 <cpn> Erik: It represents a slot where you can put something, not the content
21:17:09 <hta> q?
21:17:20 <dom> ack baboba
21:17:28 <cpn> Bernard: The reference has to be the same resolution?
21:17:48 <cpn> Eugene: Don't have anything for spatial scalaibilty, each will have a separate buffer
21:18:02 <cpn> ... We wanted to have this interface, introduce spatial scalability in future
21:18:21 <cpn> Bernard: Can do simulcast, but in the same way as WebCodecs, creating multiplke encoders
21:18:35 <cpn> ... WebRTC can have one encoder do multiple resolutions
21:18:37 <cpn> q?
21:18:54 <cpn> Topic: Corruption Likelihood Metric
21:19:22 <cpn> Erik: Detecting codec corruptions, during transport etc that lead to visible artifacts, pixels on screen with bad values
21:19:47 <cpn> ... Add a new measuremtn that tries to capture this, using as little bandwith and CPU as possible
21:20:09 <cpn> ... One implementation in mind, use an RTP header extension as side channel
21:20:32 <cpn> ... You randomly selct a number of samples in the image and put into an extension header
21:20:51 <cpn> ... The receiver takes the same locations in the image, look a t the sampe values they see. If they differ, you have a corruption
21:21:06 <cpn> ... Not just a single sample value. You'll have natural distortions from compression, want to filter those out
21:21:13 <cpn> ... With QP, take an average around a location
21:21:42 <cpn> ... Don't want stats value to be coupled to this partucular implementation
21:22:00 <cpn> ... Allows us to do completely receive side, e.g, with an ML model
21:22:38 <cpn> ... Proposal to put it in the inbound RTP RTC states. Could be put in VideoPlaybackQuality. Same thing could apply to any video transmission system
21:22:39 <fluffy> q+
21:22:43 <hta> hta has joined #webrtc
21:22:51 <cpn> ... Looking for feedback or input
21:22:52 <hta> q?
21:22:54 <cpn> ack q
21:22:57 <cpn> ack f
21:22:58 <hta> q+
21:23:40 <cpn> Cullen: Sympathetic to this use case, concerned about the details. Concern about RTP header extension, doesn't get same security processing as the video, could reveal a lot of info, e.g., guess what the video was
21:23:42 <cpn> ... Privacy concern
21:24:02 <baboba> +q
21:24:03 <cpn> Erik: That's correct. We'll rely on RFC6904 to do encryption of header extension in the initial impl
21:24:19 <cpn> ... Other wise you leak a small portion of the frame
21:24:31 <hta> hta has joined #webrtc
21:24:38 <hta> q?
21:25:03 <cpn> Cullen: If you trry to sample a screenshare video, large regions of black or white. Metrics for video quality, considered other options than just a few sampling points?
21:25:28 <cpn> Erik: Yes, screen content is difficult, doesn't generalise as well.
21:25:46 <cpn> ... With 13 samples/frame it's good at finding errors
21:25:52 <tantek> tantek has joined #webrtc
21:25:55 <cpn> Cullen: How many samples are you thinking of using?
21:25:58 <tantek> present+
21:26:09 <cpn> Erik: 13 actual samples we transmit
21:26:10 <cpn> q?
21:26:14 <fluffy> ack fluffy
21:26:30 <cpn> Harald: Thought about adding to VideoFrameMetadata instead of Stats?
21:27:02 <cpn> Erik: That's the issue of exposing up to the application level. Won't do on all frames, maybe 1 frame / second. Could involve a GPU to CPU copy, so want to limit that
21:27:20 <cpn> ... Open to ideas on how to surface to the user after calculation
21:27:23 <cpn> Harald: Sounds like we need to experiment
21:27:25 <cpn> q?
21:27:29 <cpn> ack h
21:27:54 <cpn> Bernard: The implemetations didn't work, header extensions sent in clear, so privacy issue if not fixed
21:28:07 <youenn> +q
21:28:28 <cpn> ... Want to think beyond WebRTC - MoQ, etc. Think about making meteadata, e.g., playout quality, to get it multiple ways
21:28:30 <cpn> ack b
21:28:47 <cpn> Erik: The 6904 is a stop-gap to start experimenting
21:28:58 <hta2> hta2 has joined #webrtc
21:29:03 <cpn> ... Not sure how to transmit in a general way the samples end to end
21:29:11 <hta2> q?
21:29:19 <cpn> Bernard: Previous discussion on segment masks, metadata attacked to the VideoFrame
21:29:31 <cpn> Erik: Please coment in GitHub
21:29:58 <cpn> Youenn: Hearing it's good to experiment. This can be shimmed, transform to get the data you want
21:30:12 <cpn> ... Considered doing that first, and would that be good enough?
21:30:26 <cpn> Erik: Considered doing encoded transform, the QP is missing
21:30:42 <cpn> ... On a native level you can adapt the thresholds to get better signal to noise
21:31:13 <cpn> ... We do local experiments in the office, but want to see from actual usage
21:31:14 <cpn> q?
21:31:16 <cpn> ack y
21:31:26 <cpn> Topic: Audio Evolution
21:31:45 <cpn> Paul: We'll talk about a few new things, some are codec-specific, some not
21:32:08 <cpn> ... Two items to discuss. New advances with Opus codec - 1.5 released this year, has new booleans we should take advantage of
21:32:16 <sprang> sprang has joined #webrtc
21:32:28 <cpn> ... And we can improve spatial audio capabilities. For surround sound, music applications, etc
21:32:44 <cpn> ... Link to blog post that talks about the new features
21:33:00 <cpn> ... Some are ML techniques to improve quality under heavy packet loss
21:33:25 <cpn> ... With LBRR+DRED you get good quality with 90% packet loss
21:33:40 <baboba> +q
21:34:03 <cpn> ... To use recent Opus quality improvements, there's a decoder complexity number. In Opus codec you can trade CPU power for higher quality
21:34:22 <cpn> ... If you have quality (0-10), if >=5 you get Deep PLC, very high quality PLC
21:34:32 <cpn> ... If 6 you get LACE, improves speech quality
21:34:42 <cpn> ... NoLace is more expensive on CPU
21:35:08 <cpn> ... Need a few megabytes. Not complex, geared to realtime usage
21:35:21 <cpn> ... Only works with 20ms packets and wideband bandwidth
21:35:30 <cpn> ... You'd have a complexity config
21:35:54 <cpn> ... It's decode-side only, no compatibilty issue
21:36:34 <sprang> +q
21:36:36 <cpn> ... DRED - Deep Redundancy, you put latent information in every packet, can use the data in packet received to get the data you should have received
21:37:32 <cpn> ... Increase jitter buffer size, then decoder reconstructs. Requires change of API on encoder side.  Reconstruct PCM from what you didn't receive
21:38:06 <cpn> ... New parameters when you encode the packet. Bitstream is versioned, so it will be ignored safely and not crash the decoder
21:38:10 <PeterThatcher> PeterThatcher has joined #webrtc
21:38:48 <cpn> Bernard: It's not trivially integrated in WebCodecs. What to do?
21:39:20 <cpn> Paul: Add a second parameter to decode, with a dictionary, to enable this recovery scheme. It would be backwards compatible
21:39:21 <cpn> ack b
21:39:48 <cpn> Harald: Does this introduce additional delay?
21:39:49 <solis> solis has joined #webrtc
21:40:08 <cpn> Paul: The second technique that can reconstruct heavy packet loss .. works like this
21:40:43 <cpn> ... On detecting packet loss, you increase latency a lot, up to 1 seocnd. If it continues like that, you an still understand what's said
21:40:52 <cpn> ... If network conditions approve, go back to normal
21:41:21 <cpn> Erik: Is the 20ms limit just with current implementation?
21:41:41 <cpn> Paul: They say "currently", not clear in the blog post why it is
21:41:52 <cpn> Erik: Typically you want long frame lengths
21:42:16 <cpn> Eugene: Slide 36 says 2 new APIs needed. What are they?
21:42:45 <cpn> Paul: One is indicating there was packet loss, but need something for where packet loss happened
21:43:00 <cpn> Eugene: Feature detection
21:43:17 <cpn> Paul: IF the decoder doesn't understand the additional info, it's skipped
21:43:32 <cpn> ... If you change version, it won't break. That's designed into the bitstream
21:43:42 <cpn> ... Enable in the encoder, with DRED=true
21:43:54 <cpn> Eugene: Don't need a configuration parameter
21:44:09 <cpn> Paul: Affects encoding schems
21:44:13 <hta> hta has joined #webrtc
21:44:26 <hta> q?
21:44:30 <cpn> Bernard: Config parameters in the scheme, some might affect WebCodecs
21:44:37 <sprang> -q
21:44:52 <cpn> Topic: Improve spatial audio capabilities
21:45:27 <cpn> Paul: Opus can now do new things. Opus is mono and stereo, them they tell you how to map multiple channels
21:46:04 <cpn> ... If ithe bytestream has channel family 2 and 3, it's ambisonics. Use orientation and trigonometry maps you can reconstruct audio from different directions
21:46:10 <cpn> ... Straightfoward to decode
21:46:20 <cpn> ... Trig maps can be done by the browser at this point
21:46:30 <cpn> ... Just need to what mapping familiy it is
21:47:03 <cpn> ... 255 is interesting, can have up to 256 channels, you know what to do. Have an index, do custom proessing in WASM
21:47:13 <cpn> ... App layer and the file need to udnerstand each other
21:47:25 <cpn> ... Web uses a certain channel ordering, in Web Audio API
21:47:49 <cpn> ... Propose remapping channels, so you have a consident mapping regardless of codec and container
21:47:55 <cpn> It's now specced in Web Audio
21:48:08 <cpn> s/It's/... It's/
21:48:22 <cpn> Paul: Proposal is to map everything to SMPTE. AAC would need remapping, but others not touched
21:48:40 <cpn> ... With ambisonics, decode and the app does the rest
21:48:55 <cpn> ... For decode and output, don't think the app should be doing that
21:49:33 <cpn> ... Proposal is almost do nothing, just remap so considetnt between the APIs
21:49:33 <cpn> ... Any concerns?
21:49:45 <cpn> Harald: How to map multiple channels in RTP?
21:50:09 <cpn> ... Need to tell which channels are coupled and which are mono
21:50:09 <baboba> +q
21:50:19 <cpn> ... Some implementations have something, not standardised
21:50:26 <cpn> Harald: There are hacks, yes
21:50:52 <cpn> Paul: So long as consistent with web platform, get channels in the index you expect, so don't have to tune the app code for different codecs
21:51:03 <fluffy7> fluffy7 has joined #webrtc
21:51:07 <cpn> Jan-Ivar: What about on encoding? Also use SMPTE there?
21:51:23 <cpn> Paul: On the web it's supposed to be that audio
21:51:48 <cpn> Jan-Ivar: If you want to play it, not all platforms will spport SMPTE playback
21:51:53 <cpn> Paul: You'd remap the output
21:52:05 <dom> ack baboba
21:52:31 <cpn> Bernard: In response to Harald, nothing in the formats from AOMedia. How to get it into WebRTC?
21:52:49 <cpn> Paul: Thsi is about getting into WebCodecs, then figure out SVP
21:52:59 <cpn> ... There are draft RFCs about it
21:53:13 <cpn> Bernard: THere's no transport for the container
21:53:20 <cpn> q?
21:53:49 <cpn> Weiwei: There are sevaral spatial audio codec standards. Does it work with them?
21:54:13 <cpn> Paul: All will be mapped to this order, but that's been the case for some time. Need to ensure all specs agree
21:54:36 <cpn> Weiwei: In China, there's a spatial audio standard, will it work for them?
21:54:57 <cpn> Paul: If you have channel info at the decoder level, you can remap and expose in the order you expect
21:55:10 <cpn> Weiwei: We should look into it
21:55:14 <cpn> q?
21:55:34 <cpn> Topic: IAMF
21:55:47 <cpn> Paul: IAMF and object based audio, how to deal with it on the web?
21:56:01 <cpn> ... Web Codecs doesn't concern itself with the container
21:56:24 <cpn> ... Do we feel that using low level APIs for decoding the streams is enough, then render in script?
21:56:39 <cpn> ... DSP involved isn't complicated, just mixing, volume, panning
21:56:44 <baboba> +q
21:57:27 <cpn> Eugene: Agree this is an advanced feature, so leave to the app
21:57:46 <cpn> Bernard: More complicated than that. Things like the Opus 1.5 extensions
21:58:29 <cpn> ... IAMF can work with lots of codecs, but they want to do additional stuff
21:58:58 <cpn> Paul: In that case, want to have WebCodecs work with it. Don't know if we want WebRTC WG do the work, more complications
21:59:04 <cpn> ack b
21:59:21 <cpn> Topic: Encoded Source
21:59:44 <cpn> Guido: In WebRTC WG we want to support the ultra low latency broadcast with fanout use case
22:00:02 <cpn> ... UA must be able to forward media from a peer to another peer
22:00:20 <cpn> ... Timing and bandwidth estimates for congestion control
22:00:48 <youenn> youenn has joined #webrtc
22:00:51 <cpn> ... Specifically, we want to support this scenario, where you have a server that provides the media. Large number of consumers
22:01:15 <cpn> ... Communication with server is expensive
22:01:39 <cpn> ... Assume communication between nodes is cheaper than communication to server
22:01:46 <cpn> ... Nodes can appear or disappear at any time
22:02:35 <cpn> ... Example, Two peer connections receiving data from any peer in the network.
22:02:50 <cpn> ... Use encoded transform to receive frames
22:03:11 <cpn> ... Depending on network conditions, you might want to drop frames
22:03:14 <xhwang> xhwang has joined #webrtc
22:03:26 <cpn> ... When app decides what to forward, sends frame to multiple output peer connections
22:03:41 <cpn> ... Idea is you can fail over, be robust without requiring timeouts
22:03:51 <cpn> ... So can provide a glitch-free forward
22:04:10 <cpn> ... We made a proposal, patterned on RTCRtpEncodedTransform
22:04:47 <cpn> ... This is similar to single-sided encoded transform
22:05:01 <cpn> ... Got WG and dveeloper feedback tha we've incorporated
22:05:40 <cpn> ... Allows more freedom than encoded transform. You can write any frames, so it's easier to make mistakes. Wuold be good to provide better error signals
22:06:06 <cpn> ... It's less connected to internal control loops in WebRTC
22:06:21 <cpn> ... In addition to raw error ahdling we need bandwidth estimates, etc
22:06:46 <cpn> ... Basic example. We have a worker, a number of peer connections
22:07:01 <cpn> ... Each has a sender. For each sender we call createEncodedSource()
22:07:16 <cpn> ... This method is similar to replaceTrack()
22:07:19 <jib6> q+
22:07:34 <cpn> ... On receiver connection, we use RTCRtpScriptTransform
22:08:15 <cpn> ... On worker side, we receive everything, we use encoded sources. In the example, source has a writeable stream, a readable and a writeable
22:08:21 <youenn> +q
22:08:41 <cpn> ... For the receivers, can apply a transform
22:08:54 <cpn> ... Write the frame to all the source writers
22:09:21 <cpn> ... You might need to adjust the metadata
22:10:26 <cpn> ... Errors and signals that developers say would be useful include keyframe requests, bandwidth estimates, congestion control, error handling for incorrect frames
22:10:42 <cpn> ... e.g, timestamps going backwards
22:11:21 <cpn> ... Other signals are a counter for frames dropped after being written that the sender decided to drop
22:11:36 <handellm> handellm has joined #webrtc
22:11:42 <cpn> ... Expected queue time once written
22:12:13 <cpn> ... To handle keyframe requests, there's an eent
22:12:34 <cpn> .. .writeable stream, and event handler for the keyframe request
22:13:00 <cpn> ... For bandwidth we're proposing to use a previous proposal or congestion control from Harald
22:13:22 <cpn> ... Recommended bitrate
22:13:37 <cpn> ... Outgoing bitrate is already exposed in stats, convenient to have it here
22:13:52 <cpn> ... Have an event that fires when there's a change in bandwidth info.
22:14:28 <cpn> ... [shows BandwidthInfo API]
22:14:59 <cpn> ... Use with dropped frames, after written, and expected send queue time
22:15:36 <cpn> ... if allocated bitrate exceeds a threshold, add extra redundancy data for the frame
22:15:53 <cpn> ... [Shows API shape]
22:17:24 <cpn> ... Pros and cons. Similar pattern to encoded transform, simple to use and easy to understand
22:17:31 <cpn> ... Good match for frame-centric operations
22:17:47 <cpn> ... Allows zero timeout failover from redundant paths
22:18:04 <cpn> ... Easy to adjust or drop frames due to bandwidth issues
22:18:17 <cpn> ... It requires waiting for a full frame
22:18:38 <cpn> ... In future could be ReceiverEncodedSource
22:18:41 <hta> q?
22:18:51 <cpn> ... Have fan-in for all the receivers
22:19:17 <cpn> Jan-Ivar: In general I agree this is a good API to solve the forwarding data use case
22:19:39 <cpn> ... Seems to be a bit more than a source. Somehting you can assign to a sender in place of a track
22:20:11 <cpn> ... Once you associate a sender with a source, that can't be broken again?
22:20:28 <cpn> Guido: Yes. A handle-like object
22:21:12 <cpn> ... I like it better with a method, can play track with a video element. But with this object there's nothing you can do with it
22:21:29 <cpn> ... There isn't a lot of advantage to having this object, e.g., to send to another sender
22:21:55 <hta> q+
22:22:00 <cpn> ... We can iterate on the methods and finer details
22:22:11 <cpn> ... I prefer methods, as they create the association immediately
22:22:32 <hta> ack jib
22:22:32 <cpn> Jan-Ivar: That ends up being a permanent coupling
22:22:55 <cpn> Guido Can create and replace an existing one
22:23:20 <cpn> Jan-Ivar: The permanent coupling ...
22:23:30 <cpn> Guido: It's just an immediate coupling
22:23:34 <cpn> ... You can decouple it
22:24:09 <cpn> ... Can do the same approach as encoded transform if we think that's better
22:24:11 <cpn> q?
22:24:35 <cpn> Youenn: Overall it's in the right direction. Similar feedback on the API shape, but we can converge
22:24:52 <cpn> ... Not a small API. Good to have shared concepts
22:25:11 <cpn> ... Encoded transform was very strict. Here we're opening the box. Have to be precise about error cases
22:25:42 <cpn> ... We're opening the box in the middle. Need to be precise how it works with encoded transform
22:26:07 <cpn> ... Improve the API shape and really describe the model and how it works. Implications for stats, encoded transform, etc.
22:26:25 <cpn> ... I have other feedback, will put on GitHub
22:26:38 <cpn> ... Let's go for it, but be careful about describing it precisely
22:26:39 <cpn> q?
22:26:42 <cpn> ack y
22:26:44 <hta> ack youenn
22:27:10 <cpn> Guido: So we have agreement on the direction
22:27:35 <handellm> ack h
22:27:35 <cpn> Harald: encoded transport has bandwidth allocation. Should try to harmonise the other part
22:27:46 <cpn> Topic: Timing Model
22:28:19 <cpn> scribe+ hta
22:28:26 <hta> [slide 60]
22:30:48 <dom> i|[slide|Slideset: https://docs.google.com/presentation/d/1d5KdKhwd8PGkGJweJl0qDX9yy9s2GqDRH4EpQ0ccr5g/edit
22:31:05 <hta> q+
22:31:14 <handellm> q+
22:31:40 <hta> ack hta
22:31:48 <cpn> Harald: Recently developed we added stats countters to MediaStreamTrack, and should be reflected. Either that shouldn't exist or be consistent
22:32:12 <hta> youennf: we should be able to compute video playback counters based on the track
22:32:13 <cpn> Youenn: We should define one from each other. Take the MST definition and define VideoPLaybackQUality in terms of that
22:32:14 <baboba6> baboba6 has joined #webrtc
22:32:37 <hta> handell: there's a different proposal that has total video frames in it.
22:32:39 <cpn> Marcus: I have another prosal that would increment total video frames. So lean to proposal 2
22:32:44 <handellm> ack h
22:32:49 <baboba6> +q
22:33:24 <cpn> Harald; So sounds like we should spec the behaviour. We're trying to unify tha statess across sources
22:33:46 <hta> bernard: we should try to specify it.
22:33:47 <cpn> Bernard: Suggest we do it more generally via tracks, which is more work
22:33:51 <jib6> q+
22:34:08 <cpn> Chris: Where to spec?
22:34:14 <hta> cpn: Agreement to try to specify behavior - within Video Playback Quality?
22:34:35 <hta> youennf: each source should describe how they are creating video frame objects.
22:34:35 <cpn> Youenn: You have different sources in differents specs, describe how they create video frame objects, and have defintions of countrs as well
22:35:20 <hta> dom: need a burndown list of fixing all the specs to supply that info.
22:35:59 <hta> jib: agree we need to define them for each mst source
22:36:26 <dom> Subtopic: -> https://github.com/w3c/mediacapture-transform/issues/87 What is the timestamp value of the VideoFrame/AudioData from a remote track?
22:36:43 <hta> scribenick: cpn?
22:36:59 <cpn> scribenick: cpn
22:37:45 <handellm> q+
22:37:52 <hta> [slide 61]
22:37:53 <cpn> Bernard: Timestamp is a capture timestamp, not a presentation timestamp. Shuold we change the definition in WebCodecs? Can we descirbe more clearly this and rVFC timestamp?
22:37:58 <dom> q- baboba
22:38:00 <dom> q- jib
22:38:10 <cpn> Eugene: For video file, there's only presentation timetamp
22:38:42 <jesup7> q+
22:38:47 <cpn> ... For camera, it's capture timestamp by definitition. Needs to be source-specific
22:39:03 <cpn> Bernard: Where would you put the definitions, media-capture-main?
22:39:07 <cpn> Youenn: Yes
22:39:27 <dom> ack handellm
22:39:47 <cpn> Marcus: In WebCodecs spec, there's no definition other than presentation timestamp. In chromium, starts at 0 end increments by frame duration
22:40:23 <cpn> ... It's unspecified what it contains. We have a heuristic in Chromium that puts the capture timetamp
22:40:36 <cpn> ... It's sourced up to rVFC
22:40:59 <cpn> ... Shouldn't really be like that, it should be a presentation timetamp
22:41:09 <cpn> q?
22:41:09 <dom> ack jesup
22:41:30 <cpn> Randell: The use of the terms presentation and capture timestamp is a bit arbitrary
22:42:09 <cpn> ... The fact it comes from a file and is a presentation timestamp, and from a capture is capture timstamp, isn't relevant. Just have a timetamp
22:42:17 <cpn> Bernard: Want to move to next issue
22:42:54 <cpn> [slide 66]
22:43:07 <dom> Subtopic: -> https://github.com/w3c/webcodecs/pull/813 Add captureTime, receiveTime and rtpTimestamp to VideoFrameMetadata
22:43:23 <cpn> Marcus: Web apps that depend on the timestamp sequence, we want to expose capture time into VideoFrameMetadata
22:43:39 <cpn> ... Why? Capture time is async, and enables end to end video delay measurements
22:43:51 <cpn> ... In WebRTC, we prefer the ? timetamp
22:44:10 <cpn> ... The capture time in this context is an absolute measure
22:44:23 <cpn> ... Presentation timestamps not clear when were measured
22:44:35 <cpn> .. Capture time can get the time from before the pipeline
22:45:00 <cpn> ... There are higher quality timetamps from the Capture APIs, we want to expose them
22:45:14 <cpn> ... PR 183 adds those,we refer to the rVFC text
22:45:27 <cpn> ... People didn't like that. Now we have 5 PRs
22:45:55 <cpn> ... We're trying to define this concept in media stream tracks. I'd place those in mediacapture-extensions
22:46:40 <cpn> ... webrtc-extensions, and mediacapture-transform, then repurpose #813 to add fields to VideoFrameMetadata registry
22:46:59 <cpn> ... That's the plan
22:47:04 <hta> q?
22:47:10 <cpn> Eugene: The main problem was video/audio sync
22:47:20 <fippo> fippo has joined #webrtc
22:47:26 <baboba> +q
22:47:43 <cpn> ... Audio frames captured from the mic had one timestamp, and video frames from the camera had different, and was confusing encoding configurations
22:48:18 <cpn> ... Change made for videoframe timetamp to be capture timestamps is important change. Currently just Chomium behavior, want it to be specced behaviour
22:48:35 <cpn> ... So you can do A/V sync, otherwise any kind of sync is impossible
22:49:24 <cpn> Paul: Reclocking, skew, compensate for latency, so everything matches
22:49:43 <cpn> Eugene:: Why not have the same clock in both places?
22:49:55 <cpn> Paul: You take a latency hit as it involves resampling
22:50:19 <cpn> Marcus: We don't have ??
22:50:40 <hta> ?? = AudioTrackGenerator
22:50:40 <cpn> Paul: There is reclocking happening, otherwise it falls apart
22:51:00 <cpn> Eugene: Need example code to show how to do it correctly, for web developers
22:51:07 <cpn> Paul: Sure
22:51:37 <cpn> [slide 71]
22:51:45 <dom> Subtopic: -> https://github.com/w3c/mediacapture-transform/issues/96 What is the impact of timestamp for video frames enqueued in VideoTrackGenerator?
22:51:55 <cpn> Youenn: VideoTrackGenerator timestamp model isn't defined
22:52:17 <cpn> ... Not buffering anything. Each track source will define
22:52:29 <cpn> ... Timetamp not used in any spec on the sync side
22:52:50 <cpn> ... We define timestamp per track source
22:53:22 <cpn> ... Video track sink, there's a diffenrce between webkit and Chromium on implementation
22:53:37 <cpn> ... If spec says nothing, means we don't care about the timestamp
22:53:50 <dom> [slide 72]
22:54:08 <cpn> Bernard: Are those statements about what happens true or not?
22:54:48 <cpn> Harald: Video element has a jitter buffer
22:55:08 <cpn> Bernard: So the staments seem accurate.
22:55:19 <cpn> [slide 73]
22:55:44 <dom> Subtopic: -> https://github.com/w3c/mediacapture-transform/issues/80 Expectations/Requirements for VideoFrame and AudioData timestamps
22:55:51 <cpn> Bernard: What if you append multiple VideoFrames with the same timetamp? Does VTG just pass it on, look for dupes?
22:56:05 <cpn> Jan-Ivar: Yes, garbage-in, garbage-out
22:56:19 <cpn> Youenn: It's the sink that cares about the timestamp
22:56:27 <cpn> Bernard: Something to make clear in the PR
22:56:54 <cpn> Jan-Ivar: Need to consider someone sending data over the channel
22:57:00 <cpn> [slide 74]
22:57:24 <dom> Subtopic: -> https://github.com/w3c/mediacapture-transform/issues/86 Playback and sync of tracks created by VideoTrackGenerator
22:57:33 <cpn> Bernard: HTMLVideoElement. no normative reuirement, might happen or ot
22:57:53 <cpn> ... Describes issues with losing sync, need to delay one to get sync back, etc
22:58:28 <cpn> ... Want to be more specific about this. A jitter buffer potentially in HTMLMediaElement. How does it work and what does it take into account?
22:58:38 <dom> i|[slide 60]|Subtopic: -> https://github.com/w3c/media-playback-quality/issues/19 Clarification needed for HTMLVideoElements that are playing a MediaStream
22:58:43 <dom> RRSAgent, draft minutes
22:58:45 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/26-webrtc-minutes.html dom
22:59:23 <youenn> youenn has joined #webrtc
22:59:24 <cpn> ... It's suggested it's more difficult for remote playout. In RTP, it's used to caluculat the sender/receiver offset
22:59:38 <cpn> ... What's going on inside the black box?
22:59:57 <dom> [slide 75]
23:00:13 <cpn> Youenn: Is it observable? With gUM, the tracks are synchronised. In other cases, we have separate tracks
23:01:18 <cpn> Jan-Ivar: Depends where the source comes from. MediaSrram is a generic implementation for different sources
23:01:26 <hta> Jan-Ivar: Very old language on synchronization might be outdated.
23:01:40 <cpn> Bernard: Thinking about remote audio and video. Need receive time and capture time from same source
23:02:11 <cpn> Harald: Looked at this code recently. For a single AudioTrack and VideoTrack from same peer connecton with same clock source, WebRTC tries to synchronise
23:02:23 <cpn> Marcus: MediaRecorder sorts samples
23:02:45 <cpn> Paul: Similar in FF if you have multiple microphones, we consider it a high level API so it should work
23:03:04 <cpn> Youenn: Spec should clarify there are some cases you should do it, other cases it's impossible
23:03:23 <cpn> Bernard: If I'm writing a WebCodecs+WebTransport, is there someting i can do to make it work?
23:03:29 <cpn> Paul: Implement jitter buffers
23:03:46 <cpn> Marcus: If you have capture times from all streams, you can sort in JS
23:04:00 <cpn> Youenn: make sure from same device
23:04:19 <cpn> Jan-Ivar: If you have VTG, would it affect playback?
23:04:54 <cpn> Bernard: You have capture time from sender
23:05:53 <mjwilson> mjwilson has left #webrtc
23:06:02 <dom> RRSAgent, draft minutes
23:06:03 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/26-webrtc-minutes.html dom
23:06:09 <cpn> Chris: Next steps, schedule more time for this discussion?
23:06:09 <cpn> Bernard: Good idea, yes
23:06:26 <dom> Meeting: Joint Media/WebRTC WG meeting at TPAC
23:06:28 <dom> RRSAgent, draft minutes
23:06:29 <RRSAgent> I have made the request to generate https://www.w3.org/2024/09/26-webrtc-minutes.html dom
23:06:54 <hta> hta has joined #webrtc
23:29:12 <markafoltz> markafoltz has joined #webrtc