28379 – [MSE] should buffering model be an option?

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 28379 - [MSE] should buffering model be an option?

Summary: [MSE] should buffering model be an option?

Status:	RESOLVED MOVED

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	Media Source Extensions (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 normal
Target Milestone:	CR
Assignee:	Adrian Bateman [MSFT]
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-03-30 20:18 UTC by billconan
Modified:	2015-10-13 23:04 UTC (History)
CC List:	4 users (show)

See Also:

Attachments

Description billconan 2015-03-30 20:18:08 UTC

should buffering model be an option?

I'm working on a remote desktop application using Media source. What I noticed is that media source is designed only for video streaming use cases where smoothness is more important than latency.

However, for the remote desktop use case, low latency is preferred. With the currently implementation in Chrome, little network hiccups will trigger the buffering behavior of mediasource, and user will end up seeing video pauses.

why can't the buffering model be an option instead? for example when addSourceBuffer, we could specify "no-delay" to indicate no buffering.

Thanks.

Comment 1 Aaron Colwell 2015-03-31 15:16:50 UTC

This sounds like a quality of implementation issue to me. If the web application has allowed the media element to run out of data, why would it expect playback not to stop? How long of an underflow should the UA endure before actually stopping playback?

You are right that MSE is biased more towards providing smooth playback instead of low latency. The main reasons for that come from the ways the SourceBuffers can be manipulated and because of constraints imposed by media decode pipelines. It seems to me that if you are interested in low latency non-mutable presentations you should be looking more towards WebRTC instead of MSE.

Comment 2 billconan 2015-04-01 02:15:16 UTC

(In reply to Aaron Colwell from comment #1)
> This sounds like a quality of implementation issue to me. If the web
> application has allowed the media element to run out of data, why would it
> expect playback not to stop? How long of an underflow should the UA endure
> before actually stopping playback?
> 
> You are right that MSE is biased more towards providing smooth playback
> instead of low latency. The main reasons for that come from the ways the
> SourceBuffers can be manipulated and because of constraints imposed by media
> decode pipelines. It seems to me that if you are interested in low latency
> non-mutable presentations you should be looking more towards WebRTC instead
> of MSE.

the player stops for way longer time than the actual network delay. a small hiccup can trigger few seconds video pause. I never said I expected playback not to stop. the problem is how long it should stop.

I can easily repro this issue with chrome. for example, I create a 60 fps mp4 stream, but instead of generating video frames at 60 frames per second, I generate frames at 55 fps.

The remote desktop use case would appreciate no buffering at all. so the video player should play the video at 55fps if 60 is not achievable. but the reality is that the video player pauses for 2 to 3 seconds for buffering.

To be honest, I don't understand the implementation difficulty of a buffering model option. 


the mse doc says it defines a splicing and buffering model that facilitates use cases like adaptive streaming, ad-insertion, time-shifting, and video editing. if I were to make this standard, I would like to make all these use cases options and let the programmer choose, because multi-mission often means failure, like the the f35 jet fighter http://sploid.gizmodo.com/the-designer-of-the-f-16-explains-why-the-f-35-is-such-1591828468


Speaking of WebRTC, I think it is a mess.

The real world is very hierarchical I think. There are small fundamental building blocks that form more complex ones. The more complex ones form even larger building blocks ... Strings make particles, particles make chemicals, chemicals make cells, cells make creatures. This is how the universe works.

But web standards never respect this philosophy of building things. Before there is a decent way of simply decoding a video frame in webpage, or a way of streaming a video from an ip to another ip, there is WebRTC already. A huge monster building block. This is like all you need is an extra ketchup, but you are told ketchup only comes with fries. why is the udp hole punching needed if the architecture is just servers streaming to clients? no wonder twitch uses flash.

my experience with WebRTC is worse. the latency is way higher than mse+websocket. Can't believe it is on udp. no way to control the video quality. no way to tell it to favor low fps and high bitrate...

Comment 3 Aaron Colwell 2015-04-01 05:03:25 UTC

(In reply to billconan from comment #2)
> (In reply to Aaron Colwell from comment #1)
> > This sounds like a quality of implementation issue to me. If the web
> > application has allowed the media element to run out of data, why would it
> > expect playback not to stop? How long of an underflow should the UA endure
> > before actually stopping playback?
> > 
> > You are right that MSE is biased more towards providing smooth playback
> > instead of low latency. The main reasons for that come from the ways the
> > SourceBuffers can be manipulated and because of constraints imposed by media
> > decode pipelines. It seems to me that if you are interested in low latency
> > non-mutable presentations you should be looking more towards WebRTC instead
> > of MSE.
> 

First of all, please dial it back a little bit. I can understand if you are frustrated by the behavior you are seeing and perhaps the web platform in general, but I don't think your strong language is particularly helpful. I'm sorry if my initial response upset you.

> the player stops for way longer time than the actual network delay. a small
> hiccup can trigger few seconds video pause. I never said I expected playback
> not to stop. the problem is how long it should stop.

Ok. This is quality of implementation issue and is not a behavior specified in the MSE spec. If Chrome is pausing for a long time then you should file a bug against Chrome (https://code.google.com/p/chromium/issues/entry?template=Audio/Video%20Issue) with a repro case so they can fix the implementation. If I were to guess, Chrome is probably not detecting your content as a "live stream" and so it isn't triggering the low-latency code path. You'd have to work with the Chrome engineers to figure out what is going on.
This isn't the forum for dealing with Chrome specific issues.

> 
> I can easily repro this issue with chrome. for example, I create a 60 fps
> mp4 stream, but instead of generating video frames at 60 frames per second,
> I generate frames at 55 fps.
> 
> The remote desktop use case would appreciate no buffering at all. so the
> video player should play the video at 55fps if 60 is not achievable. but the
> reality is that the video player pauses for 2 to 3 seconds for buffering.

Great! Please put this repo in the Chrome bug you file.

> 
> To be honest, I don't understand the implementation difficulty of a
> buffering model option. 

Adding an option isn't hard. My point is that I don't think we need it because the spec doesn't actually specify the behavior you are seeing.

My original comments were assuming that you were suggesting that the UA delay the transition to HAVE_CURRENT_DATA when a "hiccup" occurs. I'm assuming a "hiccup" means data arriving late enough that it causes the playback pipeline to underflow (ie triggers a transition to HAVE_CURRENT_DATA). I was trying to determine what type of delay criteria you had in mind. This was not particularly clear from the words I typed. I am sorry.

The 2-3 second delay you are seeing in Chrome likely has more to do with the HTML5 rules for transitioning from HAVE_CURRENT_DATA back to HAVE_ENOUGH_DATA(http://www.w3.org/html/wg/drafts/html/master/semantics.html#dom-media-have_enough_data) It sounds to me like you want a transition to HAVE_ENOUGH_DATA to occur if anything beyond the current playback position is appended. I'm not sure if I agree with that, but I agree pausing 2-3 seconds is probably not a good idea either. In my mind, it seems like better language in the HTML5 spec is needed around low latency streams since this is likely an issue for non-MSE based streams as well. All this of course assumes that a Chrome bug fix doesn't resolve the issue.

I'm leaving out the rest of your comments because they aren't relevant or particularly productive to the main issue at hand.

Comment 4 Matt Wolenetz 2015-10-13 23:04:33 UTC

This bug has been migrated to the GitHub issue tracker. Please follow/update progress using the GitHub issue:
https://github.com/w3c/media-source/issues/21