21302 – Support for inserting adverts as non-fragmented MP4 files

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21302 - Support for inserting adverts as non-fragmented MP4 files

Summary: Support for inserting adverts as non-fragmented MP4 files

Status:	RESOLVED WONTFIX

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	Media Source Extensions (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 normal
Target Milestone:	---
Assignee:	Adrian Bateman [MSFT]
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-03-15 08:58 UTC by Jon Piesing (OIPF)
Modified:	2013-03-25 17:23 UTC (History)
CC List:	4 users (show)

See Also:

Attachments

Description Jon Piesing (OIPF) 2013-03-15 08:58:24 UTC

This issue results from a joint meeting between the Open IPTV Forum, HbbTV and the UK DTG. These organizations originally sent a liaison statement to the W3C Web & TV IG:

https://lists.w3.org/Archives/Member/member-web-and-tv/2013Jan/0000.html (W3C member only link)

The broadcasters in our groups have a preference to work with existing web advertising solutions. We are told that a number of these deliver adverts as non-fragmented MP4 files.

While we note the discussion in response to the email http://lists.w3.org
/Archives/Public/public-html-media/2013Feb/0070.html, we don't understand a number of the reasons given for not supporting this. Here are comments on some of the reasons given in the public email thread.

>Supporting non-fragmented files would require the UA to hold the whole file in memory which could be very problematic on memory constrained devices. 

The connected TVs and set-top boxes that are the primary target of our 3 organisations already support HTTP streaming of non-fragmented MP4 files without having the whole file in memory.

>If the UA decides to garbage collect part of the presentation timeline to free up space for new appends it is not clear how the web application could reappend the garbage collected regions without appending the whole file again. The fragmented form allows the application to easily select the desired segment and reappend it. Applications can control the level of duplicate appending by adjusting the fragment size appropriately. 

As explained in W3C bugzilla #21299 the specification being rather unclear on memory management and garbage collection. It's hard to react to this comment based on what's currently in the specification.

>Non-fragmented files are so permissive about how they can store samples, there is no simple way to collect segments of the timeline w/o essentially exposing a random access file API.

In practise, the formatting of non-fragmented files must be constrained to be playable on any kind of embedded device. These constraints don't seem to cause particular problems in the industry.

>My expectation is that the web application using MSE will make sure that all the content it intends to use in a presentation is in fragmented format.

Fragmenting a 30s advert would seem to be unnecessary.

>In my opinion the ad's should just be remuxed to fragmented form as part of the ingest process. Tools like GPAC<http://gpac.wp.mines-telecom.fr/> make this super easy.

The broadcasters that we are working with have a requirement to be able to re-use their existing web advertising infrastructure as-is. Any changes that have to be made for a particular platform will either delay or reduce the probability of them providing content to that platform.

Comment 1 Mark Watson 2013-03-15 15:22:11 UTC

(In reply to comment #0)

> 
> Fragmenting a 30s advert would seem to be unnecessary.
> 

On this one point, adaptive streaming requires that the client has the opportunity to switch bitrate. In practice a granularity (fragment size) of 2 seconds works well. In this case it is certainly necessary to fragment a 30s advert.

> >In my opinion the ad's should just be remuxed to fragmented form as part of the ingest process. Tools like GPAC<http://gpac.wp.mines-telecom.fr/> make this super easy.
> 
> The broadcasters that we are working with have a requirement to be able to
> re-use their existing web advertising infrastructure as-is. Any changes that
> have to be made for a particular platform will either delay or reduce the
> probability of them providing content to that platform.

It seems like the adverts in question need to be processed anyway to support adaptive streaming.

If adaptive streaming is not required, why use Media Source at all ? Why not just use a separate video element to play the advert ?

Comment 2 Jon Piesing (OIPF) 2013-03-16 15:13:46 UTC

> If adaptive streaming is not required, why use Media Source at all ? Why not
> just use a separate video element to play the advert ?

The broadcasters want the transition from the main content item to the adverts and back again to be as seamless as possible. (There's a long debate in progress in various places about how seamless is possible and how the content must be prepared/constrained to achieve that.)

Use of two video elements has been discussed extensively in the workshops of the 3 organisations as one of the options. However just starting the 2nd video element from JS based on polling for the advert insertion time or use of media fragment URI with an end time was believed to be a long way removed from seamless without some new APIs to somehow link the two video elements together so that the UA (platform) automatically started the 2nd at the end of the content in the 1st without an JS in the critical path.

(In reply to comment #0)
> This issue results from a joint meeting between the Open IPTV Forum, HbbTV
> and the UK DTG. These organizations originally sent a liaison statement to
> the W3C Web & TV IG:
> 
> https://lists.w3.org/Archives/Member/member-web-and-tv/2013Jan/0000.html
> (W3C member only link)
> 
> The broadcasters in our groups have a preference to work with existing web
> advertising solutions. We are told that a number of these deliver adverts as
> non-fragmented MP4 files.
> 
> While we note the discussion in response to the email http://lists.w3.org
> /Archives/Public/public-html-media/2013Feb/0070.html, we don't understand a
> number of the reasons given for not supporting this. Here are comments on
> some of the reasons given in the public email thread.
> 
> >Supporting non-fragmented files would require the UA to hold the whole file in memory which could be very problematic on memory constrained devices. 
> 
> The connected TVs and set-top boxes that are the primary target of our 3
> organisations already support HTTP streaming of non-fragmented MP4 files
> without having the whole file in memory.

That is because they are given a URL that contains the whole resource. The UA can fetch subsegments of the resource whenever it needs them. MSE intentionally defers media fetching to the web applications so this isn't an option.

> 
> >If the UA decides to garbage collect part of the presentation timeline to free up space for new appends it is not clear how the web application could reappend the garbage collected regions without appending the whole file again. The fragmented form allows the application to easily select the desired segment and reappend it. Applications can control the level of duplicate appending by adjusting the fragment size appropriately. 
> 
> As explained in W3C bugzilla #21299 the specification being rather unclear
> on memory management and garbage collection. It's hard to react to this
> comment based on what's currently in the specification.
>

As explained in my response on that bug, remove() and eviction (aka garbage collection) can happen for any time range. This means that any number of coded
frames can be removed portions of the timeline covered by the non-fragmented file. Since the file isn't fragmented, the application has no choice but to reappend the whole file instead of only appending the relevant fragment that covers the evicted range.
 
> >Non-fragmented files are so permissive about how they can store samples, there is no simple way to collect segments of the timeline w/o essentially exposing a random access file API.
> 
> In practise, the formatting of non-fragmented files must be constrained to
> be playable on any kind of embedded device. These constraints don't seem to
> cause particular problems in the industry.
>

I would prefer to make this requirement explicit instead of just providing a blanket statement about non-fragmented files must be supported. I think tighter restrictions on the format of the files should be made.
 
> >My expectation is that the web application using MSE will make sure that all the content it intends to use in a presentation is in fragmented format.
> 
> Fragmenting a 30s advert would seem to be unnecessary.
> 

You can use a single fragment if you'd like, but it means that you'll have to reappend the whole thing if part of it happens to get evicted.

> >In my opinion the ad's should just be remuxed to fragmented form as part of the ingest process. Tools like GPAC<http://gpac.wp.mines-telecom.fr/> make this super easy.
> 
> The broadcasters that we are working with have a requirement to be able to
> re-use their existing web advertising infrastructure as-is. Any changes that
> have to be made for a particular platform will either delay or reduce the
> probability of them providing content to that platform.

I don't understand this hard requirement. If they want to use MSE at all they are already going to have to reformat their non-ad content so I don't see why it is unreasonable for them to reformat the ad content as well. The required format conversion tools would be the same.

Comment 4 Jon Piesing (OIPF) 2013-03-17 10:01:10 UTC

(In reply to comment #3)
> (In reply to comment #0)

<snip>

> > >Supporting non-fragmented files would require the UA to hold the whole file in memory which could be very problematic on memory constrained devices. 
> > 
> > The connected TVs and set-top boxes that are the primary target of our 3
> > organisations already support HTTP streaming of non-fragmented MP4 files
> > without having the whole file in memory.
> 
> That is because they are given a URL that contains the whole resource. The
> UA can fetch subsegments of the resource whenever it needs them. MSE
> intentionally defers media fetching to the web applications so this isn't an
> option.

Clearly this is the case for append but it is not remotely obvious that this is still the case for appendStream(). See some of our other bugzilla issues.

<snip>

> As explained in my response on that bug, remove() and eviction (aka garbage
> collection) can happen for any time range. This means that any number of
> coded
> frames can be removed portions of the timeline covered by the non-fragmented
> file. Since the file isn't fragmented, the application has no choice but to
> reappend the whole file instead of only appending the relevant fragment that
> covers the evicted range.

For the specific case of adverts, this behaviour might be fine. If the advert is appendStream()'d (say) 20-30s before it's needed, if it's evicted once it's been played once then that's fine. The most likely scenario communicated by the broadcasters is that adverts would only be played once and if the user played through the content again, they would either see it without adverts or with different ones.

We should discuss in the phone conference if these constraints would in fact be fine for the specific case of adverts.

>  
> > >Non-fragmented files are so permissive about how they can store samples, there is no simple way to collect segments of the timeline w/o essentially exposing a random access file API.
> > 
> > In practise, the formatting of non-fragmented files must be constrained to
> > be playable on any kind of embedded device. These constraints don't seem to
> > cause particular problems in the industry.
> >
> 
> I would prefer to make this requirement explicit instead of just providing a
> blanket statement about non-fragmented files must be supported. I think
> tighter restrictions on the format of the files should be made.

I'll see if I can arrange someone to produce a summary of the restrictions that can be discussed.

>  
> > >My expectation is that the web application using MSE will make sure that all the content it intends to use in a presentation is in fragmented format.
> > 
> > Fragmenting a 30s advert would seem to be unnecessary.
> > 
> 
> You can use a single fragment if you'd like, but it means that you'll have
> to reappend the whole thing if part of it happens to get evicted.

See my above comment - for the case of an advert that is played once only from beginning to end then this restriction may be OK.

> 
> > >In my opinion the ad's should just be remuxed to fragmented form as part of the ingest process. Tools like GPAC<http://gpac.wp.mines-telecom.fr/> make this super easy.
> > 
> > The broadcasters that we are working with have a requirement to be able to
> > re-use their existing web advertising infrastructure as-is. Any changes that
> > have to be made for a particular platform will either delay or reduce the
> > probability of them providing content to that platform.
> 
> I don't understand this hard requirement. If they want to use MSE at all
> they are already going to have to reformat their non-ad content so I don't
> see why it is unreasonable for them to reformat the ad content as well. The
> required format conversion tools would be the same.

Firstly the 3 groups do not have a requirement to use MSE. They have a requirement to support advert insertion for on-demand content. The 3 groups are evaluating MSE as a solution to meet this requirement with the encouragement of some common members between the 3 groups and the W3C.

The comment "If they want to use MSE at all they are already going to have to reformat their non-ad content so I don't see why it is unreasonable for them to reformat the ad content as well" indicates that some of the realities of how web video advertising works haven't been communicated well enough. 

What I had to learn 6 months ago is that the ad content is provided and served by 3rd parties. The broadcasters app just makes an XHR request to a server operated by one of these 3rd parties, gets a response back (most likely in a format called VAST). The response is parsed in JS and the URLs of the adverts extracted. The video of the adverts is delivered by those 3rd parties, the broadcaster has no involvement with the video at all. If MSE doesn't support non-fragmented MP4 files then the broadcaster would have to find a video ad provider who delivered video in fragmented MP4 files and switch their contracts to them or just wait and not launch the service until video ad providers have switched.

As discussed on last week's teleconference call, it is unlikely that support for non-fragmented MP4 files will be added to MSE.

While ad providers don't support fragmented MP4 files today, they likely will in the future to gain adaptive streaming benefits like the primary content receives. I understand that this may cause some pain or delays in the short term, but those are business problems and not the result of a technical deficiency in the spec. Adding support for non-fragmented files when it is pretty clear that these will become less common over time, just adds complexity and legacy support burden for UA vendors.

If you would like to continue discussion of this, I'd recommend starting a thread on the public-html-media@ list. I'm closing this bug for now since there is no action here to update the spec.