26718 – [InbandTracks] Container information vs. Manifest information

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 26718 - [InbandTracks] Container information vs. Manifest information

Summary: [InbandTracks] Container information vs. Manifest information

Status:	RESOLVED FIXED

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	Sourcing In-band Media Resource Tracks (show other bugs)
Version:	unspecified
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	Silvia Pfeiffer
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-09-02 14:05 UTC by Cyril Concolato
Modified:	2014-09-24 20:03 UTC (History)
CC List:	5 users (show)

See Also:

Attachments

Description Cyril Concolato 2014-09-02 14:05:58 UTC

The specification defines "Mappings [...] for [MPEGDASH], [ISOBMFF], [MPEG2TS], [OGGSKELETON] and [WebM]. " Those standards are partially overlapping. For instance an MPEG-DASH manifest may be used to describe a streaming session made of ISOBMFF files or MPEG-2 TS files. Unfortunately, some information in the DASH Manifest may be redundant (or contradictory) with the information in the container file, for example the language. The specification should indicate how to resolve those conflicts. Ideally, the container file information should prevail, unless it's missing information; so that the playback of a file from a manifest or directly in the video element produce the same result.

Comment 1 Bob Lund 2014-09-02 15:12:50 UTC

(In reply to Cyril Concolato from comment #0)
> The specification defines "Mappings [...] for [MPEGDASH], [ISOBMFF],
> [MPEG2TS], [OGGSKELETON] and [WebM]. " Those standards are partially
> overlapping. For instance an MPEG-DASH manifest may be used to describe a
> streaming session made of ISOBMFF files or MPEG-2 TS files. Unfortunately,
> some information in the DASH Manifest may be redundant (or contradictory)
> with the information in the container file, for example the language. The
> specification should indicate how to resolve those conflicts. Ideally, the
> container file information should prevail, unless it's missing information;
> so that the playback of a file from a manifest or directly in the video
> element produce the same result.

As I understand from our recent email exchange on this very topic, the MPD should really be viewed only as a hint and all the track attributes should be sourced from the metadata in the container itself. For instance, there is insufficient specification in DASH to correlate ContentComponent child Descriptors, e.g. Role, with a specific media container track.

Comment 2 Bob Lund 2014-09-05 20:14:51 UTC

(In reply to Bob Lund from comment #1)

> 
> As I understand from our recent email exchange on this very topic, the MPD
> should really be viewed only as a hint and all the track attributes should
> be sourced from the metadata in the container itself. For instance, there is
> insufficient specification in DASH to correlate ContentComponent child
> Descriptors, e.g. Role, with a specific media container track.

Cyril, my comment about the MPD was meant as a question. Is the intent of the DASH spec that the media container information takes precedence over the MPD?

Comment 3 Bob Lund 2014-09-18 19:41:53 UTC

(In reply to Cyril Concolato from comment #0)
> The specification defines "Mappings [...] for [MPEGDASH], [ISOBMFF],
> [MPEG2TS], [OGGSKELETON] and [WebM]. " Those standards are partially
> overlapping. For instance an MPEG-DASH manifest may be used to describe a
> streaming session made of ISOBMFF files or MPEG-2 TS files. Unfortunately,
> some information in the DASH Manifest may be redundant (or contradictory)
> with the information in the container file, for example the language. The
> specification should indicate how to resolve those conflicts. Ideally, the
> container file information should prevail, unless it's missing information;
> so that the playback of a file from a manifest or directly in the video
> element produce the same result.

Here is text in the DASH section [1] subsections 3 and 4 to clarify that the track attributes should first be sourced from data in the media container. If they cannot be sourced from the media container then they should be sourced from the MPD. This makes the UA behavior consistent with MSE.

This same clarification is being proposed in the MPEG DASH spec.

What do you think?

[1] http://rawgit.com/boblund/HTMLSourcingInbandTracks/bug26718/index.html#mpegdash

Comment 4 Silvia Pfeiffer 2014-09-22 08:02:09 UTC

FWIW: Looks good to me.

Comment 5 Cyril Concolato 2014-09-23 07:28:00 UTC

Fine by me too.

Comment 6 Silvia Pfeiffer 2014-09-23 08:30:41 UTC

https://github.com/w3c/HTMLSourcingInbandTracks/pull/21 was merged.

Comment 7 Jon Piesing (HbbTV) 2014-09-23 15:15:25 UTC

I think there are significant drawbacks to what is proposed here.

While it could make sense for representations that are currently being presented, it would seem to force the UA to actually fetch the data for representations that are not being presented in order to populate the properties.

Was this case considered?

Apologies for not reacting earlier but I've been on vacation.

Comment 8 Bob Lund 2014-09-23 19:38:42 UTC

(In reply to Jon Piesing (HbbTV) from comment #7)
> I think there are significant drawbacks to what is proposed here.
> 
> While it could make sense for representations that are currently being
> presented, it would seem to force the UA to actually fetch the data for
> representations that are not being presented in order to populate the
> properties.

The proposed resolution only addresses the priority the UA should use in sourcing in-band tracks - data in media container, then MPD. So, it doesn't require the UA to do anything new with respect to sourcing tracks.

I think the point you are raising has to do with when the UA needs to source Cues in the text tracks. This is a valid point and applies to other media containers besides DASH. We could specify that text tracks should be created with @mode set to "disabled" [1]. In this mode, the track object exists in the DOM but no Cues are created.

But, this is a separate issue that has nothing to do with this bug. I will create a new bug to track this.

[1] http://www.w3.org/TR/html5/embedded-content-0.html#text-track-disabled

> 
> Was this case considered?
> 
> Apologies for not reacting earlier but I've been on vacation.

Comment 9 Jon Piesing (HbbTV) 2014-09-24 14:35:46 UTC

This is nothing about text tracks or Cues - it's about video and audio tracks primarily.

There are two different cases to address;
1) A track whose content is currently being presented by the UA (e.g. the currently selected audio track)
2) A track whose content is not currently being presented by the UA (e.g. i) audio tracks other than the one currently being presented or ii) all audio tracks when none are yet being presented)

I believe this change reasonably addresses the first case.
However what about the second case?

Suppose a DASH MPD has 5 audio adaptation sets, 3 for different languages and 2 for accessibility and only one of the 5 is being presented.

If the UA is to create an AudioTrack for each of these 5 adaptation sets then the change made would require it to fetch data for each of the 4 adaptation sets not being presented. I believe that is wrong.

Comment 10 Cyril Concolato 2014-09-24 15:15:30 UTC

(In reply to Jon Piesing (HbbTV) from comment #9)
> Suppose a DASH MPD has 5 audio adaptation sets, 3 for different languages
> and 2 for accessibility and only one of the 5 is being presented.
> 
> If the UA is to create an AudioTrack for each of these 5 adaptation sets
> then the change made would require it to fetch data for each of the 4
> adaptation sets not being presented. I believe that is wrong.
Why? In many cases, you can't know for sure that the track will be playable until you've downloaded the initialization segment anyway, so you shouldn't expose it. In GPAC, when we build the GUI for selecting a track we base the information on the MP4 not the MPD. For MSE-based implementations in browsers I would wait for the SourceBuffers to be created and initialized too. You can decide to expose the tracks based on the MPD information only, in this case, make sure your UI can be changed when the real data arrives.

Comment 11 Silvia Pfeiffer 2014-09-24 19:48:17 UTC

Jon, the spec is basically all about how to retrieve metadata about audio, video and text (data) tracks (except for the text track cue sourcing). I agree with Cyril: you can't make sensible decisions about which tracks to activate without this information available.

Comment 12 Bob Lund 2014-09-24 20:03:30 UTC

(In reply to Jon Piesing (HbbTV) from comment #9)
> This is nothing about text tracks or Cues - it's about video and audio
> tracks primarily.
> 
> There are two different cases to address;
> 1) A track whose content is currently being presented by the UA (e.g. the
> currently selected audio track)
> 2) A track whose content is not currently being presented by the UA (e.g. i)
> audio tracks other than the one currently being presented or ii) all audio
> tracks when none are yet being presented)
> 
> I believe this change reasonably addresses the first case.
> However what about the second case?
> 
> Suppose a DASH MPD has 5 audio adaptation sets, 3 for different languages
> and 2 for accessibility and only one of the 5 is being presented.
> 
> If the UA is to create an AudioTrack for each of these 5 adaptation sets
> then the change made would require it to fetch data for each of the 4
> adaptation sets not being presented. I believe that is wrong.

It's not clear what you mean by "fetching data". Are you referring to creating the track? Or, are you referring to playing the media. In the case of creating a track, a track isn't created for each representation - only for each adaptation set or content component. The UA must create a track representation for each AdaptationSet/ContentComponent in order for the application or user to select the appropriate ones for rendering.

In any case, this issue is not introduced by the resolution of this bug. This bug should be closed with the resolution proposed and you should submit a new if you feel that the discussion of when to create a track should be pursued.