This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22134 - When do multiple SourceBuffers have to be used
Summary: When do multiple SourceBuffers have to be used
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Media Source Extensions (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: ---
Assignee: Aaron Colwell
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard: PRE_LAST_CALL
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-22 14:09 UTC by Jon Piesing (OIPF)
Modified: 2013-07-02 20:17 UTC (History)
8 users (show)

See Also:


Attachments

Description Jon Piesing (OIPF) 2013-05-22 14:09:11 UTC
The specification is unclear about when multiple SourceBuffers have to be used (i.e. when using a single SourceBuffer will fail) and when they may be used (i.e. using a single SourceBuffer will not fail but may not be recommended).
	
We request that this be clarified and that the clarification cover at least the following scenarios;
1.	Video and audio are each delivered separately (e.g. DASH ISOBMFF content where each representation contains only one media component).
2.	One video track and multiple audio tracks (e.g. multiple languages or accessible audio) are each delivered separately (e.g. DASH ISOBMFF content where each representation contains only one media component).
3.	The audio is encoded such that part of the audio it does not meet the requirements of section 11 of MSE to form a single logical byte stream, e.g. the audio in the main content is Dolby but a number of adverts are to be inserted where the audio is HE-AAC. There are two variations on this, 
i) where the video during the adverts also does not meet the requirements of section 11 and
ii) where the video during the adverts does meet the requirements of section 11.
4.	A new initialization segment is needed that has a different number and/or type of tracks and/or different track IDs.
5.	The byte stream format changes, e.g. from MPEG-2 transport stream to ISO BMFF or vice-versa.

NOTE: This issue arises from joint discussions between the Open IPTV Forum, HbbTV and the UK DTG. These organizations originally sent a liaison statement to the W3C Web & TV IG which is archived here;

https://lists.w3.org/Archives/Member/member-web-and-tv/2013Jan/0000.html (W3C member only link)
Comment 1 Aaron Colwell 2013-05-23 18:31:04 UTC
Marking all pre-Last Call bugs
Comment 2 Pierre Lemieux 2013-05-28 15:53:52 UTC
[Per editor's request on 20130528 TF call]

In order to enable interoperability, the specification needs to include guidance on number of source buffers that implementation are expected to support.
Comment 3 Aaron Colwell 2013-05-28 23:38:53 UTC
(In reply to comment #0)
> The specification is unclear about when multiple SourceBuffers have to be
> used (i.e. when using a single SourceBuffer will fail) and when they may be
> used (i.e. using a single SourceBuffer will not fail but may not be
> recommended).

I don't believe the MSE spec should be outlining the various use cases content authors might have. It should provide information on howthe UA behaves and what data it accepts and what does not. I'm happy to provide guidance on the situations you outline below, but it isn't clear to me what should be changed in the spec since I believe the existing text provides the appropriate hints about what to expect.

> 	
> We request that this be clarified and that the clarification cover at least
> the following scenarios;
> 1.	Video and audio are each delivered separately (e.g. DASH ISOBMFF content
> where each representation contains only one media component).

This would require multiple SourceBuffers because it implies multiple initialization segments with different track types. For example, one initialization segment would only contian an audio track and the other initialization segment would only contain a video track. This is outlined in the rules that apply to initialization segments in Section 11.

> 2.	One video track and multiple audio tracks (e.g. multiple languages or
> accessible audio) are each delivered separately (e.g. DASH ISOBMFF content
> where each representation contains only one media component).

This would require multiple SourceBuffers for the same reasons as above.

> 3.	The audio is encoded such that part of the audio it does not meet the
> requirements of section 11 of MSE to form a single logical byte stream, e.g.
> the audio in the main content is Dolby but a number of adverts are to be
> inserted where the audio is HE-AAC. There are two variations on this, 
> i) where the video during the adverts also does not meet the requirements of
> section 11 and
> ii) where the video during the adverts does meet the requirements of section
> 11.

This would require multiple SourceBuffers because codec switches are not allowed in a single SourceBuffer. Since multiple SourceBuffers are involved it also implies that the content for the different codecs are represented by different AudioTrack and VideoTrack objects. The spec makes no guarantees about seamless transitions between tracks so the content author should not assume these track switches will be seamless. I believe Bug 22135 is intended to continue a discussion about this particular constraint.

> 4.	A new initialization segment is needed that has a different number and/or
> type of tracks and/or different track IDs.

This is explicitly forbidden in a single SourceBuffer based on the rules in Section 11 so multiple SourceBuffers would need to be used here. I believe that Bug 22137 is about changing this requirement so we may be able to find some middle ground on this one. Right now though, this scenario requires multiple SourceBuffers because the number and type of tracks are not allowed to change within the initialization segments of a bytestream.

> 5.	The byte stream format changes, e.g. from MPEG-2 transport stream to ISO
> BMFF or vice-versa.

Bytestream format changes are not allowed within a single SourceBuffer because when an application calls addSourceBuffer() to create a SourceBuffer it needs to specify the mimetype of the bytestream format it intends to append to the SourceBuffer object. Obviously MPEG-2 TS data does not conform to ISOBMFF bytestream spec rules so you can expect a changing to MPEG2 TS in a SourceBuffer created for ISOBMFF would trigger a decode error in step 2 of the Segment Parser Loop. If you need to support different bytestreams then you need to use seperate SourceBuffers for that.

I hope this clarifies things.

> 
> NOTE: This issue arises from joint discussions between the Open IPTV Forum,
> HbbTV and the UK DTG. These organizations originally sent a liaison
> statement to the W3C Web & TV IG which is archived here;
> 
> https://lists.w3.org/Archives/Member/member-web-and-tv/2013Jan/0000.html
> (W3C member only link)
Comment 4 Aaron Colwell 2013-06-05 17:23:31 UTC
Changes committed
https://dvcs.w3.org/hg/html-media/rev/63675668846c

Added text to indicate the minimal number of SourceBuffers implementations are expectd to support.
Comment 5 Jon Piesing (OIPF) 2013-06-11 09:55:57 UTC
This comment is submitted on behalf of the participants in the joint discussions between the Open IPTV Forum, HbbTV and the UK DTG.

Thank you for these clarifications which are very useful in helping us understand how MSE would work in our context.

Comment #3 says “I'm happy to provide guidance on the situations you outline below, but it isn't clear to me what should be changed in the spec since I believe the existing text provides the appropriate hints about what to expect.”.

We have tried to understand what “existing text” is referred to here. As far as we can see, the only link from the byte stream format to the source buffer behaviour is the following text in section 3.5.1.

"If the input buffer starts with bytes that violate the byte stream format specifications, then run the end of stream algorithm with the error parameter set to "decode" and abort this algorithm.

Remove any bytes that the byte stream format specifications say must be ignored from the start of the input buffer."

If this is the only “existing text” then;

-          We recommend some text referencing this failure mode be added to the introduction to the byte stream formats section. E.g. ‘The behavior in the event that the bytes provided to a SourceBuffer do not meet the rules defined in this section is defined in section 3.5.1, “Segment Parser Loop”.’

-          The text from section 3.5.1 talks about the input buffer starting with bytes that violate the byte stream format specifications. What if the violation of the byte stream format specs isn’t at the start of the input buffer? (If you prefer we can open a new issue for this question).
Comment 6 Cyril Concolato 2013-06-11 13:25:00 UTC
(In reply to comment #3)
> Bytestream format changes are not allowed within a single SourceBuffer
> because when an application calls addSourceBuffer() to create a SourceBuffer
> it needs to specify the mimetype of the bytestream format it intends to
> append to the SourceBuffer object. Obviously MPEG-2 TS data does not conform
> to ISOBMFF bytestream spec rules so you can expect a changing to MPEG2 TS in
> a SourceBuffer created for ISOBMFF would trigger a decode error in step 2 of
> the Segment Parser Loop. If you need to support different bytestreams then
> you need to use seperate SourceBuffers for that.
> 
> I hope this clarifies things.
Actually, the spec isn't very clear about that. In Segment Parser Loop it says:
"If the input buffer starts with bytes that violate the byte stream format specifications, then run the end of stream algorithm with the error parameter set to "decode" and abort this algorithm."
Note the 's' at the end of "byte stream format specifications". So strictly speaking, MPEG-2 TS could be allowed in an ISO BMF SourceBuffer. This should probably be fixed. 

Additionally, the MIME type provided to the addSourceBuffer call is never referred to. The fix should probably mention it. 

Also, it is unclear if the codec parameters passed to the addSourceBuffer call (in any) should be used to reject input data in the Segment Parser Loop (i.e. if initialization segments declaring additional tracks compared to what was passed to the addSourceBuffer call should be accepted).
Comment 7 Aaron Colwell 2013-07-02 20:17:41 UTC
Change committed.
https://dvcs.w3.org/hg/html-media/rev/b98190a4472c

- Clarified that the "decode" error should be signalled if the input buffer contains bytes that violate the byte stream specification instead of just checking to see if the start of the buffer contained invalid bytes.

- Added "SourceBuffer byte stream format specification" definition and reference to clarify that only one byte stream format can be used with a SourceBuffer and the format is selected based on the type passed to addSourceBuffer().

- Added a step to the " Initialization Segment Received" algorithm to clarify what happens if an initialization segment contains tracks with unsupported codecs.