This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22136 - Inband Storage for SPS/PPS in ISO BMFF
Summary: Inband Storage for SPS/PPS in ISO BMFF
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Media Source Extensions (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: ---
Assignee: Aaron Colwell
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard: PRE_LAST_CALL
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-22 14:12 UTC by Jon Piesing (OIPF)
Modified: 2013-07-27 14:23 UTC (History)
10 users (show)

See Also:


Attachments

Description Jon Piesing (OIPF) 2013-05-22 14:12:40 UTC
The DASH Industry Forum implementation guidelines include the following- 

“Clients are expected to support Inband Storage for SPS/PPS based on Draft Amendment 32 for ISO/IEC 14496-15 as issued from MPEG#101 [23].” 

Please include a reference to this in the definition of the initialization segment part of the byte stream format for the ISO BMFF so that it is clear how these fit in the format.

NOTE: This issue arises from joint discussions between the Open IPTV Forum, HbbTV and the UK DTG. These organizations originally sent a liaison statement to the W3C Web & TV IG which is archived here;

https://lists.w3.org/Archives/Member/member-web-and-tv/2013Jan/0000.html (W3C member only link)
Comment 1 Mark Watson 2013-05-22 15:23:49 UTC
Are you suggesting that clients must support both SPS/PPS in the Initialization Segment and inband SPS/PPS, or that segments must include inband SPS/PPS ?
Comment 2 Jon Piesing (OIPF) 2013-05-22 19:44:56 UTC
>Are you suggesting that clients must support both SPS/PPS in the Initialization Segment and inband SPS/PPS, or that segments must include inband SPS/PPS ?

My understanding is that many broadcasters really would like the first of your options. Even if this group won't agree to that, clarity about how MPEG's solution for inband SPS/PPS in ISOBMFF fits into this group's vision / concept of Initialization Segments would be a big step forwards. 

I know some are hoping that MSE could fully support an implementation of the the DASH-IF guidelines and while inband SPS/PPS are only "expected" in that document, broadcasters see them as particularly useful for live content.
Comment 3 Aaron Colwell 2013-05-23 18:31:10 UTC
Marking all pre-Last Call bugs
Comment 4 Aaron Colwell 2013-05-24 00:52:04 UTC
(In reply to comment #2)
> >Are you suggesting that clients must support both SPS/PPS in the Initialization Segment and inband SPS/PPS, or that segments must include inband SPS/PPS ?
> 
> My understanding is that many broadcasters really would like the first of
> your options. Even if this group won't agree to that, clarity about how
> MPEG's solution for inband SPS/PPS in ISOBMFF fits into this group's vision
> / concept of Initialization Segments would be a big step forwards. 
> 
> I know some are hoping that MSE could fully support an implementation of the
> the DASH-IF guidelines and while inband SPS/PPS are only "expected" in that
> document, broadcasters see them as particularly useful for live content.

I don't have access to this MPEG document so I can't look at the details. 
Here are the questions I have:
1. Why is this beneficial over just providing a new init segment? 
2. Won't you need to have a new init segment anyways to allow people to join the broadcast at arbitrary points? 
3. Are inline SPS/PPS required at the beginning of every random access point?
4. What would break if the UA used the last appended init segment and started decoding at the first random access point encountered?
5. Is this data included in the sample or is it side information that needs to be combined with the sample before handing it to the decoder?
Comment 5 David Evans (BBC) 2013-05-24 15:18:40 UTC
In response to the questions in Comment 4, here is the BBC's take on how this might be used:

1. Why is this beneficial over just providing a new init segment? 
Two reasons - updating the init segment in a live stream isn't easy as there is no way in most streaming specifications to force the client to pick up a new init segment.
- secondly for compatibility with other specifications where the use of in band SPS/PPS is required or advised for certain use cases.

2. Won't you need to have a new init segment anyways to allow people to join the broadcast at arbitrary points? 
You still have an init segment, it just doesn't necessarily contain the SPS/PPS.

3. Are inline SPS/PPS required at the beginning of every random access point?
Yes.

4. What would break if the UA used the last appended init segment and started decoding at the first random access point encountered?
Nothing, that would work.

5. Is this data included in the sample or is it side information that needs to be combined with the sample before handing it to the decoder?
It's included in the sample.
Comment 6 Aaron Colwell 2013-06-05 15:41:27 UTC
Would adding the text below satisfy your concerns?

Implementation must support inband SPS/PPS. Media segments that contain inband SPS/PPS must contain this information at every random access point. Inband SPS/PPS must not apply across media segment boundaries. An initialization segment with SPS/PPS must be appended before any sequence of media segments that do not contain inband SPS/PPS.
Comment 7 Jon Piesing (OIPF) 2013-06-11 09:59:36 UTC
This comment is submitted on behalf of the participants in the joint discussions between the Open IPTV Forum, HbbTV and the UK DTG.

You don’t say where this text would go. If this text would go in the ISOBMFF byte stream format section then this is fine.

If this text it would be applicable to all byte stream formats then perhaps each byte stream format section needs to define what inband SPS/PPS mean for that format
Comment 8 Aaron Colwell 2013-06-11 14:04:10 UTC
(In reply to comment #7)
> This comment is submitted on behalf of the participants in the joint
> discussions between the Open IPTV Forum, HbbTV and the UK DTG.
> 
> You don’t say where this text would go. If this text would go in the ISOBMFF
> byte stream format section then this is fine.
> 
Yes. It would go in the ISOBMFF section only.
Comment 9 Jerry Smith 2013-07-09 02:57:32 UTC
This suggestion makes sense in principle, since it helps align MSE with MPEG-DASH.  The ISO-BMFF part 15 spec it references is still evolving though.  Our internal review of it suggests that the in-band elementary stream requirements are likely to change further.  If we require it in MSE, it will be difficult for implementations to support the requirement soon.

SPS/PPS in-band data would be used to adapt support for live content to changing layout parameters.  These are currently at least partially satisfied by packaging SPS/PPS data in avcn boxes within the video data.  The avcn implementation can and should be supported for MSE, though I don’t believe it requires additional specification language.
Comment 10 Jon Piesing (OIPF) 2013-07-09 06:17:24 UTC
(In reply to comment #9)
> This suggestion makes sense in principle, since it helps align MSE with
> MPEG-DASH.  The ISO-BMFF part 15 spec it references is still evolving
> though.  Our internal review of it suggests that the in-band elementary
> stream requirements are likely to change further.  If we require it in MSE,
> it will be difficult for implementations to support the requirement soon.
> 
> SPS/PPS in-band data would be used to adapt support for live content to
> changing layout parameters.  These are currently at least partially
> satisfied by packaging SPS/PPS data in avcn boxes within the video data. 
> The avcn implementation can and should be supported for MSE, though I don’t
> believe it requires additional specification language.

I was led to believe that inband SPS/PPS (i.e. 'avc3'/'avc4') was partly the results of discussing a proposal to include 'avcn' in MPEG. Is the specification for 'avcn' even suitable to reference? Even if it could be referenced with a little flexibility, why would you do that when there's an official MPEG solution to the same problem - 'avc3'/'avc4'?
Comment 11 Jerry Smith 2013-07-23 01:00:55 UTC
Responding to the last comment about AVC3/AVC4:  There is language in the part 15 3rd FDIS draft that says decoders must support avc1/avc2 with or without avcn boxes, and avc3/avc4 as well. I believe this means supporting PPS/SPS stored in separate elementary streams, in avc sample entries and as part of the sample itself.

On the maturity of the requirement:  We believe now that this requirement is sufficiently mature for MSE to reference it.  We recommend that a SHOULD clause be added to the informative section of the ISO BMFF byte stream that accepts the change proposed initially in this bug.
Comment 12 Chris Poole (BBC) 2013-07-23 11:29:04 UTC
For us as a content provider, this issue is primarily about content interoperability.  We want to be sure that MPEG DASH content can be created once and played by both MSE and non-MSE based clients.  We do not want to have to create separate versions of our content for different devices where the actual codecs used are the same.  This interoperability was one of the main aims of MPEG DASH in the first place of course.

We know that use of the traditional 'avc1' sample entry creates challenges for implementations on some classes of device and that the in-band carriage of codec parameter sets enabled by the 'avc3' option addresses those problems.  As Jon Piesing notes at the top of this bug, the DASH Industry Forum also recognised this issue and the DASH 264 interoperability document expects clients to support in-band carriage of the codec parameter sets.  So we believe MSE needs to support this if it is to allow for MPEG DASH implementations that are widely interoperable.

There are a few other points in favour of this:
- it's a more appropriate way of delivering a continuous live stream of segments as it allows changes to be made to the codec parameters over time
- looking ahead, using the avc3 approach is consistent with what is required for HEVC
- implementing avc3 is likely to be straightforward since the avc3 format is closer to what is generally required as input to the video decoder.  In constrast to the older 'avcn' approach, there are no new data structures to parse as the codec parameters are delivered in NAL structures which video decoders already handle.  In fact we were able to add support for avc3 in two open source projects simply by making them accept the 'avc3' identifier for the sample entry - everything else was handled correctly by existing code.

As far as we're aware the spec is stable for this and it is just progressing through the final stages of the ISO process. 

Building on Aaron's proposed text in comment 6, I'd propose the following:

"Implementations supporting content packaged according to ISO/IEC 14496-15 must support in-band carriage of codec Parameter Sets.  Media segments that contain in-band Parameter Sets must contain this information at every random access point. In-band Parameter Sets must not apply across media segment boundaries. An initialization segment containing Parameter Sets must be appended before any sequence of media segments that do not contain in-band Parameter Sets."

It's hard to see how having this as a 'should' would address the content interoperability issues so I'm suggesting it should remain a 'must', as originally proposed.
Comment 13 Jerry Smith 2013-07-23 13:12:55 UTC
Aaron proposed this:  

"Implementation must support inband SPS/PPS. Media segments that contain inband SPS/PPS must contain this information at every random access point. Inband SPS/PPS must not apply across media segment boundaries. An initialization segment with SPS/PPS must be appended before any sequence of media segments that do not contain inband SPS/PPS."

I would prefer making a higher level reference to avoid stating requirements details from the ISO standards.   I propose:  

"The implementation must support PPS/SPS stored in the sample entry (as defined for avc1/avc2), and should support PPS/SPS stored inband in the samples themselves (as defined for avc3/avc4).  An initialization segment with SPS/PPS must be appended before any sequence of media segments that do not contain inband SPS/PPS."
Comment 14 Chris Poole (BBC) 2013-07-23 13:48:26 UTC
It would need to be a 'must' for avc3/4 otherwise you don't have content interoperability which is our primary concern.

Otherwise, your proposed text gets the point across but I think it would be made clearer if 14496-15 is mentioned (because PPS/SPS are AVC/HEVC concepts) and also possibly better to talk more generally about Parameter Sets (because HEVC introduces a new one).
Comment 15 Aaron Colwell 2013-07-25 16:25:37 UTC
How about this as a compromise between Jerry's and Chris' suggestions?

1. Place the following right below "The user agent must handle Edit Boxes (edts) ..."
"The user agent must support parameter sets (e.g., PPS/SPS) stored in the sample entry (as defined for avc1/avc2), and should support parameter sets stored inband in the samples themselves (as defined for avc3/avc4).
Note: For maximum content interoperability user agents are strongly advised to support avc3/avc4 if possible."

2. Add the following to the end of the bulleted list in Section 12.2.2.
"6. Inband parameter sets are not present in the appropriate samples and parameter sets are not present in the last initialization segment appended."

I think this captures the discussion in this bug as well as what was discussed on the call.
Comment 16 Aaron Colwell 2013-07-26 14:42:02 UTC
Change committed.
https://dvcs.w3.org/hg/html-media/rev/9035359fe231
Comment 17 Chris Poole (BBC) 2013-07-26 17:22:19 UTC
Nicely worded.  Just one suggestion: can we drop the "if possible" from the end of the note?  It's already just advisory and those words tend to weaken the impact of "strongly advised" earlier in the sentence.
Comment 18 Aaron Colwell 2013-07-27 14:23:49 UTC
(In reply to comment #17)
> Nicely worded.  Just one suggestion: can we drop the "if possible" from the
> end of the note?  It's already just advisory and those words tend to weaken
> the impact of "strongly advised" earlier in the sentence.
Change committed.
https://dvcs.w3.org/hg/html-media/rev/39742e588e63