This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22117 - Add a conformance section
Summary: Add a conformance section
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Media Source Extensions (show other bugs)
Version: unspecified
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Aaron Colwell
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard: PRE_LAST_CALL
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-21 11:24 UTC by Cyril Concolato
Modified: 2013-07-18 17:42 UTC (History)
3 users (show)

See Also:


Attachments

Description Cyril Concolato 2013-05-21 11:24:29 UTC
The current spec does not define what conformance to MSE means. In particular, it uses 'must' or 'should' statements which seems to imply that there are 2 conformant products: MSE-conformant segment generators and MSE-conformant user agents. It would be good to define them clearly and to review the normative statements in the light of these products to clarify them.
Comment 1 Aaron Colwell 2013-05-23 18:31:06 UTC
Marking all pre-Last Call bugs
Comment 2 Aaron Colwell 2013-05-28 17:06:17 UTC
The 6-7 'should' statements in the normative text accidentally snuck in. I'll be fixing them in the next update since they are all intended to be 'must'. 

I'm not sure if I understand why there needs to be 2 conformant products. The MSE spec is intended to outline what the UA accepts. It is not meant to specify what a generator must do. Generators just need to create content that conforms to what the UA accepts.
Comment 3 Cyril Concolato 2013-06-03 02:32:36 UTC
I was referring to sentences like the following:
"If frames can be decoded out of order, then the decode timestamp must be present in the bytestream."
"Each track description inside a single initialization segment must have a unique Track ID."
"Coded frames within a media segment must be adjacent in time"

or to the sentences in the byte stream specifications for WebM, ISOBMFF or MPEG-2 TS which to me implies that there is a notion of "conformant" MSE content and thus conformant generator.
Comment 4 Aaron Colwell 2013-06-05 16:08:05 UTC
Please provide the text and suggested location(s) in the spec where you would like it placed.
Comment 5 Aaron Colwell 2013-06-25 14:52:28 UTC
Assigning to Cyril since he agreed to provide suggested text on the last call.
Comment 6 Cyril Concolato 2013-07-10 12:50:29 UTC
I've searched for the MUST statements in the spec, and propose to change those statements which do not have the UA as subject. I propose to rephrase as follows:

In the "Coded Frame" definition:
Replace:
"If frames can be decoded out of order, then the decode timestamp must be present in the byte stream."
With:
"If frames can be decoded out of order, then the decode timestamp are present in the byte stream. The user agent must report a "decode" error if this is not the case."

In the "Track Description" definition:
Replace:
"Each track description inside a single initialization segment must have a unique Track ID."
With:
"Each track description inside a single initialization segment has a Track ID. UA must report a "decode" error if the Track ID is not unique within the initialization segment."

In "SourceBuffer Monitoring"
The following sentence uses MUST in a NOTE and for the media element. I suggest keeping the sentence in the NOTE but using a SHOULD and rephrasing it from:
"When the media element needs more data, it must transition from HAVE_ENOUGH_DATA to HAVE_FUTURE_DATA early enough for a web application to be able to respond without causing an interruption in playback. "
to
"When the media element needs more data, the user agent SHOULD transition it from HAVE_ENOUGH_DATA to HAVE_FUTURE_DATA early enough for a web application to be able to respond without causing an interruption in playback. "

In "Source Buffer - Enumeration description" 
The following sentence uses MUST for a media segment. However, the first part of the sentence is already covered in the coded frame processing algorithm. My suggestion is:
Replace:
"Coded frames within a media segment must be adjacent in time, but media segments can be appended in any order."
with
"Media segments can be appended in any order."

In "Prepare Append Algorithm":
The sentence uses MUST in a NOTE and the subject is a "web application". Since the MSE spec is about defining conformant UA behavior and not conformant Web apps, it should be rewritten from:
"The web application must use remove() to explicitly free up space and/or reduce the size of the append."
to
"The web application SHOULD use remove() to explicitly free up space and/or reduce the size of the append."

Similarly in "Stream Append Loop":
The sentences uses MUST in a NOTE for a web application. I suggest rephrasing from:
"The web application must use remove() to free up space in the SourceBuffer."
to
"The web application SHOULD use remove() to free up space in the SourceBuffer."

Similarly in "Coded Frame Eviction Algorithm"
Rephrase:
"Implementations may use different methods for selecting removal ranges so web applications must not depend on a specific behavior."
to:
"Implementations may use different methods for selecting removal ranges so web applications do not depend on a specific behavior."

In section 12 "Byte Stream Formats", the subject of MUST statements is not the UA but another "format". So, the MSE spec actually defines "conformant User Agent" and "conformant Byte Stream Formats". See for instance the following sentences which cannot easily be replaced:

"A byte stream format specification must define initialization segments and media segments."
"It must be possible to identify segment boundaries and segment type (initialization or media) by examining the byte stream alone."
"Byte stream specifications must at a minimum define constraints which ensure that the above requirements hold."

I think this should be clarified in the introduction. I would suggest adding to section "1.1 Goals" the following sentence (or similar):
"This specification defines:
- normative behavior for User Agents to enable interoperability between user agents and web applications when processing media data;
- normative requirements to enable other specifications to define media formats to be used with this specification".

The rest of section 12 can actually be rewritten into normative UA behavior, as follows:
"The user agent MUST report a "decode" error when one of the following conditions is met:
1. The number and type of tracks is not consistent across initialization segments.
2. Track IDs are not the same across initialization segments, for segments describing multiple tracks of a single type (e.g. 2 audio tracks).
3. Codecs changes across initialization segments.

The user agent MUST support: 
1. Track IDs change across initialization segments if the segments describes only one track of each type.
2. Video frame size changes. The user agent MUST support seamless playback.
3. Audio channel count changes. The user agent MAY support seamless playback and could trigger downmixing."

The paragraph and bullet points starting with:
"The following rules apply to all media segments within a byte stream: ..."
should be rephrased as:
"The following rules apply to all media segments within a byte stream. A user agent MUST:
1. map all timestamps to the same media timeline.
2. support seamless playback of media segments having a timestamp gap smaller than the audio frame size. User agent MUST not reflect these gaps in the buffered attribute."

Additionally, that second bullet should actually be removed as the processing of gap should be covered already in the coded frame processing algorithm. 

Note: I don't know how to rephrase the last bulleted paragraph starting with "The combination of "

In section 12.2:
In this section, when a sentence indicates that a byte stream syntactical element must have (resp. must not) have a specific feature, it should be rewritten as "The user agent MUST report a "decode" error when encountering a byte stream construct that does not (resp. does) have that feature.

"The tracks in the Movie Header Box must not contain any samples"
should be rephrased as:
"UA must report a "decode" error when parsing tracks in the Movie Header Box that contain samples (i.e. when the entry_count in the stts, stsc and stco boxes is set to zero"

" A Movie Extends (mvex) box must be contained in the Movie Header Box to indicate that Movie Fragments are to be expected."
should be rephrased as:
"UA must report a decode error when a Movie Extends (mvex) box is not contained in the Movie Header Box".

The following paragraph:
"The following rules apply to ISO BMFF media segments:
- The Movie Fragment Box must contain at least one Track Fragment Box (traf).
- The Movie Fragment Box must use movie-fragment relative addressing and the flag default-base-is-moof must be set; absolute byte-offsets must not be used.
- External data references must not be used.
- If the Movie Fragment contains multiple tracks, the duration by which each track extends should be as close to equal as practical.
- Each Track Fragment Box must contain a Track Fragment Decode Time Box (tfdt)
- The Media Data Boxes must contain all the samples referenced by the Track Fragment Run Boxes (trun) of the Movie Fragment Box."
should be rephrased as:
"The following rules apply to ISO BMFF media segments. UA MUST report a "decode" error when:
- The Movie Fragment Box does not contain at least one Track Fragment Box (traf).
- The Movie Fragment Box does not use movie-fragment relative addressing or if the flag default-base-is-moof is not set.
- External data references are used.
- A Track Fragment Box does not contain a Track Fragment Decode Time Box (tfdt)
- The Media Data Boxes do not contain all the samples referenced by the Track Fragment Run Boxes (trun) of the Movie Fragment Box."

Similar changes should be done for the MPEG-2 TS and WebM part.
Comment 7 Aaron Colwell 2013-07-18 17:42:43 UTC
Changes committed.
https://dvcs.w3.org/hg/html-media/rev/2b2d8865de83


Done.

> 
> In the "Track Description" definition:
> Replace:
> "Each track description inside a single initialization segment must have a
> unique Track ID."
> With:
> "Each track description inside a single initialization segment has a Track
> ID. UA must report a "decode" error if the Track ID is not unique within the
> initialization segment."

Done.

> 
> In "SourceBuffer Monitoring"
> The following sentence uses MUST in a NOTE and for the media element. I
> suggest keeping the sentence in the NOTE but using a SHOULD and rephrasing
> it from:
> "When the media element needs more data, it must transition from
> HAVE_ENOUGH_DATA to HAVE_FUTURE_DATA early enough for a web application to
> be able to respond without causing an interruption in playback. "
> to
> "When the media element needs more data, the user agent SHOULD transition it
> from HAVE_ENOUGH_DATA to HAVE_FUTURE_DATA early enough for a web application
> to be able to respond without causing an interruption in playback. "

Done

> 
> In "Source Buffer - Enumeration description" 
> The following sentence uses MUST for a media segment. However, the first
> part of the sentence is already covered in the coded frame processing
> algorithm. My suggestion is:
> Replace:
> "Coded frames within a media segment must be adjacent in time, but media
> segments can be appended in any order."
> with
> "Media segments can be appended in any order."

Done.

> 
> In "Prepare Append Algorithm":
> The sentence uses MUST in a NOTE and the subject is a "web application".
> Since the MSE spec is about defining conformant UA behavior and not
> conformant Web apps, it should be rewritten from:
> "The web application must use remove() to explicitly free up space and/or
> reduce the size of the append."
> to
> "The web application SHOULD use remove() to explicitly free up space and/or
> reduce the size of the append."
> 

Done.

> Similarly in "Stream Append Loop":
> The sentences uses MUST in a NOTE for a web application. I suggest
> rephrasing from:
> "The web application must use remove() to free up space in the SourceBuffer."
> to
> "The web application SHOULD use remove() to free up space in the
> SourceBuffer."
> 

Done.

> Similarly in "Coded Frame Eviction Algorithm"
> Rephrase:
> "Implementations may use different methods for selecting removal ranges so
> web applications must not depend on a specific behavior."
> to:
> "Implementations may use different methods for selecting removal ranges so
> web applications do not depend on a specific behavior."

Done.

> 
> In section 12 "Byte Stream Formats", the subject of MUST statements is not
> the UA but another "format". So, the MSE spec actually defines "conformant
> User Agent" and "conformant Byte Stream Formats". See for instance the
> following sentences which cannot easily be replaced:
> 
> "A byte stream format specification must define initialization segments and
> media segments."
> "It must be possible to identify segment boundaries and segment type
> (initialization or media) by examining the byte stream alone."
> "Byte stream specifications must at a minimum define constraints which
> ensure that the above requirements hold."
> 
> I think this should be clarified in the introduction. I would suggest adding
> to section "1.1 Goals" the following sentence (or similar):
> "This specification defines:
> - normative behavior for User Agents to enable interoperability between user
> agents and web applications when processing media data;
> - normative requirements to enable other specifications to define media
> formats to be used with this specification".

Done.

> 
> The rest of section 12 can actually be rewritten into normative UA behavior,
> as follows:
> "The user agent MUST report a "decode" error when one of the following
> conditions is met:
> 1. The number and type of tracks is not consistent across initialization
> segments.
> 2. Track IDs are not the same across initialization segments, for segments
> describing multiple tracks of a single type (e.g. 2 audio tracks).
> 3. Codecs changes across initialization segments.
> 
> The user agent MUST support: 
> 1. Track IDs change across initialization segments if the segments describes
> only one track of each type.
> 2. Video frame size changes. The user agent MUST support seamless playback.
> 3. Audio channel count changes. The user agent MAY support seamless playback
> and could trigger downmixing."

Done.

> 
> The paragraph and bullet points starting with:
> "The following rules apply to all media segments within a byte stream: ..."
> should be rephrased as:
> "The following rules apply to all media segments within a byte stream. A
> user agent MUST:
> 1. map all timestamps to the same media timeline.
> 2. support seamless playback of media segments having a timestamp gap
> smaller than the audio frame size. User agent MUST not reflect these gaps in
> the buffered attribute."

Done.

> 
> Note: I don't know how to rephrase the last bulleted paragraph starting with
> "The combination of "

I reworded this along the lines of your other suggestions. The UA signals an error if any of a set
of conditions holds true. I them made each into a "... is not provided" type sentence.

> 
> In section 12.2:
> In this section, when a sentence indicates that a byte stream syntactical
> element must have (resp. must not) have a specific feature, it should be
> rewritten as "The user agent MUST report a "decode" error when encountering
> a byte stream construct that does not (resp. does) have that feature.
> 
> "The tracks in the Movie Header Box must not contain any samples"
> should be rephrased as:
> "UA must report a "decode" error when parsing tracks in the Movie Header Box
> that contain samples (i.e. when the entry_count in the stts, stsc and stco
> boxes is set to zero"
> 
> " A Movie Extends (mvex) box must be contained in the Movie Header Box to
> indicate that Movie Fragments are to be expected."
> should be rephrased as:
> "UA must report a decode error when a Movie Extends (mvex) box is not
> contained in the Movie Header Box".
> 
> The following paragraph:
> "The following rules apply to ISO BMFF media segments:
> - The Movie Fragment Box must contain at least one Track Fragment Box (traf).
> - The Movie Fragment Box must use movie-fragment relative addressing and the
> flag default-base-is-moof must be set; absolute byte-offsets must not be
> used.
> - External data references must not be used.
> - If the Movie Fragment contains multiple tracks, the duration by which each
> track extends should be as close to equal as practical.
> - Each Track Fragment Box must contain a Track Fragment Decode Time Box
> (tfdt)
> - The Media Data Boxes must contain all the samples referenced by the Track
> Fragment Run Boxes (trun) of the Movie Fragment Box."
> should be rephrased as:
> "The following rules apply to ISO BMFF media segments. UA MUST report a
> "decode" error when:
> - The Movie Fragment Box does not contain at least one Track Fragment Box
> (traf).
> - The Movie Fragment Box does not use movie-fragment relative addressing or
> if the flag default-base-is-moof is not set.
> - External data references are used.
> - A Track Fragment Box does not contain a Track Fragment Decode Time Box
> (tfdt)
> - The Media Data Boxes do not contain all the samples referenced by the
> Track Fragment Run Boxes (trun) of the Movie Fragment Box."
> 
> Similar changes should be done for the MPEG-2 TS and WebM part.

I did this slightly differently but I believe I've addressed what you wanted here.