21375 – Decoding dependencies

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21375 - Decoding dependencies

Summary: Decoding dependencies

Status:	RESOLVED FIXED

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	Media Source Extensions (show other bugs)
Version:	unspecified
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	Adrian Bateman [MSFT]
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-03-22 17:26 UTC by Cyril Concolato
Modified:	2013-04-23 16:38 UTC (History)
CC List:	4 users (show)

See Also:

Attachments

Description Cyril Concolato 2013-03-22 17:26:11 UTC

The spec currently requires in the coded frame processing algorithm that implementation detect dependencies between frames so that removal of a frame removes all dependent frames. 

This is a strong requirement on implementations which is impossible to fulfill (at least with current media formats) without decoding the frames (codec specific parsing). 

The requirement should be removed. Applications should make sure that they append frames that don't depend on non appended frames or that overlapped frames are not needed by non-overlapped frames.

(In reply to comment #0)
> The spec currently requires in the coded frame processing algorithm that
> implementation detect dependencies between frames so that removal of a frame
> removes all dependent frames. 
> 
> This is a strong requirement on implementations which is impossible to
> fulfill (at least with current media formats) without decoding the frames
> (codec specific parsing). 

This is not entirely true. We are doing this in Chrome right now. While it is true that the exact dependencies require parsing the coded frame, deleting everything between the current frame and the last random access point achieves this goal w/o codec specific knowledge. One can simply use the keyframe indicators and differences between dts & pts in the bytestream to approximate the dependencies. More consistant terminology could probably be used in the various places that coded frames are removed, but supporting this is not impossible.

> 
> The requirement should be removed. Applications should make sure that they
> append frames that don't depend on non appended frames or that overlapped
> frames are not needed by non-overlapped frames.
I disagree. The application doesn't likely have such intimate details of the encoding especially since it may not have actually created all the content it wants to insert into the presentation.

Comment 2 Aaron Colwell 2013-04-08 21:20:55 UTC

Change committed.
https://dvcs.w3.org/hg/html-media/rev/f7f2b7226543

Added text to clarify what should happen if detailed information about the exact decoding dependencies is not available.

Comment 3 Cyril Concolato 2013-04-10 15:36:47 UTC

(In reply to comment #2)
> Change committed.
> https://dvcs.w3.org/hg/html-media/rev/f7f2b7226543
> 
> Added text to clarify what should happen if detailed information about the
> exact decoding dependencies is not available.
Thanks, the text helps but I don't think it covers everything.
 
"Remove all coded frames between the coded frames removed in the previous step and the next random access point after those removed frames."

What if you have no RAP picture as in videos using Gradual Decoding Refresh, where  it's the decoding of N consecutive frames that produces a RAP.

Also, what if you have an Open GoP ? Removing up to the next RAP does not guarantee that the frames after the RAP will be decodable. 

Basically, there should be only simple SAP (as defined in DASH) in the sequence for this to work.

Comment 4 Aaron Colwell 2013-04-23 16:38:35 UTC

(In reply to comment #3)
> (In reply to comment #2)
> > Change committed.
> > https://dvcs.w3.org/hg/html-media/rev/f7f2b7226543
> > 
> > Added text to clarify what should happen if detailed information about the
> > exact decoding dependencies is not available.
> Thanks, the text helps but I don't think it covers everything.
>  
> "Remove all coded frames between the coded frames removed in the previous
> step and the next random access point after those removed frames."
> 
> What if you have no RAP picture as in videos using Gradual Decoding Refresh,
> where  it's the decoding of N consecutive frames that produces a RAP.
> 
> Also, what if you have an Open GoP ? Removing up to the next RAP does not
> guarantee that the frames after the RAP will be decodable. 
> 
> Basically, there should be only simple SAP (as defined in DASH) in the
> sequence for this to work.

Sorry for the delayed response. I think we only have to worry about SAP types 1 & 2 since that is what the ISO BMFF bytestream section spec defines as a RAP. WebM currently doesn't use the other SAP types. I'm not sure what the story is for MPEG2-TS, but at least the defined behavior ...encourages... closed GOPs. Worst case scenario, removal is too aggressive for these other SAP types which either encourages the UA to collect more detailed dependency info OR encourage content owners to insert type 1 or type 2 SAPs where necessary.