This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Guidelines for generating MPEG-2 TS metadata text track cues have been proposed in the in-band tracks community group [1]. [2] shows a new table to be added in [3]. [1] https://www.w3.org/community/inbandtracks/wiki/Main_Page [2] https://www.w3.org/community/inbandtracks/wiki/Main_Page#Guidelines_for_creating_metadata_text_track_cues [3] http://www.w3.org/TR/html5/embedded-content-0.html#guidelines-for-exposing-cues-in-various-formats-as-text-track-cues
Bob, I don't really understand what the table in [2] means. I don't follow how a cue is created in DASH or MPEG-4. Even for MPEG-2 the statement is strange. What is the time that the cue is created? Where can I find that in the file? Would it be better to say something like "Every PSI section is encapsulated in a DataCue with the cue's start time provided by the activation time of that PSI section and the cue's end time by the deactivation time of that PSI or by a repeat PSI packet." Does that make sense?
(In reply to Silvia Pfeiffer from comment #1) > Bob, I don't really understand what the table in [2] means. > > I don't follow how a cue is created in DASH or MPEG-4. I don't think there is a general rule - it's dependent on the MIME. For example, WebVTT (text/vtt) has a cue format, TTML (application/xml+ttml) has another cue format. Don't the specs for these formats define how a UA would create a cue? > > Even for MPEG-2 the statement is strange. What is the time that the cue is > created? Where can I find that in the file? Would it be better to say > something like "Every PSI section is encapsulated in a DataCue with the > cue's start time provided by the activation time of that PSI section and the > cue's end time by the deactivation time of that PSI or by a repeat PSI > packet." > > Does that make sense? There is no explicit presentation/activation and deactivation time - the cue should fire as soon as the UA can create it, which implies the UA has received all TS packets comprising the section data; hence,the proposed language. endTime is not defined so could be same as start, a really big number or an indication of infinity/no endTime.
(In reply to Bob Lund from comment #2) > (In reply to Silvia Pfeiffer from comment #1) > > Bob, I don't really understand what the table in [2] means. > > > > I don't follow how a cue is created in DASH or MPEG-4. > > I don't think there is a general rule - it's dependent on the MIME. For > example, WebVTT (text/vtt) has a cue format, TTML (application/xml+ttml) has > another cue format. Don't the specs for these formats define how a UA would > create a cue? Correct. So, nothing needs to be added to the HTML spec for these. > > Even for MPEG-2 the statement is strange. What is the time that the cue is > > created? Where can I find that in the file? Would it be better to say > > something like "Every PSI section is encapsulated in a DataCue with the > > cue's start time provided by the activation time of that PSI section and the > > cue's end time by the deactivation time of that PSI or by a repeat PSI > > packet." > > > > Does that make sense? > > There is no explicit presentation/activation and deactivation time - the cue > should fire as soon as the UA can create it, which implies the UA has > received all TS packets comprising the section data; hence,the proposed > language. endTime is not defined so could be same as start, a really big > number or an indication of infinity/no endTime. Cue start and end time are not about where something happens to be encapsulated in the media file, but have a semantic meaning. The cue start time means that at that time the data in the cue becomes active and end time when it becomes inactive. IIUC the PSI has an implied activation time at the time in the media stream where it appears and its value is implied to be correct until another PSI arrives. So, to represent it correctly in a cue, these would need to be start and end times.
(In reply to Silvia Pfeiffer from comment #3) > (In reply to Bob Lund from comment #2) > > (In reply to Silvia Pfeiffer from comment #1) > > > Bob, I don't really understand what the table in [2] means. > > > > > > I don't follow how a cue is created in DASH or MPEG-4. > > > > I don't think there is a general rule - it's dependent on the MIME. For > > example, WebVTT (text/vtt) has a cue format, TTML (application/xml+ttml) has > > another cue format. Don't the specs for these formats define how a UA would > > create a cue? > > Correct. So, nothing needs to be added to the HTML spec for these. > > > > > Even for MPEG-2 the statement is strange. What is the time that the cue is > > > created? Where can I find that in the file? Would it be better to say > > > something like "Every PSI section is encapsulated in a DataCue with the > > > cue's start time provided by the activation time of that PSI section and the > > > cue's end time by the deactivation time of that PSI or by a repeat PSI > > > packet." > > > > > > Does that make sense? > > > > There is no explicit presentation/activation and deactivation time - the cue > > should fire as soon as the UA can create it, which implies the UA has > > received all TS packets comprising the section data; hence,the proposed > > language. endTime is not defined so could be same as start, a really big > > number or an indication of infinity/no endTime. > > Cue start and end time are not about where something happens to be > encapsulated in the media file, but have a semantic meaning. The cue start > time means that at that time the data in the cue becomes active and end time > when it becomes inactive. > IIUC the PSI has an implied activation time at the > time in the media stream where it appears These streams are not PSI actually but the principle you outline is correct, and what I tried to convey. So, startTime = 'time in the media stream where it appears'. > and its value is implied to be > correct until another PSI arrives. So, to represent it correctly in a cue, > these would need to be start and end times. The endTime then wouldn't be known until the next private section arrives. This raises the question of what the endTime should be set to in the intervening time and requires the UA to update the endTime of the prior cue. A preferred alternative would be to set endTime to startTime.
(In reply to Bob Lund from comment #4) > (In reply to Silvia Pfeiffer from comment #3) > > > > > > Even for MPEG-2 the statement is strange. What is the time that the cue is > > > > created? Where can I find that in the file? Would it be better to say > > > > something like "Every PSI section is encapsulated in a DataCue with the > > > > cue's start time provided by the activation time of that PSI section and the > > > > cue's end time by the deactivation time of that PSI or by a repeat PSI > > > > packet." > > > > > > > > Does that make sense? > > > > > > There is no explicit presentation/activation and deactivation time - the cue > > > should fire as soon as the UA can create it, which implies the UA has > > > received all TS packets comprising the section data; hence,the proposed > > > language. endTime is not defined so could be same as start, a really big > > > number or an indication of infinity/no endTime. > > > > Cue start and end time are not about where something happens to be > > encapsulated in the media file, but have a semantic meaning. The cue start > > time means that at that time the data in the cue becomes active and end time > > when it becomes inactive. > > IIUC the PSI has an implied activation time at the > > time in the media stream where it appears > > These streams are not PSI actually but the principle you outline is > correct, and what I tried to convey. So, startTime = 'time in the media > stream where it appears'. > > > and its value is implied to be > > correct until another PSI arrives. So, to represent it correctly in a cue, > > these would need to be start and end times. > > The endTime then wouldn't be known until the next private section arrives. > This raises the question of what the endTime should be set to in the > intervening time and requires the UA to update the endTime of the prior cue. > > A preferred alternative would be to set endTime to startTime. That would create a cue of duration 0, which would never end up in the list of "current cues", but always only in the list of "missed cues" (see the "time marches on" algorithm). In any case, do you still want a sentence on how to create cues from PSI information in MPEG-2 in the spec?
(In reply to Silvia Pfeiffer from comment #5) > (In reply to Bob Lund from comment #4) > > (In reply to Silvia Pfeiffer from comment #3) > > > > > > > > Even for MPEG-2 the statement is strange. What is the time that the cue is > > > > > created? Where can I find that in the file? Would it be better to say > > > > > something like "Every PSI section is encapsulated in a DataCue with the > > > > > cue's start time provided by the activation time of that PSI section and the > > > > > cue's end time by the deactivation time of that PSI or by a repeat PSI > > > > > packet." > > > > > > > > > > Does that make sense? > > > > > > > > There is no explicit presentation/activation and deactivation time - the cue > > > > should fire as soon as the UA can create it, which implies the UA has > > > > received all TS packets comprising the section data; hence,the proposed > > > > language. endTime is not defined so could be same as start, a really big > > > > number or an indication of infinity/no endTime. > > > > > > Cue start and end time are not about where something happens to be > > > encapsulated in the media file, but have a semantic meaning. The cue start > > > time means that at that time the data in the cue becomes active and end time > > > when it becomes inactive. > > > IIUC the PSI has an implied activation time at the > > > time in the media stream where it appears > > > > These streams are not PSI actually but the principle you outline is > > correct, and what I tried to convey. So, startTime = 'time in the media > > stream where it appears'. > > > > > and its value is implied to be > > > correct until another PSI arrives. So, to represent it correctly in a cue, > > > these would need to be start and end times. > > > > The endTime then wouldn't be known until the next private section arrives. > > This raises the question of what the endTime should be set to in the > > intervening time and requires the UA to update the endTime of the prior cue. > > > > A preferred alternative would be to set endTime to startTime. > > That would create a cue of duration 0, which would never end up in the list > of "current cues", but always only in the list of "missed cues" (see the > "time marches on" algorithm). > > In any case, do you still want a sentence on how to create cues from PSI > information in MPEG-2 in the spec? Yes. What about endTime = startTime + 250msec? It really doesn't matter what the endTime is. The application only needs to know the startTime.
(In reply to Bob Lund from comment #6) > > Yes. What about endTime = startTime + 250msec? It really doesn't matter what > the endTime is. The application only needs to know the startTime. Sure, that works.
(In reply to Silvia Pfeiffer from comment #7) > (In reply to Bob Lund from comment #6) > > > > Yes. What about endTime = startTime + 250msec? It really doesn't matter what > > the endTime is. The application only needs to know the startTime. > > Sure, that works. What happens if JavaScript ties up execution for 250+ ms though? Wouldn't we have the same problem, where the cue gets missed? Could we fix this problem at the source by adding a list of changed cues to the cuechange event? interface CueChangeEvent : Event { attribute TextTrackCue[] cues; }
(In reply to Brendan Long from comment #8) > (In reply to Silvia Pfeiffer from comment #7) > > (In reply to Bob Lund from comment #6) > > > > > > Yes. What about endTime = startTime + 250msec? It really doesn't matter what > > > the endTime is. The application only needs to know the startTime. > > > > Sure, that works. > > What happens if JavaScript ties up execution for 250+ ms though? Wouldn't we > have the same problem, where the cue gets missed? Sure - cues can always get missed. > Could we fix this problem at the source by adding a list of changed cues to > the cuechange event? > > interface CueChangeEvent : Event { > attribute TextTrackCue[] cues; > } That was the point of the discussion in bug #24161 - see particularly the discussion in but #24382
(In reply to Silvia Pfeiffer from comment #7) > (In reply to Bob Lund from comment #6) > > > > Yes. What about endTime = startTime + 250msec? It really doesn't matter what > > the endTime is. The application only needs to know the startTime. > > Sure, that works. As discussed in http://lists.w3.org/Archives/Public/public-inbandtracks/2014Mar/0079.html, the proposal is now - startTime = 0 - endTime = video.currentTime equivalent of video frame PTS where cue is in the media resource - pause_on_exit = false.
Er sorry I think I just accidentally added people to the CC list on this. I was looking at the wrong bug :\
(In reply to Bob Lund from comment #10) > (In reply to Silvia Pfeiffer from comment #7) > > (In reply to Bob Lund from comment #6) > > > > > > Yes. What about endTime = startTime + 250msec? It really doesn't matter what > > > the endTime is. The application only needs to know the startTime. > > > > Sure, that works. > > As discussed in > http://lists.w3.org/Archives/Public/public-inbandtracks/2014Mar/0079.html, > the proposal is now > > - startTime = 0 > - endTime = video.currentTime equivalent of video frame PTS where cue is in > the media resource > - pause_on_exit = false. I am confused: where do you want me to make that change?
(In reply to Silvia Pfeiffer from comment #12) > (In reply to Bob Lund from comment #10) > > (In reply to Silvia Pfeiffer from comment #7) > > > (In reply to Bob Lund from comment #6) > > > > > > > > Yes. What about endTime = startTime + 250msec? It really doesn't matter what > > > > the endTime is. The application only needs to know the startTime. > > > > > > Sure, that works. > > > > As discussed in > > http://lists.w3.org/Archives/Public/public-inbandtracks/2014Mar/0079.html, > > the proposal is now > > > > - startTime = 0 > > - endTime = video.currentTime equivalent of video frame PTS where cue is in > > the media resource > > - pause_on_exit = false. > > I am confused: where do you want me to make that change? The bug proposed the table [1] in the existing guidelines for creating cues section [2]. This would change if all of this were moved to another spec. [1] https://www.w3.org/community/inbandtracks/wiki/Main_Page#Guidelines_for_creating_metadata_text_track_cues [2] http://www.w3.org/TR/html5/embedded-content-0.html#guidelines-for-exposing-cues-in-various-formats-as-text-track-cues
(In reply to Bob Lund from comment #13) > (In reply to Silvia Pfeiffer from comment #12) > > (In reply to Bob Lund from comment #10) > > > (In reply to Silvia Pfeiffer from comment #7) > > > > (In reply to Bob Lund from comment #6) > > > > > > > > > > Yes. What about endTime = startTime + 250msec? It really doesn't matter what > > > > > the endTime is. The application only needs to know the startTime. > > > > > > > > Sure, that works. > > > > > > As discussed in > > > http://lists.w3.org/Archives/Public/public-inbandtracks/2014Mar/0079.html, > > > the proposal is now > > > > > > - startTime = 0 > > > - endTime = video.currentTime equivalent of video frame PTS where cue is in > > > the media resource > > > - pause_on_exit = false. > > > > I am confused: where do you want me to make that change? > > The bug proposed the table [1] in the existing guidelines for creating cues > section [2]. This would change if all of this were moved to another spec. > > [1] > https://www.w3.org/community/inbandtracks/wiki/ > Main_Page#Guidelines_for_creating_metadata_text_track_cues > [2] > http://www.w3.org/TR/html5/embedded-content-0.html#guidelines-for-exposing- > cues-in-various-formats-as-text-track-cues Is http://rawgit.com/silviapfeiffer/HTMLSourcingInbandTracks/master/index.html sufficient now to close this bug?
(In reply to Silvia Pfeiffer from comment #14) > (In reply to Bob Lund from comment #13) > > (In reply to Silvia Pfeiffer from comment #12) > > > (In reply to Bob Lund from comment #10) > > > > (In reply to Silvia Pfeiffer from comment #7) > > > > > (In reply to Bob Lund from comment #6) > > > > > > > > > > > > Yes. What about endTime = startTime + 250msec? It really doesn't matter what > > > > > > the endTime is. The application only needs to know the startTime. > > > > > > > > > > Sure, that works. > > > > > > > > As discussed in > > > > http://lists.w3.org/Archives/Public/public-inbandtracks/2014Mar/0079.html, > > > > the proposal is now > > > > > > > > - startTime = 0 > > > > - endTime = video.currentTime equivalent of video frame PTS where cue is in > > > > the media resource > > > > - pause_on_exit = false. > > > > > > I am confused: where do you want me to make that change? > > > > The bug proposed the table [1] in the existing guidelines for creating cues > > section [2]. This would change if all of this were moved to another spec. > > > > [1] > > https://www.w3.org/community/inbandtracks/wiki/ > > Main_Page#Guidelines_for_creating_metadata_text_track_cues > > [2] > > http://www.w3.org/TR/html5/embedded-content-0.html#guidelines-for-exposing- > > cues-in-various-formats-as-text-track-cues > > Is > http://rawgit.com/silviapfeiffer/HTMLSourcingInbandTracks/master/index.html > sufficient now to close this bug? Currently there are no guidelines for Cue creation in http://rawgit.com/silviapfeiffer/HTMLSourcingInbandTracks/master/index.html. Once we have them there I think we can close this bug. I will do that and create a pull request.
I have created a pull request in [1] for a adding MPEG-2 TS metadata cue creation guideline. [1] https://github.com/silviapfeiffer/HTMLSourcingInbandTracks
Bug 25733 [1] has been opened to add an HTML5 to the sourcing spec. [1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=25733
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the Editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the Tracker Issue; or you may create a Tracker Issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Partially Accepted Change Description: Cue creation guidelines have been added to http://rawgit.com/w3c/HTMLSourcingInbandTracks/master/index.html and will be maintained there. Rationale: This use case is being dealt with in a different spec.