RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions] from John Birch on 2012-10-03 (public-tt@w3.org from October 2012)

From: John Birch <John.Birch@screensystems.tv>
Date: Wed, 3 Oct 2012 15:07:40 +0000
To: Michael A Dolan <mdolan@newtbt.com>, "public-tt@w3.org" <public-tt@w3.org>
Message-ID: <0981DC6F684DE44FBE8E2602C456E8AB01024D33@SS-IP-EXMB-01.screensystems.tv>
Hi Mike,

Yes, I guess I have... I was not really thinking that you might break a VOD asset into fragments. That's similar to the reels concept in Digital Cinema.

I was also assuming that subtitle documents and movie fragments would relate to the same temporal period in the presentation (thinking MXF operational practises here), but what you are saying is that a subtitle fragment could be of considerably different temporal length to any movie fragment?

So yes, agreed, it would be possible to find common document boundaries where there was no subtitle on air for any language.

But this does raise some more questions for me...

Given the typical size of a subtitle document, why fragment it at all?

Getting back to live streaming, I would assume short movie fragments would be used (perhaps to support adaptive bit rate). What would be the strategy for subtitle document duration? You don't have subtitle document for the complete 'movie' because the subtitle content is being generated live (by a steno typist or re-speaker). So I am assuming that you could send subtitle documents that represented complete single subtitles when they were completed by the subtitler / captioneer.... as documents, but what about cumulative captioning (roll-on / paint-on) in the live environment?

Best regards,
John

John Birch | Screen Systems | Strategic Partnerships Manager
Main Line : +44 1473 831700 | Ext : 270 | Direct Dial : +44 1473 834532
Mobile : +44 7919 558380 | Fax : +44 1473 830078
John.Birch@screensystems.tv | www.screensystems.tv | http://twitter.com/ScreenSubtitles

Visit us at
SMPTE Annual Technical Conference & Exhibition,23-24 October, Stand 112
Loews Hollywood hotel, Hollywood

P Before printing, think about the environment-----Original Message-----
From: Michael A Dolan [mailto:mdolan@newtbt.com]
Sent: 03 October 2012 15:38
To: public-tt@w3.org
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

John-

I think you have mixed in the live scenario here.  And you have assumed that multiple subtitle tracks have to all have the same duration fragments.

As Sean and I have noted, CFF subtitle documents are typically 5-30 minutes in duration (an authoring decision). And the tracks don't all have to be the same duration.  It is therefore statistically overwhelming that an appropriate authoring boundary can be found for all of them.

        Mike

-----Original Message-----
From: John Birch [mailto:John.Birch@screensystems.tv]
Sent: Tuesday, October 02, 2012 9:09 AM
To: 'mdolan@newtbt.com'; 'public-tt@w3.org'
Subject: Re: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

Good.

It is very unlikely that authors would be able to align subtitles with fragment boundaries.

Subtitles may often persist for longer than 4 seconds. A typical presentation time for each subtitle in a sequence might be 3 seconds... So statistically it is likely that there will be many boundary crossings.
Further, the alignment of different languages of subtitles for the same content is uncommon, due primarily to the verbosity of different languages... Extreme example, Italian cf Chinese...

A comment in TTML would be useful, perhaps also in CFF to this effect. (Yes I know it is implicit in the model :-) but wouldn't hurt to make it more explicit?

Best regards,
John


John Birch | Screen Systems | Strategic Partnerships Manager Main Line : +44
1473 831700 | Ext : 270 | Direct Dial : +44 1473 834532 Mobile : +44 7919
558380 | Fax : +44 1473 830078 John.Birch@screensystems.tv | www.screensystems.tv | http://twitter.com/ScreenSubtitles

Visit us at
SMPTE Annual Technical Conference & Exhibition,23-24 October, Stand 112 Loews Hollywood hotel, Hollywood

P Before printing, think about the environment----- Original Message -----
From: Michael A Dolan [mailto:mdolan@newtbt.com]
Sent: Tuesday, October 02, 2012 04:06 PM
To: 'Timed Text Working Group' <public-tt@w3.org>
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

CFF is a fragmented file construction (2-4 second samples for video and audio), so it is "streaming friendly" and perhaps suitable for distribution.
To stream it directly would require a custom transport (i.e. not DASH). But because of the fragmented construction, it would be easy to make DASH assets from it.

Authors should strive to create document boundaries when nothing is active.
Decoder manufacturers should strive to detect identical active content across document boundaries and not flash. As I mentioned, SMPTE has suggested some "hint" additions to TTML 1.1 to better support this.

        Mike

-----Original Message-----
From: John Birch [mailto:John.Birch@screensystems.tv]
Sent: Monday, October 01, 2012 1:59 AM
To: Michael A Dolan; 'Timed Text Working Group'
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

Hi Mike,

Thanks for your comments re clarifications.

With respect to 'progressive download' in CFF.

a) Would this be suitable for live streaming? I hope so... a robust video container standard that includes well defined support for **subtitle** tracks is definitely needed for live video on the Internet!

With respect to movie fragments, is the case where a subtitle persists across the boundary between movie fragments described?
I note there are many paragraphs about synchronisation between tracks, but clarification of the **persistence** of a subtitle across such a boundary eluded me...

It might be a subtle point, but without the notion of persistence, an implementation might result in an annoying subtitle redraw 'flash' at such a boundary.

Best regards,

John


John Birch | Screen Systems | Strategic Partnerships Manager Main Line : +44
1473 831700 | Ext : 270 | Direct Dial : +44 1473 834532 Mobile : +44 7919
558380 | Fax : +44 1473 830078 John.Birch@screensystems.tv | www.screensystems.tv | http://twitter.com/ScreenSubtitles

Visit us at
SMPTE Annual Technical Conference & Exhibition,23-24 October, Stand 112 Loews Hollywood hotel, Hollywood

P Before printing, think about the environment-----Original Message-----
From: Michael A Dolan [mailto:mdolan@newtbt.com]
Sent: 28 September 2012 17:59
To: John Birch; 'Timed Text Working Group'
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

CIL

-----Original Message-----
From: John Birch [mailto:John.Birch@screensystems.tv]
Sent: Friday, September 28, 2012 9:15 AM
To: Michael A Dolan; 'Timed Text Working Group'
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

Hi Mike,

Sorry, I don't think I'm getting my point across - I'll try one more time.

The problem is with how I feel some readers may interpret the CFF model. On cursory examination, the model behaviour is consistent with (and indicates /
implies) just a pop mode of subtitling or captioning.
There are no statements in the CFF document that indicate that multiple
(other) modes of presentation might be expected... and not all readers of CFF will be aware of different captioning modalities.

It is my fear that many implementations will follow this rendering model at a cursory level (for such is the nature of implementation under time and fiscal pressure).
Consequently there is a risk that implementations will arise that do not effectively support the more sophisticated begin, begin, begin, end, end, end, kind of sequence of SubtitleEvents that would occur for a Paint-on or Roll On sequence.

I am simply suggesting adding a couple of sentences to highlight the fact that there is the potential for non-pop presentations, and the subsequent implications that arise for the rendering model (i.e. the necessity to reconstruct the intermediate model and redraw all the content).

[MD> ] Happy to add some informative clarifications.

On the issue of inefficiency, clearly deleting the entire buffer and then redrawing all of it plus a bit more in Paint-on is inefficient.
For Roll on when a scroll is needed a change to a pointer to where to read from in the buffer would be more efficient than deleting and rewriting.

[MD> ] Of course.  By why on earth would you do that in a decoder implementation?  As I've noted, the rendering model is a constraint on the document complexity (the synchronic documents, actually).

Also it would appear that the CFF model implies that a tt document in CFF must span the time frame that all the content is on screen for... since a new document clears the buffers.
If there is a duration limit for a tt document (perhaps due to streaming
chunking) then a repetition of content would often be necessary.

Is there any chunking limitation on the length (temporal) of a document?

[MD> ] No.  But there is a total document size limit.

And if so, then chunking would dictates that the new document starting at the chunk boundary must carry some / all of the information that was present in the previous document.

[MD> ] There are no published semantics (anywhere in the world as far as I
know) for continuity or persistent content between TTML documents.  SMPTE has proposed some be added in TTML 1.1, but I don't understand why a TTML
1.0 rendering model would need to concern itself with such a future scenario.  SDP-US certainly doesn't define any.

Further if such a case arose, does the CFF model effectively state that a back to back presentation of this same content across a chunk boundary does NOT result in a visible flash on the screen?

[MD> ]  Although decoders could probably mitigate this, authors should create boundaries at sensible points in time to minimize this.  But this is not really about a rendering model.

**But you are correct that CFF does not force an implementation that cannot do paint and roll on**, further, I don't believe I stated that.
Instead my issue is that CFF does not **obviously** (at a high level of
abstraction) represent a model that anticipates paint on and roll on behaviour.
IMHO such a model would include scrolling and cumulative properties.

[MD> ] Sorry it is not obvious.  Nevertheless it does.

I agree that developing a different model in CFF is unlikely, and actually I believe it unnecessary.
A few (abstract) sentences addressing the expected styles of input in the SDP documents (or in CFF) would suffice to indicate these potential implementation requirements.

[MD> ] OK. That's easily solved and is not a barrier to its application for SDP-US.

Best regards,
John


John Birch | Screen Systems | Strategic Partnerships Manager Main Line : +44
1473 831700 | Ext : 270 | Direct Dial : +44 1473 834532 Mobile : +44 7919
558380 | Fax : +44 1473 830078 John.Birch@screensystems.tv | www.screensystems.tv | http://twitter.com/ScreenSubtitles

Visit us at
SMPTE Annual Technical Conference & Exhibition,23-24 October, Stand 112 Loews Hollywood hotel, Hollywood

P Before printing, think about the environment-----Original Message-----
From: Michael A Dolan [mailto:mdolan@newtbt.com]
Sent: 28 September 2012 16:20
To: 'Timed Text Working Group'
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

John-

I'm not sure how to respond where your argument has no backup and your reading is cursory, except to restate that the CFF rendering model in no way forbids simple paint on and animation authoring and decoder behavior, or as far as anyone knows, any feature at all defined in SDP-US.

Please cite a feature of SDP-US that you believe the CFF rendering model forces either the document or the decoder to operate inefficiently, and explain why you believe that it does.

Regards,

        Mike

-----Original Message-----
From: John Birch [mailto:John.Birch@screensystems.tv]
Sent: Friday, September 28, 2012 4:04 AM
To: Michael A Dolan; 'Timed Text Working Group'
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

Hi Mike,

My concern is that primarily (and my reading of CFF is cursory) CFF appears to imply a simplistic pop model for subtitles.

Clearly it is possible using overlapping timing to create a paint on effect in SDP. And clearly the CFF model does not explicitly preclude such a mechanism. However, CFF does not, in any obvious fashion, indicate that a renderer may need to **redraw** parts of the bitmap that have just been cleared as a result of a subtitle event (begin or end). If SDP did reference CFF as a potential rendering model, it would seem wise (to me) to add a note (probably in SDP) about how paint on and roll on modes might cause such redraw possibilities, and how they might be handled more efficiently by the renderer for SDP.

Regards,
John

John Birch | Screen Systems | Strategic Partnerships Manager Main Line : +44
1473 831700 | Ext : 270 | Direct Dial : +44 1473 834532 Mobile : +44 7919
558380 | Fax : +44 1473 830078 John.Birch@screensystems.tv | www.screensystems.tv | http://twitter.com/ScreenSubtitles

Visit us at
SMPTE Annual Technical Conference & Exhibition,23-24 October, Stand 112 Loews Hollywood hotel, Hollywood

P Before printing, think about the environment-----Original Message-----
From: Michael A Dolan [mailto:mdolan@newtbt.com]
Sent: 27 September 2012 21:15
To: 'Timed Text Working Group'
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

John-

Although CFF-TT does not explicitly address incremental additions to the region, that does not mean the model does not apply.  As drafted, it just takes the time events as a full re-rendering.  A simplification, yes; but it is incorrect to say that incremental flow ("paint-on") is not supported.
The model is therefore a constraint on the complexity of the Intermediate Synchronic Documents, not the authored document. And it is definitely not a constraints the decoder - it can do whatever it wants for efficiency.

I've started a discussion in DECE about the interest in making the model more complex to explicitly deal with incremental additions. My guess is that it will not be worth the effort.  And, decoders can always implement whatever efficiencies that they want.

There is a question that regions scroll at all.  If they do, the behavior in TTML 1.0 needs a good deal of work, and the same rendering model would apply as for paint-on described above.  If not, that would be irrelevant to the CFF-TT (or any) rendering model.  Hence the new issue 189.

Regards,

        Mike

-----Original Message-----
From: John Birch [mailto:John.Birch@screensystems.tv]
Sent: Thursday, September 27, 2012 1:23 AM
To: Timed Text Working Group
Subject: RE: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

On a quick inspection, the CFF-TT rendering model does not appear to support Paint on or Roll on (cumulative) subtitles, as every Subtitle Event causes a clear of the subtitle plane root container?
Certainly a Paint on / Roll On effect could be emulated by resending the previous caption content already 'assumed' to be displayed (although note what is currently 'on screen' does depend on when the caption stream was acquired)... but such a repetitious approach would be markedly inefficient!
Is this not a fundamental limitation for using the CFF model in SDP-US?

Regards,
John Birch

John Birch | Screen Systems | Strategic Partnerships Manager Main Line : +44
1473 831700 | Ext : 270 | Direct Dial : +44 1473 834532 Mobile : +44 7919
558380 | Fax : +44 1473 830078 John.Birch@screensystems.tv | www.screensystems.tv | http://twitter.com/ScreenSubtitles

Visit us at
SMPTE Annual Technical Conference & Exhibition,23-24 October, Stand 112 Loews Hollywood hotel, Hollywood

P Before printing, think about the environment-----Original Message-----
From: Timed Text Working Group Issue Tracker [mailto:sysbot+tracker@w3.org]
Sent: 26 September 2012 21:16
To: public-tt@w3.org
Subject: ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

ISSUE-188 (render-complexity): Bounding SDP-US rendering complexity [Simple Delivery Profile for Closed Captions]

http://www.w3.org/AudioVideo/TT/tracker/issues/188

Raised by: Pierre-Anthony Lemieux
On product: Simple Delivery Profile for Closed Captions

Bounding SDP-US rendering complexity
====================================

What
----

SDP-US is a profile of TTML that specifies constraints such as supported TTML features and number of regions active at any given time. It does not however impose bounds on key aspects of rendering complexity, such as character and background drawing rates. Without such bounds, a valid SDP-US document might not successfully play on all implementations or, equivalently, determining the processing requirements of an implementation is not possible.

CFF-TT is a profile of TTML developed by the DECE consortium
(http://uvvu.com) for internet delivery of subtitles and captions. Consumer devices implementing CFF-TT are expected to be widely deployed. The CFF-TT specification is publicly available at http://uvvu.com/docs/public/tspec/CFFMediaFormat-1.0.4.pdf.

As with SDP-US, CFF-TT specifies supported TTML features -- largely a superset of the features supported by SDP-US. To further simplify implementation and improve interoperability, CFF-TT also imposes bounds on rendering complexity through the use of an hypothetical rendering model.

SDP-US should consider adopting, a subset of or in its entirety, the rendering complexity bounds (and rendering model) defined by CFF-TT.

Why
---

Such adoption would futher:
        - simplify implementations and improve interoperability by bounding rendering (and thus document) complexity
        - encourage adoption of SDP-US and TTML by ensuring that SDP-US content can be played on any CFF-compliant CE device

How
---

Adopting the CFF-TT hypothetical renderer and bounds on document complexity could be achieved in a number of ways, including:

(a) mapping the CFF-TT rendering model to the existing (XSL-based) TTML rendering model
(b) referencing the relevant sections of the CFF-TT specification defining the CFF-TT rendering model
(c) importing the CFF-TT rendering model into the SDP-US specification




This message may contain confidential and/or privileged information. If you are not the intended recipient you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. Screen Subtitling Systems Ltd. Registered in England No. 2596832. Registered Office: The Old Rectory, Claydon Church Lane, Claydon, Ipswich, Suffolk, IP6 0EQ
Received on Wednesday, 3 October 2012 15:08:04 UTC