Interdependency between fontSize, lineHeight and cellResolution in TTML

Dear all,

We had an extensive discussion on the EBU mailing list regarding the 
relationship between cell resolution, font-size and line-height. At some 
point we found out that the TTML mailing list is possibly the better 
place to discuss some of the question that came up.

For completeness I include part of the mailing list thread below.

Some questions are highlighted below:

----------------------------
Font-Size
----------------------------
In TTML scaling is applied to the glyph's EM square. As Nigel noted 
below "the font has an EM square and each glyph has its own width and 
height that may be different from the EM square". So possibly there is 
clarification needed.

As I understand the rendering processor would choose a font that best 
matches the specified font characteristics (including the font-size) and 
then scale the font/the EM square to the computed font-size. Is this 
correct?

So, assumed there is no ancestor element with a specified font-size, the 
root container height is 720px, the grid is "32 15" and you choose a 
font-size of 100% then the computed font-size would be 720px/15 = 48px?

Another question is how this will be mapped into CSS. Assumed the 
font-family is specified as Arial, should the calculated value of the 
CSS property font-size be 48px? Would the scaling in current browser 
implementation work as intended by the TTML definition and scale the EM 
square of the chosen Arial font?

----------------------------
Line height
----------------------------
Obviously the relationship between font-size and line-height is very 
important for subtitling. In legacy formats subtitles are positioned on 
an exact number of lines. To control the grid of lines in TTML the 
line-height has to be specified explicitly. But as the font-size would 
not shrink or increase automatically according to a fixed line-height 
this has to be done with care (e.g. to avoid colliding glyphs).

If you give up the control over the rendered line height you could 
choose the initial value of "normal". The computed value for the 
line-height would be the same as the largest font size that applies to 
any descendant element[1]. So if the font-size is 48px, the value of 
line-height will be 48px as well.

This could actually result in unwanted presentation because as I 
understood there will be no white space between the content of two 
adjacent line (so there will be no leading?).

In XSL:FO 1.1 (same as XSL 1.1) the value of "normal" for line-height is 
defined as follows [2]:

 > 7.16.4 "line-height"
 > [Normal] tells user agents to set the computed value to a 
"reasonable" value based on the font size of the element. [...] We 
recommend a computed value for "normal" between 1.0 to 1.2.

The same definition can be found in the CCS 2 spec.

This user agent dependent behaviour is reflected in current browser 
implementations. The author cannot assume a specific line-height when 
setting the value to "normal" even if he knows font-family and font-size.
So they may be a problem when mapping TTML lineHeight with the value of 
"normal" to the CSS property line-height and the value "normal"?!

-------------------------------
Font-Size / Line Height
-------------------------------
Currently the cell resolution is the only way to relate the font-size to 
the height of the video (if the root container is set by a specification 
explicitly to the size of the video).
As Sean stated the "vh "  strategy  for font-size is currently evaluated 
to relate the font-size directly to the size of the video. I assume that 
this should be similar (or same) to what is proposed for 
viewport-relative-lengths in CSS3 [4] and defined as well in CSS files 
of "Conversion of 608/708 captions to WebVTT" [5]. Possibly it can be 
discussed on the list how this can be applied to TTML and if this would 
be solution for the Issue-225.

Best regards,

Andreas

[1] 
https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml10/spec/ttaf1-dfxp.html?content-type=text/html%3bcharset=utf-8#style-attribute-lineHeight
[2] http://www.w3.org/TR/xsl/#line-height
[3] http://www.w3.org/TR/CSS2/visudet.html#propdef-line-height
[4] http://www.w3.org/TR/css3-values/#viewport-relative-lengths
[5] 
https://dvcs.w3.org/hg/text-tracks/raw-file/default/608toVTT/608toVTT.html#browsers
[6] http://www.w3.org/AudioVideo/TT/tracker/issues/225


-------- Original-Nachricht --------
Betreff: 	Re: [EBU-TT-D] Updated list of proposed TTML features
Datum: 	Mon, 8 Jul 2013 09:14:19 +0000
Von: 	Nigel Megitt <nigel.megitt@bbc.co.uk>
An: 	John Birch <John.Birch@screensystems.tv>, Andreas Tai <tai@irt.de>, 
"EBU-TT-D@list.ebu.ch" <EBU-TT-D@list.ebu.ch>



I agree the concepts of the line spacing and font height need to be
separately and clearly defined to allow implementations to be able to
render text as it's intended and to avoid the confusion you've described
John. I think this is what the TTML spec is trying to do by allowing
lineHeight and fontSize to be specified with a clear relationship. However
it falls short as you've pointed out. I'd propose the following remedial
steps, certainly in EBU-TT and hopefully in a future iteration of TTML:

1. State that we (TTML) assume that any presentation device will apply
appropriate rules to generate a font of the required size, regardless of
what algorithm is used either to scale or select a pre-defined font of a
similar size.

The problem with the current TTML wording is that it says (in
http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-fontSize) both "font size
is interpreted as a scaling transform to the font's design EM square" and
"horizontal and vertical scaling of a glyph's em square" which seem to
conflict. Is it each individual glyph that should be scaled, or the entire
font? As I understand it the font has an em square and each glyph has it's
own width and height that may be different from the em square.

2. State that TTML assumes that the em square unit is a suitable line
spacing size for the chosen font, i.e. that it includes the ascent,
descent and extra space needed above and below, left and right. The
articlehttp://www.microsoft.com/typography/otspec/TTCH01.htm  includes a
good picture of this in the section headed "FUnits and the em square".

I think both of these could be inferred from the current spec but by
making them explicit it would help to avoid the confusion.

The result should be that each row in a cell grid is 1c and there's no
need for 80%s and 120%s here and there (unless a particular visual effect
squeezing or stretching the baseline spacing is desired!).

Kind regards,

Nigel



-------- Original-Nachricht --------
Betreff: 	Re: [EBU-TT-D] Updated list of proposed TTML features
Datum: 	Wed, 3 Jul 2013 18:13:19 +0200
Von: 	Andreas Tai <tai@irt.de>
An: 	John Birch <John.Birch@screensystems.tv>
Kopie (CC): 	EBU-TT-D@list.ebu.ch <EBU-TT-D@list.ebu.ch>



Thanks for the comments, John. In general I think that we won?t 
constrain the supported TTML feature list for EBU-TT-D. This is more 
about a best practice recommendation.

See further comments in-line.

Best regards,

Andreas


> *From:*Andreas Tai [mailto:tai@irt.de]
> *Sent:* 02 July 2013 15:10
> *To:* John Birch
> *Cc:* Nigel Megitt; EBU-TT-D@list.ebu.ch
> *Subject:* Re: [EBU-TT-D] Updated list of proposed TTML features
>
> Hi John,
>
> I see some problem if both, font-size and line-height, are specified 
> explicitly . Given the uncertainties (e.g. the chosen font) from my 
> view there is a high probability of unwanted presentation. Worst case 
> would be that the lines overlap because of a font that is not 
> appropriate for the line-height.
>
> >> I see the opposite. By specifying both line height and font size you 
> are defining exactly the desired outcome. There is NO interpretation 
> possible. If the font size is less than the line height then the EM 
> cell must be smaller than the line height. If a 'badly designed font' 
> where the glyph exceeds the em square by a large amount is specified, 
> then that problem exists regardless of whether you are explicit about 
> line height or choose a value of 'normal'. Fonts that exceed the em 
> square are unlikely to be used in subtitling, as (at least in my 
> experience) they are usually those that represent cursive styles.
>
>

I am not sure if you would have problems in current CSS browser 
implementations even if you have a "badly designed font". I would still 
expect that the displayed font will not exceed the line.


> To set the line-height to "normal" is a common solution in CSS and the 
> default value in CSS as in TTML. I therefore think that this concept 
> would be understood by the web community. Of course it will be far 
> better, if you had a reverse dependency: you set a fixed line-height 
> and the rendering machine has to choose the appropriate font/font-size 
> to fit in this line. But I do not expect that this will be chosen 
> solution in future editions of TTML or CSS.
>
> >> The problem is that CSS does not typically use a concept of directly 
> controlling line positions... the use of 'normal' effectively leaves 
> the line height up to the renderer, based on the font size and text 
> content. This is absolutely contrary to what is required for 
> subtitling, where the extent of the text MUST be controlled.
>
I would not take this for granted. The input I get from our broadcasters 
is that exact line-height and exact positions are no hard requirements, 
while colours are of high importance.

> The fact that this effect is 'understood by the community' in itself 
> creates a problem. The community needs to re-understand that, in the 
> context of subtitling, controlling the exact text size and position is 
> more important.
>

I am sceptical about "educating" the web community. In the past (and in 
the present) this was not very successful. What I get from our 
discussions is that a good integration in HTML and CSS is important for 
EBU-TT-D. I don?t think that these standards and implementations will 
worry to much about specific subtitling and captioning requirements.

I agree exactly, that shrinking to fit a line (or maybe a region) would 
be far better, but this again is an unknown concept within CSS. In fact 
I am not sure I would like this any better, since the likelihood is that 
you would then get subtitles of varying text sizes throughout a 
presentation. However, I'm pretty sure most implementations will support 
line height values other than 'normal'.

As said above: I think both strategies (line-height = normal or choose 
exact line-height) will be allowed in EBU-TT-D.


> I agree, that we should not change mapping of the root container to 
> the size of the video. I think that this interpretation has become 
> accepted. From an interoperability perspective this is of high value : )
>
> Yes, absolutely.
>
>
> Best regards,
>
> Andreas
>
> Am 02.07.2013 14:16, schrieb John Birch:
>
>     Hi Andreas,
>
>     Yes, these are important considerations... For me, both the line
>     height and the font-size would be specified as percentages (the
>     line height would be slightly larger than the font-size).
>
>     E.g. line height 7%, font size 6%. This would mean 12 rows of
>     characters would occupy 84% of the root container. Roughly
>     equivalent to a Teletext presentation. A 6% / 7% font to line
>     ratio is approx. 116%.
>
>     Personally I find the alternative approach to be more difficult to
>     comprehend. Particularly when you factor in the 'safe area' concept.
>
>     If the cell resolution could be applied to a 'super region' (i.e.
>     one that could be defined as the safe area) then it might be more
>     straight forward. In other words conceptually the root container
>     is not the full extent of the active video... but I don't really
>     want to go there -- you then have problems when you want (and
>     need) to write outside of the safe area (e.g. speech marks).
>
>     Best regards,
>
>     John
>
>     *John Birch | Strategic Partnerships Manager | Screen
>     *Main Line : +44 1473 831700 | Ext : 270 | Direct Dial : +44 1473
>     834532
>     Mobile : +44 7919 558380 | Fax : +44 1473 830078
>     John.Birch@screensystems.tv <mailto:John.Birch@screensystems.tv> |
>     www.screensystems.tv <http://www.screensystems.tv> |
>     https://twitter.com/screensystems <https://twitter.com/screensystems>
>
>     *Visit us at
>     SMPTE conference & exhibition, Stand G35, Sydney Exhibition
>     Centre, Darling Harbour, 23-26th July*
>
>     *P**Before printing, think about the environment*
>
>     *From:*Andreas Tai [mailto:tai@irt.de]
>     *Sent:* 02 July 2013 12:32
>     *To:* John Birch
>     *Cc:* Nigel Megitt; EBU-TT-D@list.ebu.ch <mailto:EBU-TT-D@list.ebu.ch>
>     *Subject:* Re: [EBU-TT-D] Updated list of proposed TTML features
>
>     I don?t want to let go cell resolution for EBU-TT-D so easily  ; )
>     I think there is value in this concept regardless of the legacy
>     argument. For font-size it gives you a tool to design a grid of
>     lines and decide how many lines you "intent" to address. After
>     that you can choose the appropriate font-size in relation to this
>     grid.
>
>     The height of the font-size matches not exactly 1c. The rows
>     should define the height of the line in the intended grid, not the
>     height of the font.
>
>     An important use case will be to translate the values for
>     line-height and font-size to CSS. As in TTML the relationship
>     between font-size and line-height can be expressed in CSS through
>     the value "normal" for line-height. Then a line height that fits
>     the font-size will be set through the renderer (the browser in the
>     case of CSS). The recommended line-height in the CSS spec is 110
>     to 130% of the font-size. After some Browser tests I found that a
>     font-size of 0.8c or 80% would be a good choice so that the grid
>     will be filled but not extend the root container.
>
>     This approach has some in computable variables (not only the
>     concrete font that is used for presentation but as well for
>     HTML/CSS the browser behaviour). Nevertheless I think this can be
>     a good and transparent guide to select a font-size that is
>     independent from the size of the video and preservers the concept
>     of "lines".
>
>     Best regards,
>
>     Andreas
>
>
>     Am 02.07.2013 12:16, schrieb John Birch:
>
>         I have no problem at all with retaining cell resolution and
>         grid based philosophies in Part 1 files... i.e. in archived
>         exchanged subtitle files.
>
>         Where I think the cell resolution grid strategy falls down is
>         in the delivered distribution format, where arguably having a
>         single way of expressing the presentation, in as simple a way
>         as possible, is desirable.
>
>         In my world there would (almost always) be a computer based
>         conversion *from Part 1 to EBU-TT-D*. This conversion is not
>         (necessarily) reversible.
>
>         So, for example, we can translate from 'cell resolution /
>         grid' into 'percentage of root container' when we move from a
>         (part 2 style) Part 1 document to an EBU-TT-D document.
>
>         A conversion away from mono spaced fonts might also be
>         performed here too. Loss of some metadata is expected.
>         Addition of some metadata (e.g. language track identification)
>         might be necessary since although in the Part 1 world we talk
>         about an external asset management system, that may not exist
>         in the distribution context.
>
>         Best,
>
>         John
>
>         *John Birch | Strategic Partnerships Manager | Screen
>         *Main Line : +44 1473 831700 | Ext : 270 | Direct Dial : +44
>         1473 834532
>         Mobile : +44 7919 558380 | Fax : +44 1473 830078
>         John.Birch@screensystems.tv
>         <mailto:John.Birch@screensystems.tv> | www.screensystems.tv
>         <http://www.screensystems.tv> |
>         https://twitter.com/screensystems
>         <https://twitter.com/screensystems>
>
>         *Visit us at
>         SMPTE conference & exhibition, Stand G35, Sydney Exhibition
>         Centre, Darling Harbour, 23-26th July*
>
>         *P**Before printing, think about the environment*
>
>         *From:*Nigel Megitt [mailto:nigel.megitt@bbc.co.uk]
>         *Sent:* 02 July 2013 10:56
>         *To:* John Birch; Andreas Tai
>         *Cc:* EBU-TT-D@list.ebu.ch <mailto:EBU-TT-D@list.ebu.ch>
>         *Subject:* Re: [EBU-TT-D] Updated list of proposed TTML features
>
>         Hi John,
>
>         Thanks for the welcome back!
>
>         On the authoring for legacy argument I don't particularly
>         /like/ it either but I think we have to recognise it as a
>         stage that a lot of adopters will feel they have to go
>         through. If it looks as though they're blocked at that stage
>         they may never get any further. And if they're doing that then
>         they need to ensure that if the subtitles will be presented
>         using a mono-spaced font there is enough space to fit the text
>         on each row. Happily TTML supports mono-spaced fonts and
>         there's been no suggestion so far that we should remove this
>         support.
>
>         Kind regards,
>
>         Nigel
>
>         *--*
>
>         *Nigel Megitt*
>
>         Lead Technologist, BBC Technology, Distribution & Archives
>
>         Telephone: +44 (0)208 0082360
>
>         BC4 A3 Broadcast Centre, Media Village, 201 Wood Lane, London
>         W12 7TP
>
>         On 02/07/2013 10:25, "John Birch" <John.Birch@screensystems.tv
>         <mailto:John.Birch@screensystems.tv>> wrote:
>
>             Hi Nigel,
>
>             Welcome back J
>
>             Yep, definitely an elephant... and I agree that we should
>             very much move away from grid based mentalities. In fact I
>             don't really have much 'sympathy' with the authoring for
>             legacy argument, since realistically the required
>             constraints are in the number of characters a line and the
>             number of rows per screen. I don't think there is a strong
>             requirement for retaining a mono-spaced font concept.
>
>             In terms of multiples, 160 by 360 also works, (with a
>             rather strange higher resolution in the vertical
>             dimension), giving a 4 by 9 cell for 40 x 24, and a 5 by
>             15 cell for 32 by 15.
>
>             Personally though,*for EBU-TT-D*, I actually favour a
>             default cell resolution of '1c 1c' across the root
>             container, and using (potentially fractional) percentages
>             for font size. *In effect this abandons grids altogether.*
>
>             **
>
>             I completely agree with your comment on font selection. I
>             believe an implementation should be guide to choose a
>             closest fit font 'point size' that fits the scaled font
>             box, even if it is 'slightly' smaller or larger than
>             calculated.
>
>             Best regards,
>
>             John
>
>             *John Birch | Strategic Partnerships Manager | Screen
>             *Main Line : +44 1473 831700 | Ext : 270 | Direct Dial :
>             +44 1473 834532
>             Mobile : +44 7919 558380 | Fax : +44 1473 830078
>             John.Birch@screensystems.tv
>             <mailto:John.Birch@screensystems.tv> |
>             www.screensystems.tv <http://www.screensystems.tv> |
>             https://twitter.com/screensystems
>             <https://twitter.com/screensystems>
>
>             *Visit us at
>             SMPTE conference & exhibition, Stand G35, Sydney
>             Exhibition Centre, Darling Harbour, 23-26th July*
>
>             *P**Before printing, think about the environment*
>
>             *From:*Nigel Megitt [mailto:nigel.megitt@bbc.co.uk]
>             *Sent:* 02 July 2013 10:05
>             *To:* John Birch; Andreas Tai
>             *Cc:* EBU-TT-D@list.ebu.ch <mailto:EBU-TT-D@list.ebu.ch>
>             *Subject:* Re: [EBU-TT-D] Updated list of proposed TTML
>             features
>
>             It's been interesting to read this thread on returning
>             from holiday. A few thoughts from me:
>
>             ?The 'elephant in the room' that everyone has been
>             politely avoiding is that the cell resolution grid is
>             derived from pre-existing standards that carry the
>             emotional baggage of 'this is what we're used to and
>             therefore like'. In the US it was convenient to choose one
>             cell resolution, presumably to make translating from
>             existing documents easier (I don't know the exact
>             reasons). In much of the rest of the world a different
>             cell resolution has historically been used, so the US
>             choice is somewhat less convenient. If we're interested in
>             driving adoption then we have to understand the negative
>             impact of sticking with the US resolution as a default,
>             especially if we then put barriers in the way to changing
>             it on a document by document basis. The simple maths
>             described earlier shows that this is not a technical issue
>             but a perception problem.
>
>             ?However there is also a technical problem: If authors
>             also wish to use cell resolution for positioning, perhaps
>             to make downstream conversion to teletext
>             subtitles straightforward (still likely to be in use in a
>             lot of countries for several years), then the choice of
>             cell resolution becomes a significant constraint. In this
>             case the 32 by 15 grid would be very unhelpful indeed for
>             anyone targeting a 40 by 24 grid downstream. Similarly it
>             would be inconvenient the other way around. I think we do
>             need to consider this 'stepping stone' use case even
>             though it's not where we want to end up, i.e. without the
>             dependency on legacy representations for subtitles.
>
>             ?Three strategies that might make it equally convenient
>             for both 'histories' are, in no particular order:
>
>             oA) Create a new initial cell resolution that has integer
>             multiples of both current grids, which would be 32x40 by
>             15x24 = 1280 by 360, to allow an equally complex or simple
>             mapping from whatever prior standard has been in use,
>             anywhere.
>
>             oB) Abandon grids altogether and relate font size directly
>             to the root container dimension. This would make the
>             'stepping stone' use case described above more complicated
>             but still feasible.
>
>             oC) Require the cell grid to be explicitly specified if
>             used directly or by implication, i.e. make the concept of
>             initial value carry no meaning. So if fontSize is not
>             specified, a cell resolution for the root container
>             *must* be specified, or alternatively is a fontSize is
>             specified by not in units of c and cell resolution is not
>             used for positioning purposes elsewhere in the document
>             then the cell resolution may be omitted as it isn't used
>             anywhere.
>
>             ?I can't see how in a global context we could require that
>             the root cell resolution is only permitted to have a
>             single value, be it 32 by 15 or 40 by 24 or anything else,
>             except perhaps for 1 by 1 as the mechanism for abandoning
>             grids altogether.
>
>             Something else to note:
>
>             ?Typographical scaling of fonts is not straightforward,
>             and can't be done linearly without impacting readability:
>             the use of percentages suggests that an implementation
>             might use a single master font and scale it. We should be
>             clear that, regardless of the mechanism for specifying the
>             EM-square size (ultimately to be in pixels), the font size
>             is a guide for the implementation to select an appropriate
>             font to fit that box.
>
>             Kind regards,
>
>             Nigel
>
>


-- 
------------------------------------------------
Andreas Tai
Production Systems Television IRT - Institut fuer Rundfunktechnik GmbH
R&D Institute of ARD, ZDF, DRadio, ORF and SRG/SSR
Floriansmuehlstrasse 60, D-80939 Munich, Germany

Phone: +49 89 32399-389 | Fax: +49 89 32399-200
http: www.irt.de | Email: tai@irt.de
------------------------------------------------

registration court&  managing director:
Munich Commercial, RegNo. B 5191
Dr. Klaus Illgner-Fehns
------------------------------------------------

Received on Tuesday, 16 July 2013 13:15:17 UTC